Re: [PATCH] Fix PR 103288, ICE after PHI-OPT, move an assigment when still in use for another bb

2021-11-16 Thread Richard Biener via Gcc-patches
On November 17, 2021 8:46:54 AM GMT+01:00, apinski--- via Gcc-patches 
 wrote:
>From: Andrew Pinski 
>
>The problem is r12-5300-gf98f373dd822b35c allows phiopt to recognize more 
>basic blocks
>but missed one location where phiopt could move an assignment from the middle 
>block
>to the non-middle one.  This patch fixes that.
>
>OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK. 
Richard. 

>   PR 103288
>
>gcc/ChangeLog:
>
>   * tree-ssa-phiopt.c (value_replacement): Return early if middle
>   block has more than one pred.
>
>gcc/testsuite/ChangeLog:
>
>   * gcc.c-torture/compile/pr103288-1.c: New test.
>---
> gcc/testsuite/gcc.c-torture/compile/pr103288-1.c | 6 ++
> gcc/tree-ssa-phiopt.c| 3 +++
> 2 files changed, 9 insertions(+)
> create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr103288-1.c
>
>diff --git a/gcc/testsuite/gcc.c-torture/compile/pr103288-1.c 
>b/gcc/testsuite/gcc.c-torture/compile/pr103288-1.c
>new file mode 100644
>index 000..88d1c675599
>--- /dev/null
>+++ b/gcc/testsuite/gcc.c-torture/compile/pr103288-1.c
>@@ -0,0 +1,6 @@
>+
>+int ui_5;
>+long func_14_uli_8;
>+void func_14() {
>+ui_5 &= (func_14_uli_8 ? 60 : ui_5) ? 5 : 0;
>+}
>diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
>index 6b22f6bedd4..8984a5e15ab 100644
>--- a/gcc/tree-ssa-phiopt.c
>+++ b/gcc/tree-ssa-phiopt.c
>@@ -1381,6 +1381,9 @@ value_replacement (basic_block cond_bb, basic_block 
>middle_bb,
>   }
> }
> 
>+  if (!single_pred_p (middle_bb))
>+return 0;
>+
>   /* Now optimize (x != 0) ? x + y : y to just x + y.  */
>   gsi = gsi_last_nondebug_bb (middle_bb);
>   if (gsi_end_p (gsi))



[PATCH] Fix PR 103288, ICE after PHI-OPT, move an assigment when still in use for another bb

2021-11-16 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

The problem is r12-5300-gf98f373dd822b35c allows phiopt to recognize more basic 
blocks
but missed one location where phiopt could move an assignment from the middle 
block
to the non-middle one.  This patch fixes that.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR 103288

gcc/ChangeLog:

* tree-ssa-phiopt.c (value_replacement): Return early if middle
block has more than one pred.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr103288-1.c: New test.
---
 gcc/testsuite/gcc.c-torture/compile/pr103288-1.c | 6 ++
 gcc/tree-ssa-phiopt.c| 3 +++
 2 files changed, 9 insertions(+)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr103288-1.c

diff --git a/gcc/testsuite/gcc.c-torture/compile/pr103288-1.c 
b/gcc/testsuite/gcc.c-torture/compile/pr103288-1.c
new file mode 100644
index 000..88d1c675599
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr103288-1.c
@@ -0,0 +1,6 @@
+
+int ui_5;
+long func_14_uli_8;
+void func_14() {
+ui_5 &= (func_14_uli_8 ? 60 : ui_5) ? 5 : 0;
+}
diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
index 6b22f6bedd4..8984a5e15ab 100644
--- a/gcc/tree-ssa-phiopt.c
+++ b/gcc/tree-ssa-phiopt.c
@@ -1381,6 +1381,9 @@ value_replacement (basic_block cond_bb, basic_block 
middle_bb,
}
 }
 
+  if (!single_pred_p (middle_bb))
+return 0;
+
   /* Now optimize (x != 0) ? x + y : y to just x + y.  */
   gsi = gsi_last_nondebug_bb (middle_bb);
   if (gsi_end_p (gsi))
-- 
2.17.1



Re: [PATCH 12/15] i386: Fix non-robust split condition in define_insn_and_split

2021-11-16 Thread Uros Bizjak via Gcc-patches
On Thu, Nov 11, 2021 at 12:25 PM Kewen Lin  wrote:
>
> This patch is to fix some non-robust split conditions in some
> define_insn_and_splits, to make each of them applied on top of
> the corresponding condition for define_insn part, otherwise the
> splitting could perform unexpectedly.
>
> gcc/ChangeLog:
>
> * config/i386/i386.md (*add3_doubleword, *addv4_doubleword,
> *addv4_doubleword_1, *sub3_doubleword,
> *subv4_doubleword, *subv4_doubleword_1,
> *add3_doubleword_cc_overflow_1, *divmodsi4_const,
> *neg2_doubleword, *tls_dynamic_gnu2_combine_64_): Fix split
> condition.

OK.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.md | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 6eb9de81921..2bd09e502ae 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -5491,7 +5491,7 @@ (define_insn_and_split "*add3_doubleword"
> (clobber (reg:CC FLAGS_REG))]
>"ix86_binary_operator_ok (PLUS, mode, operands)"
>"#"
> -  "reload_completed"
> +  "&& reload_completed"
>[(parallel [(set (reg:CCC FLAGS_REG)
>(compare:CCC
>  (plus:DWIH (match_dup 1) (match_dup 2))
> @@ -6300,7 +6300,7 @@ (define_insn_and_split "*addv4_doubleword"
> (plus: (match_dup 1) (match_dup 2)))]
>"ix86_binary_operator_ok (PLUS, mode, operands)"
>"#"
> -  "reload_completed"
> +  "&& reload_completed"
>[(parallel [(set (reg:CCC FLAGS_REG)
>(compare:CCC
>  (plus:DWIH (match_dup 1) (match_dup 2))
> @@ -6347,7 +6347,7 @@ (define_insn_and_split "*addv4_doubleword_1"
> && CONST_SCALAR_INT_P (operands[2])
> && rtx_equal_p (operands[2], operands[3])"
>"#"
> -  "reload_completed"
> +  "&& reload_completed"
>[(parallel [(set (reg:CCC FLAGS_REG)
>(compare:CCC
>  (plus:DWIH (match_dup 1) (match_dup 2))
> @@ -6641,7 +6641,7 @@ (define_insn_and_split "*sub3_doubleword"
> (clobber (reg:CC FLAGS_REG))]
>"ix86_binary_operator_ok (MINUS, mode, operands)"
>"#"
> -  "reload_completed"
> +  "&& reload_completed"
>[(parallel [(set (reg:CC FLAGS_REG)
>(compare:CC (match_dup 1) (match_dup 2)))
>   (set (match_dup 0)
> @@ -6817,7 +6817,7 @@ (define_insn_and_split "*subv4_doubleword"
> (minus: (match_dup 1) (match_dup 2)))]
>"ix86_binary_operator_ok (MINUS, mode, operands)"
>"#"
> -  "reload_completed"
> +  "&& reload_completed"
>[(parallel [(set (reg:CC FLAGS_REG)
>(compare:CC (match_dup 1) (match_dup 2)))
>   (set (match_dup 0)
> @@ -6862,7 +6862,7 @@ (define_insn_and_split "*subv4_doubleword_1"
> && CONST_SCALAR_INT_P (operands[2])
> && rtx_equal_p (operands[2], operands[3])"
>"#"
> -  "reload_completed"
> +  "&& reload_completed"
>[(parallel [(set (reg:CC FLAGS_REG)
>(compare:CC (match_dup 1) (match_dup 2)))
>   (set (match_dup 0)
> @@ -7542,7 +7542,7 @@ (define_insn_and_split 
> "*add3_doubleword_cc_overflow_1"
> (plus: (match_dup 1) (match_dup 2)))]
>"ix86_binary_operator_ok (PLUS, mode, operands)"
>"#"
> -  "reload_completed"
> +  "&& reload_completed"
>[(parallel [(set (reg:CCC FLAGS_REG)
>(compare:CCC
>  (plus:DWIH (match_dup 1) (match_dup 2))
> @@ -9000,7 +9000,7 @@ (define_insn_and_split "*divmodsi4_const"
> (clobber (reg:CC FLAGS_REG))]
>"!optimize_function_for_size_p (cfun)"
>"#"
> -  "reload_completed"
> +  "&& reload_completed"
>[(set (match_dup 0) (match_dup 2))
> (set (match_dup 1) (match_dup 4))
> (parallel [(set (match_dup 0)
> @@ -10515,7 +10515,7 @@ (define_insn_and_split "*neg2_doubleword"
> (clobber (reg:CC FLAGS_REG))]
>"ix86_unary_operator_ok (NEG, mode, operands)"
>"#"
> -  "reload_completed"
> +  "&& reload_completed"
>[(parallel
>  [(set (reg:CCC FLAGS_REG)
>   (ne:CCC (match_dup 1) (const_int 0)))
> @@ -16898,7 +16898,7 @@ (define_insn_and_split 
> "*tls_dynamic_gnu2_combine_64_"
> (clobber (reg:CC FLAGS_REG))]
>"TARGET_64BIT && TARGET_GNU2_TLS"
>"#"
> -  ""
> +  "&& 1"
>[(set (match_dup 0) (match_dup 4))]
>  {
>operands[4] = can_create_pseudo_p () ? gen_reg_rtx (ptr_mode) : 
> operands[0];
> --
> 2.27.0
>


[PATCH] Fix tree-optimization/101941: IPA splitting out function with error attribute

2021-11-16 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

The Linux kernel started to fail compile when the jump threader was improved
(r12-2591-g2e96b5f14e4025691). This failure was due to the IPA splitting code
decided now to split off the basic block which contained two functions,
one of those functions included the error attribute on them.  This patch fixes
the problem by disallowing basic blocks from being split which contain functions
that have either the error or warning attribute on them.

The two new testcases are to make sure we still split the function for other
places if we reject the one case.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/101941

gcc/ChangeLog:

* ipa-split.c (visit_bb): Disallow function calls where
the function has either error or warning attribute.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr101941-1.c: New test.
* gcc.dg/tree-ssa/pr101941-1.c: New test.
---
 gcc/ipa-split.c   | 12 -
 .../gcc.c-torture/compile/pr101941-1.c| 44 +
 gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c| 48 +++
 3 files changed, 103 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c

diff --git a/gcc/ipa-split.c b/gcc/ipa-split.c
index c68577d04a9..070e894ef31 100644
--- a/gcc/ipa-split.c
+++ b/gcc/ipa-split.c
@@ -873,7 +873,7 @@ visit_bb (basic_block bb, basic_block return_bb,
   gimple *stmt = gsi_stmt (bsi);
   tree op;
   ssa_op_iter iter;
-  tree decl;
+  tree decl = NULL_TREE;
 
   if (is_gimple_debug (stmt))
continue;
@@ -927,6 +927,16 @@ visit_bb (basic_block bb, basic_block return_bb,
break;
  }
 
+  /* If a function call and that function has either the
+warning or error attribute on it, don't split.  */
+  if (decl && (lookup_attribute ("warning", DECL_ATTRIBUTES (decl))
+  || lookup_attribute ("error", DECL_ATTRIBUTES (decl
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "Cannot split: warning or error attribute.\n");
+ can_split = false;
+   }
+
   FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_DEF)
bitmap_set_bit (set_ssa_names, SSA_NAME_VERSION (op));
   FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c 
b/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
new file mode 100644
index 000..ab3bbea8ed7
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr101941-1.c
@@ -0,0 +1,44 @@
+/* { dg-additional-options "-fconserve-stack" } */
+struct crypto_aes_ctx {
+  char key_dec[128];
+};
+
+int rfc4106_set_hash_subkey_hash_subkey;
+
+void __write_overflow(void)__attribute__((__error__("")));
+void __write_overflow1(void);
+void aes_encrypt(void*);
+
+void fortify_panic(const char*) __attribute__((__noreturn__)) ;
+
+char *rfc4106_set_hash_subkey(struct crypto_aes_ctx *ctx) {
+  void *a = >key_dec[0];
+  unsigned p_size =  __builtin_object_size(a, 0);
+#ifdef __OPTIMIZE__
+  if (p_size < 16) {
+__write_overflow1();
+fortify_panic(__func__);
+  }
+  if (p_size < 32) {
+__write_overflow();
+fortify_panic(__func__);
+  }
+#endif
+  aes_encrypt(ctx);
+  return ctx->key_dec;
+}
+
+char *(*gg)(struct crypto_aes_ctx *) = rfc4106_set_hash_subkey;
+
+void a(void)
+{
+  struct crypto_aes_ctx ctx;
+  rfc4106_set_hash_subkey();
+}
+void b(void)
+{
+  struct crypto_aes_ctx ctx;
+  ctx.key_dec[0] = 0;
+  rfc4106_set_hash_subkey();
+}
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c
new file mode 100644
index 000..21c1d1ec466
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr101941-1.c
@@ -0,0 +1,48 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fconserve-stack -fdump-tree-optimized" } */
+struct crypto_aes_ctx {
+  char key_dec[128];
+};
+
+int rfc4106_set_hash_subkey_hash_subkey;
+
+void __write_overflow(void)__attribute__((__error__("")));
+void __write_overflow1(void);
+void aes_encrypt(void*);
+
+void fortify_panic(const char*) __attribute__((__noreturn__)) ;
+
+char *rfc4106_set_hash_subkey(struct crypto_aes_ctx *ctx) {
+  void *a = >key_dec[0];
+  unsigned p_size =  __builtin_object_size(a, 0);
+#ifdef __OPTIMIZE__
+  if (p_size < 16) {
+__write_overflow1();
+fortify_panic(__func__);
+  }
+  if (p_size < 32) {
+__write_overflow();
+fortify_panic(__func__);
+  }
+#endif
+  aes_encrypt(ctx);
+  return ctx->key_dec;
+}
+
+char *(*gg)(struct crypto_aes_ctx *) = rfc4106_set_hash_subkey;
+
+void a(void)
+{
+  struct crypto_aes_ctx ctx;
+  rfc4106_set_hash_subkey();
+}
+void b(void)
+{
+  struct crypto_aes_ctx ctx;
+  ctx.key_dec[0] = 0;
+  rfc4106_set_hash_subkey();
+}
+
+/* This testcase should still split out one of the above 

Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-16 Thread Jason Merrill via Gcc-patches

On 11/8/21 11:40, Matthias Kretz wrote:

On Tuesday, 17 August 2021 20:31:54 CET Jason Merrill wrote:

2. Given a DECL_TI_ARGS tree, can I query whether an argument was deduced
or explicitly specified? I'm asking because I still consider diagnostics
of function templates unfortunate. `template  void f()` is fine,
as is `void f(T) [with T = float]`, but `void f() [with T = float]` could
be better. I.e. if the template parameter appears somewhere in the
function parameter list, dump_template_parms would only produce noise.
If, however, the template parameter was given explicitly, it would be
nice if it could show up accordingly in diagnostics.


NON_DEFAULT_TEMPLATE_ARGS_COUNT has that information, though there are
some issues with it.  Attached is my WIP from May to improve it
somewhat, if that's interesting.


It is interesting. I used your patch to come up with the attached. Patch. I
must say, I didn't try to read through all the cp/pt.c code to understand all
of what you did there (which is why my ChangeLog entry says "Jason?"), but it
works for me (and all of `make check`).

Anyway, I'd like to propose the following before finishing my diagnose_as
patch. I believe it's useful to fix this part first. The diagnostic/default-
template-args-[12].C tests show a lot of examples of the intent of this patch.
And the remaining changes to the testsuite show how it changes diagnostic
output.

-- 8< 

The choice when to print a function template parameter was still
suboptimal. That's because sometimes the function template parameter
list only adds noise, while in other situations the lack of a function
template parameter list makes diagnostic messages hard to understand.

The general idea of this change is to print template parms wherever they
would appear in the source code as well. Thus, the diagnostics code
needs to know whether any template parameter was given explicitly.

Signed-off-by: Matthias Kretz 

gcc/testsuite/ChangeLog:

 * g++.dg/debug/dwarf2/template-params-12n.C: Optionally, allow
 DW_AT_default_value.
 * g++.dg/diagnostic/default-template-args-1.C: New.
 * g++.dg/diagnostic/default-template-args-2.C: New.
 * g++.dg/diagnostic/param-type-mismatch-2.C: Expect template
 parms in diagnostic.
 * g++.dg/ext/pretty1.C: Expect function template specialization
 to not pretty-print template parms.
 * g++.old-deja/g++.ext/pretty3.C: Ditto.
 * g++.old-deja/g++.pt/memtemp77.C: Ditto.
 * g++.dg/goacc/template.C: Expect function template parms for
 explicit arguments.
 * g++.dg/gomp/declare-variant-7.C: Expect no function template
 parms for deduced arguments.
 * g++.dg/template/error40.C: Expect only non-default template
 arguments in diagnostic.

gcc/cp/ChangeLog:

 * cp-tree.h (GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT): Return
 absolute value of stored constant.
 (EXPLICIT_TEMPLATE_ARGS_P): New.
 (SET_EXPLICIT_TEMPLATE_ARGS_P): New.
 (TFF_AS_PRIMARY): New constant.
 * error.c (get_non_default_template_args_count): Avoid
 GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT if
 NON_DEFAULT_TEMPLATE_ARGS_COUNT is a NULL_TREE. Make independent
 of flag_pretty_templates.
 (dump_template_bindings): Add flags parameter to be passed to
 get_non_default_template_args_count. Print only non-default
 template arguments.
 (dump_function_decl): Call dump_function_name and dump_type of
 the DECL_CONTEXT with specialized template and set
 TFF_AS_PRIMARY for their flags.
 (dump_function_name): Add and document conditions for calling
 dump_template_parms.
 (dump_template_parms): Print only non-default template
 parameters.
 * pt.c (determine_specialization): Jason?


Also copy the inner TREE_VECs.


 (template_parms_level_to_args): Jason?


Always count non-default args.  Also, I think


-  if (CHECKING_P)
-SET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (a, TREE_VEC_LENGTH (a));
+  SET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (a, nondefault);


should have been

if (CHECKING_P || nondefault != TREE_VEC_LENGTH (a))
  SET_NON_DEFAULT_TEMPLATE_ARGS_COUNT (a, nondefault);


 (copy_template_args): Jason?


Only copy the non-default template args count on TREE_VECs that should 
have it.



 (fn_type_unification): Set EXPLICIT_TEMPLATE_ARGS_P on the
 template arguments tree if any template parameter was explicitly
 given.
 (type_unification_real): Jason?


Count non-default args sooner.


 (get_partial_spec_bindings): Jason?


Set non-default args count.


 (tsubst_template_args): Determine number of defaulted arguments
 from new argument vector, if possible.
---
  gcc/cp/cp-tree.h  | 18 +++-
  gcc/cp/error.c| 83 

Re: [PATCH 06/15] visium: Fix non-robust split condition in define_insn_and_split

2021-11-16 Thread Kewen.Lin via Gcc-patches
Hi Eric,

on 2021/11/17 上午12:57, Eric Botcazou wrote:
>> gcc/ChangeLog:
>>
>>  * config/visium/visium.md (*add3_insn, *addsi3_insn, *addi3_insn,
>>  *sub3_insn, *subsi3_insn, *subdi3_insn, *neg2_insn,
>>  *negdi2_insn, *and3_insn, *ior3_insn, *xor3_insn,
>>  *one_cmpl2_insn, *ashl3_insn, *ashr3_insn,
>>  *lshr3_insn, *trunchiqi2_insn, *truncsihi2_insn,
>>  *truncdisi2_insn, *extendqihi2_insn, *extendqisi2_insn,
>>  *extendhisi2_insn, *extendsidi2_insn, *zero_extendqihi2_insn,
>>*zero_extendqisi2_insn, *zero_extendsidi2_insn): Fix split condition.
> 
> OK for mainline, thanks.
> 

Thanks!  Committed as r12-5332.

BR,
Kewen


[PATCH] Enhance optimize_atomic_bit_test_and to handle truncation.

2021-11-16 Thread liuhongt via Gcc-patches
r12-5102-gfb161782545224f5 improves integer bit test on
__atomic_fetch_[or|and]_* returns only for nop_convert, .i.e.

transfrom

  mask_5 = 1 << bit_4(D);
  mask.0_1 = (unsigned int) mask_5;
  _2 = __atomic_fetch_or_4 (a_7(D), mask.0_1, 0);
  t1_9 = (int) _2;
  t2_10 = mask_5 & t1_9;

to

  mask_5 = 1 << n_4(D);
  mask.1_1 = (unsigned int) mask_5;
  _11 = .ATOMIC_BIT_TEST_AND_SET (_a_1_4, n_4(D), 0);
  _8 = (int) _11;

And this patch extend the original patch to handle truncation.
.i.e.

transform

  long int mask;
  mask_8 = 1 << n_7(D);
  mask.0_1 = (long unsigned int) mask_8;
  _2 = __sync_fetch_and_or_8 (_a_2_3, mask.0_1);
  _3 = (unsigned int) _2;
  _4 = (unsigned int) mask_8;
  _5 = _3 & _4;
  _6 = (int) _5;

to

  long int mask;
  mask_8 = 1 << n_7(D);
  mask.0_1 = (long unsigned int) mask_8;
  _14 = .ATOMIC_BIT_TEST_AND_SET (_a_2_3, n_7(D), 0);
  _5 = (unsigned int) _14;
  _6 = (int) _5;

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
Ok for trunk?

2021-11-17  Hongtao Liu  
H.J. Lu  

gcc/ChangeLog:

PR tree-optimization/103194
* match.pd (gimple_nop_atomic_bit_test_and_p): Extended to
match truncation.
* tree-ssa-ccp.c (gimple_nop_convert): Declare.
(optimize_atomic_bit_test_and): Enhance
optimize_atomic_bit_test_and to handle truncation.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr103194-2.c: New test.
* gcc.target/i386/pr103194-3.c: New test.
* gcc.target/i386/pr103194-4.c: New test.
* gcc.target/i386/pr103194-5.c: New test.
* gcc.target/i386/pr103194.c: New test.
---
 gcc/match.pd   | 48 ++-
 gcc/testsuite/gcc.target/i386/pr103194-2.c | 64 ++
 gcc/testsuite/gcc.target/i386/pr103194-3.c | 64 ++
 gcc/testsuite/gcc.target/i386/pr103194-4.c | 61 +
 gcc/testsuite/gcc.target/i386/pr103194-5.c | 61 +
 gcc/testsuite/gcc.target/i386/pr103194.c   | 16 
 gcc/tree-ssa-ccp.c | 99 +++---
 7 files changed, 345 insertions(+), 68 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr103194-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr103194-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr103194-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr103194-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr103194.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 7f76925b6c6..6c68534fff5 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -4021,39 +4021,43 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 
 #if GIMPLE
 (match (nop_atomic_bit_test_and_p @0 @1 @4)
- (bit_and (nop_convert?@4 (ATOMIC_FETCH_OR_XOR_N @2 INTEGER_CST@0 @3))
+ (bit_and (convert?@4 (ATOMIC_FETCH_OR_XOR_N @2 INTEGER_CST@0 @3))
   INTEGER_CST@1)
  (with {
 int ibit = tree_log2 (@0);
 int ibit2 = tree_log2 (@1);
}
   (if (ibit == ibit2
-  && ibit >= 0
+  && ibit >= 0
+  && TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@2))
 
 (match (nop_atomic_bit_test_and_p @0 @1 @3)
- (bit_and (nop_convert?@3 (SYNC_FETCH_OR_XOR_N @2 INTEGER_CST@0))
+ (bit_and (convert?@3 (SYNC_FETCH_OR_XOR_N @2 INTEGER_CST@0))
  INTEGER_CST@1)
  (with {
 int ibit = tree_log2 (@0);
 int ibit2 = tree_log2 (@1);
}
   (if (ibit == ibit2
-  && ibit >= 0
+  && ibit >= 0
+  && TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@2))
 
 (match (nop_atomic_bit_test_and_p @0 @0 @4)
  (bit_and:c
-  (nop_convert?@4
+  (convert1?@4
(ATOMIC_FETCH_OR_XOR_N @2 (nop_convert? (lshift@0 integer_onep@5 @6)) @3))
-  @0))
+  (convert2? @0))
+ (if (TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@2)
 
 (match (nop_atomic_bit_test_and_p @0 @0 @4)
  (bit_and:c
-  (nop_convert?@4
+  (convert1?@4
(SYNC_FETCH_OR_XOR_N @2 (nop_convert? (lshift@0 integer_onep@3 @5
-  @0))
+  (convert2? @0))
+ (if (TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@2)
 
 (match (nop_atomic_bit_test_and_p @0 @1 @3)
- (bit_and@4 (nop_convert?@3 (ATOMIC_FETCH_AND_N @2 INTEGER_CST@0 @5))
+ (bit_and@4 (convert?@3 (ATOMIC_FETCH_AND_N @2 INTEGER_CST@0 @5))
INTEGER_CST@1)
  (with {
 int ibit = wi::exact_log2 (wi::zext (wi::bit_not (wi::to_wide (@0)),
@@ -4061,11 +4065,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 int ibit2 = tree_log2 (@1);
}
   (if (ibit == ibit2
-  && ibit >= 0
+  && ibit >= 0
+  && TYPE_PRECISION (type) <= TYPE_PRECISION (TREE_TYPE (@2))
 
 (match (nop_atomic_bit_test_and_p @0 @1 @3)
  (bit_and@4
-  (nop_convert?@3 (SYNC_FETCH_AND_AND_N @2 INTEGER_CST@0))
+  (convert?@3 (SYNC_FETCH_AND_AND_N @2 INTEGER_CST@0))
   INTEGER_CST@1)
  (with {
 int ibit = wi::exact_log2 (wi::zext (wi::bit_not (wi::to_wide (@0)),
@@ -4073,19 +4078,22 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 int ibit2 = tree_log2 (@1);
}
   (if (ibit == ibit2
-  && ibit >= 

[r12-5324 Regression] FAIL: gcc.dg/tree-ssa/modref-dse-4.c scan-tree-dump dse2 "Deleted dead store: kill_me" on Linux/x86_64

2021-11-16 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

6dc90c4dbb6f9589dea9c670c3468496bb207de5 is the first bad commit
commit 6dc90c4dbb6f9589dea9c670c3468496bb207de5
Author: Jan Hubicka 
Date:   Tue Nov 16 23:01:28 2021 +0100

Use modref summaries for byte-wise dead store elimination.

caused

FAIL: gcc.dg/tree-ssa/modref-dse-4.c scan-tree-dump dse2 "Deleted dead store: 
kill_me"

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-5324/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-dse-4.c 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH v3] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-16 Thread Marek Polacek via Gcc-patches
On Tue, Nov 16, 2021 at 09:28:21PM -0500, David Malcolm wrote:
> On Tue, 2021-11-16 at 19:37 -0500, Marek Polacek wrote:
> > Sorry for a dumb question, but is this what you have in mind?
> > 
> > /* LRE
> >    PDF */
> > /* FSI
> >    PDI */
> > and check that we warn for these?
> 
> I mean something like the following multiline comments in which lines
> within them at the start, middle and end have unpaired constructs
> within a given line:
> 
> 
> /* RLI
>  *
>  */
> 
> /*
>  * RLI
>  */
> 
> /*
>  *  
>  * RLI */
> 
> and that we should warn for each case at the line containing the
> unpaired control character.
> 
> (the above lines don't have the actual chars, just "RLI")
> 
> Mostly this is just me trying to think about it from a black-box
> testing perspective, or in case we ever touch this code in the future
> (perhaps it's obviously correct by inspection of the implementation
> now, but let's have regression tests for these cases).
> 
> Sorry to add more work, but here's an idea for another test case:
> multiple comments on one line:
> 
>   /* RLI */  /* PDF */
> 
> where the closure of a comment should trigger closing a "context", so
> we should complain about the above.

No problem, I've added these.
 
> > 
> > > > > > @@ -1505,13 +1855,17 @@ lex_identifier (cpp_reader *pfile,
> > > > > > const uchar *base, bool starts_ucn,
> > > > > >  {
> > > > > >    /* Slower version for identifiers containing UCNs
> > > > > >  or extended chars (including $).  */
> > > > > > -  do {
> > > > > > -   while (ISIDNUM (*pfile->buffer->cur))
> > > > > > - {
> > > > > > -   NORMALIZE_STATE_UPDATE_IDNUM (nst, *pfile->buffer-
> > > > > > >cur);
> > > > > > -   pfile->buffer->cur++;
> > > > > > - }
> > > > > > -  } while (forms_identifier_p (pfile, false, nst));
> > > > > > +  do
> > > > > > +   {
> > > > > > + while (ISIDNUM (*pfile->buffer->cur))
> > > > > > +   {
> > > > > > + NORMALIZE_STATE_UPDATE_IDNUM (nst, *pfile-
> > > > > > >buffer->cur);
> > > > > > + pfile->buffer->cur++;
> > > > > > +   }
> > > > > > +   }
> > > > > > +  while (forms_identifier_p (pfile, false, nst));
> > > > > 
> > > > > Is the above purely a whitespace change?
> > > > 
> > > > Yes.
> > > 
> > > If I'm reading things correctly, these lines in the existing code
> > > were
> > > correctly indented, so is there a purpose to this change?  If not,
> > > please can you remove this change from the patch (to minimize the
> > > change to the history).
> > 
> > I dropped that change then.  Sometimes it's hard to resist fixing
> > formatting.  ;)
> 
> Thanks.  But I don't think the existing formatting in the code *was*
> broken; I thought the patch was taking correct formatting and breaking
> it (hence my objection to a whitespace change).  If I misread this,
> sorry.

I think it was, we're supposed to format do-while as

  do
{
}
  while (...);

but it's obviously not a big deal.

> Hopefully the above makes sense and is constructive; let me know when
> you push your patch so that I can work on my followup.

Pushed now.  Thanks!

Marek



Re: [PATCH v3] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-16 Thread David Malcolm via Gcc-patches
On Tue, 2021-11-16 at 19:37 -0500, Marek Polacek wrote:
> On Tue, Nov 16, 2021 at 06:00:58PM -0500, David Malcolm wrote:
> > > On Mon, Nov 15, 2021 at 06:15:40PM -0500, David Malcolm wrote:
> > > > > On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:
> > > > > > Ping, can we conclude on the name?   IMHO, -Wbidirectional is
> > > > > > just fine,
> > > > > > but changing the name is a trivial operation. 
> > > > > 
> > > > > Here's a patch with a better name (suggested by Jonathan W.). 
> > > > > Otherwise no
> > > > > changes.
> > > > 
> > > > Thanks for implementing this.
> > > > 
> > > > > 
> > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > > > 
> > > > > -- >8 --
> > > > > From a link below:
> > > > > "An issue was discovered in the Bidirectional Algorithm in the
> > > > > Unicode
> > > > > Specification through 14.0. It permits the visual reordering of
> > > > > characters via control sequences, which can be used to craft
> > > > > source code
> > > > > that renders different logic than the logical ordering of
> > > > > tokens
> > > > > ingested by compilers and interpreters. Adversaries can
> > > > > leverage this to
> > > > > encode source code for compilers accepting Unicode such that
> > > > > targeted
> > > > > vulnerabilities are introduced invisibly to human reviewers."
> > > > > 
> > > > > More info:
> > > > > https://nvd.nist.gov/vuln/detail/CVE-2021-42574
> > > > > https://trojansource.codes/
> > > > > 
> > > > > This is not a compiler bug.  However, to mitigate the problem,
> > > > > this patch
> > > > > implements -Wbidi-chars=[none|unpaired|any] to warn about
> > > > > possibly
> > > > > misleading Unicode bidirectional characters the preprocessor
> > > > > may encounter.
> > 
> > [...snip...]
> > 
> > > > 
> > > > Terminology nit:
> > > > The patch is referring to "bidirectional characters", but I think
> > > > the
> > > > term "bidirectional control characters" would be better.
> > > 
> > > Adjusted.
> > 
> > Thanks.
> > 
> > I wonder if the warning should be -Wbidi-control-chars, but I don't
> > care enough to insist on it being changed.
> > 
> > >  
> > > > For example, a passage of text containing both numbers and
> > > > characters
> > > > in a right-to-left script could be considered "bidirectional",
> > > > since
> > > > the numbers are written from left-to-right.
> > > > 
> > > > Specifically, the patch looks for these specific characters:
> > > >   * U+202A LEFT-TO-RIGHT EMBEDDING
> > > >   * U+202B RIGHT-TO-LEFT EMBEDDING
> > > >   * U+202C POP DIRECTIONAL FORMATTING
> > > >   * U+202D LEFT-TO-RIGHT OVERRIDE
> > > >   * U+202E RIGHT-TO-LEFT OVERRIDE
> > > >   * U+2066 LEFT-TO-RIGHT ISOLATE
> > > >   * U+2067 RIGHT-TO-LEFT ISOLATE
> > > >   * U+2068 FIRST STRONG ISOLATE
> > > >   * U+2069 POP DIRECTIONAL ISOLATE
> > > > 
> > > > However, the following characters could also be considered as
> > > > "bidirectional control characters":
> > > >   * U+200E ‎LEFT-TO-RIGHT MARK (UTF-8: E2 80 8E)
> > > >   * U+200F ‎RIGHT-TO-LEFT MARK (UTF-8: E2 80 8F)
> > > > but aren't checked for in the patch.  Should they be?  I can
> > > > imagine
> > > > ways in which they could be abused, so I think so.
> > > 
> > > I'd only intended to check the bidi chars described in the original
> > > trojan source pdf, but I added checking for U+200E/U+200F too,
> > > since
> > > it was easy enough.  AFAIK they aren't popped by a PDF/PDI like the
> > > rest, so don't need to go on the vec, and so we only warn with
> > > =any.
> > > Tests: Wbidi-chars-16.c + Wbidi-chars-17.c
> > 
> > Thanks.  I took a look through the revised patch and I think you
> > updated things correctly.
> > 
> > [...snip...]
> > 
> > > > > diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-4.c
> > > > > b/gcc/testsuite/c-c++-common/Wbidi-chars-4.c
> > > > > new file mode 100644
> > > > > index 000..9fd4bc535ca
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/c-c++-common/Wbidi-chars-4.c
> > > > > @@ -0,0 +1,166 @@
> > > > > +/* PR preprocessor/103026 */
> > > > > +/* { dg-do compile } */
> > > > > +/* { dg-options "-Wbidi-chars=any -Wno-multichar -Wno-
> > > > > overflow" } */
> > > > > +/* Test all bidi chars in various contexts (identifiers,
> > > > > comments,
> > > > > +   string literals, character constants), both UCN and UTF-8. 
> > > > > The bidi
> > > > > +   chars here are properly terminated, except for the
> > > > > character constants.  */
> > > > > +
> > > > > +/* a b c LRE‪ 1 2 3 PDF‬ x y z */
> > > > > +/* { dg-warning "U\\+202A" "" { target *-*-* } .-1 } */
> > > > > +/* a b c RLE‫ 1 2 3 PDF‬ x y z */
> > > > > +/* { dg-warning "U\\+202B" "" { target *-*-* } .-1 } */
> > > > > +/* a b c LRO‭ 1 2 3 PDF‬ x y z */
> > > > > +/* { dg-warning "U\\+202D" "" { target *-*-* } .-1 } */
> > > > > +/* a b c RLO‮ 1 2 3 PDF‬ x y z */
> > > > > +/* { dg-warning "U\\+202E" "" { target *-*-* } .-1 } */
> > > > > +/* a b c LRI⁦ 1 2 3 PDI⁩ x y z */
> > > > > +/* { dg-warning "U\\+2066" 

Re: [PATCH] regrename: Skip renaming if instruction is noop move.

2021-11-16 Thread Jojo R via Gcc-patches


— Jojo
在 2021年11月16日 +0800 PM8:12,Richard Biener ,写道:
> On Tue, Nov 16, 2021 at 12:45 PM Jojo R via Gcc-patches
>  wrote:
> >
> > Skip renaming if instruction is noop move, and it will
> > been removed for performance.
>
> Is there any (target specific) testcase you can add? Such commits are
> problematic
> when later bisected to since the intent isn't clear.

I made a issue in bugzilla, please check it, thanks.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103296
>
> > gcc/
> > * regrename.c (find_rename_reg): Return satisfied regno
> > if instruction is noop move.
> > ---
> > gcc/regrename.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/gcc/regrename.c b/gcc/regrename.c
> > index b8a9ca36f22..cb605f5176b 100644
> > --- a/gcc/regrename.c
> > +++ b/gcc/regrename.c
> > @@ -394,6 +394,9 @@ find_rename_reg (du_head_p this_head, enum reg_class 
> > super_class,
> > this_head, *unavailable))
> > return this_head->tied_chain->regno;
> >
> > + if (noop_move_p (this_head->first->insn))
> > + return best_new_reg;
> > +
> > /* If PREFERRED_CLASS is not NO_REGS, we iterate in the first pass
> > over registers that belong to PREFERRED_CLASS and try to find the
> > best register within the class. If that failed, we iterate in
> > --
> > 2.24.3 (Apple Git-128)


[committed] analyzer: fix missing -Wanalyzer-write-to-const [PR102695]

2021-11-16 Thread David Malcolm via Gcc-patches
This patch fixes -Wanalyzer-write-to-const so that it will complain
about attempts to write to functions, to labels.
It also "teaches" the analyzer about strchr, in that strchr can either
return a pointer into the input area (and thus -Wanalyzer-write-to-const
can now complain about writes into a string literal seen this way),
or return NULL (and thus the analyzer can complain about NULL
dereferences if the result is used without a check).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-5330-g111fd515f2894d7cddf62f80c69765c43ae18577.

gcc/analyzer/ChangeLog:
PR analyzer/102695
* region-model-impl-calls.cc (region_model::impl_call_strchr): New.
* region-model-manager.cc
(region_model_manager::maybe_fold_unaryop): Simplify cast to
pointer type of an existing pointer to a region.
* region-model.cc (region_model::on_call_pre): Handle
BUILT_IN_STRCHR and "strchr".
(write_to_const_diagnostic::emit): Add auto_diagnostic_group.  Add
alternate wordings for functions and labels.
(write_to_const_diagnostic::describe_final_event): Add alternate
wordings for functions and labels.
(region_model::check_for_writable_region): Handle RK_FUNCTION and
RK_LABEL.
* region-model.h (region_model::impl_call_strchr): New decl.

gcc/testsuite/ChangeLog:
PR analyzer/102695
* gcc.dg/analyzer/pr102695.c: New test.
* gcc.dg/analyzer/strchr-1.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model-impl-calls.cc  | 69 
 gcc/analyzer/region-model-manager.cc |  7 +++
 gcc/analyzer/region-model.cc | 52 --
 gcc/analyzer/region-model.h  |  1 +
 gcc/testsuite/gcc.dg/analyzer/pr102695.c | 44 +++
 gcc/testsuite/gcc.dg/analyzer/strchr-1.c | 26 +
 6 files changed, 196 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr102695.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/strchr-1.c

diff --git a/gcc/analyzer/region-model-impl-calls.cc 
b/gcc/analyzer/region-model-impl-calls.cc
index 90d4cf9c2db..ae50e69542e 100644
--- a/gcc/analyzer/region-model-impl-calls.cc
+++ b/gcc/analyzer/region-model-impl-calls.cc
@@ -678,6 +678,75 @@ region_model::impl_call_realloc (const call_details )
 }
 }
 
+/* Handle the on_call_pre part of "strchr" and "__builtin_strchr".  */
+
+void
+region_model::impl_call_strchr (const call_details )
+{
+  class strchr_call_info : public call_info
+  {
+  public:
+strchr_call_info (const call_details , bool found)
+: call_info (cd), m_found (found)
+{
+}
+
+label_text get_desc (bool can_colorize) const FINAL OVERRIDE
+{
+  if (m_found)
+   return make_label_text (can_colorize,
+   "when %qE returns non-NULL",
+   get_fndecl ());
+  else
+   return make_label_text (can_colorize,
+   "when %qE returns NULL",
+   get_fndecl ());
+}
+
+bool update_model (region_model *model,
+  const exploded_edge *,
+  region_model_context *ctxt) const FINAL OVERRIDE
+{
+  const call_details cd (get_call_details (model, ctxt));
+  if (tree lhs_type = cd.get_lhs_type ())
+   {
+ region_model_manager *mgr = model->get_manager ();
+ const svalue *result;
+ if (m_found)
+   {
+ const svalue *str_sval = cd.get_arg_svalue (0);
+ const region *str_reg
+   = model->deref_rvalue (str_sval, cd.get_arg_tree (0),
+  cd.get_ctxt ());
+ /* We want str_sval + OFFSET for some unknown OFFSET.
+Use a conjured_svalue to represent the offset,
+using the str_reg as the id of the conjured_svalue.  */
+ const svalue *offset
+   = mgr->get_or_create_conjured_svalue (size_type_node,
+ cd.get_call_stmt (),
+ str_reg);
+ result = mgr->get_or_create_binop (lhs_type, POINTER_PLUS_EXPR,
+str_sval, offset);
+   }
+ else
+   result = mgr->get_or_create_int_cst (lhs_type, 0);
+ cd.maybe_set_lhs (result);
+   }
+  return true;
+}
+  private:
+bool m_found;
+  };
+
+  /* Bifurcate state, creating a "not found" out-edge.  */
+  if (cd.get_ctxt ())
+cd.get_ctxt ()->bifurcate (new strchr_call_info (cd, false));
+
+  /* The "unbifurcated" state is the "found" case.  */
+  strchr_call_info found (cd, true);
+  found.update_model (this, NULL, cd.get_ctxt ());
+}
+
 /* Handle the on_call_pre part of "strcpy" and "__builtin_strcpy_chk".  */
 
 void
diff --git 

[committed] analyzer: don't assume target has alloca [PR102779]

2021-11-16 Thread David Malcolm via Gcc-patches
Successfully regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-5329-ga80d4e098b10d5cd161f55e4fce64a6be9683ed3.

gcc/testsuite/ChangeLog:
PR analyzer/102779
* gcc.dg/analyzer/capacity-1.c: Add dg-require-effective-target
alloca.  Use __builtin_alloca rather than alloca.
* gcc.dg/analyzer/capacity-3.c: Likewise.

Signed-off-by: David Malcolm 
---
 gcc/testsuite/gcc.dg/analyzer/capacity-1.c | 4 +++-
 gcc/testsuite/gcc.dg/analyzer/capacity-3.c | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/analyzer/capacity-1.c 
b/gcc/testsuite/gcc.dg/analyzer/capacity-1.c
index 9ea41f72e1d..2d124833296 100644
--- a/gcc/testsuite/gcc.dg/analyzer/capacity-1.c
+++ b/gcc/testsuite/gcc.dg/analyzer/capacity-1.c
@@ -1,3 +1,5 @@
+/* { dg-require-effective-target alloca } */
+
 #include 
 #include "analyzer-decls.h"
 
@@ -53,7 +55,7 @@ test_malloc (void)
 void
 test_alloca (size_t sz)
 {
-  void *p = alloca (sz);
+  void *p = __builtin_alloca (sz);
   __analyzer_dump_capacity (p); /* { dg-warning "capacity: 
'INIT_VAL\\(sz_\[^\n\r\]*\\)'" } */
 }
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/capacity-3.c 
b/gcc/testsuite/gcc.dg/analyzer/capacity-3.c
index 41e282cee92..c099ff5725d 100644
--- a/gcc/testsuite/gcc.dg/analyzer/capacity-3.c
+++ b/gcc/testsuite/gcc.dg/analyzer/capacity-3.c
@@ -1,10 +1,12 @@
+/* { dg-require-effective-target alloca } */
+
 #include 
 #include "analyzer-decls.h"
 
 static void __attribute__((noinline))
 __analyzer_callee_1 (size_t inner_sz)
 {
-  void *p = alloca (inner_sz);
+  void *p = __builtin_alloca (inner_sz);
   __analyzer_dump_capacity (p); /* { dg-warning "capacity: 
'INIT_VAL\\(outer_sz_\[^\n\r\]*\\)'" } */
 }
 
-- 
2.26.3



Re: [PATCH, rs6000] Optimization for vec_xl_sext

2021-11-16 Thread HAO CHEN GUI via Gcc-patches
Bill,

    Sorry, I mixed up the patches. There is one vec_reve patch which hasn't 
gotten approval for a long time. I will re-send it.  Thanks a lot.

On 16/11/2021 下午 9:10, Bill Schmidt wrote:
> Hi Hao Chen,
>
> I don't understand.  This patch was already approved and you committed it. 
> :-)  I know
> because I needed to make corresponding adjustments to the new builtins code.
>
> Thanks,
> Bill
>
> On 11/15/21 8:16 PM, HAO CHEN GUI wrote:
>> Hi,
>>
>>    The patch optimizes the code generation for vec_xl_sext builtin. Now all 
>> the sign extensions are done on VSX registers directly.
>>
>>    Bootstrapped and tested on powerpc64le-linux with no regressions. Is this 
>> okay for trunk? Any recommendations? Thanks a lot.
>>
>> ChangeLog
>>
>> 2021-11-16 Haochen Gui 
>>
>> gcc/
>>     * config/rs6000/rs6000-call.c (altivec_expand_lxvr_builtin): Modify
>>     the expansion for sign extension. All extensions are done on VSX
>>     registers.
>>
>> gcc/testsuite/
>>     * gcc.target/powerpc/p10_vec_xl_sext.c: New test.
>>
>> patch.diff
>>
>> diff --git a/gcc/config/rs6000/rs6000-call.c 
>> b/gcc/config/rs6000/rs6000-call.c
>> index b4e13af4dc6..587e9fa2a2a 100644
>> --- a/gcc/config/rs6000/rs6000-call.c
>> +++ b/gcc/config/rs6000/rs6000-call.c
>> @@ -9779,7 +9779,7 @@ altivec_expand_lxvr_builtin (enum insn_code icode, 
>> tree exp, rtx target, bool bl
>>
>>    if (sign_extend)
>>  {
>> -  rtx discratch = gen_reg_rtx (DImode);
>> +  rtx discratch = gen_reg_rtx (V2DImode);
>>    rtx tiscratch = gen_reg_rtx (TImode);
>>
>>    /* Emit the lxvr*x insn.  */
>> @@ -9788,20 +9788,31 @@ altivec_expand_lxvr_builtin (enum insn_code icode, 
>> tree exp, rtx target, bool bl
>>     return 0;
>>    emit_insn (pat);
>>
>> -  /* Emit a sign extension from QI,HI,WI to double (DI).  */
>> -  rtx scratch = gen_lowpart (smode, tiscratch);
>> +  /* Emit a sign extension from V16QI,V8HI,V4SI to V2DI.  */
>> +  rtx temp1, temp2;
>>    if (icode == CODE_FOR_vsx_lxvrbx)
>> -   emit_insn (gen_extendqidi2 (discratch, scratch));
>> +   {
>> + temp1  = simplify_gen_subreg (V16QImode, tiscratch, TImode, 0);
>> + emit_insn (gen_vsx_sign_extend_qi_v2di (discratch, temp1));
>> +   }
>>    else if (icode == CODE_FOR_vsx_lxvrhx)
>> -   emit_insn (gen_extendhidi2 (discratch, scratch));
>> +   {
>> + temp1  = simplify_gen_subreg (V8HImode, tiscratch, TImode, 0);
>> + emit_insn (gen_vsx_sign_extend_hi_v2di (discratch, temp1));
>> +   }
>>    else if (icode == CODE_FOR_vsx_lxvrwx)
>> -   emit_insn (gen_extendsidi2 (discratch, scratch));
>> -  /*  Assign discratch directly if scratch is already DI.  */
>> -  if (icode == CODE_FOR_vsx_lxvrdx)
>> -   discratch = scratch;
>> +   {
>> + temp1  = simplify_gen_subreg (V4SImode, tiscratch, TImode, 0);
>> + emit_insn (gen_vsx_sign_extend_si_v2di (discratch, temp1));
>> +   }
>> +  else if (icode == CODE_FOR_vsx_lxvrdx)
>> +   discratch = simplify_gen_subreg (V2DImode, tiscratch, TImode, 0);
>> +  else
>> +   gcc_unreachable ();
>>
>> -  /* Emit the sign extension from DI (double) to TI (quad).  */
>> -  emit_insn (gen_extendditi2 (target, discratch));
>> +  /* Emit the sign extension from V2DI (double) to TI (quad).  */
>> +  temp2 = simplify_gen_subreg (TImode, discratch, V2DImode, 0);
>> +  emit_insn (gen_extendditi2_vector (target, temp2));
>>
>>    return target;
>>  }
>> diff --git a/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c 
>> b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c
>> new file mode 100644
>> index 000..78e72ac5425
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c
>> @@ -0,0 +1,35 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target int128 } */
>> +/* { dg-require-effective-target power10_ok } */
>> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
>> +
>> +#include 
>> +
>> +vector signed __int128
>> +foo1 (signed long a, signed char *b)
>> +{
>> +  return vec_xl_sext (a, b);
>> +}
>> +
>> +vector signed __int128
>> +foo2 (signed long a, signed short *b)
>> +{
>> +  return vec_xl_sext (a, b);
>> +}
>> +
>> +vector signed __int128
>> +foo3 (signed long a, signed int *b)
>> +{
>> +  return vec_xl_sext (a, b);
>> +}
>> +
>> +vector signed __int128
>> +foo4 (signed long a, signed long *b)
>> +{
>> +  return vec_xl_sext (a, b);
>> +}
>> +
>> +/* { dg-final { scan-assembler-times {\mvextsd2q\M} 4 } } */
>> +/* { dg-final { scan-assembler-times {\mvextsb2d\M} 1 } } */
>> +/* { dg-final { scan-assembler-times {\mvextsh2d\M} 1 } } */
>> +/* { dg-final { scan-assembler-times {\mvextsw2d\M} 1 } } */
>>


Re: [PATCH] Fix spelling of ones' complement.

2021-11-16 Thread Koning, Paul via Gcc-patches



> On Nov 16, 2021, at 4:19 PM, Marek Polacek via Gcc-patches 
>  wrote:
> 
> On Tue, Nov 16, 2021 at 01:09:15PM -0800, Mike Stump via Gcc-patches wrote:
>> On Nov 15, 2021, at 5:48 PM, Marek Polacek via Gcc-patches 
>>  wrote:
>>> 
>>> Nitpicking time.  It's spelled "ones' complement" rather than "one's
>>> complement".  I didn't go into config/.
>>> 
>>> Ok for trunk?
>> 
>> So, is it two's complement or twos' complement then?  Seems like it should 
>> be the same, but  wikipedia suggests it is two's complement, as does google. 
>>  If that is wrong, you should go edit it as well.  :-)
> 
> It is "two's complement":
> https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584543.html
> but Knuth also continues to say that there's "twos' complement notation",
> which "has radix 3 and complementation with respect to (2...22)_3."
> 
> 
> It's not lost on me how inconsequential this patch is; I'm happy to just
> drop it and let the copy editor in me sleep.
> 
> Marek

To me it isn't so much a question of copy editing, but rather the fact that 
there clearly are two spellings, and if anything the one in the current text is 
the common one and the Knuth one found less often (perhaps only in Knuth).  My 
answer is to go fix Wikipedia, if possible.

paul




Re: [PATCH] restore ancient -Waddress for weak symbols [PR33925]

2021-11-16 Thread Martin Sebor via Gcc-patches

On 11/16/21 1:23 PM, Jason Merrill wrote:

On 10/23/21 19:06, Martin Sebor wrote:

On 10/4/21 3:37 PM, Jason Merrill wrote:

On 10/4/21 14:42, Martin Sebor wrote:

While resolving the recent -Waddress enhancement request (PR
PR102103) I came across a 2007 problem report about GCC 4 having
stopped warning for using the address of inline functions in
equality comparisons with null.  With inline functions being
commonplace in C++ this seems like an important use case for
the warning.

The change that resulted in suppressing the warning in these
cases was introduced inadvertently in a fix for PR 22252.

To restore the warning, the attached patch enhances
the decl_with_nonnull_addr_p() function to return true also for
weak symbols for which a definition has been provided.


I think you probably want to merge this function with 
fold-const.c:maybe_nonzero_address, which already handles more cases.


maybe_nonzero_address() doesn't behave quite like
decl_with_nonnull_addr_p() expects and I'm reluctant to muck
around with the former too much since it's used for codegen,
while the latter just for warnings.  (There is even a case
where the functions don't behave the same, and would result
in different warnings between C and C++ without some extra
help.)

So in the attached revision I just have maybe_nonzero_address()
call decl_with_nonnull_addr_p() and then refine the failing
(or uncertain) cases separately, with some overlap between
them.

Since I worked on this someone complained that some instances
of the warning newly enhanced under PR102103 aren't suppresed
in code resulting from macro expansion.  Since it's trivial,
I include the fix for that report in this patch as well.



+   allocated stroage might have a null address.  */


typo.

OK with that fixed.


After retesting the patch before committing I noticed it triggers
a regression in weak/weak-3.c that I missed the first time around.
Here's the test case:

extern void * ffoo1f (void);
void * foo1f (void)
{
  if (ffoo1f) /* { dg-warning "-Waddress" } */
ffoo1f ();
  return 0;
}

void * ffoox1f (void) { return (void *)0; }
extern void * ffoo1f (void)  __attribute__((weak, alias ("ffoox1f")));

The unexpected error is:

a.c: At top level:
a.c:1:15: error: ‘ffoo1f’ declared weak after being used
1 | extern void * ffoo1f (void);
  |   ^~

The error is caused by the new call to maybe_nonzero_address()
made from decl_with_nonnull_addr_p().  The call registers
the symbol as used.

So unless the error is desirable for this case I think it's
best to go back to the originally proposed solution.  I attach
it for reference and will plan to commit it tomorrow unless I
hear otherwise.

Martin

PS I don't know enough about the logic behind issuing this error
in other situations to tell for sure that it's wrong in this one
but I see no difference in the emitted code for a case in the same
test that declares the alias first, before taking its address and
that's accepted and this one.  I also checked that both Clang and
ICC accept the code either way, so I'm inclined to think the error
would be a bug.
Restore ancient -Waddress for weak symbols [PR33925].

Resolves:
PR c/33925 - gcc -Waddress lost some useful warnings
PR c/102867 - -Waddress from macro expansion in readelf.c

gcc/c-family/ChangeLog:

	PR c++/33925
	PR c/102867
	* c-common.c (decl_with_nonnull_addr_p): Call maybe_nonzero_address
	and improve handling tof defined symbols.

gcc/c/ChangeLog:

	PR c++/33925
	PR c/102867
	* c-typeck.c (maybe_warn_for_null_address): Suppress warnings for
	code resulting from macro expansion.

gcc/cp/ChangeLog:

	PR c++/33925
	PR c/102867
	* typeck.c (warn_for_null_address): Suppress warnings for code
	resulting from macro expansion.

gcc/ChangeLog:

	PR c++/33925
	PR c/102867
	* doc/invoke.texi (-Waddress): Update.

gcc/testsuite/ChangeLog:

	PR c++/33925
	PR c/102867
	* g++.dg/warn/Walways-true-2.C: Adjust to avoid a valid warning.
	* c-c++-common/Waddress-5.c: New test.
	* c-c++-common/Waddress-6.c: New test.
	* g++.dg/warn/Waddress-7.C: New test.
	* g++.dg/warn/Walways-true-2.C: Adjust to avoid a valid warning.
	* gcc.dg/weak/weak-3.c: Expect a warning.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 436df45df68..5ab34c9eed8 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -3400,16 +3400,43 @@ c_wrap_maybe_const (tree expr, bool non_const)
 
 /* Return whether EXPR is a declaration whose address can never be NULL.
The address of the first struct member could be NULL only if it were
-   accessed through a NULL pointer, and such an access would be invalid.  */
+   accessed through a NULL pointer, and such an access would be invalid.
+   The address of a weak symbol may be null unless it has a definition.  */
 
 bool
 decl_with_nonnull_addr_p (const_tree expr)
 {
-  return (DECL_P (expr)
-	  && (TREE_CODE (expr) == FIELD_DECL
-	  || TREE_CODE (expr) == PARM_DECL
-	  || TREE_CODE (expr) == 

Fix optimization difference caused by -fdump-ipa-inline

2021-11-16 Thread Jan Hubicka via Gcc-patches
Hi,
This patch fixes a bug that caused some optimizations to be dropped with
-fdump-ipa-inline.

gcc/ChangeLog:

2021-11-17  Jan Hubicka  

PR ipa/103246
* ipa-modref.c (ipa_merge_modref_summary_after_inlining): Fix clearing
of to_info_lto

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index a70575bc807..90cd1be764c 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -5123,6 +5123,7 @@ ipa_merge_modref_summary_after_inlining (cgraph_edge 
*edge)
fprintf (dump_file, "Removed mod-ref summary for %s\n",
 to->dump_name ());
  summaries_lto->remove (to);
+ to_info_lto = NULL;
}
   else if (to_info_lto && dump_file)
{
@@ -5130,7 +5131,6 @@ ipa_merge_modref_summary_after_inlining (cgraph_edge 
*edge)
fprintf (dump_file, "Updated mod-ref summary for %s\n",
 to->dump_name ());
  to_info_lto->dump (dump_file);
- to_info_lto = NULL;
}
   if (callee_info_lto)
summaries_lto->remove (edge->callee);


[PATCH v3] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-16 Thread Marek Polacek via Gcc-patches
On Tue, Nov 16, 2021 at 06:00:58PM -0500, David Malcolm wrote:
> > On Mon, Nov 15, 2021 at 06:15:40PM -0500, David Malcolm wrote:
> > > > On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:
> > > > > Ping, can we conclude on the name?   IMHO, -Wbidirectional is just 
> > > > > fine,
> > > > > but changing the name is a trivial operation. 
> > > > 
> > > > Here's a patch with a better name (suggested by Jonathan W.).  
> > > > Otherwise no
> > > > changes.
> > > 
> > > Thanks for implementing this.
> > > 
> > > > 
> > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > > 
> > > > -- >8 --
> > > > From a link below:
> > > > "An issue was discovered in the Bidirectional Algorithm in the Unicode
> > > > Specification through 14.0. It permits the visual reordering of
> > > > characters via control sequences, which can be used to craft source code
> > > > that renders different logic than the logical ordering of tokens
> > > > ingested by compilers and interpreters. Adversaries can leverage this to
> > > > encode source code for compilers accepting Unicode such that targeted
> > > > vulnerabilities are introduced invisibly to human reviewers."
> > > > 
> > > > More info:
> > > > https://nvd.nist.gov/vuln/detail/CVE-2021-42574
> > > > https://trojansource.codes/
> > > > 
> > > > This is not a compiler bug.  However, to mitigate the problem, this 
> > > > patch
> > > > implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
> > > > misleading Unicode bidirectional characters the preprocessor may 
> > > > encounter.
> 
> [...snip...]
> 
> > > 
> > > Terminology nit:
> > > The patch is referring to "bidirectional characters", but I think the
> > > term "bidirectional control characters" would be better.
> > 
> > Adjusted.
> 
> Thanks.
> 
> I wonder if the warning should be -Wbidi-control-chars, but I don't
> care enough to insist on it being changed.
> 
> >  
> > > For example, a passage of text containing both numbers and characters
> > > in a right-to-left script could be considered "bidirectional", since
> > > the numbers are written from left-to-right.
> > > 
> > > Specifically, the patch looks for these specific characters:
> > >   * U+202A LEFT-TO-RIGHT EMBEDDING
> > >   * U+202B RIGHT-TO-LEFT EMBEDDING
> > >   * U+202C POP DIRECTIONAL FORMATTING
> > >   * U+202D LEFT-TO-RIGHT OVERRIDE
> > >   * U+202E RIGHT-TO-LEFT OVERRIDE
> > >   * U+2066 LEFT-TO-RIGHT ISOLATE
> > >   * U+2067 RIGHT-TO-LEFT ISOLATE
> > >   * U+2068 FIRST STRONG ISOLATE
> > >   * U+2069 POP DIRECTIONAL ISOLATE
> > > 
> > > However, the following characters could also be considered as
> > > "bidirectional control characters":
> > >   * U+200E ‎LEFT-TO-RIGHT MARK (UTF-8: E2 80 8E)
> > >   * U+200F ‎RIGHT-TO-LEFT MARK (UTF-8: E2 80 8F)
> > > but aren't checked for in the patch.  Should they be?  I can imagine
> > > ways in which they could be abused, so I think so.
> > 
> > I'd only intended to check the bidi chars described in the original
> > trojan source pdf, but I added checking for U+200E/U+200F too, since
> > it was easy enough.  AFAIK they aren't popped by a PDF/PDI like the
> > rest, so don't need to go on the vec, and so we only warn with =any.
> > Tests: Wbidi-chars-16.c + Wbidi-chars-17.c
> 
> Thanks.  I took a look through the revised patch and I think you
> updated things correctly.
> 
> [...snip...]
> 
> > > > diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-4.c 
> > > > b/gcc/testsuite/c-c++-common/Wbidi-chars-4.c
> > > > new file mode 100644
> > > > index 000..9fd4bc535ca
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/c-c++-common/Wbidi-chars-4.c
> > > > @@ -0,0 +1,166 @@
> > > > +/* PR preprocessor/103026 */
> > > > +/* { dg-do compile } */
> > > > +/* { dg-options "-Wbidi-chars=any -Wno-multichar -Wno-overflow" } */
> > > > +/* Test all bidi chars in various contexts (identifiers, comments,
> > > > +   string literals, character constants), both UCN and UTF-8.  The bidi
> > > > +   chars here are properly terminated, except for the character 
> > > > constants.  */
> > > > +
> > > > +/* a b c LRE‪ 1 2 3 PDF‬ x y z */
> > > > +/* { dg-warning "U\\+202A" "" { target *-*-* } .-1 } */
> > > > +/* a b c RLE‫ 1 2 3 PDF‬ x y z */
> > > > +/* { dg-warning "U\\+202B" "" { target *-*-* } .-1 } */
> > > > +/* a b c LRO‭ 1 2 3 PDF‬ x y z */
> > > > +/* { dg-warning "U\\+202D" "" { target *-*-* } .-1 } */
> > > > +/* a b c RLO‮ 1 2 3 PDF‬ x y z */
> > > > +/* { dg-warning "U\\+202E" "" { target *-*-* } .-1 } */
> > > > +/* a b c LRI⁦ 1 2 3 PDI⁩ x y z */
> > > > +/* { dg-warning "U\\+2066" "" { target *-*-* } .-1 } */
> > > > +/* a b c RLI⁧ 1 2 3 PDI⁩ x y */
> > > > +/* { dg-warning "U\\+2067" "" { target *-*-* } .-1 } */
> > > > +/* a b c FSI⁨ 1 2 3 PDI⁩ x y z */
> > > > +/* { dg-warning "U\\+2068" "" { target *-*-* } .-1 } */
> > > 
> > > AIUI the Unicode bidirectionality algorithm works at the line level,
> > > and so each line in a block comment should be checked individually for

[PATCH] handle folded nonconstant array bounds [PR101702]

2021-11-16 Thread Martin Sebor via Gcc-patches

-Warray-parameter and -Wvla-parameter assume that array bounds
in function parameters are either constant integers or variable,
but not something in between like a cast of a constant that's
not recognized as an INTEGER_CST until we strip the cast from
it.  This leads to an ICE as the the internal checks fail.

The attached patch fixes the problem by stripping the casts
earlier than before, preventing the inconsistency.  In addition,
it also folds the array bound, avoiding a class of false
positives and negatives that not doing so would lead to otherwise.

Tested on x86_64-linux.

Martin
Handle folded nonconstant array bounds [PR101702]

PR c/101702 - ICE: in handle_argspec_attribute, at c-family/c-attribs.c:3623

gcc/c/ChangeLog:

	PR c/101702
	* c-decl.c (get_parm_array_spec): Strip casts earlier and fold array
	bounds before deciding if they're constant.

gcc/testsuite/ChangeLog:

	PR c/101702
	* gcc.dg/Warray-parameter-11.c: New test.

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 186fa1692c1..63d806a84c9 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -5866,6 +5866,12 @@ get_parm_array_spec (const struct c_parm *parm, tree attrs)
   if (pd->u.array.static_p)
 	spec += 's';
 
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (nelts)))
+	/* Avoid invalid NELTS.  */
+	return attrs;
+
+  STRIP_NOPS (nelts);
+  nelts = c_fully_fold (nelts, false, nullptr);
   if (TREE_CODE (nelts) == INTEGER_CST)
 	{
 	  /* Skip all constant bounds except the most significant one.
@@ -5883,13 +5889,9 @@ get_parm_array_spec (const struct c_parm *parm, tree attrs)
 	  spec += buf;
 	  break;
 	}
-  else if (!INTEGRAL_TYPE_P (TREE_TYPE (nelts)))
-	/* Avoid invalid NELTS.  */
-	return attrs;
 
   /* Each variable VLA bound is represented by a dollar sign.  */
   spec += "$";
-  STRIP_NOPS (nelts);
   vbchain = tree_cons (NULL_TREE, nelts, vbchain);
 }
 
diff --git a/gcc/testsuite/gcc.dg/Warray-parameter-11.c b/gcc/testsuite/gcc.dg/Warray-parameter-11.c
new file mode 100644
index 000..8ca1b55bd28
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Warray-parameter-11.c
@@ -0,0 +1,24 @@
+/* PR c/101702 - ICE on invalid function redeclaration
+   { dg-do compile }
+   { dg-options "-Wall" } */
+
+typedef __INTPTR_TYPE__ intptr_t;
+
+#define copysign(x, y) __builtin_copysign (x, y)
+
+void f0 (double[!copysign (~2, 3)]);
+
+void f1 (double[!copysign (~2, 3)]);
+void f1 (double[1]);// { dg-warning "-Warray-parameter" }
+
+void f2 (int[(int)+1.0]);
+void f2 (int[(int)+1.1]);
+
+/* Also verify that equivalent expressions don't needlessly cause false
+   positives or negatives.  */
+struct S { int a[1]; };
+extern struct S *sp;
+
+void f3 (int[(intptr_t)((char*)sp->a - (char*)sp)]);
+void f3 (int[(intptr_t)((char*)>a[0] - (char*)sp)]);
+void f3 (int[(intptr_t)((char*)>a[1] - (char*)sp)]);   // { dg-warning "-Warray-parameter" }


Re: [PATCH] Fix spelling of ones' complement.

2021-11-16 Thread Eric Botcazou
> Nitpicking time.  It's spelled "ones' complement" rather than "one's
> complement".  I didn't go into config/.
> 
> Ok for trunk?
> 
> gcc/ChangeLog:
> 
>   * doc/implement-c.texi: Fix spelling.
>   * doc/md.texi: Likewise.
>   * expmed.c (emit_store_flag_int): Likewise.
>   * optabs.c (expand_abs): Likewise.
>   (expand_one_cmpl_abs_nojump): Likewise.
>   * optabs.h (expand_abs): Likewise.
>   * tree-ssa-ccp.c (gimple_nop_atomic_bit_test_and_p): Likewise.

IMO either you change them all or you change none, any intermediate stage is 
worse than the current situation, which is probably OK for 99.99% of people.

-- 
Eric Botcazou




Re: [PATCH v2] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-16 Thread David Malcolm via Gcc-patches
> On Mon, Nov 15, 2021 at 06:15:40PM -0500, David Malcolm wrote:
> > > On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:
> > > > Ping, can we conclude on the name?   IMHO, -Wbidirectional is just fine,
> > > > but changing the name is a trivial operation. 
> > > 
> > > Here's a patch with a better name (suggested by Jonathan W.).  Otherwise 
> > > no
> > > changes.
> > 
> > Thanks for implementing this.
> > 
> > > 
> > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > 
> > > -- >8 --
> > > From a link below:
> > > "An issue was discovered in the Bidirectional Algorithm in the Unicode
> > > Specification through 14.0. It permits the visual reordering of
> > > characters via control sequences, which can be used to craft source code
> > > that renders different logic than the logical ordering of tokens
> > > ingested by compilers and interpreters. Adversaries can leverage this to
> > > encode source code for compilers accepting Unicode such that targeted
> > > vulnerabilities are introduced invisibly to human reviewers."
> > > 
> > > More info:
> > > https://nvd.nist.gov/vuln/detail/CVE-2021-42574
> > > https://trojansource.codes/
> > > 
> > > This is not a compiler bug.  However, to mitigate the problem, this patch
> > > implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
> > > misleading Unicode bidirectional characters the preprocessor may 
> > > encounter.

[...snip...]

> > 
> > Terminology nit:
> > The patch is referring to "bidirectional characters", but I think the
> > term "bidirectional control characters" would be better.
> 
> Adjusted.

Thanks.

I wonder if the warning should be -Wbidi-control-chars, but I don't
care enough to insist on it being changed.

>  
> > For example, a passage of text containing both numbers and characters
> > in a right-to-left script could be considered "bidirectional", since
> > the numbers are written from left-to-right.
> > 
> > Specifically, the patch looks for these specific characters:
> >   * U+202A LEFT-TO-RIGHT EMBEDDING
> >   * U+202B RIGHT-TO-LEFT EMBEDDING
> >   * U+202C POP DIRECTIONAL FORMATTING
> >   * U+202D LEFT-TO-RIGHT OVERRIDE
> >   * U+202E RIGHT-TO-LEFT OVERRIDE
> >   * U+2066 LEFT-TO-RIGHT ISOLATE
> >   * U+2067 RIGHT-TO-LEFT ISOLATE
> >   * U+2068 FIRST STRONG ISOLATE
> >   * U+2069 POP DIRECTIONAL ISOLATE
> > 
> > However, the following characters could also be considered as
> > "bidirectional control characters":
> >   * U+200E ‎LEFT-TO-RIGHT MARK (UTF-8: E2 80 8E)
> >   * U+200F ‎RIGHT-TO-LEFT MARK (UTF-8: E2 80 8F)
> > but aren't checked for in the patch.  Should they be?  I can imagine
> > ways in which they could be abused, so I think so.
> 
> I'd only intended to check the bidi chars described in the original
> trojan source pdf, but I added checking for U+200E/U+200F too, since
> it was easy enough.  AFAIK they aren't popped by a PDF/PDI like the
> rest, so don't need to go on the vec, and so we only warn with =any.
> Tests: Wbidi-chars-16.c + Wbidi-chars-17.c

Thanks.  I took a look through the revised patch and I think you
updated things correctly.

[...snip...]

> > > diff --git a/gcc/testsuite/c-c++-common/Wbidi-chars-4.c 
> > > b/gcc/testsuite/c-c++-common/Wbidi-chars-4.c
> > > new file mode 100644
> > > index 000..9fd4bc535ca
> > > --- /dev/null
> > > +++ b/gcc/testsuite/c-c++-common/Wbidi-chars-4.c
> > > @@ -0,0 +1,166 @@
> > > +/* PR preprocessor/103026 */
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-Wbidi-chars=any -Wno-multichar -Wno-overflow" } */
> > > +/* Test all bidi chars in various contexts (identifiers, comments,
> > > +   string literals, character constants), both UCN and UTF-8.  The bidi
> > > +   chars here are properly terminated, except for the character 
> > > constants.  */
> > > +
> > > +/* a b c LRE‪ 1 2 3 PDF‬ x y z */
> > > +/* { dg-warning "U\\+202A" "" { target *-*-* } .-1 } */
> > > +/* a b c RLE‫ 1 2 3 PDF‬ x y z */
> > > +/* { dg-warning "U\\+202B" "" { target *-*-* } .-1 } */
> > > +/* a b c LRO‭ 1 2 3 PDF‬ x y z */
> > > +/* { dg-warning "U\\+202D" "" { target *-*-* } .-1 } */
> > > +/* a b c RLO‮ 1 2 3 PDF‬ x y z */
> > > +/* { dg-warning "U\\+202E" "" { target *-*-* } .-1 } */
> > > +/* a b c LRI⁦ 1 2 3 PDI⁩ x y z */
> > > +/* { dg-warning "U\\+2066" "" { target *-*-* } .-1 } */
> > > +/* a b c RLI⁧ 1 2 3 PDI⁩ x y */
> > > +/* { dg-warning "U\\+2067" "" { target *-*-* } .-1 } */
> > > +/* a b c FSI⁨ 1 2 3 PDI⁩ x y z */
> > > +/* { dg-warning "U\\+2068" "" { target *-*-* } .-1 } */
> > 
> > AIUI the Unicode bidirectionality algorithm works at the line level,
> > and so each line in a block comment should be checked individually for
> > unclossed bidi control chars, rather than a block comment as a whole. 
> > Hence I think the test case needs to have block comment test coverage
> > for:
> > - single line blocks
> > - first line of a multiline block comment
> > - middle line of a multiline block comment
> > - final line of a multiline 

[PATCH] PR tree-optimization/96779 Adding a missing pattern to match.pd

2021-11-16 Thread Navid Rahimi via Gcc-patches
Hi GCC community,

This patch will add the missed pattern described in bug 102232 [1] to the 
match.pd. 

Tree-optimization/96779: Adding new optimization to match.pd:

* match.pd (-x == x) -> (x == 0): New optimization.
* gcc.dg/tree-ssa/pr96779.c: testcase for this optimization.
* gcc.dg/tree-ssa/pr96779-disabled.c: testcase for this 
optimization when -fwrapv passed.

1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96779

Best wishes,
Navid.

0001-tree-optimization-96779.patch
Description: 0001-tree-optimization-96779.patch


[committed] libstdc++: Fix tests for constexpr std::string

2021-11-16 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.


Some tests fail when run with -D_GLIBCXX_USE_CXX11_ABI or -stdgnu++20.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (operator<=>): Use constexpr
unconditionally.
* testsuite/21_strings/basic_string/modifiers/constexpr.cc:
Require cxx11-abit effective target.
* testsuite/21_strings/headers/string/synopsis.cc: Add
conditional constexpr to declarations, and adjust relational
operators for C++20.
---
 libstdc++-v3/include/bits/basic_string.h  |  6 ++--
 .../basic_string/modifiers/constexpr.cc   |  1 +
 .../21_strings/headers/string/synopsis.cc | 33 +--
 3 files changed, 33 insertions(+), 7 deletions(-)

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index b6945f1cdfb..0b7d6c0a981 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -3546,8 +3546,7 @@ _GLIBCXX_END_NAMESPACE_CXX11
*  greater than, or incomparable with `__rhs`.
*/
   template
-_GLIBCXX20_CONSTEXPR
-inline auto
+constexpr auto
 operator<=>(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
const basic_string<_CharT, _Traits, _Alloc>& __rhs) noexcept
 -> decltype(__detail::__char_traits_cmp_cat<_Traits>(0))
@@ -3561,8 +3560,7 @@ _GLIBCXX_END_NAMESPACE_CXX11
*  greater than, or incomparable with `__rhs`.
*/
   template
-_GLIBCXX20_CONSTEXPR
-inline auto
+constexpr auto
 operator<=>(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
const _CharT* __rhs) noexcept
 -> decltype(__detail::__char_traits_cmp_cat<_Traits>(0))
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/constexpr.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/constexpr.cc
index c875a3a19ad..a4627714d9a 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/constexpr.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/constexpr.cc
@@ -1,5 +1,6 @@
 // { dg-options "-std=gnu++20" }
 // { dg-do compile { target c++20 } }
+// { dg-require-effective-target cxx11-abi }
 
 #include 
 #include 
diff --git a/libstdc++-v3/testsuite/21_strings/headers/string/synopsis.cc 
b/libstdc++-v3/testsuite/21_strings/headers/string/synopsis.cc
index f14c4ae831c..f12345ed426 100644
--- a/libstdc++-v3/testsuite/21_strings/headers/string/synopsis.cc
+++ b/libstdc++-v3/testsuite/21_strings/headers/string/synopsis.cc
@@ -26,6 +26,12 @@
 # define NOTHROW
 #endif
 
+#if __cplusplus >= 202002L
+# define CONSTEXPR constexpr
+#else
+# define CONSTEXPR
+#endif
+
 namespace std {
   //  lib.char.traits, character traits:
   template
@@ -40,33 +46,52 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 _GLIBCXX_END_NAMESPACE_CXX11
 
   template
+  CONSTEXPR
   basic_string
   operator+(const basic_string& lhs,
const basic_string& rhs);
   template
+  CONSTEXPR
   basic_string
   operator+(const charT* lhs,
const basic_string& rhs);
   template
+  CONSTEXPR
   basic_string
   operator+(charT lhs, const basic_string& rhs);
   template
+  CONSTEXPR
   basic_string
   operator+(const basic_string& lhs,
const charT* rhs);
   template
+  CONSTEXPR
   basic_string
   operator+(const basic_string& lhs, charT rhs);
 
   template
+  CONSTEXPR
   bool operator==(const basic_string& lhs,
  const basic_string& rhs) NOTHROW;
   template
-  bool operator==(const charT* lhs,
- const basic_string& rhs);
-  template
+  CONSTEXPR
   bool operator==(const basic_string& lhs,
  const charT* rhs);
+
+#if __cpp_lib_three_way_comparison
+  template
+  constexpr
+  bool operator<=>(const basic_string& lhs,
+  const basic_string& rhs) NOTHROW;
+  template
+  constexpr
+  bool operator<=>(const basic_string& lhs,
+  const charT* rhs);
+#else
+  template
+  CONSTEXPR
+  bool operator==(const charT* lhs,
+ const basic_string& rhs);
   template
   bool operator!=(const basic_string& lhs,
  const basic_string& rhs) NOTHROW;
@@ -114,9 +139,11 @@ _GLIBCXX_END_NAMESPACE_CXX11
   template
   bool operator>=(const charT* lhs,
  const basic_string& rhs);
+#endif
 
   //  lib.string.special:
   template
+  CONSTEXPR
   void swap(basic_string& lhs,
basic_string& rhs)
 #if __cplusplus >= 201103L
-- 
2.31.1



Re: [PATCH] Fix spelling of ones' complement.

2021-11-16 Thread Marek Polacek via Gcc-patches
On Tue, Nov 16, 2021 at 01:09:15PM -0800, Mike Stump via Gcc-patches wrote:
> On Nov 15, 2021, at 5:48 PM, Marek Polacek via Gcc-patches 
>  wrote:
> > 
> > Nitpicking time.  It's spelled "ones' complement" rather than "one's
> > complement".  I didn't go into config/.
> > 
> > Ok for trunk?
> 
> So, is it two's complement or twos' complement then?  Seems like it should be 
> the same, but  wikipedia suggests it is two's complement, as does google.  If 
> that is wrong, you should go edit it as well.  :-)
 
It is "two's complement":
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584543.html
but Knuth also continues to say that there's "twos' complement notation",
which "has radix 3 and complementation with respect to (2...22)_3."


It's not lost on me how inconsequential this patch is; I'm happy to just
drop it and let the copy editor in me sleep.

Marek



Re: [PATCH] Fix spelling of ones' complement.

2021-11-16 Thread Mike Stump via Gcc-patches
On Nov 15, 2021, at 5:48 PM, Marek Polacek via Gcc-patches 
 wrote:
> 
> Nitpicking time.  It's spelled "ones' complement" rather than "one's
> complement".  I didn't go into config/.
> 
> Ok for trunk?

So, is it two's complement or twos' complement then?  Seems like it should be 
the same, but  wikipedia suggests it is two's complement, as does google.  If 
that is wrong, you should go edit it as well.  :-)


Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-16 Thread Matthias Kretz
On Tuesday, 16 November 2021 21:49:31 CET Jason Merrill wrote:
> On 11/16/21 15:42, Matthias Kretz wrote:
> > On Tuesday, 16 November 2021 21:25:33 CET Jason Merrill wrote:
> >> On 11/8/21 15:00, Matthias Kretz wrote:
> >>> I forgot to mention why I tagged it [RFC]: I needed one more bit of
> >>> information on the template args TREE_VEC to encode
> >>> EXPLICIT_TEMPLATE_ARGS_P. Its TREE_CHAIN already points to an integer
> >>> constant denoting the number of non-default arguments, so I couldn't
> >>> trivially replace that. Therefore, I used the sign of that integer. I
> >>> was
> >>> hoping to find a cleaner solution, though.
> >> 
> >> It seems that we aren't using any TREE_LANG_FLAG_n on TREE_VEC, so that
> >> would be a cleaner solution.
> > 
> > I tried that first but realized that TREE_VEC doesn't allow any
> > TREE_LANG_FLAGs (it uses those bits for the length IIRC). And setting the
> > TREE_LANG_FLAGs on the TREE_CHAIN of the TREE_VEC can't work either (since
> > the int constants are shared between many trees).
> > 
> > Should I maybe turn the TREE_CHAIN into a TREE_LIST using TREE_PURPOSE and
> > TREE_VALUE for EXPLICIT_TEMPLATE_ARGS_P and non-default arguments,
> > respectively? (And where would I document this?)
> 
> Maybe a TREE_LIST if there are explicit template arguments to a function
> template, where TREE_PURPOSE is the number of explicit arguments and
> TREE_VALUE is the number of non-default arguments.
> 
> I'd document it at the definition of NON_DEFAULT_TEMPLATE_ARGS_COUNT.
> The SET/GET macros should become functions.

Sounds good. I'll come up with a new patch ASAP.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──


Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-16 Thread Jason Merrill via Gcc-patches

On 11/16/21 15:42, Matthias Kretz wrote:

On Tuesday, 16 November 2021 21:25:33 CET Jason Merrill wrote:

On 11/8/21 15:00, Matthias Kretz wrote:

I forgot to mention why I tagged it [RFC]: I needed one more bit of
information on the template args TREE_VEC to encode
EXPLICIT_TEMPLATE_ARGS_P. Its TREE_CHAIN already points to an integer
constant denoting the number of non-default arguments, so I couldn't
trivially replace that. Therefore, I used the sign of that integer. I was
hoping to find a cleaner solution, though.

It seems that we aren't using any TREE_LANG_FLAG_n on TREE_VEC, so that
would be a cleaner solution.


I tried that first but realized that TREE_VEC doesn't allow any
TREE_LANG_FLAGs (it uses those bits for the length IIRC). And setting the
TREE_LANG_FLAGs on the TREE_CHAIN of the TREE_VEC can't work either (since the
int constants are shared between many trees).

Should I maybe turn the TREE_CHAIN into a TREE_LIST using TREE_PURPOSE and
TREE_VALUE for EXPLICIT_TEMPLATE_ARGS_P and non-default arguments,
respectively? (And where would I document this?)


Maybe a TREE_LIST if there are explicit template arguments to a function 
template, where TREE_PURPOSE is the number of explicit arguments and 
TREE_VALUE is the number of non-default arguments.


I'd document it at the definition of NON_DEFAULT_TEMPLATE_ARGS_COUNT. 
The SET/GET macros should become functions.


Jason



Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-16 Thread Matthias Kretz
On Tuesday, 16 November 2021 21:25:33 CET Jason Merrill wrote:
> On 11/8/21 15:00, Matthias Kretz wrote:
> > I forgot to mention why I tagged it [RFC]: I needed one more bit of
> > information on the template args TREE_VEC to encode
> > EXPLICIT_TEMPLATE_ARGS_P. Its TREE_CHAIN already points to an integer
> > constant denoting the number of non-default arguments, so I couldn't
> > trivially replace that. Therefore, I used the sign of that integer. I was
> > hoping to find a cleaner solution, though.
> It seems that we aren't using any TREE_LANG_FLAG_n on TREE_VEC, so that
> would be a cleaner solution.

I tried that first but realized that TREE_VEC doesn't allow any 
TREE_LANG_FLAGs (it uses those bits for the length IIRC). And setting the 
TREE_LANG_FLAGs on the TREE_CHAIN of the TREE_VEC can't work either (since the 
int constants are shared between many trees).

Should I maybe turn the TREE_CHAIN into a TREE_LIST using TREE_PURPOSE and 
TREE_VALUE for EXPLICIT_TEMPLATE_ARGS_P and non-default arguments, 
respectively? (And where would I document this?)

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──


[PATCH v2] rs6000: Test case adjustments for new builtins

2021-11-16 Thread Bill Schmidt via Gcc-patches
Hi!  I recently submitted [1] to make adjustments to test cases for the new 
builtins
support, mostly due to error messages changing for consistency.  Thanks for the
previous review.  I've reviewed the reasons for the changes and removed 
unrelated
changes as requested.

A couple of comments:

 - For fold-vect-splat-floatdouble.c and fold-vec-splat-longlong.c, the existing
   test cases have some bad tests in them (checking two bits when only one bit
   is meaningful).  The new builtin support catches this but the old support did
   not.  Removing those bad cases changes some of the scan-assembler-times 
expected
   values.
 - For int_128bit-runnable.c, I chose not to do gimple folding on the 128-bit
   comparison operations in the new implementation, because doing so results in
   bad code that splits things into two 64-bit values.  That needs separate
   attention; but the point here is, when I did that, I started generating
   more of the vcmpequq, vcmpgtsq, and vcmpgtuq instructions.

Everything else here is hopefully straightforward, and unchanged from the 
previous
submission.

Bootstrapped and tested on powerpc64le-linux-gnu, and on powerpc64-linux-gnu 
with
-m32 and -m64.  Is this okay for trunk?

Thanks!
Bill

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578615.html


2021-11-15  Bill Schmidt  

gcc/testsuite/
* gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Adjust error
message.
* gcc.target/powerpc/bfp/scalar-extract-sig-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-8.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-neg-2.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-neg-3.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-neg-5.c: Likewise.
* gcc.target/powerpc/byte-in-set-2.c: Likewise.
* gcc.target/powerpc/cmpb-2.c: Likewise.
* gcc.target/powerpc/cmpb32-2.c: Likewise.
* gcc.target/powerpc/crypto-builtin-2.c: Likewise.
* gcc.target/powerpc/fold-vec-splat-floatdouble.c: Remove invalid
test and adjust xxpermdi count.
* gcc.target/powerpc/fold-vec-splat-longlong.c: Remove invalid
tests and adjust instruction counts.
* gcc.target/powerpc/fold-vec-splat-misc-invalid.c: Adjust error
messages.
* gcc.target/powerpc/int_128bit-runnable.c: Adjust instruction
counts since we do better by not gimple-folding some builtins.
* gcc.target/powerpc/pr80315-1.c: Adjust error message.
* gcc.target/powerpc/pr80315-2.c: Likewise.
* gcc.target/powerpc/pr80315-3.c: Likewise.
* gcc.target/powerpc/pr80315-4.c: Likewise.
* gcc.target/powerpc/pr88100.c: Likewise.
* gcc.target/powerpc/pragma_misc9.c: Likewise.
* gcc.target/powerpc/pragma_power8.c: Undef _RS6000_VECDEFINES_H.
* gcc.target/powerpc/pragma_power9.c: Likewise.
* gcc.target/powerpc/test_fpscr_drn_builtin_error.c: Adjust error
messages.
* gcc.target/powerpc/test_fpscr_rn_builtin_error.c: Likewise.
* gcc.target/powerpc/vec-gnb-2.c: Likewise.
* gcc.target/powerpc/vsu/vec-all-nez-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-any-eqz-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-cmpnez-7.c: Likewise.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c: Likewise.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c: Likewise.
* gcc.target/powerpc/vsu/vec-xl-len-13.c: Likewise.
* gcc.target/powerpc/vsu/vec-xst-len-12.c: Likewise.
---
 .../gcc.target/powerpc/bfp/scalar-extract-exp-2.c  |  2 +-
 .../gcc.target/powerpc/bfp/scalar-extract-sig-2.c  |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-2.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-5.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-8.c   |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-2.c |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-3.c |  2 +-
 .../gcc.target/powerpc/bfp/scalar-test-neg-5.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/byte-in-set-2.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/cmpb-2.c  |  2 +-
 gcc/testsuite/gcc.target/powerpc/cmpb32-2.c|  2 +-
 .../gcc.target/powerpc/crypto-builtin-2.c  | 14 +++---
 .../powerpc/fold-vec-splat-floatdouble.c   |  4 ++--
 .../gcc.target/powerpc/fold-vec-splat-longlong.c   | 10 +++---
 .../powerpc/fold-vec-splat-misc-invalid.c  |  8 
 .../gcc.target/powerpc/int_128bit-runnable.c   |  6 +++---
 gcc/testsuite/gcc.target/powerpc/pr80315-1.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-2.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-3.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-4.c   |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr88100.c | 12 ++--
 

Re: [RFC] c++: Print function template parms when relevant (was: [PATCH v4] c++: Add gnu::diagnose_as attribute)

2021-11-16 Thread Jason Merrill via Gcc-patches

On 11/8/21 15:00, Matthias Kretz wrote:

I forgot to mention why I tagged it [RFC]: I needed one more bit of
information on the template args TREE_VEC to encode EXPLICIT_TEMPLATE_ARGS_P.
Its TREE_CHAIN already points to an integer constant denoting the number of
non-default arguments, so I couldn't trivially replace that. Therefore, I used
the sign of that integer. I was hoping to find a cleaner solution, though.


It seems that we aren't using any TREE_LANG_FLAG_n on TREE_VEC, so that 
would be a cleaner solution.



On Monday, 8 November 2021 17:40:44 CET Matthias Kretz wrote:

On Tuesday, 17 August 2021 20:31:54 CET Jason Merrill wrote:

2. Given a DECL_TI_ARGS tree, can I query whether an argument was
deduced
or explicitly specified? I'm asking because I still consider diagnostics
of function templates unfortunate. `template  void f()` is
fine,
as is `void f(T) [with T = float]`, but `void f() [with T = float]`
could
be better. I.e. if the template parameter appears somewhere in the
function parameter list, dump_template_parms would only produce noise.
If, however, the template parameter was given explicitly, it would be
nice if it could show up accordingly in diagnostics.


NON_DEFAULT_TEMPLATE_ARGS_COUNT has that information, though there are
some issues with it.  Attached is my WIP from May to improve it
somewhat, if that's interesting.


It is interesting. I used your patch to come up with the attached. Patch. I
must say, I didn't try to read through all the cp/pt.c code to understand
all of what you did there (which is why my ChangeLog entry says "Jason?"),
but it works for me (and all of `make check`).

Anyway, I'd like to propose the following before finishing my diagnose_as
patch. I believe it's useful to fix this part first. The diagnostic/default-
template-args-[12].C tests show a lot of examples of the intent of this
patch. And the remaining changes to the testsuite show how it changes
diagnostic output.

-- 8< 

The choice when to print a function template parameter was still
suboptimal. That's because sometimes the function template parameter
list only adds noise, while in other situations the lack of a function
template parameter list makes diagnostic messages hard to understand.

The general idea of this change is to print template parms wherever they
would appear in the source code as well. Thus, the diagnostics code
needs to know whether any template parameter was given explicitly.

Signed-off-by: Matthias Kretz 

gcc/testsuite/ChangeLog:

 * g++.dg/debug/dwarf2/template-params-12n.C: Optionally, allow
 DW_AT_default_value.
 * g++.dg/diagnostic/default-template-args-1.C: New.
 * g++.dg/diagnostic/default-template-args-2.C: New.
 * g++.dg/diagnostic/param-type-mismatch-2.C: Expect template
 parms in diagnostic.
 * g++.dg/ext/pretty1.C: Expect function template specialization
 to not pretty-print template parms.
 * g++.old-deja/g++.ext/pretty3.C: Ditto.
 * g++.old-deja/g++.pt/memtemp77.C: Ditto.
 * g++.dg/goacc/template.C: Expect function template parms for
 explicit arguments.
 * g++.dg/gomp/declare-variant-7.C: Expect no function template
 parms for deduced arguments.
 * g++.dg/template/error40.C: Expect only non-default template
 arguments in diagnostic.

gcc/cp/ChangeLog:

 * cp-tree.h (GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT): Return
 absolute value of stored constant.
 (EXPLICIT_TEMPLATE_ARGS_P): New.
 (SET_EXPLICIT_TEMPLATE_ARGS_P): New.
 (TFF_AS_PRIMARY): New constant.
 * error.c (get_non_default_template_args_count): Avoid
 GET_NON_DEFAULT_TEMPLATE_ARGS_COUNT if
 NON_DEFAULT_TEMPLATE_ARGS_COUNT is a NULL_TREE. Make independent
 of flag_pretty_templates.
 (dump_template_bindings): Add flags parameter to be passed to
 get_non_default_template_args_count. Print only non-default
 template arguments.
 (dump_function_decl): Call dump_function_name and dump_type of
 the DECL_CONTEXT with specialized template and set
 TFF_AS_PRIMARY for their flags.
 (dump_function_name): Add and document conditions for calling
 dump_template_parms.
 (dump_template_parms): Print only non-default template
 parameters.
 * pt.c (determine_specialization): Jason?
 (template_parms_level_to_args): Jason?
 (copy_template_args): Jason?
 (fn_type_unification): Set EXPLICIT_TEMPLATE_ARGS_P on the
 template arguments tree if any template parameter was explicitly
 given.
 (type_unification_real): Jason?
 (get_partial_spec_bindings): Jason?
 (tsubst_template_args): Determine number of defaulted arguments
 from new argument vector, if possible.
---
  gcc/cp/cp-tree.h  | 18 +++-
  gcc/cp/error.c  

Re: [PATCH] restore ancient -Waddress for weak symbols [PR33925]

2021-11-16 Thread Jason Merrill via Gcc-patches

On 10/23/21 19:06, Martin Sebor wrote:

On 10/4/21 3:37 PM, Jason Merrill wrote:

On 10/4/21 14:42, Martin Sebor wrote:

While resolving the recent -Waddress enhancement request (PR
PR102103) I came across a 2007 problem report about GCC 4 having
stopped warning for using the address of inline functions in
equality comparisons with null.  With inline functions being
commonplace in C++ this seems like an important use case for
the warning.

The change that resulted in suppressing the warning in these
cases was introduced inadvertently in a fix for PR 22252.

To restore the warning, the attached patch enhances
the decl_with_nonnull_addr_p() function to return true also for
weak symbols for which a definition has been provided.


I think you probably want to merge this function with 
fold-const.c:maybe_nonzero_address, which already handles more cases.


maybe_nonzero_address() doesn't behave quite like
decl_with_nonnull_addr_p() expects and I'm reluctant to muck
around with the former too much since it's used for codegen,
while the latter just for warnings.  (There is even a case
where the functions don't behave the same, and would result
in different warnings between C and C++ without some extra
help.)

So in the attached revision I just have maybe_nonzero_address()
call decl_with_nonnull_addr_p() and then refine the failing
(or uncertain) cases separately, with some overlap between
them.

Since I worked on this someone complained that some instances
of the warning newly enhanced under PR102103 aren't suppresed
in code resulting from macro expansion.  Since it's trivial,
I include the fix for that report in this patch as well.



+   allocated stroage might have a null address.  */


typo.

OK with that fixed.

Jason



[PATCH, committed] PR fortran/103286 - ICE in resolve_select, at fortran/resolve.c:8848

2021-11-16 Thread Harald Anlauf via Gcc-patches
Committed to mainline as obvious after regtesting.

When issuing an error on an invalid range in a SELECT CASE statement
with a logical case expression, we need to be careful to use the
right locus information.

Thanks,
Harald

From 3b3c9932338650c9a402cf1bfbdf7dfc03e185e7 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Tue, 16 Nov 2021 21:06:06 +0100
Subject: [PATCH] Fortran: avoid NULL pointer dereference on invalid range in
 logical SELECT CASE

gcc/fortran/ChangeLog:

	PR fortran/103286
	* resolve.c (resolve_select): Choose appropriate range limit to
	avoid NULL pointer dereference when generating error message.

gcc/testsuite/ChangeLog:

	PR fortran/103286
	* gfortran.dg/pr103286.f90: New test.
---
 gcc/fortran/resolve.c  |  3 ++-
 gcc/testsuite/gfortran.dg/pr103286.f90 | 11 +++
 2 files changed, 13 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr103286.f90

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 705d2326a29..f074a0ab3a1 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -8846,7 +8846,8 @@ resolve_select (gfc_code *code, bool select_type)
 		  || cp->low != cp->high))
 	{
 	  gfc_error ("Logical range in CASE statement at %L is not "
-			 "allowed", >low->where);
+			 "allowed",
+			 cp->low ? >low->where : >high->where);
 	  t = false;
 	  break;
 	}
diff --git a/gcc/testsuite/gfortran.dg/pr103286.f90 b/gcc/testsuite/gfortran.dg/pr103286.f90
new file mode 100644
index 000..1c18b7136ce
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr103286.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! { dg-options "std=gnu" }
+! PR fortran/103286 - ICE in resolve_select
+
+program p
+  select case (.true.) ! { dg-warning "Extension: Conversion" }
+  case (1_8)
+  case (:0)! { dg-error "Logical range in CASE statement" }
+  case (2:)! { dg-error "Logical range in CASE statement" }
+  end select
+end
--
2.26.2



Re: [PATCH] Fix spelling of ones' complement.

2021-11-16 Thread Bernhard Reutner-Fischer via Gcc-patches
On Tue, 16 Nov 2021 15:55:55 +0100
Aldy Hernandez via Gcc-patches  wrote:

> All sources before Knuth are clearly wrong.  How could they not?
> Folks living in the pre-Knuth era lived without a deity.
> 
> :-P

Not sure if this one's a compliment.

Speaking of which:

$ git grep -i "complim"
gcc/ChangeLog-2000: addition over compliments over shifts.
gcc/ada/sem_util.adb:  --  Assume that the main unit does not have a 
complimentary unit
gcc/ada/sem_util.adb:  --  Obtain the complimentary unit of the main unit
gcc/config/fr30/fr30.c:  /* Convert GCC's comparison operators into the 
complimentary FR30
gcc/config/mn10300/mn10300.md:  /* Recall that twos-compliment is 
ones-compliment plus one.  When
gcc/config/nds32/constraints.md:  "A constant whose compliment value is in the 
range of imm15u
gcc/config/nds32/nds32.md:;; 'ONE_COMPLIMENT' operation
gcc/config/sparc/sparc.h:   compliment of ordered and unordered comparisons, 
but until generic
gcc/config/visium/visium.h:   compliment of ordered and unordered comparisons, 
but until generic
gcc/d/expr.cc:  /* Build a compliment expression, where all the bits in the 
value are
gcc/d/intrinsics.cc:   Variants of `bt' will then update that bit. `btc' 
compliments the bit, `bts'
gcc/doc/md.texi:A constant whose compliment value is in the range of imm15u
gcc/ipa-reference.c:  /* Create the complimentary sets.  */
libstdc++-v3/testsuite/data/thirty_years_among_the_dead_preproc.txt:compliment

Maybe someone competent should contemplate to complement the fixes
for ones' two's complement in the above, except the first and last... ;)


Re: [PATCH v4] Fix ICE when mixing VLAs and statement expressions [PR91038]

2021-11-16 Thread Jason Merrill via Gcc-patches

On 11/16/21 08:48, Uecker, Martin wrote:

Am Montag, den 08.11.2021, 19:13 +0100 schrieb Martin Uecker:

Am Montag, den 08.11.2021, 12:13 -0500 schrieb Jason Merrill:

On 11/7/21 01:40, Uecker, Martin wrote:

Am Mittwoch, den 03.11.2021, 10:18 -0400 schrieb Jason Merrill:


...


Thank you! I made these changes and ran
bootstrap and tests again.


Hmm, it doesn't look like you made the change to use the save_expr
function instead of build1?


Oh, sorry. I wanted to change it and then forgot.
Now also with this change (changelog as before).



Ok, with is this change?


OK.


Best,
Martin




Ok for trunk?


Any idea how to fix returning structs with
VLA member from statement expressions?


Testcase?


void foo(void)
{
   ({ int N = 3; struct { char x[N]; } x; x; });
}

The difference to the tests in this patch (which
also forgot to include in the last version) is that
the object of variable size is returned from the
statement expression and not a pointer to it.
This can not happen with arrays because they decay
to pointers.


Martin



Otherwise, I will add an error message to
the FE in another patch.

Martin



diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 436df45df68..95083f95442 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -3306,7 +3306,19 @@ pointer_int_sum (location_t loc, enum tree_code 
resultcode,
 TREE_TYPE (result_type)))
  size_exp = integer_one_node;
else
-size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
+{
+  size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
+  /* Wrap the pointer expression in a SAVE_EXPR to make sure it
+is evaluated first when the size expression may depend
+on it for VM types.  */
+  if (TREE_SIDE_EFFECTS (size_exp)
+ && TREE_SIDE_EFFECTS (ptrop)
+ && variably_modified_type_p (TREE_TYPE (ptrop), NULL))
+   {
+ ptrop = save_expr (ptrop);
+ size_exp = build2 (COMPOUND_EXPR, TREE_TYPE (intop), ptrop, size_exp);
+   }
+}
  
/* We are manipulating pointer values, so we don't need to warn

   about relying on undefined signed overflow.  We disable the
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index c2ab96e7e18..84f7dc3c248 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -2964,7 +2964,9 @@ gimplify_var_or_parm_decl (tree *expr_p)
   declaration, for which we've already issued an error.  It would
   be really nice if the front end wouldn't leak these at all.
   Currently the only known culprit is C++ destructors, as seen
- in g++.old-deja/g++.jason/binding.C.  */
+ in g++.old-deja/g++.jason/binding.C.
+ Another possible culpit are size expressions for variably modified
+ types which are lost in the FE or not gimplified correctly.  */
if (VAR_P (decl)
&& !DECL_SEEN_IN_BIND_EXPR_P (decl)
&& !TREE_STATIC (decl) && !DECL_EXTERNAL (decl)
@@ -3109,16 +3111,22 @@ gimplify_compound_lval (tree *expr_p, gimple_seq 
*pre_p, gimple_seq
*post_p,
   expression until we deal with any variable bounds, sizes, or
   positions in order to deal with PLACEHOLDER_EXPRs.
  
- So we do this in three steps.  First we deal with the annotations

- for any variables in the components, then we gimplify the base,
- then we gimplify any indices, from left to right.  */
+ The base expression may contain a statement expression that
+ has declarations used in size expressions, so has to be
+ gimplified before gimplifying the size expressions.
+
+ So we do this in three steps.  First we deal with variable
+ bounds, sizes, and positions, then we gimplify the base,
+ then we deal with the annotations for any variables in the
+ components and any indices, from left to right.  */
+
for (i = expr_stack.length () - 1; i >= 0; i--)
  {
tree t = expr_stack[i];
  
if (TREE_CODE (t) == ARRAY_REF || TREE_CODE (t) == ARRAY_RANGE_REF)

{
- /* Gimplify the low bound and element type size and put them into
+ /* Deal with the low bound and element type size and put them into
 the ARRAY_REF.  If these values are set, they have already been
 gimplified.  */
  if (TREE_OPERAND (t, 2) == NULL_TREE)
@@ -3127,18 +3135,8 @@ gimplify_compound_lval (tree *expr_p, gimple_seq *pre_p, 
gimple_seq
*post_p,
  if (!is_gimple_min_invariant (low))
{
  TREE_OPERAND (t, 2) = low;
- tret = gimplify_expr (_OPERAND (t, 2), pre_p,
-   post_p, is_gimple_reg,
-   fb_rvalue);
- ret = MIN (ret, tret);
}
}
- else
-   {
- tret = gimplify_expr (_OPERAND (t, 2), pre_p, post_p,
-   is_gimple_reg, fb_rvalue);
- ret = MIN (ret, tret);
- 

[PATCH v2] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-16 Thread Marek Polacek via Gcc-patches
On Mon, Nov 15, 2021 at 06:15:40PM -0500, David Malcolm wrote:
> > On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:
> > > Ping, can we conclude on the name?   IMHO, -Wbidirectional is just fine,
> > > but changing the name is a trivial operation. 
> > 
> > Here's a patch with a better name (suggested by Jonathan W.).  Otherwise no
> > changes.
> 
> Thanks for implementing this.
> 
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > -- >8 --
> > From a link below:
> > "An issue was discovered in the Bidirectional Algorithm in the Unicode
> > Specification through 14.0. It permits the visual reordering of
> > characters via control sequences, which can be used to craft source code
> > that renders different logic than the logical ordering of tokens
> > ingested by compilers and interpreters. Adversaries can leverage this to
> > encode source code for compilers accepting Unicode such that targeted
> > vulnerabilities are introduced invisibly to human reviewers."
> > 
> > More info:
> > https://nvd.nist.gov/vuln/detail/CVE-2021-42574
> > https://trojansource.codes/
> > 
> > This is not a compiler bug.  However, to mitigate the problem, this patch
> > implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
> > misleading Unicode bidirectional characters the preprocessor may encounter.
> > 
> > The default is =unpaired, which warns about improperly terminated
> > bidirectional characters; e.g. a LRE without its appertaining PDF.  The
> 
> I like the default.

Great.

> Wording nit: maybe use "corresponding" rather than "appertaining"; I
> believe the latter has a sense that one is part of the other, when they
> are more like peers.

OK, fixed.

> > level =any warns about any use of bidirectional characters.
> 
> Terminology nit:
> The patch is referring to "bidirectional characters", but I think the
> term "bidirectional control characters" would be better.

Adjusted.
 
> For example, a passage of text containing both numbers and characters
> in a right-to-left script could be considered "bidirectional", since
> the numbers are written from left-to-right.
> 
> Specifically, the patch looks for these specific characters:
>   * U+202A LEFT-TO-RIGHT EMBEDDING
>   * U+202B RIGHT-TO-LEFT EMBEDDING
>   * U+202C POP DIRECTIONAL FORMATTING
>   * U+202D LEFT-TO-RIGHT OVERRIDE
>   * U+202E RIGHT-TO-LEFT OVERRIDE
>   * U+2066 LEFT-TO-RIGHT ISOLATE
>   * U+2067 RIGHT-TO-LEFT ISOLATE
>   * U+2068 FIRST STRONG ISOLATE
>   * U+2069 POP DIRECTIONAL ISOLATE
> 
> However, the following characters could also be considered as
> "bidirectional control characters":
>   * U+200E ‎LEFT-TO-RIGHT MARK (UTF-8: E2 80 8E)
>   * U+200F ‎RIGHT-TO-LEFT MARK (UTF-8: E2 80 8F)
> but aren't checked for in the patch.  Should they be?  I can imagine
> ways in which they could be abused, so I think so.

I'd only intended to check the bidi chars described in the original
trojan source pdf, but I added checking for U+200E/U+200F too, since
it was easy enough.  AFAIK they aren't popped by a PDF/PDI like the
rest, so don't need to go on the vec, and so we only warn with =any.
Tests: Wbidi-chars-16.c + Wbidi-chars-17.c
  
> [...snip...]
> 
> > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> > index 06457ac739e..b047df0f125 100644
> > --- a/gcc/c-family/c.opt
> > +++ b/gcc/c-family/c.opt
> > @@ -374,6 +374,30 @@ Wbad-function-cast
> >  C ObjC Var(warn_bad_function_cast) Warning
> >  Warn about casting functions to incompatible types.
> >  
> > +Wbidi-chars
> > +C ObjC C++ ObjC++ Warning Alias(Wbidi-chars=,any,none)
> > +;
> > +
> > +Wbidi-chars=
> > +C ObjC C++ ObjC++ RejectNegative Joined Warning 
> > CPP(cpp_warn_bidirectional) CppReason(CPP_W_BIDIRECTIONAL) 
> > Var(warn_bidirectional) Init(bidirectional_unpaired) 
> > Enum(cpp_bidirectional_level)
> > +-Wbidi-chars=[none|unpaired|any] Warn about UTF-8 bidirectional characters.
> 
> "control characters"
 
Fixed.

> [...snip...]
> 
> >  
> > +@item -Wbidi-chars=@r{[}none@r{|}unpaired@r{|}any@r{]}
> > +@opindex Wbidi-chars=
> > +@opindex Wbidi-chars
> > +@opindex Wno-bidi-chars
> > +Warn about possibly misleading UTF-8 bidirectional characters in comments,
> 
> (and here again)
 
Fixed.

> > +string literals, character constants, and identifiers.  Such characters can
> > +change left-to-right writing direction into right-to-left (and vice versa),
> > +which can cause confusion between the logical order and visual order.  This
> > +may be dangerous; for instance, it may seem that a piece of code is not
> > +commented out, whereas it in fact is.
> > +
> > +There are three levels of warning supported by GCC@.  The default is
> > +@option{-Wbidi-chars=unpaired}, which warns about improperly terminated
> > +bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
> > +@option{-Wbidi-chars=any} warns about any use of bidirectional characters.
> 
> (and again)

Fixed.

> [...snip...]
> 
> 
> > diff --git 

[pushed] configure, Darwin: Set appropriate defaults for host-shared.

2021-11-16 Thread Iain Sandoe via Gcc-patches
Darwin x86_64 and aarch64 platforms are PIC (shared) by default,
and user-space code must be built in this mode.  The patch
ensures that this is set correctly and applies a default when
--enable-host-shared is not set.

tested on *-darwin*, x86_64,powerpc64le-linux-gnu,
pushed to master, thanks
Iain

Signed-off-by: Iain Sandoe 

ChangeLog:

* configure: Regenerate.
* configure.ac: Ensure that PIC (shared) defaults are set
correctly for Darwin.
---
 configure| 16 +++-
 configure.ac | 15 ++-
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 58979d6e3b1..3062495da31 100755
--- a/configure
+++ b/configure
@@ -8447,8 +8447,20 @@ fi
 # Check whether --enable-host-shared was given.
 if test "${enable_host_shared+set}" = set; then :
   enableval=$enable_host_shared; host_shared=$enableval
+ case $target in
+   x86_64-*-darwin* | aarch64-*-darwin*)
+ if test x$host_shared != xyes ; then
+   # PIC is the default, and actually cannot be switched off.
+   echo configure.ac: warning: PIC code is required for the configured 
target, host-shared setting ignored. 1>&2
+   host_shared=yes
+ fi ;;
+  *) ;;
+ esac
 else
-  host_shared=no
+  case $target in
+  x86_64-*-darwin* | aarch64-*-darwin*) host_shared=yes ;;
+  *) host_shared=no ;;
+ esac
 fi
 
 
@@ -10083,6 +10095,8 @@ done
 
 
 
+
+
 # Generate default definitions for YACC, M4, LEX and other programs that run
 # on the build machine.  These are used if the Makefile can't locate these
 # programs in objdir.
diff --git a/configure.ac b/configure.ac
index 550e6993b59..bed60bcaf72 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1859,7 +1859,20 @@ AC_SUBST(extra_linker_plugin_flags)
 AC_ARG_ENABLE(host-shared,
 [AS_HELP_STRING([--enable-host-shared],
[build host code as shared libraries])],
-[host_shared=$enableval], [host_shared=no])
+[host_shared=$enableval
+ case $target in
+   x86_64-*-darwin* | aarch64-*-darwin*)
+ if test x$host_shared != xyes ; then
+   # PIC is the default, and actually cannot be switched off.
+   echo configure.ac: warning: PIC code is required for the configured 
target, host-shared setting ignored. 1>&2
+   host_shared=yes
+ fi ;;
+  *) ;;
+ esac],
+[case $target in
+  x86_64-*-darwin* | aarch64-*-darwin*) host_shared=yes ;;
+  *) host_shared=no ;;
+ esac])
 AC_SUBST(host_shared)
 
 # By default, C and C++ are the only stage 1 languages.
-- 
2.24.3 (Apple Git-128)



[r12-5301 Regression] FAIL: gcc.dg/tree-ssa/if-to-switch-3.c scan-tree-dump iftoswitch "Condition chain with [^\n\r]* BBs transformed into a switch statement." on Linux/x86_64

2021-11-16 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

045206450386bcd774db3bde0c696828402361c6 is the first bad commit
commit 045206450386bcd774db3bde0c696828402361c6
Author: Richard Biener 
Date:   Fri Nov 12 10:21:22 2021 +0100

tree-optimization/102880 - improve CD-DCE

caused

FAIL: gcc.dg/tree-ssa/if-to-switch-3.c scan-tree-dump iftoswitch "Condition 
chain with [^\n\r]* BBs transformed into a switch statement."

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-5301/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/if-to-switch-3.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/if-to-switch-3.c 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[r12-5292 Regression] FAIL: gcc.dg/tree-ssa/modref-dse-5.c scan-tree-dump dse2 "Deleted dead store: wrap" on Linux/x86_64

2021-11-16 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

e69b7c5779863469479698f863ab25e0d9b4586e is the first bad commit
commit e69b7c5779863469479698f863ab25e0d9b4586e
Author: Jan Hubicka 
Date:   Tue Nov 16 09:15:39 2021 +0100

Fix uninitialized access in merge_call_side_effects

caused

FAIL: gcc.dg/tree-ssa/modref-dse-5.c scan-tree-dump dse2 "Deleted dead store: 
wrap"

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-5292/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-dse-5.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-dse-5.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-dse-5.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/modref-dse-5.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[PATCH v2] rs6000: Fix a handful of 32-bit built-in function problems

2021-11-16 Thread Bill Schmidt via Gcc-patches
Hi!  I previously posted [1] to correct some problems with the new builtins
support targeting 32-bit code gen.  Based on the discussion, I've made some
adjustments and would like to submit this for consideration.

We eventually agreed that the strange behavior for -m32 -mpowerpc64 for certain
HTM builtins should be removed.  All of the registers TEXASR, TEXASRU, TFHAR,
and TFIAR are now accessed using the unsigned long data type in all 
configurations.

Segher didn't like the change in the error message for the cmpb-3.c test case,
but I think this should be fine.  The test case just tests for the error 
message,
but there is also a "note" message that provides additional information.  The
diagnostics that the user sees will look like this:

cmpb-3.c:11:3: error: '__builtin_p6_cmpb' requires the '-mcpu=power6' option 
and either the '-m64' or '-mpowerpc64' option
cmpb-3.c:11:3: note: builtin '__builtin_cmpb' requires builtin 
'__builtin_p6_cmpb'

So it's clear to the user that their use of __builtin_cmpb at line 11 triggered
the error.

Bootstrapped and tested on powerpc64le-linux-gnu, and on powerpc64-linux-gnu
using -m32/-m64.  Is this okay for trunk?

Thanks!
Bill

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583905.html


2021-11-16  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit.
(BPERMD): Flag as 32bit (needing special handling for 32-bit).
(UNPACK_TD): Return unsigned long long instead of unsigned long.
(GET_TEXASR): Return unsigned long instead of unsigned long long.
(GET_TEXASRU): Likewise.
(GET_TFHAR): Likewise.
(GET_TFIAR): Likewise.
(SET_TEXASR): Pass unsigned long instead of unsigned long long.
(SET_TEXASRU): Likewise.
(SET_TFHAR): Likewise.
(SET_TFIAR): Likewise.
(TABORTDC): Likewise.
(TABORTDCI): Likewise.
* config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error
handling for no32bit.  Add 32bit handling for RS6000_BIF_BPERMD.

gcc/testsuite/
* gcc.target/powerpc/cmpb-3.c: Adjust error message.
---
 gcc/config/rs6000/rs6000-builtin-new.def  | 30 +++
 gcc/config/rs6000/rs6000-call.c   |  9 ---
 gcc/testsuite/gcc.target/powerpc/cmpb-3.c |  2 +-
 3 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index 58dfce1ca37..30556e5c7f2 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -273,7 +273,7 @@
 ; Power6 builtins requiring 64-bit GPRs (even with 32-bit addressing).
 [power6-64]
   const signed long __builtin_p6_cmpb (signed long, signed long);
-CMPB cmpbdi3 {}
+CMPB cmpbdi3 {no32bit}
 
 
 ; AltiVec builtins.
@@ -2018,7 +2018,7 @@
 ADDG6S addg6s {}
 
   const signed long __builtin_bpermd (signed long, signed long);
-BPERMD bpermd_di {}
+BPERMD bpermd_di {32bit}
 
   const unsigned int __builtin_cbcdtd (unsigned int);
 CBCDTD cbcdtd {}
@@ -2971,7 +2971,7 @@
   void __builtin_set_fpscr_drn (const int[0,7]);
 SET_FPSCR_DRN rs6000_set_fpscr_drn {}
 
-  const unsigned long __builtin_unpack_dec128 (_Decimal128, const int<1>);
+  const unsigned long long __builtin_unpack_dec128 (_Decimal128, const int<1>);
 UNPACK_TD unpacktd {}
 
 
@@ -3014,39 +3014,39 @@
 
 
 [htm]
-  unsigned long long __builtin_get_texasr ();
+  unsigned long __builtin_get_texasr ();
 GET_TEXASR nothing {htm,htmspr}
 
-  unsigned long long __builtin_get_texasru ();
+  unsigned long __builtin_get_texasru ();
 GET_TEXASRU nothing {htm,htmspr}
 
-  unsigned long long __builtin_get_tfhar ();
+  unsigned long __builtin_get_tfhar ();
 GET_TFHAR nothing {htm,htmspr}
 
-  unsigned long long __builtin_get_tfiar ();
+  unsigned long __builtin_get_tfiar ();
 GET_TFIAR nothing {htm,htmspr}
 
-  void __builtin_set_texasr (unsigned long long);
+  void __builtin_set_texasr (unsigned long);
 SET_TEXASR nothing {htm,htmspr}
 
-  void __builtin_set_texasru (unsigned long long);
+  void __builtin_set_texasru (unsigned long);
 SET_TEXASRU nothing {htm,htmspr}
 
-  void __builtin_set_tfhar (unsigned long long);
+  void __builtin_set_tfhar (unsigned long);
 SET_TFHAR nothing {htm,htmspr}
 
-  void __builtin_set_tfiar (unsigned long long);
+  void __builtin_set_tfiar (unsigned long);
 SET_TFIAR nothing {htm,htmspr}
 
   unsigned int __builtin_tabort (unsigned int);
 TABORT tabort {htm,htmcr}
 
-  unsigned int __builtin_tabortdc (unsigned long long, unsigned long long, \
-   unsigned long long);
+  unsigned int __builtin_tabortdc (unsigned long, unsigned long, \
+   unsigned long);
 TABORTDC tabortdc {htm,htmcr}
 
-  unsigned int __builtin_tabortdci (unsigned long long, unsigned long long, \
-unsigned long long);
+  

[PATCH] x86: Add -mindirect-branch-cs-prefix

2021-11-16 Thread H.J. Lu via Gcc-patches
Add -mindirect-branch-cs-prefix to add CS prefix to call and jmp to thunk
via r8-r15 registers when converting indirect call and jump to increase
the instruction length to 6, allowing the non-thunk form to be inlined.

gcc/

PR target/102952
* config/i386/i386.c (ix86_output_jmp_thunk_or_indirect): Emit
CS prefix for -mindirect-branch-cs-prefix.
(ix86_output_indirect_branch_via_reg): Likewise.
* config/i386/i386.opt: Add -mindirect-branch-cs-prefix.
* doc/invoke.texi: Document -mindirect-branch-cs-prefix.

gcc/testsuite/

PR target/102952
* gcc.target/i386/indirect-thunk-cs-prefix-1.c: New test.
* gcc.target/i386/indirect-thunk-cs-prefix-2.c: Likewise.
---
 gcc/config/i386/i386.c|  6 ++
 gcc/config/i386/i386.opt  |  4 
 gcc/doc/invoke.texi   |  8 +++-
 .../gcc.target/i386/indirect-thunk-cs-prefix-1.c  | 14 ++
 .../gcc.target/i386/indirect-thunk-cs-prefix-2.c  | 15 +++
 5 files changed, 46 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-2.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7e9b7bc347f..0a902d66321 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -15983,6 +15983,9 @@ ix86_output_jmp_thunk_or_indirect (const char 
*thunk_name, const int regno)
 {
   if (thunk_name != NULL)
 {
+  if (regno >= FIRST_REX_INT_REG
+ && ix86_indirect_branch_cs_prefix)
+   fprintf (asm_out_file, "\tcs\n");
   fprintf (asm_out_file, "\tjmp\t");
   assemble_name (asm_out_file, thunk_name);
   putc ('\n', asm_out_file);
@@ -16036,6 +16039,9 @@ ix86_output_indirect_branch_via_reg (rtx call_op, bool 
sibcall_p)
 {
   if (thunk_name != NULL)
{
+ if (regno >= FIRST_REX_INT_REG
+ && ix86_indirect_branch_cs_prefix)
+   fprintf (asm_out_file, "\tcs\n");
  fprintf (asm_out_file, "\tcall\t");
  assemble_name (asm_out_file, thunk_name);
  putc ('\n', asm_out_file);
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index 8d499a5a4df..c5452c49597 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1076,6 +1076,10 @@ Enum(indirect_branch) String(thunk-inline) 
Value(indirect_branch_thunk_inline)
 EnumValue
 Enum(indirect_branch) String(thunk-extern) Value(indirect_branch_thunk_extern)
 
+mindirect-branch-cs-prefix
+Target Var(ix86_indirect_branch_cs_prefix) Init(0)
+Add CS prefix to call and jmp to thunk when converting indirect call and jump.
+
 mindirect-branch-register
 Target Var(ix86_indirect_branch_register) Init(0)
 Force indirect call and jump via register.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f3b4b467765..c992a7152f5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1425,7 +1425,8 @@ See RS/6000 and PowerPC Options.
 -mstack-protector-guard-symbol=@var{symbol} @gol
 -mgeneral-regs-only  -mcall-ms2sysv-xlogues -mrelax-cmpxchg-loop @gol
 -mindirect-branch=@var{choice}  -mfunction-return=@var{choice} @gol
--mindirect-branch-register -mharden-sls=@var{choice} -mneeded}
+-mindirect-branch-register -mharden-sls=@var{choice} @gol
+-mindirect-branch-cs-prefix -mneeded}
 
 @emph{x86 Windows Options}
 @gccoptlist{-mconsole  -mcygwin  -mno-cygwin  -mdll @gol
@@ -32390,6 +32391,11 @@ hardening.  @samp{return} enables SLS hardening for 
function return.
 @samp{indirect-branch} enables SLS hardening for indirect branch.
 @samp{all} enables all SLS hardening.
 
+@item -mindirect-branch-cs-prefix
+@opindex mindirect-branch-cs-prefix
+Add CS prefix to call and jmp to thunk via r8-r15 registers when
+converting indirect call and jump.
+
 @end table
 
 These @samp{-m} switches are supported in addition to the above
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-1.c 
b/gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-1.c
new file mode 100644
index 000..db2f3416823
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -ffixed-rax -ffixed-rbx -ffixed-rcx -ffixed-rdx 
-ffixed-rdi -ffixed-rsi -mindirect-branch-cs-prefix 
-mindirect-branch=thunk-extern" } */
+/* { dg-additional-options "-fno-pic" { target { ! *-*-darwin* } } } */
+
+extern void (*fptr) (void);
+
+void
+foo (void)
+{
+  fptr ();
+}
+
+/* { dg-final { scan-assembler-times "jmp\[ 
\t\]+_?__x86_indirect_thunk_r\[0-9\]+" 1 } } */
+/* { dg-final { scan-assembler-times "\tcs" 1 } } */
diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-2.c 
b/gcc/testsuite/gcc.target/i386/indirect-thunk-cs-prefix-2.c
new file mode 100644
index 000..adfc39a49d4
--- /dev/null
+++ 

[PATCH] x86: Add -mharden-sls=[none|all|return|indirect-branch]

2021-11-16 Thread H.J. Lu via Gcc-patches
Add -mharden-sls= to mitigate against straight line speculation (SLS)
for function return and indirect branch by adding an INT3 instruction
after function return and indirect branch.

gcc/

PR target/102952
* config/i386/i386-opts.h (harden_sls): New enum.
* config/i386/i386.c (output_indirect_thunk): Mitigate against
SLS for function return.
(ix86_output_function_return): Likewise.
(ix86_output_jmp_thunk_or_indirect): Mitigate against indirect
branch.
(ix86_output_indirect_jmp): Likewise.
(ix86_output_call_insn): Likewise.
* config/i386/i386.opt: Add -mharden-sls=.
* doc/invoke.texi: Document -mharden-sls=.

gcc/testsuite/

PR target/102952
* gcc.target/i386/harden-sls-1.c: New test.
* gcc.target/i386/harden-sls-2.c: Likewise.
* gcc.target/i386/harden-sls-3.c: Likewise.
* gcc.target/i386/harden-sls-4.c: Likewise.
---
 gcc/config/i386/i386-opts.h  |  7 +
 gcc/config/i386/i386.c   | 30 
 gcc/config/i386/i386.opt | 20 +
 gcc/doc/invoke.texi  | 10 ++-
 gcc/testsuite/gcc.target/i386/harden-sls-1.c | 14 +
 gcc/testsuite/gcc.target/i386/harden-sls-2.c | 14 +
 gcc/testsuite/gcc.target/i386/harden-sls-3.c | 14 +
 gcc/testsuite/gcc.target/i386/harden-sls-4.c | 14 +
 8 files changed, 116 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/harden-sls-4.c

diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h
index 04e4ad608fb..171d3106d0a 100644
--- a/gcc/config/i386/i386-opts.h
+++ b/gcc/config/i386/i386-opts.h
@@ -121,4 +121,11 @@ enum instrument_return {
   instrument_return_nop5
 };
 
+enum harden_sls {
+  harden_sls_none = 0,
+  harden_sls_return = 1 << 0,
+  harden_sls_indirect_branch = 1 << 1,
+  harden_sls_all = harden_sls_return | harden_sls_indirect_branch
+};
+
 #endif
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index cc9f9322fad..0a902d66321 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -5914,6 +5914,8 @@ output_indirect_thunk (unsigned int regno)
 }
 
   fputs ("\tret\n", asm_out_file);
+  if ((ix86_harden_sls & harden_sls_return))
+fputs ("\tint3\n", asm_out_file);
 }
 
 /* Output a funtion with a call and return thunk for indirect branch.
@@ -15987,6 +15989,8 @@ ix86_output_jmp_thunk_or_indirect (const char 
*thunk_name, const int regno)
   fprintf (asm_out_file, "\tjmp\t");
   assemble_name (asm_out_file, thunk_name);
   putc ('\n', asm_out_file);
+  if ((ix86_harden_sls & harden_sls_indirect_branch))
+   fputs ("\tint3\n", asm_out_file);
 }
   else
 output_indirect_thunk (regno);
@@ -16212,10 +16216,14 @@ ix86_output_indirect_jmp (rtx call_op)
gcc_unreachable ();
 
   ix86_output_indirect_branch (call_op, "%0", true);
-  return "";
+  if ((ix86_harden_sls & harden_sls_indirect_branch))
+   return "int3";
+  else
+   return "";
 }
   else
-return "%!jmp\t%A0";
+return ((ix86_harden_sls & harden_sls_indirect_branch)
+   ? "%!jmp\t%A0\n\tint3" : "%!jmp\t%A0");
 }
 
 /* Output return instrumentation for current function if needed.  */
@@ -16283,10 +16291,15 @@ ix86_output_function_return (bool long_p)
   return "";
 }
 
-  if (!long_p)
-return "%!ret";
+  if ((ix86_harden_sls & harden_sls_return))
+return "%!ret\n\tint3";
+  else
+{
+  if (!long_p)
+   return "%!ret";
 
-  return "rep%; ret";
+  return "rep%; ret";
+}
 }
 
 /* Output indirect function return.  RET_OP is the function return
@@ -16381,7 +16394,12 @@ ix86_output_call_insn (rtx_insn *insn, rtx call_op)
   if (output_indirect_p && !direct_p)
ix86_output_indirect_branch (call_op, xasm, true);
   else
-   output_asm_insn (xasm, _op);
+   {
+ output_asm_insn (xasm, _op);
+ if (!direct_p
+ && (ix86_harden_sls & harden_sls_indirect_branch))
+   return "int3";
+   }
   return "";
 }
 
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index b38ac13fc91..c5452c49597 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -1121,6 +1121,26 @@ mrecord-return
 Target Var(ix86_flag_record_return) Init(0)
 Generate a __return_loc section pointing to all return instrumentation code.
 
+mharden-sls=
+Target RejectNegative Joined Enum(harden_sls) Var(ix86_harden_sls) 
Init(harden_sls_none)
+Generate code to mitigate against straight line speculation.
+
+Enum
+Name(harden_sls) Type(enum harden_sls)
+Known choices for mitigation against straight line speculation with 
-mharden-sls=:
+

Re: [PATCH] rs6000: MMA test case emits wrong code when building a vector pair

2021-11-16 Thread Peter Bergner via Gcc-patches
On 11/13/21 7:25 AM, Segher Boessenkool wrote:
> On Wed, Oct 27, 2021 at 08:37:57PM -0500, Peter Bergner wrote:
>> PR102976 shows a test case where we generate wrong code when building
>> a vector pair from 2 vector registers.  The bug here is that with unlucky
>> register assignments, we can clobber one of the input operands before
>> we write both registers of the output operand.  The solution is to use
>> early-clobbers in the assemble pair and accumulator patterns.
> 
> Because of what insns there are after the split.  Aha.
> 
> Please add a comment explaining this, near the earlyclobber itself.

Done for both patterns.



> You can just write this as {\mxxlor \d+,44,44\M} etc., that will be
> simplest I think.

Done and tested that it still works.


> Okay for trunk with comments added near the earlyclobber, and the RE
> improved.  Also fine for 11 after some burn-in.  Thanks!

Ok, I pushed with both changes.  I'll push a change to GCC11 in a few days.
Thanks!

Peter




Re: [PATCH 4/5] if-conv: Apply VN to hoisted conversions

2021-11-16 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches  writes:
> On Mon, Nov 15, 2021 at 3:00 PM Richard Sandiford
>  wrote:
>>
>> Richard Biener via Gcc-patches  writes:
>> > On Fri, Nov 12, 2021 at 7:05 PM Richard Sandiford via Gcc-patches
>> >  wrote:
>> >>
>> >> This patch is a prerequisite for a later one.  At the moment,
>> >> if-conversion converts predicated POINTER_PLUS_EXPRs into
>> >> non-wrapping forms, which for:
>> >>
>> >> … = base + offset
>> >>
>> >> becomes:
>> >>
>> >> tmp = (unsigned long) base
>> >> … = tmp + offset
>> >>
>> >> It then hoists these conversions out of the loop where possible.
>> >>
>> >> However, because “base” is a valid gimple operand, there can be
>> >> multiple POINTER_PLUS_EXPRs with the same base, which can in turn
>> >> lead to multiple instances of the same conversion.  The later VN pass
>> >> is (and I think needs to be) restricted to the new if-converted code,
>> >> whereas here we're deliberately inserting the conversions before the
>> >> .LOOP_VECTORIZED condition:
>> >>
>> >> /* If we versioned loop then make sure to insert invariant
>> >>stmts before the .LOOP_VECTORIZED check since the vectorizer
>> >>will re-use that for things like runtime alias versioning
>> >>whose condition can end up using those invariants.  */
>> >>
>> >> We can therefore enter the vectoriser with redundant conversions.
>> >>
>> >> The easiest fix seemed to be to defer the hoisting until after VN.
>> >> This catches other hoisting opportunities too.
>> >>
>> >> Hoisting the code from the (artificial) loop in pr99102.c means
>> >> that it's no longer worth vectorising.  The patch forces vectorisation
>> >> instead of relying on the cost model.
>> >>
>> >> The patch also reverts pr87007-4.c and pr87007-5.c back to their
>> >> original forms, undoing changes in 783dc66f9ccb0019c3dad.
>> >> The code at the time the tests were added was:
>> >>
>> >> testl   %edi, %edi
>> >> je  .L10
>> >> vxorps  %xmm1, %xmm1, %xmm1
>> >> vsqrtsd d3(%rip), %xmm1, %xmm0
>> >> vsqrtsd d2(%rip), %xmm1, %xmm1
>> >> ...
>> >> .L10:
>> >> ret
>> >>
>> >> with the operations being hoisted, and the vxorps was specifically
>> >> wanted (compared to the previous code).  This patch restores the code
>> >> to that form, with the hoisted operations and the vxorps.
>> >>
>> >> Regstrapped on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>> >>
>> >> Richard
>> >>
>> >>
>> >> gcc/
>> >> * tree-if-conv.c: Include tree-eh.h.
>> >> (predicate_statements): Remove pe argument.  Don't hoist
>> >> statements here.
>> >> (combine_blocks): Remove pe argument.
>> >> (ifcvt_can_hoist, ifcvt_can_hoist_further): New functions.
>> >> (ifcvt_hoist_invariants): Likewise.
>> >> (tree_if_conversion): Update call to combine_blocks.  Call
>> >> ifcvt_hoist_invariants after VN.
>> >>
>> >> gcc/testsuite/
>> >> * gcc.dg/vect/pr99102.c: Add -fno-vect-cost-model.
>> >>
>> >> Revert:
>> >>
>> >> 2020-09-09  Richard Biener  
>> >>
>> >> * gcc.target/i386/pr87007-4.c: Adjust.
>> >> * gcc.target/i386/pr87007-5.c: Likewise.
>> >> ---
>> >>  gcc/testsuite/gcc.dg/vect/pr99102.c   |   2 +-
>> >>  gcc/testsuite/gcc.target/i386/pr87007-4.c |   2 +-
>> >>  gcc/testsuite/gcc.target/i386/pr87007-5.c |   2 +-
>> >>  gcc/tree-if-conv.c| 122 --
>> >>  4 files changed, 114 insertions(+), 14 deletions(-)
>> >>
>> >> diff --git a/gcc/testsuite/gcc.dg/vect/pr99102.c 
>> >> b/gcc/testsuite/gcc.dg/vect/pr99102.c
>> >> index 6c1a13f0783..0d030d15c86 100644
>> >> --- a/gcc/testsuite/gcc.dg/vect/pr99102.c
>> >> +++ b/gcc/testsuite/gcc.dg/vect/pr99102.c
>> >> @@ -1,4 +1,4 @@
>> >> -/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-details" } */
>> >> +/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model 
>> >> -fdump-tree-vect-details" } */
>> >>  /* { dg-additional-options "-msve-vector-bits=256" { target 
>> >> aarch64_sve256_hw } } */
>> >>  long a[44];
>> >>  short d, e = -7;
>> >> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-4.c 
>> >> b/gcc/testsuite/gcc.target/i386/pr87007-4.c
>> >> index 9c4b8005af3..e91bdcbac44 100644
>> >> --- a/gcc/testsuite/gcc.target/i386/pr87007-4.c
>> >> +++ b/gcc/testsuite/gcc.target/i386/pr87007-4.c
>> >> @@ -15,4 +15,4 @@ foo (int n, int k)
>> >>d1 = ceil (d3);
>> >>  }
>> >>
>> >> -/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 0 } } 
>> >> */
>> >> +/* { dg-final { scan-assembler-times "vxorps\[^\n\r\]*xmm\[0-9\]" 1 } } 
>> >> */
>> >> diff --git a/gcc/testsuite/gcc.target/i386/pr87007-5.c 
>> >> b/gcc/testsuite/gcc.target/i386/pr87007-5.c
>> >> index e4d956a5d7f..20d13cf650b 100644
>> >> --- a/gcc/testsuite/gcc.target/i386/pr87007-5.c
>> >> +++ b/gcc/testsuite/gcc.target/i386/pr87007-5.c
>> >> @@ -15,4 +15,4 @@ foo (int n, int k)
>> >>  

Re: [musl] Re: [PATCH v2] configure: define TARGET_LIBC_GNUSTACK on musl

2021-11-16 Thread Ilya Lipnitskiy via Gcc-patches
On Tue, Nov 16, 2021 at 8:41 AM Rich Felker  wrote:
>
> On Tue, Nov 16, 2021 at 03:40:00PM +0100, Dragan Mladjenovic wrote:
> > Hi,
> >
> > Looks fine to me. If possible, maybe it should even be back-ported
> > to stable branches.
The change cherry-picks fine onto 10.x and 11.x branches. Should I
send out separate patches or can the committer of this patch apply it
to 10.x and 11.x?
> >
> > Not sure if MIPS assembly sources (if any) in musl would need
> > explicit ..note.GNU-stack
> >
> > to complement this?
>
> What are the actual consequences of making this change, and what is
> the goal? I'm concerned that it might produce object files which don't
> include annotation that they don't need executable stack, in which
> case the final executable file will be marked as executable-stack and
> the kernel will load it as such. That would be very bad.
It is actually the other way around - for MIPS hard-float targets on
non-glibc (or glibc < 2.31) without this change the .note.GNU-stack
annotation is not emitted by GCC.

Ilya
>
> Rich
>
>
> > On 16-Nov-21 06:13, Ilya Lipnitskiy wrote:
> > >musl only uses PT_GNU_STACK to set default thread stack size and has no
> > >executable stack support[0], so there is no reason not to emit the
> > >.note.GNU-stack section on musl builds.
> > >
> > >[0]: 
> > >https://lore.kernel.org/all/20190423192534.gn23...@brightrain.aerifal.cx/T/#u
> > >
> > >gcc/ChangeLog:
> > >
> > > * configure: Regenerate.
> > > * configure.ac: define TARGET_LIBC_GNUSTACK on musl
> > >
> > >Signed-off-by: Ilya Lipnitskiy 
> > >---
> > >  gcc/configure| 3 +++
> > >  gcc/configure.ac | 3 +++
> > >  2 files changed, 6 insertions(+)
> > >
> > >diff --git a/gcc/configure b/gcc/configure
> > >index 74b9d9be4c85..7091a838aefa 100755
> > >--- a/gcc/configure
> > >+++ b/gcc/configure
> > >@@ -31275,6 +31275,9 @@ fi
> > >  # Check if the target LIBC handles PT_GNU_STACK.
> > >  gcc_cv_libc_gnustack=unknown
> > >  case "$target" in
> > >+  mips*-*-linux-musl*)
> > >+gcc_cv_libc_gnustack=yes
> > >+;;
> > >mips*-*-linux*)
> > >  if test $glibc_version_major -gt 2 \
> > >diff --git a/gcc/configure.ac b/gcc/configure.ac
> > >index c9ee1fb8919e..8a2d34179a75 100644
> > >--- a/gcc/configure.ac
> > >+++ b/gcc/configure.ac
> > >@@ -6961,6 +6961,9 @@ fi
> > >  # Check if the target LIBC handles PT_GNU_STACK.
> > >  gcc_cv_libc_gnustack=unknown
> > >  case "$target" in
> > >+  mips*-*-linux-musl*)
> > >+gcc_cv_libc_gnustack=yes
> > >+;;
> > >mips*-*-linux*)
> > >  GCC_GLIBC_VERSION_GTE_IFELSE([2], [31], [gcc_cv_libc_gnustack=yes], )
> > >  ;;


[PATCH] rs6000: Better error messages for power8/9-vector builtins

2021-11-16 Thread Bill Schmidt via Gcc-patches
Hi!  During a previous patch review, Segher asked that I provide better
messages when builtins are unavailable because they require both a minimum
CPU and the enablement of VSX instructions.  This patch does just that.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks!
Bill


2021-11-11  Bill Schmidt  

gcc/
* config/rs6000/rs6000-call.c (rs6000_invalid_new_builtin): Change
error messages for ENB_P8V and ENB_P9V.
---
 gcc/config/rs6000/rs6000-call.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 85fec80c6d7..035266eb001 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -11943,7 +11943,8 @@ rs6000_invalid_new_builtin (enum rs6000_gen_builtins 
fncode)
   error ("%qs requires the %qs option", name, "-mcpu=power8");
   break;
 case ENB_P8V:
-  error ("%qs requires the %qs option", name, "-mpower8-vector");
+  error ("%qs requires the %qs and %qs options", name, "-mcpu=power8",
+"-mvsx");
   break;
 case ENB_P9:
   error ("%qs requires the %qs option", name, "-mcpu=power9");
@@ -11953,7 +11954,8 @@ rs6000_invalid_new_builtin (enum rs6000_gen_builtins 
fncode)
 name, "-mcpu=power9", "-m64", "-mpowerpc64");
   break;
 case ENB_P9V:
-  error ("%qs requires the %qs option", name, "-mpower9-vector");
+  error ("%qs requires the %qs and %qs options", name, "-mcpu=power9",
+"-mvsx");
   break;
 case ENB_IEEE128_HW:
   error ("%qs requires ISA 3.0 IEEE 128-bit floating point", name);
-- 
2.27.0




Re: [PATCH] rs6000: Add [power6-64] stanza to new builtin support

2021-11-16 Thread Bill Schmidt via Gcc-patches
Sorry, I forgot to CC maintainers on this one.

Thanks!
Bill

On 11/16/21 11:06 AM, Bill Schmidt wrote:
> Hi!  While reviewing the recent 32-bit changes for the new builtin 
> infrastructure,
> I realized that I needed another stanza to represent builtins requiring both
> -mcpu=power6 and -mpowerpc64.  (There's only one of these, but nonetheless...)
> So this patch adds that support in the same fashion as [power7-64] and
> [power9-64].  Bootstrapped and tested on powerpc64le-linux-gnu, and on
> powerpc64-linux-gnu with -m32/-m64.  Is this okay for trunk?
>
> Thanks!
> Bill
>
>
> 2021-11-16  Bill Schmidt  
>
> gcc/
>   * config/rs6000/rs6000-builtin-new.def: Add power6-64 stanza.
>   Move CMPB to power6-64 stanza.
>   * config/rs6000/rs6000-call.c (rs6000_invalid_new_builtin): Handle
>   ENB_P6_64 case.
>   (rs6000_new_builtin_is_supported): Likewise.
>   (rs6000_expand_new_builtin): Likewise.
>   (rs6000_init_builtins): Likewise.
>   * config/rs6000/rs6000-gen-builtins.c (bif_stanza): Add
>   BSTZ_P6_64.
>   (stanza_map): Add entry mapping power6-64 to BSTZ_P6_64.
>   (enable_string): Add "ENB_P6_64".
>   (write_decls): Add ENB_P6_64 to bif_enable enum.
> ---
>  gcc/config/rs6000/rs6000-builtin-new.def |  9 ++---
>  gcc/config/rs6000/rs6000-call.c  | 10 ++
>  gcc/config/rs6000/rs6000-gen-builtins.c  |  4 
>  3 files changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
> b/gcc/config/rs6000/rs6000-builtin-new.def
> index 1dd8f6b40b2..58dfce1ca37 100644
> --- a/gcc/config/rs6000/rs6000-builtin-new.def
> +++ b/gcc/config/rs6000/rs6000-builtin-new.def
> @@ -266,13 +266,16 @@
>  
>  ; Power6 builtins (ISA 2.05).
>  [power6]
> -  const signed long __builtin_p6_cmpb (signed long, signed long);
> -CMPB cmpbdi3 {}
> -
>const signed int __builtin_p6_cmpb_32 (signed int, signed int);
>  CMPB_32 cmpbsi3 {}
>  
>  
> +; Power6 builtins requiring 64-bit GPRs (even with 32-bit addressing).
> +[power6-64]
> +  const signed long __builtin_p6_cmpb (signed long, signed long);
> +CMPB cmpbdi3 {}
> +
> +
>  ; AltiVec builtins.
>  [altivec]
>const vsc __builtin_altivec_abs_v16qi (vsc);
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 83e1abb6118..822a9736591 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -11919,6 +11919,10 @@ rs6000_invalid_new_builtin (enum rs6000_gen_builtins 
> fncode)
>  case ENB_P6:
>error ("%qs requires the %qs option", name, "-mcpu=power6");
>break;
> +case ENB_P6_64:
> +  error ("%qs requires the %qs option and either the %qs or %qs option",
> +  name, "-mcpu=power6", "-m64", "-mpowerpc64");
> +  break;
>  case ENB_ALTIVEC:
>error ("%qs requires the %qs option", name, "-maltivec");
>break;
> @@ -13346,6 +13350,8 @@ rs6000_new_builtin_is_supported (enum 
> rs6000_gen_builtins fncode)
>return TARGET_POPCNTB;
>  case ENB_P6:
>return TARGET_CMPB;
> +case ENB_P6_64:
> +  return TARGET_CMPB && TARGET_POWERPC64;
>  case ENB_P7:
>return TARGET_POPCNTD;
>  case ENB_P7_64:
> @@ -15697,6 +15703,8 @@ rs6000_expand_new_builtin (tree exp, rtx target,
>if (!(e == ENB_ALWAYS
>   || (e == ENB_P5 && TARGET_POPCNTB)
>   || (e == ENB_P6 && TARGET_CMPB)
> + || (e == ENB_P6_64  && TARGET_CMPB
> + && TARGET_POWERPC64)
>   || (e == ENB_ALTIVEC&& TARGET_ALTIVEC)
>   || (e == ENB_CELL   && TARGET_ALTIVEC
>   && rs6000_cpu == PROCESSOR_CELL)
> @@ -16419,6 +16427,8 @@ rs6000_init_builtins (void)
>   continue;
> if (e == ENB_P6 && !TARGET_CMPB)
>   continue;
> +   if (e == ENB_P6_64 && !(TARGET_CMPB && TARGET_POWERPC64))
> + continue;
> if (e == ENB_ALTIVEC && !TARGET_ALTIVEC)
>   continue;
> if (e == ENB_VSX && !TARGET_VSX)
> diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
> b/gcc/config/rs6000/rs6000-gen-builtins.c
> index 1655a2fd765..4ce83bd2290 100644
> --- a/gcc/config/rs6000/rs6000-gen-builtins.c
> +++ b/gcc/config/rs6000/rs6000-gen-builtins.c
> @@ -212,6 +212,7 @@ enum bif_stanza
>   BSTZ_ALWAYS,
>   BSTZ_P5,
>   BSTZ_P6,
> + BSTZ_P6_64,
>   BSTZ_ALTIVEC,
>   BSTZ_CELL,
>   BSTZ_VSX,
> @@ -245,6 +246,7 @@ static stanza_entry stanza_map[NUMBIFSTANZAS] =
>  { "always",  BSTZ_ALWAYS },
>  { "power5",  BSTZ_P5 },
>  { "power6",  BSTZ_P6 },
> +{ "power6-64",   BSTZ_P6_64  },
>  { "altivec", BSTZ_ALTIVEC},
>  { "cell",BSTZ_CELL   },
>  { "vsx", BSTZ_VSX},
> @@ -269,6 +271,7 @@ static const char *enable_string[NUMBIFSTANZAS] =
>  "ENB_ALWAYS",
>  "ENB_P5",
>  "ENB_P6",
> +"ENB_P6_64",
>  

[PATCH] rs6000: Add [power6-64] stanza to new builtin support

2021-11-16 Thread Bill Schmidt via Gcc-patches
Hi!  While reviewing the recent 32-bit changes for the new builtin 
infrastructure,
I realized that I needed another stanza to represent builtins requiring both
-mcpu=power6 and -mpowerpc64.  (There's only one of these, but nonetheless...)
So this patch adds that support in the same fashion as [power7-64] and
[power9-64].  Bootstrapped and tested on powerpc64le-linux-gnu, and on
powerpc64-linux-gnu with -m32/-m64.  Is this okay for trunk?

Thanks!
Bill


2021-11-16  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtin-new.def: Add power6-64 stanza.
Move CMPB to power6-64 stanza.
* config/rs6000/rs6000-call.c (rs6000_invalid_new_builtin): Handle
ENB_P6_64 case.
(rs6000_new_builtin_is_supported): Likewise.
(rs6000_expand_new_builtin): Likewise.
(rs6000_init_builtins): Likewise.
* config/rs6000/rs6000-gen-builtins.c (bif_stanza): Add
BSTZ_P6_64.
(stanza_map): Add entry mapping power6-64 to BSTZ_P6_64.
(enable_string): Add "ENB_P6_64".
(write_decls): Add ENB_P6_64 to bif_enable enum.
---
 gcc/config/rs6000/rs6000-builtin-new.def |  9 ++---
 gcc/config/rs6000/rs6000-call.c  | 10 ++
 gcc/config/rs6000/rs6000-gen-builtins.c  |  4 
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
b/gcc/config/rs6000/rs6000-builtin-new.def
index 1dd8f6b40b2..58dfce1ca37 100644
--- a/gcc/config/rs6000/rs6000-builtin-new.def
+++ b/gcc/config/rs6000/rs6000-builtin-new.def
@@ -266,13 +266,16 @@
 
 ; Power6 builtins (ISA 2.05).
 [power6]
-  const signed long __builtin_p6_cmpb (signed long, signed long);
-CMPB cmpbdi3 {}
-
   const signed int __builtin_p6_cmpb_32 (signed int, signed int);
 CMPB_32 cmpbsi3 {}
 
 
+; Power6 builtins requiring 64-bit GPRs (even with 32-bit addressing).
+[power6-64]
+  const signed long __builtin_p6_cmpb (signed long, signed long);
+CMPB cmpbdi3 {}
+
+
 ; AltiVec builtins.
 [altivec]
   const vsc __builtin_altivec_abs_v16qi (vsc);
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 83e1abb6118..822a9736591 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -11919,6 +11919,10 @@ rs6000_invalid_new_builtin (enum rs6000_gen_builtins 
fncode)
 case ENB_P6:
   error ("%qs requires the %qs option", name, "-mcpu=power6");
   break;
+case ENB_P6_64:
+  error ("%qs requires the %qs option and either the %qs or %qs option",
+name, "-mcpu=power6", "-m64", "-mpowerpc64");
+  break;
 case ENB_ALTIVEC:
   error ("%qs requires the %qs option", name, "-maltivec");
   break;
@@ -13346,6 +13350,8 @@ rs6000_new_builtin_is_supported (enum 
rs6000_gen_builtins fncode)
   return TARGET_POPCNTB;
 case ENB_P6:
   return TARGET_CMPB;
+case ENB_P6_64:
+  return TARGET_CMPB && TARGET_POWERPC64;
 case ENB_P7:
   return TARGET_POPCNTD;
 case ENB_P7_64:
@@ -15697,6 +15703,8 @@ rs6000_expand_new_builtin (tree exp, rtx target,
   if (!(e == ENB_ALWAYS
|| (e == ENB_P5 && TARGET_POPCNTB)
|| (e == ENB_P6 && TARGET_CMPB)
+   || (e == ENB_P6_64  && TARGET_CMPB
+   && TARGET_POWERPC64)
|| (e == ENB_ALTIVEC&& TARGET_ALTIVEC)
|| (e == ENB_CELL   && TARGET_ALTIVEC
&& rs6000_cpu == PROCESSOR_CELL)
@@ -16419,6 +16427,8 @@ rs6000_init_builtins (void)
continue;
  if (e == ENB_P6 && !TARGET_CMPB)
continue;
+ if (e == ENB_P6_64 && !(TARGET_CMPB && TARGET_POWERPC64))
+   continue;
  if (e == ENB_ALTIVEC && !TARGET_ALTIVEC)
continue;
  if (e == ENB_VSX && !TARGET_VSX)
diff --git a/gcc/config/rs6000/rs6000-gen-builtins.c 
b/gcc/config/rs6000/rs6000-gen-builtins.c
index 1655a2fd765..4ce83bd2290 100644
--- a/gcc/config/rs6000/rs6000-gen-builtins.c
+++ b/gcc/config/rs6000/rs6000-gen-builtins.c
@@ -212,6 +212,7 @@ enum bif_stanza
  BSTZ_ALWAYS,
  BSTZ_P5,
  BSTZ_P6,
+ BSTZ_P6_64,
  BSTZ_ALTIVEC,
  BSTZ_CELL,
  BSTZ_VSX,
@@ -245,6 +246,7 @@ static stanza_entry stanza_map[NUMBIFSTANZAS] =
 { "always",BSTZ_ALWAYS },
 { "power5",BSTZ_P5 },
 { "power6",BSTZ_P6 },
+{ "power6-64", BSTZ_P6_64  },
 { "altivec",   BSTZ_ALTIVEC},
 { "cell",  BSTZ_CELL   },
 { "vsx",   BSTZ_VSX},
@@ -269,6 +271,7 @@ static const char *enable_string[NUMBIFSTANZAS] =
 "ENB_ALWAYS",
 "ENB_P5",
 "ENB_P6",
+"ENB_P6_64",
 "ENB_ALTIVEC",
 "ENB_CELL",
 "ENB_VSX",
@@ -2227,6 +2230,7 @@ write_decls (void)
   fprintf (header_file, "  ENB_ALWAYS,\n");
   fprintf (header_file, "  ENB_P5,\n");
   fprintf (header_file, "  ENB_P6,\n");
+  fprintf (header_file, "  ENB_P6_64,\n");
   fprintf (header_file, "  

Re: [PATCH RFC] c-family: don't cache large vecs

2021-11-16 Thread Marek Polacek via Gcc-patches
On Tue, Nov 16, 2021 at 11:53:14AM -0500, Jason Merrill via Gcc-patches wrote:
> Patrick observed recently that an element of the vector cache could be
> arbitrarily large.  Let's only cache relatively small vecs.
> 
> This has no effect on compiling the libstdc++ stdc++.h, presumably because
> nothing in the library requires a vec that large.  I figure that this makes it
> more likely that a subsequent long list will reuse the same memory when the
> later vec gets expanded.
> 
> Does this make sense to others?

Looks good to me.
 
> gcc/c-family/ChangeLog:
> 
>   * c-common.c (release_tree_vector): Only cache vecs smaller than
>   16 elements.
> ---
>  gcc/c-family/c-common.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
> index 436df45df68..90e8ec87b6b 100644
> --- a/gcc/c-family/c-common.c
> +++ b/gcc/c-family/c-common.c
> @@ -8213,8 +8213,16 @@ release_tree_vector (vec *vec)
>  {
>if (vec != NULL)
>  {
> -  vec->truncate (0);
> -  vec_safe_push (tree_vector_cache, vec);
> +  if (vec->allocated () >= 16)
> + /* Don't cache vecs that have expanded more than once.  On a p64
> +target, vecs double in alloc size with each power of 2 elements, e.g
> +at 16 elements the alloc increases from 128 to 256 bytes.  */
> + vec_free (vec);
> +  else
> + {
> +   vec->truncate (0);
> +   vec_safe_push (tree_vector_cache, vec);
> + }
>  }
>  }
>  
> 
> base-commit: 132f1c27770fa6dafdf14591878d301aedd5ae16
> -- 
> 2.27.0
> 

Marek



Re: [PATCH 06/15] visium: Fix non-robust split condition in define_insn_and_split

2021-11-16 Thread Eric Botcazou via Gcc-patches
> gcc/ChangeLog:
> 
>  * config/visium/visium.md (*add3_insn, *addsi3_insn, *addi3_insn,
>   *sub3_insn, *subsi3_insn, *subdi3_insn, *neg2_insn,
>   *negdi2_insn, *and3_insn, *ior3_insn, *xor3_insn,
>   *one_cmpl2_insn, *ashl3_insn, *ashr3_insn,
>   *lshr3_insn, *trunchiqi2_insn, *truncsihi2_insn,
>   *truncdisi2_insn, *extendqihi2_insn, *extendqisi2_insn,
>   *extendhisi2_insn, *extendsidi2_insn, *zero_extendqihi2_insn,
>*zero_extendqisi2_insn, *zero_extendsidi2_insn): Fix split condition.

OK for mainline, thanks.

-- 
Eric Botcazou




Re: [committed 2/2] libstdc++: Implement constexpr std::basic_string for C++20

2021-11-16 Thread Jonathan Wakely via Gcc-patches
Oops, the subject line was not supposed to say 2/2 for this commit, and I
was not supposed to have Michael de Lang as the author ... I messed up my
git send-email and git cherry-pick commands!

Sorry Michael, I originally tried to use your tests from
https://github.com/Oipo/gcc/ but as noted in https://gcc.gnu.org/PR93989
those tests are incorrect, and so I didn't actually use any of them (nor
the std::string code itself). But apparently the commit still had you as
the author, because I reset the content of the git tree, but not the commit
author. I'll fix that in GCC's ChangeLog file after it regenerates
overnight.



On Tue, 16 Nov 2021 at 16:47, Jonathan Wakely wrote:

> From: Michael de Lang
>
> Tested x86_64-linux, committed to trunk.
>
>
> This is only supported for the cxx11 ABI, not for COW strings.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/basic_string.h (basic_string, operator""s): Add
> constexpr for C++20.
> (basic_string::basic_string(basic_string&&)): Only copy
> initialized portion of the buffer.
> (basic_string::basic_string(basic_string&&, const Alloc&)):
> Likewise.
> * include/bits/basic_string.tcc (basic_string): Add constexpr
> for C++20.
> (basic_string::swap(basic_string&)): Only copy initialized
> portions of the buffers.
> (basic_string::_M_replace): Add constexpr implementation that
> doesn't depend on pointer comparisons.
> * include/bits/cow_string.h: Adjust comment.
> * include/ext/type_traits.h (__is_null_pointer): Add constexpr.
> * include/std/string (erase, erase_if): Add constexpr.
> * include/std/version (__cpp_lib_constexpr_string): Update
> value.
> * testsuite/21_strings/basic_string/cons/char/constexpr.cc:
> New test.
> * testsuite/21_strings/basic_string/cons/wchar_t/constexpr.cc:
> New test.
> * testsuite/21_strings/basic_string/literals/constexpr.cc:
> New test.
> * testsuite/21_strings/basic_string/modifiers/constexpr.cc: New
> test.
> *
> testsuite/21_strings/basic_string/modifiers/swap/char/constexpr.cc:
> New test.
> *
> testsuite/21_strings/basic_string/modifiers/swap/wchar_t/constexpr.cc:
> New test.
> * testsuite/21_strings/basic_string/version.cc: New test.
>
>
>


[PATCH RFC] c-family: don't cache large vecs

2021-11-16 Thread Jason Merrill via Gcc-patches
Patrick observed recently that an element of the vector cache could be
arbitrarily large.  Let's only cache relatively small vecs.

This has no effect on compiling the libstdc++ stdc++.h, presumably because
nothing in the library requires a vec that large.  I figure that this makes it
more likely that a subsequent long list will reuse the same memory when the
later vec gets expanded.

Does this make sense to others?

gcc/c-family/ChangeLog:

* c-common.c (release_tree_vector): Only cache vecs smaller than
16 elements.
---
 gcc/c-family/c-common.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 436df45df68..90e8ec87b6b 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -8213,8 +8213,16 @@ release_tree_vector (vec *vec)
 {
   if (vec != NULL)
 {
-  vec->truncate (0);
-  vec_safe_push (tree_vector_cache, vec);
+  if (vec->allocated () >= 16)
+   /* Don't cache vecs that have expanded more than once.  On a p64
+  target, vecs double in alloc size with each power of 2 elements, e.g
+  at 16 elements the alloc increases from 128 to 256 bytes.  */
+   vec_free (vec);
+  else
+   {
+ vec->truncate (0);
+ vec_safe_push (tree_vector_cache, vec);
+   }
 }
 }
 

base-commit: 132f1c27770fa6dafdf14591878d301aedd5ae16
-- 
2.27.0



[PATCH] Do not abort compilation when dump file is /dev/*

2021-11-16 Thread Giuliano Belinassi via Gcc-patches
The `configure` scripts generated with autoconf often tests compiler
features by setting output to `/dev/null`, which then sets the dump
folder as being /dev/* and the compilation halts with an error because
GCC cannot create files in /dev/. This is a problem when configure is
testing for compiler features because it cannot tell if the failure was
due to unsupported features or any other problem, and disable it even
if it is working.

As an example, running configure overriding CFLAGS="-fdump-ipa-clones"
will result in several compiler-features as being disabled because of
gcc halting with an error creating files in /dev/*.

This commit fixes this issue by checking if the dump folder is /dev/.
If yes, then it just informs the user and disables dumping, but does
not halt the compilation and the compiler retuns 0 to the shell.

gcc/ChangeLog
2021-11-16  Giuliano Belinassi  

* dumpfile.c (dump_open): Do not halt compilation when file
matches /dev/*.

gcc/testsuite/ChangeLog
2021-11-16  Giuliano Belinassi  

* gcc.dg/devnull-dump.c: New.

Signed-off-by: Giuliano Belinassi 
---
 gcc/dumpfile.c  | 17 -
 gcc/testsuite/gcc.dg/devnull-dump.c |  7 +++
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/devnull-dump.c

diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
index 8169daf7f59..b1dbfb371af 100644
--- a/gcc/dumpfile.c
+++ b/gcc/dumpfile.c
@@ -378,7 +378,22 @@ dump_open (const char *filename, bool trunc)
   FILE *stream = fopen (filename, trunc ? "w" : "a");
 
   if (!stream)
-error ("could not open dump file %qs: %m", filename);
+{
+  /* Autoconf tests compiler functionalities by setting output to 
/dev/null.
+In this case, if dumps are enabled, it will try to set the output
+folder to /dev/*, which is of course invalid and the compiler will exit
+with an error, resulting in configure script reporting the tested
+feature as being unavailable. Here we test this case by checking if the
+output file prefix has /dev/ and only inform the user in this case
+rather than refusing to compile.  */
+
+  const char *const slash_dev = "/dev/";
+  if (strncmp(slash_dev, filename, strlen(slash_dev)) == 0)
+   inform (UNKNOWN_LOCATION,
+   "could not open dump file %qs: %m. Dumps are disabled.", 
filename);
+  else
+   error ("could not open dump file %qs: %m", filename);
+}
   return stream;
 }
 
diff --git a/gcc/testsuite/gcc.dg/devnull-dump.c 
b/gcc/testsuite/gcc.dg/devnull-dump.c
new file mode 100644
index 000..378e0901c28
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/devnull-dump.c
@@ -0,0 +1,7 @@
+/* { dg-do assemble } */
+/* { dg-options "-fdump-ipa-clones -o /dev/null" } */
+
+int main()
+{
+  return 0;
+}
-- 
2.33.1



[committed 2/2] libstdc++: Implement constexpr std::basic_string for C++20

2021-11-16 Thread Jonathan Wakely via Gcc-patches
From: Michael de Lang 

Tested x86_64-linux, committed to trunk.


This is only supported for the cxx11 ABI, not for COW strings.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (basic_string, operator""s): Add
constexpr for C++20.
(basic_string::basic_string(basic_string&&)): Only copy
initialized portion of the buffer.
(basic_string::basic_string(basic_string&&, const Alloc&)):
Likewise.
* include/bits/basic_string.tcc (basic_string): Add constexpr
for C++20.
(basic_string::swap(basic_string&)): Only copy initialized
portions of the buffers.
(basic_string::_M_replace): Add constexpr implementation that
doesn't depend on pointer comparisons.
* include/bits/cow_string.h: Adjust comment.
* include/ext/type_traits.h (__is_null_pointer): Add constexpr.
* include/std/string (erase, erase_if): Add constexpr.
* include/std/version (__cpp_lib_constexpr_string): Update
value.
* testsuite/21_strings/basic_string/cons/char/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/cons/wchar_t/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/literals/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/modifiers/constexpr.cc: New test.
* testsuite/21_strings/basic_string/modifiers/swap/char/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/modifiers/swap/wchar_t/constexpr.cc:
New test.
* testsuite/21_strings/basic_string/version.cc: New test.
---
 libstdc++-v3/include/bits/basic_string.h  | 274 --
 libstdc++-v3/include/bits/basic_string.tcc|  69 -
 libstdc++-v3/include/bits/cow_string.h|   2 +-
 libstdc++-v3/include/ext/type_traits.h|   4 +-
 libstdc++-v3/include/std/string   |   2 +
 libstdc++-v3/include/std/version  |   6 +-
 .../basic_string/cons/char/constexpr.cc   | 174 +++
 .../basic_string/cons/wchar_t/constexpr.cc| 174 +++
 .../basic_string/literals/constexpr.cc|  22 ++
 .../basic_string/modifiers/constexpr.cc   |  52 
 .../modifiers/swap/char/constexpr.cc  |  49 
 .../modifiers/swap/wchar_t/constexpr.cc   |  49 
 .../21_strings/basic_string/version.cc|  25 ++
 13 files changed, 869 insertions(+), 33 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/cons/char/constexpr.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/constexpr.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/literals/constexpr.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/modifiers/constexpr.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/modifiers/swap/char/constexpr.cc
 create mode 100644 
libstdc++-v3/testsuite/21_strings/basic_string/modifiers/swap/wchar_t/constexpr.cc
 create mode 100644 libstdc++-v3/testsuite/21_strings/basic_string/version.cc

diff --git a/libstdc++-v3/include/bits/basic_string.h 
b/libstdc++-v3/include/bits/basic_string.h
index a6575fa9e26..b6945f1cdfb 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -57,12 +57,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 _GLIBCXX_BEGIN_NAMESPACE_CXX11
 
 #ifdef __cpp_lib_is_constant_evaluated
-// Support P1032R1 in C++20 (but not P0980R1 yet).
-# define __cpp_lib_constexpr_string 201811L
+// Support P0980R1 in C++20.
+# define __cpp_lib_constexpr_string 201907L
 #elif __cplusplus >= 201703L && _GLIBCXX_HAVE_BUILTIN_IS_CONSTANT_EVALUATED
 // Support P0426R1 changes to char_traits in C++17.
 # define __cpp_lib_constexpr_string 201611L
-#elif __cplusplus > 201703L
 #endif
 
   /**
@@ -131,6 +130,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
  _Res>;
 
   // Allows an implicit conversion to __sv_type.
+  _GLIBCXX20_CONSTEXPR
   static __sv_type
   _S_to_string_view(__sv_type __svt) noexcept
   { return __svt; }
@@ -141,7 +141,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   // is provided.
   struct __sv_wrapper
   {
-   explicit __sv_wrapper(__sv_type __sv) noexcept : _M_sv(__sv) { }
+   _GLIBCXX20_CONSTEXPR explicit
+   __sv_wrapper(__sv_type __sv) noexcept : _M_sv(__sv) { }
+
__sv_type _M_sv;
   };
 
@@ -151,6 +153,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
*  @param  __svw  string view wrapper.
*  @param  __a  Allocator to use.
*/
+  _GLIBCXX20_CONSTEXPR
   explicit
   basic_string(__sv_wrapper __svw, const _Alloc& __a)
   : basic_string(__svw._M_sv.data(), __svw._M_sv.size(), __a) { }
@@ -163,9 +166,11 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
_Alloc_hider(pointer __dat, const _Alloc& __a = _Alloc())
: allocator_type(__a), _M_p(__dat) { }
 #else
+   _GLIBCXX20_CONSTEXPR
_Alloc_hider(pointer __dat, const 

[committed 1/2] libstdc++: Use hidden friends for vector::reference swap overloads

2021-11-16 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, committed to trunk.

These swap overloads are non-standard, but are needed to make swap work
for vector::reference rvalues. They don't need to be called
explicitly, only via ADL, so hide them from normal lookup. This is what
I've proposed as the resolution to LWG 3638.

libstdc++-v3/ChangeLog:

* include/bits/stl_bvector.h (swap(_Bit_reference, _Bit_reference))
(swap(_Bit_reference, bool&), swap(bool&, _Bit_reference)):
Define as hidden friends of _Bit_reference.
---
 libstdc++-v3/include/bits/stl_bvector.h | 50 -
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_bvector.h 
b/libstdc++-v3/include/bits/stl_bvector.h
index 381c47b6132..68070685baf 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -125,36 +125,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 void
 flip() _GLIBCXX_NOEXCEPT
 { *_M_p ^= _M_mask; }
-  };
 
 #if __cplusplus >= 201103L
-  _GLIBCXX20_CONSTEXPR
-  inline void
-  swap(_Bit_reference __x, _Bit_reference __y) noexcept
-  {
-bool __tmp = __x;
-__x = __y;
-__y = __tmp;
-  }
+_GLIBCXX20_CONSTEXPR
+friend void
+swap(_Bit_reference __x, _Bit_reference __y) noexcept
+{
+  bool __tmp = __x;
+  __x = __y;
+  __y = __tmp;
+}
 
-  _GLIBCXX20_CONSTEXPR
-  inline void
-  swap(_Bit_reference __x, bool& __y) noexcept
-  {
-bool __tmp = __x;
-__x = __y;
-__y = __tmp;
-  }
+_GLIBCXX20_CONSTEXPR
+friend void
+swap(_Bit_reference __x, bool& __y) noexcept
+{
+  bool __tmp = __x;
+  __x = __y;
+  __y = __tmp;
+}
 
-  _GLIBCXX20_CONSTEXPR
-  inline void
-  swap(bool& __x, _Bit_reference __y) noexcept
-  {
-bool __tmp = __x;
-__x = __y;
-__y = __tmp;
-  }
+_GLIBCXX20_CONSTEXPR
+friend void
+swap(bool& __x, _Bit_reference __y) noexcept
+{
+  bool __tmp = __x;
+  __x = __y;
+  __y = __tmp;
+}
 #endif
+  };
 
   struct _Bit_iterator_base
   : public std::iterator
-- 
2.31.1



Re: [PATCH] simplify get_range_strlen interface

2021-11-16 Thread Martin Sebor via Gcc-patches

On 11/15/21 3:05 PM, Martin Sebor wrote:

The deeply nested PHI handling in get_range_strlen_dynamic makes
the code bigger and harder to follow than it would be if done in
its own function.  The attached patch does that.

In addition, the get_range_strlen family of functions use a bitmap
to avoid infinite recursion.  Rather than dynamically allocating
and freeing it on demand the attached patch simplifies the code
by using an instance of auto_bitmap.  This avoids the risk of
neglecting to deallocate the bitmap.


I forgot over the weekend that this change also fixes a bug:
PR 102960.

I have committed the fix in r12-5310 along with a test.

Martin



Tested on x86_64-linux.

Martin




Re: [musl] Re: [PATCH v2] configure: define TARGET_LIBC_GNUSTACK on musl

2021-11-16 Thread Rich Felker
On Tue, Nov 16, 2021 at 03:40:00PM +0100, Dragan Mladjenovic wrote:
> Hi,
> 
> Looks fine to me. If possible, maybe it should even be back-ported
> to stable branches.
> 
> Not sure if MIPS assembly sources (if any) in musl would need
> explicit ..note.GNU-stack
> 
> to complement this?

What are the actual consequences of making this change, and what is
the goal? I'm concerned that it might produce object files which don't
include annotation that they don't need executable stack, in which
case the final executable file will be marked as executable-stack and
the kernel will load it as such. That would be very bad.

Rich


> On 16-Nov-21 06:13, Ilya Lipnitskiy wrote:
> >musl only uses PT_GNU_STACK to set default thread stack size and has no
> >executable stack support[0], so there is no reason not to emit the
> >.note.GNU-stack section on musl builds.
> >
> >[0]: 
> >https://lore.kernel.org/all/20190423192534.gn23...@brightrain.aerifal.cx/T/#u
> >
> >gcc/ChangeLog:
> >
> > * configure: Regenerate.
> > * configure.ac: define TARGET_LIBC_GNUSTACK on musl
> >
> >Signed-off-by: Ilya Lipnitskiy 
> >---
> >  gcc/configure| 3 +++
> >  gcc/configure.ac | 3 +++
> >  2 files changed, 6 insertions(+)
> >
> >diff --git a/gcc/configure b/gcc/configure
> >index 74b9d9be4c85..7091a838aefa 100755
> >--- a/gcc/configure
> >+++ b/gcc/configure
> >@@ -31275,6 +31275,9 @@ fi
> >  # Check if the target LIBC handles PT_GNU_STACK.
> >  gcc_cv_libc_gnustack=unknown
> >  case "$target" in
> >+  mips*-*-linux-musl*)
> >+gcc_cv_libc_gnustack=yes
> >+;;
> >mips*-*-linux*)
> >  if test $glibc_version_major -gt 2 \
> >diff --git a/gcc/configure.ac b/gcc/configure.ac
> >index c9ee1fb8919e..8a2d34179a75 100644
> >--- a/gcc/configure.ac
> >+++ b/gcc/configure.ac
> >@@ -6961,6 +6961,9 @@ fi
> >  # Check if the target LIBC handles PT_GNU_STACK.
> >  gcc_cv_libc_gnustack=unknown
> >  case "$target" in
> >+  mips*-*-linux-musl*)
> >+gcc_cv_libc_gnustack=yes
> >+;;
> >mips*-*-linux*)
> >  GCC_GLIBC_VERSION_GTE_IFELSE([2], [31], [gcc_cv_libc_gnustack=yes], )
> >  ;;


Re: [PATCH 4/5] vect: Make reduction code handle calls

2021-11-16 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches  writes:
> On Wed, Nov 10, 2021 at 1:48 PM Richard Sandiford via Gcc-patches
>  wrote:
>>
>> This patch extends the reduction code to handle calls.  So far
>> it's a structural change only; a later patch adds support for
>> specific function reductions.
>>
>> Most of the patch consists of using code_helper and gimple_match_op
>> to describe the reduction operations.  The other main change is that
>> vectorizable_call now needs to handle fully-predicated reductions.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>> Richard
>>
>>
>> gcc/
>> * builtins.h (associated_internal_fn): Declare overload that
>> takes a (combined_cfn, return type) pair.
>> * builtins.c (associated_internal_fn): Split new overload out
>> of original fndecl version.  Also provide an overload that takes
>> a (combined_cfn, return type) pair.
>> * internal-fn.h (commutative_binary_fn_p): Declare.
>> (associative_binary_fn_p): Likewise.
>> * internal-fn.c (commutative_binary_fn_p): New function,
>> split out from...
>> (first_commutative_argument): ...here.
>> (associative_binary_fn_p): New function.
>> * gimple-match.h (code_helper): Add a constructor that takes
>> internal functions.
>> (commutative_binary_op_p): Declare.
>> (associative_binary_op_p): Likewise.
>> (canonicalize_code): Likewise.
>> (directly_supported_p): Likewise.
>> (get_conditional_internal_fn): Likewise.
>> (gimple_build): New overload that takes a code_helper.
>> * gimple-fold.c (gimple_build): Likewise.
>> * gimple-match-head.c (commutative_binary_op_p): New function.
>> (associative_binary_op_p): Likewise.
>> (canonicalize_code): Likewise.
>> (directly_supported_p): Likewise.
>> (get_conditional_internal_fn): Likewise.
>> * tree-vectorizer.h: Include gimple-match.h.
>> (neutral_op_for_reduction): Take a code_helper instead of a 
>> tree_code.
>> (needs_fold_left_reduction_p): Likewise.
>> (reduction_fn_for_scalar_code): Likewise.
>> (vect_can_vectorize_without_simd_p): Declare a nNew overload that 
>> takes
>> a code_helper.
>> * tree-vect-loop.c: Include case-cfn-macros.h.
>> (fold_left_reduction_fn): Take a code_helper instead of a tree_code.
>> (reduction_fn_for_scalar_code): Likewise.
>> (neutral_op_for_reduction): Likewise.
>> (needs_fold_left_reduction_p): Likewise.
>> (use_mask_by_cond_expr_p): Likewise.
>> (build_vect_cond_expr): Likewise.
>> (vect_create_partial_epilog): Likewise.  Use gimple_build rather
>> than gimple_build_assign.
>> (check_reduction_path): Handle calls and operate on code_helpers
>> rather than tree_codes.
>> (vect_is_simple_reduction): Likewise.
>> (vect_model_reduction_cost): Likewise.
>> (vect_find_reusable_accumulator): Likewise.
>> (vect_create_epilog_for_reduction): Likewise.
>> (vect_transform_cycle_phi): Likewise.
>> (vectorizable_reduction): Likewise.  Make more use of
>> lane_reduc_code_p.
>> (vect_transform_reduction): Use gimple_extract_op but expect
>> a tree_code for now.
>> (vect_can_vectorize_without_simd_p): New overload that takes
>> a code_helper.
>> * tree-vect-stmts.c (vectorizable_call): Handle reductions in
>> fully-masked loops.
>> * tree-vect-patterns.c (vect_mark_pattern_stmts): Use
>> gimple_extract_op when updating STMT_VINFO_REDUC_IDX.
>> ---
>>  gcc/builtins.c   |  46 -
>>  gcc/builtins.h   |   1 +
>>  gcc/gimple-fold.c|   9 +
>>  gcc/gimple-match-head.c  |  70 +++
>>  gcc/gimple-match.h   |  20 ++
>>  gcc/internal-fn.c|  46 -
>>  gcc/internal-fn.h|   2 +
>>  gcc/tree-vect-loop.c | 420 +++
>>  gcc/tree-vect-patterns.c |  23 ++-
>>  gcc/tree-vect-stmts.c|  66 --
>>  gcc/tree-vectorizer.h|  10 +-
>>  11 files changed, 455 insertions(+), 258 deletions(-)
>>
>> diff --git a/gcc/builtins.c b/gcc/builtins.c
>> index 384864bfb3a..03829c03a5a 100644
>> --- a/gcc/builtins.c
>> +++ b/gcc/builtins.c
>> @@ -2139,17 +2139,17 @@ mathfn_built_in_type (combined_fn fn)
>>  #undef SEQ_OF_CASE_MATHFN
>>  }
>>
>> -/* If BUILT_IN_NORMAL function FNDECL has an associated internal function,
>> -   return its code, otherwise return IFN_LAST.  Note that this function
>> -   only tests whether the function is defined in internals.def, not whether
>> -   it is actually available on the target.  */
>> +/* Check whether there is an internal function associated with function FN
>> +   and return type RETURN_TYPE.  Return the function if so, otherwise return
>> +   IFN_LAST.
>>
>> -internal_fn
>> -associated_internal_fn 

Re: [PATCH 1/5] libstdc++: Import the fast_float library

2021-11-16 Thread Daniel Krügler via Gcc-patches
Am Di., 16. Nov. 2021 um 16:31 Uhr schrieb Patrick Palka via Libstdc++
:
>
[..]
> -- >8 --
>
> Subject: [PATCH 1/5] libstdc++: Import the fast_float library
>
[..]
> +## Reference
> +
> +- Daniel Lemire, [Number Parsing at a Gigabyte per 
> Second](https://arxiv.org/abs/2101.11408), Software: Pratice and Experience 
> 51 (8), 2021.

There is a typo in the title at the very end:

s/Pratice/Practice

(See https://arxiv.org/abs/2101.11408)

- Daniel


Re: [PATCH 2/5] gimple-match: Add a gimple_extract_op function

2021-11-16 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On Wed, Nov 10, 2021 at 1:46 PM Richard Sandiford via Gcc-patches
>  wrote:
>>
>> code_helper and gimple_match_op seem like generally useful ways
>> of summing up a gimple_assign or gimple_call (or gimple_cond).
>> This patch adds a gimple_extract_op function that can be used
>> for that.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>> Richard
>>
>>
>> gcc/
>> * gimple-match.h (gimple_extract_op): Declare.
>> * gimple-match.c (gimple_extract): New function, extracted from...
>> (gimple_simplify): ...here.
>> (gimple_extract_op): New function.
>> ---
>>  gcc/gimple-match-head.c | 261 +++-
>>  gcc/gimple-match.h  |   1 +
>>  2 files changed, 149 insertions(+), 113 deletions(-)
>>
>> diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c
>> index 9d88b2f8551..4c6e0883ba4 100644
>> --- a/gcc/gimple-match-head.c
>> +++ b/gcc/gimple-match-head.c
>> @@ -890,12 +890,29 @@ try_conditional_simplification (internal_fn ifn, 
>> gimple_match_op *res_op,
>>return true;
>>  }
>>
>> -/* The main STMT based simplification entry.  It is used by the fold_stmt
>> -   and the fold_stmt_to_constant APIs.  */
>> +/* Common subroutine of gimple_extract_op and gimple_simplify.  Try to
>> +   describe STMT in RES_OP.  Return:
>>
>> -bool
>> -gimple_simplify (gimple *stmt, gimple_match_op *res_op, gimple_seq *seq,
>> -tree (*valueize)(tree), tree (*top_valueize)(tree))
>> +   - -1 if extraction failed
>> +   - otherwise, 0 if no simplification should take place
>> +   - otherwise, the number of operands for a GIMPLE_ASSIGN or GIMPLE_COND
>> +   - otherwise, -2 for a GIMPLE_CALL
>> +
>> +   Before recording an operand, call:
>> +
>> +   - VALUEIZE_CONDITION for a COND_EXPR condition
>> +   - VALUEIZE_NAME if the rhs of a GIMPLE_ASSIGN is an SSA_NAME
>
> I think at least VALUEIZE_NAME is unnecessary, see below

Yeah, it's unnecessary.  The idea was to (try to) ensure that
gimple_simplify keeps all the microoptimisations that it had
previously.  This includes open-coding do_valueize for SSA_NAMEs
and jumping straight to the right gimplify_resimplifyN routine
when the number of operands is already known.

(The two calls to gimple_extract<> produce different functions
that ought to get inlined into their single callers.  A lot of the
jumps should then be threaded.)

I can drop all that if you don't think it's worth it though.
Just wanted to double-check first.

Thanks,
Richard

>> +   - VALUEIZE_OP for every other top-level operand
>> +
>> +   Each routine takes a tree argument and returns a tree.  */
>> +
>> +template> +typename ValueizeName>
>> +inline int
>> +gimple_extract (gimple *stmt, gimple_match_op *res_op,
>> +   ValueizeOp valueize_op,
>> +   ValueizeCondition valueize_condition,
>> +   ValueizeName valueize_name)
>>  {
>>switch (gimple_code (stmt))
>>  {
>> @@ -911,100 +928,53 @@ gimple_simplify (gimple *stmt, gimple_match_op 
>> *res_op, gimple_seq *seq,
>> || code == VIEW_CONVERT_EXPR)
>>   {
>> tree op0 = TREE_OPERAND (gimple_assign_rhs1 (stmt), 0);
>> -   bool valueized = false;
>> -   op0 = do_valueize (op0, top_valueize, valueized);
>> -   res_op->set_op (code, type, op0);
>> -   return (gimple_resimplify1 (seq, res_op, valueize)
>> -   || valueized);
>> +   res_op->set_op (code, type, valueize_op (op0));
>> +   return 1;
>>   }
>> else if (code == BIT_FIELD_REF)
>>   {
>> tree rhs1 = gimple_assign_rhs1 (stmt);
>> -   tree op0 = TREE_OPERAND (rhs1, 0);
>> -   bool valueized = false;
>> -   op0 = do_valueize (op0, top_valueize, valueized);
>> +   tree op0 = valueize_op (TREE_OPERAND (rhs1, 0));
>> res_op->set_op (code, type, op0,
>> TREE_OPERAND (rhs1, 1),
>> TREE_OPERAND (rhs1, 2),
>> REF_REVERSE_STORAGE_ORDER (rhs1));
>> -   if (res_op->reverse)
>> - return valueized;
>> -   return (gimple_resimplify3 (seq, res_op, valueize)
>> -   || valueized);
>> +   return res_op->reverse ? 0 : 3;
>>   }
>> -   else if (code == SSA_NAME
>> -&& top_valueize)
>> +   else if (code == SSA_NAME)
>>   {
>> tree op0 = gimple_assign_rhs1 (stmt);
>> -   tree valueized = top_valueize (op0);
>> +   tree valueized = valueize_name (op0);
>> if (!valueized || op0 == valueized)
>> - return false;
>> + return -1;
>> res_op->set_op (TREE_CODE (op0), type, valueized);
>> 

[PATCH][committed]middle-end signbit-2: make test check for scalar or vector versions

2021-11-16 Thread Tamar Christina via Gcc-patches
Hi All,

This updates the signbit-2 test to check for
the scalar optimization if the target does not
support vectorization.

Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions.

Committed under the gcc obvious rule.

Thanks,
Tamar

gcc/testsuite/ChangeLog:

* gcc.dg/signbit-2.c: CHeck vect or scalar.

--- inline copy of patch -- 
diff --git a/gcc/testsuite/gcc.dg/signbit-2.c b/gcc/testsuite/gcc.dg/signbit-2.c
index 
d8501e9b7a2d82b511ad0b3a44c0121d635972c0..b609f67dc9f8a949b86f0ec84144db834b9d531a
 100644
--- a/gcc/testsuite/gcc.dg/signbit-2.c
+++ b/gcc/testsuite/gcc.dg/signbit-2.c
@@ -19,5 +19,6 @@ void fun2(int32_t *x, int n)
   x[i] = (-x[i]) >> 30;
 }
 
-/* { dg-final { scan-tree-dump {\s+>\s+\{ 0, 0, 0(, 0)+ \}} optimized } } */
+/* { dg-final { scan-tree-dump {\s+>\s+\{ 0(, 0)+ \}} optimized { target 
vect_int } } } */
+/* { dg-final { scan-tree-dump {\s+>\s+0} optimized { target { ! vect_int } } 
} } */
 /* { dg-final { scan-tree-dump-not {\s+>>\s+31} optimized } } */


-- 
diff --git a/gcc/testsuite/gcc.dg/signbit-2.c b/gcc/testsuite/gcc.dg/signbit-2.c
index d8501e9b7a2d82b511ad0b3a44c0121d635972c0..b609f67dc9f8a949b86f0ec84144db834b9d531a 100644
--- a/gcc/testsuite/gcc.dg/signbit-2.c
+++ b/gcc/testsuite/gcc.dg/signbit-2.c
@@ -19,5 +19,6 @@ void fun2(int32_t *x, int n)
   x[i] = (-x[i]) >> 30;
 }
 
-/* { dg-final { scan-tree-dump {\s+>\s+\{ 0, 0, 0(, 0)+ \}} optimized } } */
+/* { dg-final { scan-tree-dump {\s+>\s+\{ 0(, 0)+ \}} optimized { target vect_int } } } */
+/* { dg-final { scan-tree-dump {\s+>\s+0} optimized { target { ! vect_int } } } } */
 /* { dg-final { scan-tree-dump-not {\s+>>\s+31} optimized } } */



[PATCH][committed]AArch64 shrn-combine-10: update test to current codegen.

2021-11-16 Thread Tamar Christina via Gcc-patches
Hi All,

When the rshrn commit was reverted I missed this testcase.
This now updates it.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Committed under the obvious rule.

Thanks,
Tamar

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/shrn-combine-10.c: Use shrn.

--- inline copy of patch -- 
diff --git a/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c 
b/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c
index 
3a1cfce93e9065e8d5b43a770b0ef24a17586411..dc9e9be94cbe4ba81d936dfaf178674b9da31040
 100644
--- a/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c
+++ b/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c
@@ -6,7 +6,7 @@
 
 uint32x4_t foo (uint64x2_t a, uint64x2_t b)
 {
-  return vrshrn_high_n_u64 (vrshrn_n_u64 (a, 32), b, 32);
+  return vshrn_high_n_u64 (vshrn_n_u64 (a, 32), b, 32);
 }
 
 /* { dg-final { scan-assembler-times {\tuzp2\t} 1 } } */


-- 
diff --git a/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c b/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c
index 3a1cfce93e9065e8d5b43a770b0ef24a17586411..dc9e9be94cbe4ba81d936dfaf178674b9da31040 100644
--- a/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c
+++ b/gcc/testsuite/gcc.target/aarch64/shrn-combine-10.c
@@ -6,7 +6,7 @@
 
 uint32x4_t foo (uint64x2_t a, uint64x2_t b)
 {
-  return vrshrn_high_n_u64 (vrshrn_n_u64 (a, 32), b, 32);
+  return vshrn_high_n_u64 (vshrn_n_u64 (a, 32), b, 32);
 }
 
 /* { dg-final { scan-assembler-times {\tuzp2\t} 1 } } */



[PATCH]middle-end: Fix FMA detection when inspecting gimple which have no LHS.

2021-11-16 Thread Tamar Christina via Gcc-patches
Hi All,

convert_mult_to_fma assumes that all gimple_assigns have a LHS set.  This
assumption is however not true when an IFN is kept around just for the
side-effects.  In those situations you have just the IFN and lhs will be null.

Since there's no LHS, there also can't be any ADD and such it can't be an FMA
so it's correct to just return early if no LHS.

Bootstrapped Regtested on aarch64-none-linux-gnu,
x86_64-pc-linux-gnu and no regressions.

Ok for master?

Thanks,
Tamar



gcc/ChangeLog:

PR tree-optimizations/103253
* tree-ssa-math-opts.c (convert_mult_to_fma): Check for LHS.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr103253.c: New test.

--- inline copy of patch -- 
diff --git a/gcc/testsuite/gcc.dg/vect/pr103253.c 
b/gcc/testsuite/gcc.dg/vect/pr103253.c
new file mode 100644
index 
..abe3f09f3818d79a53f2aa962c6b6c06855d618e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr103253.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target fopenmp } */
+/* { dg-additional-options "-O2 -fexceptions -fopenmp 
-fno-delete-dead-exceptions -fno-trapping-math" } */
+
+double
+do_work (double do_work_pri)
+{
+  int i;
+
+#pragma omp simd
+  for (i = 0; i < 17; ++i)
+do_work_pri = (!i ? 0.5 : i) * 2.0;
+
+  return do_work_pri;
+}
+
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index 
c4a6492b50df25b4cf296a75bd51e5af34eeacc7..cc8496c3c325f3cc303a90b9b9cac383e5a7942d
 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -3224,6 +3224,10 @@ convert_mult_to_fma (gimple *mul_stmt, tree op1, tree 
op2,
 fma_deferring_state *state, tree mul_cond = NULL_TREE)
 {
   tree mul_result = gimple_get_lhs (mul_stmt);
+  /* If there isn't a LHS then this can't be an FMA.  There can be no LHS
+ if the statement was left just for the side-effects.  */
+  if (!mul_result)
+return false;
   tree type = TREE_TYPE (mul_result);
   gimple *use_stmt, *neguse_stmt;
   use_operand_p use_p;


-- 
diff --git a/gcc/testsuite/gcc.dg/vect/pr103253.c b/gcc/testsuite/gcc.dg/vect/pr103253.c
new file mode 100644
index ..abe3f09f3818d79a53f2aa962c6b6c06855d618e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr103253.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target fopenmp } */
+/* { dg-additional-options "-O2 -fexceptions -fopenmp -fno-delete-dead-exceptions -fno-trapping-math" } */
+
+double
+do_work (double do_work_pri)
+{
+  int i;
+
+#pragma omp simd
+  for (i = 0; i < 17; ++i)
+do_work_pri = (!i ? 0.5 : i) * 2.0;
+
+  return do_work_pri;
+}
+
diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index c4a6492b50df25b4cf296a75bd51e5af34eeacc7..cc8496c3c325f3cc303a90b9b9cac383e5a7942d 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -3224,6 +3224,10 @@ convert_mult_to_fma (gimple *mul_stmt, tree op1, tree op2,
 		 fma_deferring_state *state, tree mul_cond = NULL_TREE)
 {
   tree mul_result = gimple_get_lhs (mul_stmt);
+  /* If there isn't a LHS then this can't be an FMA.  There can be no LHS
+ if the statement was left just for the side-effects.  */
+  if (!mul_result)
+return false;
   tree type = TREE_TYPE (mul_result);
   gimple *use_stmt, *neguse_stmt;
   use_operand_p use_p;



Re: [PATCH 1/5] libstdc++: Import the fast_float library

2021-11-16 Thread Patrick Palka via Gcc-patches
On Tue, 16 Nov 2021, Florian Weimer wrote:

> * Patrick Palka via Libstdc:
> 
> > This copies the fast_float library[1] into the compiled-in library
> > sources.  We're going to use this library in our floating-point
> > std::from_chars implementation for faster and more portable parsing of
> > binary32/64 decimal strings.
> >
> > [1]: https://github.com/fastfloat/fast_float
> >
> > Series tested on x86_64, i686, ppc64, ppc64le and aarch64, does it
> > look OK for trunk?
> 
> Missing Signed-off-by:?

Oops, fixed in the below patch.

> 
> > diff --git a/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE 
> > b/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE
> > new file mode 100644
> > index 000..26f4398f249
> > --- /dev/null
> > +++ b/libstdc++-v3/src/c++17/fast_float/LICENSE-APACHE
> > @@ -0,0 +1,190 @@
> > + Apache License
> > +   Version 2.0, January 2004
> > +http://www.apache.org/licenses/
> 
> > diff --git a/libstdc++-v3/src/c++17/fast_float/LICENSE-MIT 
> > b/libstdc++-v3/src/c++17/fast_float/LICENSE-MIT
> > new file mode 100644
> > index 000..2fb2a37ad7f
> > --- /dev/null
> > +++ b/libstdc++-v3/src/c++17/fast_float/LICENSE-MIT
> > @@ -0,0 +1,27 @@
> > +MIT License
> > +
> > +Copyright (c) 2021 The fast_float authors
> 
> You also need to include the README file, which makes it clear that
> recipients can choose between Apache and MIT.  GCC needs to use the MIT
> option, I think.

Also fixed.

I noticed that the source repository contains the script
./script/amalgamate.py that generates a single-file version of the
library for us, complete with an embedded copyright/license banner.
This seems like a simpler way of integrating the library, so the below
patch uses the amalgamation instead.

-- >8 --

Subject: [PATCH 1/5] libstdc++: Import the fast_float library

We're going to use the fast_float library in our (compiled-in)
floating-point std::from_chars implementation for faster and more
portable parsing of binary32/64 decimal strings.

The single file fast_float.h is an amalgamation of the entire library,
which can be (re)generated with the command

  python3 ./script/amalgamate.py --license=MIT \
> $GCC_SRC/libstdc++-v3/c++17/fast_float/fast_float.h

[1]: https://github.com/fastfloat/fast_float

libstdc++-v3/ChangeLog:

* src/c++17/fast_float/LOCAL_PATCHES: New file.
* src/c++17/fast_float/MERGE: New file.
* src/c++17/fast_float/README.fd: New file, copied from the
fast_float library sources.
* src/c++17/fast_float/fast_float.h: New file, an amalgamation
of the fast_float library.

Signed-off-by: Patrick Palka 
---
 .../src/c++17/fast_float/LOCAL_PATCHES|0
 libstdc++-v3/src/c++17/fast_float/MERGE   |4 +
 libstdc++-v3/src/c++17/fast_float/README.md   |  218 ++
 .../src/c++17/fast_float/fast_float.h | 2944 +
 4 files changed, 3166 insertions(+)
 create mode 100644 libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES
 create mode 100644 libstdc++-v3/src/c++17/fast_float/MERGE
 create mode 100644 libstdc++-v3/src/c++17/fast_float/README.md
 create mode 100644 libstdc++-v3/src/c++17/fast_float/fast_float.h

diff --git a/libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES 
b/libstdc++-v3/src/c++17/fast_float/LOCAL_PATCHES
new file mode 100644
index 000..e69de29bb2d
diff --git a/libstdc++-v3/src/c++17/fast_float/MERGE 
b/libstdc++-v3/src/c++17/fast_float/MERGE
new file mode 100644
index 000..43bdc3981c8
--- /dev/null
+++ b/libstdc++-v3/src/c++17/fast_float/MERGE
@@ -0,0 +1,4 @@
+d35368cae610b4edeec61cd41e4d2367a4d33f58
+
+The first line of this file holds the git revision number of the
+last merge done from the master library sources.
diff --git a/libstdc++-v3/src/c++17/fast_float/README.md 
b/libstdc++-v3/src/c++17/fast_float/README.md
new file mode 100644
index 000..1e1c06d0a3e
--- /dev/null
+++ b/libstdc++-v3/src/c++17/fast_float/README.md
@@ -0,0 +1,218 @@
+## fast_float number parsing library: 4x faster than strtod
+
+![Ubuntu 20.04 CI (GCC 
9)](https://github.com/lemire/fast_float/workflows/Ubuntu%2020.04%20CI%20(GCC%209)/badge.svg)
+![Ubuntu 18.04 CI (GCC 
7)](https://github.com/lemire/fast_float/workflows/Ubuntu%2018.04%20CI%20(GCC%207)/badge.svg)
+![Alpine 
Linux](https://github.com/lemire/fast_float/workflows/Alpine%20Linux/badge.svg)
+![MSYS2-CI](https://github.com/lemire/fast_float/workflows/MSYS2-CI/badge.svg)
+![VS16-CLANG-CI](https://github.com/lemire/fast_float/workflows/VS16-CLANG-CI/badge.svg)
+[![VS16-CI](https://github.com/fastfloat/fast_float/actions/workflows/vs16-ci.yml/badge.svg)](https://github.com/fastfloat/fast_float/actions/workflows/vs16-ci.yml)
+
+The fast_float library provides fast header-only implementations for the C++ 
from_chars
+functions for `float` and `double` types.  These functions convert ASCII 
strings representing
+decimal values (e.g., `1.3e10`) into binary 

[committed] analyzer: fix overeager sharing of bounded_range instances [PR102662]

2021-11-16 Thread David Malcolm via Gcc-patches
This was leading to an assertion failure ICE on a switch stmt when using
-fstrict-enums, due to erroneously reusing a range involving one enum
with a range involving a different enum.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-5307-ge1c0c908f85816240b685a5be4f0e5a0e6634979.

gcc/analyzer/ChangeLog:
PR analyzer/102662
* constraint-manager.cc (bounded_range::operator==): Require the
types to be the same for equality.

gcc/testsuite/ChangeLog:
PR analyzer/102662
* g++.dg/analyzer/pr102662.C: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/constraint-manager.cc   |  4 ++-
 gcc/testsuite/g++.dg/analyzer/pr102662.C | 39 
 2 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/analyzer/pr102662.C

diff --git a/gcc/analyzer/constraint-manager.cc 
b/gcc/analyzer/constraint-manager.cc
index 6df23fb477e..ea6b5dc60e0 100644
--- a/gcc/analyzer/constraint-manager.cc
+++ b/gcc/analyzer/constraint-manager.cc
@@ -432,7 +432,9 @@ bounded_range::intersects_p (const bounded_range ,
 bool
 bounded_range::operator== (const bounded_range ) const
 {
-  return (tree_int_cst_equal (m_lower, other.m_lower)
+  return (TREE_TYPE (m_lower) == TREE_TYPE (other.m_lower)
+ && TREE_TYPE (m_upper) == TREE_TYPE (other.m_upper)
+ && tree_int_cst_equal (m_lower, other.m_lower)
  && tree_int_cst_equal (m_upper, other.m_upper));
 }
 
diff --git a/gcc/testsuite/g++.dg/analyzer/pr102662.C 
b/gcc/testsuite/g++.dg/analyzer/pr102662.C
new file mode 100644
index 000..99252c7d109
--- /dev/null
+++ b/gcc/testsuite/g++.dg/analyzer/pr102662.C
@@ -0,0 +1,39 @@
+/* { dg-additional-options "-fstrict-enums" } */
+
+enum OpCode {
+  OP_MOVE,
+  OP_LOADK,
+  OP_LOADBOOL,
+  OP_LOADNIL,
+  OP_GETUPVAL,
+  OP_SETUPVAL
+};
+
+enum OpArg {
+  OpArgN,
+  OpArgU,
+  OpArgR,
+  OpArgK
+};
+
+void
+symbexec_lastpc (enum OpCode symbexec_lastpc_op, enum OpArg luaP_opmodes)
+{
+  switch (luaP_opmodes)
+{
+case OpArgN:
+case OpArgK:
+  {
+switch (symbexec_lastpc_op)
+  {
+  case OP_LOADNIL:
+  case OP_SETUPVAL:
+break;
+  default:
+break;
+  }
+  }
+default:
+  break;
+}
+}
-- 
2.26.3



[PATCH v2] c++: improve print_node of PTRMEM_CST

2021-11-16 Thread Jason Merrill via Gcc-patches

On 11/4/21 16:32, Jakub Jelinek wrote:

On Thu, Nov 04, 2021 at 11:52:34AM -0400, Jason Merrill via Gcc-patches wrote:

It's been inconvenient that pretty-printing of PTRMEM_CST didn't display
what member the constant refers to.

Adding that is complicated by the absence of a langhook for CONSTANT_CLASS_P
nodes; the simplest fix for that is to use the tcc_exceptional hook for
tcc_constant as well.

Tested x86_64-pc-linux-gnu.  OK for trunk, or should I add a new hook for
constants?

gcc/cp/ChangeLog:

* ptree.c (cxx_print_xnode): Handle PTRMEM_CST.

gcc/ChangeLog:

* print-tree.c (print_node): Also call print_xnode hook for
tcc_constant class.


I think using the same langhook is fine, but in that case certainly
   /* Called by print_tree when there is a tree of class tcc_exceptional
  that it doesn't know how to display.  */
should be adjusted so that it mentions also tcc_constant.


Done.


And maybe rename it from print_xnode to print_node?


I think changing the comment is enough, it's still just exceptional and 
constant.


This is what I'm pushing:From 761b128dbfa2fbc1f1a0138160a39db95db7759a Mon Sep 17 00:00:00 2001
From: Jason Merrill 
Date: Fri, 29 Oct 2021 16:39:01 -0400
Subject: [PATCH] c++: improve print_node of PTRMEM_CST
To: gcc-patches@gcc.gnu.org

It's been inconvenient that pretty-printing of PTRMEM_CST didn't display
what member the constant refers to.

Adding that is complicated by the absence of a langhook for CONSTANT_CLASS_P
nodes; the simplest fix for that is to use the tcc_exceptional hook for
tcc_constant as well.

gcc/cp/ChangeLog:

	* ptree.c (cxx_print_xnode): Handle PTRMEM_CST.

gcc/ChangeLog:

	* langhooks.h (struct lang_hooks): Adjust comment.
	* print-tree.c (print_node): Also call print_xnode hook for
	tcc_constant class.
---
 gcc/langhooks.h  | 2 +-
 gcc/cp/ptree.c   | 3 +++
 gcc/print-tree.c | 3 +--
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/gcc/langhooks.h b/gcc/langhooks.h
index 3e89134e8b4..3db8f2a550d 100644
--- a/gcc/langhooks.h
+++ b/gcc/langhooks.h
@@ -477,7 +477,7 @@ struct lang_hooks
   void (*print_statistics) (void);
 
   /* Called by print_tree when there is a tree of class tcc_exceptional
- that it doesn't know how to display.  */
+ or tcc_constant that it doesn't know how to display.  */
   lang_print_tree_hook print_xnode;
 
   /* Called to print language-dependent parts of tcc_decl, tcc_type,
diff --git a/gcc/cp/ptree.c b/gcc/cp/ptree.c
index ca7884db39b..d514aa2cad2 100644
--- a/gcc/cp/ptree.c
+++ b/gcc/cp/ptree.c
@@ -379,6 +379,9 @@ cxx_print_xnode (FILE *file, tree node, int indent)
   if (tree message = STATIC_ASSERT_MESSAGE (node))
 	print_node (file, "message", message, indent+4);
   break;
+case PTRMEM_CST:
+  print_node (file, "member", PTRMEM_CST_MEMBER (node), indent+4);
+  break;
 default:
   break;
 }
diff --git a/gcc/print-tree.c b/gcc/print-tree.c
index d1fbd044c27..b5dc523fcb1 100644
--- a/gcc/print-tree.c
+++ b/gcc/print-tree.c
@@ -1004,8 +1004,7 @@ print_node (FILE *file, const char *prefix, tree node, int indent,
 	  break;
 
 	default:
-	  if (EXCEPTIONAL_CLASS_P (node))
-	lang_hooks.print_xnode (file, node, indent);
+	  lang_hooks.print_xnode (file, node, indent);
 	  break;
 	}
 
-- 
2.27.0



Re: [PATCH] Fix spelling of ones' complement.

2021-11-16 Thread Aldy Hernandez via Gcc-patches
On Tue, Nov 16, 2021 at 3:40 PM Koning, Paul  wrote:
>
>
>
> > On Nov 16, 2021, at 2:03 AM, Aldy Hernandez via Gcc-patches 
> >  wrote:
> >
> > On Tue, Nov 16, 2021, 03:20 Marek Polacek via Gcc-patches <
> > gcc-patches@gcc.gnu.org> wrote:
> >
> >> On Tue, Nov 16, 2021 at 02:01:47AM +, Koning, Paul via Gcc-patches
> >> wrote:
> >>>
> >>>
>  On Nov 15, 2021, at 8:48 PM, Marek Polacek via Gcc-patches <
> >> gcc-patches@gcc.gnu.org> wrote:
> 
>  Nitpicking time.  It's spelled "ones' complement" rather than "one's
>  complement".
> >>>
> >>> Is that so?  I see Wikipedia claims it is, but there are no sources for
> >> that claim.  (There is an assertion that it is "discussed at length on the
> >> talk page" of an article about number representation, but in fact there is
> >> no discussion there at all.)
> >>>
> >>> I have never seen this spelling before, and I very much doubt its
> >> validity.  For one thing, why then have "two's complement"?  For another,
> >> to pick one random authority, J.E. Thornton in "Design of a computer -- the
> >> Control Data 6600" refers to "one's complement" to describe the well known
> >> mode used by that machine and its relatives.
> >>
> >> Knuth, The Art of Computer Programming Volume 2, page 203-4:
> >>
> >> "A two's complement number is complemented with respect to a single
> >> power of 2, while a ones' complement number is complemented with respect
> >> to a long sequence of 1s."
> >>
> >
> > I think you get to do a drop mike when you pull out Knuth.
> >
> > :-)
>
> If that were the only source, sure.  But with authoritative sources for both 
> terms (with the ones I quoted being the earlier ones) at the very least there 
> is an argument that both terms are used.
>
> Some more: DEC PDP-1 handbook (April 1960), page 9: "Negative numbers are 
> represented as the 1's complement of the positive numbers."
>
> Univac 1107 CPU manual, page 2-6: "Next, the adder subtracts the one's 
> complement..."
>
> CDC 160 programming manual (1963), page 2-1: "All arithmetic is binary, one's 
> complement notation".
>
> Incidentally, these are the four of the five machines cited by the Wikipedia 
> article.

All sources before Knuth are clearly wrong.  How could they not?
Folks living in the pre-Knuth era lived without a deity.

:-P



Re: [PATCH] Loop unswitching: support gswitch statements.

2021-11-16 Thread Martin Liška

On 11/11/21 08:15, Richard Biener wrote:

So I'd try to do no functional change first, improving the costing and
setting up the transform to simply pick up the stmts to "fold" as discovered
during analysis (as I hinted you possibly can use gimple_uid to mark
the stmts that simplify, IIRC gimple_uid is preserved during copying.
gimple_uid would also scale better than gimple_plf in case we do
the analysis for all candidates at once).


Thinking about the analysis. Am I correct that we want to properly calculate
loop size for true and false edge of a potential gcond before the actually 
unswitching?

We can do that by finding a first gcond candidate, evaluate (symbolic + irange 
approache)
all other gcond in the loop body and use BB_REACHABLE discovery. Similarly to 
what we do now
at lines 378-446. Then tree_num_loop_insns can be adjusted for only these 
reachable blocks.
Having that, we can calculate # of insns that will live in true/false loops.

Then we can call tree_unswitch_loop and make the gcond folding as we do in the 
versioned loops.

Is it a step in good direction? Having that we can then extend it to gswitch 
statements.

Cheers,
Martin


Re: [PATCH] Fix spelling of ones' complement.

2021-11-16 Thread Koning, Paul via Gcc-patches



> On Nov 16, 2021, at 2:03 AM, Aldy Hernandez via Gcc-patches 
>  wrote:
> 
> On Tue, Nov 16, 2021, 03:20 Marek Polacek via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
> 
>> On Tue, Nov 16, 2021 at 02:01:47AM +, Koning, Paul via Gcc-patches
>> wrote:
>>> 
>>> 
 On Nov 15, 2021, at 8:48 PM, Marek Polacek via Gcc-patches <
>> gcc-patches@gcc.gnu.org> wrote:
 
 Nitpicking time.  It's spelled "ones' complement" rather than "one's
 complement".
>>> 
>>> Is that so?  I see Wikipedia claims it is, but there are no sources for
>> that claim.  (There is an assertion that it is "discussed at length on the
>> talk page" of an article about number representation, but in fact there is
>> no discussion there at all.)
>>> 
>>> I have never seen this spelling before, and I very much doubt its
>> validity.  For one thing, why then have "two's complement"?  For another,
>> to pick one random authority, J.E. Thornton in "Design of a computer -- the
>> Control Data 6600" refers to "one's complement" to describe the well known
>> mode used by that machine and its relatives.
>> 
>> Knuth, The Art of Computer Programming Volume 2, page 203-4:
>> 
>> "A two's complement number is complemented with respect to a single
>> power of 2, while a ones' complement number is complemented with respect
>> to a long sequence of 1s."
>> 
> 
> I think you get to do a drop mike when you pull out Knuth.
> 
> :-)

If that were the only source, sure.  But with authoritative sources for both 
terms (with the ones I quoted being the earlier ones) at the very least there 
is an argument that both terms are used.  

Some more: DEC PDP-1 handbook (April 1960), page 9: "Negative numbers are 
represented as the 1's complement of the positive numbers."

Univac 1107 CPU manual, page 2-6: "Next, the adder subtracts the one's 
complement..."

CDC 160 programming manual (1963), page 2-1: "All arithmetic is binary, one's 
complement notation".

Incidentally, these are the four of the five machines cited by the Wikipedia 
article.

Re: [PATCH v2] configure: define TARGET_LIBC_GNUSTACK on musl

2021-11-16 Thread Dragan Mladjenovic

Hi,

Looks fine to me. If possible, maybe it should even be back-ported to 
stable branches.


Not sure if MIPS assembly sources (if any) in musl would need explicit 
.note.GNU-stack


to complement this?

Best regards,

Dragan

On 16-Nov-21 06:13, Ilya Lipnitskiy wrote:

musl only uses PT_GNU_STACK to set default thread stack size and has no
executable stack support[0], so there is no reason not to emit the
.note.GNU-stack section on musl builds.

[0]: 
https://lore.kernel.org/all/20190423192534.gn23...@brightrain.aerifal.cx/T/#u

gcc/ChangeLog:

* configure: Regenerate.
* configure.ac: define TARGET_LIBC_GNUSTACK on musl

Signed-off-by: Ilya Lipnitskiy 
---
  gcc/configure| 3 +++
  gcc/configure.ac | 3 +++
  2 files changed, 6 insertions(+)

diff --git a/gcc/configure b/gcc/configure
index 74b9d9be4c85..7091a838aefa 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -31275,6 +31275,9 @@ fi
  # Check if the target LIBC handles PT_GNU_STACK.
  gcc_cv_libc_gnustack=unknown
  case "$target" in
+  mips*-*-linux-musl*)
+gcc_cv_libc_gnustack=yes
+;;
mips*-*-linux*)
  
  if test $glibc_version_major -gt 2 \

diff --git a/gcc/configure.ac b/gcc/configure.ac
index c9ee1fb8919e..8a2d34179a75 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -6961,6 +6961,9 @@ fi
  # Check if the target LIBC handles PT_GNU_STACK.
  gcc_cv_libc_gnustack=unknown
  case "$target" in
+  mips*-*-linux-musl*)
+gcc_cv_libc_gnustack=yes
+;;
mips*-*-linux*)
  GCC_GLIBC_VERSION_GTE_IFELSE([2], [31], [gcc_cv_libc_gnustack=yes], )
  ;;


[committed] libstdc++: Fix out-of-bound array accesses in testsuite

2021-11-16 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, pushed to trunk.


I fixed some undefined behaviour in string tests in r238609, but I only
fixed the narrow char versions. This applies the same fixes to the
wchar_t ones. These problems were found when testing a patch to make
std::basic_string usable in constexpr.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/modifiers/append/wchar_t/1.cc:
Fix reads past the end of strings.
* testsuite/21_strings/basic_string/operations/compare/wchar_t/1.cc:
Likewise.
* testsuite/experimental/string_view/operations/compare/wchar_t/1.cc:
Likewise.
---
 .../21_strings/basic_string/modifiers/append/wchar_t/1.cc | 2 +-
 .../21_strings/basic_string/operations/compare/wchar_t/1.cc   | 4 ++--
 .../experimental/string_view/operations/compare/wchar_t/1.cc  | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/append/wchar_t/1.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/append/wchar_t/1.cc
index bb2d682de8e..684209f143e 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/append/wchar_t/1.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string/modifiers/append/wchar_t/1.cc
@@ -117,7 +117,7 @@ void test01(void)
   VERIFY( str06 == L"corpus, corpus" );
 
   str06 = str02;
-  str06.append(L"corpus, ", 12);
+  str06.append(L"corpus, ", 9); // n=9 includes null terminator
   VERIFY( str06 != L"corpus, corpus, " );
 
 
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/operations/compare/wchar_t/1.cc
 
b/libstdc++-v3/testsuite/21_strings/basic_string/operations/compare/wchar_t/1.cc
index 27836f8e6fb..6f2113fb16a 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string/operations/compare/wchar_t/1.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string/operations/compare/wchar_t/1.cc
@@ -81,8 +81,8 @@ test01()
   test_value(wcsncmp(str_1.data(), str_0.data(), 6), z);
   test_value(wcsncmp(str_1.data(), str_0.data(), 14), lt);
   test_value(wmemcmp(str_1.data(), str_0.data(), 6), z);
-  test_value(wmemcmp(str_1.data(), str_0.data(), 14), lt);
-  test_value(wmemcmp(L"costa marbella", L"costa rica", 14), lt);
+  test_value(wmemcmp(str_1.data(), str_0.data(), 10), lt);
+  test_value(wmemcmp(L"costa marbella", L"costa rica", 10), lt);
 
   // int compare(const basic_string& str) const;
   test_value(str_0.compare(str_1), gt); //because r>m
diff --git 
a/libstdc++-v3/testsuite/experimental/string_view/operations/compare/wchar_t/1.cc
 
b/libstdc++-v3/testsuite/experimental/string_view/operations/compare/wchar_t/1.cc
index db523e6a83c..20bb030970b 100644
--- 
a/libstdc++-v3/testsuite/experimental/string_view/operations/compare/wchar_t/1.cc
+++ 
b/libstdc++-v3/testsuite/experimental/string_view/operations/compare/wchar_t/1.cc
@@ -81,8 +81,8 @@ test01()
   test_value(wcsncmp(str_1.data(), str_0.data(), 6), z);
   test_value(wcsncmp(str_1.data(), str_0.data(), 14), lt);
   test_value(wmemcmp(str_1.data(), str_0.data(), 6), z);
-  test_value(wmemcmp(str_1.data(), str_0.data(), 14), lt);
-  test_value(wmemcmp(L"costa marbella", L"costa rica", 14), lt);
+  test_value(wmemcmp(str_1.data(), str_0.data(), 10), lt);
+  test_value(wmemcmp(L"costa marbella", L"costa rica", 10), lt);
 
   // int compare(const basic_string_view& str) const;
   test_value(str_0.compare(str_1), gt); //because r>m
-- 
2.31.1



[committed] libstdc++: Fix typos in tests

2021-11-16 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, pushed to trunk.


libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/allocator/71964.cc: Fix
typo.
* testsuite/23_containers/set/allocator/71964.cc: Likewise.
---
 .../testsuite/21_strings/basic_string/allocator/71964.cc| 2 +-
 libstdc++-v3/testsuite/23_containers/set/allocator/71964.cc | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/71964.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/allocator/71964.cc
index c57cb96e971..4196b331aca 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/allocator/71964.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/allocator/71964.cc
@@ -40,7 +40,7 @@ template
   a.moved_from = true;
 }
 
-T* allocate(unsigned n) { return std::allocator{}.allcoate(n); }
+T* allocate(unsigned n) { return std::allocator{}.allocate(n); }
 void deallocate(T* p, unsigned n) { std::allocator{}.deallocate(p, n); }
 
 bool moved_to;
diff --git a/libstdc++-v3/testsuite/23_containers/set/allocator/71964.cc 
b/libstdc++-v3/testsuite/23_containers/set/allocator/71964.cc
index 34a02d85e66..a2c166afd0f 100644
--- a/libstdc++-v3/testsuite/23_containers/set/allocator/71964.cc
+++ b/libstdc++-v3/testsuite/23_containers/set/allocator/71964.cc
@@ -40,7 +40,7 @@ template
   a.moved_from = true;
 }
 
-T* allocate(unsigned n) { return std::allocator{}.allcoate(n); }
+T* allocate(unsigned n) { return std::allocator{}.allocate(n); }
 void deallocate(T* p, unsigned n) { std::allocator{}.deallocate(p, n); }
 
 bool moved_to;
-- 
2.31.1



Re: [PATCH] Loop unswitching: support gswitch statements.

2021-11-16 Thread Martin Liška

On 11/11/21 08:15, Richard Biener wrote:

If you look at simplify_using_entry_checks then this is really really simple,
so I'd try to abstract this, recording sth like a unswitch_predicate where
we store the condition we unswitch on plus maybe cache the constant
range of a VAR cmp CST variable condition on the true/false edge.  We
can then try to simplify each gcond/gswitch based on such an unswitch_predicate
(when we ever scan the loop once to discover all opportunities we'd have a
set of unswitch_predicates to try simplifying against).  As said the integer
range thing would be an improvement over the current state so even that
can be done as followup but I guess for gswitch support that's going to be
the thing to use.


I started working on the unswitch_predicate where I recond also true/false-edge 
irange
of an expression we unswitch on.

I noticed one significant problem, let's consider:

  for (int i = 0; i < size; i++)
  {
double tmp;

if (order == 1)
  tmp = -8 * a[i];
else
  {
if (order == 2)
  tmp = -4 * b[i];
else
  tmp = a[i];
  }

r[i] = 3.4f * tmp + d[i];
  }

We can end up with first unswitching candidate being 'if (order == 2)' (I have 
a real benchmark where it happens).
So I collect ranges and they are [2,2] for true edge and [-INF, 0], [3, INF] 
(because we came to the condition through order != 1 cond).
Then the loop is cloned and we have

if (order == 2)
   loop_version_1
else
   loop_version_2

but in loop_version_2 we wrongly fold 'if (order == 1)' to false because it's 
reflected in the range.

So the question is, can one iterate get_loop_body stmts in some dominator order?

Thanks,
Martin




Re: [PATCH v4] Fix ICE when mixing VLAs and statement expressions [PR91038]

2021-11-16 Thread Uecker, Martin
Am Montag, den 08.11.2021, 19:13 +0100 schrieb Martin Uecker:
> Am Montag, den 08.11.2021, 12:13 -0500 schrieb Jason Merrill:
> > On 11/7/21 01:40, Uecker, Martin wrote:
> > > Am Mittwoch, den 03.11.2021, 10:18 -0400 schrieb Jason Merrill:
> 
> ...
> 
> > > Thank you! I made these changes and ran
> > > bootstrap and tests again.
> > 
> > Hmm, it doesn't look like you made the change to use the save_expr 
> > function instead of build1?
> 
> Oh, sorry. I wanted to change it and then forgot.
> Now also with this change (changelog as before).


Ok, with is this change?

Best,
Martin



> > > Ok for trunk?
> > > 
> > > 
> > > Any idea how to fix returning structs with
> > > VLA member from statement expressions?
> > 
> > Testcase?
> 
> void foo(void)
> {
>   ({ int N = 3; struct { char x[N]; } x; x; });
> }
> 
> The difference to the tests in this patch (which
> also forgot to include in the last version) is that
> the object of variable size is returned from the
> statement expression and not a pointer to it.
> This can not happen with arrays because they decay
> to pointers.
> 
> 
> Martin
> 
> 
> > > Otherwise, I will add an error message to
> > > the FE in another patch.
> > > 
> > > Martin
> > > 
> 
> diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
> index 436df45df68..95083f95442 100644
> --- a/gcc/c-family/c-common.c
> +++ b/gcc/c-family/c-common.c
> @@ -3306,7 +3306,19 @@ pointer_int_sum (location_t loc, enum tree_code 
> resultcode,
>TREE_TYPE (result_type)))
>  size_exp = integer_one_node;
>else
> -size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
> +{
> +  size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
> +  /* Wrap the pointer expression in a SAVE_EXPR to make sure it
> +  is evaluated first when the size expression may depend
> +  on it for VM types.  */
> +  if (TREE_SIDE_EFFECTS (size_exp)
> +   && TREE_SIDE_EFFECTS (ptrop)
> +   && variably_modified_type_p (TREE_TYPE (ptrop), NULL))
> + {
> +   ptrop = save_expr (ptrop);
> +   size_exp = build2 (COMPOUND_EXPR, TREE_TYPE (intop), ptrop, size_exp);
> + }
> +}
>  
>/* We are manipulating pointer values, so we don't need to warn
>   about relying on undefined signed overflow.  We disable the
> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
> index c2ab96e7e18..84f7dc3c248 100644
> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -2964,7 +2964,9 @@ gimplify_var_or_parm_decl (tree *expr_p)
>   declaration, for which we've already issued an error.  It would
>   be really nice if the front end wouldn't leak these at all.
>   Currently the only known culprit is C++ destructors, as seen
> - in g++.old-deja/g++.jason/binding.C.  */
> + in g++.old-deja/g++.jason/binding.C.
> + Another possible culpit are size expressions for variably modified
> + types which are lost in the FE or not gimplified correctly.  */
>if (VAR_P (decl)
>&& !DECL_SEEN_IN_BIND_EXPR_P (decl)
>&& !TREE_STATIC (decl) && !DECL_EXTERNAL (decl)
> @@ -3109,16 +3111,22 @@ gimplify_compound_lval (tree *expr_p, gimple_seq 
> *pre_p, gimple_seq
> *post_p,
>   expression until we deal with any variable bounds, sizes, or
>   positions in order to deal with PLACEHOLDER_EXPRs.
>  
> - So we do this in three steps.  First we deal with the annotations
> - for any variables in the components, then we gimplify the base,
> - then we gimplify any indices, from left to right.  */
> + The base expression may contain a statement expression that
> + has declarations used in size expressions, so has to be
> + gimplified before gimplifying the size expressions.
> +
> + So we do this in three steps.  First we deal with variable
> + bounds, sizes, and positions, then we gimplify the base,
> + then we deal with the annotations for any variables in the
> + components and any indices, from left to right.  */
> +
>for (i = expr_stack.length () - 1; i >= 0; i--)
>  {
>tree t = expr_stack[i];
>  
>if (TREE_CODE (t) == ARRAY_REF || TREE_CODE (t) == ARRAY_RANGE_REF)
>   {
> -   /* Gimplify the low bound and element type size and put them into
> +   /* Deal with the low bound and element type size and put them into
>the ARRAY_REF.  If these values are set, they have already been
>gimplified.  */
> if (TREE_OPERAND (t, 2) == NULL_TREE)
> @@ -3127,18 +3135,8 @@ gimplify_compound_lval (tree *expr_p, gimple_seq 
> *pre_p, gimple_seq
> *post_p,
> if (!is_gimple_min_invariant (low))
>   {
> TREE_OPERAND (t, 2) = low;
> -   tret = gimplify_expr (_OPERAND (t, 2), pre_p,
> - post_p, is_gimple_reg,
> - fb_rvalue);
> -   ret = MIN (ret, tret);
>   }
>   }
> -   else
> -  

Re: [PATCH, rs6000] Optimization for vec_xl_sext

2021-11-16 Thread Bill Schmidt via Gcc-patches
Hi Hao Chen,

I don't understand.  This patch was already approved and you committed it. :-)  
I know
because I needed to make corresponding adjustments to the new builtins code.

Thanks,
Bill

On 11/15/21 8:16 PM, HAO CHEN GUI wrote:
> Hi,
>
>    The patch optimizes the code generation for vec_xl_sext builtin. Now all 
> the sign extensions are done on VSX registers directly.
>
>    Bootstrapped and tested on powerpc64le-linux with no regressions. Is this 
> okay for trunk? Any recommendations? Thanks a lot.
>
> ChangeLog
>
> 2021-11-16 Haochen Gui 
>
> gcc/
>     * config/rs6000/rs6000-call.c (altivec_expand_lxvr_builtin): Modify
>     the expansion for sign extension. All extensions are done on VSX
>     registers.
>
> gcc/testsuite/
>     * gcc.target/powerpc/p10_vec_xl_sext.c: New test.
>
> patch.diff
>
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index b4e13af4dc6..587e9fa2a2a 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -9779,7 +9779,7 @@ altivec_expand_lxvr_builtin (enum insn_code icode, tree 
> exp, rtx target, bool bl
>
>    if (sign_extend)
>  {
> -  rtx discratch = gen_reg_rtx (DImode);
> +  rtx discratch = gen_reg_rtx (V2DImode);
>    rtx tiscratch = gen_reg_rtx (TImode);
>
>    /* Emit the lxvr*x insn.  */
> @@ -9788,20 +9788,31 @@ altivec_expand_lxvr_builtin (enum insn_code icode, 
> tree exp, rtx target, bool bl
>     return 0;
>    emit_insn (pat);
>
> -  /* Emit a sign extension from QI,HI,WI to double (DI).  */
> -  rtx scratch = gen_lowpart (smode, tiscratch);
> +  /* Emit a sign extension from V16QI,V8HI,V4SI to V2DI.  */
> +  rtx temp1, temp2;
>    if (icode == CODE_FOR_vsx_lxvrbx)
> -   emit_insn (gen_extendqidi2 (discratch, scratch));
> +   {
> + temp1  = simplify_gen_subreg (V16QImode, tiscratch, TImode, 0);
> + emit_insn (gen_vsx_sign_extend_qi_v2di (discratch, temp1));
> +   }
>    else if (icode == CODE_FOR_vsx_lxvrhx)
> -   emit_insn (gen_extendhidi2 (discratch, scratch));
> +   {
> + temp1  = simplify_gen_subreg (V8HImode, tiscratch, TImode, 0);
> + emit_insn (gen_vsx_sign_extend_hi_v2di (discratch, temp1));
> +   }
>    else if (icode == CODE_FOR_vsx_lxvrwx)
> -   emit_insn (gen_extendsidi2 (discratch, scratch));
> -  /*  Assign discratch directly if scratch is already DI.  */
> -  if (icode == CODE_FOR_vsx_lxvrdx)
> -   discratch = scratch;
> +   {
> + temp1  = simplify_gen_subreg (V4SImode, tiscratch, TImode, 0);
> + emit_insn (gen_vsx_sign_extend_si_v2di (discratch, temp1));
> +   }
> +  else if (icode == CODE_FOR_vsx_lxvrdx)
> +   discratch = simplify_gen_subreg (V2DImode, tiscratch, TImode, 0);
> +  else
> +   gcc_unreachable ();
>
> -  /* Emit the sign extension from DI (double) to TI (quad).  */
> -  emit_insn (gen_extendditi2 (target, discratch));
> +  /* Emit the sign extension from V2DI (double) to TI (quad).  */
> +  temp2 = simplify_gen_subreg (TImode, discratch, V2DImode, 0);
> +  emit_insn (gen_extendditi2_vector (target, temp2));
>
>    return target;
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c 
> b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c
> new file mode 100644
> index 000..78e72ac5425
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/p10_vec_xl_sext.c
> @@ -0,0 +1,35 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target int128 } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> +
> +#include 
> +
> +vector signed __int128
> +foo1 (signed long a, signed char *b)
> +{
> +  return vec_xl_sext (a, b);
> +}
> +
> +vector signed __int128
> +foo2 (signed long a, signed short *b)
> +{
> +  return vec_xl_sext (a, b);
> +}
> +
> +vector signed __int128
> +foo3 (signed long a, signed int *b)
> +{
> +  return vec_xl_sext (a, b);
> +}
> +
> +vector signed __int128
> +foo4 (signed long a, signed long *b)
> +{
> +  return vec_xl_sext (a, b);
> +}
> +
> +/* { dg-final { scan-assembler-times {\mvextsd2q\M} 4 } } */
> +/* { dg-final { scan-assembler-times {\mvextsb2d\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvextsh2d\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvextsw2d\M} 1 } } */
>


[PATCH, v5, OpenMP 5.0] Improve OpenMP target support for C++ [PR92120 v5]

2021-11-16 Thread Chung-Lin Tang

Hi Jakub,

On 2021/6/24 9:15 PM, Jakub Jelinek wrote:

On Fri, Jun 18, 2021 at 10:25:16PM +0800, Chung-Lin Tang wrote:

Note, you'll need to rebase your patch, it clashes with
r12-1768-g7619d33471c10fe3d149dcbb701d99ed3dd23528.
Sorry for that.  And sorry for patch review delay.


--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -13104,6 +13104,12 @@ handle_omp_array_sections_1 (tree c, tree t, vec 
,
  return error_mark_node;
}
  t = TREE_OPERAND (t, 0);
+ if ((ort == C_ORT_ACC || ort == C_ORT_OMP)


Map clauses never appear on declare simd, so
(ort == C_ORT_ACC || ort == C_ORT_OMP)
previously meant always and since the in_reduction change is incorrect
(as C_ORT_OMP_TARGET is used for target construct but not for
e.g. target data* or target update).


+ && TREE_CODE (t) == MEM_REF)


Upon reviewing, it appears that most of these C_ORT_* tests are no longer 
needed, removed in new patch.


So please just use if (TREE_CODE (t) == MEM_REF)
or explain when it shouldn't trigger.


@@ -14736,6 +14743,11 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
{
  while (TREE_CODE (t) == COMPONENT_REF)
t = TREE_OPERAND (t, 0);
+ if (TREE_CODE (t) == MEM_REF)
+   {
+ t = TREE_OPERAND (t, 0);
+ STRIP_NOPS (t);
+   }


This doesn't look correct.  At least the parsing (and the spec AFAIK)
doesn't ensure that if there is ->, it must come before all the dots.
So, if one uses map (s->x.y) the above would work, but if map (s->x.y->z) or
map (s.a->b->c->d->e) is used, it wouldn't.  I'd expect a single
while loop that looks through COMPONENT_REFs and MEM_REFs as they appear.
Maybe the handle_omp_array_sections_1 MEM_REF case too?

Or do you want to have it done incrementally, start with supporting only
a single -> first before all the dots and later on add support for the rest?

I think the 5.0 and especially 5.1 wording basically says that map clause
operand is arbitrary lvalue expression that includes array section support
too, so eventually we should just have somewhere in parsing scope a bool
whether OpenMP array sections are allowed or not, add OMP_ARRAY_REF or
similar tree code for those and after parsing the expression, ensure
array sections appear only where they can appear and for a subset of the
lvalue expressions where we have decl plus series of -> field or . field
or [ index ] or [ array section stuff ] handle those specially.
That arbitrary lvalue can certainly be done incrementally.
map (foo(123)->a.b[3]->c.d[:7]) and the like.


Indeed this kind of modification is sort of "as encountered", so there are
probably many cases that are not completely handled yet; it's not just
the front-end, but also changes in gimplify_scan_omp_clauses().

However, I had another patch that should've plowed a bit further on this:
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570075.html
as well as those patch sets that Julian is working on.
(our current plan is to have my sets go in first, and Julian's on top,
to minimize clashing)


  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
  && OMP_CLAUSE_MAP_IMPLICIT (c)
  && (bitmap_bit_p (_head, DECL_UID (t))
@@ -14802,6 +14814,15 @@ c_finish_omp_clauses (tree clauses, enum 
c_omp_region_type ort)
   bias) to zero here, so it is not set erroneously to the pointer
   size later on in gimplify.c.  */
OMP_CLAUSE_SIZE (c) = size_zero_node;
+ indir_component_ref_p = false;
+ if ((ort == C_ORT_ACC || ort == C_ORT_OMP)


Same comment about ort tests.


+ && TREE_CODE (t) == COMPONENT_REF
+ && TREE_CODE (TREE_OPERAND (t, 0)) == MEM_REF)
+   {
+ t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
+ indir_component_ref_p = true;
+ STRIP_NOPS (t);
+   }


Again, this can handle only a single ->


@@ -42330,16 +42328,10 @@ cp_parser_omp_target (cp_parser *parser, cp_token 
*pragma_tok,
cclauses[C_OMP_CLAUSE_SPLIT_TARGET] = tc;
  }
}
- tree stmt = make_node (OMP_TARGET);
- TREE_TYPE (stmt) = void_type_node;
- OMP_TARGET_CLAUSES (stmt) = cclauses[C_OMP_CLAUSE_SPLIT_TARGET];
- c_omp_adjust_map_clauses (OMP_TARGET_CLAUSES (stmt), true);
- OMP_TARGET_BODY (stmt) = body;
- OMP_TARGET_COMBINED (stmt) = 1;
- SET_EXPR_LOCATION (stmt, pragma_tok->location);
- add_stmt (stmt);
- pc = _TARGET_CLAUSES (stmt);
- goto check_clauses;
+ c_omp_adjust_map_clauses (cclauses[C_OMP_CLAUSE_SPLIT_TARGET], true);
+ finish_omp_target (pragma_tok->location,
+cclauses[C_OMP_CLAUSE_SPLIT_TARGET], body, true);


What is 

Re: [PATCH 3/3] elf: Add _dl_find_eh_frame function

2021-11-16 Thread Adhemerval Zanella via Gcc-patches



On 03/11/2021 13:28, Florian Weimer via Gcc-patches wrote:
> This function is similar to __gnu_Unwind_Find_exidx as used on arm.
> It can be used to speed up the libgcc unwinder.

Besides the terse patch description, the design seems ok to accomplish the
lock-free read and update.  There are some question and remarks below,
and I still need to revise the tests.

However the code is somewhat complex and I would like to have some feedback
if gcc will be willing to accept this change (I assume it would require
this code merge on glibc beforehand).

> ---
>  NEWS  |   4 +
>  bits/dlfcn_eh_frame.h |  33 +
>  dlfcn/Makefile|   2 +-
>  dlfcn/dlfcn.h |   2 +
>  elf/Makefile  |  31 +-
>  elf/Versions  |   3 +
>  elf/dl-close.c|   4 +
>  elf/dl-find_eh_frame.c| 864 ++
>  elf/dl-find_eh_frame.h|  90 ++
>  elf/dl-find_eh_frame_slow.h   |  55 ++
>  elf/dl-libc_freeres.c |   2 +
>  elf/dl-open.c |   5 +
>  elf/rtld.c|   7 +
>  elf/tst-dl_find_eh_frame-mod1.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod2.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod3.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod4.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod5.c   |  11 +
>  elf/tst-dl_find_eh_frame-mod6.c   |  11 +
>  elf/tst-dl_find_eh_frame-mod7.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod8.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod9.c   |  10 +
>  elf/tst-dl_find_eh_frame-threads.c| 237 +
>  elf/tst-dl_find_eh_frame.c| 179 
>  include/atomic_wide_counter.h |  14 +
>  include/bits/dlfcn_eh_frame.h |   1 +
>  include/link.h|   3 +
>  manual/Makefile   |   2 +-
>  manual/dynlink.texi   |  69 ++
>  manual/libdl.texi |  10 -
>  manual/probes.texi|   2 +-
>  manual/threads.texi   |   2 +-
>  sysdeps/i386/bits/dlfcn_eh_frame.h|  34 +
>  sysdeps/mach/hurd/i386/ld.abilist |   1 +
>  sysdeps/nios2/bits/dlfcn_eh_frame.h   |  34 +
>  sysdeps/unix/sysv/linux/aarch64/ld.abilist|   1 +
>  sysdeps/unix/sysv/linux/alpha/ld.abilist  |   1 +
>  sysdeps/unix/sysv/linux/arc/ld.abilist|   1 +
>  sysdeps/unix/sysv/linux/arm/be/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/arm/le/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/csky/ld.abilist   |   1 +
>  sysdeps/unix/sysv/linux/hppa/ld.abilist   |   1 +
>  sysdeps/unix/sysv/linux/i386/ld.abilist   |   1 +
>  sysdeps/unix/sysv/linux/ia64/ld.abilist   |   1 +
>  .../unix/sysv/linux/m68k/coldfire/ld.abilist  |   1 +
>  .../unix/sysv/linux/m68k/m680x0/ld.abilist|   1 +
>  sysdeps/unix/sysv/linux/microblaze/ld.abilist |   1 +
>  .../unix/sysv/linux/mips/mips32/ld.abilist|   1 +
>  .../sysv/linux/mips/mips64/n32/ld.abilist |   1 +
>  .../sysv/linux/mips/mips64/n64/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/nios2/ld.abilist  |   1 +
>  .../sysv/linux/powerpc/powerpc32/ld.abilist   |   1 +
>  .../linux/powerpc/powerpc64/be/ld.abilist |   1 +
>  .../linux/powerpc/powerpc64/le/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/riscv/rv32/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/riscv/rv64/ld.abilist |   1 +
>  .../unix/sysv/linux/s390/s390-32/ld.abilist   |   1 +
>  .../unix/sysv/linux/s390/s390-64/ld.abilist   |   1 +
>  sysdeps/unix/sysv/linux/sh/be/ld.abilist  |   1 +
>  sysdeps/unix/sysv/linux/sh/le/ld.abilist  |   1 +
>  .../unix/sysv/linux/sparc/sparc32/ld.abilist  |   1 +
>  .../unix/sysv/linux/sparc/sparc64/ld.abilist  |   1 +
>  sysdeps/unix/sysv/linux/x86_64/64/ld.abilist  |   1 +
>  sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist |   1 +
>  64 files changed, 1795 insertions(+), 16 deletions(-)
>  create mode 100644 bits/dlfcn_eh_frame.h
>  create mode 100644 elf/dl-find_eh_frame.c
>  create mode 100644 elf/dl-find_eh_frame.h
>  create mode 100644 elf/dl-find_eh_frame_slow.h
>  create mode 100644 elf/tst-dl_find_eh_frame-mod1.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod2.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod3.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod4.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod5.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod6.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod7.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod8.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod9.c
>  create mode 100644 

Re: Use modref kills in tree-ssa-dse

2021-11-16 Thread Richard Biener via Gcc-patches
On Tue, 16 Nov 2021, Jan Hubicka wrote:

> > 
> > Not sure, tree-ssa-dse.c doesn't seem to handle MEM_REF with offset?
> > 
> > VN has adjust_offsets_for_equal_base_address for this purpose.  I
> > agree that some common functionality like
> > 
> > bool
> > get_relative_extent_of (const ao_ref *base, const ao_ref *ref,
> > poly_int64 *offset);
> > 
> > that computes [offset, offset + ref->[max_]size] of REF adjusted as to
> > make ao_ref_base have the same address (or return false if not
> > possible).  Then [ base->offset, base->offset + base->max_size ]
> > can be compared against that.
> 
> OK, I will look into that.
> > > +  if (valid_ao_ref_for_dse (write)
> > > +  && operand_equal_p (write->base, ref->base, OEP_ADDRESS_OF)
> > > +  && known_eq (write->size, write->max_size)
> > > +  && normalize_ref (write, ref)
> > 
> > normalize_ref alters 'write', I think we should work on a local
> > copy here.  See live_bytes_read which takes a copy of 'use_ref'.
> 
> We never proces same write twice (get_ao_ref is always constructing
> fresh copy), so this should be safe.  Or shall I turn the write
> parameter to "ao_ref write" instead of "ao_ref *write" just to be sure
> we do not break infuture?

Yes.

Thanks,
Richard.


Re: [PATCH] ivopts: Improve code generated for very simple loops.

2021-11-16 Thread Richard Biener via Gcc-patches
On Mon, Nov 15, 2021 at 2:04 PM Roger Sayle  wrote:
>
>
> This patch tidies up the code that GCC generates for simple loops,
> by selecting/generating a simpler loop bound expression in ivopts.
> The original motivation came from looking at the following loop (from
> gcc.target/i386/pr90178.c)
>
> int *find_ptr (int* mem, int sz, int val)
> {
>   for (int i = 0; i < sz; i++)
> if (mem[i] == val)
>   return [i];
>   return 0;
> }
>
> which GCC currently compiles to:
>
> find_ptr:
> movq%rdi, %rax
> testl   %esi, %esi
> jle .L4
> leal-1(%rsi), %ecx
> leaq4(%rdi,%rcx,4), %rcx
> jmp .L3
> .L7:addq$4, %rax
> cmpq%rcx, %rax
> je  .L4
> .L3:cmpl%edx, (%rax)
> jne .L7
> ret
> .L4:xorl%eax, %eax
> ret
>
> Notice the relatively complex leal/leaq instructions, that result
> from ivopts using the following expression for the loop bound:
> inv_expr 2: ((unsigned long) ((unsigned int) sz_8(D) + 4294967295)
> * 4 + (unsigned long) mem_9(D)) + 4
>
> which results from NITERS being (unsigned int) sz_8(D) + 4294967295,
> i.e. (sz - 1), and the logic in cand_value_at determining the bound
> as BASE + NITERS*STEP at the start of the final iteration and as
> BASE + NITERS*STEP + STEP at the end of the final iteration.
>
> Ideally, we'd like the middle-end optimizers to simplify
> BASE + NITERS*STEP + STEP as BASE + (NITERS+1)*STEP, especially
> when NITERS already has the form BOUND-1, but with type conversions
> and possible overflow to worry about, the above "inv_expr 2" is the
> best that can be done by fold (without additional context information).
>
> This patch improves ivopts' cand_value_at by instead of using just
> the tree expression for NITERS, passing the data structure that
> explains how that expression was derived.  This allows us to peek
> under the surface to check that NITERS+1 doesn't overflow, and in
> this patch to use the SSA_NAME already holding the required value.
>
> In the motivating loop above, inv_expr 2 now becomes:
> (unsigned long) sz_8(D) * 4 + (unsigned long) mem_9(D)
>
> And as a result, on x86_64 we now generate:
>
> find_ptr:
> movq%rdi, %rax
> testl   %esi, %esi
> jle .L4
> movslq  %esi, %rsi
> leaq(%rdi,%rsi,4), %rcx
> jmp .L3
> .L7:addq$4, %rax
> cmpq%rcx, %rax
> je  .L4
> .L3:cmpl%edx, (%rax)
> jne .L7
> ret
> .L4:xorl%eax, %eax
> ret
>
>
> This improvement required one minor tweak to GCC's testsuite for
> gcc.dg/wrapped-binop-simplify.c, where we again generate better
> code, and therefore no longer find as many optimization opportunities
> in later passes (vrp2).
>
> Previously:
>
> void v1 (unsigned long *in, unsigned long *out, unsigned int n)
> {
>   int i;
>   for (i = 0; i < n; i++) {
> out[i] = in[i];
>   }
> }
>
> on x86_64 generated:
> v1: testl   %edx, %edx
> je  .L1
> movl%edx, %edx
> xorl%eax, %eax
> .L3:movq(%rdi,%rax,8), %rcx
> movq%rcx, (%rsi,%rax,8)
> addq$1, %rax
> cmpq%rax, %rdx
> jne .L3
> .L1:ret
>
> and now instead generates:
> v1: testl   %edx, %edx
> je  .L1
> movl%edx, %edx
> xorl%eax, %eax
> leaq0(,%rdx,8), %rcx
> .L3:movq(%rdi,%rax), %rdx
> movq%rdx, (%rsi,%rax)
> addq$8, %rax
> cmpq%rax, %rcx
> jne .L3
> .L1:ret

Is that actually better?  IIRC the addressing modes are both complex
and we now have an extra lea?  For this case I see we generate

  _15 = n_10(D) + 4294967295;
  _8 = (unsigned long) _15;
  _7 = _8 + 1;

where n is unsigned int so if we know that n is not zero we can simplify the
addition and conveniently the loop header test provides this guarantee.
IIRC there were some attempts to enhance match.pd for some
cases of such expressions.

>
> This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap
> and make -k check with no new failures.  Ok for mainline?

+  /* If AFTER_ADJUST is required, the code below generates the equivalent
+   * of BASE + NITER * STEP + STEP, when ideally we'd prefer the expression
+   * BASE + (NITER + 1) * STEP, especially when NITER is often of the form
+   * SSA_NAME - 1.  Unfortunately, guaranteeing that adding 1 to NITER
+   * doesn't overflow is tricky, so we peek inside the TREE_NITER_DESC
+   * class for common idioms that we know are safe.  */

No '* ' each line.

+  if (after_adjust
+  && desc->control.no_overflow
+  && integer_onep (desc->control.step)
+  && integer_onep (desc->control.base)
+  && desc->cmp == LT_EXPR
+  && TREE_CODE (desc->bound) == SSA_NAME)
+{
+  niter = desc->bound;
+  after_adjust = false;
+}

I wonder if the 

Re: Use modref kills in tree-ssa-dse

2021-11-16 Thread Jan Hubicka via Gcc-patches
> 
> Not sure, tree-ssa-dse.c doesn't seem to handle MEM_REF with offset?
> 
> VN has adjust_offsets_for_equal_base_address for this purpose.  I
> agree that some common functionality like
> 
> bool
> get_relative_extent_of (const ao_ref *base, const ao_ref *ref,
> poly_int64 *offset);
> 
> that computes [offset, offset + ref->[max_]size] of REF adjusted as to
> make ao_ref_base have the same address (or return false if not
> possible).  Then [ base->offset, base->offset + base->max_size ]
> can be compared against that.

OK, I will look into that.
> > +  if (valid_ao_ref_for_dse (write)
> > +  && operand_equal_p (write->base, ref->base, OEP_ADDRESS_OF)
> > +  && known_eq (write->size, write->max_size)
> > +  && normalize_ref (write, ref)
> 
> normalize_ref alters 'write', I think we should work on a local
> copy here.  See live_bytes_read which takes a copy of 'use_ref'.

We never proces same write twice (get_ao_ref is always constructing
fresh copy), so this should be safe.  Or shall I turn the write
parameter to "ao_ref write" instead of "ao_ref *write" just to be sure
we do not break infuture?

Thank you,
Honza


Re: [PATCH][GCC] arm: add armv9-a architecture to -march

2021-11-16 Thread Ramana Radhakrishnan via Gcc-patches
Hi There,

I think for AArch32 mapping it back to armv8-a sounds sufficient.  Unless we 
have string or math routines in newlib that make use of any ACLE guards that 
are beyond armv8-a …

Ramana


From: Richard Earnshaw 
Date: Tuesday, 16 November 2021 at 11:48
To: Christophe Lyon , Przemyslaw Wirkus 

Cc: Ramana Radhakrishnan , 
gcc-patches@gcc.gnu.org , Richard Earnshaw 

Subject: Re: [PATCH][GCC] arm: add armv9-a architecture to -march
You can't make an omelette without breaking eggs, as they say.  New
architectures need new assemblers.

However, I wonder if there's anything in v9-a that significantly affects
the quality of the base multilib code needed for building the libraries.
  It might be that we can deal with v9-a by just mapping it to the v8-a
equivalents.  That would then avoid the need for an updated assembler,
and reduce the build time and install footprint.

R.


On 16/11/2021 08:03, Christophe Lyon via Gcc-patches wrote:
> Hi,
>
>
> On Tue, Nov 9, 2021 at 12:36 PM Przemyslaw Wirkus via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
>
> -Original Message-
> From: Przemyslaw Wirkus
> Sent: 18 October 2021 10:37
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; Ramana
> Radhakrishnan ; Kyrylo Tkachov
> ; ni...@redhat.com
> Subject: [PATCH][GCC] arm: add armv9-a architecture to -march
>
> Hi,
>
> This patch is adding `armv9-a` to -march in Arm GCC.
>
> In this patch:
>+ Add `armv9-a` to -march.
>+ Update multilib with armv9-a and armv9-a+simd.
>
> After this patch three additional multilib directories are available:
>
> $ arm-none-eabi-gcc --print-multi-lib .; [...vanilla multi-lib
> dirs...] thumb/v9-a/nofp;@mthumb@march=armv9-a@mfloat-abi=soft
> thumb/v9-a+simd/softfp;@mthumb@march=armv9-a+simd@mfloat-
> abi=softfp
> thumb/v9-a+simd/hard;@mthumb@march=armv9-a+simd@mfloat-
> abi=hard
>
>>
>
> This is causing a GCC build failure when using "old" binutils (I'm using
> 2.36.1),
> because the new -march=armv9-a option is not supported. This breaks the
> multilib support.
>
> I don't remember how we handled similar cases in the past? Is that just
> "expected", and
> "current" GCC needs "current" binutils, or should we have a multilib list
> dependent on
> the actual binutils support? (I think this is not the case, and it sounds
> like an undesirable
> extra complication in an already overcrowded mutilib-Makefile)
>
> Christophe
>
 New multi-lib directories under
> $GCC_INSTALL_DIE/lib/gcc/arm-none-eabi/12.0.0/thumb are created:
>
> thumb/
> +--- v9-a
> ||--- nofp
> |
> +--- v9-a+simd
>   |--- hard
>   |--- softfp
>
> Regtested on arm-none-eabi cross and no issues.
>
> OK for master?
>>
>> Thanks.
>>
>> commit 32ba7860ccaddd5219e6dae94a3d0653e124c9dd
>>
>>> Ok.
>>> Thanks,
>>> Kyrill
>>>
>>>
>
> gcc/ChangeLog:
>
>* config/arm/arm-cpus.in (armv9): New define.
>(ARMv9a): New group.
>(armv9-a): New arch definition.
>* config/arm/arm-tables.opt: Regenerate.
>* config/arm/arm.h (BASE_ARCH_9A): New arch enum value.
>* config/arm/t-aprofile: Added armv9-a and armv9+simd.
>* config/arm/t-arm-elf: Added arm9-a, v9_fps and all_v9_archs
>to MULTILIB_MATCHES.
>* config/arm/t-multilib: Added v9_a_nosimd_variants and
>v9_a_simd_variants to MULTILIB_MATCHES.
>* doc/invoke.texi: Update docs.
>
> gcc/testsuite/ChangeLog:
>
>* gcc.target/arm/multilib.exp: Update test with armv9-a entries.
>* lib/target-supports.exp (v9a): Add new armflag.
>(__ARM_ARCH_9A__): Add new armdef.
>
> --
> kind regards,
> Przemyslaw Wirkus
>>
>>


Re: [PATCH 5/5] vect: Support masked gather loads with SLP

2021-11-16 Thread Richard Biener via Gcc-patches
On Fri, Nov 12, 2021 at 7:06 PM Richard Sandiford via Gcc-patches
 wrote:
>
> This patch extends the previous SLP gather load support so
> that it can handle masked loads too.
>
> Regstrapped on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

OK.

Thanks,
Richard.

> Richard
>
>
> gcc/
> * tree-vect-slp.c (arg1_arg4_map): New variable.
> (vect_get_operand_map): Handle IFN_MASK_GATHER_LOAD.
> (vect_build_slp_tree_1): Likewise.
> (vect_build_slp_tree_2): Likewise.
> * tree-vect-stmts.c (vectorizable_load): Expect the mask to be
> the last SLP child node rather than the first.
>
> gcc/testsuite/
> * gcc.dg/vect/vect-gather-3.c: New test.
> * gcc.dg/vect/vect-gather-4.c: Likewise.
> * gcc.target/aarch64/sve/mask_gather_load_8.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/vect/vect-gather-3.c | 64 ++
>  gcc/testsuite/gcc.dg/vect/vect-gather-4.c | 48 ++
>  .../aarch64/sve/mask_gather_load_8.c  | 65 +++
>  gcc/tree-vect-slp.c   | 15 -
>  gcc/tree-vect-stmts.c | 21 --
>  5 files changed, 203 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-gather-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-gather-4.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/mask_gather_load_8.c
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-gather-3.c 
> b/gcc/testsuite/gcc.dg/vect/vect-gather-3.c
> new file mode 100644
> index 000..738bd3f3106
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-gather-3.c
> @@ -0,0 +1,64 @@
> +#include "tree-vect.h"
> +
> +#define N 16
> +
> +void __attribute__((noipa))
> +f (int *restrict y, int *restrict x, int *restrict indices)
> +{
> +  for (int i = 0; i < N; ++i)
> +{
> +  y[i * 2] = (indices[i * 2] < N * 2
> + ? x[indices[i * 2]] + 1
> + : 1);
> +  y[i * 2 + 1] = (indices[i * 2 + 1] < N * 2
> + ? x[indices[i * 2 + 1]] + 2
> + : 2);
> +}
> +}
> +
> +int y[N * 2];
> +int x[N * 2] = {
> +  72704, 52152, 51301, 96681,
> +  57937, 60490, 34504, 60944,
> +  42225, 28333, 88336, 74300,
> +  29250, 20484, 38852, 91536,
> +  86917, 63941, 31590, 21998,
> +  22419, 26974, 28668, 13968,
> +  3451, 20247, 44089, 85521,
> +  22871, 87362, 50555, 85939
> +};
> +int indices[N * 2] = {
> +  15, 0x1, 0xcafe0, 19,
> +  7, 22, 19, 1,
> +  0x2, 0x7, 15, 30,
> +  5, 12, 11, 11,
> +  10, 25, 5, 20,
> +  22, 24, 32, 28,
> +  30, 19, 6, 0xabcdef,
> +  7, 12, 8, 21
> +};
> +int expected[N * 2] = {
> +  91537, 2, 1, 22000,
> +  60945, 28670, 21999, 52154,
> +  1, 2, 91537, 50557,
> +  60491, 29252, 74301, 74302,
> +  88337, 20249, 60491, 22421,
> +  28669, 3453, 1, 22873,
> +  50556, 22000, 34505, 2,
> +  60945, 29252, 42226, 26976
> +};
> +
> +int
> +main (void)
> +{
> +  check_vect ();
> +
> +  f (y, x, indices);
> +  for (int i = 0; i < 32; ++i)
> +if (y[i] != expected[i])
> +  __builtin_abort ();
> +
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump "Loop contains only SLP stmts" vect { target 
> { vect_gather_load_ifn && vect_masked_load } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-gather-4.c 
> b/gcc/testsuite/gcc.dg/vect/vect-gather-4.c
> new file mode 100644
> index 000..ee2e4e4999a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-gather-4.c
> @@ -0,0 +1,48 @@
> +/* { dg-do compile } */
> +
> +#define N 16
> +
> +void
> +f1 (int *restrict y, int *restrict x1, int *restrict x2,
> +int *restrict indices)
> +{
> +  for (int i = 0; i < N; ++i)
> +{
> +  y[i * 2] = (indices[i * 2] < N * 2
> + ? x1[indices[i * 2]] + 1
> + : 1);
> +  y[i * 2 + 1] = (indices[i * 2 + 1] < N * 2
> + ? x2[indices[i * 2 + 1]] + 2
> + : 2);
> +}
> +}
> +
> +void
> +f2 (int *restrict y, int *restrict x, int *restrict indices)
> +{
> +  for (int i = 0; i < N; ++i)
> +{
> +  y[i * 2] = (indices[i * 2] < N * 2
> + ? x[indices[i * 2]] + 1
> + : 1);
> +  y[i * 2 + 1] = (indices[i * 2 + 1] < N * 2
> + ? x[indices[i * 2 + 1] * 2] + 2
> + : 2);
> +}
> +}
> +
> +void
> +f3 (int *restrict y, int *restrict x, int *restrict indices)
> +{
> +  for (int i = 0; i < N; ++i)
> +{
> +  y[i * 2] = (indices[i * 2] < N * 2
> + ? x[indices[i * 2]] + 1
> + : 1);
> +  y[i * 2 + 1] = (indices[i * 2 + 1] < N * 2
> + ? x[(unsigned int) indices[i * 2 + 1]] + 2
> + : 2);
> +}
> +}
> +
> +/* { dg-final { scan-tree-dump-not "Loop contains only SLP stmts" vect { 
> target vect_gather_load_ifn } } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/mask_gather_load_8.c 
> 

Re: [PATCH] regrename: Skip renaming if instruction is noop move.

2021-11-16 Thread Richard Biener via Gcc-patches
On Tue, Nov 16, 2021 at 12:45 PM Jojo R via Gcc-patches
 wrote:
>
> Skip renaming if instruction is noop move, and it will
> been removed for performance.

Is there any (target specific) testcase you can add?  Such commits are
problematic
when later bisected to since the intent isn't clear.

> gcc/
> * regrename.c (find_rename_reg): Return satisfied regno
> if instruction is noop move.
> ---
>  gcc/regrename.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/gcc/regrename.c b/gcc/regrename.c
> index b8a9ca36f22..cb605f5176b 100644
> --- a/gcc/regrename.c
> +++ b/gcc/regrename.c
> @@ -394,6 +394,9 @@ find_rename_reg (du_head_p this_head, enum reg_class 
> super_class,
>   this_head, *unavailable))
>  return this_head->tied_chain->regno;
>
> +  if (noop_move_p (this_head->first->insn))
> +return best_new_reg;
> +
>/* If PREFERRED_CLASS is not NO_REGS, we iterate in the first pass
>   over registers that belong to PREFERRED_CLASS and try to find the
>   best register within the class.  If that failed, we iterate in
> --
> 2.24.3 (Apple Git-128)
>


Re: [AArch64] Enable generation of FRINTNZ instructions

2021-11-16 Thread Richard Biener via Gcc-patches
On Fri, 12 Nov 2021, Andre Simoes Dias Vieira wrote:

> 
> On 12/11/2021 10:56, Richard Biener wrote:
> > On Thu, 11 Nov 2021, Andre Vieira (lists) wrote:
> >
> >> Hi,
> >>
> >> This patch introduces two IFN's FTRUNC32 and FTRUNC64, the corresponding
> >> optabs and mappings. It also creates a backend pattern to implement them
> >> for
> >> aarch64 and a match.pd pattern to idiom recognize these.
> >> These IFN's (and optabs) represent a truncation towards zero, as if
> >> performed
> >> by first casting it to a signed integer of 32 or 64 bits and then back to
> >> the
> >> same floating point type/mode.
> >>
> >> The match.pd pattern choses to use these, when supported, regardless of
> >> trapping math, since these new patterns mimic the original behavior of
> >> truncating through an integer.
> >>
> >> I didn't think any of the existing IFN's represented these. I know it's a
> >> bit
> >> late in stage 1, but I thought this might be OK given it's only used by a
> >> single target and should have very little impact on anything else.
> >>
> >> Bootstrapped on aarch64-none-linux.
> >>
> >> OK for trunk?
> > On the RTL side ftrunc32/ftrunc64 would probably be better a conversion
> > optab (with two modes), so not
> >
> > +OPTAB_D (ftrunc32_optab, "ftrunc$asi2")
> > +OPTAB_D (ftrunc64_optab, "ftrunc$adi2")
> >
> > but
> >
> > OPTAB_CD (ftrunc_shrt_optab, "ftrunc$a$I$b2")
> >
> > or so?  I know that gets somewhat awkward for the internal function,
> > but IMHO we shouldn't tie our hands because of that?
> I tried doing this originally, but indeed I couldn't find a way to correctly
> tie the internal function to it.
> 
> direct_optab_supported_p with multiple types expect those to be of the same
> mode. I see convert_optab_supported_p does but I don't know how that is
> used...
> 
> Any ideas?

No "nice" ones.  The "usual" way is to provide fake arguments that
specify the type/mode.  We could use an integer argument directly
secifying the mode (then the IL would look host dependent - ugh),
or specify a constant zero in the intended mode (less visibly
obvious - but at least with -gimple dumping you'd see the type...).

In any case if people think going with two optabs is OK then
please consider using ftruncsi and ftruncdi instead of 32/64.

Richard.


Re: Use modref kills in tree-ssa-dse

2021-11-16 Thread Richard Biener via Gcc-patches
On Mon, 15 Nov 2021, Jan Hubicka wrote:

> Hi,
> this patch extends tree-ssa-dse to use modref kill summary to clear
> live_bytes.  This makes it possible to remove calls that are killed
> in parts.
> 
> I noticed that DSE duplicates the logic of tree-ssa-alias that is 
> mathing bases of memory accesses.  Here operands_equal_p (base1, base, 
> OEP_ADDRESS_OF) is used. So it won't work with mismatching memref 
> offsets.  We probably want to commonize this and add common function 
> that matches bases and returns offset adjustments. I wonder however if 
> it can catch any cases that the tree-ssa-alias code doesn't?

Not sure, tree-ssa-dse.c doesn't seem to handle MEM_REF with offset?

VN has adjust_offsets_for_equal_base_address for this purpose.  I
agree that some common functionality like

bool
get_relative_extent_of (const ao_ref *base, const ao_ref *ref,
poly_int64 *offset);

that computes [offset, offset + ref->[max_]size] of REF adjusted as to
make ao_ref_base have the same address (or return false if not
possible).  Then [ base->offset, base->offset + base->max_size ]
can be compared against that.

> Other check that stmt_kills_ref_p has and tree-ssa-dse is for 
> non-call-exceptions.
> 
> Bootstrapped/regtested x86_64-linux, OK?

See below.

> gcc/ChangeLog:
> 
>   * ipa-modref.c (get_modref_function_summary): New function.
>   * ipa-modref.h (get_modref_function_summary): Declare.
>   * tree-ssa-dse.c (clear_live_bytes_for_ref): Break out from ...
>   (clear_bytes_written_by): ... here; add handling of modref summary.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/tree-ssa/modref-dse-4.c: New test.
> 
> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
> index df4612bbff9..8966f9fd2a4 100644
> --- a/gcc/ipa-modref.c
> +++ b/gcc/ipa-modref.c
> @@ -724,6 +724,22 @@ get_modref_function_summary (cgraph_node *func)
>return r;
>  }
>  
> +/* Get function summary for CALL if it exists, return NULL otherwise.
> +   If INTERPOSED is non-NULL set it to true if call may be interposed.  */
> +
> +modref_summary *
> +get_modref_function_summary (gcall *call, bool *interposed)
> +{
> +  tree callee = gimple_call_fndecl (call);
> +  if (!callee)
> +return NULL;
> +  struct cgraph_node *node = cgraph_node::get (callee);
> +  if (!node)
> +return NULL;
> +  if (interposed)
> +*interposed = !node->binds_to_current_def_p ();
> +  return get_modref_function_summary (node);
> +}
> +
>  namespace {
>  
>  /* Construct modref_access_node from REF.  */
> diff --git a/gcc/ipa-modref.h b/gcc/ipa-modref.h
> index 9e8a30fd80a..72e608864ce 100644
> --- a/gcc/ipa-modref.h
> +++ b/gcc/ipa-modref.h
> @@ -50,6 +50,7 @@ struct GTY(()) modref_summary
>  };
>  
>  modref_summary *get_modref_function_summary (cgraph_node *func);
> +modref_summary *get_modref_function_summary (gcall *call, bool *interposed);
>  void ipa_modref_c_finalize ();
>  void ipa_merge_modref_summary_after_inlining (cgraph_edge *e);
>  
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-4.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-4.c
> new file mode 100644
> index 000..81aa7dc587c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/modref-dse-4.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-dse2-details"  } */
> +struct a {int a,b,c;};
> +__attribute__ ((noinline))
> +void
> +kill_me (struct a *a)
> +{
> +  a->a=0;
> +  a->b=0;
> +  a->c=0;
> +}
> +__attribute__ ((noinline))
> +void
> +my_pleasure (struct a *a)
> +{
> +  a->a=1;
> +  a->c=2;
> +}
> +void
> +set (struct a *a)
> +{
> +  kill_me (a);
> +  my_pleasure (a);
> +  a->b=1;
> +}
> +/* { dg-final { scan-tree-dump "Deleted dead store: kill_me" "dse2" } } */
> diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
> index ce0083a6dab..d2f54b0faad 100644
> --- a/gcc/tree-ssa-dse.c
> +++ b/gcc/tree-ssa-dse.c
> @@ -209,6 +209,24 @@ normalize_ref (ao_ref *copy, ao_ref *ref)
>return true;
>  }
>  
> +/* Update LIVE_BYTES tracking REF for write to WRITE:
> +   Verify we have the same base memory address, the write
> +   has a known size and overlaps with REF.  */
> +static void
> +clear_live_bytes_for_ref (sbitmap live_bytes, ao_ref *ref, ao_ref *write)
> +{
> +  HOST_WIDE_INT start, size;
> +
> +  if (valid_ao_ref_for_dse (write)
> +  && operand_equal_p (write->base, ref->base, OEP_ADDRESS_OF)
> +  && known_eq (write->size, write->max_size)
> +  && normalize_ref (write, ref)

normalize_ref alters 'write', I think we should work on a local
copy here.  See live_bytes_read which takes a copy of 'use_ref'.

Otherwise looks good to me.

Thanks,
Richard.

> +  && (write->offset - ref->offset).is_constant ()
> +  && write->size.is_constant ())
> +bitmap_clear_range (live_bytes, start / BITS_PER_UNIT,
> + size / BITS_PER_UNIT);
> +}
> +
>  /* Clear any bytes written by STMT from the bitmap LIVE_BYTES.  The base
> address written by STMT must match the one 

[PATCH] middle-end/103248 - fix RDIV_EXPR handling with fixed point

2021-11-16 Thread Richard Biener via Gcc-patches
This fixes the previous adjustment to operation_could_trap_helper_p
where I failed to realize that RDIV_EXPR is also used for
fixed-point types.  It also fixes that handling by properly
checking for a fixed_zerop divisor.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

2021-11-16  Richard Biener  

PR middle-end/103248
* tree-eh.c (operation_could_trap_helper_p): Properly handle
fixed-point RDIV_EXPR.

* gcc.dg/pr103248.c: New testcase.
---
 gcc/testsuite/gcc.dg/pr103248.c |  8 
 gcc/tree-eh.c   | 12 +---
 2 files changed, 17 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr103248.c

diff --git a/gcc/testsuite/gcc.dg/pr103248.c b/gcc/testsuite/gcc.dg/pr103248.c
new file mode 100644
index 000..da6232d21ee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103248.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target fixed_point } */
+/* { dg-options "-fnon-call-exceptions" } */
+
+_Accum sa;
+int c;
+
+void div_csa() { c /= sa; }
diff --git a/gcc/tree-eh.c b/gcc/tree-eh.c
index 3eff07fc8fe..916da85af2e 100644
--- a/gcc/tree-eh.c
+++ b/gcc/tree-eh.c
@@ -2474,10 +2474,16 @@ operation_could_trap_helper_p (enum tree_code op,
   return false;
 
 case RDIV_EXPR:
-  if (honor_snans)
+  if (fp_operation)
+   {
+ if (honor_snans)
+   return true;
+ return flag_trapping_math;
+   }
+  /* Fixed point operations also use RDIV_EXPR.  */
+  if (!TREE_CONSTANT (divisor) || fixed_zerop (divisor))
return true;
-  gcc_assert (fp_operation);
-  return flag_trapping_math;
+  return false;
 
 case LT_EXPR:
 case LE_EXPR:
-- 
2.31.1


[PATCH] OpenMP: Ensure that offloaded variables are public

2021-11-16 Thread Andrew Stubbs

Hi,

This patch is needed for AMD GCN offloading when we use the assembler 
from LLVM 13+.


The GCN runtime (libgomp+ROCm) requires that the location of all 
variables in the offloaded variables table are discoverable at runtime 
(using the "hsa_executable_symbol_get_info" API), and this only works 
when the symbols are exported from the binary. Previously we solved this 
by having mkoffload insert ".global" directives into the assembler text, 
but newer LLVM assemblers emit an error if we do this when then variable 
was previously declared ".local" (which happens when a variable is 
zero-initialized and placed in the BSS).


Since we can no longer easily fix them up after the fact, this patch 
fixes them up during OMP lowering.


OK?

AndrewOpenMP: Ensure that offloaded variables are public

The AMD GCN runtime loader requires that variables in the offload table are
exported (public) so that it can locate the load address and do the mapping.

gcc/ChangeLog:

* config/gcn/mkoffload.c (process_asm): Don't add .global directives.
* omp-offload.c (pass_omp_target_link::execute): Make offload_vars
public.

diff --git a/gcc/config/gcn/mkoffload.c b/gcc/config/gcn/mkoffload.c
index b2e71ea5aa00..5b130cc6de71 100644
--- a/gcc/config/gcn/mkoffload.c
+++ b/gcc/config/gcn/mkoffload.c
@@ -573,10 +573,6 @@ process_asm (FILE *in, FILE *out, FILE *cfile)
  abort ();
obstack_int_grow (_os, varsize);
var_count++;
-
-   /* The HSA Runtime cannot locate the symbol if it is not
-  exported from the kernel.  */
-   fprintf (out, "\t.global %s\n", varname);
  }
break;
  }
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 833f7ddea58f..c6fb87a5dee2 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -2799,6 +2799,18 @@ pass_omp_target_link::execute (function *fun)
}
 }
 
+  /* Variables in the offload table may need to be public for the runtime
+ loader to be able to locate them.  (This is true for at least amdgcn.)  */
+  if (offload_vars)
+for (auto it = offload_vars->begin (); it != offload_vars->end (); it++)
+if (!TREE_PUBLIC (*it))
+  {
+   TREE_PUBLIC (*it) = 1;
+
+   if (dump_enabled_p () && dump_flags & TDF_DETAILS)
+ dump_printf (MSG_NOTE, "Make offload var public: %T\n", *it);
+  }
+
   return 0;
 }
 


Re: [PATCH][GCC] arm: add armv9-a architecture to -march

2021-11-16 Thread Richard Earnshaw via Gcc-patches
You can't make an omelette without breaking eggs, as they say.  New 
architectures need new assemblers.


However, I wonder if there's anything in v9-a that significantly affects 
the quality of the base multilib code needed for building the libraries. 
 It might be that we can deal with v9-a by just mapping it to the v8-a 
equivalents.  That would then avoid the need for an updated assembler, 
and reduce the build time and install footprint.


R.


On 16/11/2021 08:03, Christophe Lyon via Gcc-patches wrote:

Hi,


On Tue, Nov 9, 2021 at 12:36 PM Przemyslaw Wirkus via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:


-Original Message-
From: Przemyslaw Wirkus
Sent: 18 October 2021 10:37
To: gcc-patches@gcc.gnu.org
Cc: Richard Earnshaw ; Ramana
Radhakrishnan ; Kyrylo Tkachov
; ni...@redhat.com
Subject: [PATCH][GCC] arm: add armv9-a architecture to -march

Hi,

This patch is adding `armv9-a` to -march in Arm GCC.

In this patch:
   + Add `armv9-a` to -march.
   + Update multilib with armv9-a and armv9-a+simd.

After this patch three additional multilib directories are available:

$ arm-none-eabi-gcc --print-multi-lib .; [...vanilla multi-lib
dirs...] thumb/v9-a/nofp;@mthumb@march=armv9-a@mfloat-abi=soft
thumb/v9-a+simd/softfp;@mthumb@march=armv9-a+simd@mfloat-
abi=softfp
thumb/v9-a+simd/hard;@mthumb@march=armv9-a+simd@mfloat-
abi=hard





This is causing a GCC build failure when using "old" binutils (I'm using
2.36.1),
because the new -march=armv9-a option is not supported. This breaks the
multilib support.

I don't remember how we handled similar cases in the past? Is that just
"expected", and
"current" GCC needs "current" binutils, or should we have a multilib list
dependent on
the actual binutils support? (I think this is not the case, and it sounds
like an undesirable
extra complication in an already overcrowded mutilib-Makefile)

Christophe


New multi-lib directories under

$GCC_INSTALL_DIE/lib/gcc/arm-none-eabi/12.0.0/thumb are created:

thumb/
+--- v9-a
||--- nofp
|
+--- v9-a+simd
  |--- hard
  |--- softfp

Regtested on arm-none-eabi cross and no issues.

OK for master?


Thanks.

commit 32ba7860ccaddd5219e6dae94a3d0653e124c9dd


Ok.
Thanks,
Kyrill




gcc/ChangeLog:

   * config/arm/arm-cpus.in (armv9): New define.
   (ARMv9a): New group.
   (armv9-a): New arch definition.
   * config/arm/arm-tables.opt: Regenerate.
   * config/arm/arm.h (BASE_ARCH_9A): New arch enum value.
   * config/arm/t-aprofile: Added armv9-a and armv9+simd.
   * config/arm/t-arm-elf: Added arm9-a, v9_fps and all_v9_archs
   to MULTILIB_MATCHES.
   * config/arm/t-multilib: Added v9_a_nosimd_variants and
   v9_a_simd_variants to MULTILIB_MATCHES.
   * doc/invoke.texi: Update docs.

gcc/testsuite/ChangeLog:

   * gcc.target/arm/multilib.exp: Update test with armv9-a entries.
   * lib/target-supports.exp (v9a): Add new armflag.
   (__ARM_ARCH_9A__): Add new armdef.

--
kind regards,
Przemyslaw Wirkus





[PATCH] regrename: Skip renaming if instruction is noop move.

2021-11-16 Thread Jojo R via Gcc-patches
Skip renaming if instruction is noop move, and it will
been removed for performance.

gcc/
* regrename.c (find_rename_reg): Return satisfied regno
if instruction is noop move.
---
 gcc/regrename.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/regrename.c b/gcc/regrename.c
index b8a9ca36f22..cb605f5176b 100644
--- a/gcc/regrename.c
+++ b/gcc/regrename.c
@@ -394,6 +394,9 @@ find_rename_reg (du_head_p this_head, enum reg_class 
super_class,
  this_head, *unavailable))
 return this_head->tied_chain->regno;
 
+  if (noop_move_p (this_head->first->insn))
+return best_new_reg;
+
   /* If PREFERRED_CLASS is not NO_REGS, we iterate in the first pass
  over registers that belong to PREFERRED_CLASS and try to find the
  best register within the class.  If that failed, we iterate in
-- 
2.24.3 (Apple Git-128)



Re: [PATCH 1/5] libstdc++: Import the fast_float library

2021-11-16 Thread Jonathan Wakely via Gcc-patches
On Tue, 16 Nov 2021 at 09:46, Florian Weimer via Libstdc++ <
libstd...@gcc.gnu.org> wrote:

> * Jonathan Wakely:
>
> > On Tue, 16 Nov 2021 at 08:01, Florian Weimer wrote:
> >>
> >> * Patrick Palka via Libstdc:
> >>
> >> > This copies the fast_float library[1] into the compiled-in library
> >> > sources.  We're going to use this library in our floating-point
> >> > std::from_chars implementation for faster and more portable parsing of
> >> > binary32/64 decimal strings.
> >> >
> >> > [1]: https://github.com/fastfloat/fast_float
> >> >
> >> > Series tested on x86_64, i686, ppc64, ppc64le and aarch64, does it
> >> > look OK for trunk?
> >>
> >> Missing Signed-off-by:?
> >
> > That's not needed if Patrick is still covered by an FSF assignment.
>
> But the submission is not covered by the FSF assignment.
>

Good point.


> > I think we could use Apache as well, because this code isn't going to
> > appear in public headers so the problematic clause doesn't apply. But
> > MIT is simpler.
>
> Okay, so you consider dynamic linking only?  I think the historic
> libstdc++ license is more permissive than Apache or MIT when used with
> GCC.  There aren't any notification or other requirements.
>
>
Another good point - the Apache license is (once again) problematic here.
So it's good we can choose the MIT one.


POS Customers Database

2021-11-16 Thread Taylor Germain via Gcc-patches
Hi,

I was in your website, and I got to know that you are one of the Point Of sales 
(POS) company. We can help you in providing customers/users and competitors' 
business contacts across USA and worldwide which includes entire business 
details that you would require.


Technology Product we track
Number of Users
Square POS
14028
Aloha POS
4852
Lightspeed Retail
1247
Maropost
1547
Total: 21674


If this sounds of any value, please specify your requirement in detail so that 
I can get back to you with more information and few samples just for your 
review.

I look forward to hearing from you soon.

Regards,

Taylor Germain |Business Manager - Partnership Development

As this is not an auto generated email, to discontinue receiving email from us 
reply as "Exclude"



[committed] arc: Update (u)maddhisi4 patterns

2021-11-16 Thread Claudiu Zissulescu via Gcc-patches
The (u)maddsihi4 patterns are using the ARC's VMAC2H(U)
instruction with null destination, however, VMAC2H(U) doesn't
rewrite the accumulator.  This patch solves the destination issue
of VMAC2H by replacing it with DMACH(U) instruction.

gcc/

* config/arc/arc.md (maddhisi4): Use a single move to accumulator.
(umaddhisi4): Likewise.
(machi): Update pattern.
(umachi): Likewise.

gcc/testsuite/

* gcc.target/arc/tmac-4.c: New test.

Signed-off-by: Claudiu Zissulescu 
---
 gcc/config/arc/arc.md | 34 +--
 gcc/testsuite/gcc.target/arc/tmac-4.c | 29 +++
 2 files changed, 46 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/tmac-4.c

diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 4919d275820..74ec38f1526 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -6023,26 +6023,26 @@ (define_insn "stack_irq_dwarf"
 (define_expand "maddhisi4"
   [(match_operand:SI 0 "register_operand" "")
(match_operand:HI 1 "register_operand" "")
-   (match_operand:HI 2 "extend_operand"   "")
+   (match_operand:HI 2 "register_operand" "")
(match_operand:SI 3 "register_operand" "")]
   "TARGET_PLUS_MACD"
   "{
-   rtx acc_reg = gen_rtx_REG (SImode, ACC_REG_FIRST);
+   rtx acc_reg = gen_rtx_REG (SImode, ACCL_REGNO);
 
emit_move_insn (acc_reg, operands[3]);
-   emit_insn (gen_machi (operands[1], operands[2]));
-   emit_move_insn (operands[0], acc_reg);
+   emit_insn (gen_machi (operands[0], operands[1], operands[2], acc_reg));
DONE;
   }")
 
 (define_insn "machi"
-  [(set (reg:SI ARCV2_ACC)
+  [(set (match_operand:SI 0 "register_operand" "=Ral,r")
(plus:SI
-(mult:SI (sign_extend:SI (match_operand:HI 0 "register_operand" "%r"))
- (sign_extend:SI (match_operand:HI 1 "register_operand" "r")))
-(reg:SI ARCV2_ACC)))]
+(mult:SI (sign_extend:SI (match_operand:HI 1 "register_operand" 
"%r,r"))
+ (sign_extend:SI (match_operand:HI 2 "register_operand" 
"r,r")))
+(match_operand:SI 3 "accl_operand" "")))
+   (clobber (reg:DI ARCV2_ACC))]
   "TARGET_PLUS_MACD"
-  "vmac2h\\t0,%0,%1"
+  "dmach\\t%0,%1,%2"
   [(set_attr "length" "4")
(set_attr "type" "multi")
(set_attr "predicable" "no")
@@ -6056,22 +6056,22 @@ (define_expand "umaddhisi4"
(match_operand:SI 3 "register_operand" "")]
   "TARGET_PLUS_MACD"
   "{
-   rtx acc_reg = gen_rtx_REG (SImode, ACC_REG_FIRST);
+   rtx acc_reg = gen_rtx_REG (SImode, ACCL_REGNO);
 
emit_move_insn (acc_reg, operands[3]);
-   emit_insn (gen_umachi (operands[1], operands[2]));
-   emit_move_insn (operands[0], acc_reg);
+   emit_insn (gen_umachi (operands[0], operands[1], operands[2], acc_reg));
DONE;
   }")
 
 (define_insn "umachi"
-  [(set (reg:SI ARCV2_ACC)
+  [(set (match_operand:SI 0 "register_operand" "=Ral,r")
(plus:SI
-(mult:SI (zero_extend:SI (match_operand:HI 0 "register_operand" "%r"))
- (zero_extend:SI (match_operand:HI 1 "register_operand" "r")))
-(reg:SI ARCV2_ACC)))]
+(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" 
"%r,r"))
+ (zero_extend:SI (match_operand:HI 2 "register_operand" 
"r,r")))
+(match_operand:SI 3 "accl_operand" "")))
+   (clobber (reg:DI ARCV2_ACC))]
   "TARGET_PLUS_MACD"
-  "vmac2hu\\t0,%0,%1"
+  "dmachu\\t%0,%1,%2"
   [(set_attr "length" "4")
(set_attr "type" "multi")
(set_attr "predicable" "no")
diff --git a/gcc/testsuite/gcc.target/arc/tmac-4.c 
b/gcc/testsuite/gcc.target/arc/tmac-4.c
new file mode 100644
index 000..3c6b99327a7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/tmac-4.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-skip-if "" { ! { clmcpu } } } */
+/* { dg-options "-O3 -mbig-endian -mcpu=hs38" } */
+
+struct a {};
+struct b {
+  int c;
+  int d;
+};
+
+struct {
+  struct a e;
+  struct b f[];
+} g;
+short h;
+
+extern void bar (int *);
+
+int foo(void)
+{
+  struct b *a;
+  for (;;)
+{
+  a = [h];
+  bar(>d);
+}
+}
+
+/* { dg-final { scan-assembler "dmach" } } */
-- 
2.31.1



[PATCH] tree-optimization/102880 - improve CD-DCE

2021-11-16 Thread Richard Biener via Gcc-patches
The PR shows a missed control-dependent DCE caused by CFG cleanup
merging a forwarder resulting in a partially degenerate PHI node.
With control-dependent DCE we need to mark control dependences
of incoming edges into PHIs as necessary but that is unnecessarily
conservative for the case when two edges have the same value.
There is no easy way to mark only a subset of control dependences
of both edges necessary so the fix is to produce forwarder blocks
where then the control dependence captures the requirements more
precisely.

For gcc.dg/tree-ssa/ssa-dom-thread-7.c the number of edges in the
CFG decrease as we have commonized PHI arguments which in turn
results in different threadings.  The testcase is too complex
and the dump scanning too simple to do anything meaningful here
but to adjust the number of expected threads.

The same CFG massaging could be useful at RTL expansion time to
reduce the number of copies we need to insert on edges.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-11-12  Richard Biener  

PR tree-optimization/102880
* tree-ssa-dce.c (sort_phi_args): New function.
(make_forwarders_with_degenerate_phis): Likewise.
(perform_tree_ssa_dce): Call
make_forwarders_with_degenerate_phis.

* gcc.dg/tree-ssa/pr102880.c: New testcase.
* gcc.dg/tree-ssa/pr69270-3.c: Robustify.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Change the number of
expected threadings.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr102880.c  |  27 +++
 gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c |   2 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-7.c|   2 +-
 gcc/tree-ssa-dce.c| 171 +-
 4 files changed, 196 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr102880.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr102880.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr102880.c
new file mode 100644
index 000..0306deedb6c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr102880.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+void foo(void);
+
+static int b, c, d, e, f, ah;
+static short g, ai, am, aq, as;
+static char an, at, av, ax, ay;
+static char a(char h, char i) { return i == 0 || h && i == 1 ? 0 : h % i; }
+static void ae(int h) {
+  if (a(b, h))
+foo();
+
+}
+int main() {
+  ae(1);
+  ay = a(0, ay);
+  ax = a(g, aq);
+  at = a(0, as);
+  av = a(c, 1);
+  an = a(am, f);
+  int al = e || ((a(1, ah) && b) & d) == 2;
+  ai = al;
+}
+
+/* We should eliminate the call to foo.  */
+/* { dg-final { scan-tree-dump-not "foo" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c
index 89735f67de2..5ffd5f71506 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c
@@ -3,7 +3,7 @@
 
 /* We're looking for a constant argument a PHI node.  There
should only be one if we unpropagate correctly.  */
-/* { dg-final { scan-tree-dump-times ", 1" 1 "uncprop1"} } */
+/* { dg-final { scan-tree-dump-times "<1\|, 1" 1 "uncprop1"} } */
 
 typedef long unsigned int size_t;
 typedef union gimple_statement_d *gimple;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
index d40a61fd725..b64e71dae22 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
@@ -11,7 +11,7 @@
to change decisions in switch expansion which in turn can expose new
jump threading opportunities.  Skip the later tests on aarch64.  */
 /* { dg-final { scan-tree-dump-not "Jumps threaded"  "dom3" { target { ! 
aarch64*-*-* } } } } */
-/* { dg-final { scan-tree-dump "Jumps threaded: 11"  "thread2" { target { ! 
aarch64*-*-* } } } } */
+/* { dg-final { scan-tree-dump "Jumps threaded: 7"  "thread2" { target { ! 
aarch64*-*-* } } } } */
 /* { dg-final { scan-tree-dump "Jumps threaded: 18"  "thread2" { target { 
aarch64*-*-* } } } } */
 
 enum STATE {
diff --git a/gcc/tree-ssa-dce.c b/gcc/tree-ssa-dce.c
index 1281e67489c..dbf02c434de 100644
--- a/gcc/tree-ssa-dce.c
+++ b/gcc/tree-ssa-dce.c
@@ -67,6 +67,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-scalar-evolution.h"
 #include "tree-ssa-propagate.h"
 #include "gimple-fold.h"
+#include "tree-ssa.h"
 
 static struct stmt_stats
 {
@@ -1612,6 +1613,164 @@ tree_dce_done (bool aggressive)
   worklist.release ();
 }
 
+/* Sort PHI argument values for make_forwarders_with_degenerate_phis.  */
+
+static int
+sort_phi_args (const void *a_, const void *b_)
+{
+  auto *a = (const std::pair *) a_;
+  auto *b = (const std::pair *) b_;
+  hashval_t ha = a->second;
+  hashval_t hb = b->second;
+  if (ha < hb)
+return -1;
+  else if (ha > hb)
+return 1;
+  else
+return 0;
+}
+
+/* Look for a non-virtual PHIs and make a forwarder block when all PHIs
+   have the 

RE: [vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2021-11-16 Thread Joel Hutton via Gcc-patches
Updated patch 2 with explanation included in commit message and changes 
requested.

Bootstrapped and regression tested on aarch64
> -Original Message-
> From: Joel Hutton
> Sent: 12 November 2021 11:42
> To: Richard Biener 
> Cc: gcc-patches@gcc.gnu.org; Richard Sandiford
> 
> Subject: RE: [vect-patterns] Refactor widen_plus/widen_minus as
> internal_fns
> 
> > please use #define INCLUDE_MAP before the system.h include instead.
Done.

> > Is it really necessary to build a new std::map for each optab lookup?!
> > That looks quite ugly and inefficient.  We'd usually - if necessary at
> > all - build a auto_vec > and .sort () and .bsearch () 
> > it.
> Ok, I'll rework this part. In the meantime, to address your other comment.
Done.

> > I'm not sure I understand DEF_INTERNAL_OPTAB_MULTI_FN, neither this
> > cover letter nor the patch ChangeLog explains anything.
> 
> I'll attempt to clarify, if this makes things clearer I can include this in 
> the
> commit message of the respun patch:
> 
> DEF_INTERNAL_OPTAB_MULTI_FN is like DEF_INTERNAL_OPTAB_FN except it
> provides convenience wrappers for defining conversions that require a hi/lo
> split, like widening and narrowing operations.  Each definition for 
> will require an optab named  and two other optabs that you specify
> for signed and unsigned. The hi/lo pair is necessary because the widening
> operations take n narrow elements as inputs and return n/2 wide elements
> as outputs. The 'lo' operation operates on the first n/2 elements of input.
> The 'hi' operation operates on the second n/2 elements of input. Defining an
> internal_fn along with hi/lo variations allows a single internal function to 
> be
> returned from a vect_recog function that will later be expanded to hi/lo.
> 
> DEF_INTERNAL_OPTAB_MULTI_FN is used in internal-fn.def to register a
> widening internal_fn. It is defined differently in different places and 
> internal-
> fn.def is sourced from those places so the parameters given can be reused.
>   internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later
> defined to generate the  'expand_' functions for the hi/lo versions of the fn.
>   internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original
> and hi/lo variants of the internal_fn
> 
>  For example:
>  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI,
> IFN_VEC_WIDEN_PLUS_LO
> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_addl_hi_
> -> (u/s)addl2
>IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_addl_lo_
> -> (u/s)addl
> 
> This gives the same functionality as the previous
> WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into
> VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
> 
> Let me know if I'm not expressing this clearly.
> 
> Thanks,
> Joel


0001-vect-patterns-Refactor-to-allow-internal_fn-s.patch
Description: 0001-vect-patterns-Refactor-to-allow-internal_fn-s.patch


0002-vect-patterns-Refactor-widen_plus-as-internal_fn.patch
Description: 0002-vect-patterns-Refactor-widen_plus-as-internal_fn.patch


0003-Remove-widen_plus-minus_expr-tree-codes.patch
Description: 0003-Remove-widen_plus-minus_expr-tree-codes.patch


Re: [PATCH] PR tree-optimization/103216: optimize some A ? (b op CST) : b into b op (A?CST:CST2)

2021-11-16 Thread Richard Biener via Gcc-patches
On Mon, Nov 15, 2021 at 1:09 AM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> For this PR, we have:
>   if (d_5 < 0)
> goto ; [INV]
>   else
> goto ; [INV]
>
>:
>   v_7 = c_4 | -128;
>
>:
>   # v_1 = PHI 
>
> Which PHI-OPT will try to simplify
> "(d_5 < 0) ? (c_4 | -128) : c_4" which is not handled currently.
> This adds a few patterns which allows to try to see if (a ? CST : CST1)
> where CST1 is either 0, 1 or -1 depending on the operator.
> Note to optimize this case always, we should check to make sure that
> the a?CST:CST1 gets simplified to not include the conditional expression.
> The ! flag does not work as we want to have more simplifcations than just
> when we simplify it to a leaf node (SSA_NAME or CONSTANT). This adds a new
> flag ^ to genmatch which says the simplification should happen but not down
> to the same kind of node.
> We could allow this for !GIMPLE and use fold_* rather than fold_buildN but I
> didn't see any use of it for now.
>
> Also all of these patterns need to be done late as other optimizations can be
> done without them.
>
> OK? Bootstrapped and tested on x86_64 with no regressions.
>
> gcc/ChangeLog:
>
> * doc/match-and-simplify.texi: Document ^ flag.
> * genmatch.c (expr::expr): Add Setting of force_simplify.
> (expr): Add force_simplify field.
> (expr::gen_transform): Add support for force_simplify field.
> (parser::parse_expr): Add parsing of ^ flag for the expr.
> * match.pd: New patterns to optimize "a ? (b op CST) : b".
> ---
>  gcc/doc/match-and-simplify.texi | 16 +
>  gcc/genmatch.c  | 35 ++--
>  gcc/match.pd| 41 +
>  3 files changed, 90 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/doc/match-and-simplify.texi b/gcc/doc/match-and-simplify.texi
> index e7e5a4f7299..4e3407c0263 100644
> --- a/gcc/doc/match-and-simplify.texi
> +++ b/gcc/doc/match-and-simplify.texi
> @@ -377,6 +377,22 @@ of the @code{vec_cond} expression but only if the actual 
> plus
>  operations both simplify.  Note this is currently only supported
>  for code generation targeting @code{GIMPLE}.
>
> +Another modifier for generated expressions is @code{^} which
> +tells the machinery to only consider the simplification in case
> +the marked expression simplified away from the original code.
> +Consider for example
> +
> +@smallexample
> +(simplify
> + (cond @@0 (plus:s @@1 INTEGER_CST@@2) @@1)
> + (plus @@1 (cond^ @@0 @@2 @{ build_zero_cst (type); @})))
> +@end smallexample
> +
> +which moves the inner @code{plus} operation to the outside of the
> +@code{cond} expression but only if the actual cond operation simplify
> +wayaway from cond.  Note this is currently only supported for code

s/wayaway/away/

> +generation targeting @code{GIMPLE}.
> +
>  As intermediate conversions are often optional there is a way to
>  avoid the need to repeat patterns both with and without such
>  conversions.  Namely you can mark a conversion as being optional
> diff --git a/gcc/genmatch.c b/gcc/genmatch.c
> index 95248455ec5..2dca1141df6 100644
> --- a/gcc/genmatch.c
> +++ b/gcc/genmatch.c
> @@ -698,12 +698,13 @@ public:
>  : operand (OP_EXPR, loc), operation (operation_),
>ops (vNULL), expr_type (NULL), is_commutative (is_commutative_),
>is_generic (false), force_single_use (false), force_leaf (false),
> -  opt_grp (0) {}
> +  force_simplify(false), opt_grp (0) {}
>expr (expr *e)
>  : operand (OP_EXPR, e->location), operation (e->operation),
>ops (vNULL), expr_type (e->expr_type), is_commutative 
> (e->is_commutative),
>is_generic (e->is_generic), force_single_use (e->force_single_use),
> -  force_leaf (e->force_leaf), opt_grp (e->opt_grp) {}
> +  force_leaf (e->force_leaf), force_simplify(e->force_simplify),
> +  opt_grp (e->opt_grp) {}
>void append_op (operand *op) { ops.safe_push (op); }
>/* The operator and its operands.  */
>id_base *operation;
> @@ -721,6 +722,9 @@ public:
>/* Whether in the result expression this should be a leaf node
>   with any children simplified down to simple operands.  */
>bool force_leaf;
> +  /* Whether in the result expression this should be a node
> + with any children simplified down not to use the original operator.  */
> +  bool force_simplify;
>/* If non-zero, the group for optional handling.  */
>unsigned char opt_grp;
>virtual void gen_transform (FILE *f, int, const char *, bool, int,
> @@ -2527,6 +2531,17 @@ expr::gen_transform (FILE *f, int indent, const char 
> *dest, bool gimple,
> fprintf (f, ", _o%d[%u]", depth, i);
>fprintf (f, ");\n");
>fprintf_indent (f, indent, "tem_op.resimplify (lseq, valueize);\n");

I wonder if with force_simplify we should pass NULL as lseq to resimplify?
That is, should we allow (plus^ (convert @0) @1) to simplify to
(convert 

[PATCH 2/2][GCC] arm: Declare MVE types internally via pragma

2021-11-16 Thread Murray Steele via Gcc-patches
Hi all,

This patch moves the implementation of MVE ACLE types from
arm_mve_types.h to inside GCC via a new pragma, which replaces the prior
type definitions. This allows for the types to be used internally for
intrinsic function definitions.

Bootstrapped and regression tested on arm-none-linux-gnuabihf, and
regression tested on arm-eabi -- no issues.

Thanks,
Murray

gcc/ChangeLog:

* config.gcc: Add arm-mve-builtins.o to extra_objs for arm-*-*-*
targets.
* config/arm/arm-c.c (arm_pragma_arm): Handle new pragma.
(arm_register_target_pragmas): Register new pragma.
* config/arm/arm-protos.h: Add arm_mve namespace and declare
arm_handle_mve_types_h.
* config/arm/arm_mve_types.h: Replace MVE type definitions with
new pragma.
* config/arm/t-arm: Add arm-mve-builtins.o target.
* config/arm/arm-mve-builtins.cc: New file.
* config/arm/arm-mve-builtins.def: New file.
* config/arm/arm-mve-builtins.h: New file.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/mve.exp: Add new subdirectories.
* gcc.target/arm/mve/general-c/type_redef_1.c: New test.
* gcc.target/arm/mve/general-c/type_redef_10.c: New test.
* gcc.target/arm/mve/general-c/type_redef_11.c: New test.
* gcc.target/arm/mve/general-c/type_redef_12.c: New test.
* gcc.target/arm/mve/general-c/type_redef_13.c: New test.
* gcc.target/arm/mve/general-c/type_redef_14.c: New test.
* gcc.target/arm/mve/general-c/type_redef_15.c: New test.
* gcc.target/arm/mve/general-c/type_redef_16.c: New test.
* gcc.target/arm/mve/general-c/type_redef_17.c: New test.
* gcc.target/arm/mve/general-c/type_redef_18.c: New test.
* gcc.target/arm/mve/general-c/type_redef_19.c: New test.
* gcc.target/arm/mve/general-c/type_redef_2.c: New test.
* gcc.target/arm/mve/general-c/type_redef_20.c: New test.
* gcc.target/arm/mve/general-c/type_redef_21.c: New test.
* gcc.target/arm/mve/general-c/type_redef_22.c: New test.
* gcc.target/arm/mve/general-c/type_redef_23.c: New test.
* gcc.target/arm/mve/general-c/type_redef_24.c: New test.
* gcc.target/arm/mve/general-c/type_redef_25.c: New test.
* gcc.target/arm/mve/general-c/type_redef_26.c: New test.
* gcc.target/arm/mve/general-c/type_redef_27.c: New test.
* gcc.target/arm/mve/general-c/type_redef_28.c: New test.
* gcc.target/arm/mve/general-c/type_redef_29.c: New test.
* gcc.target/arm/mve/general-c/type_redef_3.c: New test.
* gcc.target/arm/mve/general-c/type_redef_30.c: New test.
* gcc.target/arm/mve/general-c/type_redef_31.c: New test.
* gcc.target/arm/mve/general-c/type_redef_4.c: New test.
* gcc.target/arm/mve/general-c/type_redef_5.c: New test.
* gcc.target/arm/mve/general-c/type_redef_6.c: New test.
* gcc.target/arm/mve/general-c/type_redef_7.c: New test.
* gcc.target/arm/mve/general-c/type_redef_8.c: New test.
* gcc.target/arm/mve/general-c/type_redef_9.c: New test.
* gcc.target/arm/mve/general/double_pragmas_1.c: New test.
* gcc.target/arm/mve/general/nomve_1.c: New test.



diff --git a/gcc/config.gcc b/gcc/config.gcc
index 
3675e063a5365ff84854eb5c2c27921216494c69..50d3401e3aa94f077d7e0675ee443a94431dba1e
 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -352,7 +352,7 @@ arc*-*-*)
;;
 arm*-*-*)
cpu_type=arm
-   extra_objs="arm-builtins.o aarch-common.o"
+   extra_objs="arm-builtins.o arm-mve-builtins.o aarch-common.o"
extra_headers="mmintrin.h arm_neon.h arm_acle.h arm_fp16.h arm_cmse.h 
arm_bf16.h arm_mve_types.h arm_mve.h arm_cde.h"
target_type_format_char='%'
c_target_objs="arm-c.o"
diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 
cc7901bca8dc9c5c27ed6afc5bc26afd42689e6d..d1414f6e0e1c2bd0a7364b837c16adf493221376
 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -28,6 +28,7 @@
 #include "c-family/c-pragma.h"
 #include "stringpool.h"
 #include "arm-builtins.h"
+#include "arm-protos.h"
 
 tree
 arm_resolve_cde_builtin (location_t loc, tree fndecl, void *arglist)
@@ -129,6 +130,24 @@ arm_resolve_cde_builtin (location_t loc, tree fndecl, void 
*arglist)
   return call_expr;
 }
 
+/* Implement "#pragma GCC arm".  */
+static void
+arm_pragma_arm (cpp_reader *)
+{
+  tree x;
+  if (pragma_lex () != CPP_STRING)
+{
+  error ("%<#pragma GCC arm%> requires a string parameter");
+  return;
+}
+
+  const char *name = TREE_STRING_POINTER (x);
+  if (strcmp (name, "arm_mve_types.h") == 0)
+arm_mve::handle_arm_mve_types_h ();
+  else
+error ("unknown %<#pragma GCC arm%> option %qs", name);
+}
+
 /* Implement TARGET_RESOLVE_OVERLOADED_BUILTIN.  This is currently only
used for the MVE related builtins for the CDE extension.
Here we ensure the type of arguments is such 

[PATCH 1/2][GCC] arm: Move arm_simd_info array declaration into header

2021-11-16 Thread Murray Steele via Gcc-patches
Hi all,

This patch moves the arm_simd_type and arm_type_qualifiers enums, and
arm_simd_info struct from arm-builtins.c into arm-builtins.h header.

This is a first step towards internalising the type definitions for MVE
predicate, vector, and tuple types.  By moving arm_simd_types into a
header, we allow future patches to use these type trees externally to
arm-builtins.c, which is a crucial step towards developing an MVE
intrinsics framework similar to the current SVE implementation.

Thanks,
Murray

gcc/ChangeLog:

* config/arm/arm-builtins.c (enum arm_type_qualifiers): Move to
arm_builtins.h
(enum arm_simd_type): Move to arm-builtins.h
(struct arm_simd_type_info): Move to arm-builtins.h
* config/arm/arm-builtins.h (enum arm_simd_type): Move from
arm-builtins.c
(enum arm_type_qualifiers): Move from arm-builtins.c
(struct arm_simd_type_info): Move from arm-builtins.c



diff --git a/gcc/config/arm/arm-builtins.h b/gcc/config/arm/arm-builtins.h
index 
bee9f9bb83758820ca7faedf80b7e138026c1ca0..a40fa8950707314d3cc1372fb5c47a8891a18516
 100644
--- a/gcc/config/arm/arm-builtins.h
+++ b/gcc/config/arm/arm-builtins.h
@@ -32,4 +32,91 @@ enum resolver_ident {
 enum resolver_ident arm_describe_resolver (tree);
 unsigned arm_cde_end_args (tree);
 
+#define ENTRY(E, M, Q, S, T, G) E,
+enum arm_simd_type
+{
+#include "arm-simd-builtin-types.def"
+  __TYPE_FINAL
+};
+#undef ENTRY
+
+enum arm_type_qualifiers
+{
+  /* T foo.  */
+  qualifier_none = 0x0,
+  /* unsigned T foo.  */
+  qualifier_unsigned = 0x1, /* 1 << 0  */
+  /* const T foo.  */
+  qualifier_const = 0x2, /* 1 << 1  */
+  /* T *foo.  */
+  qualifier_pointer = 0x4, /* 1 << 2  */
+  /* const T * foo.  */
+  qualifier_const_pointer = 0x6,
+  /* Used when expanding arguments if an operand could
+ be an immediate.  */
+  qualifier_immediate = 0x8, /* 1 << 3  */
+  qualifier_unsigned_immediate = 0x9,
+  qualifier_maybe_immediate = 0x10, /* 1 << 4  */
+  /* void foo (...).  */
+  qualifier_void = 0x20, /* 1 << 5  */
+  /* Some patterns may have internal operands, this qualifier is an
+ instruction to the initialisation code to skip this operand.  */
+  qualifier_internal = 0x40, /* 1 << 6  */
+  /* Some builtins should use the T_*mode* encoded in a simd_builtin_datum
+ rather than using the type of the operand.  */
+  qualifier_map_mode = 0x80, /* 1 << 7  */
+  /* qualifier_pointer | qualifier_map_mode  */
+  qualifier_pointer_map_mode = 0x84,
+  /* qualifier_const_pointer | qualifier_map_mode  */
+  qualifier_const_pointer_map_mode = 0x86,
+  /* Polynomial types.  */
+  qualifier_poly = 0x100,
+  /* Lane indices - must be within range of previous argument = a vector.  */
+  qualifier_lane_index = 0x200,
+  /* Lane indices for single lane structure loads and stores.  */
+  qualifier_struct_load_store_lane_index = 0x400,
+  /* A void pointer.  */
+  qualifier_void_pointer = 0x800,
+  /* A const void pointer.  */
+  qualifier_const_void_pointer = 0x802,
+  /* Lane indices selected in pairs - must be within range of previous
+ argument = a vector.  */
+  qualifier_lane_pair_index = 0x1000,
+  /* Lane indices selected in quadtuplets - must be within range of previous
+ argument = a vector.  */
+  qualifier_lane_quadtup_index = 0x2000
+};
+
+struct arm_simd_type_info
+{
+  enum arm_simd_type type;
+
+  /* Internal type name.  */
+  const char *name;
+
+  /* Internal type name(mangled).  The mangled names conform to the
+ AAPCS (see "Procedure Call Standard for the ARM Architecture",
+ Appendix A).  To qualify for emission with the mangled names defined in
+ that document, a vector type must not only be of the correct mode but also
+ be of the correct internal Neon vector type (e.g. __simd64_int8_t);
+ these types are registered by arm_init_simd_builtin_types ().  In other
+ words, vector types defined in other ways e.g. via vector_size attribute
+ will get default mangled names.  */
+  const char *mangle;
+
+  /* Internal type.  */
+  tree itype;
+
+  /* Element type.  */
+  tree eltype;
+
+  /* Machine mode the internal type maps to.  */
+  machine_mode mode;
+
+  /* Qualifiers.  */
+  enum arm_type_qualifiers q;
+};
+
+extern struct arm_simd_type_info arm_simd_types[];
+
 #endif /* GCC_ARM_BUILTINS_H */
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 
3a9ff8f26b8e222c52cb70f7509b714c3e475758..b6bf31349d8f0e996a6c169b061ebe05a2cf9acb
 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -48,53 +48,6 @@
 
 #define SIMD_MAX_BUILTIN_ARGS 7
 
-enum arm_type_qualifiers
-{
-  /* T foo.  */
-  qualifier_none = 0x0,
-  /* unsigned T foo.  */
-  qualifier_unsigned = 0x1, /* 1 << 0  */
-  /* const T foo.  */
-  qualifier_const = 0x2, /* 1 << 1  */
-  /* T *foo.  */
-  qualifier_pointer = 0x4, /* 1 << 2  */
-  /* const T * foo.  */
-  qualifier_const_pointer = 0x6,
-  /* Used when expanding arguments if an 

[PATCH 0/2][GCC] arm: Define MVE types internally

2021-11-16 Thread Murray Steele via Gcc-patches
Hi all,

This patch series implements the arm MVE ACLE types currently found
under config/arm/arm_mve_types.h internally via a new pragma. Exposing
the MVE ACLE types internally allows for an MVE intrinsics
implementation similar to the current SVE implementation.

Any prefix of the patch series should build and pass regression tests.

Thanks,
Murray

---

Murray Steele (2):
  arm: Move arm_simd_info array declaration into header
  arm: Define MVE types internally via pragma

 gcc/config.gcc|   2 +-
 gcc/config/arm/arm-builtins.c |  87 +---
 gcc/config/arm/arm-builtins.h |  87 
 gcc/config/arm/arm-c.c|  21 ++
 gcc/config/arm/arm-mve-builtins.cc| 192 ++
 gcc/config/arm/arm-mve-builtins.def   |  41 
 gcc/config/arm/arm-mve-builtins.h |  34 
 gcc/config/arm/arm-protos.h   |   5 +
 gcc/config/arm/arm_mve_types.h|  30 +--
 gcc/config/arm/t-arm  |  10 +
 .../arm/mve/general-c/type_redef_1.c  |   7 +
 .../arm/mve/general-c/type_redef_10.c |   7 +
 .../arm/mve/general-c/type_redef_11.c |   7 +
 .../arm/mve/general-c/type_redef_12.c |   7 +
 .../arm/mve/general-c/type_redef_13.c |   7 +
 .../arm/mve/general-c/type_redef_14.c |   7 +
 .../arm/mve/general-c/type_redef_15.c |   7 +
 .../arm/mve/general-c/type_redef_16.c |   7 +
 .../arm/mve/general-c/type_redef_17.c |   7 +
 .../arm/mve/general-c/type_redef_18.c |   7 +
 .../arm/mve/general-c/type_redef_19.c |   7 +
 .../arm/mve/general-c/type_redef_2.c  |   7 +
 .../arm/mve/general-c/type_redef_20.c |   7 +
 .../arm/mve/general-c/type_redef_21.c |   7 +
 .../arm/mve/general-c/type_redef_22.c |   7 +
 .../arm/mve/general-c/type_redef_23.c |   7 +
 .../arm/mve/general-c/type_redef_24.c |   7 +
 .../arm/mve/general-c/type_redef_25.c |   7 +
 .../arm/mve/general-c/type_redef_26.c |   7 +
 .../arm/mve/general-c/type_redef_27.c |   7 +
 .../arm/mve/general-c/type_redef_28.c |   7 +
 .../arm/mve/general-c/type_redef_29.c |   7 +
 .../arm/mve/general-c/type_redef_3.c  |   7 +
 .../arm/mve/general-c/type_redef_30.c |   7 +
 .../arm/mve/general-c/type_redef_31.c |   7 +
 .../arm/mve/general-c/type_redef_4.c  |   7 +
 .../arm/mve/general-c/type_redef_5.c  |   7 +
 .../arm/mve/general-c/type_redef_6.c  |   7 +
 .../arm/mve/general-c/type_redef_7.c  |   7 +
 .../arm/mve/general-c/type_redef_8.c  |   7 +
 .../arm/mve/general-c/type_redef_9.c  |   7 +
 .../arm/mve/general/double_pragmas_1.c|   8 +
 .../gcc.target/arm/mve/general/nomve_1.c  |   3 +
 gcc/testsuite/gcc.target/arm/mve/mve.exp  |   6 +
 44 files changed, 627 insertions(+), 116 deletions(-)
 create mode 100644 gcc/config/arm/arm-mve-builtins.cc
 create mode 100644 gcc/config/arm/arm-mve-builtins.def
 create mode 100644 gcc/config/arm/arm-mve-builtins.h
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_10.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_11.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_12.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_13.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_14.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_15.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_16.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_17.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_18.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_19.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_20.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_21.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_22.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_23.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_24.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_25.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_26.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_27.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_28.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_29.c
 create mode 100644 gcc/testsuite/gcc.target/arm/mve/general-c/type_redef_3.c
 create mode 100644 

Re: [PATCH] tree-optimization: [PR103218] Fold ((type)(a<0)) << SIGNBITOFA into ((type)a) & signbit

2021-11-16 Thread Richard Biener via Gcc-patches
On Sat, Nov 13, 2021 at 9:14 PM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> This folds Fold ((type)(a<0)) << SIGNBITOFA into ((type)a) & signbit inside 
> match.pd.
> This was already handled in fold-cost by:
> /* A < 0 ?  : 0 is simply (A & ).  */
> I have not removed as we only simplify "a ? POW2 : 0" at the gimple level to 
> "a << CST1"
> and fold actually does the reverse of folding "(a<0)< 1< OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK.

Thanks,
Richard.

> PR tree-optimization/103218
>
> gcc/ChangeLog:
>
> * match.pd: New pattern for "((type)(a<0)) << SIGNBITOFA".
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/pr103218-1.c: New test.
> ---
>  gcc/match.pd   | 10 
>  gcc/testsuite/gcc.dg/tree-ssa/pr103218-1.c | 28 ++
>  2 files changed, 38 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr103218-1.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index a319aefa808..df31964e02f 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -865,6 +865,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  { tree utype = unsigned_type_for (type); }
>  (convert (rshift (lshift (convert:utype @0) @2) @3))
>
> +/* Fold ((type)(a<0)) << SIGNBITOFA into ((type)a) & signbit. */
> +(simplify
> + (lshift (convert (lt @0 integer_zerop@1)) INTEGER_CST@2)
> + (if (TYPE_SIGN (TREE_TYPE (@0)) == SIGNED
> +  && wi::eq_p (wi::to_wide (@2), TYPE_PRECISION (TREE_TYPE (@0)) - 1))
> +  (with { wide_int wone = wi::one (TYPE_PRECISION (type)); }
> +   (bit_and (convert @0)
> +{ wide_int_to_tree (type,
> +   wi::lshift (wone, wi::to_wide (@2))); }
> +
>  /* Fold (-x >> C) into -(x > 0) where C = precision(type) - 1.  */
>  (for cst (INTEGER_CST VECTOR_CST)
>   (simplify
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr103218-1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr103218-1.c
> new file mode 100644
> index 000..f086f073b38
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr103218-1.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* PR tree-optimization/103218 */
> +
> +/* These first two are removed during forwprop1 */
> +signed char f(signed char a)
> +{
> +  signed char t = a < 0;
> +  int tt = (unsigned char)(t << 7);
> +  return tt;
> +}
> +signed char f0(signed char a)
> +{
> +  unsigned char t = a < 0;
> +  int tt = (unsigned char)(t << 7);
> +  return tt;
> +}
> +
> +/* This one is removed during phiopt. */
> +signed char  f1(signed char a)
> +{
> +if (a < 0)
> +  return 1u<<7;
> +return 0;
> +}
> +
> +/* These three examples should remove "a < 0" by optimized. */
> +/* { dg-final { scan-tree-dump-times "< 0" 0 "optimized"} } */
> --
> 2.17.1
>


Re: [GCC-11 PATCH] aarch64: enable Ampere-1 CPU (backport to GCC11)

2021-11-16 Thread Richard Sandiford via Gcc-patches
Philipp Tomsich  writes:
> This adds support and a basic turning model for the Ampere Computing
> "Ampere-1" CPU.
>
> The Ampere-1 implements the ARMv8.6 architecture in A64 mode and is
> modelled as a 4-wide issue (as with all modern micro-architectures,
> the chosen issue rate is a compromise between the maximum dispatch
> rate and the maximum rate of uops issued to the scheduler).
>
> This adds the -mcpu=ampere1 command-line option and the relevant cost
> information/tuning tables for the Ampere-1.
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-cores.def (AARCH64_CORE): New Ampere-1
>   core.
>   * config/aarch64/aarch64-tune.md: Regenerate.
>   * config/aarch64/aarch64-cost-tables.h: Add extra costs for
>   Ampere-1.
>   * config/aarch64/aarch64.c: Add tuning structures for Ampere-1.
>
> (cherry picked from 67b0d47e20e655c0dd53a76ea88aab60fafb2059)
>
> ---
> This is a backport from master and only affects the AArch64 backend.
>
> OK for GCC-11?

Yes, thanks.

Richard.

>
>  gcc/config/aarch64/aarch64-cores.def |   3 +-
>  gcc/config/aarch64/aarch64-cost-tables.h | 104 +++
>  gcc/config/aarch64/aarch64-tune.md   |   2 +-
>  gcc/config/aarch64/aarch64.c |  78 +
>  gcc/doc/invoke.texi  |   2 +-
>  5 files changed, 186 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index b2aa1670561..4643e0e2795 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -68,7 +68,8 @@ AARCH64_CORE("octeontx83",octeontxt83,   thunderx,  8A, 
>  AARCH64_FL_FOR_ARCH
>  AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
> 0x0a2, -1)
>  AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
> 0x0a3, -1)
>  
> -/* Ampere Computing cores. */
> +/* Ampere Computing ('\xC0') cores. */
> +AARCH64_CORE("ampere1", ampere1, cortexa57, 8_6A, AARCH64_FL_FOR_ARCH8_6, 
> ampere1, 0xC0, 0xac3, -1)
>  /* Do not swap around "emag" and "xgene1",
> this order is required to handle variant correctly. */
>  AARCH64_CORE("emag",emag,  xgene1,8A,  AARCH64_FL_FOR_ARCH8 
> | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, 0x000, 3)
> diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
> b/gcc/config/aarch64/aarch64-cost-tables.h
> index dd2e7e7cbb1..4b7e4e034a2 100644
> --- a/gcc/config/aarch64/aarch64-cost-tables.h
> +++ b/gcc/config/aarch64/aarch64-cost-tables.h
> @@ -650,4 +650,108 @@ const struct cpu_cost_table a64fx_extra_costs =
>}
>  };
>  
> +const struct cpu_cost_table ampere1_extra_costs =
> +{
> +  /* ALU */
> +  {
> +0, /* arith.  */
> +0, /* logical.  */
> +0, /* shift.  */
> +COSTS_N_INSNS (1), /* shift_reg.  */
> +0, /* arith_shift.  */
> +COSTS_N_INSNS (1), /* arith_shift_reg.  */
> +0, /* log_shift.  */
> +COSTS_N_INSNS (1), /* log_shift_reg.  */
> +0, /* extend.  */
> +COSTS_N_INSNS (1), /* extend_arith.  */
> +0, /* bfi.  */
> +0, /* bfx.  */
> +0, /* clz.  */
> +0, /* rev.  */
> +0, /* non_exec.  */
> +true   /* non_exec_costs_exec.  */
> +  },
> +  {
> +/* MULT SImode */
> +{
> +  COSTS_N_INSNS (3),   /* simple.  */
> +  COSTS_N_INSNS (3),   /* flag_setting.  */
> +  COSTS_N_INSNS (3),   /* extend.  */
> +  COSTS_N_INSNS (4),   /* add.  */
> +  COSTS_N_INSNS (4),   /* extend_add.  */
> +  COSTS_N_INSNS (18)   /* idiv.  */
> +},
> +/* MULT DImode */
> +{
> +  COSTS_N_INSNS (3),   /* simple.  */
> +  0,   /* flag_setting (N/A).  */
> +  COSTS_N_INSNS (3),   /* extend.  */
> +  COSTS_N_INSNS (4),   /* add.  */
> +  COSTS_N_INSNS (4),   /* extend_add.  */
> +  COSTS_N_INSNS (34)   /* idiv.  */
> +}
> +  },
> +  /* LD/ST */
> +  {
> +COSTS_N_INSNS (4), /* load.  */
> +COSTS_N_INSNS (4), /* load_sign_extend.  */
> +0, /* ldrd (n/a).  */
> +0, /* ldm_1st.  */
> +0, /* ldm_regs_per_insn_1st.  */
> +0, /* ldm_regs_per_insn_subsequent.  */
> +COSTS_N_INSNS (5), /* loadf.  */
> +COSTS_N_INSNS (5), /* loadd.  */
> +COSTS_N_INSNS (5), /* load_unaligned.  */
> +0, /* store.  */
> +0, /* strd.  */
> +0, /* stm_1st.  */
> +0, /* stm_regs_per_insn_1st.  */
> +  

[committed] arc: Update arc specific tests

2021-11-16 Thread Claudiu Zissulescu via Gcc-patches
Update assembly output test pattern. Take into consideration also for
which platform we do execute the test (baremetal or linux).

gcc/testsuite/ChangeLog:

* gcc.target/arc/add_n-combine.c: Update test patterns.
* gcc.target/arc/builtin_eh.c: Update test for linux platforms.
* gcc.target/arc/mul64-1.c: Disable this test while running on
linux.
* gcc.target/arc/tls-gd.c: Update matching patterns.
* gcc.target/arc/tls-ie.c: Likewise.
* gcc.target/arc/tls-ld.c: Likewise.
* gcc.target/arc/uncached-8.c: Likewise.

Signed-off-by: Claudiu Zissulescu 
---
 gcc/testsuite/gcc.target/arc/add_n-combine.c | 4 ++--
 gcc/testsuite/gcc.target/arc/builtin_eh.c| 3 ++-
 gcc/testsuite/gcc.target/arc/mul64-1.c   | 2 +-
 gcc/testsuite/gcc.target/arc/tls-gd.c| 4 ++--
 gcc/testsuite/gcc.target/arc/tls-ie.c| 4 ++--
 gcc/testsuite/gcc.target/arc/tls-ld.c| 6 +++---
 gcc/testsuite/gcc.target/arc/uncached-8.c| 5 +++--
 7 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arc/add_n-combine.c 
b/gcc/testsuite/gcc.target/arc/add_n-combine.c
index bc400df669e..84e261ece8f 100644
--- a/gcc/testsuite/gcc.target/arc/add_n-combine.c
+++ b/gcc/testsuite/gcc.target/arc/add_n-combine.c
@@ -45,6 +45,6 @@ void f() {
   a(at3.bn[bu]);
 }
 
-/* { dg-final { scan-assembler "add1" } } */
-/* { dg-final { scan-assembler "add2" } } */
+/* { dg-final { scan-assembler "@at1\\+1" } } */
+/* { dg-final { scan-assembler "@at2\\+2" } } */
 /* { dg-final { scan-assembler "add3" } } */
diff --git a/gcc/testsuite/gcc.target/arc/builtin_eh.c 
b/gcc/testsuite/gcc.target/arc/builtin_eh.c
index 717a54bb084..83f4f1d2ee0 100644
--- a/gcc/testsuite/gcc.target/arc/builtin_eh.c
+++ b/gcc/testsuite/gcc.target/arc/builtin_eh.c
@@ -19,4 +19,5 @@ foo (int x)
 /* { dg-final { scan-assembler "r13" } } */
 /* { dg-final { scan-assembler "r0" } } */
 /* { dg-final { scan-assembler "fp" } } */
-/* { dg-final { scan-assembler "fp,64" } } */
+/* { dg-final { scan-assembler "fp,64" { target { *-elf32-* } } } } */
+/* { dg-final { scan-assembler "fp,60" { target { *-linux-* } } } } */
diff --git a/gcc/testsuite/gcc.target/arc/mul64-1.c 
b/gcc/testsuite/gcc.target/arc/mul64-1.c
index 2543fc33d3f..1a351feee87 100644
--- a/gcc/testsuite/gcc.target/arc/mul64-1.c
+++ b/gcc/testsuite/gcc.target/arc/mul64-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-skip-if "MUL64 is ARC600 extension." { ! { clmcpu } } } */
+/* { dg-skip-if "MUL64 is ARC600 extension." { { ! { clmcpu } } || *-linux-* } 
} */
 /* { dg-options "-O2 -mmul64 -mbig-endian -mcpu=arc600" } */
 
 /* Check if mlo/mhi registers are correctly layout when we compile for
diff --git a/gcc/testsuite/gcc.target/arc/tls-gd.c 
b/gcc/testsuite/gcc.target/arc/tls-gd.c
index aa1b5429b08..d02af9537f8 100644
--- a/gcc/testsuite/gcc.target/arc/tls-gd.c
+++ b/gcc/testsuite/gcc.target/arc/tls-gd.c
@@ -13,5 +13,5 @@ int *ae2 (void)
   return 
 }
 
-/* { dg-final { scan-assembler "add r0,pcl,@e2@tlsgd" } } */
-/* { dg-final { scan-assembler "bl @__tls_get_addr@plt" } } */
+/* { dg-final { scan-assembler "add\\s+r0,pcl,@e2@tlsgd" } } */
+/* { dg-final { scan-assembler "bl\\s+@__tls_get_addr@plt" } } */
diff --git a/gcc/testsuite/gcc.target/arc/tls-ie.c 
b/gcc/testsuite/gcc.target/arc/tls-ie.c
index 0c981cfbf67..f4ad635c4d3 100644
--- a/gcc/testsuite/gcc.target/arc/tls-ie.c
+++ b/gcc/testsuite/gcc.target/arc/tls-ie.c
@@ -13,5 +13,5 @@ int *ae2 (void)
   return 
 }
 
-/* { dg-final { scan-assembler "ld r0,\\\[pcl,@e2@tlsie\\\]" } } */
-/* { dg-final { scan-assembler "add_s r0,r0,r25" } } */
+/* { dg-final { scan-assembler "ld\\s+r0,\\\[pcl,@e2@tlsie\\\]" } } */
+/* { dg-final { scan-assembler "add_s\\s+r0,r0,r25" } } */
diff --git a/gcc/testsuite/gcc.target/arc/tls-ld.c 
b/gcc/testsuite/gcc.target/arc/tls-ld.c
index 351c3f02abd..68ab9bf809c 100644
--- a/gcc/testsuite/gcc.target/arc/tls-ld.c
+++ b/gcc/testsuite/gcc.target/arc/tls-ld.c
@@ -13,6 +13,6 @@ int *ae2 (void)
   return 
 }
 
-/* { dg-final { scan-assembler "add r0,pcl,@.tbss@tlsgd" } } */
-/* { dg-final { scan-assembler "bl @__tls_get_addr@plt" } } */
-/* { dg-final { scan-assembler "add_s r0,r0,@e2@dtpoff" } } */
+/* { dg-final { scan-assembler "add\\s+r0,pcl,@.tbss@tlsgd" } } */
+/* { dg-final { scan-assembler "bl\\s+@__tls_get_addr@plt" } } */
+/* { dg-final { scan-assembler "add_s\\s+r0,r0,@e2@dtpoff" } } */
diff --git a/gcc/testsuite/gcc.target/arc/uncached-8.c 
b/gcc/testsuite/gcc.target/arc/uncached-8.c
index 060229b11df..b5ea2359a9a 100644
--- a/gcc/testsuite/gcc.target/arc/uncached-8.c
+++ b/gcc/testsuite/gcc.target/arc/uncached-8.c
@@ -29,5 +29,6 @@ void bar (void)
   x.c.b.a = 10;
 }
 
-/* { dg-final { scan-assembler-times "st\.di" 1 } } */
-/* { dg-final { scan-assembler-times "st\.as\.di" 1 } } */
+/* { dg-final { scan-assembler-times "st\.di" 2 { target { *-linux-* } } } } */
+/* { dg-final { scan-assembler-times "st\.di" 1 { target { 

  1   2   >