date:20211130

On Tue, 30 Nov 2021, Jason Merrill wrote:

> On 11/29/21 10:03, Richard Biener via Gcc-patches wrote:
> > This cleans up unreachable code diagnosed by -Wunreachable-code-ctrl.
> > It largely follows the previous series but discovers a few extra
> > cases, namely dead code after break or continue or loops without
> > exits.
> > 
> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> > 
> > Richard.
> > 
> > 2021-11-29  Richard Biener  
> > 
> > gcc/c/
> >  * gimple-parser.c (c_parser_gimple_postfix_expression):
> >  avoid unreachable code after break.
> > 
> > gcc/
> >  * cfgrtl.c (skip_insns_after_block): Refactor code to
> >  be more easily readable.
> >  * expr.c (op_by_pieces_d::run): Remove unreachable
> >  assert.
> >  * sched-deps.c (sched_analyze): Remove unreachable
> >  gcc_unreachable.
> >  * sel-sched-ir.c (in_same_ebb_p): Likewise.
> >  * tree-ssa-alias.c (nonoverlapping_refs_since_match_p):
> >  Remove unreachable code.
> >  * tree-vect-slp.c (vectorize_slp_instance_root_stmt):
> >  Refactor to avoid unreachable loop iteration.
> >  * tree.c (walk_tree_1): Remove unreachable break.
> >  * vec-perm-indices.c (vec_perm_indices::series_p): Remove
> >  unreachable return.
> > 
> > gcc/cp/
> >  * parser.c (cp_parser_postfix_expression): Remove
> >  unreachable code.
> >  * pt.c (tsubst_expr): Remove unreachable breaks.
> > 
> > gcc/fortran/
> >  * frontend-passes.c (gfc_expr_walker): Remove unreachable
> >  break.
> >  * scanner.c (skip_fixed_comments): Remove unreachable
> >  gcc_unreachable.
> >  * trans-expr.c (gfc_expr_is_variable): Refactor to make
> >  control flow more obvious.
> > ---
> >   gcc/c/gimple-parser.c |  8 +---
> >   gcc/cfgrtl.c  | 10 ++
> >   gcc/cp/parser.c   |  4 
> >   gcc/cp/pt.c   |  2 --
> >   gcc/expr.c|  3 ---
> >   gcc/fortran/frontend-passes.c |  1 -
> >   gcc/fortran/scanner.c |  1 -
> >   gcc/fortran/trans-expr.c  | 11 +++
> >   gcc/sched-deps.c  |  2 --
> >   gcc/sel-sched-ir.c|  3 ---
> >   gcc/tree-ssa-alias.c  |  3 ---
> >   gcc/tree-vect-slp.c   | 22 --
> >   gcc/tree.c|  2 --
> >   gcc/vec-perm-indices.c|  1 -
> >   14 files changed, 14 insertions(+), 59 deletions(-)
> > 
> > diff --git a/gcc/c/gimple-parser.c b/gcc/c/gimple-parser.c
> > index 32f22dbb8a7..f594a8ccb31 100644
> > --- a/gcc/c/gimple-parser.c
> > +++ b/gcc/c/gimple-parser.c
> > @@ -1698,13 +1698,7 @@ c_parser_gimple_postfix_expression (gimple_parser
> > )
> >}
> >  break;
> > }
> > -  else
> > -   {
> > - c_parser_error (parser, "expected expression");
> > - expr.set_error ();
> > - break;
> > -   }
> > -  break;
> > +  /* Fallthru.  */
> >   default:
> > c_parser_error (parser, "expected expression");
> > expr.set_error ();
> > diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
> > index 3744adcc2ba..287a3db643a 100644
> > --- a/gcc/cfgrtl.c
> > +++ b/gcc/cfgrtl.c
> > @@ -3539,14 +3539,8 @@ skip_insns_after_block (basic_block bb)
> >  continue;
> >   
> > case NOTE:
> > - switch (NOTE_KIND (insn))
> > -   {
> > -   case NOTE_INSN_BLOCK_END:
> > - gcc_unreachable ();
> > -   default:
> > - continue;
> > -   }
> > - break;
> > + gcc_assert (NOTE_KIND (insn) != NOTE_INSN_BLOCK_END);
> > + continue;
> >   
> >case CODE_LABEL:
> >   if (NEXT_INSN (insn)
> > diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> > index 0bd58525726..cc88a36dd39 100644
> > --- a/gcc/cp/parser.c
> > +++ b/gcc/cp/parser.c
> > @@ -7892,10 +7892,6 @@ cp_parser_postfix_expression (cp_parser *parser, bool
> > address_p, bool cast_p,
> >return postfix_expression;
> >}
> >   }
> > -
> > -  /* We should never get here.  */
> > -  gcc_unreachable ();
> 
> Hmm, I generally disagree with removing gcc_unreachable() asserts because they
> are unreachable; it seems like it increases the fragility of the code in case
> later changes wrongly make them reachable.

It seems to be quite inconsistently used in the code base though.  Do
you suggest the coding conventions to be amended with something like

"If a function returns non-void and the last statement in the outermost
lexical scope is not a return statement you must add a gcc_unreachable ()
call at this place."

?  Most definitely most functions do _not_ follow this.

The case above involves

  while (true)
{
  loop without exit (break or goto)
}
}

if somebody would add code doing a break and not add a return he'd
get a diagnostic now.  With the gcc_unreachable () in place he'd
_not_ get a diagnostic but maybe some ICEs at compile-time for some
testcase we'd yet have to discover.

Not sure what I think is better ;)

I'm sure we do not want to diagnose

 "warning: gcc_unreachable () might be reached"

correct?  So placing a

Re: [PATCH] tree-optimization/103456 - Record only successes from object_sizes_set

On Wed, Dec 01, 2021 at 09:47:21AM +0530, Siddhesh Poyarekar wrote:
> Avoid overwriting osi->changed if object_sizes_set does not update the
> size, so that a previous success in the same pass is not overwritten.
> This fixes the bootstrap-ubsan build config, which was failing due to
> incorrect object size.
> 
> Also completed a bootstrap on x86_64 which didn't show any new failures.
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/103456
>   * tree-object-size.c (merge_object_sizes): Update osi->changed
>   only if object_sizes_set succeeded.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/103456
>   * gcc.dg/pr103456.c: New test.
> 
> Co-authored-by: Martin Liška 
> Signed-off-by: Siddhesh Poyarekar 
> ---
>  gcc/testsuite/gcc.dg/pr103456.c | 21 +
>  gcc/tree-object-size.c  |  3 ++-
>  2 files changed, 23 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr103456.c
> 
> diff --git a/gcc/testsuite/gcc.dg/pr103456.c b/gcc/testsuite/gcc.dg/pr103456.c
> new file mode 100644
> index 000..20322fbaab1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr103456.c

With -fsanitize=undefined that testcase belongs to
gcc/testsuite/gcc.dg/ubsan/pr103456.c, please move it there.

> @@ -0,0 +1,21 @@
> +/* PR tree-optimization/103456 */
> +/* { dg-do compile } */
> +/* { dg-options "-fsanitize=undefined -O -fdump-tree-objsz" } */
> +
> +static char *multilib_options = "m64/m32";
> +
> +void
> +used_arg_t (void)
> +{
> +  char *q = multilib_options;
> +  for (;;)
> +{
> +  while (*q)
> + q++;
> +  while (__builtin_strchr (q, '_') == 0)
> + while (*q)
> +   q++;
> +}
> +}
> +
> +/* { dg-final { scan-tree-dump-not "maximum object size 0" "objsz1" } } */
> diff --git a/gcc/tree-object-size.c b/gcc/tree-object-size.c
> index 3780437ff91..b4881ef198f 100644
> --- a/gcc/tree-object-size.c
> +++ b/gcc/tree-object-size.c
> @@ -854,7 +854,8 @@ merge_object_sizes (struct object_size_info *osi, tree 
> dest, tree orig,
>  orig_bytes = (offset > orig_bytes)
>? HOST_WIDE_INT_0U : orig_bytes - offset;
>  
> -  osi->changed = object_sizes_set (osi, varno, orig_bytes);
> +  if (object_sizes_set (osi, varno, orig_bytes))
> +osi->changed = true;
>  
>return bitmap_bit_p (osi->reexamine, SSA_NAME_VERSION (orig));
>  }

Otherwise LGTM.

Jakub

[PATCH] tree-optimization/103456 - Record only successes from object_sizes_set

2021-11-30 Thread Siddhesh Poyarekar

Avoid overwriting osi->changed if object_sizes_set does not update the
size, so that a previous success in the same pass is not overwritten.
This fixes the bootstrap-ubsan build config, which was failing due to
incorrect object size.

Also completed a bootstrap on x86_64 which didn't show any new failures.

gcc/ChangeLog:

PR tree-optimization/103456
* tree-object-size.c (merge_object_sizes): Update osi->changed
only if object_sizes_set succeeded.

gcc/testsuite/ChangeLog:

PR tree-optimization/103456
* gcc.dg/pr103456.c: New test.

Co-authored-by: Martin Liška 
Signed-off-by: Siddhesh Poyarekar 
---
 gcc/testsuite/gcc.dg/pr103456.c | 21 +
 gcc/tree-object-size.c  |  3 ++-
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr103456.c

diff --git a/gcc/testsuite/gcc.dg/pr103456.c b/gcc/testsuite/gcc.dg/pr103456.c
new file mode 100644
index 000..20322fbaab1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103456.c
@@ -0,0 +1,21 @@
+/* PR tree-optimization/103456 */
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=undefined -O -fdump-tree-objsz" } */
+
+static char *multilib_options = "m64/m32";
+
+void
+used_arg_t (void)
+{
+  char *q = multilib_options;
+  for (;;)
+{
+  while (*q)
+   q++;
+  while (__builtin_strchr (q, '_') == 0)
+   while (*q)
+ q++;
+}
+}
+
+/* { dg-final { scan-tree-dump-not "maximum object size 0" "objsz1" } } */
diff --git a/gcc/tree-object-size.c b/gcc/tree-object-size.c
index 3780437ff91..b4881ef198f 100644
--- a/gcc/tree-object-size.c
+++ b/gcc/tree-object-size.c
@@ -854,7 +854,8 @@ merge_object_sizes (struct object_size_info *osi, tree 
dest, tree orig,
 orig_bytes = (offset > orig_bytes)
 ? HOST_WIDE_INT_0U : orig_bytes - offset;
 
-  osi->changed = object_sizes_set (osi, varno, orig_bytes);
+  if (object_sizes_set (osi, varno, orig_bytes))
+osi->changed = true;
 
   return bitmap_bit_p (osi->reexamine, SSA_NAME_VERSION (orig));
 }
-- 
2.31.1

Re: [PATCH] Modify combine pattern by anding a pseudo with its nonzero bits

2021-11-30 Thread HAO CHEN GUI via Gcc-patches

Hi Segher,

   Thanks for your review. Please see my comments.

On 1/12/2021 上午 2:11, Segher Boessenkool wrote:
> Hi!
>
> On Tue, Nov 30, 2021 at 04:46:34PM +0800, HAO CHEN GUI wrote:
>>     This patch modifies the combine pattern with a helper - 
>> change_pseudo_and_mask when recog fails. The helper converts a single pseudo 
>> to the pseudo and with a mask if the outer operator is IOR/XOR/PLUS and the 
>> inner operator is ASHIFT/LSHIFTRT/AND. The conversion helps match shift + 
>> ior pattern.
>>
>>     Bootstrapped and tested on powerpc64-linux BE and LE with no 
>> regressions. Is this okay for trunk? Any recommendations? Thanks a lot.
> (Please make shorter lines in email.  70 chars is usual).
>
>> gcc/
>>     * combine.c (change_pseudo_and_mask): New.
>>     (recog_for_combine): If recog fails, try again with the pattern
>>     modified by change_pseudo_and_mask.
>>
>> gcc/testsuite/
>>     * gcc.target/powerpc/20050603-3.c: Modify the dump check conditions.
>>     * gcc.target/powerpc/rlwimi-2.c: Likewise.
>> +/* When the outer code of set_src is IOR/XOR/PLUS and the inner code is
>> +   ASHIFT/LSHIFTRT/AND, convert a psuedo to psuedo AND with a mask if its
>> +   nonzero_bits is less than its mode mask.  */
> Please add some words *why* we do this (namely, because you cannot use
> nonzero_bits in combine as well as after combine and expect the same
> answer).
>
>> +static bool
>> +change_pseudo_and_mask (rtx pat)
>> +{
>> +  bool changed = false;
>> +
>> +  rtx src = SET_SRC (pat);
>> +  if ((GET_CODE (src) == IOR
>> +   || GET_CODE (src) == XOR
>> +   || GET_CODE (src) == PLUS)
>> +  && (((GET_CODE (XEXP (src, 0)) == ASHIFT
>> +   || GET_CODE (XEXP (src, 0)) == LSHIFTRT
>> +   || GET_CODE (XEXP (src, 0)) == AND)
>> +  && REG_P (XEXP (src, 1)))
>> + || ((GET_CODE (XEXP (src, 1)) == ASHIFT
>> +  || GET_CODE (XEXP (src, 1)) == LSHIFTRT
>> +  || GET_CODE (XEXP (src, 1)) == AND)
>> + && REG_P (XEXP (src, 0)
> If one arm is a pseudo and the other is compound, the compound one is
> first always.  This is one of those canonicalisations that simplifies a
> lot of code -- including this new code :-)
>
>> +    {
>> +  rtx *reg = REG_P (XEXP (src, 0))
>> +    ?  (SET_SRC (pat), 0)
>> +    :  (SET_SRC (pat), 1);
> This is indented wrong.  But, in fact, all tabs are changed to spaces in
> your patch?

When I paste the patch from terminal, the tab is automatically converted to 4 
spaces.  I will

try to send patch via "git send-email" next time.

>> @@ -11586,7 +11622,14 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, 
>> rtx *pnotes)
>>     }
>>     }
>>    else
>> -   changed = change_zero_ext (pat);
>> +   {
>> + if (change_pseudo_and_mask (pat))
>> +   {
>> + maybe_swap_commutative_operands (SET_SRC (pat));
>> + changed = true;
>> +   }
>> + changed |= change_zero_ext (pat);
>> +   }
>>  }
>>    else if (GET_CODE (pat) == PARALLEL)
>>  {
>
>   changed = change_zero_ext (pat);
>   if (!changed)
> changed = change_pseudo_and_mask (pat);
>
>   if (changed)
> maybe_swap_commutative_operands (SET_SRC (pat));
>
>
>> --- a/gcc/testsuite/gcc.target/powerpc/20050603-3.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/20050603-3.c
>> @@ -12,7 +12,7 @@ void rotins (unsigned int x)
>>    b.y = (x<<12) | (x>>20);
>>  }
>>
>> -/* { dg-final { scan-assembler-not {\mrlwinm} } } */
>> +/* { dg-final { scan-assembler-not {\mrlwinm} { target ilp32 } } } */
>>  /* { dg-final { scan-assembler-not {\mrldic} } } */
>>  /* { dg-final { scan-assembler-not {\mrot[lr]} } } */
>>  /* { dg-final { scan-assembler-not {\ms[lr][wd]} } } */
> Please show the -m32 code before and after the change?  Why is it okay
> to get an rlwinm there?

The patch doesn't affect -m32 code. The original also fails with -m64 on 
"\mrldic" as it generates an "rldicl" instruction.

My patch fails with -m64 on "\mrlwinm" as it generates an "rlwinm" instruction. 
So I changed it.

original regression test

PASS: gcc.target/powerpc/20050603-3.c scan-assembler-not \\mrlwinm
FAIL: gcc.target/powerpc/20050603-3.c scan-assembler-not \\mrldic
PASS: gcc.target/powerpc/20050603-3.c scan-assembler-not \\mrot[lr]
PASS: gcc.target/powerpc/20050603-3.c scan-assembler-not \\ms[lr][wd]
PASS: gcc.target/powerpc/20050603-3.c scan-assembler-times \\mrl[wd]imi 1

original -m64 assembly

    addis 10,2,.LANCHOR0@toc@ha
    rldicl 3,3,52,32
    lwz 9,.LANCHOR0@toc@l(10)
    rlwimi 9,3,0,3840
    stw 9,.LANCHOR0@toc@l(10)
    blr

patch -m64 assembly

    addis 10,2,.LANCHOR0@toc@ha
    rlwinm 3,3,20,20,23
    lwz 9,.LANCHOR0@toc@l(10)
    rlwimi 9,3,0,3840
    stw 9,.LANCHOR0@toc@l(10)
    blr

-m32 assembly (both original and patch)

    lis 10,b@ha
    lwz 9,b@l(10)
    rlwimi 9,3,20,20,23

[PATCH RFA (fold/symtab)] c++: constexpr, fold, weak redecl, fp/0 [PR103310]

For PR61825, honza changed tree_single_nonzero_warnv_p to prevent a later
declaration from marking a function as weak after we've determined that it
wasn't weak before.  But we shouldn't do that for speculative folding; we
should only do it when we actually need a constant value.  In C++, such a
context is called "manifestly constant-evaluated".  In fold, this seems to
correspond to the folding_initializer flag, since in C this situation only
occurs in static initializers.

This change makes nonzero-1.c well-formed; I've added a nonzero-1a.c to
verify that we delete the null check eventually if there is no weak
redeclaration.

The varasm.c change is so that if we do get the weak redeclaration error, we
get it at the position of the weak declaration rather than the previous
declaration.

Using the FOLD_INIT paths also affects floating point arithmetic: notably,
this makes floating point division by zero in a manifestly
constant-evaluated context constant, as in a C static initializer.  I've had
some success convincing CWG that this is the right direction; C++ should
follow C's floating point semantics more than we have been doing, and Joseph
says that the C policy is that Annex F overrides other parts of the standard
that say that some operations are undefined.  But since we're in stage 3,
I'm only making this change with the new flag -fconstexpr-fp-except.  It may
turn on by default in a future release.

I think this distinction is only relevant for binary operations; arithmetic
for the floating point case, comparison for possibly non-zero addresses.

Tested x86_64-pc-linux-gnu, OK for trunk?

PR c++/103310

gcc/ChangeLog:

* fold-const.c (maybe_nonzero_address): Use get_create or get
depending on folding_initializer.
(fold_binary_initializer_loc): New.
* fold-const.h (fold_binary_initializer_loc): Declare.
* varasm.c (mark_weak): Don't use the decl location.
* doc/invoke.texi: Document -fconstexpr-fp-except.

gcc/c-family/ChangeLog:

* c.opt: Add -fconstexpr-fp-except.

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_binary_expression): Use
fold_binary_initializer_loc if manifestly cxeval.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-fp-except1.C: New test.
* g++.dg/cpp1z/constexpr-if36.C: New test.
* gcc.dg/tree-ssa/nonzero-1.c: Now well-formed.
* gcc.dg/tree-ssa/nonzero-1a.c: New test.
---
 gcc/doc/invoke.texi   | 14 ++
 gcc/c-family/c.opt|  4 +++
 gcc/fold-const.h  |  1 +
 gcc/cp/constexpr.c|  9 ++-
 gcc/fold-const.c  | 26 ---
 .../g++.dg/cpp0x/constexpr-fp-except1.C   |  4 +++
 gcc/testsuite/g++.dg/cpp1z/constexpr-if36.C   | 19 ++
 gcc/testsuite/gcc.dg/tree-ssa/nonzero-1.c |  5 ++--
 gcc/testsuite/gcc.dg/tree-ssa/nonzero-1a.c| 11 
 gcc/varasm.c  |  2 +-
 10 files changed, 88 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-fp-except1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-if36.C
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/nonzero-1a.c

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 3bddfbaae6a..d6858d834f9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -3035,6 +3035,20 @@ users are likely to want to adjust it, but if your code 
does heavy
 constexpr calculations you might want to experiment to find which
 value works best for you.
 
+@item -fconstexpr-fp-except
+@opindex fconstexpr-fp-except
+Annex F of the C standard specifies that IEC559 floating point
+exceptions encountered at compile time should not stop compilation.
+C++ compilers have historically not followed this guidance, instead
+treating floating point division by zero as non-constant even though
+it has a well defined value.  This flag tells the compiler to give
+Annex F priority over other rules saying that a particular operation
+is undefined.
+
+@smallexample
+constexpr float inf = 1./0.; // OK with -fconstexpr-fp-except
+@end smallexample
+
 @item -fconstexpr-loop-limit=@var{n}
 @opindex fconstexpr-loop-limit
 Set the maximum number of iterations for a loop in C++14 constexpr functions
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 4b8a094b206..5654f044ae4 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1615,6 +1615,10 @@ fconstexpr-cache-depth=
 C++ ObjC++ Joined RejectNegative UInteger Var(constexpr_cache_depth) Init(8)
 -fconstexpr-cache-depth=   Specify maximum constexpr recursion 
cache depth.
 
+fconstexpr-fp-except
+C++ ObjC++ Var(flag_constexpr_fp_except) Init(0)
+Allow IEC559 floating point exceptions in constant expressions
+
 fconstexpr-loop-limit=
 C++ ObjC++ Joined RejectNegative UInteger Var(constexpr_loop_limit) 
Init(262144)
 -fconstexpr-loop-limit=

Re: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out boolean left shift

2021-11-30 Thread Navid Rahimi via Gcc-patches

I see. That makes sense. Thanks for the explanation. I was looking at the 64992 
and it seems all the implementation right now are only using INTEGER_CST at the 
moment.

But now it makes sense.

Best wishes,
Navid.


From: Andrew Pinski 
Sent: Tuesday, November 30, 2021 15:18
To: Navid Rahimi
Cc: Navid Rahimi via Gcc-patches
Subject: Re: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out 
boolean left shift

On Tue, Nov 30, 2021 at 3:08 PM Navid Rahimi  wrote:
>
> Hi Andrew,
>
> Thanks for your detailed comment. There are two problem I wanted to discuss 
> with you about:
>
> a) The optimization I have sent patch, does optimize variable length "<<" 
> too(for example B0 << x, where x is variable). This [1] link shows the actual 
> optimization and a link for the proof is included in the editor.
>
> b) I am unable to prove the optimization you are describing for non-constant 
> length shift. You can take a look at the code example [2] and proof [3]. I am 
> getting "Transformation doesn't verify!" when I do implement the optimization 
> you mentioned for non-constant shift.
>
> The optimization you are describing only works for "(take: (t << 1) != 0) -> 
> ((t & 0x7fff) != 0)" which only is provable and works for INTEGER_CST.

No it works with non constants too:
t << y != 0 -> t & (-1u>>y) != 0

When y == 0, you have t != 0.
Which is exactly what you think it should be.
Which can be further reduced to t != 0 as y >= sizeof(t)*BITS_PER_UNIT
is undefined.

Thanks,
Andrew Pinski


>
> My understanding might be incorrect here, please don't hesitate to correct me.
>
> 1) 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2Fr46znh4Tjdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Caa5443e61a5e4cdc177f08d9b457d03e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739111521114457%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=hGkyUkh4Srjb5%2BhvYdT30VLaDLGlkM6jBt3TmfcHFUw%3Dreserved=0
> 2) 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2FK1so39dbKdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Caa5443e61a5e4cdc177f08d9b457d03e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739111521124452%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=EiVD3aIDzds%2BIX5EY3onWVuc%2FdMjoeDSyc5I1B2Xr%2F4%3Dreserved=0
> 3) 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Falive2.llvm.org%2Fce%2Fz%2F-54zZvdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Caa5443e61a5e4cdc177f08d9b457d03e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739111521124452%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=vFiA3eWi5Ry3rFwp6iUc61JaVtzoWS6careE4I5rZvk%3Dreserved=0
>
> Best wishes,
> Navid.
>
> 
> From: Andrew Pinski 
> Sent: Tuesday, November 30, 2021 14:03
> To: Navid Rahimi
> Cc: Navid Rahimi via Gcc-patches
> Subject: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out 
> boolean left shift
>
> On Tue, Nov 30, 2021 at 8:35 AM Navid Rahimi via Gcc-patches
>  wrote:
> >
> > Hi GCC community,
> >
> > This patch will add the missed pattern described in bug 98956 [1] to the 
> > match.pd. The codegen and correctness proof for this pattern is here [2,3] 
> > in case anyone is curious. Tested on x86_64 Linux.
> >
>
> A better way to optimize this is the following (which I describe in PR 64992):
>  take: (t << 1) != 0;
>
> This should be transformed into:
> (t & 0x7fff) != 0
>
> The rest will just fall out really.  That is there is no reason to
> special case bool here.
> I have most of the patch except for creating the mask part which
> should be simple, I just did not want to look up the wi:: functions at
> the time I was writing it into the bug report.
>
> Thanks,
> Andrew Pinski
>
>
>
> > Tree-optimization/98956:
> >
> > Adding new optimization to match.pd:
> > * match.pd ((B0 << x) cmp 0) -> B0 cmp 0 : New optimization.
> > * gcc.dg/tree-ssa/pr98956.c: testcase for this optimization.
> > * gcc.dg/tree-ssa/pr98956-2.c: testcase for node with 
> > side-effect.
> >
> > 1) 
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D98956data=04%7C01%7Cnavidrahimi%40microsoft.com%7Caa5443e61a5e4cdc177f08d9b457d03e%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739111521124452%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=D2%2Fsh1qyHezbiH8xfcwONjnob00Pvu5ktj2kcFlQSxE%3Dreserved=0
> > 2) 
> >

[r12-5612 Regression] FAIL: gcc.target/i386/pr88531-1a.c (test for excess errors) on Linux/x86_64

2021-11-30 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

10833849b55401a52f2334eb032a70beb688e9fc is the first bad commit
commit 10833849b55401a52f2334eb032a70beb688e9fc
Author: Richard Sandiford 
Date:   Tue Nov 30 09:52:29 2021 +

vect: Support gather loads with SLP

caused

FAIL: gcc.target/i386/avx2-i32gatherpd256-2.c (internal compiler error)
FAIL: gcc.target/i386/avx2-i32gatherpd256-2.c (test for excess errors)
FAIL: gcc.target/i386/avx2-i32gatherq256-2.c (internal compiler error)
FAIL: gcc.target/i386/avx2-i32gatherq256-2.c (test for excess errors)
FAIL: gcc.target/i386/avx2-i64gatherpd256-2.c (internal compiler error)
FAIL: gcc.target/i386/avx2-i64gatherpd256-2.c (test for excess errors)
FAIL: gcc.target/i386/avx2-i64gatherq256-2.c (internal compiler error)
FAIL: gcc.target/i386/avx2-i64gatherq256-2.c (test for excess errors)
FAIL: gcc.target/i386/avx2-vpermpd-2.c (internal compiler error)
FAIL: gcc.target/i386/avx2-vpermpd-2.c (test for excess errors)
FAIL: gcc.target/i386/avx2-vpermq-2.c (internal compiler error)
FAIL: gcc.target/i386/avx2-vpermq-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-vpermilpd-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-vpermilpd-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-vpermpd-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-vpermpd-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-vpermpdi-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-vpermpdi-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-vpermq-imm-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-vpermq-imm-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-vpermq-imm-3.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-vpermq-imm-3.c (test for excess errors)
FAIL: gcc.target/i386/avx512f-vpermq-var-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512f-vpermq-var-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512vl-gather-1.c (internal compiler error)
FAIL: gcc.target/i386/avx512vl-gather-1.c (test for excess errors)
FAIL: gcc.target/i386/avx512vl-pr79299-1.c (internal compiler error)
FAIL: gcc.target/i386/avx512vl-pr79299-1.c (test for excess errors)
FAIL: gcc.target/i386/avx512vl-vpermilpd-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512vl-vpermilpd-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512vl-vpermpd-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512vl-vpermpd-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512vl-vpermpdi-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512vl-vpermpdi-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512vl-vpermq-imm-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512vl-vpermq-imm-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512vl-vpermq-var-2.c (internal compiler error)
FAIL: gcc.target/i386/avx512vl-vpermq-var-2.c (test for excess errors)
FAIL: gcc.target/i386/vect-gather-1.c (internal compiler error)
FAIL: gcc.target/i386/vect-gather-1.c (test for excess errors)
FAIL: gfortran.dg/pr88148.f90   -O  (internal compiler error)
FAIL: gfortran.dg/pr88148.f90   -O  (test for excess errors)
FAIL: gfortran.dg/vect/vect-8-epilogue.F90   -O  (internal compiler error)
FAIL: gfortran.dg/vect/vect-8-epilogue.F90   -O  (test for excess errors)
FAIL: gfortran.dg/vect/vect-8.f90   -O  (internal compiler error)
FAIL: gfortran.dg/vect/vect-8.f90   -O   scan-tree-dump-times vect "vectorized 
2[234] loops" 1
FAIL: gfortran.dg/vect/vect-8.f90   -O  (test for excess errors)
FAIL: libgomp.fortran/examples-4/simd-2.f90   -O1  (internal compiler error)
FAIL: libgomp.fortran/examples-4/simd-2.f90   -O1  (test for excess errors)
FAIL: libgomp.fortran/examples-4/simd-2.f90   -O2  (internal compiler error)
FAIL: libgomp.fortran/examples-4/simd-2.f90   -O2  (test for excess errors)
FAIL: libgomp.fortran/examples-4/simd-2.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (internal compiler 
error)
FAIL: libgomp.fortran/examples-4/simd-2.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess 
errors)
FAIL: libgomp.fortran/examples-4/simd-2.f90   -O3 -g  (internal compiler error)
FAIL: libgomp.fortran/examples-4/simd-2.f90   -O3 -g  (test for excess errors)
FAIL: libgomp.fortran/examples-4/simd-2.f90   -Os  (internal compiler error)
FAIL: libgomp.fortran/examples-4/simd-2.f90   -Os  (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-5612/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/avx2-i32gatherpd256-2.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/avx2-i32gatherq256-2.c

Re: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out boolean left shift

2021-11-30 Thread Andrew Pinski via Gcc-patches

On Tue, Nov 30, 2021 at 3:18 PM Andrew Pinski  wrote:
>
> On Tue, Nov 30, 2021 at 3:08 PM Navid Rahimi  
> wrote:
> >
> > Hi Andrew,
> >
> > Thanks for your detailed comment. There are two problem I wanted to discuss 
> > with you about:
> >
> > a) The optimization I have sent patch, does optimize variable length "<<" 
> > too(for example B0 << x, where x is variable). This [1] link shows the 
> > actual optimization and a link for the proof is included in the editor.
> >
> > b) I am unable to prove the optimization you are describing for 
> > non-constant length shift. You can take a look at the code example [2] and 
> > proof [3]. I am getting "Transformation doesn't verify!" when I do 
> > implement the optimization you mentioned for non-constant shift.
> >
> > The optimization you are describing only works for "(take: (t << 1) != 0) 
> > -> ((t & 0x7fff) != 0)" which only is provable and works for 
> > INTEGER_CST.
>
> No it works with non constants too:
> t << y != 0 -> t & (-1u>>y) != 0
>
> When y == 0, you have t != 0.
> Which is exactly what you think it should be.
> Which can be further reduced to t != 0 as y >= sizeof(t)*BITS_PER_UNIT
> is undefined.

I filed PR 103509 for the above issue.  Note it only works when
comparing against 0. I also notice that LLVM, ICC and MSVC do not do
the optimization either. But they (except for MSVC) do handle (-1u>>y)
!= 0 to be always true.

Thanks,
Andrew Pinski

>
> Thanks,
> Andrew Pinski
>
>
> >
> > My understanding might be incorrect here, please don't hesitate to correct 
> > me.
> >
> > 1) https://compiler-explorer.com/z/r46znh4Tj
> > 2) https://compiler-explorer.com/z/K1so39dbK
> > 3) https://alive2.llvm.org/ce/z/-54zZv
> >
> > Best wishes,
> > Navid.
> >
> > 
> > From: Andrew Pinski 
> > Sent: Tuesday, November 30, 2021 14:03
> > To: Navid Rahimi
> > Cc: Navid Rahimi via Gcc-patches
> > Subject: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out 
> > boolean left shift
> >
> > On Tue, Nov 30, 2021 at 8:35 AM Navid Rahimi via Gcc-patches
> >  wrote:
> > >
> > > Hi GCC community,
> > >
> > > This patch will add the missed pattern described in bug 98956 [1] to the 
> > > match.pd. The codegen and correctness proof for this pattern is here 
> > > [2,3] in case anyone is curious. Tested on x86_64 Linux.
> > >
> >
> > A better way to optimize this is the following (which I describe in PR 
> > 64992):
> >  take: (t << 1) != 0;
> >
> > This should be transformed into:
> > (t & 0x7fff) != 0
> >
> > The rest will just fall out really.  That is there is no reason to
> > special case bool here.
> > I have most of the patch except for creating the mask part which
> > should be simple, I just did not want to look up the wi:: functions at
> > the time I was writing it into the bug report.
> >
> > Thanks,
> > Andrew Pinski
> >
> >
> >
> > > Tree-optimization/98956:
> > >
> > > Adding new optimization to match.pd:
> > > * match.pd ((B0 << x) cmp 0) -> B0 cmp 0 : New 
> > > optimization.
> > > * gcc.dg/tree-ssa/pr98956.c: testcase for this 
> > > optimization.
> > > * gcc.dg/tree-ssa/pr98956-2.c: testcase for node with 
> > > side-effect.
> > >
> > > 1) 
> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D98956data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cd83f36080fd94b563ab608d9b44d4d1f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739066369079450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=EO7zAIa9sux4JklTDeALImoX3Kcjqeug%2BssU0E%2Fp6mY%3Dreserved=0
> > > 2) 
> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2Fnj4PTrecWdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Cd83f36080fd94b563ab608d9b44d4d1f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739066369079450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=GyivNuda31%2FPXJQQ4Z9tK2cFtj3N9YcvRdtM7rVkhHg%3Dreserved=0
> > > 3) 
> > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Falive2.llvm.org%2Fce%2Fz%2FjyJAoSdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Cd83f36080fd94b563ab608d9b44d4d1f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739066369079450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=esqOKjKS5JZDbNBmAi0Bwwk0JTTHzInQ2Lgeq%2BPHJ9w%3Dreserved=0
> > >
> > > Best wishes,
> > > Navid.

Re: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out boolean left shift

2021-11-30 Thread Andrew Pinski via Gcc-patches

On Tue, Nov 30, 2021 at 3:08 PM Navid Rahimi  wrote:
>
> Hi Andrew,
>
> Thanks for your detailed comment. There are two problem I wanted to discuss 
> with you about:
>
> a) The optimization I have sent patch, does optimize variable length "<<" 
> too(for example B0 << x, where x is variable). This [1] link shows the actual 
> optimization and a link for the proof is included in the editor.
>
> b) I am unable to prove the optimization you are describing for non-constant 
> length shift. You can take a look at the code example [2] and proof [3]. I am 
> getting "Transformation doesn't verify!" when I do implement the optimization 
> you mentioned for non-constant shift.
>
> The optimization you are describing only works for "(take: (t << 1) != 0) -> 
> ((t & 0x7fff) != 0)" which only is provable and works for INTEGER_CST.

No it works with non constants too:
t << y != 0 -> t & (-1u>>y) != 0

When y == 0, you have t != 0.
Which is exactly what you think it should be.
Which can be further reduced to t != 0 as y >= sizeof(t)*BITS_PER_UNIT
is undefined.

Thanks,
Andrew Pinski


>
> My understanding might be incorrect here, please don't hesitate to correct me.
>
> 1) https://compiler-explorer.com/z/r46znh4Tj
> 2) https://compiler-explorer.com/z/K1so39dbK
> 3) https://alive2.llvm.org/ce/z/-54zZv
>
> Best wishes,
> Navid.
>
> 
> From: Andrew Pinski 
> Sent: Tuesday, November 30, 2021 14:03
> To: Navid Rahimi
> Cc: Navid Rahimi via Gcc-patches
> Subject: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out 
> boolean left shift
>
> On Tue, Nov 30, 2021 at 8:35 AM Navid Rahimi via Gcc-patches
>  wrote:
> >
> > Hi GCC community,
> >
> > This patch will add the missed pattern described in bug 98956 [1] to the 
> > match.pd. The codegen and correctness proof for this pattern is here [2,3] 
> > in case anyone is curious. Tested on x86_64 Linux.
> >
>
> A better way to optimize this is the following (which I describe in PR 64992):
>  take: (t << 1) != 0;
>
> This should be transformed into:
> (t & 0x7fff) != 0
>
> The rest will just fall out really.  That is there is no reason to
> special case bool here.
> I have most of the patch except for creating the mask part which
> should be simple, I just did not want to look up the wi:: functions at
> the time I was writing it into the bug report.
>
> Thanks,
> Andrew Pinski
>
>
>
> > Tree-optimization/98956:
> >
> > Adding new optimization to match.pd:
> > * match.pd ((B0 << x) cmp 0) -> B0 cmp 0 : New optimization.
> > * gcc.dg/tree-ssa/pr98956.c: testcase for this optimization.
> > * gcc.dg/tree-ssa/pr98956-2.c: testcase for node with 
> > side-effect.
> >
> > 1) 
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D98956data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cd83f36080fd94b563ab608d9b44d4d1f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739066369079450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=EO7zAIa9sux4JklTDeALImoX3Kcjqeug%2BssU0E%2Fp6mY%3Dreserved=0
> > 2) 
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2Fnj4PTrecWdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Cd83f36080fd94b563ab608d9b44d4d1f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739066369079450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=GyivNuda31%2FPXJQQ4Z9tK2cFtj3N9YcvRdtM7rVkhHg%3Dreserved=0
> > 3) 
> > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Falive2.llvm.org%2Fce%2Fz%2FjyJAoSdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Cd83f36080fd94b563ab608d9b44d4d1f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739066369079450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=esqOKjKS5JZDbNBmAi0Bwwk0JTTHzInQ2Lgeq%2BPHJ9w%3Dreserved=0
> >
> > Best wishes,
> > Navid.

[committed] libstdc++: Fix tests that fail with fully-dynamic-string

Tested powerpc64le-linux (old ABI) and x86_64-linux (new ABI), pushed to
trunk.


Fix some tests that assume that a moved-from string is empty, or that
default constructing a string doesn't allocate.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string/cons/char/moveable.cc: Allow
moved-from string to be non-empty.
* testsuite/21_strings/basic_string/cons/char/moveable2.cc:
Likewise.
* testsuite/21_strings/basic_string/cons/char/moveable2_c++17.cc:
Likewise.
* testsuite/21_strings/basic_string/cons/wchar_t/moveable.cc:
Likewise.
* testsuite/21_strings/basic_string/cons/wchar_t/moveable2.cc:
Likewise.
* testsuite/21_strings/basic_string/cons/wchar_t/moveable2_c++17.cc:
Likewise.
* testsuite/21_strings/basic_string/modifiers/assign/char/87749.cc:
Construct empty string before setting oom flag.
* testsuite/21_strings/basic_string/modifiers/assign/wchar_t/87749.cc:
Likewise.
---
 .../testsuite/21_strings/basic_string/cons/char/moveable.cc   | 4 +++-
 .../testsuite/21_strings/basic_string/cons/char/moveable2.cc  | 4 +++-
 .../21_strings/basic_string/cons/char/moveable2_c++17.cc  | 4 +++-
 .../21_strings/basic_string/cons/wchar_t/moveable.cc  | 4 +++-
 .../21_strings/basic_string/cons/wchar_t/moveable2.cc | 4 +++-
 .../21_strings/basic_string/cons/wchar_t/moveable2_c++17.cc   | 4 +++-
 .../21_strings/basic_string/modifiers/assign/char/87749.cc| 2 +-
 .../21_strings/basic_string/modifiers/assign/wchar_t/87749.cc | 2 +-
 8 files changed, 20 insertions(+), 8 deletions(-)

diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/moveable.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/moveable.cc
index 5de2a5f9330..3ba39ec432d 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/moveable.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/moveable.cc
@@ -35,7 +35,9 @@ void test01()
 
   std::string c(std::move(b));
   VERIFY( c.size() == 1 && c[0] == '1' );
-  VERIFY( b.size() == 0 );
+#if ! _GLIBCXX_FULLY_DYNAMIC_STRING
+  VERIFY( b.size() == 0 ); // not guaranteed by the standard
+#endif
 }
 
 int main()
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/moveable2.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/moveable2.cc
index fe91c5ab539..5804ccb6bf8 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/moveable2.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/moveable2.cc
@@ -44,7 +44,9 @@ void test01()
 
   tstring c(std::move(b));
   VERIFY( c.size() == 1 && c[0] == '1' );
-  VERIFY( b.size() == 0 );
+#if ! _GLIBCXX_FULLY_DYNAMIC_STRING
+  VERIFY( b.size() == 0 ); // not guaranteed by the standard
+#endif
 }
 
 int main()
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/moveable2_c++17.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/moveable2_c++17.cc
index 1caedcccfce..59d1d775134 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/moveable2_c++17.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/moveable2_c++17.cc
@@ -42,7 +42,9 @@ void test01()
 
   tstring c(std::move(b));
   VERIFY( c.size() == 1 && c[0] == '1' );
-  VERIFY( b.size() == 0 );
+#if ! _GLIBCXX_FULLY_DYNAMIC_STRING
+  VERIFY( b.size() == 0 ); // not guaranteed by the standard
+#endif
 }
 
 int main()
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/moveable.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/moveable.cc
index d05afb7d466..67e25de2916 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/moveable.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/moveable.cc
@@ -35,7 +35,9 @@ void test01()
 
   std::wstring c(std::move(b));
   VERIFY( c.size() == 1 && c[0] == L'1' );
-  VERIFY( b.size() == 0 );
+#if ! _GLIBCXX_FULLY_DYNAMIC_STRING
+  VERIFY( b.size() == 0 ); // not guaranteed by the standard
+#endif
 }
 
 int main()
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/moveable2.cc 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/moveable2.cc
index e301984612d..c72eb9bfddb 100644
--- a/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/moveable2.cc
+++ b/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/moveable2.cc
@@ -44,7 +44,9 @@ void test01()
 
   twstring c(std::move(b));
   VERIFY( c.size() == 1 && c[0] == L'1' );
-  VERIFY( b.size() == 0 );
+#if ! _GLIBCXX_FULLY_DYNAMIC_STRING
+  VERIFY( b.size() == 0 ); // not guaranteed by the standard
+#endif
 }
 
 int main()
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/moveable2_c++17.cc
 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/moveable2_c++17.cc
index d3e4744ff34..6a2bc2714b5 100644
---

[committed] libstdc++: Fix fully-dynamic-string build

Tested powerpc64le-linux (old ABI) and x86_64-linux (new ABI), pushed to
trunk.


My last change to the fully-dynamic-string actually broke it. This fixes
the move constructor so it builds, and simplifies it slightly so that
more code is common between the fully-dynamic enabled/disabled cases.

libstdc++-v3/ChangeLog:

* include/bits/cow_string.h (basic_string(basic_string&&)): Fix
mem-initializer for _GLIBCXX_FULLY_DYNAMIC_STRING==0 case.
* 
testsuite/21_strings/basic_string/cons/char/noexcept_move_construct.cc:
Remove outdated comment.
* 
testsuite/21_strings/basic_string/cons/wchar_t/noexcept_move_construct.cc:
Likewise.
---
 libstdc++-v3/include/bits/cow_string.h| 8 +++-
 .../basic_string/cons/char/noexcept_move_construct.cc | 1 -
 .../basic_string/cons/wchar_t/noexcept_move_construct.cc  | 1 -
 3 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/libstdc++-v3/include/bits/cow_string.h 
b/libstdc++-v3/include/bits/cow_string.h
index bafca7bb313..ced395b80b8 100644
--- a/libstdc++-v3/include/bits/cow_string.h
+++ b/libstdc++-v3/include/bits/cow_string.h
@@ -621,14 +621,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  @a __str is a valid, but unspecified string.
*/
   basic_string(basic_string&& __str) noexcept
-#if _GLIBCXX_FULLY_DYNAMIC_STRING == 0
   : _M_dataplus(std::move(__str._M_dataplus))
   {
+#if _GLIBCXX_FULLY_DYNAMIC_STRING == 0
+   // Make __str use the shared empty string rep.
__str._M_data(_S_empty_rep()._M_refdata());
-  }
 #else
-  : _M_dataplus(__str._M_rep())
-  {
// Rather than allocate an empty string for the rvalue string,
// just share ownership with it by incrementing the reference count.
// If the rvalue string was "leaked" then it was the unique owner,
@@ -637,8 +635,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  __gnu_cxx::__atomic_add_dispatch(&_M_rep()->_M_refcount, 2);
else
  __gnu_cxx::__atomic_add_dispatch(&_M_rep()->_M_refcount, 1);
-  }
 #endif
+  }
 
   /**
*  @brief  Construct string from an initializer %list.
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/noexcept_move_construct.cc
 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/noexcept_move_construct.cc
index f04a491370d..74b0ed3910c 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/noexcept_move_construct.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/char/noexcept_move_construct.cc
@@ -23,7 +23,6 @@
 
 typedef std::string stype;
 
-// True except for COW strings with _GLIBCXX_FULLY_DYNAMIC_STRING:
 static_assert(std::is_nothrow_move_constructible::value, "Error");
 
 // True for std::allocator because is_always_equal, but not true in general:
diff --git 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/noexcept_move_construct.cc
 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/noexcept_move_construct.cc
index d5dbf561ec0..53cb81d8aee 100644
--- 
a/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/noexcept_move_construct.cc
+++ 
b/libstdc++-v3/testsuite/21_strings/basic_string/cons/wchar_t/noexcept_move_construct.cc
@@ -23,7 +23,6 @@
 
 typedef std::wstring wstype;
 
-// True except for COW strings with _GLIBCXX_FULLY_DYNAMIC_STRING:
 static_assert(std::is_nothrow_move_constructible::value, "Error");
 
 // True for std::allocator because is_always_equal, but not true in general:
-- 
2.31.1

[committed] libstdc++: Ensure C++20 std::stringstream definitions use correct ABI

Tested powerpc64le-linux (old ABI) and x86_64-linux (new ABI), pushed to
trunk.


The definitions of the new C++20 members of std::stringstream etc are
missing when --with-default-libstdcxx-abi=gcc4-compatible is used,
because all the explicit instantiations in src/c++20/sstream-inst.cc are
skipped.

This ensures the contents of that file are compiled with the new ABI, so
the same set of symbols are exported regardless of which ABI is active
by default.

libstdc++-v3/ChangeLog:

* src/c++20/sstream-inst.cc (_GLIBCXX_USE_CXX11_ABI): Define to
select new ABI.
---
 libstdc++-v3/src/c++20/sstream-inst.cc | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/src/c++20/sstream-inst.cc 
b/libstdc++-v3/src/c++20/sstream-inst.cc
index b3fbd6ead44..55d1fe5234d 100644
--- a/libstdc++-v3/src/c++20/sstream-inst.cc
+++ b/libstdc++-v3/src/c++20/sstream-inst.cc
@@ -26,7 +26,9 @@
 // ISO C++ 14882:
 //
 
-// Instantiations in this file are only for the new SSO std::string ABI
+// Instantiations in this file are only for the new SSO std::string ABI.
+#define _GLIBCXX_USE_CXX11_ABI 1
+
 #include 
 
 #if _GLIBCXX_USE_CXX11_ABI
-- 
2.31.1

Re: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out boolean left shift

2021-11-30 Thread Navid Rahimi via Gcc-patches

Hi Andrew,

Thanks for your detailed comment. There are two problem I wanted to discuss 
with you about:

a) The optimization I have sent patch, does optimize variable length "<<" 
too(for example B0 << x, where x is variable). This [1] link shows the actual 
optimization and a link for the proof is included in the editor.

b) I am unable to prove the optimization you are describing for non-constant 
length shift. You can take a look at the code example [2] and proof [3]. I am 
getting "Transformation doesn't verify!" when I do implement the optimization 
you mentioned for non-constant shift.

The optimization you are describing only works for "(take: (t << 1) != 0) -> 
((t & 0x7fff) != 0)" which only is provable and works for INTEGER_CST.

My understanding might be incorrect here, please don't hesitate to correct me.

1) https://compiler-explorer.com/z/r46znh4Tj
2) https://compiler-explorer.com/z/K1so39dbK
3) https://alive2.llvm.org/ce/z/-54zZv

Best wishes,
Navid.


From: Andrew Pinski 
Sent: Tuesday, November 30, 2021 14:03
To: Navid Rahimi
Cc: Navid Rahimi via Gcc-patches
Subject: [EXTERNAL] Re: [PATCH] tree-optimization/98956 Optimizing out boolean 
left shift

On Tue, Nov 30, 2021 at 8:35 AM Navid Rahimi via Gcc-patches
 wrote:
>
> Hi GCC community,
>
> This patch will add the missed pattern described in bug 98956 [1] to the 
> match.pd. The codegen and correctness proof for this pattern is here [2,3] in 
> case anyone is curious. Tested on x86_64 Linux.
>

A better way to optimize this is the following (which I describe in PR 64992):
 take: (t << 1) != 0;

This should be transformed into:
(t & 0x7fff) != 0

The rest will just fall out really.  That is there is no reason to
special case bool here.
I have most of the patch except for creating the mask part which
should be simple, I just did not want to look up the wi:: functions at
the time I was writing it into the bug report.

Thanks,
Andrew Pinski



> Tree-optimization/98956:
>
> Adding new optimization to match.pd:
> * match.pd ((B0 << x) cmp 0) -> B0 cmp 0 : New optimization.
> * gcc.dg/tree-ssa/pr98956.c: testcase for this optimization.
> * gcc.dg/tree-ssa/pr98956-2.c: testcase for node with 
> side-effect.
>
> 1) 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D98956data=04%7C01%7Cnavidrahimi%40microsoft.com%7Cd83f36080fd94b563ab608d9b44d4d1f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739066369079450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=EO7zAIa9sux4JklTDeALImoX3Kcjqeug%2BssU0E%2Fp6mY%3Dreserved=0
> 2) 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcompiler-explorer.com%2Fz%2Fnj4PTrecWdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Cd83f36080fd94b563ab608d9b44d4d1f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739066369079450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=GyivNuda31%2FPXJQQ4Z9tK2cFtj3N9YcvRdtM7rVkhHg%3Dreserved=0
> 3) 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Falive2.llvm.org%2Fce%2Fz%2FjyJAoSdata=04%7C01%7Cnavidrahimi%40microsoft.com%7Cd83f36080fd94b563ab608d9b44d4d1f%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637739066369079450%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=esqOKjKS5JZDbNBmAi0Bwwk0JTTHzInQ2Lgeq%2BPHJ9w%3Dreserved=0
>
> Best wishes,
> Navid.

Re: [PATCH] Remove more stray returns and gcc_unreachable ()s

2021-11-30 Thread Martin Sebor via Gcc-patches


On 11/30/21 12:51 AM, Richard Biener wrote:

On Mon, 29 Nov 2021, Martin Sebor wrote:


On 11/29/21 11:53 AM, Martin Sebor wrote:

On 11/29/21 6:09 AM, Richard Biener via Gcc-patches wrote:

This removes more cases that appear when bootstrap with
-Wunreachable-code-return progresses.


...

diff --git a/gcc/sel-sched-ir.h b/gcc/sel-sched-ir.h
index 8ee0529d5a8..18e03c4cb96 100644
--- a/gcc/sel-sched-ir.h
+++ b/gcc/sel-sched-ir.h
@@ -1493,8 +1493,6 @@ bb_next_bb (basic_block bb)
   default:
     return bb->next_bb;
   }
-
-  gcc_unreachable ();
   }


Just skiming the changes out of curiosity, this one makes me
wonder if the warning shouldn't be taught to avoid triggering
on calls to __builtin_unreachable().  They can help make code
more readable (e.g., after a case and switch statement that
handles all values).


I see someone else raised the same question in a patch I hadn't
gotten to yet:

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585431.html

If you do end up removing the gcc_unreachable() calls, I would
suggest to replace them with a comment so as not to lose
the readability benefit.

But I still wonder if it might make sense to teach the warning
not just about __builtin_unreachable() but also about noreturn
calls like abort() that (as you explained in the thread above)
gcc_unreachable() expands to.  Is there a benefit to warning
on such calls?


I'm not sure.  I've chosen to eliminate only the "obvious"
cases, like above where there's a default: that returns immediately
visible (not always in the patch context).  I've left some in
the code base where that's not so obvious.

IMHO making the flow obvious without a unreachable marker is
superior to obfuscating it and clearing that up with one.

Yes, I thought about not diagnosing things like

return 1;
return 1;

but then what about

return 1;
return 0;

?  I've seen cases like

gcc_unreachable ();
return 0;

was that meant to be

return 0;
gcc_unreachable ();

?  So it's not entirely clear.  I think that if there was a way
to denote definitive 'code should not reach here' function
(a new attribute?) then it would make sense to not warn about
control flow not reaching that.  But then it would make sense
to warn about stmts following such annotation.


How would such an attribute be different from
__builtin_unreachable?  (By the way, there is or was a proposal
before WG14 to add an annotation like it:
  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2816.pdf
If I recall, a function was preferred by more in a discussion
of the proposal than an attribute.)

I agree the cases above are not entirely clear but it occurs to
me that it's possible to discern at least two broad categories
of cases: 1) a statement made unreachable by a prior one with
the same effect where swapping the two wouldn't change anything
(the double return 1; above), and 2) an unreachable statement
(or a series of statements) with a different effect than
the prior one (the last three above).

The set in (1) are completely innocuous and removing them might
considered just a matter of cleanup.  Those in (2) are less
clear cut and more likely to harbor bugs and so when adopting
the warning in a code base like Binutils with lots of instances
of both kinds I'd expect to focus on (2) first and worry about
(1) later.

Even within (2) there might be separable subsets, like a return
statement followed by a break in a switch (common in Binutils
and I think you also cleaned up some in GCC).  In at least some
of these the return is hidden in a macro so the break after it
might serve as a visual clue that the case isn't meant to fall
through.  This subset would be different from two apparently
contradictory return statements each with a different value,
or from a return followed by more than one statement.  It might
make sense to treat these two classes separately (e.g., add
a level for them).

But these are just ideas for heuristics based on my limited
insight, and YMMV.  It's just food for thought.

Martin



Richard.

Re: [PATCH v2] rs6000: Modify the way for extra penalized cost

2021-11-30 Thread Segher Boessenkool

Hi!

On Tue, Nov 30, 2021 at 01:05:48PM +0800, Kewen.Lin wrote:
> on 2021/11/30 上午6:06, Segher Boessenkool wrote:
> > On Tue, Sep 28, 2021 at 04:16:04PM +0800, Kewen.Lin wrote:
> >> unsigned adjusted_cost = (nunits == 2) ? 2 : 1;
> >> unsigned extra_cost = nunits * adjusted_cost;
> > 
> >> For V2DI/V2DF, it uses 2 penalized cost for each scalar load
> >> while for the other modes, it uses 1.
> > 
> > So for V2D[IF] we get 4, for V4S[IF] we get 4, for V8HI it's 8, and
> > for V16QI it is 16?  Pretty terrible as well, heh (I would expect all
> > vector ops to be similar cost).
> 
> But for different vector units it has different number of loads, it seems
> reasonable to have more costs when it has more loads to be fed into those
> limited number of load/store units.

More expensive, yes.  This expensive?  That doesn't look optimal :-)

> > This also suggests we should cost vector construction separately, which
> > would pretty obviously be a good thing anyway (it happens often, it has
> > a quite different cost structure).
> 
> vectorizer does model vector construction separately, there is an enum
> vect_cost_for_stmt *vec_construct*, normally it works well.  But for this
> bwaves hotspot, it requires us to do some more penalization as evaluated,
> so we put the penalized cost onto this special vector construction when
> some heuristic thresholds are met.

Ah, heuristics.  We can adjust them forever :-)


Segher

[committed] analyzer: add regression test [PR94579]

2021-11-30 Thread David Malcolm via Gcc-patches

Successfully regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-5642-g955ea7b58e4f1e3cc5083e88575161168c147254.

gcc/testsuite/ChangeLog:
PR analyzer/94579
* gcc.dg/analyzer/pr94579.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/testsuite/gcc.dg/analyzer/pr94579.c | 11 +++
 1 file changed, 11 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr94579.c

diff --git a/gcc/testsuite/gcc.dg/analyzer/pr94579.c 
b/gcc/testsuite/gcc.dg/analyzer/pr94579.c
new file mode 100644
index 000..0ab88c2efb5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr94579.c
@@ -0,0 +1,11 @@
+struct a *c;
+struct a {
+  int b;
+} d() {
+}
+
+void e()
+
+{
+  *c = d();
+}
-- 
2.26.3

[committed] analyzer: verify that -Wanalyzer-too-complex can be disabled via pragmas [PR100524]

2021-11-30 Thread David Malcolm via Gcc-patches

Successfully regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-5640-g03ea0ca1189a39e095188b0425c66446cc84a0a5.

gcc/testsuite/ChangeLog:
PR analyzer/100524
* gcc.dg/analyzer/pragma-2.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/testsuite/gcc.dg/analyzer/pragma-2.c | 57 
 1 file changed, 57 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pragma-2.c

diff --git a/gcc/testsuite/gcc.dg/analyzer/pragma-2.c 
b/gcc/testsuite/gcc.dg/analyzer/pragma-2.c
new file mode 100644
index 000..58fcaab11df
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pragma-2.c
@@ -0,0 +1,57 @@
+/* Verify that we can disable -Wanalyzer-too-complex via pragmas.  */
+/* { dg-additional-options "-Wanalyzer-too-complex 
-Werror=analyzer-too-complex -fno-analyzer-state-merge -g" } */
+
+#include 
+
+extern int get (void);
+
+/* In theory each of p0...p4 can be in various malloc states,
+   independently, so the total combined number of states
+   at any program point within the loop is NUM_VARS * NUM_STATES.  */
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wanalyzer-too-complex"
+
+void test (void)
+{
+  void *p0 = NULL, *p1 = NULL, *p2 = NULL, *p3 = NULL, *p4 = NULL;
+  void **pp = NULL;
+  while (get ())
+{
+  switch (get ())
+   {
+   default:
+   case 0:
+ pp = 
+ break;
+   case 1:
+ pp = 
+ break;
+   case 2:
+ pp = 
+ break;
+   case 3:
+ pp = 
+ break;
+   case 4:
+ pp = 
+ break;
+   }
+
+  switch (get ())
+   {
+   default:
+   case 0:
+ *pp = malloc (16); /* { dg-warning "leak" } */
+ break;
+   case 1:
+ free (*pp);
+ break;
+   case 2:
+ /* no-op.  */
+ break;
+   }
+}
+}
+
+#pragma GCC diagnostic pop
-- 
2.26.3

[PATCH v2 2/2] add -Wdangling-pointer [PR #63272]

2021-11-30 Thread Martin Sebor via Gcc-patches


Attached is a revision of this patch with adjustments for
the changes to the prerequisite patch 1 in the series and
a couple of minor simplifications and slightly improved
test coverage, rested on x86_64-linux.

On 11/1/21 4:18 PM, Martin Sebor wrote:

Patch 2 in this series adds support for detecting the uses of
dangling pointers: those to auto objects that have gone out of
scope.  Like patch 1, to minimize false positives this detection
is very simplistic.  However, thanks to the more deterministic
nature of the problem (all local objects go out of scope) is able
to detect more instances of it.  The approach I used is to simply
search the IL for clobbers that dominate uses of pointers to
the clobbered objects.  If such a use is found that's not
followed by a clobber of the same object the warning triggers.
Similar to -Wuse-after-free, the new -Wdangling-pointer option
has multiple levels: level 1 to detect unconditional uses and
level 2 to flag conditional ones.  Unlike with -Wuse-after-free
there is no use case for testing dangling pointers for
equality, so there is no level 3.

Tested on x86_64-linux and  by building Glibc and Binutils/GDB.
It found no problems outside of the GCC test suite.

As with the first patch in this series, the tests contain a number
of xfails due to known limitations marked with pr??.  I'll
open bugs for them before committing the patch if I don't resolve
them first in a followup.

Martin


Add -Wdangling-pointer [PR63272].
Resolves:

PR c/63272 - GCC should warn when using pointer to dead scoped variable within the same function

gcc/c-family/ChangeLog:

	PR c/63272
	* c.opt (-Wdangling-pointer): New option.

gcc/ChangeLog:

	PR c/63272
	* diagnostic-spec.c (nowarn_spec_t::nowarn_spec_t): Handle
	-Wdangling-pointer.
	* doc/invoke.texi (-Wdangling-pointer): Document new option.
	* gimple-ssa-isolate-paths.c (diag_returned_locals): Suppress
	warning after issuing it.
	* gimple-ssa-warn-access.cc (pass_waccess::clone): Set new member.
	(pass_waccess::check_pointer_uses): New function.
	(pass_waccess::gimple_call_return_arg): New function.
	(pass_waccess::gimple_call_return_arg_ref): New function.
	(pass_waccess::check_call_dangling): New function.
	(pass_waccess::check_dangling_uses): New function overloads.
	(pass_waccess::check_dangling_stores): New function.
	(pass_waccess::check_dangling_stores): New function.
	(pass_waccess::m_clobbers): New data member.
	(pass_waccess::m_func): New data member.
	(pass_waccess::m_run_number): New data member.
	(pass_waccess::m_check_dangling_p): New data member.
	(pass_waccess::check_alloca): Check m_early_checks_p.
	(pass_waccess::check_alloc_size_call): Same.
	(pass_waccess::check_strcat): Same.
	(pass_waccess::check_strncat): Same.
	(pass_waccess::check_stxcpy): Same.
	(pass_waccess::check_stxncpy): Same.
	(pass_waccess::check_strncmp): Same.
	(pass_waccess::check_memop_access): Same.
	(pass_waccess::check_read_access): Same.
	(pass_waccess::check_builtin): Call check_pointer_uses.
	(pass_waccess::warn_invalid_pointer): Add arguments.
	(is_auto_decl): New function.
	(pass_waccess::check_stmt): New function.
	(pass_waccess::check_block): Call check_stmt.
	(pass_waccess::execute): Call check_dangling_uses,
	check_dangling_stores.  Empty m_clobbers.
	* passes.def (pass_warn_access): Invoke pass two more times.

gcc/testsuite/ChangeLog:

	PR c/63272
	* g++.dg/warn/Wfree-nonheap-object-6.C: Disable valid warnings.
	* gcc.dg/uninit-pr50476.c: Expect a new warning.
	* c-c++-common/Wdangling-pointer-2.c: New test.
	* c-c++-common/Wdangling-pointer-3.c: New test.
	* c-c++-common/Wdangling-pointer-4.c: New test.
	* c-c++-common/Wdangling-pointer-5.c: New test.
	* c-c++-common/Wdangling-pointer.c: New test.
	* gcc.dg/Wdangling-pointer-2.c: New test.
	* gcc.dg/Wdangling-pointer.c: New test.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index fb1abc0de4c..2e978ae9071 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -548,6 +548,14 @@ Wdangling-else
 C ObjC C++ ObjC++ Var(warn_dangling_else) Warning LangEnabledBy(C ObjC C++ ObjC++,Wparentheses)
 Warn about dangling else.
 
+Wdangling-pointer
+C ObjC C++ LTO ObjC++ Alias(Wdangling-pointer=, 2, 0) Warning
+Warn for uses of pointers to auto variables whose lifetime has ended.
+
+Wdangling-pointer=
+C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_dangling_pointer) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall, 2, 0) IntegerRange(0, 2)
+Warn for uses of pointers to auto variables whose lifetime has ended.
+
 Wdate-time
 C ObjC C++ ObjC++ CPP(warn_date_time) CppReason(CPP_W_DATE_TIME) Var(cpp_warn_date_time) Init(0) Warning
 Warn about __TIME__, __DATE__ and __TIMESTAMP__ usage.
diff --git a/gcc/diagnostic-spec.c b/gcc/diagnostic-spec.c
index 921e7ab7423..3b1e37a6836 100644
--- a/gcc/diagnostic-spec.c
+++ b/gcc/diagnostic-spec.c
@@ -99,6 +99,7 @@ nowarn_spec_t::nowarn_spec_t (opt_code opt)
 	m_bits = NW_UNINIT;
   break;
 
+case OPT_Wdangling_pointer_:
 case

[committed] analyzer: add regression test [PR99269]

2021-11-30 Thread David Malcolm via Gcc-patches

Successfully regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-5641-g9603bccba62e250d0ff64863a1730a167d571a25.

gcc/testsuite/ChangeLog:
PR analyzer/99269
* gcc.dg/analyzer/pr99269.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/testsuite/gcc.dg/analyzer/pr99269.c | 16 
 1 file changed, 16 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr99269.c

diff --git a/gcc/testsuite/gcc.dg/analyzer/pr99269.c 
b/gcc/testsuite/gcc.dg/analyzer/pr99269.c
new file mode 100644
index 000..1cce3aef8dd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr99269.c
@@ -0,0 +1,16 @@
+#include 
+
+void example(void) {
+   int len;
+   int **namelist = NULL;
+
+   len = 2;
+   namelist = malloc(len * sizeof *namelist);
+   if (!namelist) return;
+   namelist[0] = malloc(sizeof **namelist);
+   namelist[1] = malloc(sizeof **namelist);
+
+   while(len--) { free(namelist[len]); }
+   free(namelist);
+   return;
+}
-- 
2.26.3

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

On 11/30/21 2:44 PM, Qing Zhao via Gcc-patches wrote:
> Sorry for the confusing…
> My major question is:  
> 
> for a variable of type __vector_pair,  could it be in a register?

Yes.  To be pedantic, it will live in a vector register pair.

> If it could be in a register, can we initialize this register with some 
> constant value? 

For a __vector_pair, no, not as it is setup now.  We also do not have a
use case where we would want to initialize a __vector_pair to a constant.
Our normal (only?) use case with a __vector_pair is to load it up with
some actual data from memory that represents a (partial) row of a matrix. 

For __vector_quad, it too lives in a register (accumulator register) and
represents a small matrix.  We have the __builtin_mma_xxsetaccz ()
builtin to initialize it to a zero constant.

Peter

[PATCH v2 1/2] add -Wuse-after-free

2021-11-30 Thread Martin Sebor via Gcc-patches


Attached is a revised patch with the following changes based
on your comments:

1) Set and use statement uids to determine which statement
   precedes which in the same basic block.
2) Avoid testing flag_isolate_erroneous_paths_dereference.
3) Use post-dominance to decide whether to use the "maybe"
   phrasing vs a definite form.

David raised (and in our offline discussion today reiterated)
an objection to the default setting of the option being
the strictest.  I have not changed that in this revision.
See my rationale for this choice in my reply below:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583176.html

Martin

On 11/23/21 2:16 PM, Martin Sebor wrote:

On 11/22/21 6:32 PM, Jeff Law wrote:



On 11/1/2021 4:17 PM, Martin Sebor via Gcc-patches wrote:

Patch 1 in the series detects a small subset of uses of pointers
made indeterminate by calls to deallocation functions like free
or C++ operator delete.  To control the conditions the warnings
are issued under the new -Wuse-after-free= option provides three
levels.  At the lowest level the warning triggers only for
unconditional uses of freed pointers and doesn't warn for uses
in equality expressions.  Level 2 warns also for come conditional
uses, and level 3 also for uses in equality expressions.

I debated whether to make level 2 or 3 the default included in
-Wall.  I decided on 3 for two reasons: 1) to raise awareness
of both the problem and GCC's new ability to detect it: using
a pointer after it's been freed, even only in principle, by
a successful call to realloc, is undefined, and 2) because
it's trivial to lower the level either globally, or locally
by suppressing the warning around such misuses.

I've tested the patch on x86_64-linux and by building Glibc
and Binutils/GDB.  It triggers a number of times in each, all
due to comparing invalidated pointers for equality (i.e., level
3).  I have suppressed these in GCC (libiberty) by a #pragma,
and will see how the Glibc folks want to deal with theirs (I
track them in BZ #28521).

The tests contain a number of xfails due to limitations I'm
aware of.  I marked them pr?? until the patch is approved.
I will open bugs for them before committing if I don't resolve
them in a followup.

Martin

gcc-63272-1.diff

Add -Wuse-after-free.

gcc/c-family/ChangeLog

* c.opt (-Wuse-after-free): New options.

gcc/ChangeLog:

* diagnostic-spec.c (nowarn_spec_t::nowarn_spec_t): Handle
OPT_Wreturn_local_addr and OPT_Wuse_after_free_.
* diagnostic-spec.h (NW_DANGLING): New enumerator.
* doc/invoke.texi (-Wuse-after-free): Document new option.
* gimple-ssa-warn-access.cc (pass_waccess::check_call): Rename...
(pass_waccess::check_call_access): ...to this.
(pass_waccess::check): Rename...
(pass_waccess::check_block): ...to this.
(pass_waccess::check_pointer_uses): New function.
(pass_waccess::gimple_call_return_arg): New function.
(pass_waccess::warn_invalid_pointer): New function.
(pass_waccess::check_builtin): Handle free and realloc.
(gimple_use_after_inval_p): New function.
(get_realloc_lhs): New function.
(maybe_warn_mismatched_realloc): New function.
(pointers_related_p): New function.
(pass_waccess::check_call): Call check_pointer_uses.
(pass_waccess::execute): Compute and free dominance info.

libcpp/ChangeLog:

* files.c (_cpp_find_file): Substitute a valid pointer for
an invalid one to avoid -Wuse-0after-free.

libiberty/ChangeLog:

* regex.c: Suppress -Wuse-after-free.

gcc/testsuite/ChangeLog:

* gcc.dg/Wmismatched-dealloc-2.c: Avoid -Wuse-after-free.
* gcc.dg/Wmismatched-dealloc-3.c: Same.
* gcc.dg/attr-alloc_size-6.c: Disable -Wuse-after-free.
* gcc.dg/attr-alloc_size-7.c: Same.
* c-c++-common/Wuse-after-free-2.c: New test.
* c-c++-common/Wuse-after-free-3.c: New test.
* c-c++-common/Wuse-after-free-4.c: New test.
* c-c++-common/Wuse-after-free-5.c: New test.
* c-c++-common/Wuse-after-free-6.c: New test.
* c-c++-common/Wuse-after-free-7.c: New test.
* c-c++-common/Wuse-after-free.c: New test.
* g++.dg/warn/Wdangling-pointer.C: New test.
* g++.dg/warn/Wmismatched-dealloc-3.C: New test.
* g++.dg/warn/Wuse-after-free.C: New test.

diff --git a/gcc/gimple-ssa-warn-access.cc 
b/gcc/gimple-ssa-warn-access.cc

index 63fc27a1487..2065402a2b9 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc

@@ -3397,33 +3417,460 @@ pass_waccess::maybe_check_dealloc_call 
(gcall *call)

  }
  }
+/* Return true if either USE_STMT's basic block (that of a pointer's 
use)
+   is dominated by INVAL_STMT's (that of a pointer's invalidating 
statement,

+   which is either a clobber or a deallocation call), or if they're in
+   the same block, USE_STMT follows INVAL_STMT.  */
+
+static bool
+gimple_use_after_inval_p (gimple *inval_stmt, gimple *use_stmt,
+  bool last_block = false)
+{
+  tree clobvar =
+    gimple_clobber_p (inval_stmt)

Re: [PATCH] Avoid some -Wunreachable-code-ctrl


On 11/29/21 10:03, Richard Biener via Gcc-patches wrote:

This cleans up unreachable code diagnosed by -Wunreachable-code-ctrl.
It largely follows the previous series but discovers a few extra
cases, namely dead code after break or continue or loops without
exits.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2021-11-29  Richard Biener  

gcc/c/
* gimple-parser.c (c_parser_gimple_postfix_expression):
avoid unreachable code after break.

gcc/
* cfgrtl.c (skip_insns_after_block): Refactor code to
be more easily readable.
* expr.c (op_by_pieces_d::run): Remove unreachable
assert.
* sched-deps.c (sched_analyze): Remove unreachable
gcc_unreachable.
* sel-sched-ir.c (in_same_ebb_p): Likewise.
* tree-ssa-alias.c (nonoverlapping_refs_since_match_p):
Remove unreachable code.
* tree-vect-slp.c (vectorize_slp_instance_root_stmt):
Refactor to avoid unreachable loop iteration.
* tree.c (walk_tree_1): Remove unreachable break.
* vec-perm-indices.c (vec_perm_indices::series_p): Remove
unreachable return.

gcc/cp/
* parser.c (cp_parser_postfix_expression): Remove
unreachable code.
* pt.c (tsubst_expr): Remove unreachable breaks.

gcc/fortran/
* frontend-passes.c (gfc_expr_walker): Remove unreachable
break.
* scanner.c (skip_fixed_comments): Remove unreachable
gcc_unreachable.
* trans-expr.c (gfc_expr_is_variable): Refactor to make
control flow more obvious.
---
  gcc/c/gimple-parser.c |  8 +---
  gcc/cfgrtl.c  | 10 ++
  gcc/cp/parser.c   |  4 
  gcc/cp/pt.c   |  2 --
  gcc/expr.c|  3 ---
  gcc/fortran/frontend-passes.c |  1 -
  gcc/fortran/scanner.c |  1 -
  gcc/fortran/trans-expr.c  | 11 +++
  gcc/sched-deps.c  |  2 --
  gcc/sel-sched-ir.c|  3 ---
  gcc/tree-ssa-alias.c  |  3 ---
  gcc/tree-vect-slp.c   | 22 --
  gcc/tree.c|  2 --
  gcc/vec-perm-indices.c|  1 -
  14 files changed, 14 insertions(+), 59 deletions(-)

diff --git a/gcc/c/gimple-parser.c b/gcc/c/gimple-parser.c
index 32f22dbb8a7..f594a8ccb31 100644
--- a/gcc/c/gimple-parser.c
+++ b/gcc/c/gimple-parser.c
@@ -1698,13 +1698,7 @@ c_parser_gimple_postfix_expression (gimple_parser 
)
}
  break;
}
-  else
-   {
- c_parser_error (parser, "expected expression");
- expr.set_error ();
- break;
-   }
-  break;
+  /* Fallthru.  */
  default:
c_parser_error (parser, "expected expression");
expr.set_error ();
diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c
index 3744adcc2ba..287a3db643a 100644
--- a/gcc/cfgrtl.c
+++ b/gcc/cfgrtl.c
@@ -3539,14 +3539,8 @@ skip_insns_after_block (basic_block bb)
  continue;
  
  	case NOTE:

- switch (NOTE_KIND (insn))
-   {
-   case NOTE_INSN_BLOCK_END:
- gcc_unreachable ();
-   default:
- continue;
-   }
- break;
+ gcc_assert (NOTE_KIND (insn) != NOTE_INSN_BLOCK_END);
+ continue;
  
  	case CODE_LABEL:

  if (NEXT_INSN (insn)
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 0bd58525726..cc88a36dd39 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -7892,10 +7892,6 @@ cp_parser_postfix_expression (cp_parser *parser, bool 
address_p, bool cast_p,
  return postfix_expression;
}
  }
-
-  /* We should never get here.  */
-  gcc_unreachable ();


Hmm, I generally disagree with removing gcc_unreachable() asserts 
because they are unreachable; it seems like it increases the fragility 
of the code in case later changes wrongly make them reachable.



-  return error_mark_node;
  }
  
  /* Helper function for cp_parser_parenthesized_expression_list and

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 31ed773e145..f4b9d9673fb 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -18242,13 +18242,11 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl,
stmt = finish_co_yield_expr (input_location,
   RECUR (TREE_OPERAND (t, 0)));
RETURN (stmt);
-  break;
  
  case CO_AWAIT_EXPR:

stmt = finish_co_await_expr (input_location,
   RECUR (TREE_OPERAND (t, 0)));
RETURN (stmt);
-  break;
  
  case EXPR_STMT:

tmp = RECUR (EXPR_STMT_EXPR (t));
diff --git a/gcc/expr.c b/gcc/expr.c
index 5673902b1fc..b2815257509 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -1342,9 +1342,6 @@ op_by_pieces_d::run ()
}
  }
while (1);
-
-  /* The code above should have handled everything.  */
-  gcc_assert (!length);
  }
  
  /* Derived class from op_by_pieces_d, providing support for block move

diff --git

Re: [PATCH] tree-optimization/98956 Optimizing out boolean left shift

2021-11-30 Thread Andrew Pinski via Gcc-patches

On Tue, Nov 30, 2021 at 8:35 AM Navid Rahimi via Gcc-patches
 wrote:
>
> Hi GCC community,
>
> This patch will add the missed pattern described in bug 98956 [1] to the 
> match.pd. The codegen and correctness proof for this pattern is here [2,3] in 
> case anyone is curious. Tested on x86_64 Linux.
>

A better way to optimize this is the following (which I describe in PR 64992):
 take: (t << 1) != 0;

This should be transformed into:
(t & 0x7fff) != 0

The rest will just fall out really.  That is there is no reason to
special case bool here.
I have most of the patch except for creating the mask part which
should be simple, I just did not want to look up the wi:: functions at
the time I was writing it into the bug report.

Thanks,
Andrew Pinski



> Tree-optimization/98956:
>
> Adding new optimization to match.pd:
> * match.pd ((B0 << x) cmp 0) -> B0 cmp 0 : New optimization.
> * gcc.dg/tree-ssa/pr98956.c: testcase for this optimization.
> * gcc.dg/tree-ssa/pr98956-2.c: testcase for node with 
> side-effect.
>
> 1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98956
> 2) https://compiler-explorer.com/z/nj4PTrecW
> 3) https://alive2.llvm.org/ce/z/jyJAoS
>
> Best wishes,
> Navid.

[committed] wwwdocs: readings: Switch the DWARF Workgroup to https

2021-11-30 Thread Gerald Pfeifer

While we are at it, remove the unnecessary trailing slash.

Pushed, Gerald

---
 htdocs/readings.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/readings.html b/htdocs/readings.html
index e75bfc49..12755d7e 100644
--- a/htdocs/readings.html
+++ b/htdocs/readings.html
@@ -595,7 +595,7 @@ names.
   http://refspecs.linux-foundation.org/elf/elfspec_ppc.pdf;>System
   V PowerPC ABI
 
-  http://dwarfstd.org/;>DWARF Workgroup
+  https://dwarfstd.org;>DWARF Workgroup
 
 
 
-- 
2.34.0

[committed] wwwdocs: gcc-4.7: Update reference to Go 1 language standard

2021-11-30 Thread Gerald Pfeifer

Just a trivial, if permanent redirect, to follow.

Pushed, Gerald

---
 htdocs/gcc-4.7/changes.html | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/htdocs/gcc-4.7/changes.html b/htdocs/gcc-4.7/changes.html
index 21294cc3..846946d6 100644
--- a/htdocs/gcc-4.7/changes.html
+++ b/htdocs/gcc-4.7/changes.html
@@ -1017,8 +1017,7 @@ complete (that is, it is possible that some PRs that have 
been fixed
 are not listed here).
 
 The Go front end in the 4.7.1 release fully supports
-the https://golang.org/doc/go1;>Go 1 language
-standard.
+the https://go.dev/doc/go1;>Go 1 language standard.
 
 GCC 4.7.2
 
-- 
2.34.0

[committed] wwwdocs: gcc--12: Tweak language in the Fortran section

2021-11-30 Thread Gerald Pfeifer

Pushed.

---
 htdocs/gcc-12/changes.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 10ac025f..45a8d99a 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -322,7 +322,7 @@ a work-in-progress.
 GCC 12 now uses OPERATION as the name of the function to
 the CO_REDUCE intrinsic for the pairwise reduction, thus
 conforming to the Fortran 2018 standard.  Previous versions
-used OPERATOR, which conformed to TS 18508.
+used OPERATOR which conforms to TS 18508.
   
 
 
-- 
2.34.0

[PATCH] c++: don't fold away 'if' with constant condition

richi's recent unreachable code warning experiments had trouble with the C++
front end folding away an 'if' with a constant condition.  Let's do less
folding at the statement level.  Thanks to Marek for finding the offending
code.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* cp-gimplify.c (genericize_if_stmt): Always build a COND_EXPR.
---
 gcc/cp/cp-gimplify.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 0988655eeba..0a002db14e7 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -166,11 +166,8 @@ genericize_if_stmt (tree *stmt_p)
  can contain unfolded immediate function calls, we have to discard
  the then_ block regardless of whether else_ has side-effects or not.  */
   if (IF_STMT_CONSTEVAL_P (stmt))
-stmt = else_;
-  else if (integer_nonzerop (cond) && !TREE_SIDE_EFFECTS (else_))
-stmt = then_;
-  else if (integer_zerop (cond) && !TREE_SIDE_EFFECTS (then_))
-stmt = else_;
+stmt = build3 (COND_EXPR, void_type_node, boolean_false_node,
+  void_node, else_);
   else
 stmt = build3 (COND_EXPR, void_type_node, cond, then_, else_);
   protected_set_expr_location_if_unset (stmt, locus);

base-commit: 92de188ea3d36ec012b6d42959d4722e42524256
-- 
2.27.0

Re: [PATCH, fortran] Improve expansion of constant array expressions within constructors


Hello,

On 27/11/2021 21:56, Harald Anlauf via Fortran wrote:

diff --git a/gcc/fortran/array.c b/gcc/fortran/array.c
index 6552eaf3b0c..fbc66097c80 100644
--- a/gcc/fortran/array.c
+++ b/gcc/fortran/array.c
@@ -1804,6 +1804,12 @@ expand_constructor (gfc_constructor_base base)
   if (empty_constructor)
empty_ts = e->ts;

+  /* Simplify constant array expression/section within constructor.  */
+  if (e->expr_type == EXPR_VARIABLE && e->rank > 0 && e->ref
+ && e->symtree && e->symtree->n.sym
+ && e->symtree->n.sym->attr.flavor == FL_PARAMETER)
+   gfc_simplify_expr (e, 0);
+
   if (e->expr_type == EXPR_ARRAY)
{
  if (!expand_constructor (e->value.constructor))


There is another simplification call just a few lines below, that I 
thought could just be moved up.
But it works on a copy of the expression, and managing the copy makes it 
complex as well, so let’s do it your way.


OK.

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

Sorry for the confusing…
My major question is:  

for a variable of type __vector_pair,  could it be in a register?
If it could be in a register, can we initialize this register with some 
constant value? 

Qing

> On Nov 30, 2021, at 2:07 PM, Peter Bergner  wrote:
> 
> On 11/30/21 1:50 PM, Qing Zhao via Gcc-patches wrote:
>>> void
>>> bar (__vector_pair *dst, __vector_pair *src)
>>> {
>>> __vector_pair pair;
>>> pair = *src;
>>> ...
>>> }
>> 
>> However, even with the above, the memory pointed by “src” still need to
>> be initialized somewhere. How to provide the initial value to the variable
>> in the beginning for __vector_pair type?
> 
> Well no initialization is required here in this function.  Isn't that what
> matters here?  When generating code for bar(), we assume that src already
> points to initialized memory.
> 
> As for what src points to, that could be initialized how any other memory/
> array could be initialized, so either a static array, read in some data
> from a file into an array, compute the array values in a loop, etc. etc.
> 
> Peter
>

Re: [PATCH] c++, v2: Allow indeterminate unsigned char or std::byte in bit_cast - P1272R4


On 11/30/21 07:17, Jakub Jelinek wrote:

On Mon, Nov 29, 2021 at 10:25:58PM -0500, Jason Merrill wrote:

It's a DR.  Really, it was intended to be part of C++20; at the Cologne
meeting in 2019 CWG thought byteswap was going to make C++20, so this bugfix
could go in as part of that paper.


Ok, changed to be done unconditionally now.


Also, allowing indeterminate values that are never read was in C++20
(P1331).


Reading P1331R2 again, I'm still puzzled.
Our current behavior (both before and after this patch) is that if
some variable is scalar and has indeterminate value or if an aggregate
variable has some members (possibly nested) with indeterminate values,
in constexpr contexts we allow copying those into other vars of the
same type (e.g. the testcases in the patch below test mere copying
of the whole structures or unsigned char result of __builtin_bit_cast),


That seems to be a bug, since the copy involves an lvalue-to-rvalue 
conversion.



but we reject if we actually use them in some other way (e.g. try to
read a member from a variable that has that member indeterminate,
see e.g. bit-cast14.C (f5, f6, f7), even when reading it into an
unsigned char variable.


That's correct.


Then there is P1331R2 which makes the UB on
"an lvalue-to-rvalue conversion that is applied to an object with
indeterminate value ([basic.indet]);"
but isn't even the
   unsigned char a = __builtin_bit_cast (unsigned char, u);
   unsigned char b = a;
case non-constant then when __builtin_bit_cast returns indeterminate value?


Good point.  So it would seem to follow that if the output is going to 
have an indeterminate value, it's non-constant, we don't have to work 
hard in constexpr evaluation, and f1-4 are all non-constant.  And the 
new bit_cast text is only interesting for non-constant evaluation.



__builtin_bit_cast returns rvalue, so no lvalue-to-rvalue conversion happens
in that case, so supposely
   unsigned char a = __builtin_bit_cast (unsigned char, u);
is fine, but on


Eh, there's clearly an lvalue-rvalue conversion involved in reading from 
the source value.



   unsigned char b = a;
a is lvalue and is converted to rvalue.
Similarly
   T t = { 1, 2 };
   S s = __builtin_bit_cast (S, t);
   S u = s;
where S s = __builtin_bit_cast (S, t); could be ok even when some or all
members are indeterminate, but u = s; does lvalue-to-rvalue conversion?



Or there is http://eel.is/c++draft/basic.indet that has quite clear rules
what is and isn't UB and if C++ wanted to go further and allow all those
valid cases in there as constant...

Anyway, I hope this can be dealt with incrementally.



I think in all of them the result of the cast has (some) indeterminate
value.  So f1-3 are OK because the indeterminate value has unsigned char
type and is never used; f4() is non-constant because S::f has
non-byte-access type and so the new wording says it's undefined.


Ok, implemented the bitfield handling then.

Here is an updated patch, so far lightly tested.

2021-11-30  Jakub Jelinek 

* constexpr.c (clear_uchar_or_std_byte_in_mask): New function.
(cxx_eval_bit_cast): Don't error about padding bits if target
type is unsigned char or std::byte, instead return no clearing
ctor.  Use clear_uchar_or_std_byte_in_mask.

* g++.dg/cpp2a/bit-cast11.C: New test.
* g++.dg/cpp2a/bit-cast12.C: New test.
* g++.dg/cpp2a/bit-cast13.C: New test.
* g++.dg/cpp2a/bit-cast14.C: New test.

--- gcc/cp/constexpr.c.jj   2021-11-30 09:44:46.531607444 +0100
+++ gcc/cp/constexpr.c  2021-11-30 12:20:29.105251443 +0100
@@ -4268,6 +4268,121 @@ check_bit_cast_type (const constexpr_ctx
return false;
  }
  
+/* Helper function for cxx_eval_bit_cast.  For unsigned char or

+   std::byte members of CONSTRUCTOR (recursively) if they contain
+   some indeterminate bits (as set in MASK), remove the ctor elts,
+   mark the CONSTRUCTOR as CONSTRUCTOR_NO_CLEARING and clear the
+   bits in MASK.  */
+
+static void
+clear_uchar_or_std_byte_in_mask (location_t loc, tree t, unsigned char *mask)
+{
+  if (TREE_CODE (t) != CONSTRUCTOR)
+return;
+
+  unsigned i, j = 0;
+  tree index, value;
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (t), i, index, value)
+{
+  tree type = TREE_TYPE (value);
+  if (TREE_CODE (TREE_TYPE (t)) != ARRAY_TYPE
+ && DECL_BIT_FIELD_TYPE (index) != NULL_TREE)
+   {
+ if (is_byte_access_type (DECL_BIT_FIELD_TYPE (index))
+ && (TYPE_MAIN_VARIANT (DECL_BIT_FIELD_TYPE (index))
+ != char_type_node))
+   {
+ HOST_WIDE_INT fldsz = TYPE_PRECISION (TREE_TYPE (index));
+ gcc_assert (fldsz != 0);
+ HOST_WIDE_INT pos = int_byte_position (index);
+ HOST_WIDE_INT bpos
+   = tree_to_uhwi (DECL_FIELD_BIT_OFFSET (index));
+ bpos %= BITS_PER_UNIT;
+ HOST_WIDE_INT end
+   = ROUND_UP (bpos + fldsz, BITS_PER_UNIT) / BITS_PER_UNIT;

[committed] libstdc++: Skip tag dispatching for _S_relocate in C++17

Tested x86_64-linux, pushed to trunk.


In C++17 mode all callers of _S_relocate have already done:

  if constexpr (_S_use_relocate())

so we don't need to repeat that check and use tag dispatching to avoid
ill-formed instantiations.

libstdc++-v3/ChangeLog:

* include/bits/stl_vector.h (vector::_S_do_relocate): Remove
C++20 constexpr specifier.
(vector::_S_relocate) [__cpp_if_constexpr]: Call __relocate_a
directly without tag dispatching.
---
 libstdc++-v3/include/bits/stl_vector.h | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_vector.h 
b/libstdc++-v3/include/bits/stl_vector.h
index 4587757637e..36b2cff3d78 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -481,14 +481,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
return _S_nothrow_relocate(__is_move_insertable<_Tp_alloc_type>{});
   }
 
-  static _GLIBCXX20_CONSTEXPR pointer
+  static pointer
   _S_do_relocate(pointer __first, pointer __last, pointer __result,
 _Tp_alloc_type& __alloc, true_type) noexcept
   {
return std::__relocate_a(__first, __last, __result, __alloc);
   }
 
-  static _GLIBCXX20_CONSTEXPR pointer
+  static pointer
   _S_do_relocate(pointer, pointer, pointer __result,
 _Tp_alloc_type&, false_type) noexcept
   { return __result; }
@@ -497,8 +497,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   _S_relocate(pointer __first, pointer __last, pointer __result,
  _Tp_alloc_type& __alloc) noexcept
   {
+#if __cpp_if_constexpr
+   // All callers have already checked _S_use_relocate() so just do it.
+   return std::__relocate_a(__first, __last, __result, __alloc);
+#else
using __do_it = __bool_constant<_S_use_relocate()>;
return _S_do_relocate(__first, __last, __result, __alloc, __do_it{});
+#endif
   }
 #endif // C++11
 
-- 
2.31.1

[committed] libstdc++: Make Asan detection work for Clang [PR103453]

Tested x86_64-linux, pushed to trunk.


Clang doesn't define __SANITIZE_ADDRESS__ so use its __has_feature check
to detect Asan instead.

libstdc++-v3/ChangeLog:

PR libstdc++/103453
* config/allocator/malloc_allocator_base.h
(_GLIBCXX_SANITIZE_STD_ALLOCATOR): Define for Clang.
* config/allocator/new_allocator_base.h
(_GLIBCXX_SANITIZE_STD_ALLOCATOR): Likewise.
---
 libstdc++-v3/config/allocator/malloc_allocator_base.h | 10 --
 libstdc++-v3/config/allocator/new_allocator_base.h| 10 --
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/config/allocator/malloc_allocator_base.h 
b/libstdc++-v3/config/allocator/malloc_allocator_base.h
index d7b56e3c9ef..b798d3fd448 100644
--- a/libstdc++-v3/config/allocator/malloc_allocator_base.h
+++ b/libstdc++-v3/config/allocator/malloc_allocator_base.h
@@ -52,8 +52,14 @@ namespace std
 # define __allocator_base  __gnu_cxx::malloc_allocator
 #endif
 
-#if defined(__SANITIZE_ADDRESS__) && !defined(_GLIBCXX_SANITIZE_STD_ALLOCATOR)
-# define _GLIBCXX_SANITIZE_STD_ALLOCATOR 1
+#ifndef _GLIBCXX_SANITIZE_STD_ALLOCATOR
+# if defined(__SANITIZE_ADDRESS__)
+#  define _GLIBCXX_SANITIZE_STD_ALLOCATOR 1
+# elif defined __has_feature
+#  if __has_feature(address_sanitizer)
+#   define _GLIBCXX_SANITIZE_STD_ALLOCATOR 1
+#  endif
+# endif
 #endif
 
 #endif
diff --git a/libstdc++-v3/config/allocator/new_allocator_base.h 
b/libstdc++-v3/config/allocator/new_allocator_base.h
index 77ee8b73979..7c52fef63de 100644
--- a/libstdc++-v3/config/allocator/new_allocator_base.h
+++ b/libstdc++-v3/config/allocator/new_allocator_base.h
@@ -52,8 +52,14 @@ namespace std
 # define __allocator_base  __gnu_cxx::new_allocator
 #endif
 
-#if defined(__SANITIZE_ADDRESS__) && !defined(_GLIBCXX_SANITIZE_STD_ALLOCATOR)
-# define _GLIBCXX_SANITIZE_STD_ALLOCATOR 1
+#ifndef _GLIBCXX_SANITIZE_STD_ALLOCATOR
+# if defined(__SANITIZE_ADDRESS__)
+#  define _GLIBCXX_SANITIZE_STD_ALLOCATOR 1
+# elif defined __has_feature
+#  if __has_feature(address_sanitizer)
+#   define _GLIBCXX_SANITIZE_STD_ALLOCATOR 1
+#  endif
+# endif
 #endif
 
 #endif
-- 
2.31.1

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

On 11/30/21 1:50 PM, Qing Zhao via Gcc-patches wrote:
>> void
>> bar (__vector_pair *dst, __vector_pair *src)
>> {
>>  __vector_pair pair;
>>  pair = *src;
>>  ...
>> }
> 
> However, even with the above, the memory pointed by “src” still need to
> be initialized somewhere. How to provide the initial value to the variable
> in the beginning for __vector_pair type?

Well no initialization is required here in this function.  Isn't that what
matters here?  When generating code for bar(), we assume that src already
points to initialized memory.

As for what src points to, that could be initialized how any other memory/
array could be initialized, so either a static array, read in some data
from a file into an array, compute the array values in a loop, etc. etc.

Peter

Re: [PATCH, Fortran] Fix setting of array lower bound for named arrays

2021-11-30 Thread Toon Moene


On 11/30/21 8:54 PM, Harald Anlauf via Fortran wrote:


Hi Tobias,



You seem to be quite convinced with your interpretation,
while I am simply confused.


If both compiler developers are confused, and actual compiler 
implementations differ in their outcomes of the test case, IMNSHO it is 
time to ask the Fortran Standardization Committee for an interpretation 
(of the standard's text).


Kind regards,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands

Re: [PATCH, Fortran] Fix setting of array lower bound for named arrays

2021-11-30 Thread Harald Anlauf via Gcc-patches

Hi Tobias,

Am 30.11.21 um 18:24 schrieb Tobias Burnus:

On 29.11.21 22:11, Harald Anlauf wrote:

"A whole array is a named array or a structure component whose final
part-ref is an array component name; no subscript list is appended."

I think in "h(3)" there is not really a named array – thus I read it as
if the "Otherwise ... result value is 1" applies.

If you read on in the standard:

"The appearance of a whole array variable in an executable construct
specifies all the elements of the array ..."

which might make you/makes me think that the sentence before that one
could need an official interpretation...

I am not sure whether I understand what part of the spec you wonder
about. (I mean besides that 'variable' can also mean referencing a
data-pointer-returning function.)

strictly speaking you're now talking about the text for LBOUND,
and your quote is not from the standard section about the ALLOCATE
statement. And there are several places in the standard document
where there is an explicit reference to LBOUND when talking about
what the bounds should be. This is why I am unhappy with the text
about ALLOCATE, not about LBOUND.

Question: What do NAG/flang/... report for lbound(h(3)) - also [3] – or
[1] as gfortran?

I've submitted a reduced example to the Intel Fortran Forum:
https://community.intel.com/t5/Intel-Fortran-Compiler/Allocate-with-SOURCE-and-bounds/m-p/1339992#M158535

There are good chances that Steve Lionel reads and comments on it.

So far only "FortranFan" has replied – and he comes to the same
conclusion as my reading, albeit without referring to the standard.

You seem to be quite convinced with your interpretation,
while I am simply confused.

So go ahead and apply to mainline. Let's see if we learn more.
I do hope I will.

Harald

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
Registergericht München, HRB 106955

[r12-5612 Regression] FAIL: gcc.target/i386/pr88531-1a.c (test for excess errors) on Linux/x86_64

2021-11-30 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

10833849b55401a52f2334eb032a70beb688e9fc is the first bad commit
commit 10833849b55401a52f2334eb032a70beb688e9fc
Author: Richard Sandiford 
Date:   Tue Nov 30 09:52:29 2021 +

vect: Support gather loads with SLP

caused

FAIL: gcc.target/i386/pr88531-1a.c (internal compiler error)
FAIL: gcc.target/i386/pr88531-1a.c (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-5612/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr88531-1a.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr88531-1a.c --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]



> On Nov 30, 2021, at 12:08 PM, Peter Bergner  wrote:
> 
> On 11/30/21 11:51 AM, Qing Zhao wrote:
>> So, looks like that the variable with OPAQUE_TYPE cannot be all explicitly 
>> initialized even at source code level. 
>> 
>> The only way to initialize such variable (only __vector_quad, not for 
>> __vector_pairs) at source code level is through call to 
>> __builtin_mma_xxsetaccz as:
>> 
>> void
>> foo (__vector_quad *dst)
>> {
>>  __vector_quad acc;
>>  __builtin_mma_xxsetaccz();
>>  *dst = acc;
>> }
>> 
>> Is this the correct understanding?
> 
> Correct.  Or via...
> 
> 
>> Is there way to initialize such variable to other values than zero at source 
>> code level?
> 
> Not for any constant values.  You can load it from memory though like below,
> which is also allowed for __vector_pair:
> 
> void
> foo (__vector_quad *dst, __vector_quad *src)
> {
>  __vector_quad acc;
>  acc = *src;
>  ...
> }
> void
> bar (__vector_pair *dst, __vector_pair *src)
> {
>  __vector_pair pair;
>  pair = *src;
>  ...
> }

However, even with the above, the memory pointed by “src” still need to be 
initialized somewhere. How to provide the initial value to the variable in the 
beginning for __vector_pair type?

Qing
> 
> We do not accept things like:
> 
>  acc = 0;
>  acc = {0, 0, ... };
>  etc.
> 
> Peter

Re: [PATCH] PR fortran/101565 - ICE in gfc_simplify_image_index, at fortran/simplify.c:8234

2021-11-30 Thread Harald Anlauf via Gcc-patches


Hi Mikael,

Am 30.11.21 um 12:25 schrieb Mikael Morin:

Hello,

Le 29/11/2021 à 22:31, Harald Anlauf via Fortran a écrit :

Dear all,

a trivial one: we need to check the type of the SUB argument
to the coarray IMAGE_INDEX intrinsic.  It has to be an array
of type integer.

Patch by Steve Kargl.


I hope at some point he’ll finally come to a working git workflow.


Initially I had to rethink my workflow habits when switching from
svn to git.  But after a steep learning curve I wouldn't want to
go back.  One day Steve might see it same way.


Regtested on x86_64-pc-linux-gnu.  OK for mainline?


Sure.



Thanks,
Harald

Re: [PATCH] Fix alignment of stack slots for overaligned types [PR103500]

2021-11-30 Thread Florian Weimer via Gcc-patches

* Alex Coplan via Gcc-patches:

> Bootstrapped and regtested on aarch64-linux-gnu, x86_64-linux-gnu, and
> arm-linux-gnueabihf: no regressions.
>
> I'd appreciate any feedback. Is it OK for trunk?

Does this need an ABI warning?

Thanks,
Florian

Re: [PATCH] libcpp: Enable P1949R7 for C++98 too [PR100977]


On 11/30/21 13:19, Jakub Jelinek wrote:

On Mon, Nov 29, 2021 at 05:53:58PM -0500, Jason Merrill wrote:

I'm inclined to go ahead and change C++98 as well; I doubt anyone is relying
on the particular C++98 extended character set rules, and we already accept
the union of the different sets when not pedantic.


Ok, here is an incremental patch to do that also for -std={c,gnu}++98.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2021-11-30  Jakub Jelinek  

* init.c (struct lang_flags): Remove cxx23_identifiers.
(lang_defaults): Remove cxx23_identifiers initializers.
(cpp_set_lang): Don't copy cxx23_identifiers.
* include/cpplib.h (struct cpp_options): Adjust comment about
c11_identifiers.  Remove cxx23_identifiers field.
* lex.c (warn_about_normalization): Use cplusplus instead of
cxx23_identifiers.
* charset.c (ucn_valid_in_identifier): Likewise.

* g++.dg/cpp/ucnid-1.C: Adjust expected diagnostics.
* g++.dg/cpp/ucnid-1-utf8.C: Likewise.

--- gcc/init.c.jj   2021-11-29 22:54:46.503750631 +0100
+++ gcc/init.c  2021-11-30 01:06:31.704473882 +0100
@@ -82,7 +82,6 @@ struct lang_flags
char extended_numbers;
char extended_identifiers;
char c11_identifiers;
-  char cxx23_identifiers;
char std;
char digraphs;
char uliterals;
@@ -100,31 +99,31 @@ struct lang_flags
  };
  
  static const struct lang_flags lang_defaults[] =

-{ /*  c99 c++ xnum xid c11 c++23 std digr ulit rlit udlit bincst 
digsep trig u8chlit vaopt scope dfp szlit elifdef */
-  /* GNUC89   */  { 0,  0,  1,  0,  0,  0,0,  1,   0,   0,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC99   */  { 1,  0,  1,  1,  0,  0,0,  1,   1,   1,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC11   */  { 1,  0,  1,  1,  1,  0,0,  1,   1,   1,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC17   */  { 1,  0,  1,  1,  1,  0,0,  1,   1,   1,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC2X   */  { 1,  0,  1,  1,  1,  0,0,  1,   1,   1,   0,1, 
1, 0,   1,  1,   1, 1,   0,   1 },
-  /* STDC89   */  { 0,  0,  0,  0,  0,  0,1,  0,   0,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC94   */  { 0,  0,  0,  0,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC99   */  { 1,  0,  1,  1,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC11   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC17   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC2X   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,1, 
1, 1,   1,  0,   1, 1,   0,   1 },
-  /* GNUCXX   */  { 0,  1,  1,  1,  0,  0,0,  1,   0,   0,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX98*/  { 0,  1,  0,  1,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX11 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX11*/  { 1,  1,  0,  1,  1,  1,1,  1,   1,   1,   1,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX14 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX14*/  { 1,  1,  0,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX17 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* CXX17*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  0,   1, 0,   0,   0 },
-  /* GNUCXX20 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* CXX20*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* GNUCXX23 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   1,   1 },
-  /* CXX23*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   1,   1 },
-  /* ASM  */  { 0,  0,  1,  0,  0,  0,0,  0,   0,   0,   0,0, 
0, 0,   0,  0,   0, 0,   0,   0 }
+{ /*  c99 c++ xnum xid c11 std digr ulit rlit udlit bincst digsep 
trig u8chlit vaopt scope dfp szlit elifdef */
+  /* GNUC89   */  { 0,  0,  1,  0,  0,  0,  1,   0,   0,   0,0, 0, 
0,   0,  1,   1, 0,   0,   0 },
+  /* GNUC99   */  { 1,  0,  1,  1,  0,  0,  1,   1,   1,   0,0, 0, 
0,   0,  1,

[PATCH] libcpp: Enable P1949R7 for C++98 too [PR100977]

On Mon, Nov 29, 2021 at 05:53:58PM -0500, Jason Merrill wrote:
> I'm inclined to go ahead and change C++98 as well; I doubt anyone is relying
> on the particular C++98 extended character set rules, and we already accept
> the union of the different sets when not pedantic.

Ok, here is an incremental patch to do that also for -std={c,gnu}++98.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-11-30  Jakub Jelinek  

* init.c (struct lang_flags): Remove cxx23_identifiers.
(lang_defaults): Remove cxx23_identifiers initializers.
(cpp_set_lang): Don't copy cxx23_identifiers.
* include/cpplib.h (struct cpp_options): Adjust comment about
c11_identifiers.  Remove cxx23_identifiers field.
* lex.c (warn_about_normalization): Use cplusplus instead of
cxx23_identifiers.
* charset.c (ucn_valid_in_identifier): Likewise.

* g++.dg/cpp/ucnid-1.C: Adjust expected diagnostics.
* g++.dg/cpp/ucnid-1-utf8.C: Likewise.

--- gcc/init.c.jj   2021-11-29 22:54:46.503750631 +0100
+++ gcc/init.c  2021-11-30 01:06:31.704473882 +0100
@@ -82,7 +82,6 @@ struct lang_flags
   char extended_numbers;
   char extended_identifiers;
   char c11_identifiers;
-  char cxx23_identifiers;
   char std;
   char digraphs;
   char uliterals;
@@ -100,31 +99,31 @@ struct lang_flags
 };
 
 static const struct lang_flags lang_defaults[] =
-{ /*  c99 c++ xnum xid c11 c++23 std digr ulit rlit udlit bincst 
digsep trig u8chlit vaopt scope dfp szlit elifdef */
-  /* GNUC89   */  { 0,  0,  1,  0,  0,  0,0,  1,   0,   0,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC99   */  { 1,  0,  1,  1,  0,  0,0,  1,   1,   1,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC11   */  { 1,  0,  1,  1,  1,  0,0,  1,   1,   1,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC17   */  { 1,  0,  1,  1,  1,  0,0,  1,   1,   1,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* GNUC2X   */  { 1,  0,  1,  1,  1,  0,0,  1,   1,   1,   0,1, 
1, 0,   1,  1,   1, 1,   0,   1 },
-  /* STDC89   */  { 0,  0,  0,  0,  0,  0,1,  0,   0,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC94   */  { 0,  0,  0,  0,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC99   */  { 1,  0,  1,  1,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC11   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC17   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,0, 
0, 1,   0,  0,   0, 0,   0,   0 },
-  /* STDC2X   */  { 1,  0,  1,  1,  1,  0,1,  1,   1,   0,   0,1, 
1, 1,   1,  0,   1, 1,   0,   1 },
-  /* GNUCXX   */  { 0,  1,  1,  1,  0,  0,0,  1,   0,   0,   0,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX98*/  { 0,  1,  0,  1,  0,  0,1,  1,   0,   0,   0,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX11 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,0, 
0, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX11*/  { 1,  1,  0,  1,  1,  1,1,  1,   1,   1,   1,0, 
0, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX14 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   0,  1,   1, 0,   0,   0 },
-  /* CXX14*/  { 1,  1,  0,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 1,   0,  0,   1, 0,   0,   0 },
-  /* GNUCXX17 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* CXX17*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  0,   1, 0,   0,   0 },
-  /* GNUCXX20 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* CXX20*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   0,   0 },
-  /* GNUCXX23 */  { 1,  1,  1,  1,  1,  1,0,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   1,   1 },
-  /* CXX23*/  { 1,  1,  1,  1,  1,  1,1,  1,   1,   1,   1,1, 
1, 0,   1,  1,   1, 0,   1,   1 },
-  /* ASM  */  { 0,  0,  1,  0,  0,  0,0,  0,   0,   0,   0,0, 
0, 0,   0,  0,   0, 0,   0,   0 }
+{ /*  c99 c++ xnum xid c11 std digr ulit rlit udlit bincst digsep 
trig u8chlit vaopt scope dfp szlit elifdef */
+  /* GNUC89   */  { 0,  0,  1,  0,  0,  0,  1,   0,   0,   0,0, 0, 
0,   0,  1,   1, 0,   0,   0 },
+  /* GNUC99   */  { 1,  0,  1,  1,  0,  0,  1,   1,   1,   0,0, 0, 
0,   0,  1,   1, 0,   0,   0 },
+  /* GNUC11   */  { 1,  0,  1,

Re: [PATCH] Modify combine pattern by anding a pseudo with its nonzero bits

2021-11-30 Thread Segher Boessenkool

Hi!

On Tue, Nov 30, 2021 at 04:46:34PM +0800, HAO CHEN GUI wrote:
>     This patch modifies the combine pattern with a helper - 
> change_pseudo_and_mask when recog fails. The helper converts a single pseudo 
> to the pseudo and with a mask if the outer operator is IOR/XOR/PLUS and the 
> inner operator is ASHIFT/LSHIFTRT/AND. The conversion helps match shift + ior 
> pattern.
> 
>     Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. 
> Is this okay for trunk? Any recommendations? Thanks a lot.

(Please make shorter lines in email.  70 chars is usual).

> gcc/
>     * combine.c (change_pseudo_and_mask): New.
>     (recog_for_combine): If recog fails, try again with the pattern
>     modified by change_pseudo_and_mask.
> 
> gcc/testsuite/
>     * gcc.target/powerpc/20050603-3.c: Modify the dump check conditions.
>     * gcc.target/powerpc/rlwimi-2.c: Likewise.

> +/* When the outer code of set_src is IOR/XOR/PLUS and the inner code is
> +   ASHIFT/LSHIFTRT/AND, convert a psuedo to psuedo AND with a mask if its
> +   nonzero_bits is less than its mode mask.  */

Please add some words *why* we do this (namely, because you cannot use
nonzero_bits in combine as well as after combine and expect the same
answer).

> +static bool
> +change_pseudo_and_mask (rtx pat)
> +{
> +  bool changed = false;
> +
> +  rtx src = SET_SRC (pat);
> +  if ((GET_CODE (src) == IOR
> +   || GET_CODE (src) == XOR
> +   || GET_CODE (src) == PLUS)
> +  && (((GET_CODE (XEXP (src, 0)) == ASHIFT
> +   || GET_CODE (XEXP (src, 0)) == LSHIFTRT
> +   || GET_CODE (XEXP (src, 0)) == AND)
> +  && REG_P (XEXP (src, 1)))
> + || ((GET_CODE (XEXP (src, 1)) == ASHIFT
> +  || GET_CODE (XEXP (src, 1)) == LSHIFTRT
> +  || GET_CODE (XEXP (src, 1)) == AND)
> + && REG_P (XEXP (src, 0)

If one arm is a pseudo and the other is compound, the compound one is
first always.  This is one of those canonicalisations that simplifies a
lot of code -- including this new code :-)

> +    {
> +  rtx *reg = REG_P (XEXP (src, 0))
> +    ?  (SET_SRC (pat), 0)
> +    :  (SET_SRC (pat), 1);

This is indented wrong.  But, in fact, all tabs are changed to spaces in
your patch?

> @@ -11586,7 +11622,14 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx 
> *pnotes)
>     }
>     }
>    else
> -   changed = change_zero_ext (pat);
> +   {
> + if (change_pseudo_and_mask (pat))
> +   {
> + maybe_swap_commutative_operands (SET_SRC (pat));
> + changed = true;
> +   }
> + changed |= change_zero_ext (pat);
> +   }
>  }
>    else if (GET_CODE (pat) == PARALLEL)
>  {


  changed = change_zero_ext (pat);
  if (!changed)
changed = change_pseudo_and_mask (pat);

  if (changed)
maybe_swap_commutative_operands (SET_SRC (pat));


> --- a/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> @@ -12,7 +12,7 @@ void rotins (unsigned int x)
>    b.y = (x<<12) | (x>>20);
>  }
> 
> -/* { dg-final { scan-assembler-not {\mrlwinm} } } */
> +/* { dg-final { scan-assembler-not {\mrlwinm} { target ilp32 } } } */
>  /* { dg-final { scan-assembler-not {\mrldic} } } */
>  /* { dg-final { scan-assembler-not {\mrot[lr]} } } */
>  /* { dg-final { scan-assembler-not {\ms[lr][wd]} } } */

Please show the -m32 code before and after the change?  Why is it okay
to get an rlwinm there?

> diff --git a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c 
> b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> index bafa371db73..ffb5f9e450f 100644
> --- a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> @@ -2,14 +2,14 @@
>  /* { dg-options "-O2" } */
> 
>  /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 14121 { target ilp32 } 
> } } */
> -/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 20217 { target lp64 } } 
> } */
> +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 21279 { target lp64 } } 
> } */

No, it is not okay to generate worse code.  In what cases do you see
more insns now, and why?

>  /* { dg-final { scan-assembler-times {(?n)^\s+blr} 6750 } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+mr} 643 { target ilp32 } } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+mr} 11 { target lp64 } } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 7790 { target lp64 } } 
> } */
> 
>  /* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target ilp32 } 
> } } */
> -/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1666 { target lp64 } } 
> } */
> +/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target lp64 } } 
> } */
> 
>  /* { dg-final { scan-assembler-times {(?n)^\s+mulli} 5036 } } */

Are the new rlwimi's good to have, or can we do those with simpler or
fewer insns?


Segher

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

On 11/30/21 11:51 AM, Qing Zhao wrote:
> So, looks like that the variable with OPAQUE_TYPE cannot be all explicitly 
> initialized even at source code level. 
> 
> The only way to initialize such variable (only __vector_quad, not for 
> __vector_pairs) at source code level is through call to 
> __builtin_mma_xxsetaccz as:
> 
> void
> foo (__vector_quad *dst)
> {
>   __vector_quad acc;
>   __builtin_mma_xxsetaccz();
>   *dst = acc;
> }
> 
> Is this the correct understanding?

Correct.  Or via...

> Is there way to initialize such variable to other values than zero at source 
> code level?

Not for any constant values.  You can load it from memory though like below,
which is also allowed for __vector_pair:

void
foo (__vector_quad *dst, __vector_quad *src)
{
  __vector_quad acc;
  acc = *src;
  ...
}
void
bar (__vector_pair *dst, __vector_pair *src)
{
  __vector_pair pair;
  pair = *src;
  ...
}

We do not accept things like:

  acc = 0;
  acc = {0, 0, ... };
  etc.

Peter

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]



> On Nov 30, 2021, at 2:37 AM, Richard Biener  
> wrote:
> 
> On Mon, Nov 29, 2021 at 11:56 PM Qing Zhao  wrote:
>> 
>> Peter,
>> 
>> Thanks a lot for the patch.
>> 
>> Richard, how do you think of the patch?
>> 
>> (The major concern for me is:
>> 
>>With the current patch proposed by Peter, we will generate the call 
>> to .DEFERRED_INIT for a variable with OPAQUE_TYPE during gimplification 
>> phase,
>> However, if this variable is in register, then the call to 
>> .DEFERRED_INIT will NOT be expanded during RTL expansion phase.  This 
>> unexpanded call to .DEFERRED_INIT might cause some potential IR issue later?
> 
> I think that's inconsistent indeed.

Can we treat the call to .DEFERRED_INIT to a NOP during expansion phase if we 
cannot expand it to a valid RTL for the OPAQUE_TYPE?  Will doing this resolve 
the issues?

>  Peter, what are "opaque"
> registers?  rs6000-modes.def suggests
> that there's __vector_pair and __vector_quad, what's the GIMPLE types
> for those?  It seems they
> are either SSA names or expanded to pseudo registers but there's no
> constants for them.
> 
>> 
>> If the above is a real issue, should we skip initialization for all 
>> OPAQUE_TYPE variables even when they are in memory and can be initialized 
>> with memset?
>>then we should update “is_var_need_auto_init” in gimplify.c 
>> instead.   However, the issue with this approach is, we might miss the 
>> opportunity to initialize an OPAQUE_TYPE variable if it will be in memory?
>> ).
> 
> I think we need to bite the bullet at some point to do register initialization
> not via expand_assignment but directly based on what the LHS expands to.

OPAQUE_TYPE is so special, it should not be the reason to rewrite the register 
initialization from my understanding. 
If later more issue exposed, it might be necessary to rewrite this part.

Qing
> 
> Can they be initialized?  I see they can be copied at least.
> 
> If such "things" cannot be initialized they should indeed be exempt
> from auto-init.  The
> documentation suggests that they act as bit-bucked but even bit-buckets should
> be initializable, thus why exactly does CONST0_RTX not exist for them?
> 
> Richard.
> 
> 
>> 
>> Thanks.
>> 
>> Qing
>> 
>> 
>>> On Nov 29, 2021, at 3:56 PM, Peter Bergner  wrote:
>>> 
>>> Sorry for dropping the ball on testing the patch from the bugzilla!
>>> 
>>> The following patch fixes the ICE reported in the bugzilla on the 
>>> pre-existing
>>> gcc testsuite test case, bootstraps and shows no testsuite regressions
>>> on powerpc64le-linux.  Ok for trunk?
>>> 
>>> Peter
>>> 
>>> 
>>> For -ftrivial-auto-var-init=*, skip initializing the register variable if it
>>> is an opaque type, because CONST0_RTX(mode) is not defined for opaque modes.
>>> 
>>> gcc/
>>>  PR middle-end/103127
>>>  * internal-fn.c (expand_DEFERRED_INIT): Skip if VAR_TYPE is opaque.
>>> 
>>> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
>>> index 0cba95411a6..7cc0e9d5293 100644
>>> --- a/gcc/internal-fn.c
>>> +++ b/gcc/internal-fn.c
>>> @@ -3070,6 +3070,10 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt)
>>>}
>>>  else
>>>{
>>> +  /* Skip variables of opaque types that are in a register.  */
>>> +  if (OPAQUE_TYPE_P (var_type))
>>> + return;
>>> +
>>>  /* If this variable is in a register use expand_assignment.
>>>   For boolean scalars force zero-init.  */
>>>  tree init;
>>

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]




> On Nov 30, 2021, at 9:14 AM, Peter Bergner  wrote:
> 
> On 11/30/21 2:37 AM, Richard Biener wrote:
>> On Mon, Nov 29, 2021 at 11:56 PM Qing Zhao  wrote:
>> I think that's inconsistent indeed.  Peter, what are "opaque"
>> registers?  rs6000-modes.def suggests
>> that there's __vector_pair and __vector_quad, what's the GIMPLE types
>> for those?  It seems they
>> are either SSA names or expanded to pseudo registers but there's no
>> constants for them.
> 
> The __vector_pair and __vector_quad types are target specific types
> for use with our Matrix-Math-Assist (MMA) unit and they are only
> usable with our associated MMA built-in functions.  What they hold
> is really dependent on which MMA built-ins you use on them.
> You can think of them a generic (and large) vector type where the
> subtype is undefined...or defined by which built-in function you
> happen to be using.
> 
> We do not have any constants defined for them.  How we initialize them
> is either by loading values from memory into them or by zeroing them
> out using the xxsetaccz instruction (only for __vector_quads).

So, looks like that the variable with OPAQUE_TYPE cannot be all explicitly 
initialized even at source code level. 

The only way to initialize such variable (only __vector_quad, not for 
__vector_pairs) at source code level is through call to __builtin_mma_xxsetaccz 
as:

void
foo (__vector_quad *dst)
{
  __vector_quad acc;
  __builtin_mma_xxsetaccz();
  *dst = acc;
}

Is this the correct understanding?

Is there way to initialize such variable to other values than zero at source 
code level?

Qing

> 
> 
> 
> 
>> Can they be initialized?  I see they can be copied at least.
> 
> __vector_quads can be zero initialized using the __builtin_mma_xxsetaccz()
> built-in function.  We don't have a method (or use case) for zero initializing
> __vector_pairs.


> 
> 
> 
>> If such "things" cannot be initialized they should indeed be exempt
>> from auto-init.  The
>> documentation suggests that they act as bit-bucked but even bit-buckets 
>> should
>> be initializable, thus why exactly does CONST0_RTX not exist for them?
> 
> We used to have CONST0_RTX defined (but nothing else), but we had problems
> with the compiler CSEing the initialization for multiple __vector_quads and
> then copying the values around.  We'd end up with one xxsetaccz instruction
> and copies out of that accumulator register into the other accumulator
> registers.  Copies are VERY expensive, while xxsetaccz's are cheap, so we
> don't want that.  That said, I think a fix I put in to disable fwprop on
> these types may have been the culprit for that problem, so maybe we could
> add the CONST0_RTX back?  I'd have to verify that.  If so, then we'd at least
> be able to support -ftrivial-auto-var-init=zero.  The =pattern version
> would be more problematical...unless the value for pattern was loaded from
> memory.
> 
> Peter
> 
>

[committed] vect: Fix ncopies calculation for emulated gather/scatter [PR103494]

2021-11-30 Thread Richard Sandiford via Gcc-patches

I was too eager about removing ncopies calculations in g:10833849b55.
When emulating gather/scatter, the offset ncopies can be different from
the data ncopies.  This patch restores the original calculation.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  Pushed as obvious,
since it's essentially reverting part of my earlier patch (except for
obvious adjustments to keep slp_node).

Richard


gcc/
PR tree-optimization/103494
* tree-vect-stmts.c (vect_get_gather_scatter_ops): Remove ncopies
argument and calculate ncopies from gs_info->offset_vectype
where necessary.
(vectorizable_store, vectorizable_load): Update accordingly.

gcc/testsuite/
PR tree-optimization/103494
* gcc.dg/vect/pr103494.c: New test.
* g++.dg/vect/pr103494.cc: Likewise.
---
 gcc/testsuite/g++.dg/vect/pr103494.cc | 26 ++
 gcc/testsuite/gcc.dg/vect/pr103494.c  | 14 ++
 gcc/tree-vect-stmts.c | 21 -
 3 files changed, 52 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/vect/pr103494.cc
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr103494.c

diff --git a/gcc/testsuite/g++.dg/vect/pr103494.cc 
b/gcc/testsuite/g++.dg/vect/pr103494.cc
new file mode 100644
index 000..c0b078105c2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/pr103494.cc
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+void glFinish();
+struct _Vector_base {
+  struct {
+unsigned _M_start;
+  } _M_impl;
+};
+class vector : _Vector_base {
+public:
+  vector(long) {}
+  unsigned *data() { return &_M_impl._M_start; }
+};
+void *PutBitsIndexedImpl_color_table;
+int PutBitsIndexedImpl_dstRectHeight;
+char *PutBitsIndexedImpl_src_ptr;
+void PutBitsIndexedImpl() {
+  vector unpacked_buf(PutBitsIndexedImpl_dstRectHeight);
+  unsigned *dst_ptr = unpacked_buf.data();
+  for (int x; x; x++) {
+char i = *PutBitsIndexedImpl_src_ptr++;
+dst_ptr[x] = static_cast(PutBitsIndexedImpl_color_table)[i];
+  }
+  glFinish();
+}
diff --git a/gcc/testsuite/gcc.dg/vect/pr103494.c 
b/gcc/testsuite/gcc.dg/vect/pr103494.c
new file mode 100644
index 000..b544bf2379c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr103494.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+typedef int T1;
+typedef signed char T2;
+
+T1
+f (T1 *d, T2 *x, int n)
+{
+  unsigned char res = 0;
+  for (int i = 0; i < n; ++i)
+res += d[x[i]];
+  return res;
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 8642acbc0b4..9726450ab2d 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -2962,8 +2962,7 @@ vect_build_gather_load_calls (vec_info *vinfo, 
stmt_vec_info stmt_info,
 static void
 vect_get_gather_scatter_ops (loop_vec_info loop_vinfo,
 class loop *loop, stmt_vec_info stmt_info,
-slp_tree slp_node, unsigned int ncopies,
-gather_scatter_info *gs_info,
+slp_tree slp_node, gather_scatter_info *gs_info,
 tree *dataref_ptr, vec *vec_offset)
 {
   gimple_seq stmts = NULL;
@@ -2978,9 +2977,13 @@ vect_get_gather_scatter_ops (loop_vec_info loop_vinfo,
   if (slp_node)
 vect_get_slp_defs (SLP_TREE_CHILDREN (slp_node)[0], vec_offset);
   else
-vect_get_vec_defs_for_operand (loop_vinfo, stmt_info, ncopies,
-  gs_info->offset, vec_offset,
-  gs_info->offset_vectype);
+{
+  unsigned ncopies
+   = vect_get_num_copies (loop_vinfo, gs_info->offset_vectype);
+  vect_get_vec_defs_for_operand (loop_vinfo, stmt_info, ncopies,
+gs_info->offset, vec_offset,
+gs_info->offset_vectype);
+}
 }
 
 /* Prepare to implement a grouped or strided load or store using
@@ -8149,8 +8152,8 @@ vectorizable_store (vec_info *vinfo,
  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
{
  vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info,
-  slp_node, ncopies, _info,
-  _ptr, _offsets);
+  slp_node, _info, _ptr,
+  _offsets);
  vec_offset = vec_offsets[0];
}
  else
@@ -9454,8 +9457,8 @@ vectorizable_load (vec_info *vinfo,
  else if (STMT_VINFO_GATHER_SCATTER_P (stmt_info))
{
  vect_get_gather_scatter_ops (loop_vinfo, loop, stmt_info,
-  slp_node, ncopies, _info,
-  _ptr, _offsets);
+  slp_node, _info, _ptr,
+  _offsets);
}
  else

Re: [PATCH, Fortran] Fix setting of array lower bound for named arrays

2021-11-30 Thread Tobias Burnus


On 29.11.21 22:11, Harald Anlauf wrote:


"A whole array is a named array or a structure component whose final
part-ref is an array component name; no subscript list is appended."

I think in "h(3)" there is not really a named array – thus I read it as
if the "Otherwise ... result value is 1" applies.


If you read on in the standard:

"The appearance of a whole array variable in an executable construct
specifies all the elements of the array ..."

which might make you/makes me think that the sentence before that one
could need an official interpretation...


I am not sure whether I understand what part of the spec you wonder
about. (I mean besides that 'variable' can also mean referencing a
data-pointer-returning function.)

Question: What do NAG/flang/... report for lbound(h(3)) - also [3] – or
[1] as gfortran?


I've submitted a reduced example to the Intel Fortran Forum:
https://community.intel.com/t5/Intel-Fortran-Compiler/Allocate-with-SOURCE-and-bounds/m-p/1339992#M158535


There are good chances that Steve Lionel reads and comments on it.


So far only "FortranFan" has replied – and he comes to the same
conclusion as my reading, albeit without referring to the standard.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

[PING, PATCH] darwin, d: Support outfile substitution for liphobos

Ping.

Are the common gcc parts OK (also for backporting)?

Iain.

Excerpts from Iain Buclaw's message of November 26, 2021 1:51 pm:
> Excerpts from Iain Sandoe's message of November 19, 2021 10:21 am:
>> Hi Iain
>> 
>>> On 19 Nov 2021, at 08:32, Iain Buclaw  wrote:
>> 
>>> This patch fixes a stage2 bootstrap failure in the D front-end on
>>> darwin due to libgphobos being dynamically linked despite
>>> -static-libphobos being on the command line.
>>> 
>>> In the gdc driver, this takes the previous fix for the Darwin D
>>> bootstrap, and extends it to the -static-libphobos option as well.
>>> Rather than pushing the -static-libphobos option back onto the command
>>> line, the setting of SKIPOPT is instead conditionally removed.  The same
>>> change has been repeated for -static-libstdc++ so there is now no need
>>> to call generate_option to re-add it.
>>> 
>>> In the gcc driver, -static-libphobos has been added as a common option,
>>> validated, and a new outfile substition added to config/darwin.h to
>>> correctly replace -lgphobos with libgphobos.a.
>>> 
>>> Bootstrapped and regression tested on x86_64-linux-gnu and
>>> x86_64-apple-darwin20.
>>> 
>>> OK for mainline?  This would also be fine for gcc-11 release branch too,
>>> as well as earlier releases with D support.
>> 
>> the Darwin parts are fine, thanks 
>> 
>> The SKIPOPT in d-spec, presumably means “skip removing this opt”?
>> otherwise the #ifndef looks odd (because of the 
>> static-libgcc|static-libphobos,
>> darwin.h would do the substitution for -static-libgcc as well, so it’s not a 
>> 100%
>> test).
>> 
> 
> I've only just realised what you meant.  Yes you are of course right,
> and it should have been #ifdef, attaching a fixed-up patch.
> 
> Iain.
> 
> ---
> gcc/ChangeLog:
> 
> * common.opt (static-libphobos): Add option.
> * config/darwin.h (LINK_SPEC): Substitute -lgphobos with 
> libgphobos.a
> when linking statically.
> * gcc.c (driver_handle_option): Set -static-libphobos as always 
> valid.
> 
> gcc/d/ChangeLog:
> 
> * d-spec.cc (lang_specific_driver): Set SKIPOPT on 
> -static-libstdc++
> and -static-libphobos only when target supports LD_STATIC_DYNAMIC.
> Remove generate_option to re-add -static-libstdc++.
> 
> libphobos/ChangeLog:
> 
> * testsuite/testsuite_flags.in: Add libphobos library directory as
> search path to --gdcldflags.
> 
> diff --git a/gcc/common.opt b/gcc/common.opt
> index db6010e4e20..73c12d933f3 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3527,6 +3527,10 @@ static-libgfortran
>  Driver
>  ; Documented for Fortran, but always accepted by driver.
>  
> +static-libphobos
> +Driver
> +; Documented for D, but always accepted by driver.
> +
>  static-libstdc++
>  Driver
>  
> diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
> index 7ed01efa694..c4ddd623e8b 100644
> --- a/gcc/config/darwin.h
> +++ b/gcc/config/darwin.h
> @@ -443,6 +443,7 @@ extern GTY(()) int darwin_ms_struct;
>   %:replace-outfile(-lobjc libobjc-gnu.a%s); \
>  :%:replace-outfile(-lobjc -lobjc-gnu )}}\
> %{static|static-libgcc|static-libgfortran:%:replace-outfile(-lgfortran 
> libgfortran.a%s)}\
> +   %{static|static-libgcc|static-libphobos:%:replace-outfile(-lgphobos 
> libgphobos.a%s)}\
> 
> %{static|static-libgcc|static-libstdc++|static-libgfortran:%:replace-outfile(-lgomp
>  libgomp.a%s)}\
> %{static|static-libgcc|static-libstdc++:%:replace-outfile(-lstdc++ 
> libstdc++.a%s)}\
> %{force_cpusubtype_ALL:-arch %(darwin_arch)} \
> diff --git a/gcc/d/d-spec.cc b/gcc/d/d-spec.cc
> index b12d28f1047..1304126a675 100644
> --- a/gcc/d/d-spec.cc
> +++ b/gcc/d/d-spec.cc
> @@ -253,13 +253,23 @@ lang_specific_driver (cl_decoded_option 
> **in_decoded_options,
>  
>   case OPT_static_libstdc__:
> saw_static_libcxx = true;
> +#ifdef HAVE_LD_STATIC_DYNAMIC
> +   /* Remove -static-libstdc++ from the command only if target supports
> +  LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
> +  back-end target can use outfile substitution.  */
> args[i] |= SKIPOPT;
> +#endif
> break;
>  
>   case OPT_static_libphobos:
> if (phobos_library != PHOBOS_NOLINK)
>   phobos_library = PHOBOS_STATIC;
> +#ifdef HAVE_LD_STATIC_DYNAMIC
> +   /* Remove -static-libphobos from the command only if target supports
> +  LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
> +  back-end target can use outfile substitution.  */
> args[i] |= SKIPOPT;
> +#endif
> break;
>  
>   case OPT_shared_libphobos:
> @@ -460,7 +470,7 @@ lang_specific_driver (cl_decoded_option 
> **in_decoded_options,
>  #endif
>  }
>  
> -  if (saw_libcxx || need_stdcxx)
> +  if (saw_libcxx || saw_static_libcxx || need_stdcxx)
>  {
>  #ifdef HAVE_LD_STATIC_DYNAMIC

[PING, PATCH] doc, d: Add note that D front end now requires GDC installed in order to bootstrap.

Ping.

Excerpts from Iain Buclaw's message of November 18, 2021 2:06 am:
> Hi,
> 
> As asked for, this adds the documentation note in install.texi about the
> upcoming bootstrap requirements.
> 
> Obviously this will be applied alongside the patch posted previously:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2021-October/582917.html
> 
> Final batch of testing before proceeding has taking a bit longer than I
> expected.  Currently bootstrapping on sparcv9-sun-solaris2.11, and will
> push forward once have confirmed that it works as well as the current
> C++ implementation of the D front end.
> 
> OK for mainline?  Any improvements on wording?
> 
> Thanks,
> Iain.
> 
> ---
> gcc/ChangeLog:
> 
>   * doc/install.texi (Prerequisites): Add note that D front end now
>   requires GDC installed in order to bootstrap.
>   (Building): Add D compiler section, referencing prerequisites.
> ---
>  gcc/doc/install.texi | 28 
>  1 file changed, 28 insertions(+)
> 
> diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> index 094469b9a4e..6f999a2fd5a 100644
> --- a/gcc/doc/install.texi
> +++ b/gcc/doc/install.texi
> @@ -289,6 +289,25 @@ Ada runtime libraries. You can check that your build 
> environment is clean
>  by verifying that @samp{gnatls -v} lists only one explicit path in each
>  section.
>  
> +@item @anchor{GDC-prerequisite}GDC
> +
> +In order to build GDC, the D compiler, you need a working GDC
> +compiler (GCC version 9.1 or later), as the D front end is written in D.
> +
> +Versions of GDC prior to 12 can be built with an ISO C++11 compiler, which 
> can
> +then be installed and used to bootstrap newer versions of the D front end.
> +
> +It is strongly recommended to use an older version of GDC to build GDC. More
> +recent versions of GDC than the version built are not guaranteed to work and
> +will often fail during the build with compilation errors relating to
> +deprecations or removed features.
> +
> +Note that @command{configure} does not test whether the GDC installation 
> works
> +and has a sufficiently recent version.  Though the implementation of the D
> +front end does not make use of any GDC-specific extensions, or novel features
> +of the D language, if too old a GDC version is installed and
> +@option{--enable-languages=d} is used, the build will fail.
> +
>  @item A ``working'' POSIX compatible shell, or GNU bash
>  
>  Necessary when running @command{configure} because some
> @@ -2977,6 +2996,15 @@ and network filesystems.
>  @uref{prerequisites.html#GNAT-prerequisite,,GNAT prerequisites}.
>  @end ifhtml
>  
> +@section Building the D compiler
> +
> +@ifnothtml
> +@ref{GDC-prerequisite}.
> +@end ifnothtml
> +@ifhtml
> +@uref{prerequisites.html#GDC-prerequisite,,GDC prerequisites}.
> +@end ifhtml
> +
>  @section Building with profile feedback
>  
>  It is possible to use profile feedback to optimize the compiler itself.  This
> -- 
> 2.30.2
> 
>

Re: [PATCH] OpenMP: Ensure that offloaded variables are public

On Tue, Nov 30, 2021 at 05:24:49PM +0100, Jakub Jelinek via Gcc-patches wrote:
> Consider in one TU
> 
> static int a = 5;
> static int baz (void) { static int b;
> #pragma omp declare target to (b)
> return ++b; }
> int foo (void) { return ++a + baz (); }
> #pragma omp declare target to (a, foo)
> 
> and
> 
> static int a = 5;
> static int baz (void) { static int b;
> #pragma omp declare target to (b)
> return ++b; }
> int bar (void) { return ++a + baz (); }
> #pragma omp declare target to (a, bar)
> 
> int
> main ()
> {
>   int v;
>   #pragma omp target (from: v)
>   v = foo () + bar ();
> }
> 
> in another one.  This has
>   .quad   a
>   .quad   4
>   .quad   b.0
>   .quad   4
> in .offload_var_table.  I'd guess this must fail to link or load
> with GCN if it makes them forcibly TREE_PUBLIC.
> 
> Why does the GCN plugin or runtime need to know those vars?
> It needs to know the single array that contains their addresses of course...

Actually, you've done it in ACCEL_COMPILER only, so
I assume linking the above two sources with -fopenmp into a single
binary or shared library will still work because LTO when reading
the byte-code in will remangle the names of those variables to something
where they are unique in that single *.s (or *.ptx) it emits.
But, if you put one of those TUs into a shared library and the other
into another shared library, I don't see how it can work anymore,
because both those ELF objects which will be in data sections of those
libraries might have clashing names.

If GCN can't support static variables (but isn't it ELF?) and there is no
other way than sacrifice offloading from multiple shared libraries or binary
in the same process, it at least shouldn't be done for targets which don't
need it (e.g. PTX) and shouldn't be done in the pass you've done it in
(because that means it will walk all the vars for each function it
processes, rather than just once).  So, better place would be e.g.
offload_handle_link_vars in lto/*.c or so.

Jakub

[PATCH] Fix alignment of stack slots for overaligned types [PR103500]

2021-11-30 Thread Alex Coplan via Gcc-patches

Hi,

This fixes PR103500 i.e. ensuring that stack slots for
passed-by-reference overaligned types are appropriately aligned. For the
testcase:

typedef struct __attribute__((aligned(32))) {
  long x,y;
} S;
S x;
void f(S);
void g(void) { f(x); }

on AArch64, we currently generate (at -O2):

g:
adrpx1, .LANCHOR0
add x1, x1, :lo12:.LANCHOR0
stp x29, x30, [sp, -48]!
mov x29, sp
ldp q0, q1, [x1]
add x0, sp, 16
stp q0, q1, [sp, 16]
bl  f
ldp x29, x30, [sp], 48
ret

so the stack slot for the passed-by-reference copy of the structure is
at sp + 16, and the sp is only guaranteed to be 16-byte aligned, so the
structure is only 16-byte aligned. The PCS requires the structure to be
32-byte aligned. After this patch, we generate:

g:
adrpx1, .LANCHOR0
add x1, x1, :lo12:.LANCHOR0
stp x29, x30, [sp, -64]!
mov x29, sp
add x0, sp, 47
ldp q0, q1, [x1]
and x0, x0, -32
stp q0, q1, [x0]
bl  f
ldp x29, x30, [sp], 64
ret

i.e. we ensure 32-byte alignment for the struct.

The approach taken here is similar to that in
function.c:assign_parm_setup_block where it handles the case for
DECL_ALIGN (parm) > MAX_SUPPORTED_STACK_ALIGNMENT. This in turn is
similar to the approach taken in cfgexpand.c:expand_stack_vars (where
the function calls get_dynamic_stack_size) which is the code that
handles the alignment for overaligned structures as addressable local
variables (see the related case discussed in the PR).

This patch also updates the aapcs64 test mentioned in the PR to avoid
the frontend folding away the alignment check. I've confirmed that the
execution test actually fails on aarch64-linux-gnu prior to the patch
being applied and passes afterwards.

Bootstrapped and regtested on aarch64-linux-gnu, x86_64-linux-gnu, and
arm-linux-gnueabihf: no regressions.

I'd appreciate any feedback. Is it OK for trunk?

Thanks,
Alex

gcc/ChangeLog:

PR middle-end/103500
* function.c (get_stack_local_alignment): Align BLKmode overaligned
types to the alignment required by the type.
(assign_stack_temp_for_type): Handle BLKmode overaligned stack
slots by allocating a larger-than-necessary buffer and aligning
the address within appropriately.

gcc/testsuite/ChangeLog:

PR middle-end/103500
* gcc.target/aarch64/aapcs64/rec_align-8.c (test_pass_by_ref):
Prevent the frontend from folding our alignment check away by
using snprintf to store the pointer into a string and recovering
it with sscanf.
diff --git a/gcc/function.c b/gcc/function.c
index 61b3bd036b8..5ed722ab959 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -278,7 +278,9 @@ get_stack_local_alignment (tree type, machine_mode mode)
   unsigned int alignment;
 
   if (mode == BLKmode)
-alignment = BIGGEST_ALIGNMENT;
+alignment = (type && TYPE_ALIGN (type) > MAX_SUPPORTED_STACK_ALIGNMENT)
+  ? TYPE_ALIGN (type)
+  : BIGGEST_ALIGNMENT;
   else
 alignment = GET_MODE_ALIGNMENT (mode);
 
@@ -872,21 +874,35 @@ assign_stack_temp_for_type (machine_mode mode, poly_int64 
size, tree type)
 
   p = ggc_alloc ();
 
-  /* We are passing an explicit alignment request to assign_stack_local.
-One side effect of that is assign_stack_local will not round SIZE
-to ensure the frame offset remains suitably aligned.
-
-So for requests which depended on the rounding of SIZE, we go ahead
-and round it now.  We also make sure ALIGNMENT is at least
-BIGGEST_ALIGNMENT.  */
-  gcc_assert (mode != BLKmode || align == BIGGEST_ALIGNMENT);
-  p->slot = assign_stack_local_1 (mode,
- (mode == BLKmode
-  ? aligned_upper_bound (size,
- (int) align
- / BITS_PER_UNIT)
-  : size),
- align, 0);
+  if (mode == BLKmode && align > MAX_SUPPORTED_STACK_ALIGNMENT)
+   {
+ rtx allocsize = gen_int_mode (size, Pmode);
+ get_dynamic_stack_size (, 0, align, NULL);
+ gcc_assert (CONST_INT_P (allocsize));
+ size = UINTVAL (allocsize);
+ p->slot = assign_stack_local_1 (mode,
+ size,
+ BIGGEST_ALIGNMENT, 0);
+ rtx addr = align_dynamic_address (XEXP (p->slot, 0), align);
+ mark_reg_pointer (addr, align);
+ p->slot = gen_rtx_MEM (GET_MODE (p->slot), addr);
+ MEM_NOTRAP_P (p->slot) = 1;
+   }
+  else
+   /* We are passing an explicit alignment request to assign_stack_local.
+  One side effect of that is

Re: [PATCH 1/7] ifcvt: Check if cmovs are needed.

2021-11-30 Thread Richard Sandiford via Gcc-patches

BTW, in response to your earlier concern about stage 3: you posted the
series well in time for end of stage 1, so I think it can still go in
during stage 3.

Robin Dapp  writes:
> Hi Richard,
>
>> It's hard to judge this in isolation because it's not clear when
>> and how the new arguments are going to be used, but it seems OK
>> in principle.  Do you still want:
>> 
>>   /* If earliest == jump, try to build the cmove insn directly.
>>  This is helpful when combine has created some complex condition
>>  (like for alpha's cmovlbs) that we can't hope to regenerate
>>  through the normal interface.  */
>> 
>>   if (if_info->cond_earliest == if_info->jump)
>> {
>> 
>> to be used when cc_cmp and rev_cc_cmp are nonnull?
>
> My initial hunch was to just leave it in place as I did not manage to
> trigger it.  As it is going to be called and costed both ways (with
> cc_cmp, rev_cc_cmp and without) it is probably better to move it into
> the else branch.
>
> The single usage of this is in patch 5/7.  We are passing the already
> existing condition from the jump and its reverse to see if the backend
> can come up with something better than when creating a new comparison.
>
>>> +static rtx emit_conditional_move (rtx, rtx, rtx, rtx, machine_mode);
>>> +rtx emit_conditional_move (rtx, rtx, rtx, rtx, rtx, machine_mode);
>> 
>> This is redundant with the header file declaration.
>> 
>
> Removed it.
>
>> I think it'd be better to call one of these functions something else,
>> rather than make the interpretation of the third parameter depend on
>> the total number of parameters.  In the second overload, the comparison
>> rtx effectively replaces four parameters of the existing
>> emit_conditional_move, so perhaps that's the one that should remain
>> emit_conditional_move.  Maybe the first one should be called
>> emit_conditional_move_with_rev or something.
>
> Not entirely fond of calling the first one _with_rev because essentially
> both try normal and reversed variants but I agree that the naming is not
> ideal.  I don't have any great ideas how to properly untangle it so I
> would go with your suggestions in order to move forward.  As there is
> only one caller of the second function, we could also let the caller
> handle the reversing.  Then, the third function would need to be
> non-static, though.
>
> The third, static emit_conditional_move I already renamed locally to
> emit_conditional_move_1.

Thanks, renaming the third function helps.

>> Part of me wonders if this would be simpler if we created a structure
>> to describe a comparison and passed that around instead of individual
>> fields, but I guess it could become a rat hole.
>
> I also thought about this as it would allow us to use either
> representation as required by the usage site.  Even tried it in a branch
> locally but indeed it became ugly quickly so I postponed it for now.

Still, perhaps we could at least add (in rtl.h):

struct rtx_comparison {
  rtx_code code;
  machine_mode op_mode;
  rtx op0, op1;
};

and make the existing emit_conditional_moves use it instead of four
separate parameters.  These rtx arguments would then be replacing those
rtx_comparison arguments, which would avoid the ambiguity in the overloads.

With C++ it should be possible to rewrite the calls using { … }, e.g.:

  if (!emit_conditional_move (into_target, { cmp_code, op1_mode, cmp1, cmp2 },
  into_target, into_superword, word_mode, false))

so the new type wouldn't need to spread too far.

Does that sound OK?  If so, could you post the current version of full
patch series and say which bits still need review?

Thanks,
Richard

[PATCH] tree-optimization/98956 Optimizing out boolean left shift

2021-11-30 Thread Navid Rahimi via Gcc-patches

Hi GCC community,

This patch will add the missed pattern described in bug 98956 [1] to the 
match.pd. The codegen and correctness proof for this pattern is here [2,3] in 
case anyone is curious. Tested on x86_64 Linux.

Tree-optimization/98956:

Adding new optimization to match.pd:
* match.pd ((B0 << x) cmp 0) -> B0 cmp 0 : New optimization.
* gcc.dg/tree-ssa/pr98956.c: testcase for this optimization.
* gcc.dg/tree-ssa/pr98956-2.c: testcase for node with 
side-effect.

1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98956
2) https://compiler-explorer.com/z/nj4PTrecW
3) https://alive2.llvm.org/ce/z/jyJAoS

Best wishes,
Navid.

0001-Tree-optimization-98956.patch
Description: 0001-Tree-optimization-98956.patch

Re: [PATCH] OpenMP: Ensure that offloaded variables are public

On Tue, Nov 16, 2021 at 11:49:18AM +, Andrew Stubbs wrote:
> This patch is needed for AMD GCN offloading when we use the assembler from
> LLVM 13+.
> 
> The GCN runtime (libgomp+ROCm) requires that the location of all variables
> in the offloaded variables table are discoverable at runtime (using the
> "hsa_executable_symbol_get_info" API), and this only works when the symbols
> are exported from the binary. Previously we solved this by having mkoffload
> insert ".global" directives into the assembler text, but newer LLVM
> assemblers emit an error if we do this when then variable was previously
> declared ".local" (which happens when a variable is zero-initialized and
> placed in the BSS).
> 
> Since we can no longer easily fix them up after the fact, this patch fixes
> them up during OMP lowering.

I'm confused, how can that ever work reliably?
The !TREE_PUBLIC offload_vars can be static locals or static globals
or static anon namespace vars, but their names can very easily clash with
either static or non-static variables from other TUs.
Consider in one TU

static int a = 5;
static int baz (void) { static int b;
#pragma omp declare target to (b)
return ++b; }
int foo (void) { return ++a + baz (); }
#pragma omp declare target to (a, foo)

and

static int a = 5;
static int baz (void) { static int b;
#pragma omp declare target to (b)
return ++b; }
int bar (void) { return ++a + baz (); }
#pragma omp declare target to (a, bar)

int
main ()
{
  int v;
  #pragma omp target (from: v)
  v = foo () + bar ();
}

in another one.  This has
.quad   a
.quad   4
.quad   b.0
.quad   4
in .offload_var_table.  I'd guess this must fail to link or load
with GCN if it makes them forcibly TREE_PUBLIC.

Why does the GCN plugin or runtime need to know those vars?
It needs to know the single array that contains their addresses of course...

Jakub

Re: [PATCH 2/5]AArch64 sve: combine nested if predicates

2021-11-30 Thread Richard Sandiford via Gcc-patches

Tamar Christina  writes:
> Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-linux-gnu and no 
> issues.
>
> gcc/ChangeLog:
>
>   * tree-vect-stmts.c (prepare_load_store_mask): Rename to...
>   (prepare_vec_mask): ...This and record operations that have already been
>   masked.
>   (vectorizable_call): Use it.
>   (vectorizable_operation): Likewise.
>   (vectorizable_store): Likewise.
>   (vectorizable_load): Likewise.
>   * tree-vectorizer.c (vec_cond_masked_key::get_cond_ops_from_tree): New.
>   * tree-vectorizer.h (struct vec_cond_masked_key): New.
>   (class _loop_vec_info): Add vec_cond_masked_set.
>   (vec_cond_masked_set_type): New.
>   (struct default_hash_traits): New.
>
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/sve/pred-combine-and.c: New test.
>
> --- inline copy of patch ---
>
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pred-combine-and.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/pred-combine-and.c
> new file mode 100644
> index 
> ..ee927346abe518caa3cba397b11dfd1ee7e93630
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pred-combine-and.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +
> +void f5(float * restrict z0, float * restrict z1, float *restrict x, float * 
> restrict y, float c, int n)
> +{
> +for (int i = 0; i < n; i++) {
> +float a = x[i];
> +float b = y[i];
> +if (a > b) {
> +z0[i] = a + b;
> +if (a > c) {
> +z1[i] = a - b;
> +}
> +}
> +}
> +}
> +
> +/* { dg-final { scan-assembler-times {\tfcmgt\tp[0-9]+\.s, p[0-9]+/z, 
> z[0-9]+\.s, z[0-9]+\.s} 2 } } */
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 
> 2284ad069e4d521f4e0bd43d34181a258cd636ef..b1946b589043312a9b29d832f9b8398e24787a5f
>  100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -1796,23 +1796,30 @@ check_load_store_for_partial_vectors (loop_vec_info 
> loop_vinfo, tree vectype,
>  /* Return the mask input to a masked load or store.  VEC_MASK is the 
> vectorized
> form of the scalar mask condition and LOOP_MASK, if nonnull, is the mask
> that needs to be applied to all loads and stores in a vectorized loop.
> -   Return VEC_MASK if LOOP_MASK is null, otherwise return VEC_MASK & 
> LOOP_MASK.
> +   Return VEC_MASK if LOOP_MASK is null or if VEC_MASK is already masked,
> +   otherwise return VEC_MASK & LOOP_MASK.
>  
> MASK_TYPE is the type of both masks.  If new statements are needed,
> insert them before GSI.  */
>  
>  static tree
> -prepare_load_store_mask (tree mask_type, tree loop_mask, tree vec_mask,
> -  gimple_stmt_iterator *gsi)
> +prepare_vec_mask (tree mask_type, loop_vec_info loop_vinfo, tree loop_mask,
> +   tree vec_mask, gimple_stmt_iterator *gsi)

Minor, but: loop_vinfo normally comes first when present.

>  {
>gcc_assert (useless_type_conversion_p (mask_type, TREE_TYPE (vec_mask)));
>if (!loop_mask)
>  return vec_mask;
>  
>gcc_assert (TREE_TYPE (loop_mask) == mask_type);
> +
> +  vec_cond_masked_key cond (vec_mask, loop_mask);
> +  if (loop_vinfo->vec_cond_masked_set.contains (cond))
> +return vec_mask;
> +
>tree and_res = make_temp_ssa_name (mask_type, NULL, "vec_mask_and");
>gimple *and_stmt = gimple_build_assign (and_res, BIT_AND_EXPR,
> vec_mask, loop_mask);
> +
>gsi_insert_before (gsi, and_stmt, GSI_SAME_STMT);
>return and_res;
>  }
> @@ -3526,8 +3533,9 @@ vectorizable_call (vec_info *vinfo,
> gcc_assert (ncopies == 1);
> tree mask = vect_get_loop_mask (gsi, masks, vec_num,
> vectype_out, i);
> -   vargs[mask_opno] = prepare_load_store_mask
> - (TREE_TYPE (mask), mask, vargs[mask_opno], gsi);
> +   vargs[mask_opno] = prepare_vec_mask
> + (TREE_TYPE (mask), loop_vinfo, mask,
> +  vargs[mask_opno], gsi);
>   }
>  
> gcall *call;
> @@ -3564,8 +3572,8 @@ vectorizable_call (vec_info *vinfo,
> tree mask = vect_get_loop_mask (gsi, masks, ncopies,
> vectype_out, j);
> vargs[mask_opno]
> - = prepare_load_store_mask (TREE_TYPE (mask), mask,
> -vargs[mask_opno], gsi);
> + = prepare_vec_mask (TREE_TYPE (mask), loop_vinfo, mask,
> + vargs[mask_opno], gsi);
>   }
>  
> gimple *new_stmt;
> @@ -6302,10 +6310,46 @@ vectorizable_operation (vec_info *vinfo,
>   }
>else
>   {
> +   tree mask = NULL_TREE;
> +   /* When combining two masks check is

Re: [PATCH]AArch64 Optimize right shift rounding narrowing

2021-11-30 Thread Richard Sandiford via Gcc-patches

Tamar Christina  writes:
> Hi All,
>
> This optimizes right shift rounding narrow instructions to
> rounding add narrow high where one vector is 0 when the shift amount is half
> that of the original input type.
>
> i.e.
>
> uint32x4_t foo (uint64x2_t a, uint64x2_t b)
> {
>   return vrshrn_high_n_u64 (vrshrn_n_u64 (a, 32), b, 32);
> }
>
> now generates:
>
> foo:
> moviv3.4s, 0
> raddhn  v0.2s, v2.2d, v3.2d
> raddhn2 v0.4s, v2.2d, v3.2d
>
> instead of:
>
> foo:
> rshrn   v0.2s, v0.2d, 32
> rshrn2  v0.4s, v1.2d, 32
> ret
>
> On Arm cores this is an improvement in both latency and throughput.
> Because a vector zero is needed I created a new method
> aarch64_gen_shareable_zero that creates zeros using V4SI and then takes a 
> subreg
> of the zero to the desired type.  This allows CSE to share all the zero
> constants.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?

LGTM.  Just a couple of nits:

>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-protos.h (aarch64_gen_shareable_zero): New.
>   * config/aarch64/aarch64-simd.md (aarch64_rshrn,
>   aarch64_rshrn2): 

Missing description.

>   * config/aarch64/aarch64.c (aarch64_gen_shareable_zero): New.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/advsimd-intrinsics/shrn-1.c: New test.
>   * gcc.target/aarch64/advsimd-intrinsics/shrn-2.c: New test.
>   * gcc.target/aarch64/advsimd-intrinsics/shrn-3.c: New test.
>   * gcc.target/aarch64/advsimd-intrinsics/shrn-4.c: New test.
>
> --- inline copy of patch -- 
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 
> f7887d06139f01c1591c4e755538d94e5e608a52..f7f5cae82bc9198e54d0298f25f7c0f5902d5fb1
>  100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -846,6 +846,7 @@ const char *aarch64_output_move_struct (rtx *operands);
>  rtx aarch64_return_addr_rtx (void);
>  rtx aarch64_return_addr (int, rtx);
>  rtx aarch64_simd_gen_const_vector_dup (machine_mode, HOST_WIDE_INT);
> +rtx aarch64_gen_shareable_zero (machine_mode);
>  bool aarch64_simd_mem_operand_p (rtx);
>  bool aarch64_sve_ld1r_operand_p (rtx);
>  bool aarch64_sve_ld1rq_operand_p (rtx);
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 
> c71658e2bf52b26bf9fc9fa702dd5446447f4d43..d7f8694add540e32628893a7b7471c08de6f760f
>  100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -1956,20 +1956,32 @@ (define_expand "aarch64_rshrn"
> (match_operand:SI 2 "aarch64_simd_shift_imm_offset_")]
>"TARGET_SIMD"
>{
> -operands[2] = aarch64_simd_gen_const_vector_dup (mode,
> -  INTVAL (operands[2]));
> -rtx tmp = gen_reg_rtx (mode);
> -if (BYTES_BIG_ENDIAN)
> -  emit_insn (gen_aarch64_rshrn_insn_be (tmp, operands[1],
> - operands[2], CONST0_RTX (mode)));
> +if (INTVAL (operands[2]) == GET_MODE_UNIT_BITSIZE (mode))
> +  {
> + rtx tmp0 = aarch64_gen_shareable_zero (mode);
> + emit_insn (gen_aarch64_raddhn (operands[0], operands[1], tmp0));
> +  }
>  else
> -  emit_insn (gen_aarch64_rshrn_insn_le (tmp, operands[1],
> - operands[2], CONST0_RTX (mode)));
> -
> -/* The intrinsic expects a narrow result, so emit a subreg that will get
> -   optimized away as appropriate.  */
> -emit_move_insn (operands[0], lowpart_subreg (mode, tmp,
> -  mode));
> +  {
> + rtx tmp = gen_reg_rtx (mode);
> + operands[2] = aarch64_simd_gen_const_vector_dup (mode,
> +  INTVAL (operands[2]));
> + if (BYTES_BIG_ENDIAN)
> +   emit_insn (
> + gen_aarch64_rshrn_insn_be (tmp, operands[1],
> +  operands[2],
> +  CONST0_RTX (mode)));
> + else
> +   emit_insn (
> + gen_aarch64_rshrn_insn_le (tmp, operands[1],
> +  operands[2],
> +  CONST0_RTX (mode)));
> +
> + /* The intrinsic expects a narrow result, so emit a subreg that will
> +get optimized away as appropriate.  */
> + emit_move_insn (operands[0], lowpart_subreg (mode, tmp,
> +  mode));
> +  }
>  DONE;
>}
>  )
> @@ -2049,14 +2061,27 @@ (define_expand "aarch64_rshrn2"
> (match_operand:SI 3 "aarch64_simd_shift_imm_offset_")]
>"TARGET_SIMD"
>{
> -operands[3] = aarch64_simd_gen_const_vector_dup (mode,
> -  INTVAL (operands[3]));
> -if (BYTES_BIG_ENDIAN)
> -  emit_insn (gen_aarch64_rshrn2_insn_be

Re: [PATCH] libcpp, v2: Fix up #__VA_OPT__ handling [PR103415]


On 11/30/21 09:13, Jakub Jelinek wrote:

On Mon, Nov 29, 2021 at 07:28:10PM -0500, Jason Merrill wrote:

Please add some of this explanation to the "paste any tokens" comment in the
code.


Ok.


+ while (rhs->flags & PASTE_LEFT);
+ if ((flags & PREV_WHITE)
+ && (token->flags & PREV_WHITE) == 0)
+   const_cast(token)->flags
+ |= PREV_WHITE;


Hmm, shouldn't paste_tokens handle copying PREV_WHITE?


Copying there PREV_FALLTHROUGH fixes the new Wimplicit-fallthrough-38.c
testcase, I couldn't find where doing the copying of PREV_WHITE would
make an observable difference outside of __VA_OPT__, e.g.
#define F(x) #x
#define G(x) F(x)
#define H G({a##b)
#define I G({ a##b)
const char *h = H;
const char *i = I;
results in "{ab" and "{ ab" before/after the patch.  But copying it
in paste_tokens looks cleaner...


OK, thanks.


2021-11-30  Jakub Jelinek  

PR preprocessor/103415
libcpp/
* macro.c (stringify_arg): Remove va_opt argument and va_opt handling.
(paste_tokens): On successful paste or in PREV_WHITE and
PREV_FALLTHROUGH flags from the *plhs token to the new token.
(replace_args): Adjust stringify_arg callers.  For #__VA_OPT__,
perform token pasting in a separate loop before stringify_arg call.
gcc/testsuite/
* c-c++-common/cpp/va-opt-8.c: New test.
* c-c++-common/Wimplicit-fallthrough-38.c: New test.

--- libcpp/macro.c.jj   2021-11-26 10:09:50.278020239 +0100
+++ libcpp/macro.c  2021-11-30 14:05:25.274132482 +0100
@@ -295,7 +295,7 @@ static cpp_context *next_context (cpp_re
  static const cpp_token *padding_token (cpp_reader *, const cpp_token *);
  static const cpp_token *new_string_token (cpp_reader *, uchar *, unsigned 
int);
  static const cpp_token *stringify_arg (cpp_reader *, const cpp_token **,
-  unsigned int, bool);
+  unsigned int);
  static void paste_all_tokens (cpp_reader *, const cpp_token *);
  static bool paste_tokens (cpp_reader *, location_t,
  const cpp_token **, const cpp_token *);
@@ -834,8 +834,7 @@ cpp_quote_string (uchar *dest, const uch
  /* Convert a token sequence FIRST to FIRST+COUNT-1 to a single string token
 according to the rules of the ISO C #-operator.  */
  static const cpp_token *
-stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count,
-  bool va_opt)
+stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count)
  {
unsigned char *dest;
unsigned int i, escape_it, backslash_count = 0;
@@ -852,24 +851,6 @@ stringify_arg (cpp_reader *pfile, const
  {
const cpp_token *token = first[i];
  
-  if (va_opt && (token->flags & PASTE_LEFT))

-   {
- location_t virt_loc = pfile->invocation_location;
- const cpp_token *rhs;
- do
-   {
- if (i == count)
-   abort ();
- rhs = first[++i];
- if (!paste_tokens (pfile, virt_loc, , rhs))
-   {
- --i;
- break;
-   }
-   }
- while (rhs->flags & PASTE_LEFT);
-   }
-
if (token->type == CPP_PADDING)
{
  if (source == NULL
@@ -1003,6 +984,7 @@ paste_tokens (cpp_reader *pfile, locatio
return false;
  }
  
+  lhs->flags |= (*plhs)->flags & (PREV_WHITE | PREV_FALLTHROUGH);

*plhs = lhs;
_cpp_pop_buffer (pfile);
return true;
@@ -1945,8 +1927,7 @@ replace_args (cpp_reader *pfile, cpp_has
if (src->flags & STRINGIFY_ARG)
  {
if (!arg->stringified)
- arg->stringified = stringify_arg (pfile, arg->first, arg->count,
-   false);
+ arg->stringified = stringify_arg (pfile, arg->first, arg->count);
  }
else if ((src->flags & PASTE_LEFT)
 || (src != macro->exp.tokens && (src[-1].flags & PASTE_LEFT)))
@@ -2066,11 +2047,46 @@ replace_args (cpp_reader *pfile, cpp_has
{
  unsigned int count
= start ? paste_flag - start : tokens_buff_count (buff);
- const cpp_token *t
-   = stringify_arg (pfile,
-start ? start + 1
-: (const cpp_token **) (buff->base),
-count, true);
+ const cpp_token **first
+   = start ? start + 1
+   : (const cpp_token **) (buff->base);
+ unsigned int i, j;
+
+ /* Paste any tokens that need to be pasted before calling
+stringify_arg, because stringify_arg uses pfile->u_buff
+which paste_tokens can use as well.  */
+

[PATCH] Allow loop crossing paths in back threader copier.

2021-11-30 Thread Aldy Hernandez via Gcc-patches

We are currently restricting loop crossing paths in the generic copier
used by the back threader, but we should be able to handle them after
loop_done has completed.

This fixes the PR at -O2, though the problem remains at -O1 because we
have no threaders smart enough to elide the undefined read.  DOM3 could
be a candidate when it is converted to either a hybrid threader or
replaced with the backward threader (when ranger can handle floats).

Tested on x86-64 Linux.

OK for trunk?

PR tree-optimization/80548

gcc/ChangeLog:

* attribs.c (sorted_attr_string): Add assert for -Wstringop-overread.
* tree-ssa-threadupdate.c
(back_jt_path_registry::duplicate_thread_path): Allow paths that
cross loops after loop_done.
(back_jt_path_registry::update_cfg): Diagnose dropped threads
after duplicate_thread_path.

gcc/testsuite/ChangeLog:

* gcc.dg/pr80548.c: New test.
---
 gcc/attribs.c  |  1 +
 gcc/testsuite/gcc.dg/pr80548.c | 23 +++
 gcc/tree-ssa-threadupdate.c| 19 +++
 3 files changed, 35 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr80548.c

diff --git a/gcc/attribs.c b/gcc/attribs.c
index c252f5af07b..9a079b8405a 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -1035,6 +1035,7 @@ sorted_attr_string (tree arglist)
   attr_str[str_len_sum + len] = TREE_CHAIN (arg) ? ',' : '\0';
   str_len_sum += len + 1;
 }
+  gcc_assert (arglist);
 
   /* Replace "=,-" with "_".  */
   for (i = 0; i < strlen (attr_str); i++)
diff --git a/gcc/testsuite/gcc.dg/pr80548.c b/gcc/testsuite/gcc.dg/pr80548.c
new file mode 100644
index 000..232743e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr80548.c
@@ -0,0 +1,23 @@
+// { dg-do compile }
+// { dg-options "-O2 -Wuninitialized" }
+
+int g (void);
+void h (int, int);
+
+void f (int b)
+{
+  int x, y;
+
+  if (b)
+{
+  x = g ();
+  y = g ();
+}
+
+  while (g ())
+if (b)
+  {
+h (x, y); // { dg-bogus "uninit" }
+y = g ();
+  }
+}
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 8aac733ac25..b194c11e23d 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -2410,13 +2410,14 @@ back_jt_path_registry::duplicate_thread_path (edge 
entry,
  missuses of the functions.  I.e. if you ask to copy something weird,
  it will work, but the state of structures probably will not be
  correct.  */
-  for (i = 0; i < n_region; i++)
-{
-  /* We do not handle subloops, i.e. all the blocks must belong to the
-same loop.  */
-  if (region[i]->loop_father != loop)
-   return false;
-}
+  if (!(cfun->curr_properties & PROP_loop_opts_done))
+for (i = 0; i < n_region; i++)
+  {
+   /* We do not handle subloops, i.e. all the blocks must belong to the
+  same loop.  */
+   if (region[i]->loop_father != loop)
+ return false;
+  }
 
   initialize_original_copy_tables ();
 
@@ -2651,9 +2652,11 @@ back_jt_path_registry::update_cfg (bool 
/*peel_loop_headers*/)
  visited_starting_edges.add (entry);
  retval = true;
  m_num_threaded_edges++;
+ path->release ();
}
+  else
+   cancel_thread (path, "Failure in duplicate_thread_path");
 
-  path->release ();
   m_paths.unordered_remove (0);
   free (region);
 }
-- 
2.31.1

Re: [PATCH] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-30 Thread Marek Polacek via Gcc-patches

On Tue, Nov 30, 2021 at 04:00:01PM +0100, Stephan Bergmann wrote:
> On 30/11/2021 14:26, Marek Polacek wrote:
> > On Tue, Nov 30, 2021 at 09:38:57AM +0100, Stephan Bergmann wrote:
> > > On 15/11/2021 18:28, Marek Polacek via Gcc-patches wrote:
> > > > On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:
> > > > > Ping, can we conclude on the name?   IMHO, -Wbidirectional is just 
> > > > > fine,
> > > > > but changing the name is a trivial operation.
> > > > 
> > > > Here's a patch with a better name (suggested by Jonathan W.).  
> > > > Otherwise no
> > > > changes.
> > > > 
> > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > > 
> > > > -- >8 --
> > > >   From a link below:
> > > > "An issue was discovered in the Bidirectional Algorithm in the Unicode
> > > > Specification through 14.0. It permits the visual reordering of
> > > > characters via control sequences, which can be used to craft source code
> > > > that renders different logic than the logical ordering of tokens
> > > > ingested by compilers and interpreters. Adversaries can leverage this to
> > > > encode source code for compilers accepting Unicode such that targeted
> > > > vulnerabilities are introduced invisibly to human reviewers."
> > > > 
> > > > More info:
> > > > https://nvd.nist.gov/vuln/detail/CVE-2021-42574
> > > > https://trojansource.codes/
> > > > 
> > > > This is not a compiler bug.  However, to mitigate the problem, this 
> > > > patch
> > > > implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
> > > > misleading Unicode bidirectional characters the preprocessor may 
> > > > encounter.
> > > > 
> > > > The default is =unpaired, which warns about improperly terminated
> > > > bidirectional characters; e.g. a LRE without its appertaining PDF.  The
> > > > level =any warns about any use of bidirectional characters.
> > > > 
> > > > This patch handles both UCNs and UTF-8 characters.  UCNs designating
> > > > bidi characters in identifiers are accepted since r204886.  Then r217144
> > > > enabled -fextended-identifiers by default.  Extended characters in C/C++
> > > > identifiers have been accepted since r275979.  However, this patch still
> > > > warns about mixing UTF-8 and UCN bidi characters; there seems to be no
> > > > good reason to allow mixing them.
> > > 
> > > I wonder what the rationale is to warn about UCNs, like in
> > > 
> > > >  aText = u"\u202D" + aText;
> > > 
> > > (as found in the LibreOffice source code).
> > 
> > Is this line mixing a UCN and a UTF-8?  Or is it just that you're
> > prepending a LRO to aText?  We warn because the LRO is not "closed"
> > in the context of its string literal, which was part of the Trojan
> > source attack.  So "\u202D ... \u202C" would not warn.
> > 
> > I'm not sure what workaround I could offer.  Maybe provide an option not to
> > warn about UCNs at all, though even that is potentially dangerous -- while
> > you can see UCNs in the source code, if you print strings containing them,
> > they won't be visible anymore.
> 
> I'm not sure what you mean with "mixing a UCN and a UTF-8", but what the
> code apparently does is programmatically constructing a larger piece of text
> by prepending LRO to an existing piece of text.
> 
> My understanding is that Trojan Source is concerned with presentation of
> program source code and not with properties of Unicode text constructed
> during the execution of such a program, and from the documentation quoted
> above I understand that -Wbidi-chars is meant to address Trojan Source, so I
> don't understand why you're concerned here with what happens "if you print
> strings containing [UCNs in the source code]".
> 
> Short of a source code viewer that interprets UCNs in C/C++ source code and
> renders them in the same way as their corresponding Unicode characters, I
> don't think that UCNs are relevant for Trojan Source, and don't understand
> why -Wbidi-chars would warn about them.

I guess we were concerned with programs that generate other programs.
Maybe UCNs should be ignored by default.  There's still time to adjust
the behavior.
 
> (Also, I noticed that it doesn't work to silence -Werror=bidi-chars= with a
> 
> > #pragma GCC diagnostic ignored "-Wbidi-chars"

Yeah, it doesn't work with C++, it's https://gcc.gnu.org/PR53431 :(

Marek

[committed 19/19] libphobos: Update libphobos testsuite to pass on latest version

This adds new, or updates the dejagu testing scripts for the suite of
libphobos tests.

Bootstrapped, regression tested, and committed to mainline.

Regards,
Iain.

---
libphobos/ChangeLog:

* testsuite/lib/libphobos.exp (libphobos-dg-test): Handle assembly
compile types.
(dg-test): Override.
(additional_prunes): Define.
(libphobos-dg-prune): Filter any additional_prunes set by tests.
* testsuite/libphobos.druntime/druntime.exp (version_flags): Add
-fversion=CoreUnittest.
* testsuite/libphobos.druntime_shared/druntime_shared.exp
(version_flags): Add -fversion=CoreUnittest -fversion=Shared.
* testsuite/libphobos.phobos/phobos.exp (version_flags): Add
-fversion=StdUnittest
* testsuite/libphobos.phobos_shared/phobos_shared.exp (version_flags):
Likewise.
* testsuite/testsuite_flags.in: Add -fpreview=dip1000 to --gdcflags.
* testsuite/libphobos.betterc/betterc.exp: New test.
* testsuite/libphobos.config/config.exp: New test.
* testsuite/libphobos.gc/gc.exp: New test.
* testsuite/libphobos.imports/imports.exp: New test.
* testsuite/libphobos.lifetime/lifetime.exp: New test.
* testsuite/libphobos.unittest/unittest.exp: New test.
---
 libphobos/testsuite/lib/libphobos.exp | 60 +++
 .../testsuite/libphobos.betterc/betterc.exp   | 27 +
 .../testsuite/libphobos.config/config.exp | 46 ++
 .../testsuite/libphobos.druntime/druntime.exp |  2 +-
 .../druntime_shared.exp   |  2 +-
 libphobos/testsuite/libphobos.gc/gc.exp   | 27 +
 .../testsuite/libphobos.imports/imports.exp   | 29 +
 .../testsuite/libphobos.lifetime/lifetime.exp | 27 +
 .../testsuite/libphobos.phobos/phobos.exp |  2 +-
 .../libphobos.phobos_shared/phobos_shared.exp |  2 +-
 .../testsuite/libphobos.unittest/unittest.exp | 53 
 libphobos/testsuite/testsuite_flags.in|  2 +-
 12 files changed, 274 insertions(+), 5 deletions(-)
 create mode 100644 libphobos/testsuite/libphobos.betterc/betterc.exp
 create mode 100644 libphobos/testsuite/libphobos.config/config.exp
 create mode 100644 libphobos/testsuite/libphobos.gc/gc.exp
 create mode 100644 libphobos/testsuite/libphobos.imports/imports.exp
 create mode 100644 libphobos/testsuite/libphobos.lifetime/lifetime.exp
 create mode 100644 libphobos/testsuite/libphobos.unittest/unittest.exp

diff --git a/libphobos/testsuite/lib/libphobos.exp 
b/libphobos/testsuite/lib/libphobos.exp
index 2af430a0e45..66e3e80105f 100644
--- a/libphobos/testsuite/lib/libphobos.exp
+++ b/libphobos/testsuite/lib/libphobos.exp
@@ -54,6 +54,10 @@ proc libphobos-dg-test { prog do_what extra_tool_flags } {
 
 # Set up the compiler flags, based on what we're going to do.
 switch $do_what {
+   "compile" {
+   set compile_type "assembly"
+   set output_file "[file rootname [file tail $prog]].s"
+   }
"run" {
set compile_type "executable"
# FIXME: "./" is to cope with "." not being in $PATH.
@@ -89,8 +93,52 @@ proc libphobos-dg-test { prog do_what extra_tool_flags } {
 return [list $comp_output $output_file]
 }
 
+# Override the DejaGnu dg-test in order to clear flags after a test, as
+# is done for compiler tests in gcc-dg.exp.
+
+if { [info procs saved-dg-test] == [list] } {
+rename dg-test saved-dg-test
+
+proc dg-test { args } {
+   global additional_prunes
+   global errorInfo
+   global testname_with_flags
+   global shouldfail
+
+   if { [ catch { eval saved-dg-test $args } errmsg ] } {
+   set saved_info $errorInfo
+   set additional_prunes ""
+   set shouldfail 0
+   if [info exists testname_with_flags] {
+   unset testname_with_flags
+   }
+   unset_timeout_vars
+   error $errmsg $saved_info
+   }
+   set additional_prunes ""
+   set shouldfail 0
+   unset_timeout_vars
+   if [info exists testname_with_flags] {
+   unset testname_with_flags
+   }
+}
+}
+
+# Prune messages from gdc that aren't useful.
+
+set additional_prunes ""
+
 proc libphobos-dg-prune { system text } {
 
+global additional_prunes
+
+foreach p $additional_prunes {
+   if { [string length $p] > 0 } {
+   # Following regexp matches a complete line containing $p.
+   regsub -all "(^|\n)\[^\n\]*$p\[^\n\]*" $text "" text
+   }
+}
+
 # Ignore harmless warnings from Xcode.
 regsub -all "(^|\n)\[^\n\]*ld: warning: could not create compact unwind 
for\[^\n\]*" $text "" text
 
@@ -281,6 +329,18 @@ proc libphobos_skipped_test_p { test } {
 return "skipped test"
 }
 
+# Prune any messages matching ARGS[1] (a regexp) from test output.
+proc dg-prune-output { args } {
+global additional_prunes
+
+if { [llength $args] != 2 } {
+   error "[lindex

[ping^6] Make sure that we get unique test names if several DejaGnu directives refer to the same line [PR102735]

Hi!

I know I'm late this week ;-\ -- but here is another ping.


Grüße
 Thomas


On 2021-11-22T11:27:49+0100, Thomas Schwinge  wrote:
> Hi!
>
> Ping.
>
>
> Grüße
>  Thomas
>
>
> On 2021-11-15T15:50:58+0100, I wrote:
>> Hi!
>>
>> ..., and here is another ping.
>>
>>
>> Grüße
>>  Thomas
>>
>>
>> On 2021-11-08T11:45:12+0100, I wrote:
>>> Hi!
>>>
>>> Ping, once more.
>>>
>>>
>>> Grüße
>>>  Thomas
>>>
>>>
>>> On 2021-10-14T12:12:41+0200, I wrote:
 Hi!

 Ping, again.

 Commit log updated for 
 "privatization-1-compute.c results in both XFAIL and PASS".


 Grüße
  Thomas


 On 2021-09-30T08:42:25+0200, I wrote:
> Hi!
>
> Ping.
>
> On 2021-09-22T13:03:46+0200, I wrote:
>> On 2021-09-19T11:35:00-0600, Jeff Law via Gcc-patches 
>>  wrote:
>>> A couple of goacc tests do not have unique names.
>>
>> Thanks for fixing this up, and sorry, largely my "fault", I suppose.  ;-|
>>
>>> This causes problems
>>> for the test comparison script when one of the test passes and the other
>>> fails -- in this scenario the test comparison script claims there is a
>>> regression.
>>
>> So I understand correctly that this is a problem not just for actual
>> mixed PASS vs. FAIL (which we'd like you to report anyway!) that appear
>> for the same line, but also for mixed PASS vs. XFAIL?  (Because, the
>> latter appears to be what you're addressing with your commit here.)
>>
>>> This slipped through for a while because I had turned off x86_64 testing
>>> (others test it regularly and I was revamping the tester's hardware
>>> requirements).  Now that I've acquired more x86_64 resources and turned
>>> on native x86 testing again, it's been flagged.
>>
>> (I don't follow that argument -- these test cases should be all generic?
>> Anyway, not important, I guess.)
>>
>>> This patch just adds a numeric suffix to the TODO string to disambiguate
>>> them.
>>
>> So, instead of doing this manually (always error-prone!), like you've...
>>
>>> Committed to the trunk,
>>
>>> commit f75b237254f32d5be32c9d9610983b777abea633
>>> Author: Jeff Law 
>>> Date:   Sun Sep 19 13:31:32 2021 -0400
>>>
>>> [committed] Make test names unique for a couple of goacc tests
>>
>>> --- a/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute.f90
>>> +++ b/gcc/testsuite/gfortran.dg/goacc/privatization-1-compute.f90
>>> @@ -39,9 +39,9 @@ contains
>>>!$acc atomic write ! ... to force 'TREE_ADDRESSABLE'.
>>>y = a
>>>  !$acc end parallel
>>> -! { dg-note {variable 'i' in 'private' clause potentially has 
>>> improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* 
>>> } l_compute$c_compute }
>>> -! { dg-note {variable 'j' in 'private' clause potentially has 
>>> improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* 
>>> } l_compute$c_compute }
>>> -! { dg-note {variable 'a' in 'private' clause potentially has 
>>> improper OpenACC privatization level: 'parm_decl'} "TODO" { xfail *-*-* 
>>> } l_compute$c_compute }
>>> +! { dg-note {variable 'i' in 'private' clause potentially has 
>>> improper OpenACC privatization level: 'parm_decl'} "TODO2" { xfail 
>>> *-*-* } l_compute$c_compute }
>>> +! { dg-note {variable 'j' in 'private' clause potentially has 
>>> improper OpenACC privatization level: 'parm_decl'} "TODO3" { xfail 
>>> *-*-* } l_compute$c_compute }
>>> +! { dg-note {variable 'a' in 'private' clause potentially has 
>>> improper OpenACC privatization level: 'parm_decl'} "TODO4" { xfail 
>>> *-*-* } l_compute$c_compute }
>>
>> ... etc. (also similarly in a handful of earlier commits, if I remember
>> correctly), why don't we do that programmatically, like in the attached
>> "Make sure that we get unique test names if several DejaGnu directives
>> refer to the same line", once and for all?  OK to push after proper
>> testing?
>
> Attached again, for easy reference.
>
> I figure it may help if I showed an example of how this changes things;
> for the test case cited above (word-diff):
>
> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
> 40+} (test for warnings, line 39)
> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
> 41+} (test for warnings, line 22)
> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
> 42+} (test for warnings, line 39)
> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
> 43+} (test for warnings, line 22)
> PASS: gfortran.dg/goacc/privatization-1-compute.f90   -O   {+at line 
> 44+} (test for warnings, line 39)
> PASS:

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

On 11/30/21 2:37 AM, Richard Biener wrote:
> On Mon, Nov 29, 2021 at 11:56 PM Qing Zhao  wrote:
> I think that's inconsistent indeed.  Peter, what are "opaque"
> registers?  rs6000-modes.def suggests
> that there's __vector_pair and __vector_quad, what's the GIMPLE types
> for those?  It seems they
> are either SSA names or expanded to pseudo registers but there's no
> constants for them.

The __vector_pair and __vector_quad types are target specific types
for use with our Matrix-Math-Assist (MMA) unit and they are only
usable with our associated MMA built-in functions.  What they hold
is really dependent on which MMA built-ins you use on them.
You can think of them a generic (and large) vector type where the
subtype is undefined...or defined by which built-in function you
happen to be using.

We do not have any constants defined for them.  How we initialize them
is either by loading values from memory into them or by zeroing them
out using the xxsetaccz instruction (only for __vector_quads).

> Can they be initialized?  I see they can be copied at least.

__vector_quads can be zero initialized using the __builtin_mma_xxsetaccz()
built-in function.  We don't have a method (or use case) for zero initializing
__vector_pairs.

> If such "things" cannot be initialized they should indeed be exempt
> from auto-init.  The
> documentation suggests that they act as bit-bucked but even bit-buckets should
> be initializable, thus why exactly does CONST0_RTX not exist for them?

We used to have CONST0_RTX defined (but nothing else), but we had problems
with the compiler CSEing the initialization for multiple __vector_quads and
then copying the values around.  We'd end up with one xxsetaccz instruction
and copies out of that accumulator register into the other accumulator
registers.  Copies are VERY expensive, while xxsetaccz's are cheap, so we
don't want that.  That said, I think a fix I put in to disable fwprop on
these types may have been the culprit for that problem, so maybe we could
add the CONST0_RTX back?  I'd have to verify that.  If so, then we'd at least
be able to support -ftrivial-auto-var-init=zero.  The =pattern version
would be more problematical...unless the value for pattern was loaded from
memory.

Peter

[committed 18/19] testsuite: Update gdc testsuite to pass on latest version

This updates the GDC testsuite parts to be compatible with the current
language features/deprecations.  The dejagnu gdc-utils helper has also
been updated to handle the new options and directives added to the D2
testsuite tests.

Bootstrapped, regression tested, and committed to mainline.

Regards,
Iain.

---
gcc/testsuite/ChangeLog:

* gdc.dg/Wcastresult2.d: Update test.
* gdc.dg/asm1.d: Likewise.
* gdc.dg/asm2.d: Likewise.
* gdc.dg/asm3.d: Likewise.
* gdc.dg/gdc282.d: Likewise.
* gdc.dg/imports/gdc170.d: Likewise.
* gdc.dg/intrinsics.d: Likewise.
* gdc.dg/pr101672.d: Likewise.
* gdc.dg/pr90650a.d: Likewise.
* gdc.dg/pr90650b.d: Likewise.
* gdc.dg/pr94777a.d: Likewise.
* gdc.dg/pr95250.d: Likewise.
* gdc.dg/pr96869.d: Likewise.
* gdc.dg/pr98277.d: Likewise.
* gdc.dg/pr98457.d: Likewise.
* gdc.dg/simd1.d: Likewise.
* gdc.dg/simd2a.d: Likewise.
* gdc.dg/simd2b.d: Likewise.
* gdc.dg/simd2c.d: Likewise.
* gdc.dg/simd2d.d: Likewise.
* gdc.dg/simd2e.d: Likewise.
* gdc.dg/simd2f.d: Likewise.
* gdc.dg/simd2g.d: Likewise.
* gdc.dg/simd2h.d: Likewise.
* gdc.dg/simd2i.d: Likewise.
* gdc.dg/simd2j.d: Likewise.
* gdc.dg/simd7951.d: Likewise.
* gdc.dg/torture/gdc309.d: Likewise.
* gdc.dg/torture/pr94424.d: Likewise.
* gdc.dg/torture/pr94777b.d: Likewise.
* lib/gdc-utils.exp (gdc-convert-args): Handle new compiler options.
(gdc-convert-test): Handle CXXFLAGS, EXTRA_OBJC_SOURCES, and ARG_SETS
test directives.
(gdc-do-test): Only import modules in the test run directory.
* gdc.dg/pr94777c.d: New test.
* gdc.dg/pr96156b.d: New test.
* gdc.dg/pr96157c.d: New test.
* gdc.dg/simd_ctfe.d: New test.
* gdc.dg/torture/simd17344.d: New test.
* gdc.dg/torture/simd20052.d: New test.
* gdc.dg/torture/simd6.d: New test.
* gdc.dg/torture/simd7.d: New test.
---
 gcc/testsuite/gdc.dg/Wcastresult2.d  |   2 +-
 gcc/testsuite/gdc.dg/asm1.d  |  18 +--
 gcc/testsuite/gdc.dg/asm2.d  |   2 +-
 gcc/testsuite/gdc.dg/asm3.d  |  10 +-
 gcc/testsuite/gdc.dg/gdc282.d|   6 +-
 gcc/testsuite/gdc.dg/imports/gdc170.d|   8 +-
 gcc/testsuite/gdc.dg/intrinsics.d|  36 +++---
 gcc/testsuite/gdc.dg/pr101672.d  |   2 +-
 gcc/testsuite/gdc.dg/pr90650a.d  |   2 +-
 gcc/testsuite/gdc.dg/pr90650b.d  |   2 +-
 gcc/testsuite/gdc.dg/pr94777a.d  |   2 +-
 gcc/testsuite/gdc.dg/pr94777c.d  |  62 +++
 gcc/testsuite/gdc.dg/pr95250.d   |   2 +-
 gcc/testsuite/gdc.dg/pr96156b.d  |  17 +++
 gcc/testsuite/gdc.dg/pr96157c.d  |  40 +++
 gcc/testsuite/gdc.dg/pr96869.d   |  26 ++---
 gcc/testsuite/gdc.dg/pr98277.d   |   2 +-
 gcc/testsuite/gdc.dg/pr98457.d   |   6 +-
 gcc/testsuite/gdc.dg/simd1.d |   8 --
 gcc/testsuite/gdc.dg/simd2a.d|   8 --
 gcc/testsuite/gdc.dg/simd2b.d|   8 --
 gcc/testsuite/gdc.dg/simd2c.d|   8 --
 gcc/testsuite/gdc.dg/simd2d.d|   8 --
 gcc/testsuite/gdc.dg/simd2e.d|   8 --
 gcc/testsuite/gdc.dg/simd2f.d|   8 --
 gcc/testsuite/gdc.dg/simd2g.d|   8 --
 gcc/testsuite/gdc.dg/simd2h.d|   8 --
 gcc/testsuite/gdc.dg/simd2i.d|   8 --
 gcc/testsuite/gdc.dg/simd2j.d|   8 --
 gcc/testsuite/gdc.dg/simd7951.d  |   1 +
 gcc/testsuite/gdc.dg/simd_ctfe.d |  87 +++
 gcc/testsuite/gdc.dg/torture/gdc309.d|   1 +
 gcc/testsuite/gdc.dg/torture/pr94424.d   |  16 +++
 gcc/testsuite/gdc.dg/torture/pr94777b.d  | 135 ---
 gcc/testsuite/gdc.dg/torture/simd17344.d |  11 ++
 gcc/testsuite/gdc.dg/torture/simd20052.d |  17 +++
 gcc/testsuite/gdc.dg/torture/simd6.d |  26 +
 gcc/testsuite/gdc.dg/torture/simd7.d |  18 +++
 gcc/testsuite/lib/gdc-utils.exp  |  81 --
 39 files changed, 435 insertions(+), 291 deletions(-)
 create mode 100644 gcc/testsuite/gdc.dg/pr94777c.d
 create mode 100644 gcc/testsuite/gdc.dg/pr96156b.d
 create mode 100644 gcc/testsuite/gdc.dg/pr96157c.d
 create mode 100644 gcc/testsuite/gdc.dg/simd_ctfe.d
 create mode 100644 gcc/testsuite/gdc.dg/torture/simd17344.d
 create mode 100644 gcc/testsuite/gdc.dg/torture/simd20052.d
 create mode 100644 gcc/testsuite/gdc.dg/torture/simd6.d
 create mode 100644 gcc/testsuite/gdc.dg/torture/simd7.d

diff --git a/gcc/testsuite/gdc.dg/Wcastresult2.d 
b/gcc/testsuite/gdc.dg/Wcastresult2.d
index 56d2dd20e82..83d189a6adf 100644
--- a/gcc/testsuite/gdc.dg/Wcastresult2.d
+++ b/gcc/testsuite/gdc.dg/Wcastresult2.d
@@ -1,5 +1,5 @@
 // { dg-do compile }
-// { dg-options "-Wcast-result" }
+// { dg-options "-Wcast-result -Wno-deprecated" }

[committed 17/19] libphobos: Import druntime testsuite v2.098.0-beta.1 (e6caaab9)

This is the updated D runtime library testsuite.

Bootstrapped, regression tested, and committed to mainline.

Regards,
Iain.

---
libphobos/ChangeLog:

* testsuite/libphobos.aa/test_aa.d: Update test.
* testsuite/libphobos.exceptions/unknown_gc.d: Likewise.
* testsuite/libphobos.hash/test_hash.d: Likewise.
* testsuite/libphobos.shared/host.c: Likewise.
* testsuite/libphobos.shared/load.d: Likewise.
* testsuite/libphobos.shared/load_13414.d: Likewise.
* testsuite/libphobos.thread/fiber_guard_page.d: Likewise.
* testsuite/libphobos.thread/tlsgc_sections.d: Likewise.
* testsuite/libphobos.shared/link_mod_collision.d: Removed.
* testsuite/libphobos.shared/load_mod_collision.d: Removed.
* testsuite/libphobos.allocations/alloc_from_assert.d: New test.
* testsuite/libphobos.betterc/test18828.d: New test.
* testsuite/libphobos.betterc/test19416.d: New test.
* testsuite/libphobos.betterc/test19421.d: New test.
* testsuite/libphobos.betterc/test19561.d: New test.
* testsuite/libphobos.betterc/test19924.d: New test.
* testsuite/libphobos.betterc/test20088.d: New test.
* testsuite/libphobos.betterc/test20613.d: New test.
* testsuite/libphobos.config/test19433.d: New test.
* testsuite/libphobos.config/test20459.d: New test.
* testsuite/libphobos.exceptions/assert_fail.d: New test.
* testsuite/libphobos.exceptions/catch_in_finally.d: New test.
* testsuite/libphobos.exceptions/future_message.d: New test.
* testsuite/libphobos.exceptions/long_backtrace_trunc.d: New test.
* testsuite/libphobos.exceptions/refcounted.d: New test.
* testsuite/libphobos.exceptions/rt_trap_exceptions.d: New test.
* testsuite/libphobos.exceptions/rt_trap_exceptions_drt.d: New test.
* testsuite/libphobos.gc/attributes.d: New test.
* testsuite/libphobos.gc/forkgc.d: New test.
* testsuite/libphobos.gc/forkgc2.d: New test.
* testsuite/libphobos.gc/nocollect.d: New test.
* testsuite/libphobos.gc/precisegc.d: New test.
* testsuite/libphobos.gc/recoverfree.d: New test.
* testsuite/libphobos.gc/sigmaskgc.d: New test.
* testsuite/libphobos.gc/startbackgc.d: New test.
* testsuite/libphobos.imports/bug18193.d: New test.
* testsuite/libphobos.init_fini/custom_gc.d: New test.
* testsuite/libphobos.init_fini/test18996.d: New test.
* testsuite/libphobos.lifetime/large_aggregate_destroy_21097.d: New 
test.
* testsuite/libphobos.thread/external_threads.d: New test.
* testsuite/libphobos.thread/join_detach.d: New test.
* testsuite/libphobos.thread/test_import.d: New test.
* testsuite/libphobos.thread/tlsstack.d: New test.
* testsuite/libphobos.typeinfo/enum_.d: New test.
* testsuite/libphobos.typeinfo/isbaseof.d: New test.
* testsuite/libphobos.unittest/customhandler.d: New test.
---
 libphobos/testsuite/libphobos.aa/test_aa.d|  79 ++-
 .../libphobos.allocations/alloc_from_assert.d |  25 +
 .../testsuite/libphobos.betterc/test18828.d   |  10 +
 .../testsuite/libphobos.betterc/test19416.d   |  14 +
 .../testsuite/libphobos.betterc/test19421.d   |  13 +
 .../testsuite/libphobos.betterc/test19561.d   |  16 +
 .../testsuite/libphobos.betterc/test19924.d   |  15 +
 .../testsuite/libphobos.betterc/test20088.d   |  14 +
 .../testsuite/libphobos.betterc/test20613.d   |  18 +
 .../testsuite/libphobos.config/test19433.d|   7 +
 .../testsuite/libphobos.config/test20459.d|   5 +
 .../libphobos.exceptions/assert_fail.d| 564 ++
 .../libphobos.exceptions/catch_in_finally.d   | 191 ++
 .../libphobos.exceptions/future_message.d |  71 +++
 .../long_backtrace_trunc.d|  37 ++
 .../libphobos.exceptions/refcounted.d |  96 +++
 .../libphobos.exceptions/rt_trap_exceptions.d |  15 +
 .../rt_trap_exceptions_drt.d  |  11 +
 .../libphobos.exceptions/unknown_gc.d |   4 +
 libphobos/testsuite/libphobos.gc/attributes.d |  30 +
 libphobos/testsuite/libphobos.gc/forkgc.d |  36 ++
 libphobos/testsuite/libphobos.gc/forkgc2.d|  22 +
 libphobos/testsuite/libphobos.gc/nocollect.d  |  15 +
 libphobos/testsuite/libphobos.gc/precisegc.d  | 126 
 .../testsuite/libphobos.gc/recoverfree.d  |  13 +
 libphobos/testsuite/libphobos.gc/sigmaskgc.d  |  42 ++
 .../testsuite/libphobos.gc/startbackgc.d  |  22 +
 .../testsuite/libphobos.hash/test_hash.d  | 140 -
 .../testsuite/libphobos.imports/bug18193.d|   4 +
 .../testsuite/libphobos.init_fini/custom_gc.d | 203 +++
 .../testsuite/libphobos.init_fini/test18996.d |  13 +
 .../large_aggregate_destroy_21097.d   |  78 +++
 libphobos/testsuite/libphobos.shared/host.c   |   8 +
 .../libphobos.shared/link_mod_collision.d |   5 -

[committed 14/19] libphobos: Update libphobos to build latest version

Updates the make files that build phobos.

Bootstrapped, regression tested, and committed to mainline.

Regards,
Iain.

---
libphobos/ChangeLog:

* src/Makefile.am (D_EXTRA_DFLAGS): Add -fpreview=dip1000 and
-fpreview=dtorfields flags.
(PHOBOS_DSOURCES): Update list of std modules.
* src/Makefile.in: Regenerate.
---
 libphobos/src/Makefile.am |  47 +++-
 libphobos/src/Makefile.in | 145 ++
 2 files changed, 142 insertions(+), 50 deletions(-)

diff --git a/libphobos/src/Makefile.am b/libphobos/src/Makefile.am
index 9f6251009f6..ba1579da8d7 100644
--- a/libphobos/src/Makefile.am
+++ b/libphobos/src/Makefile.am
@@ -19,7 +19,7 @@
 include $(top_srcdir)/d_rules.am
 
 # Make sure GDC can find libdruntime and libphobos include files
-D_EXTRA_DFLAGS=-nostdinc -I $(srcdir) \
+D_EXTRA_DFLAGS=-fpreview=dip1000 -fpreview=dtorfields -nostdinc -I $(srcdir) \
-I $(top_srcdir)/libdruntime -I ../libdruntime -I .
 
 # D flags for compilation
@@ -83,12 +83,12 @@ PHOBOS_DSOURCES =
 
 else
 
-PHOBOS_DSOURCES = etc/c/curl.d etc/c/sqlite3.d etc/c/zlib.d \
-   std/algorithm/comparison.d std/algorithm/internal.d \
-   std/algorithm/iteration.d std/algorithm/mutation.d \
-   std/algorithm/package.d std/algorithm/searching.d \
-   std/algorithm/setops.d std/algorithm/sorting.d std/array.d std/ascii.d \
-   std/base64.d std/bigint.d std/bitmanip.d std/compiler.d std/complex.d \
+PHOBOS_DSOURCES = etc/c/curl.d etc/c/zlib.d std/algorithm/comparison.d \
+   std/algorithm/internal.d std/algorithm/iteration.d \
+   std/algorithm/mutation.d std/algorithm/package.d \
+   std/algorithm/searching.d std/algorithm/setops.d \
+   std/algorithm/sorting.d std/array.d std/ascii.d std/base64.d \
+   std/bigint.d std/bitmanip.d std/compiler.d std/complex.d \
std/concurrency.d std/container/array.d std/container/binaryheap.d \
std/container/dlist.d std/container/package.d std/container/rbtree.d \
std/container/slist.d std/container/util.d std/conv.d std/csv.d \
@@ -99,7 +99,9 @@ PHOBOS_DSOURCES = etc/c/curl.d etc/c/sqlite3.d etc/c/zlib.d \
std/digest/murmurhash.d std/digest/package.d std/digest/ripemd.d \
std/digest/sha.d std/encoding.d std/exception.d \
std/experimental/allocator/building_blocks/affix_allocator.d \
+   std/experimental/allocator/building_blocks/aligned_block_list.d \
std/experimental/allocator/building_blocks/allocator_list.d \
+   std/experimental/allocator/building_blocks/ascending_page_allocator.d \
std/experimental/allocator/building_blocks/bitmapped_block.d \
std/experimental/allocator/building_blocks/bucketizer.d \
std/experimental/allocator/building_blocks/fallback_allocator.d \
@@ -123,27 +125,34 @@ PHOBOS_DSOURCES = etc/c/curl.d etc/c/sqlite3.d 
etc/c/zlib.d \
std/experimental/logger/core.d std/experimental/logger/filelogger.d \
std/experimental/logger/multilogger.d \
std/experimental/logger/nulllogger.d std/experimental/logger/package.d \
-   std/experimental/typecons.d std/file.d std/format.d std/functional.d \
-   std/getopt.d std/internal/cstring.d std/internal/math/biguintcore.d \
-   std/internal/math/biguintnoasm.d std/internal/math/errorfunction.d \
-   std/internal/math/gammafunction.d std/internal/scopebuffer.d \
+   std/experimental/typecons.d std/file.d std/format/internal/floats.d \
+   std/format/internal/read.d std/format/internal/write.d \
+   std/format/package.d std/format/read.d std/format/spec.d \
+   std/format/write.d std/functional.d std/getopt.d \
+   std/internal/attributes.d std/internal/cstring.d \
+   std/internal/math/biguintcore.d std/internal/math/biguintnoasm.d \
+   std/internal/math/errorfunction.d std/internal/math/gammafunction.d \
+   std/internal/memory.d std/internal/scopebuffer.d \
std/internal/test/dummyrange.d std/internal/test/range.d \
std/internal/test/uda.d std/internal/unicode_comp.d \
std/internal/unicode_decomp.d std/internal/unicode_grapheme.d \
std/internal/unicode_norm.d std/internal/unicode_tables.d \
-   std/internal/windows/advapi32.d std/json.d std/math.d \
+   std/internal/windows/advapi32.d std/json.d std/math/algebraic.d \
+   std/math/constants.d std/math/exponential.d std/math/hardware.d \
+   std/math/operations.d std/math/package.d std/math/remainder.d \
+   std/math/rounding.d std/math/traits.d std/math/trigonometry.d \
std/mathspecial.d std/meta.d std/mmfile.d std/net/curl.d \
-   std/net/isemail.d std/numeric.d std/outbuffer.d std/parallelism.d \
-   std/path.d std/process.d std/random.d std/range/interfaces.d \
-   std/range/package.d std/range/primitives.d \
+   std/net/isemail.d std/numeric.d std/outbuffer.d std/package.d \
+   std/parallelism.d std/path.d std/process.d std/random.d \
+

Re: [PATCH] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-30 Thread Stephan Bergmann via Gcc-patches


On 30/11/2021 14:26, Marek Polacek wrote:

On Tue, Nov 30, 2021 at 09:38:57AM +0100, Stephan Bergmann wrote:

On 15/11/2021 18:28, Marek Polacek via Gcc-patches wrote:

On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:

Ping, can we conclude on the name?   IMHO, -Wbidirectional is just fine,
but changing the name is a trivial operation.


Here's a patch with a better name (suggested by Jonathan W.).  Otherwise no
changes.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
  From a link below:
"An issue was discovered in the Bidirectional Algorithm in the Unicode
Specification through 14.0. It permits the visual reordering of
characters via control sequences, which can be used to craft source code
that renders different logic than the logical ordering of tokens
ingested by compilers and interpreters. Adversaries can leverage this to
encode source code for compilers accepting Unicode such that targeted
vulnerabilities are introduced invisibly to human reviewers."

More info:
https://nvd.nist.gov/vuln/detail/CVE-2021-42574
https://trojansource.codes/

This is not a compiler bug.  However, to mitigate the problem, this patch
implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
misleading Unicode bidirectional characters the preprocessor may encounter.

The default is =unpaired, which warns about improperly terminated
bidirectional characters; e.g. a LRE without its appertaining PDF.  The
level =any warns about any use of bidirectional characters.

This patch handles both UCNs and UTF-8 characters.  UCNs designating
bidi characters in identifiers are accepted since r204886.  Then r217144
enabled -fextended-identifiers by default.  Extended characters in C/C++
identifiers have been accepted since r275979.  However, this patch still
warns about mixing UTF-8 and UCN bidi characters; there seems to be no
good reason to allow mixing them.


I wonder what the rationale is to warn about UCNs, like in


 aText = u"\u202D" + aText;


(as found in the LibreOffice source code).


Is this line mixing a UCN and a UTF-8?  Or is it just that you're
prepending a LRO to aText?  We warn because the LRO is not "closed"
in the context of its string literal, which was part of the Trojan
source attack.  So "\u202D ... \u202C" would not warn.

I'm not sure what workaround I could offer.  Maybe provide an option not to
warn about UCNs at all, though even that is potentially dangerous -- while
you can see UCNs in the source code, if you print strings containing them,
they won't be visible anymore.


I'm not sure what you mean with "mixing a UCN and a UTF-8", but what the 
code apparently does is programmatically constructing a larger piece of 
text by prepending LRO to an existing piece of text.


My understanding is that Trojan Source is concerned with presentation of 
program source code and not with properties of Unicode text constructed 
during the execution of such a program, and from the documentation 
quoted above I understand that -Wbidi-chars is meant to address Trojan 
Source, so I don't understand why you're concerned here with what 
happens "if you print strings containing [UCNs in the source code]".


Short of a source code viewer that interprets UCNs in C/C++ source code 
and renders them in the same way as their corresponding Unicode 
characters, I don't think that UCNs are relevant for Trojan Source, and 
don't understand why -Wbidi-chars would warn about them.


(Also, I noticed that it doesn't work to silence -Werror=bidi-chars= with a


#pragma GCC diagnostic ignored "-Wbidi-chars"


?)

[committed 10/19] libphobos: Update libgdruntime to build with latest version

Updates the make files, and the gdc-specific modules of druntime.

Bootstrapped, regression tested, and committed to mainline.

Regards,
Iain.

---
libphobos/ChangeLog:

* libdruntime/Makefile.am (D_EXTRA_FLAGS): Build libdruntime with
-fpreview=dip1000, -fpreview=fieldwise, and -fpreview=dtorfields.
(ALL_DRUNTIME_SOURCES): Add DRUNTIME_DSOURCES_STDCXX.
(DRUNTIME_DSOURCES): Update list of C binding modules.
(DRUNTIME_DSOURCES_STDCXX): Likewise.
(DRUNTIME_DSOURCES_LINUX): Likewise.
(DRUNTIME_DSOURCES_OPENBSD): Likewise.
(DRUNTIME_DISOURCES): Remove __entrypoint.di.
* libdruntime/Makefile.in: Regenerated.
* libdruntime/__entrypoint.di: Removed.
* libdruntime/gcc/backtrace.d (FIRSTFRAME): Remove.
(LibBacktrace.MaxAlignment): Remove.
(LibBacktrace.this): Remove default initialization of firstFrame.
(UnwindBacktrace.this): Likewise.
* libdruntime/gcc/deh.d (_d_isbaseof): Update signature.
(_d_createTrace): Likewise.
(__gdc_begin_catch): Remove reference to the exception.
(_d_throw): Increment reference count of thrown object before unwind.
(__gdc_personality): Chain exceptions with  Throwable.chainTogether.
* libdruntime/gcc/emutls.d: Update imports.
* libdruntime/gcc/sections/elf.d: Update imports.
(DSO.moduleGroup): Update signature.
* libdruntime/gcc/sections/macho.d: Update imports.
(DSO.moduleGroup): Update signature.
* libdruntime/gcc/sections/pecoff.d: Update imports.
(DSO.moduleGroup): Update signature.
* libdruntime/gcc/unwind/generic.d (__aligned__): Define.
---
 libphobos/libdruntime/Makefile.am   |   6 +-
 libphobos/libdruntime/Makefile.in   | 148 
 libphobos/libdruntime/__entrypoint.di   |  56 
 libphobos/libdruntime/gcc/deh.d |  22 +--
 libphobos/libdruntime/gcc/emutls.d  |   3 +-
 libphobos/libdruntime/gcc/sections/elf.d|   6 +-
 libphobos/libdruntime/gcc/sections/macho.d  |   6 +-
 libphobos/libdruntime/gcc/sections/pecoff.d |   6 +-
 8 files changed, 116 insertions(+), 137 deletions(-)
 delete mode 100644 libphobos/libdruntime/__entrypoint.di

diff --git a/libphobos/libdruntime/Makefile.am 
b/libphobos/libdruntime/Makefile.am
index 80fc0badcff..80c7567079a 100644
--- a/libphobos/libdruntime/Makefile.am
+++ b/libphobos/libdruntime/Makefile.am
@@ -19,7 +19,8 @@
 include $(top_srcdir)/d_rules.am
 
 # Make sure GDC can find libdruntime include files
-D_EXTRA_DFLAGS=-nostdinc -I $(srcdir) -I .
+D_EXTRA_DFLAGS=-fpreview=dip1000 -fpreview=fieldwise -fpreview=dtorfields \
+  -nostdinc -I $(srcdir) -I .
 
 # D flags for compilation
 AM_DFLAGS= \
@@ -119,6 +120,7 @@ endif
 DRUNTIME_DSOURCES_GENERATED = gcc/config.d gcc/libbacktrace.d
 
 ALL_DRUNTIME_SOURCES = $(DRUNTIME_DSOURCES) $(DRUNTIME_CSOURCES) \
+   $(DRUNTIME_DSOURCES_STDCXX) \
$(DRUNTIME_SOURCES_CONFIGURED) $(DRUNTIME_DSOURCES_GENERATED)
 
 # Need this library to both be part of libgphobos.a, and installed separately.
@@ -422,4 +424,4 @@ DRUNTIME_DSOURCES_WINDOWS = core/sys/windows/accctrl.d \
core/sys/windows/winuser.d core/sys/windows/winver.d \
core/sys/windows/wtsapi32.d core/sys/windows/wtypes.d
 
-DRUNTIME_DISOURCES = __entrypoint.di __main.di
+DRUNTIME_DISOURCES = __main.di
diff --git a/libphobos/libdruntime/Makefile.in 
b/libphobos/libdruntime/Makefile.in
index cdb1fe3cc18..b5f29da8540 100644
--- a/libphobos/libdruntime/Makefile.in
+++ b/libphobos/libdruntime/Makefile.in
@@ -245,7 +245,13 @@ am__objects_1 = core/atomic.lo core/attribute.lo 
core/bitop.lo \
rt/monitor_.lo rt/profilegc.lo rt/sections.lo rt/tlsgc.lo \
rt/util/typeinfo.lo rt/util/utility.lo
 am__objects_2 = core/stdc/libgdruntime_la-errno_.lo
-am__objects_3 = core/sys/posix/aio.lo core/sys/posix/arpa/inet.lo \
+am__objects_3 = core/stdcpp/allocator.lo core/stdcpp/array.lo \
+   core/stdcpp/exception.lo core/stdcpp/memory.lo \
+   core/stdcpp/new_.lo core/stdcpp/string.lo \
+   core/stdcpp/string_view.lo core/stdcpp/type_traits.lo \
+   core/stdcpp/typeinfo.lo core/stdcpp/utility.lo \
+   core/stdcpp/vector.lo core/stdcpp/xutility.lo
+am__objects_4 = core/sys/posix/aio.lo core/sys/posix/arpa/inet.lo \
core/sys/posix/config.lo core/sys/posix/dirent.lo \
core/sys/posix/dlfcn.lo core/sys/posix/fcntl.lo \
core/sys/posix/grp.lo core/sys/posix/iconv.lo \
@@ -272,8 +278,8 @@ am__objects_3 = core/sys/posix/aio.lo 
core/sys/posix/arpa/inet.lo \
core/sys/posix/syslog.lo core/sys/posix/termios.lo \
core/sys/posix/time.lo core/sys/posix/ucontext.lo \
core/sys/posix/unistd.lo core/sys/posix/utime.lo
-@DRUNTIME_OS_POSIX_TRUE@am__objects_4 = $(am__objects_3)
-am__objects_5 = core/sys/darwin/config.lo \
+@DRUNTIME_OS_POSIX_TRUE@am__objects_5 = $(am__objects_4)
+am__objects_6 =

[PATCH] tree-optimization/103464 - Also pre-process PHIs in range-of-stmt.

2021-11-30 Thread Andrew MacLeod via Gcc-patches

When I flatten the call stack for range_of_stmt in PR 103231 ( 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103231 ), I mention that I 
was only flattening it for chains of statements with range handlers. If 
it turned out that PHI chaining was also a problem, we could also do PHIs.


The cost to do phis is quite nominal, and resolve this testcase...  so 
we might as well do PHIs as well.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  OK?

Andrew
commit 99b0f5f03a04fd342461a67287d81250f86f0586
Author: Andrew MacLeod 
Date:   Mon Nov 29 12:00:26 2021 -0500

Also pre-process PHIs in range-of-stmt.

PR tree-optimization/103464
* gimple-range.cc (gimple_ranger::prefill_name): Process phis also.
(gimple_ranger::prefill_stmt_dependencies): Ditto.

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 178a470a419..c8431a7180b 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -333,7 +333,7 @@ gimple_ranger::prefill_name (irange , tree name)
   if (!gimple_range_ssa_p (name))
 return;
   gimple *stmt = SSA_NAME_DEF_STMT (name);
-  if (!gimple_range_handler (stmt))
+  if (!gimple_range_handler (stmt) && !is_a (stmt))
 return;
 
   bool current;
@@ -356,8 +356,8 @@ gimple_ranger::prefill_stmt_dependencies (tree ssa)
   gimple *stmt = SSA_NAME_DEF_STMT (ssa);
   gcc_checking_assert (stmt && gimple_bb (stmt));
 
-  // Only pre-process range-ops.
-  if (!gimple_range_handler (stmt))
+  // Only pre-process range-ops and phis.
+  if (!gimple_range_handler (stmt) && !is_a (stmt))
 return;
 
   // Mark where on the stack we are starting.
@@ -401,13 +401,22 @@ gimple_ranger::prefill_stmt_dependencies (tree ssa)
 	  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
 	}
 
-  gcc_checking_assert (gimple_range_handler (stmt));
-  tree op = gimple_range_operand2 (stmt);
-  if (op)
-	prefill_name (r, op);
-  op = gimple_range_operand1 (stmt);
-  if (op)
-	prefill_name (r, op);
+  gphi *phi = dyn_cast  (stmt);
+  if (phi)
+	{
+	  for (unsigned x = 0; x < gimple_phi_num_args (phi); x++)
+	prefill_name (r, gimple_phi_arg_def (phi, x));
+	}
+  else
+	{
+	  gcc_checking_assert (gimple_range_handler (stmt));
+	  tree op = gimple_range_operand2 (stmt);
+	  if (op)
+	prefill_name (r, op);
+	  op = gimple_range_operand1 (stmt);
+	  if (op)
+	prefill_name (r, op);
+	}
 }
   if (idx)
 tracer.trailer (idx, "ROS ", false, ssa, r);

Re: [PATCH] ipa-sra: Check also ECF_LOOPING_CONST_OR_PURE when evaluating calls

On Tue, Nov 30, 2021 at 3:24 PM Martin Jambor  wrote:
>
> Hi,
>
> in PR 103267 Honza found out that IPA-SRA does not look at
> ECF_LOOPING_CONST_OR_PURE when evaluating if a call can have side
> effects.  Fixed with this patch.  The testcase infinitely loops in a
> const function, so it would not make a good addition to the testsuite.
>
> Bootstrapped and tested on x86_64-linux.  OK for trunk?

OK.

> Thanks,
>
> Martin
>
>
> gcc/ChangeLog:
>
> 2021-11-29  Martin Jambor  
>
> PT ipa/103267
> * ipa-sra.c (scan_function): Also check ECF_LOOPING_CONST_OR_PURE 
> flag.
> ---
>  gcc/ipa-sra.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
> index cb0e30507a1..12ccd049552 100644
> --- a/gcc/ipa-sra.c
> +++ b/gcc/ipa-sra.c
> @@ -1925,7 +1925,8 @@ scan_function (cgraph_node *node, struct function *fun)
> if (lhs)
>   scan_expr_access (lhs, stmt, ISRA_CTX_STORE, bb);
> int flags = gimple_call_flags (stmt);
> -   if ((flags & (ECF_CONST | ECF_PURE)) == 0)
> +   if (((flags & (ECF_CONST | ECF_PURE)) == 0)
> +   || (flags & ECF_LOOPING_CONST_OR_PURE))
>   bitmap_set_bit (final_bbs, bb->index);
>   }
>   break;
> --
> 2.33.1
>

Re: [PATCH] Avoid some -Wunreachable-code-ctrl

On Tue, 30 Nov 2021, Mikael Morin wrote:

> On 30/11/2021 14:25, Richard Biener wrote:
> > On Tue, 30 Nov 2021, Mikael Morin wrote:
> > 
> >> Le 29/11/2021 ? 16:03, Richard Biener via Gcc-patches a ?crit :
> >>> diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
> >>> index f5ba7cecd54..16ee2afc9c0 100644
> >>> --- a/gcc/fortran/frontend-passes.c
> >>> +++ b/gcc/fortran/frontend-passes.c
> >>> @@ -5229,7 +5229,6 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t
> >>> exprfn,
> >>> void *data)
> >>>   case EXPR_OP:
> >>> WALK_SUBEXPR ((*e)->value.op.op1);
> >>> WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
> >>> - break;
> >>>   case EXPR_FUNCTION:
> >>> for (a = (*e)->value.function.actual; a; a = a->next)
> >>>   WALK_SUBEXPR (a->expr);
> >>
> >> I?m uncomfortable with the above change.
> >> It makes it look like there is a fall through, but there is not.
> >> Maybe inline the macro to make the continue explicit, or use WALK_SUBEXPR
> >> instead of WALK_SUBEXPR_TAIL and hope the compiler will do the tail call
> >> optimization.
> > 
> > Ah, it follows the style in tree.c:walk_tree_1 where break was used
> > inconsistently after WALK_SUBTREE_TAIL which was then more obvious
> > to me to clean up.  I didn't realize the fortran FE only had a
> > single WALK_SUBEXPR_TAIL.
> > 
> > I'm not sure inlining will make the situation more clear, for
> > sure using WALK_SUBEXPR would but it might loose the tailcall.
> > 
> > Would you accept an additional comment after WALK_SUBEXPR_TAIL like
> > 
> >case EXPR_OP:
> >  WALK_SUBEXPR ((*e)->value.op.op1);
> >  WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
> >  /* tail-recurse  */
> > 
> My preference would be a gcc_unreachable() or something similar, but I
> understand it would get a warning as well?
> 
> Without better idea, I?m fine with an even more explicit comment:
> 
> /* No fallthru because of the tail recursion above.  */
> 
> > ?  Btw, a fallthru would be diagnosed by GCC unless we put
> > 
> >  /* Fallthru  */
> > 
> > here.
> Sure, but my main concern was misreading from programmers (including me),
> which is not diagnosed by compilers.
> 
> > Maybe renaming WALK_SUBEXPR_TAIL to WALK_SUBEXPR_WITH_CONTINUE
> > or WALK_SUBEXPR_BY_TAIL_RECURSING or WALK_SUBEXPR_TAILRECURSE would
> > be more obvious?
> > 
> I think the comment above would be enough.

Installed as follows.

Richard.

>From e5c2a436ef7596d254ffefd279742382b1ff546b Mon Sep 17 00:00:00 2001
From: Richard Biener 
Date: Tue, 30 Nov 2021 15:25:17 +0100
Subject: [PATCH] Add comment to indicate tail recursion
To: gcc-patches@gcc.gnu.org

My previous change removed an unreachable break; there (an
unreachable continue; would have been more to the point).  The
following re-adds a comment explaining that WALK_SUBEXPR_TAIL
does not fall through but tail recurses.

2021-11-30  Richard Biener  

gcc/fortran/
* frontend-passes.c (gfc_expr_walker): Add comment to
indicate tail recursion.
---
 gcc/fortran/frontend-passes.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
index 16ee2afc9c0..4764c834f4f 100644
--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -5229,6 +5229,7 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t exprfn, 
void *data)
  case EXPR_OP:
WALK_SUBEXPR ((*e)->value.op.op1);
WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
+   /* No fallthru because of the tail recursion above.  */
  case EXPR_FUNCTION:
for (a = (*e)->value.function.actual; a; a = a->next)
  WALK_SUBEXPR (a->expr);
-- 
2.31.1

[PATCH] ipa-sra: Check also ECF_LOOPING_CONST_OR_PURE when evaluating calls

2021-11-30 Thread Martin Jambor

Hi,

in PR 103267 Honza found out that IPA-SRA does not look at
ECF_LOOPING_CONST_OR_PURE when evaluating if a call can have side
effects.  Fixed with this patch.  The testcase infinitely loops in a
const function, so it would not make a good addition to the testsuite.

Bootstrapped and tested on x86_64-linux.  OK for trunk?

Thanks,

Martin


gcc/ChangeLog:

2021-11-29  Martin Jambor  

PT ipa/103267
* ipa-sra.c (scan_function): Also check ECF_LOOPING_CONST_OR_PURE flag.
---
 gcc/ipa-sra.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
index cb0e30507a1..12ccd049552 100644
--- a/gcc/ipa-sra.c
+++ b/gcc/ipa-sra.c
@@ -1925,7 +1925,8 @@ scan_function (cgraph_node *node, struct function *fun)
if (lhs)
  scan_expr_access (lhs, stmt, ISRA_CTX_STORE, bb);
int flags = gimple_call_flags (stmt);
-   if ((flags & (ECF_CONST | ECF_PURE)) == 0)
+   if (((flags & (ECF_CONST | ECF_PURE)) == 0)
+   || (flags & ECF_LOOPING_CONST_OR_PURE))
  bitmap_set_bit (final_bbs, bb->index);
  }
  break;
-- 
2.33.1

[PATCH] libsanitizer: Use SSE to save and restore XMM registers

2021-11-30 Thread H.J. Lu via Gcc-patches

Use SSE, instead of AVX, to save and restore XMM registers to support
processors without AVX.  The affected codes are unused in upstream since

https://github.com/llvm/llvm-project/commit/66d4ce7e26a5

and will be removed in

https://reviews.llvm.org/D112604

This fixed

FAIL: g++.dg/tsan/pthread_cond_clockwait.C   -O0  execution test
FAIL: g++.dg/tsan/pthread_cond_clockwait.C   -O2  execution test

on machines without AVX.

PR sanitizer/103466
* tsan/tsan_rtl_amd64.S (__tsan_trace_switch_thunk): Replace
vmovdqu with movdqu.
(__tsan_report_race_thunk): Likewise.
---
 libsanitizer/tsan/tsan_rtl_amd64.S | 128 ++---
 1 file changed, 64 insertions(+), 64 deletions(-)

diff --git a/libsanitizer/tsan/tsan_rtl_amd64.S 
b/libsanitizer/tsan/tsan_rtl_amd64.S
index 632b19d1815..c15b01e49e5 100644
--- a/libsanitizer/tsan/tsan_rtl_amd64.S
+++ b/libsanitizer/tsan/tsan_rtl_amd64.S
@@ -45,22 +45,22 @@ ASM_SYMBOL(__tsan_trace_switch_thunk):
   # All XMM registers are caller-saved.
   sub $0x100, %rsp
   CFI_ADJUST_CFA_OFFSET(0x100)
-  vmovdqu %xmm0, 0x0(%rsp)
-  vmovdqu %xmm1, 0x10(%rsp)
-  vmovdqu %xmm2, 0x20(%rsp)
-  vmovdqu %xmm3, 0x30(%rsp)
-  vmovdqu %xmm4, 0x40(%rsp)
-  vmovdqu %xmm5, 0x50(%rsp)
-  vmovdqu %xmm6, 0x60(%rsp)
-  vmovdqu %xmm7, 0x70(%rsp)
-  vmovdqu %xmm8, 0x80(%rsp)
-  vmovdqu %xmm9, 0x90(%rsp)
-  vmovdqu %xmm10, 0xa0(%rsp)
-  vmovdqu %xmm11, 0xb0(%rsp)
-  vmovdqu %xmm12, 0xc0(%rsp)
-  vmovdqu %xmm13, 0xd0(%rsp)
-  vmovdqu %xmm14, 0xe0(%rsp)
-  vmovdqu %xmm15, 0xf0(%rsp)
+  movdqu %xmm0, 0x0(%rsp)
+  movdqu %xmm1, 0x10(%rsp)
+  movdqu %xmm2, 0x20(%rsp)
+  movdqu %xmm3, 0x30(%rsp)
+  movdqu %xmm4, 0x40(%rsp)
+  movdqu %xmm5, 0x50(%rsp)
+  movdqu %xmm6, 0x60(%rsp)
+  movdqu %xmm7, 0x70(%rsp)
+  movdqu %xmm8, 0x80(%rsp)
+  movdqu %xmm9, 0x90(%rsp)
+  movdqu %xmm10, 0xa0(%rsp)
+  movdqu %xmm11, 0xb0(%rsp)
+  movdqu %xmm12, 0xc0(%rsp)
+  movdqu %xmm13, 0xd0(%rsp)
+  movdqu %xmm14, 0xe0(%rsp)
+  movdqu %xmm15, 0xf0(%rsp)
   # Align stack frame.
   push %rbx  # non-scratch
   CFI_ADJUST_CFA_OFFSET(8)
@@ -78,22 +78,22 @@ ASM_SYMBOL(__tsan_trace_switch_thunk):
   pop %rbx
   CFI_ADJUST_CFA_OFFSET(-8)
   # Restore scratch registers.
-  vmovdqu 0x0(%rsp), %xmm0
-  vmovdqu 0x10(%rsp), %xmm1
-  vmovdqu 0x20(%rsp), %xmm2
-  vmovdqu 0x30(%rsp), %xmm3
-  vmovdqu 0x40(%rsp), %xmm4
-  vmovdqu 0x50(%rsp), %xmm5
-  vmovdqu 0x60(%rsp), %xmm6
-  vmovdqu 0x70(%rsp), %xmm7
-  vmovdqu 0x80(%rsp), %xmm8
-  vmovdqu 0x90(%rsp), %xmm9
-  vmovdqu 0xa0(%rsp), %xmm10
-  vmovdqu 0xb0(%rsp), %xmm11
-  vmovdqu 0xc0(%rsp), %xmm12
-  vmovdqu 0xd0(%rsp), %xmm13
-  vmovdqu 0xe0(%rsp), %xmm14
-  vmovdqu 0xf0(%rsp), %xmm15
+  movdqu 0x0(%rsp), %xmm0
+  movdqu 0x10(%rsp), %xmm1
+  movdqu 0x20(%rsp), %xmm2
+  movdqu 0x30(%rsp), %xmm3
+  movdqu 0x40(%rsp), %xmm4
+  movdqu 0x50(%rsp), %xmm5
+  movdqu 0x60(%rsp), %xmm6
+  movdqu 0x70(%rsp), %xmm7
+  movdqu 0x80(%rsp), %xmm8
+  movdqu 0x90(%rsp), %xmm9
+  movdqu 0xa0(%rsp), %xmm10
+  movdqu 0xb0(%rsp), %xmm11
+  movdqu 0xc0(%rsp), %xmm12
+  movdqu 0xd0(%rsp), %xmm13
+  movdqu 0xe0(%rsp), %xmm14
+  movdqu 0xf0(%rsp), %xmm15
   add $0x100, %rsp
   CFI_ADJUST_CFA_OFFSET(-0x100)
   pop %r11
@@ -163,22 +163,22 @@ ASM_SYMBOL(__tsan_report_race_thunk):
   # All XMM registers are caller-saved.
   sub $0x100, %rsp
   CFI_ADJUST_CFA_OFFSET(0x100)
-  vmovdqu %xmm0, 0x0(%rsp)
-  vmovdqu %xmm1, 0x10(%rsp)
-  vmovdqu %xmm2, 0x20(%rsp)
-  vmovdqu %xmm3, 0x30(%rsp)
-  vmovdqu %xmm4, 0x40(%rsp)
-  vmovdqu %xmm5, 0x50(%rsp)
-  vmovdqu %xmm6, 0x60(%rsp)
-  vmovdqu %xmm7, 0x70(%rsp)
-  vmovdqu %xmm8, 0x80(%rsp)
-  vmovdqu %xmm9, 0x90(%rsp)
-  vmovdqu %xmm10, 0xa0(%rsp)
-  vmovdqu %xmm11, 0xb0(%rsp)
-  vmovdqu %xmm12, 0xc0(%rsp)
-  vmovdqu %xmm13, 0xd0(%rsp)
-  vmovdqu %xmm14, 0xe0(%rsp)
-  vmovdqu %xmm15, 0xf0(%rsp)
+  movdqu %xmm0, 0x0(%rsp)
+  movdqu %xmm1, 0x10(%rsp)
+  movdqu %xmm2, 0x20(%rsp)
+  movdqu %xmm3, 0x30(%rsp)
+  movdqu %xmm4, 0x40(%rsp)
+  movdqu %xmm5, 0x50(%rsp)
+  movdqu %xmm6, 0x60(%rsp)
+  movdqu %xmm7, 0x70(%rsp)
+  movdqu %xmm8, 0x80(%rsp)
+  movdqu %xmm9, 0x90(%rsp)
+  movdqu %xmm10, 0xa0(%rsp)
+  movdqu %xmm11, 0xb0(%rsp)
+  movdqu %xmm12, 0xc0(%rsp)
+  movdqu %xmm13, 0xd0(%rsp)
+  movdqu %xmm14, 0xe0(%rsp)
+  movdqu %xmm15, 0xf0(%rsp)
   # Align stack frame.
   push %rbx  # non-scratch
   CFI_ADJUST_CFA_OFFSET(8)
@@ -196,22 +196,22 @@ ASM_SYMBOL(__tsan_report_race_thunk):
   pop %rbx
   CFI_ADJUST_CFA_OFFSET(-8)
   # Restore scratch registers.
-  vmovdqu 0x0(%rsp), %xmm0
-  vmovdqu 0x10(%rsp), %xmm1
-  vmovdqu 0x20(%rsp), %xmm2
-  vmovdqu 0x30(%rsp), %xmm3
-  vmovdqu 0x40(%rsp), %xmm4
-  vmovdqu 0x50(%rsp), %xmm5
-  vmovdqu 0x60(%rsp), %xmm6
-  vmovdqu 0x70(%rsp), %xmm7
-  vmovdqu 0x80(%rsp), %xmm8
-  vmovdqu 0x90(%rsp), %xmm9
-  vmovdqu 0xa0(%rsp), %xmm10
-  vmovdqu 0xb0(%rsp), %xmm11
-  vmovdqu 0xc0(%rsp), %xmm12
-  vmovdqu 0xd0(%rsp), %xmm13
-  vmovdqu 0xe0(%rsp), %xmm14
-  vmovdqu 0xf0(%rsp), %xmm15
+  movdqu 0x0(%rsp), %xmm0
+

Re: [PATCH] Avoid some -Wunreachable-code-ctrl


On 30/11/2021 14:25, Richard Biener wrote:

On Tue, 30 Nov 2021, Mikael Morin wrote:


Le 29/11/2021 à 16:03, Richard Biener via Gcc-patches a écrit :

diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
index f5ba7cecd54..16ee2afc9c0 100644
--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -5229,7 +5229,6 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t exprfn,
void *data)
  case EXPR_OP:
WALK_SUBEXPR ((*e)->value.op.op1);
WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
-   break;
  case EXPR_FUNCTION:
for (a = (*e)->value.function.actual; a; a = a->next)
  WALK_SUBEXPR (a->expr);


I’m uncomfortable with the above change.
It makes it look like there is a fall through, but there is not.
Maybe inline the macro to make the continue explicit, or use WALK_SUBEXPR
instead of WALK_SUBEXPR_TAIL and hope the compiler will do the tail call
optimization.


Ah, it follows the style in tree.c:walk_tree_1 where break was used
inconsistently after WALK_SUBTREE_TAIL which was then more obvious
to me to clean up.  I didn't realize the fortran FE only had a
single WALK_SUBEXPR_TAIL.

I'm not sure inlining will make the situation more clear, for
sure using WALK_SUBEXPR would but it might loose the tailcall.

Would you accept an additional comment after WALK_SUBEXPR_TAIL like

   case EXPR_OP:
 WALK_SUBEXPR ((*e)->value.op.op1);
 WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
 /* tail-recurse  */

My preference would be a gcc_unreachable() or something similar, but I 
understand it would get a warning as well?


Without better idea, I’m fine with an even more explicit comment:

/* No fallthru because of the tail recursion above.  */


?  Btw, a fallthru would be diagnosed by GCC unless we put

 /* Fallthru  */

here.
Sure, but my main concern was misreading from programmers (including 
me), which is not diagnosed by compilers.



 Maybe renaming WALK_SUBEXPR_TAIL to WALK_SUBEXPR_WITH_CONTINUE
or WALK_SUBEXPR_BY_TAIL_RECURSING or WALK_SUBEXPR_TAILRECURSE would
be more obvious?


I think the comment above would be enough.

Thanks.

[PATCH] libcpp, v2: Fix up #__VA_OPT__ handling [PR103415]

On Mon, Nov 29, 2021 at 07:28:10PM -0500, Jason Merrill wrote:
> Please add some of this explanation to the "paste any tokens" comment in the
> code.

Ok.

> > + while (rhs->flags & PASTE_LEFT);
> > + if ((flags & PREV_WHITE)
> > + && (token->flags & PREV_WHITE) == 0)
> > +   const_cast(token)->flags
> > + |= PREV_WHITE;
> 
> Hmm, shouldn't paste_tokens handle copying PREV_WHITE?

Copying there PREV_FALLTHROUGH fixes the new Wimplicit-fallthrough-38.c
testcase, I couldn't find where doing the copying of PREV_WHITE would
make an observable difference outside of __VA_OPT__, e.g.
#define F(x) #x
#define G(x) F(x)
#define H G({a##b)
#define I G({ a##b)
const char *h = H;
const char *i = I;
results in "{ab" and "{ ab" before/after the patch.  But copying it
in paste_tokens looks cleaner...

2021-11-30  Jakub Jelinek  

PR preprocessor/103415
libcpp/
* macro.c (stringify_arg): Remove va_opt argument and va_opt handling.
(paste_tokens): On successful paste or in PREV_WHITE and
PREV_FALLTHROUGH flags from the *plhs token to the new token.
(replace_args): Adjust stringify_arg callers.  For #__VA_OPT__,
perform token pasting in a separate loop before stringify_arg call.
gcc/testsuite/
* c-c++-common/cpp/va-opt-8.c: New test.
* c-c++-common/Wimplicit-fallthrough-38.c: New test.

--- libcpp/macro.c.jj   2021-11-26 10:09:50.278020239 +0100
+++ libcpp/macro.c  2021-11-30 14:05:25.274132482 +0100
@@ -295,7 +295,7 @@ static cpp_context *next_context (cpp_re
 static const cpp_token *padding_token (cpp_reader *, const cpp_token *);
 static const cpp_token *new_string_token (cpp_reader *, uchar *, unsigned int);
 static const cpp_token *stringify_arg (cpp_reader *, const cpp_token **,
-  unsigned int, bool);
+  unsigned int);
 static void paste_all_tokens (cpp_reader *, const cpp_token *);
 static bool paste_tokens (cpp_reader *, location_t,
  const cpp_token **, const cpp_token *);
@@ -834,8 +834,7 @@ cpp_quote_string (uchar *dest, const uch
 /* Convert a token sequence FIRST to FIRST+COUNT-1 to a single string token
according to the rules of the ISO C #-operator.  */
 static const cpp_token *
-stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count,
-  bool va_opt)
+stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count)
 {
   unsigned char *dest;
   unsigned int i, escape_it, backslash_count = 0;
@@ -852,24 +851,6 @@ stringify_arg (cpp_reader *pfile, const
 {
   const cpp_token *token = first[i];
 
-  if (va_opt && (token->flags & PASTE_LEFT))
-   {
- location_t virt_loc = pfile->invocation_location;
- const cpp_token *rhs;
- do
-   {
- if (i == count)
-   abort ();
- rhs = first[++i];
- if (!paste_tokens (pfile, virt_loc, , rhs))
-   {
- --i;
- break;
-   }
-   }
- while (rhs->flags & PASTE_LEFT);
-   }
-
   if (token->type == CPP_PADDING)
{
  if (source == NULL
@@ -1003,6 +984,7 @@ paste_tokens (cpp_reader *pfile, locatio
   return false;
 }
 
+  lhs->flags |= (*plhs)->flags & (PREV_WHITE | PREV_FALLTHROUGH);
   *plhs = lhs;
   _cpp_pop_buffer (pfile);
   return true;
@@ -1945,8 +1927,7 @@ replace_args (cpp_reader *pfile, cpp_has
if (src->flags & STRINGIFY_ARG)
  {
if (!arg->stringified)
- arg->stringified = stringify_arg (pfile, arg->first, arg->count,
-   false);
+ arg->stringified = stringify_arg (pfile, arg->first, arg->count);
  }
else if ((src->flags & PASTE_LEFT)
 || (src != macro->exp.tokens && (src[-1].flags & PASTE_LEFT)))
@@ -2066,11 +2047,46 @@ replace_args (cpp_reader *pfile, cpp_has
{
  unsigned int count
= start ? paste_flag - start : tokens_buff_count (buff);
- const cpp_token *t
-   = stringify_arg (pfile,
-start ? start + 1
-: (const cpp_token **) (buff->base),
-count, true);
+ const cpp_token **first
+   = start ? start + 1
+   : (const cpp_token **) (buff->base);
+ unsigned int i, j;
+
+ /* Paste any tokens that need to be pasted before calling
+stringify_arg, because stringify_arg uses pfile->u_buff
+which paste_tokens can use as well.  */
+ for (i = 0, j = 0; i < count; i++, j++)
+   {
+

Re: [PATCH] simplify-rtx, v2: Punt on simplify_associative_operation with large operands [PR102356]

On Tue, 30 Nov 2021, Jakub Jelinek wrote:

> On Tue, Nov 30, 2021 at 10:43:28AM +0100, Richard Biener wrote:
> > I wonder given we now have 'simplify_context' whether we can
> > track a re-association budget we can eat from.  At least your
> > code to determine whether the expression is too large is
> > quadratic as well (but bound to 64, so just a very large constant
> > overhead for an outermost expression of size 63).  We already
> > have a mem_depth there,
> 
> Makes sense.
> 
> > so just have reassoc_times and punt
> > if that reaches --param max-simplify-reassoc-times, incrementing
> > it each time simplify_associative_operation is entered?
> 
> Though, is a --param worth for it?  There is IMO no way the 64 limit
> can trigger for non-debug insns (I can certainly gather how many times
> it triggers when > 20 and in which pass during bootstrap/regtest
> to verify).

Probably not - but maybe use a (static) const unsigned int max_assoc_count
in the class then?

OK either way I guess.

Thanks,
Richard.

> 2021-11-30  Jakub Jelinek  
> 
>   PR rtl-optimization/102356
>   * rtl.h (simplify_context): Add assoc_count member.
>   * simplify-rtx.c (simplify_associative_operation): Don't reassociate
>   more than 64 times within one outermost simplify_* call.
>   * dwarf2out.c (mem_loc_descriptor): Optimize binary operation
>   with both operands the same using DW_OP_dup.
> 
>   * gcc.dg/pr102356.c: New test.
> 
> --- gcc/rtl.h.jj  2021-11-02 09:06:05.904396581 +0100
> +++ gcc/rtl.h 2021-11-30 14:55:39.701257736 +0100
> @@ -3433,6 +3433,10 @@ public:
>   inside a MEM than outside.  */
>unsigned int mem_depth = 0;
>  
> +  /* Tracks number of simplify_associative_operation calls performed during
> + outermost simplify* call.  */
> +  unsigned int assoc_count = 0;
> +
>  private:
>rtx simplify_truncation (machine_mode, rtx, machine_mode);
>rtx simplify_byte_swapping_operation (rtx_code, machine_mode, rtx, rtx);
> --- gcc/simplify-rtx.c.jj 2021-11-30 09:44:46.619606170 +0100
> +++ gcc/simplify-rtx.c2021-11-30 14:59:00.251321577 +0100
> @@ -2263,6 +2263,16 @@ simplify_context::simplify_associative_o
>  {
>rtx tem;
>  
> +  /* Normally expressions simplified by simplify-rtx.c are combined
> + at most from a few machine instructions and therefore the
> + expressions should be fairly small.  During var-tracking
> + we can see arbitrarily large expressions though and reassociating
> + those can be quadratic, so punt after encountering 64
> + simplify_associative_operation calls during outermost simplify_*
> + call.  */
> +  if (++assoc_count >= 64)
> +return NULL_RTX;
> +
>/* Linearize the operator to the left.  */
>if (GET_CODE (op1) == code)
>  {
> --- gcc/dwarf2out.c.jj2021-11-30 09:44:46.568606908 +0100
> +++ gcc/dwarf2out.c   2021-11-30 14:53:28.779174490 +0100
> @@ -16363,6 +16363,15 @@ mem_loc_descriptor (rtx rtl, machine_mod
>  do_binop:
>op0 = mem_loc_descriptor (XEXP (rtl, 0), mode, mem_mode,
>   VAR_INIT_STATUS_INITIALIZED);
> +  if (XEXP (rtl, 0) == XEXP (rtl, 1))
> + {
> +   if (op0 == 0)
> + break;
> +   mem_loc_result = op0;
> +   add_loc_descr (_loc_result, new_loc_descr (DW_OP_dup, 0, 0));
> +   add_loc_descr (_loc_result, new_loc_descr (op, 0, 0));
> +   break;
> + }
>op1 = mem_loc_descriptor (XEXP (rtl, 1), mode, mem_mode,
>   VAR_INIT_STATUS_INITIALIZED);
>  
> --- gcc/testsuite/gcc.dg/pr102356.c.jj2021-11-30 14:53:28.779174490 
> +0100
> +++ gcc/testsuite/gcc.dg/pr102356.c   2021-11-30 14:53:28.779174490 +0100
> @@ -0,0 +1,33 @@
> +/* PR rtl-optimization/102356 */
> +/* { dg-do compile { target int32plus } } */
> +/* { dg-options "-O3 -g" } */
> +
> +signed char a = 0;
> +unsigned char b = 9;
> +unsigned long long c = 0xF1FBFC17225F7A57ULL;
> +int d = 0x3A6667C6;
> +
> +unsigned char
> +foo (unsigned int x)
> +{
> +  unsigned int *e = 
> +  if ((c /= ((0 * (*e *= b)) <= 0)))
> +;
> +  for (d = 9; d > 2; d -= 2)
> +{
> +  c = -2;
> +  do
> + if ((*e *= *e))
> +   {
> + a = 4;
> + do
> +   {
> + a -= 3;
> + if ((*e *= *e))
> +   b = 9;
> +   }
> + while (a > 2);
> +   }
> +  while (c++);
> +}
> +}
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)

[PATCH] simplify-rtx, v2: Punt on simplify_associative_operation with large operands [PR102356]

On Tue, Nov 30, 2021 at 10:43:28AM +0100, Richard Biener wrote:
> I wonder given we now have 'simplify_context' whether we can
> track a re-association budget we can eat from.  At least your
> code to determine whether the expression is too large is
> quadratic as well (but bound to 64, so just a very large constant
> overhead for an outermost expression of size 63).  We already
> have a mem_depth there,

Makes sense.

> so just have reassoc_times and punt
> if that reaches --param max-simplify-reassoc-times, incrementing
> it each time simplify_associative_operation is entered?

Though, is a --param worth for it?  There is IMO no way the 64 limit
can trigger for non-debug insns (I can certainly gather how many times
it triggers when > 20 and in which pass during bootstrap/regtest
to verify).

2021-11-30  Jakub Jelinek  

PR rtl-optimization/102356
* rtl.h (simplify_context): Add assoc_count member.
* simplify-rtx.c (simplify_associative_operation): Don't reassociate
more than 64 times within one outermost simplify_* call.
* dwarf2out.c (mem_loc_descriptor): Optimize binary operation
with both operands the same using DW_OP_dup.

* gcc.dg/pr102356.c: New test.

--- gcc/rtl.h.jj2021-11-02 09:06:05.904396581 +0100
+++ gcc/rtl.h   2021-11-30 14:55:39.701257736 +0100
@@ -3433,6 +3433,10 @@ public:
  inside a MEM than outside.  */
   unsigned int mem_depth = 0;
 
+  /* Tracks number of simplify_associative_operation calls performed during
+ outermost simplify* call.  */
+  unsigned int assoc_count = 0;
+
 private:
   rtx simplify_truncation (machine_mode, rtx, machine_mode);
   rtx simplify_byte_swapping_operation (rtx_code, machine_mode, rtx, rtx);
--- gcc/simplify-rtx.c.jj   2021-11-30 09:44:46.619606170 +0100
+++ gcc/simplify-rtx.c  2021-11-30 14:59:00.251321577 +0100
@@ -2263,6 +2263,16 @@ simplify_context::simplify_associative_o
 {
   rtx tem;
 
+  /* Normally expressions simplified by simplify-rtx.c are combined
+ at most from a few machine instructions and therefore the
+ expressions should be fairly small.  During var-tracking
+ we can see arbitrarily large expressions though and reassociating
+ those can be quadratic, so punt after encountering 64
+ simplify_associative_operation calls during outermost simplify_*
+ call.  */
+  if (++assoc_count >= 64)
+return NULL_RTX;
+
   /* Linearize the operator to the left.  */
   if (GET_CODE (op1) == code)
 {
--- gcc/dwarf2out.c.jj  2021-11-30 09:44:46.568606908 +0100
+++ gcc/dwarf2out.c 2021-11-30 14:53:28.779174490 +0100
@@ -16363,6 +16363,15 @@ mem_loc_descriptor (rtx rtl, machine_mod
 do_binop:
   op0 = mem_loc_descriptor (XEXP (rtl, 0), mode, mem_mode,
VAR_INIT_STATUS_INITIALIZED);
+  if (XEXP (rtl, 0) == XEXP (rtl, 1))
+   {
+ if (op0 == 0)
+   break;
+ mem_loc_result = op0;
+ add_loc_descr (_loc_result, new_loc_descr (DW_OP_dup, 0, 0));
+ add_loc_descr (_loc_result, new_loc_descr (op, 0, 0));
+ break;
+   }
   op1 = mem_loc_descriptor (XEXP (rtl, 1), mode, mem_mode,
VAR_INIT_STATUS_INITIALIZED);
 
--- gcc/testsuite/gcc.dg/pr102356.c.jj  2021-11-30 14:53:28.779174490 +0100
+++ gcc/testsuite/gcc.dg/pr102356.c 2021-11-30 14:53:28.779174490 +0100
@@ -0,0 +1,33 @@
+/* PR rtl-optimization/102356 */
+/* { dg-do compile { target int32plus } } */
+/* { dg-options "-O3 -g" } */
+
+signed char a = 0;
+unsigned char b = 9;
+unsigned long long c = 0xF1FBFC17225F7A57ULL;
+int d = 0x3A6667C6;
+
+unsigned char
+foo (unsigned int x)
+{
+  unsigned int *e = 
+  if ((c /= ((0 * (*e *= b)) <= 0)))
+;
+  for (d = 9; d > 2; d -= 2)
+{
+  c = -2;
+  do
+   if ((*e *= *e))
+ {
+   a = 4;
+   do
+ {
+   a -= 3;
+   if ((*e *= *e))
+ b = 9;
+ }
+   while (a > 2);
+ }
+  while (c++);
+}
+}


Jakub

[PATCH] tree-optimization/103489 - fix ICE when bool pattern recog fails

bool pattern recog currently does not handle cycles correctly
and when it fails we can ICE later vectorizing PHIs with
mismatched bool and non-bool vector types.  The following avoids
blindly trusting bool pattern recog here and verifies things
more thoroughly in vectorizable_phi.  A bool pattern recog fix
is for GCC 13.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-11-30  Richard Biener  

PR tree-optimization/103489
* tree-vect-loop.c (vectorizable_phi): Verify argument
vector type compatibility to mitigate bool pattern recog
bug.

* gcc.dg/torture/pr103489.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr103489.c | 12 
 gcc/tree-vect-loop.c| 18 ++
 2 files changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr103489.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr103489.c 
b/gcc/testsuite/gcc.dg/torture/pr103489.c
new file mode 100644
index 000..cd62623ece2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr103489.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-ftree-vectorize" } */
+
+_Bool a[80];
+short b, f;
+void g(short h[][8][16])
+{
+  for (_Bool c = 0; c < b;)
+for (_Bool d = 0; d < (_Bool)f; d = 1)
+  for (short e = 0; e < 16; e++)
+a[e] = h[b][1][e];
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 841da78f1fd..7f544ba1fd5 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -7846,6 +7846,24 @@ vectorizable_phi (vec_info *,
   "incompatible vector types for invariants\n");
return false;
  }
+   else if (SLP_TREE_DEF_TYPE (child) == vect_internal_def
+&& !useless_type_conversion_p (vectype,
+   SLP_TREE_VECTYPE (child)))
+ {
+   /* With bools we can have mask and non-mask precision vectors,
+  while pattern recog is supposed to guarantee consistency here
+  bugs in it can cause mismatches (PR103489 for example).
+  Deal with them here instead of ICEing later.  */
+   if (dump_enabled_p ())
+ dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+  "incompatible vector type setup from "
+  "bool pattern detection\n");
+   gcc_checking_assert
+ (VECTOR_BOOLEAN_TYPE_P (SLP_TREE_VECTYPE (child))
+  != VECTOR_BOOLEAN_TYPE_P (vectype));
+   return false;
+ }
+
   /* For single-argument PHIs assume coalescing which means zero cost
 for the scalar and the vector PHIs.  This avoids artificially
 favoring the vector path (but may pessimize it in some cases).  */
-- 
2.31.1

Re: [PATCH 1v2/3][vect] Add main vectorized loop unrolling

On Tue, 30 Nov 2021, Andre Vieira (lists) wrote:

> 
> On 25/11/2021 12:46, Richard Biener wrote:
> > Oops, my fault, yes, it does.  I would suggest to refactor things so
> > that the mode_i = first_loop_i case is there only once.  I also wonder
> > if all the argument about starting at 0 doesn't apply to the
> > not unrolled LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P as well?  So
> > what's the reason to differ here?  So in the end I'd just change
> > the existing
> >
> >if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
> >  {
> >
> > to
> >
> >if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)
> >|| first_loop_vinfo->suggested_unroll_factor > 1)
> >  {
> >
> > and maybe revisit this when we have an actual testcase showing that
> > doing sth else has a positive effect?
> >
> > Thanks,
> > Richard.
> 
> So I had a quick chat with Richard Sandiford and he is suggesting resetting
> mode_i to 0 for all cases.
> 
> He pointed out that for some tunings the SVE mode might come after the NEON
> mode, which means that even for not-unrolled loop_vinfos we could end up with
> a suboptimal choice of mode for the epilogue. I.e. it could be that we pick
> V16QI for main vectorization, but that's VNx16QI + 1 in the array, so we'd not
> try VNx16QI for the epilogue.
> 
> This would simplify the mode selecting cases, by just simply restarting at
> mode_i in all epilogue cases. Is that something you'd be OK?

Works for me with an updated comment.  Even better with showing a
testcase exercising such tuning.

Richard.

[PATCH][pushed] Change if-to-switch-conversion test.

2021-11-30 Thread Martin Liška


Small update of the test-case, approved by Richi.

Martin

PR tree-optimization/103278

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/if-to-switch-5.c: Make the test acceptable by
targets with no jump-tables.
---
 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c 
b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c
index ceeae908821..54771e64e59 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-5.c
@@ -4,8 +4,8 @@
 int crud (unsigned char c)
 {
   return (((int) c == 46) || (int) c == 44)
-|| (int) c == 58) || (int) c == 59) || (int) c == 60)
- || (int) c == 62) || (int) c == 34) || (int) c == 92)
+|| (int) c == 58) || (int) c == 60) || (int) c == 62)
+ || (int) c == 64) || (int) c == 34) || (int) c == 92)
   || (int) c == 39) != 0);
 }
 
--

2.34.0

Re: [PATCH] Modify combine pattern by anding a pseudo with its nonzero bits

2021-11-30 Thread David Edelsohn via Gcc-patches

On Tue, Nov 30, 2021 at 3:46 AM HAO CHEN GUI  wrote:
>
> Hi,
>
> This patch modifies the combine pattern with a helper - 
> change_pseudo_and_mask when recog fails. The helper converts a single pseudo 
> to the pseudo and with a mask if the outer operator is IOR/XOR/PLUS and the 
> inner operator is ASHIFT/LSHIFTRT/AND. The conversion helps match shift + ior 
> pattern.
>
> Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. 
> Is this okay for trunk? Any recommendations? Thanks a lot.
>
> ChangeLog
>
> 2021-11-30 Haochen Gui 
>
> gcc/
> * combine.c (change_pseudo_and_mask): New.
> (recog_for_combine): If recog fails, try again with the pattern
> modified by change_pseudo_and_mask.
>
> gcc/testsuite/
> * gcc.target/powerpc/20050603-3.c: Modify the dump check conditions.
> * gcc.target/powerpc/rlwimi-2.c: Likewise.
>
> patch.diff
>
> diff --git a/gcc/combine.c b/gcc/combine.c
> index 03e9a780919..c83c0aceb57 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -11539,6 +11539,42 @@ change_zero_ext (rtx pat)
>return changed;
>  }
>
> +/* When the outer code of set_src is IOR/XOR/PLUS and the inner code is
> +   ASHIFT/LSHIFTRT/AND, convert a psuedo to psuedo AND with a mask if its

^^^ spelling mistake in comment: pseudo not psuedo

Thanks, David

> +   nonzero_bits is less than its mode mask.  */
> +static bool
> +change_pseudo_and_mask (rtx pat)
> +{
> +  bool changed = false;
> +
> +  rtx src = SET_SRC (pat);
> +  if ((GET_CODE (src) == IOR
> +   || GET_CODE (src) == XOR
> +   || GET_CODE (src) == PLUS)
> +  && (((GET_CODE (XEXP (src, 0)) == ASHIFT
> +   || GET_CODE (XEXP (src, 0)) == LSHIFTRT
> +   || GET_CODE (XEXP (src, 0)) == AND)
> +  && REG_P (XEXP (src, 1)))
> + || ((GET_CODE (XEXP (src, 1)) == ASHIFT
> +  || GET_CODE (XEXP (src, 1)) == LSHIFTRT
> +  || GET_CODE (XEXP (src, 1)) == AND)
> + && REG_P (XEXP (src, 0)
> +{
> +  rtx *reg = REG_P (XEXP (src, 0))
> +?  (SET_SRC (pat), 0)
> +:  (SET_SRC (pat), 1);
> +  machine_mode mode = GET_MODE (*reg);
> +  unsigned HOST_WIDE_INT nonzero = nonzero_bits (*reg, mode);
> +  if (nonzero < GET_MODE_MASK (mode))
> +   {
> + rtx x = gen_rtx_AND (mode, *reg, GEN_INT (nonzero));
> + SUBST (*reg, x);
> + changed = true;
> +   }
> + }
> +  return changed;
> +}
> +
>  /* Like recog, but we receive the address of a pointer to a new pattern.
> We try to match the rtx that the pointer points to.
> If that fails, we may try to modify or replace the pattern,
> @@ -11586,7 +11622,14 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx 
> *pnotes)
> }
> }
>else
> -   changed = change_zero_ext (pat);
> +   {
> + if (change_pseudo_and_mask (pat))
> +   {
> + maybe_swap_commutative_operands (SET_SRC (pat));
> + changed = true;
> +   }
> + changed |= change_zero_ext (pat);
> +   }
>  }
>else if (GET_CODE (pat) == PARALLEL)
>  {
> diff --git a/gcc/testsuite/gcc.target/powerpc/20050603-3.c 
> b/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> index 4017d34f429..e628be11532 100644
> --- a/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> @@ -12,7 +12,7 @@ void rotins (unsigned int x)
>b.y = (x<<12) | (x>>20);
>  }
>
> -/* { dg-final { scan-assembler-not {\mrlwinm} } } */
> +/* { dg-final { scan-assembler-not {\mrlwinm} { target ilp32 } } } */
>  /* { dg-final { scan-assembler-not {\mrldic} } } */
>  /* { dg-final { scan-assembler-not {\mrot[lr]} } } */
>  /* { dg-final { scan-assembler-not {\ms[lr][wd]} } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c 
> b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> index bafa371db73..ffb5f9e450f 100644
> --- a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> @@ -2,14 +2,14 @@
>  /* { dg-options "-O2" } */
>
>  /* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 14121 { target ilp32 } 
> } } */
> -/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 20217 { target lp64 } } 
> } */
> +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 21279 { target lp64 } } 
> } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+blr} 6750 } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+mr} 643 { target ilp32 } } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+mr} 11 { target lp64 } } } */
>  /* { dg-final { scan-assembler-times {(?n)^\s+rldicl} 7790 { target lp64 } } 
> } */
>
>  /* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target ilp32 } 
> } } */
> -/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1666 { target lp64 } } 
> } */
> +/* { dg-final { scan-assembler-times {(?n)^\s+rlwimi} 1692 { target lp64 } } 
> } */
>
>  /* { dg-final {

Re: [PATCH] libcpp: Implement -Wbidi-chars for CVE-2021-42574 [PR103026]

2021-11-30 Thread Marek Polacek via Gcc-patches

On Tue, Nov 30, 2021 at 09:38:57AM +0100, Stephan Bergmann wrote:
> On 15/11/2021 18:28, Marek Polacek via Gcc-patches wrote:
> > On Mon, Nov 08, 2021 at 04:33:43PM -0500, Marek Polacek wrote:
> > > Ping, can we conclude on the name?   IMHO, -Wbidirectional is just fine,
> > > but changing the name is a trivial operation.
> > 
> > Here's a patch with a better name (suggested by Jonathan W.).  Otherwise no
> > changes.
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > -- >8 --
> >  From a link below:
> > "An issue was discovered in the Bidirectional Algorithm in the Unicode
> > Specification through 14.0. It permits the visual reordering of
> > characters via control sequences, which can be used to craft source code
> > that renders different logic than the logical ordering of tokens
> > ingested by compilers and interpreters. Adversaries can leverage this to
> > encode source code for compilers accepting Unicode such that targeted
> > vulnerabilities are introduced invisibly to human reviewers."
> > 
> > More info:
> > https://nvd.nist.gov/vuln/detail/CVE-2021-42574
> > https://trojansource.codes/
> > 
> > This is not a compiler bug.  However, to mitigate the problem, this patch
> > implements -Wbidi-chars=[none|unpaired|any] to warn about possibly
> > misleading Unicode bidirectional characters the preprocessor may encounter.
> > 
> > The default is =unpaired, which warns about improperly terminated
> > bidirectional characters; e.g. a LRE without its appertaining PDF.  The
> > level =any warns about any use of bidirectional characters.
> > 
> > This patch handles both UCNs and UTF-8 characters.  UCNs designating
> > bidi characters in identifiers are accepted since r204886.  Then r217144
> > enabled -fextended-identifiers by default.  Extended characters in C/C++
> > identifiers have been accepted since r275979.  However, this patch still
> > warns about mixing UTF-8 and UCN bidi characters; there seems to be no
> > good reason to allow mixing them.
> 
> I wonder what the rationale is to warn about UCNs, like in
> 
> > aText = u"\u202D" + aText;
> 
> (as found in the LibreOffice source code).

Is this line mixing a UCN and a UTF-8?  Or is it just that you're
prepending a LRO to aText?  We warn because the LRO is not "closed"
in the context of its string literal, which was part of the Trojan
source attack.  So "\u202D ... \u202C" would not warn.

I'm not sure what workaround I could offer.  Maybe provide an option not to
warn about UCNs at all, though even that is potentially dangerous -- while
you can see UCNs in the source code, if you print strings containing them,
they won't be visible anymore.

Marek

Re: [PATCH] Avoid some -Wunreachable-code-ctrl

On Tue, 30 Nov 2021, Mikael Morin wrote:

> Le 29/11/2021 à 16:03, Richard Biener via Gcc-patches a écrit :
> > diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
> > index f5ba7cecd54..16ee2afc9c0 100644
> > --- a/gcc/fortran/frontend-passes.c
> > +++ b/gcc/fortran/frontend-passes.c
> > @@ -5229,7 +5229,6 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t exprfn,
> > void *data)
> >  case EXPR_OP:
> >WALK_SUBEXPR ((*e)->value.op.op1);
> >WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
> > -   break;
> >  case EXPR_FUNCTION:
> >for (a = (*e)->value.function.actual; a; a = a->next)
> >  WALK_SUBEXPR (a->expr);
> 
> I’m uncomfortable with the above change.
> It makes it look like there is a fall through, but there is not.
> Maybe inline the macro to make the continue explicit, or use WALK_SUBEXPR
> instead of WALK_SUBEXPR_TAIL and hope the compiler will do the tail call
> optimization.

Ah, it follows the style in tree.c:walk_tree_1 where break was used
inconsistently after WALK_SUBTREE_TAIL which was then more obvious
to me to clean up.  I didn't realize the fortran FE only had a 
single WALK_SUBEXPR_TAIL.

I'm not sure inlining will make the situation more clear, for
sure using WALK_SUBEXPR would but it might loose the tailcall.

Would you accept an additional comment after WALK_SUBEXPR_TAIL like

  case EXPR_OP:
WALK_SUBEXPR ((*e)->value.op.op1);
WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
/* tail-recurse  */

?  Btw, a fallthru would be diagnosed by GCC unless we put

/* Fallthru  */

here.  Maybe renaming WALK_SUBEXPR_TAIL to WALK_SUBEXPR_WITH_CONTINUE
or WALK_SUBEXPR_BY_TAIL_RECURSING or WALK_SUBEXPR_TAILRECURSE would
be more obvious?

Thanks,
Richard.

[committed] libstdc++: Use gender-agnostic pronoun in docs

I've pushed this change for the libstdc++ docs (should be "their"), but
didn't notice the typo in the changelog, so I'll fix that tomorrow after
the file is regenerated.



libstdc++-v3/ChangeLog:

* doc/xml/manual/debug_mode.xml: Replace "his or her" with "they".
* doc/html/manual/debug_mode_design.html: Regenerate.
---
 libstdc++-v3/doc/html/manual/debug_mode_design.html | 10 +-
 libstdc++-v3/doc/xml/manual/debug_mode.xml  | 10 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/libstdc++-v3/doc/xml/manual/debug_mode.xml 
b/libstdc++-v3/doc/xml/manual/debug_mode.xml
index dbd5c2b7775..988c4a93601 100644
--- a/libstdc++-v3/doc/xml/manual/debug_mode.xml
+++ b/libstdc++-v3/doc/xml/manual/debug_mode.xml
@@ -393,14 +393,14 @@ That alias is deprecated and may be removed in a future 
release.
 less recompilation) but are more complicated to implement than
 the lower-numbered conformance levels.
   
-   Full recompilation: The user must 
recompile his or
-   her entire application and all C++ libraries it depends on,
+   Full recompilation: The user must 
recompile
+   their entire application and all C++ libraries it depends on,
including the C++ standard library that ships with the
compiler. This must be done even if only a small part of the
program can use debugging features.
 
Full user recompilation: The user 
must recompile
-   his or her entire application and all C++ libraries it depends
+   their entire application and all C++ libraries it depends
on, but not the C++ standard library itself. This must be done
even if only a small part of the program can use debugging
features. This can be achieved given a full recompilation
@@ -409,7 +409,7 @@ That alias is deprecated and may be removed in a future 
release.
one, e.g., a multilibs approach.
 
Partial recompilation: The user 
must recompile the
-   parts of his or her application and the C++ libraries it
+   parts of their application and the C++ libraries it
depends on that will use the debugging facilities
directly. This means that any code that uses the debuggable
standard containers would need to be recompiled, but code
@@ -417,7 +417,7 @@ That alias is deprecated and may be removed in a future 
release.
would not have to be recompiled.
 
Per-use recompilation: The user 
must recompile the
-   parts of his or her application and the C++ libraries it
+   parts of their application and the C++ libraries it
depends on where debugging should occur, and any other code
that interacts with those containers. This means that a set of
translation units that accesses a particular standard
-- 
2.31.1

Re: [PATCH] Avoid some -Wunreachable-code-ctrl


Le 29/11/2021 à 16:03, Richard Biener via Gcc-patches a écrit :

diff --git a/gcc/fortran/frontend-passes.c b/gcc/fortran/frontend-passes.c
index f5ba7cecd54..16ee2afc9c0 100644
--- a/gcc/fortran/frontend-passes.c
+++ b/gcc/fortran/frontend-passes.c
@@ -5229,7 +5229,6 @@ gfc_expr_walker (gfc_expr **e, walk_expr_fn_t exprfn, 
void *data)
  case EXPR_OP:
WALK_SUBEXPR ((*e)->value.op.op1);
WALK_SUBEXPR_TAIL ((*e)->value.op.op2);
-   break;
  case EXPR_FUNCTION:
for (a = (*e)->value.function.actual; a; a = a->next)
  WALK_SUBEXPR (a->expr);


I’m uncomfortable with the above change.
It makes it look like there is a fall through, but there is not.
Maybe inline the macro to make the continue explicit, or use 
WALK_SUBEXPR instead of WALK_SUBEXPR_TAIL and hope the compiler will do 
the tail call optimization.


Mikael

gender-agnostic pronouns

2021-11-30 Thread Nathan Sidwell

I've committed this change to use gneder agnostic pronouns on the 
non-historical web documents.


and if you're upset that Those Are Plural!, assemble this URL and watch 
youtube  /watch?v=46ehrFk-gLk=87s at about the 2 minute mark


nathan
--
Nathan SidwellFrom b5a0f250f0f05364a51c331d040d78bf15057884 Mon Sep 17 00:00:00 2001
From: Nathan Sidwell 
Date: Tue, 30 Nov 2021 07:12:44 -0500
Subject: [PATCH] Use gender-agnostic pronouns

Use they/them/their in non-historical documents
---
 htdocs/bugs/management.html | 6 +++---
 htdocs/contribute.html  | 2 +-
 htdocs/develop.html | 2 +-
 htdocs/fortran/index.html   | 4 ++--
 htdocs/gitwrite.html| 2 +-
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/htdocs/bugs/management.html b/htdocs/bugs/management.html
index 18fee991..97ef8299 100644
--- a/htdocs/bugs/management.html
+++ b/htdocs/bugs/management.html
@@ -203,7 +203,7 @@ fixing (the rationale is that a patch will have to go to the newest
 release branch before any other release branch).
 The priority of a regression should initially be set to P3.
 The milestone and the priority can
-be changed by the release manager and his/her delegates.
+be changed by the release manager and their delegates.
 
 If a patch fixing a PR has been submitted, a link
 to the message with the patch should be added to the PR, as well as the
@@ -224,8 +224,8 @@ release versions) should get "minor" severity and the additional keyword
 
 Bugs in component "bootstrap" that refer to older
 releases or snapshots/CVS versions should be put into state "WAITING",
-asking the reporter whether she can still reproduce the problem and to
-report her findings in any case (whether positive or negative).
+asking the reporter whether they can still reproduce the problem and to
+report their findings in any case (whether positive or negative).
 
 
 If the response is "works now", close the report,
diff --git a/htdocs/contribute.html b/htdocs/contribute.html
index 423ce9de..c0223738 100644
--- a/htdocs/contribute.html
+++ b/htdocs/contribute.html
@@ -397,7 +397,7 @@ to point out lack of write access in your initial submission, too.
 
 Announcing Changes (to our Users)
 
-Everything that requires a user to edit his Makefiles or his source code
+Everything that requires a user to edit their Makefiles or source code
 is a good candidate for being mentioned in the release notes.
 
 Larger accomplishments, either as part of a specific project, or long
diff --git a/htdocs/develop.html b/htdocs/develop.html
index 4b1f9468..9880ad42 100644
--- a/htdocs/develop.html
+++ b/htdocs/develop.html
@@ -60,7 +60,7 @@ branch in the publicly accessible GCC development tree.)
 
 
 There is no firm guideline for what constitutes a "major change"
-and what does not.  If a developer is unsure, he or she should ask for
+and what does not.  If a developer is unsure, they should ask for
 guidance on the GCC mailing lists.  In general, a change that has the
 potential to be extremely destabilizing should be done on a branch.
 
diff --git a/htdocs/fortran/index.html b/htdocs/fortran/index.html
index 1d140b3a..1984a297 100644
--- a/htdocs/fortran/index.html
+++ b/htdocs/fortran/index.html
@@ -117,11 +117,11 @@ changes.
 Approval should be necessary for
 patches which don't fall under the obvious rule. So, with the approver list
 put in place, everybody (except maintainers) should still seek approval for 
-his/her patches.  We have found the mutual peer review process really 
+their patches.  We have found the mutual peer review process really 
 works well.
 Patches should only be reviewed by
 people who know the affected parts of the compiler. (i.e. the
-reviewer has to be sure he/she knows stuff well enough to make a
+reviewer has to be sure they know stuff well enough to make a
 good judgment.)
 Large/complicated patches should
 still go by one of our maintainers, or team consensus.
diff --git a/htdocs/gitwrite.html b/htdocs/gitwrite.html
index 92740209..9de5de27 100644
--- a/htdocs/gitwrite.html
+++ b/htdocs/gitwrite.html
@@ -37,7 +37,7 @@ is not sufficient).
 
 If you already have an account on sourceware.org / gcc.gnu.org, ask
 overse...@gcc.gnu.org to add access to the GCC repository.
-Include the name of your sponsor and CC: her.
+Include the name of your sponsor and CC: them.
 Otherwise use https://sourceware.org/cgi-bin/pdw/ps_form.cgi;>this form,
 again specifying your sponsor.
-- 
2.31.1

Re: Gang-level reductions in OpenACC routine

Hi!

On 2020-03-19T17:12:02+, Kwok Cheung Yeung  wrote:
> On 18/03/2020 11:34 pm, Kwok Cheung Yeung wrote:
>> I was looking at the regression in c-c++-common/goacc/nested-reductions.c, 
>> which
>> has the following excess warnings in acc_routine:
>>
>> /scratch/kyeung/openacc/og10/nvidia/src/gcc-og10-branch/gcc/testsuite/c-c++-common/goacc/nested-reductions.c:360:15:
>> warning: insufficient partitioning available to parallelize loop
>> /scratch/kyeung/openacc/og10/nvidia/src/gcc-og10-branch/gcc/testsuite/c-c++-common/goacc/nested-reductions.c:369:17:
>> warning: insufficient partitioning available to parallelize loop
>> /scratch/kyeung/openacc/og10/nvidia/src/gcc-og10-branch/gcc/testsuite/c-c++-common/goacc/nested-reductions.c:375:17:
>> warning: insufficient partitioning available to parallelize loop
>> /scratch/kyeung/openacc/og10/nvidia/src/gcc-og10-branch/gcc/testsuite/c-c++-common/goacc/nested-reductions.c:320:6:
>> warning: region is gang partitioned but does not contain gang partitioned 
>> code
>>
>> It is caused by the following code in the patch 'Make OpenACC orphan
>> gang reductions errors"] (originally by Cesar):
>>
>> +  /* Orphan reductions cannot have gang partitioning.  */
>> +  if ((loop->flags & OLF_REDUCTION)
>> + && oacc_get_fn_attrib (current_function_decl)
>> + && !lookup_attribute ("omp target entrypoint",
>> +   DECL_ATTRIBUTES (current_function_decl)))
>> +   this_mask = GOMP_DIM_MASK (GOMP_DIM_WORKER);

Right.  However, that code doesn't implement what the OpenACC
specification actually says.  ;-)

>> The problem is that acc_routine is not declared with 'omp target entrypoint',
>> but it does have '#pragma acc_routine gang' applied to it. From what I
>> understand of the OpenACC spec, this means that the function can be called 
>> from
>> the accelerator, and may contain a loop at the gang-level.

Right.

>> So is allowing gang
>> reductions for functions with '#pragma acc_routine gang' (but not for worker 
>> or
>> vector) the right thing to do here?

No, that's precisely the thing that the compiler needs to diagnose.  See
OpenACC 2.6, 2.9.11. "reduction clause", which places a restriction such
that "The 'reduction' clause may not be specified on an orphaned 'loop'
construct with the 'gang' clause, or on an orphaned 'loop' construct that
will generate gang parallelism in a procedure that is compiled with the
'routine gang' clause."  */

Cesar apparently read the last part to mean that inside a 'routine gang',
a 'loop reduction' with implicit 'gang' level of parallelism should be
demoted to 'worker' level of parallelism.  But what actually is meant,
simply, is that in such cases we raise the same "gang reduction on an
orphan loop" error diagnostic that we raise for explicit 'gang' level of
parallelism.  (..., and adjust our offending test cases).

Now, re your og10 etc. change:

> Allow gang-level reductions in OpenACC routines with gang-level 
> parallelism

>   gcc/
>   * omp-offload.c (oacc_loop_auto_partitions): Check for 'omp declare
>   target' attributes with a gang clause attached.

> --- a/gcc/omp-offload.c
> +++ b/gcc/omp-offload.c
> @@ -1374,14 +1374,32 @@ oacc_loop_auto_partitions (oacc_loop *loop, unsigned 
> outer_mask,

>/* Orphan reductions cannot have gang partitioning.  */
>if ((loop->flags & OLF_REDUCTION)
> -   && oacc_get_fn_attrib (current_function_decl)
> -   && !lookup_attribute ("omp target entrypoint",
> +   && oacc_get_fn_attrib (current_function_decl))
> + {
> +   bool gang_p = false;
> +   tree attr
> +   = lookup_attribute ("omp declare target",
> +   DECL_ATTRIBUTES (current_function_decl));
> +
> +   if (attr)
> + for (tree c = TREE_VALUE (attr); c; c = OMP_CLAUSE_CHAIN (c))
> +   if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_GANG)
> + {
> +   gang_p = true;
> +   break;
> + }
> +
> +   if (lookup_attribute ("omp target entrypoint",
>   DECL_ATTRIBUTES (current_function_decl)))
> - this_mask = GOMP_DIM_MASK (GOMP_DIM_WORKER);
> + gang_p = true;
> +
> +   if (!gang_p)
> + this_mask = GOMP_DIM_MASK (GOMP_DIM_WORKER);
> + }

..., I don't understand what exactly that is meant to do: as far as I can
tell, we always get 'gang_p == true' from that code?

Instead, I've pushed to master branch
commit 365cd5f9ba812c389b404a53d99ab5dded5097f4 '[OpenACC] Remove
erroneous "Orphan reductions cannot have gang partitioning" handling',
see attached.  This implements the desired "gang reduction on an orphan
loop" error diagnostics also for these implicit 'gang' cases, via the
middle-end checking that I've just added in
commit 77d24d43644909852998043335b5a0e09d1e8f02
'Consolidate OpenACC "gang reduction on an orphan loop" checking'.

Grüße
 Thomas

-
Siemens Electronic Design

[PATCH] c++, v2: Allow indeterminate unsigned char or std::byte in bit_cast - P1272R4

On Mon, Nov 29, 2021 at 10:25:58PM -0500, Jason Merrill wrote:
> It's a DR.  Really, it was intended to be part of C++20; at the Cologne
> meeting in 2019 CWG thought byteswap was going to make C++20, so this bugfix
> could go in as part of that paper.

Ok, changed to be done unconditionally now.

> Also, allowing indeterminate values that are never read was in C++20
> (P1331).

Reading P1331R2 again, I'm still puzzled.
Our current behavior (both before and after this patch) is that if
some variable is scalar and has indeterminate value or if an aggregate
variable has some members (possibly nested) with indeterminate values,
in constexpr contexts we allow copying those into other vars of the
same type (e.g. the testcases in the patch below test mere copying
of the whole structures or unsigned char result of __builtin_bit_cast),
but we reject if we actually use them in some other way (e.g. try to
read a member from a variable that has that member indeterminate,
see e.g. bit-cast14.C (f5, f6, f7), even when reading it into an
unsigned char variable.

Then there is P1331R2 which makes the UB on
"an lvalue-to-rvalue conversion that is applied to an object with
indeterminate value ([basic.indet]);"
but isn't even the
  unsigned char a = __builtin_bit_cast (unsigned char, u);
  unsigned char b = a;
case non-constant then when __builtin_bit_cast returns indeterminate value?
__builtin_bit_cast returns rvalue, so no lvalue-to-rvalue conversion happens
in that case, so supposely
  unsigned char a = __builtin_bit_cast (unsigned char, u);
is fine, but on
  unsigned char b = a;
a is lvalue and is converted to rvalue.
Similarly
  T t = { 1, 2 };
  S s = __builtin_bit_cast (S, t);
  S u = s;
where S s = __builtin_bit_cast (S, t); could be ok even when some or all
members are indeterminate, but u = s; does lvalue-to-rvalue conversion?

Or there is http://eel.is/c++draft/basic.indet that has quite clear rules
what is and isn't UB and if C++ wanted to go further and allow all those
valid cases in there as constant...

Anyway, I hope this can be dealt with incrementally.

> I think in all of them the result of the cast has (some) indeterminate
> value.  So f1-3 are OK because the indeterminate value has unsigned char
> type and is never used; f4() is non-constant because S::f has
> non-byte-access type and so the new wording says it's undefined.

Ok, implemented the bitfield handling then.

Here is an updated patch, so far lightly tested.

2021-11-30  Jakub Jelinek 

* constexpr.c (clear_uchar_or_std_byte_in_mask): New function.
(cxx_eval_bit_cast): Don't error about padding bits if target
type is unsigned char or std::byte, instead return no clearing
ctor.  Use clear_uchar_or_std_byte_in_mask.

* g++.dg/cpp2a/bit-cast11.C: New test.
* g++.dg/cpp2a/bit-cast12.C: New test.
* g++.dg/cpp2a/bit-cast13.C: New test.
* g++.dg/cpp2a/bit-cast14.C: New test.

--- gcc/cp/constexpr.c.jj   2021-11-30 09:44:46.531607444 +0100
+++ gcc/cp/constexpr.c  2021-11-30 12:20:29.105251443 +0100
@@ -4268,6 +4268,121 @@ check_bit_cast_type (const constexpr_ctx
   return false;
 }

+/* Helper function for cxx_eval_bit_cast.  For unsigned char or
+   std::byte members of CONSTRUCTOR (recursively) if they contain
+   some indeterminate bits (as set in MASK), remove the ctor elts,
+   mark the CONSTRUCTOR as CONSTRUCTOR_NO_CLEARING and clear the
+   bits in MASK.  */
+
+static void
+clear_uchar_or_std_byte_in_mask (location_t loc, tree t, unsigned char *mask)
+{
+  if (TREE_CODE (t) != CONSTRUCTOR)
+return;
+
+  unsigned i, j = 0;
+  tree index, value;
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (t), i, index, value)
+{
+  tree type = TREE_TYPE (value);
+  if (TREE_CODE (TREE_TYPE (t)) != ARRAY_TYPE
+ && DECL_BIT_FIELD_TYPE (index) != NULL_TREE)
+   {
+ if (is_byte_access_type (DECL_BIT_FIELD_TYPE (index))
+ && (TYPE_MAIN_VARIANT (DECL_BIT_FIELD_TYPE (index))
+ != char_type_node))
+   {
+ HOST_WIDE_INT fldsz = TYPE_PRECISION (TREE_TYPE (index));
+ gcc_assert (fldsz != 0);
+ HOST_WIDE_INT pos = int_byte_position (index);
+ HOST_WIDE_INT bpos
+   = tree_to_uhwi (DECL_FIELD_BIT_OFFSET (index));
+ bpos %= BITS_PER_UNIT;
+ HOST_WIDE_INT end
+   = ROUND_UP (bpos + fldsz, BITS_PER_UNIT) / BITS_PER_UNIT;
+ gcc_assert (end == 1 || end == 2);
+ unsigned char *p = mask + pos;
+ unsigned char mask_save[2];
+ mask_save[0] = mask[pos];
+ mask_save[1] = end == 2 ? mask[pos + 1] : 0;
+ if (BYTES_BIG_ENDIAN != WORDS_BIG_ENDIAN)
+   sorry_at (loc, "PDP11 bit-field handling unsupported"
+  " in %qs", "__builtin_bit_cast");
+ else if (BYTES_BIG_ENDIAN)
+   {
+ /* Big endian.  */
+

Re: [gomp4] Make OpenACC orphan gang reductions errors

Hi!

On 2017-05-01T18:27:59-0700, Cesar Philippidis  wrote:
>   gcc/c/
>   * c-typeck.c (c_finish_omp_clauses): Emit an error on orphan OpenACC
>   gang reductions.
>
>   gcc/cp/
>   * semantics.c (finish_omp_clauses): Emit an error on orphan OpenACC
>   gang reductions.
>
>   gcc/fortran/
>   * openmp.c (resolve_oacc_loop_blocks): Emit an error on orphan OpenACC
>   gang reductions.

As a follow-up, I've pushed to master branch
commit 77d24d43644909852998043335b5a0e09d1e8f02
'Consolidate OpenACC "gang reduction on an orphan loop" checking',
see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 77d24d43644909852998043335b5a0e09d1e8f02 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 26 Nov 2021 12:29:26 +0100
Subject: [PATCH] Consolidate OpenACC "gang reduction on an orphan loop"
 checking

No need to implement separately in all front ends what we may implement in the
middle end, once for all.

Follow-up to preceding commit 2b7dac2c0dcb087da9e4018943c023c0678234a3
"Make OpenACC orphan gang reductions errors".

	gcc/
	* omp-offload.c (oacc_loop_process): Implement "gang reduction on
	an orphan loop" checking.
	gcc/c/
	* c-typeck.c (c_finish_omp_clauses): Remove "gang reduction on an
	orphan loop" checking.
	gcc/cp/
	* semantics.c (finish_omp_clauses): Remove "gang reduction on an
	orphan loop" checking.
	gcc/fortran/
	* openmp.c (resolve_oacc_loop_blocks): Remove "gang reduction on
	an orphan loop" checking.
	(oacc_is_parallel, oacc_is_kernels, oacc_is_serial)
	(oacc_is_compute_construct): Remove.
	gcc/testsuite/
	* gfortran.dg/goacc/orphan-reductions-1.f90: Adjust.
---
 gcc/c/c-typeck.c  |  8 
 gcc/cp/semantics.c|  8 
 gcc/fortran/openmp.c  | 37 ---
 gcc/omp-offload.c | 20 --
 .../gfortran.dg/goacc/orphan-reductions-1.f90 |  8 ++--
 5 files changed, 20 insertions(+), 61 deletions(-)

diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index a025740e618..7524304f2bd 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -14135,14 +14135,6 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	  goto check_dup_generic;
 
 	case OMP_CLAUSE_REDUCTION:
-	  if (ort == C_ORT_ACC && oacc_get_fn_attrib (current_function_decl)
-	  && omp_find_clause (clauses, OMP_CLAUSE_GANG))
-	{
-	  error_at (OMP_CLAUSE_LOCATION (c),
-			"gang reduction on an orphan loop");
-	  remove = true;
-	  break;
-	}
 	  if (reduction_seen == 0)
 	reduction_seen = OMP_CLAUSE_REDUCTION_INSCAN (c) ? -1 : 1;
 	  else if (reduction_seen != -2
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index c84caf43251..cd1956497f8 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -6667,14 +6667,6 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	  field_ok = ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP);
 	  goto check_dup_generic;
 	case OMP_CLAUSE_REDUCTION:
-	  if (ort == C_ORT_ACC && oacc_get_fn_attrib (current_function_decl)
-	  && omp_find_clause (clauses, OMP_CLAUSE_GANG))
-	{
-	  error_at (OMP_CLAUSE_LOCATION (c),
-			"gang reduction on an orphan loop");
-	  remove = true;
-	  break;
-	}
 	  if (reduction_seen == 0)
 	reduction_seen = OMP_CLAUSE_REDUCTION_INSCAN (c) ? -1 : 1;
 	  else if (reduction_seen != -2
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 7950c7fb43d..d120be81467 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -8322,31 +8322,6 @@ resolve_omp_do (gfc_code *code)
 }
 }
 
-static bool
-oacc_is_parallel (gfc_code *code)
-{
-  return code->op == EXEC_OACC_PARALLEL || code->op == EXEC_OACC_PARALLEL_LOOP;
-}
-
-static bool
-oacc_is_kernels (gfc_code *code)
-{
-  return code->op == EXEC_OACC_KERNELS || code->op == EXEC_OACC_KERNELS_LOOP;
-}
-
-static bool
-oacc_is_serial (gfc_code *code)
-{
-  return code->op == EXEC_OACC_SERIAL || code->op == EXEC_OACC_SERIAL_LOOP;
-}
-
-static bool
-oacc_is_compute_construct (gfc_code *code)
-{
-  return (oacc_is_parallel (code)
-	  || oacc_is_kernels (code)
-	  || oacc_is_serial (code));
-}
 
 static gfc_statement
 omp_code_to_statement (gfc_code *code)
@@ -8650,18 +8625,6 @@ resolve_oacc_loop_blocks (gfc_code *code)
   if (!oacc_is_loop (code))
 return;
 
-  if (code->op == EXEC_OACC_LOOP
-  && code->ext.omp_clauses->lists[OMP_LIST_REDUCTION]
-  && code->ext.omp_clauses->gang)
-{
-  fortran_omp_context *c;
-  for (c = omp_current_ctx; c; c = c->previous)
-	if (!oacc_is_loop (c->code))
-	  break;
-  if (c == NULL || !(oacc_is_compute_construct (c->code)))
-	gfc_error ("gang reduction on an orphan loop at %L", >loc);

Re: [PATCH] [og10] libgomp, Fortran: Fix OpenACC "gang reduction on an orphan loop" error message

Hi!

On 2020-07-20T12:26:48+0200, Frederik Harwath  wrote:
> Thomas Schwinge  writes:
>>> Can I include the patch in OG10?

> This has been delayed a bit by my vacation, but I have now committed
> the patch.

>> (Ideally, we'd also test 'serial' construct in addition to 'kernels',
>> 'parallel'

> I have included the test cases for the "serial construct".

I've adapted the remaining relevant changes and pushed to master branch
commit c4f4c60457d1657cbd72015de3d818eb6462a0e9
'Re OpenACC "gang reduction on an orphan loop" error message', see
attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From c4f4c60457d1657cbd72015de3d818eb6462a0e9 Mon Sep 17 00:00:00 2001
From: Frederik Harwath 
Date: Mon, 20 Jul 2020 11:24:21 +0200
Subject: [PATCH] Re OpenACC "gang reduction on an orphan loop" error message

Follow-up to preceding commit 2b7dac2c0dcb087da9e4018943c023c0678234a3
"Make OpenACC orphan gang reductions errors".

	gcc/fortran/
	* openmp.c (oacc_is_parallel_or_serial): Evolve into...
	(oacc_is_compute_construct): ... this function.
	(resolve_oacc_loop_blocks): Use "oacc_is_compute_construct"
	instead of "oacc_is_parallel_or_serial" for checking that a
	loop is not orphaned.
	gcc/testsuite/
	* gfortran.dg/goacc/orphan-reductions-3.f90: New test
	verifying that the "gang reduction on an orphan loop" error message
	is not emitted for non-orphaned loops.
	* c-c++-common/goacc/orphan-reductions-3.c: Likewise for C and C++.

Co-Authored-By: Thomas Schwinge 
---
 gcc/fortran/openmp.c  |   9 +-
 .../c-c++-common/goacc/orphan-reductions-3.c  | 102 ++
 .../gfortran.dg/goacc/orphan-reductions-3.f90 |  89 +++
 3 files changed, 196 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/orphan-reductions-3.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/orphan-reductions-3.f90

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index b4100577e51..7950c7fb43d 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -8341,9 +8341,11 @@ oacc_is_serial (gfc_code *code)
 }
 
 static bool
-oacc_is_parallel_or_serial (gfc_code *code)
+oacc_is_compute_construct (gfc_code *code)
 {
-  return oacc_is_parallel (code) || oacc_is_serial (code);
+  return (oacc_is_parallel (code)
+	  || oacc_is_kernels (code)
+	  || oacc_is_serial (code));
 }
 
 static gfc_statement
@@ -8656,8 +8658,7 @@ resolve_oacc_loop_blocks (gfc_code *code)
   for (c = omp_current_ctx; c; c = c->previous)
 	if (!oacc_is_loop (c->code))
 	  break;
-  if (c == NULL || !(oacc_is_parallel_or_serial (c->code)
-			 || oacc_is_kernels (c->code)))
+  if (c == NULL || !(oacc_is_compute_construct (c->code)))
 	gfc_error ("gang reduction on an orphan loop at %L", >loc);
 }
 
diff --git a/gcc/testsuite/c-c++-common/goacc/orphan-reductions-3.c b/gcc/testsuite/c-c++-common/goacc/orphan-reductions-3.c
new file mode 100644
index 000..cd8ad274ebb
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/orphan-reductions-3.c
@@ -0,0 +1,102 @@
+/* Verify that the error message for gang reduction on orphaned OpenACC loops
+   is not reported for non-orphaned loops. */
+
+/* { dg-additional-options "-Wopenacc-parallelism" } */
+
+int
+kernels (int n)
+{
+  int i, s1 = 0, s2 = 0;
+#pragma acc kernels
+  {
+#pragma acc loop gang reduction(+:s1) /* { dg-bogus "gang reduction on an orphan loop" } */
+  for (i = 0; i < n; i++)
+s1 = s1 + 2;
+
+#pragma acc loop gang reduction(+:s2) /* { dg-bogus "gang reduction on an orphan loop" } */
+  for (i = 0; i < n; i++)
+s2 = s2 + 2;
+  }
+  return s1 + s2;
+}
+
+int
+parallel (int n)
+{
+  int i, s1 = 0, s2 = 0;
+#pragma acc parallel
+  {
+#pragma acc loop gang reduction(+:s1) /* { dg-bogus "gang reduction on an orphan loop" } */
+  for (i = 0; i < n; i++)
+s1 = s1 + 2;
+
+#pragma acc loop gang reduction(+:s2) /* { dg-bogus "gang reduction on an orphan loop" } */
+  for (i = 0; i < n; i++)
+s2 = s2 + 2;
+  }
+  return s1 + s2;
+}
+
+int
+serial (int n)
+{
+  int i, s1 = 0, s2 = 0;
+#pragma acc serial /* { dg-warning "region contains gang partitioned code but is not gang partitioned" } */
+  {
+#pragma acc loop gang reduction(+:s1) /* { dg-bogus "gang reduction on an orphan loop" } */
+  for (i = 0; i < n; i++)
+s1 = s1 + 2;
+
+#pragma acc loop gang reduction(+:s2) /* { dg-bogus "gang reduction on an orphan loop" } */
+  for (i = 0; i < n; i++)
+s2 = s2 + 2;
+  }
+  return s1 + s2;
+}
+
+int
+serial_combined (int n)
+{
+  int i, s1 = 0, s2 = 0;
+#pragma acc serial loop gang reduction(+:s1) /* { dg-bogus "gang reduction on an orphan loop" } */
+  /* { dg-warning "region contains gang partitioned code but is not gang partitioned" "" { target *-*-* } .-1 } */
+  for (i = 0;

Re: [gomp4] Make OpenACC orphan gang reductions errors

Hi!

On 2017-05-01T18:27:59-0700, Cesar Philippidis  wrote:
> --- a/gcc/fortran/openmp.c
> +++ b/gcc/fortran/openmp.c
> @@ -6090,6 +6090,18 @@ resolve_oacc_loop_blocks (gfc_code *code)

> +  if (code->op == EXEC_OACC_LOOP
> +  && code->ext.omp_clauses->lists[OMP_LIST_REDUCTION]
> +  && code->ext.omp_clauses->gang)
> +{
> +  for (c = omp_current_ctx; c; c = c->previous)
> + if (!oacc_is_loop (c->code))
> +   break;
> +  if (c == NULL || !(oacc_is_parallel (c->code)
> +  || oacc_is_kernels (c->code)))
> +  gfc_error ("gang reduction on an orphan loop at %L", >loc);
> +}

To avoid erroneous diagnostics, we also need to handle the OpenACC
'serial' construct here.  I've adapted Kwok's relevant patch, and pushed
to master branch commit f1a58ab0db20c0862e8b5039bd448fc8c9799cac
"[OpenACC] Allow gang reductions inside serial constructs", see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From f1a58ab0db20c0862e8b5039bd448fc8c9799cac Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Fri, 13 Mar 2020 11:13:49 -0700
Subject: [PATCH] [OpenACC] Allow gang reductions inside serial constructs

... fixing a regression introduced in the preceding
commit 2b7dac2c0dcb087da9e4018943c023c0678234a3
"Make OpenACC orphan gang reductions errors".

	gcc/fortran/
	* openmp.c (oacc_is_serial, oacc_is_parallel_or_serial): New.
	(resolve_oacc_loop_blocks): Use oacc_is_parallel_or_serial instead of
	oacc_is_parallel.
	libgomp/
	* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Remove
	temporary skip.

Co-Authored-By: Thomas Schwinge 
---
 gcc/fortran/openmp.c   | 14 +-
 .../libgomp.oacc-fortran/parallel-dims.f90 |  1 -
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 4fa38691c01..b4100577e51 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -8334,6 +8334,18 @@ oacc_is_kernels (gfc_code *code)
   return code->op == EXEC_OACC_KERNELS || code->op == EXEC_OACC_KERNELS_LOOP;
 }
 
+static bool
+oacc_is_serial (gfc_code *code)
+{
+  return code->op == EXEC_OACC_SERIAL || code->op == EXEC_OACC_SERIAL_LOOP;
+}
+
+static bool
+oacc_is_parallel_or_serial (gfc_code *code)
+{
+  return oacc_is_parallel (code) || oacc_is_serial (code);
+}
+
 static gfc_statement
 omp_code_to_statement (gfc_code *code)
 {
@@ -8644,7 +8656,7 @@ resolve_oacc_loop_blocks (gfc_code *code)
   for (c = omp_current_ctx; c; c = c->previous)
 	if (!oacc_is_loop (c->code))
 	  break;
-  if (c == NULL || !(oacc_is_parallel (c->code)
+  if (c == NULL || !(oacc_is_parallel_or_serial (c->code)
 			 || oacc_is_kernels (c->code)))
 	gfc_error ("gang reduction on an orphan loop at %L", >loc);
 }
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90 b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
index 80d64030414..fad3d9d6a80 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/parallel-dims.f90
@@ -3,7 +3,6 @@
 
 ! { dg-additional-sources parallel-dims-aux.c }
 ! { dg-do run }
-  ! { dg-skip-if TODO { *-*-* } }
 ! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
 
 ! { dg-additional-options "-fopt-info-note-omp" }
-- 
2.33.0

Re: [gomp4] Make OpenACC orphan gang reductions errors

Hi!

On 2017-05-01T18:27:59-0700, Cesar Philippidis  wrote:
> This patch promotes all OpenACC gang reductions on orphan loops as
> errors. Accord to the spec, orphan loops are those which are not
> lexically nested inside an OpenACC parallel or kernels regions. I.e.,
> acc loops inside acc routines.
>
> At first I thought this could be a warning because the gang reduction
> finalizer uses an atomic update. However, because there is no
> synchronization between gangs, there is way to guarantee that reduction
> will have completed once a single gang entity returns from the acc
> routine call.
>
> I've applied this patch to gomp-4_0-branch.

... which I've now adapted (with several things to be fixed in follow-up
commits) and pushed to master branch in
commit 2b7dac2c0dcb087da9e4018943c023c0678234a3
"Make OpenACC orphan gang reductions errors", see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 2b7dac2c0dcb087da9e4018943c023c0678234a3 Mon Sep 17 00:00:00 2001
From: Cesar Philippidis 
Date: Mon, 1 May 2017 18:27:59 -0700
Subject: [PATCH] Make OpenACC orphan gang reductions errors

This patch promotes all OpenACC gang reductions on orphan loops as
errors. Accord to the spec, orphan loops are those which are not
lexically nested inside an OpenACC parallel or kernels regions. I.e.,
acc loops inside acc routines.

At first I thought this could be a warning because the gang reduction
finalizer uses an atomic update. However, because there is no
synchronization between gangs, there is way to guarantee that reduction
will have completed once a single gang entity returns from the acc
routine call.

	gcc/c/
	* c-typeck.c (c_finish_omp_clauses): Emit an error on orphan
	OpenACC gang reductions.
	gcc/cp/
	* semantics.c (finish_omp_clauses): Emit an error on orphan
	OpenACC gang reductions.
	gcc/fortran/
	* openmp.c (oacc_is_parallel, oacc_is_kernels): New 'static'
	functions.
	(resolve_oacc_loop_blocks): Emit an error on orphan OpenACC gang
	reductions.
	gcc/
	* omp-general.h (enum oacc_loop_flags): Add OLF_REDUCTION enum.
	* omp-low.c (lower_oacc_head_mark): Use it to mark OpenACC
	reductions.
	* omp-offload.c (oacc_loop_auto_partitions): Don't assign gang
	level parallelism to orphan reductions.
	gcc/testsuite/
	* c-c++-common/goacc/nested-reductions-1-routine.c: Adjust.
	* c-c++-common/goacc/nested-reductions-2-routine.c: Likewise.
	* gcc.dg/goacc/loop-processing-1.c: Likewise.
	* gfortran.dg/goacc/nested-reductions-1-routine.f90: Likewise.
	* gfortran.dg/goacc/nested-reductions-2-routine.f90: Likewise.
	* c-c++-common/goacc/orphan-reductions-1.c: New test.
	* c-c++-common/goacc/orphan-reductions-2.c: New test.
	* gfortran.dg/goacc/orphan-reductions-1.f90: New test.
	* gfortran.dg/goacc/orphan-reductions-2.f90: New test.
	libgomp/
	* testsuite/libgomp.oacc-fortran/parallel-dims.f90: Temporarily
	skip.

Co-Authored-By: Thomas Schwinge 
---
 gcc/c/c-typeck.c  |   8 +
 gcc/cp/semantics.c|   8 +
 gcc/fortran/openmp.c  |  24 ++
 gcc/omp-general.h |   3 +-
 gcc/omp-low.c |   4 +
 gcc/omp-offload.c |   7 +
 .../goacc/nested-reductions-1-routine.c   |   3 +
 .../goacc/nested-reductions-2-routine.c   |   9 +
 .../c-c++-common/goacc/orphan-reductions-1.c  |  56 +
 .../c-c++-common/goacc/orphan-reductions-2.c  |  87 
 .../gcc.dg/goacc/loop-processing-1.c  |   2 +-
 .../goacc/nested-reductions-1-routine.f90 |   3 +
 .../goacc/nested-reductions-2-routine.f90 |   9 +
 .../gfortran.dg/goacc/orphan-reductions-1.f90 | 206 ++
 .../gfortran.dg/goacc/orphan-reductions-2.f90 |  89 
 .../libgomp.oacc-fortran/parallel-dims.f90|   1 +
 16 files changed, 517 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/orphan-reductions-1.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/orphan-reductions-2.c
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/orphan-reductions-1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/orphan-reductions-2.f90

diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 7524304f2bd..a025740e618 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -14135,6 +14135,14 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	  goto check_dup_generic;
 
 	case OMP_CLAUSE_REDUCTION:
+	  if (ort == C_ORT_ACC && oacc_get_fn_attrib (current_function_decl)
+	  && omp_find_clause (clauses, OMP_CLAUSE_GANG))
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c),
+			"gang reduction on an orphan loop");
+	  remove = true;
+	  break;
+	}
 	  if (reduction_seen == 0)
 	reduction_seen

Re: [PATCH] [og10] Fix goacc/routine-4-extern.c test

Hi!

On 2020-07-28T10:44:29+0200, I wrote:
> On 2020-07-26T14:05:32+0100, Kwok Cheung Yeung  wrote:
>> On 24/07/2020 8:27 am, Thomas Schwinge wrote:
>>> [proposed patch] however completely defeats what we're intending to test 
>>> here, which
>>> is to "Test invalid intra-routine parallelism".  The same problem has
>>> been introduced in og10 commit 6a0b5806b24bfdefe0b0f3ccbcc51299e5195dca
>>> "Various OpenACC reduction enhancements - test cases" for
>>> 'gcc/testsuite/c-c++-common/goacc/routine-4.c', which throughout changed:
>>>
>>>  -#pragma acc loop gang reduction (+:red) // { dg-error "disallowed by 
>>> containing routine" }
>>>  +#pragma acc loop seq reduction (+:red)
>>>
>>> Please revert that, and instead replace 'reduction (+:red)' with a
>>> different "dummy loop operation" (just an empty loop body?), and in the
>>> commit log state that this should've been included in the respective og10
>>> commit adding the "gang reduction on an orphan loop" checking.
>>
>> I have reverted all the previous changes and replaced the orphan loop gang
>> reductions with empty loops as suggested, and checked that the tests now 
>> pass.
>>
>> Is this version okay for OG10?
>
> Yes, thanks.

... which I've now adapted and pushed to master branch in
commit a83a07557085f6da83c63e86c1cd2e719a39b8b2
"Fix c-c++-common/goacc/routine-4.c and
c-c++-common/goacc/routine-4-extern.c testcases", see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From a83a07557085f6da83c63e86c1cd2e719a39b8b2 Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Tue, 28 Jul 2020 05:41:14 -0700
Subject: [PATCH] Fix c-c++-common/goacc/routine-4.c and
 c-c++-common/goacc/routine-4-extern.c testcases

... in preparation for checks that we're introducing for OpenACC gang
reductions on orphan loops.

	gcc/testsuite/
	* c-c++-common/goacc/routine-4.c (seq, vector, worker, gang):
	Remove loop reductions.
	* c-c++-common/goacc/routine-4-extern.c (seq, vector, worker, gang):
	Likewise.

Co-Authored-By: Thomas Schwinge 
---
 .../c-c++-common/goacc/routine-4-extern.c | 72 +--
 gcc/testsuite/c-c++-common/goacc/routine-4.c  | 72 +--
 2 files changed, 64 insertions(+), 80 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/goacc/routine-4-extern.c b/gcc/testsuite/c-c++-common/goacc/routine-4-extern.c
index ec21db1c319..ec4475818ad 100644
--- a/gcc/testsuite/c-c++-common/goacc/routine-4-extern.c
+++ b/gcc/testsuite/c-c++-common/goacc/routine-4-extern.c
@@ -26,23 +26,21 @@ void seq (void)
   extern_vector ();  /* { dg-error "routine call uses" } */
   extern_seq ();
 
-  int red;
-
-#pragma acc loop reduction (+:red) // { dg-warning "insufficient partitioning" }
+#pragma acc loop // { dg-warning "insufficient partitioning" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop gang reduction (+:red) // { dg-error "disallowed by containing routine" }
+#pragma acc loop gang // { dg-error "disallowed by containing routine" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop worker reduction (+:red) // { dg-error "disallowed by containing routine" }
+#pragma acc loop worker // { dg-error "disallowed by containing routine" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop vector reduction (+:red) // { dg-error "disallowed by containing routine" }
+#pragma acc loop vector // { dg-error "disallowed by containing routine" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 }
 
 void vector (void)
@@ -52,23 +50,21 @@ void vector (void)
   extern_vector ();
   extern_seq ();
 
-  int red;
-
-#pragma acc loop reduction (+:red)
+#pragma acc loop
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop gang reduction (+:red) // { dg-error "disallowed by containing routine" }
+#pragma acc loop gang // { dg-error "disallowed by containing routine" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop worker reduction (+:red) // { dg-error "disallowed by containing routine" }
+#pragma acc loop worker // { dg-error "disallowed by containing routine" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop vector reduction (+:red)
+#pragma acc loop vector
   for (int i = 0; i < 10; i++)
-red ++;
+;
 }
 
 void worker (void)
@@ -78,23 +74,21 @@ void worker (void)
   extern_vector ();
   extern_seq ();
 
-  int red;
-
-#pragma acc loop reduction (+:red)
+#pragma acc loop
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop gang reduction (+:red) // { dg-error "disallowed by containing routine" }
+#pragma acc loop gang // { dg-error "disallowed by containing routine" }
   for (int i = 0; i < 10; i++)
-red ++;
+;
 
-#pragma acc loop worker

Re: [PATCH] PR fortran/103473 - [11/12 Regression] ICE in simplify_minmaxloc_nodim, at fortran/simplify.c:5287


Le 29/11/2021 à 23:01, Harald Anlauf via Fortran a écrit :

Dear all,

another trivial and obvious one, discovered by Gerhard.

We can have a NULL pointer dereference simplifying MINLOC/MAXLOC
on an array that was not properly declared.

OK for mainline / affected 11-branch after regtesting completes?


Yes, fine as well.

I have the impression that there are quite a number of bugs of this 
kind, and that maybe we could take a more systematic approach to not try 
to simplify something with errors.


Thanks.

Re: [PATCH 1v2/3][vect] Add main vectorized loop unrolling

2021-11-30 Thread Andre Vieira (lists) via Gcc-patches




On 25/11/2021 12:46, Richard Biener wrote:

Oops, my fault, yes, it does.  I would suggest to refactor things so
that the mode_i = first_loop_i case is there only once.  I also wonder
if all the argument about starting at 0 doesn't apply to the
not unrolled LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P as well?  So
what's the reason to differ here?  So in the end I'd just change
the existing

   if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
 {

to

   if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo)
   || first_loop_vinfo->suggested_unroll_factor > 1)
 {

and maybe revisit this when we have an actual testcase showing that
doing sth else has a positive effect?

Thanks,
Richard.


So I had a quick chat with Richard Sandiford and he is suggesting 
resetting mode_i to 0 for all cases.


He pointed out that for some tunings the SVE mode might come after the 
NEON mode, which means that even for not-unrolled loop_vinfos we could 
end up with a suboptimal choice of mode for the epilogue. I.e. it could 
be that we pick V16QI for main vectorization, but that's VNx16QI + 1 in 
the array, so we'd not try VNx16QI for the epilogue.


This would simplify the mode selecting cases, by just simply restarting 
at mode_i in all epilogue cases. Is that something you'd be OK?


Regards,
Andre

Re: [PATCH] PR fortran/101565 - ICE in gfc_simplify_image_index, at fortran/simplify.c:8234


Hello,

Le 29/11/2021 à 22:31, Harald Anlauf via Fortran a écrit :

Dear all,

a trivial one: we need to check the type of the SUB argument
to the coarray IMAGE_INDEX intrinsic.  It has to be an array
of type integer.

Patch by Steve Kargl.


I hope at some point he’ll finally come to a working git workflow.


Regtested on x86_64-pc-linux-gnu.  OK for mainline?


Sure.

Re: [PATCH] Loop unswitching: support gswitch statements.

On Mon, Nov 29, 2021 at 1:45 PM Martin Liška  wrote:
>
> On 11/26/21 09:12, Richard Biener wrote:
> > On Wed, Nov 24, 2021 at 3:32 PM Martin Liška  wrote:
> >>
> >> On 11/24/21 15:14, Martin Liška wrote:
> >>> It likely miscompiles gcc.dg/loop-unswitch-5.c, working on that..
> >>
> >> Fixed that in the updated version.
> >
> > Function level comments need updating it seems.
>
> I've done that.
>
> >
> > +static unsigned
> > +evaluate_insns (class loop *loop,  basic_block *bbs,
> > +   predicate_vector _path,
> > +   auto_bb_flag _flag)
> > +{
> > +  auto_vec worklist (loop->num_nodes);
> > +  worklist.quick_push (bbs[0]);
> > ...
> >
> > so when adding gswitch support the easiest way to make
> >
> > +  FOR_EACH_EDGE (e, ei, bb->succs)
> > +   {
> > ...
> > +   {
> > + worklist.safe_push (dest);
> > + dest->flags |= reachable_flag;
> >
> > work is when the gcond/gswitch simplification would mark
> > outgoing edges as (non-)executable.  For gswitch this
> > could be achieved by iterating over the case labels and
> > intersecting that with the range while for gcond it's a
> > matter of setting an edge flag instead of returning true/false.
>
> Exactly, it can be quite naturally added to the current patch.
>
> > I'd call the common function evaluate_control_stmt_using_entry_checks
> > or so and invoke it on the last stmt of a block with >= 2 outgoing
> > edges.
>
> Yes, I'll do it for the gswitch support patch.
>
> >
> > We still seem to do the simplification work twice, once for costing
> > and once for transform, but that's OK for now I guess.
> >
> > I think you want to clear_aux_for_blocks at the end of the pass.
>
> Called that.
>
> >
> > Otherwise I like it - it seems you have some TODO around cost
> > modeling.  Did you try to do gswitch support ontop as I suggested
> > to see if the general structure keeps working?
>
> I vanished and tested the patch. No, I don't have the gswitch support patch
> as the current patch was reworked a few times.
>
> Can we please progress and have installed the suggested patch?

I'd like to see the gswitch support - that's what was posted before stage3
close, this patch on its own doesn't seem worth pushing for.  That said,
I have some comments below (and the already raised ones about how
things might need to change with gswitch support).  Is it so difficult to
develop gswitch support as a separate change ontop of this?

> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

+#include 

that's included unconditionally by system.h

+/* The type represents a predicate path leading to a basic block.  */
+typedef auto_vec> predicate_vector;

+static bool tree_unswitch_single_loop (class loop *, int,
+  predicate_vector _path,

I think we don't want to pass auto_vec by reference, instead auto_vec should
decay to vec<> when passed around.

+  unswitch_predicate *predicate = new unswitch_predicate (cond, lhs);
+  if (irange::supports_type_p (TREE_TYPE (lhs)) && CONSTANT_CLASS_P (rhs))
+{
+  ranger->range_on_edge (predicate->true_range, edge_true, lhs);
+  predicate->false_range = predicate->true_range;

-  return cond;
+  if (!predicate->false_range.varying_p ()
+ && !predicate->false_range.undefined_p ())
+   predicate->false_range.invert ();
+}

is that correct?  I would guess range_on_edge, for

   if (a > 10)
 if (a < 15)
/* true */
 else
/* false */

figures [11, 14] on the true edge of if (a < 15) (considered the
unswitch predicate),
inverting that yields [0, 10] u [15, +INF] but that's at least
sub-optimal for the
else range.  I think we want to call range_on_edge again to determine the range
on the else branch?

 }

-/* Simplifies COND using checks in front of the entry of the LOOP.  Just very
-   simplish (sufficient to prevent us from duplicating loop in unswitching
-   unnecessarily).  */
+static void
+combine_range (predicate_vector _path, tree index, irange
_range)
+{

unless I misread the patch combine_range misses a comment.

+evaluate_control_stmt_using_entry_checks (gimple *stmt,
+ predicate_vector _path)
 {

so this function for ranger does combine all predicates on the predicate_path
but for the symbolic evaluation it looks at the last predicate only?  I guess
that's because other predicate simplification opportunities are applied already,
correct?  But doesn't that mean that the combine_range could be done once
when we build the predicate vector instead of for each stmt?  I'm just
looking at the difference in treating both cases - if we first analyze the whole
unswitching path (including all recursions) then we'd have to simplify all
opportunities at once, so iterating over all predicates would make sense.
Still merging ranges when pushing the to the predicate vector rather than
for each stmt would make sense?  We'd then have at most one predicate

Re: [Patch 1/8, Arm, AArch64, GCC] Refactor mbranch-protection option parsing and make it common to AArch32 and AArch64 backends. [Was RE: [Patch 2/7, Arm, GCC] Add option -mbranch-protection.]

2021-11-30 Thread Andrea Corallo via Gcc-patches

Tejas Belagod via Gcc-patches  writes:

> Ping for this series.
>
> Thanks,
> Tejas.

Hi all,

pinging this series.

BR

  Andrea

Re: [PATCH] libstdc++: Add [[nodiscard]] to std::byteswap