Re: [PATCH] bpf: correct extra_headers

2021-09-28 Thread Jose E. Marchesi via Gcc-patches


Hi David.

> The BPF CO-RE support (commit 8bdabb37549f12ce727800a1c8aa182c0b1dd42a)
> mistakenly overwrote bpf-*-* extra_headers in config.gcc, causing
> bpf-helpers.h to not be installed. The redefinition with coreout.h is
> unneeded, so delete it.

This is OK.
Thanks.

>
> gcc/ChangeLog:
>
>   * config.gcc (bpf-*-*): Do not overwrite extra_headers.
> ---
>  gcc/config.gcc | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 498c51e619d..aa5bd5d1459 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -1531,7 +1531,6 @@ bpf-*-*)
>  use_collect2=no
>  extra_headers="bpf-helpers.h"
>  use_gcc_stdint=provide
> -extra_headers="coreout.h"
>  extra_objs="coreout.o"
>  target_gtfiles="$target_gtfiles \$(srcdir)/config/bpf/coreout.c"
>  ;;


Re: [PATCH] rs6000: Remove builtin mask check from builtin_decl [PR102347]

2021-09-28 Thread Kewen.Lin via Gcc-patches
Hi Bill,

Thanks for your prompt comments!

on 2021/9/29 上午3:24, Bill Schmidt wrote:
> Hi Kewen,
> 
> Although I agree that what we do now is tragically bad (and will be fixed in 
> the builtin rewrite), this seems a little too cavalier to remove all checking 
> during initialization without adding any checking somewhere else. :-)  We 
> still need to check for invalid usage when the builtin is expanded, and I 
> don't think the old code does this at all.
> 

If I read the code right, there are some following places to check the invalid 
usage or not.
  1) for folding, rs6000_gimple_fold_builtin -> rs6000_builtin_is_supported_p 
-> check mask
  -> defer to expand if invalid.
  2) for expanding, obtain func_valid_p, error in rs6000_invalid_builtin.

Both places seem to exist before the builtin rewrite, am I missing something?

btw, I remembered I used one built gcc with my fix to compile one test case 
which is supposed to fail
due to its invalid usage builtin at option -flto, it failed (errored) as 
expected but at LTRANS phase
since it's the time to do expansion for no-fat-objs scenario.

> Unless you are planning to do a backport, I think the proper way forward here 
> is to just wait for the new builtin support to land.  In the new code, we 
> initialize all built-ins up front, and check properly at expansion time 
> whether the builtin is enabled in the environment that obtains during expand.

Good to know that!  Nice!  btw, for this issue itself, the current 
implementation (without rewriting)
also initializes the built-ins in the table since MMA built-ins guarded in 
TARGET_EXTRA_BUILTINS,
the root cause is the rs6000_builtin_mask can't set up (be switched) expectedly 
since the checking
time is too early right when the built-in function_decl being created.

BR,
Kewen

> 
> My two cents,
> Bill
> 
> On 9/28/21 3:13 AM, Kewen.Lin wrote:
>> Hi,
>>
>> As the discussion in PR102347, currently builtin_decl is invoked so
>> early, it's when making up the function_decl for builtin functions,
>> at that time the rs6000_builtin_mask could be wrong for those
>> builtins sitting in #pragma/attribute target functions, though it
>> will be updated properly later when LTO processes all nodes.
>>
>> This patch is to align with the practice i386 port adopts, also
>> align with r10-7462 by relaxing builtin mask checking in some places.
>>
>> Bootstrapped and regress-tested on powerpc64le-linux-gnu P9 and
>> powerpc64-linux-gnu P8.
>>
>> Is it ok for trunk?
>>
>> BR,
>> Kewen
>> -
>> gcc/ChangeLog:
>>
>>  PR target/102347
>>  * config/rs6000/rs6000-call.c (rs6000_builtin_decl): Remove builtin
>>  mask check.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  PR target/102347
>>  * gcc.target/powerpc/pr102347.c: New test.
>>
>> ---
>>  gcc/config/rs6000/rs6000-call.c | 14 --
>>  gcc/testsuite/gcc.target/powerpc/pr102347.c | 15 +++
>>  2 files changed, 19 insertions(+), 10 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102347.c
>>
>> diff --git a/gcc/config/rs6000/rs6000-call.c 
>> b/gcc/config/rs6000/rs6000-call.c
>> index fd7f24da818..15e0e09c07d 100644
>> --- a/gcc/config/rs6000/rs6000-call.c
>> +++ b/gcc/config/rs6000/rs6000-call.c
>> @@ -13775,23 +13775,17 @@ rs6000_init_builtins (void)
>>  }
>>  }
>>
>> -/* Returns the rs6000 builtin decl for CODE.  */
>> +/* Returns the rs6000 builtin decl for CODE.  Note that we don't check
>> +   the builtin mask here since there could be some #pragma/attribute
>> +   target functions and the rs6000_builtin_mask could be wrong when
>> +   this checking happens, though it will be updated properly later.  */
>>
>>  tree
>>  rs6000_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED)
>>  {
>> -  HOST_WIDE_INT fnmask;
>> -
>>if (code >= RS6000_BUILTIN_COUNT)
>>  return error_mark_node;
>>
>> -  fnmask = rs6000_builtin_info[code].mask;
>> -  if ((fnmask & rs6000_builtin_mask) != fnmask)
>> -{
>> -  rs6000_invalid_builtin ((enum rs6000_builtins)code);
>> -  return error_mark_node;
>> -}
>> -
>>return rs6000_builtin_decls[code];
>>  }
>>
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr102347.c 
>> b/gcc/testsuite/gcc.target/powerpc/pr102347.c
>> new file mode 100644
>> index 000..05c439a8dac
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr102347.c
>> @@ -0,0 +1,15 @@
>> +/* { dg-do link } */
>> +/* { dg-require-effective-target power10_ok } */
>> +/* { dg-require-effective-target lto } */
>> +/* { dg-options "-flto -mdejagnu-cpu=power9" } */
>> +
>> +/* Verify there are no error messages in LTO mode.  */
>> +
>> +#pragma GCC target "cpu=power10"
>> +int main ()
>> +{
>> +  float *b;
>> +  __vector_quad c;
>> +  __builtin_mma_disassemble_acc (b, );
>> +  return 0;
>> +}
>> --
>> 2.27.0
>>
> 




[r12-3947 Regression] FAIL: gcc.dg/vect/bb-slp-pr97709.c (test for excess errors) on Linux/x86_64

2021-09-28 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

e12f66d96fe41c8ef8a0d01b6a8394cd6bce3978 is the first bad commit
commit e12f66d96fe41c8ef8a0d01b6a8394cd6bce3978
Author: Andrew Pinski 
Date:   Fri Sep 17 04:59:03 2021 +

c: [PR32122] Require pointer types for computed gotos

caused

FAIL: gcc.c-torture/compile/920826-1.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/920826-1.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/compile/920826-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.c-torture/compile/920826-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.c-torture/compile/920826-1.c   -O2  (test for excess errors)
FAIL: gcc.c-torture/compile/920826-1.c   -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/920826-1.c   -Os  (test for excess errors)
FAIL: gcc.c-torture/compile/920831-1.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/920831-1.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/compile/920831-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.c-torture/compile/920831-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.c-torture/compile/920831-1.c   -O2  (test for excess errors)
FAIL: gcc.c-torture/compile/920831-1.c   -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/920831-1.c   -Os  (test for excess errors)
FAIL: gcc.c-torture/compile/pr27863.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/pr27863.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/compile/pr27863.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.c-torture/compile/pr27863.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.c-torture/compile/pr27863.c   -O2  (test for excess errors)
FAIL: gcc.c-torture/compile/pr27863.c   -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/pr27863.c   -Os  (test for excess errors)
FAIL: gcc.c-torture/compile/pr70190.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/compile/pr70190.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/compile/pr70190.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.c-torture/compile/pr70190.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.c-torture/compile/pr70190.c   -O2  (test for excess errors)
FAIL: gcc.c-torture/compile/pr70190.c   -O3 -g  (test for excess errors)
FAIL: gcc.c-torture/compile/pr70190.c   -Os  (test for excess errors)
FAIL: gcc.dg/torture/pr89135.c   -O0  (test for excess errors)
FAIL: gcc.dg/torture/pr89135.c   -O1  (test for excess errors)
FAIL: gcc.dg/torture/pr89135.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/pr89135.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.dg/torture/pr89135.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/pr89135.c   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: gcc.dg/torture/pr89135.c   -O3 -g  (test for excess errors)
FAIL: gcc.dg/torture/pr89135.c   -Os  (test for excess errors)
FAIL: gcc.dg/torture/pr90071.c   -O0  (test for excess errors)
FAIL: gcc.dg/torture/pr90071.c   -O1  (test for excess errors)
FAIL: gcc.dg/torture/pr90071.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/pr90071.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.dg/torture/pr90071.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/pr90071.c   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: gcc.dg/torture/pr90071.c   -O3 -g  (test for excess errors)
FAIL: gcc.dg/torture/pr90071.c   -Os  (test for excess errors)
FAIL: gcc.dg/vect/bb-slp-pr97709.c -flto -ffat-lto-objects (test for excess 
errors)
FAIL: gcc.dg/vect/bb-slp-pr97709.c (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-3947/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="compile.exp=gcc.c-torture/compile/920826-1.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="compile.exp=gcc.c-torture/compile/920826-1.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="compile.exp=gcc.c-torture/compile/920826-1.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="compile.exp=gcc.c-torture/compile/920826-1.c 

Re: [PATCH] rs6000: Remove builtin mask check from builtin_decl [PR102347]

2021-09-28 Thread Peter Bergner via Gcc-patches
On 9/28/21 2:24 PM, Bill Schmidt via Gcc-patches wrote:
> Unless you are planning to do a backport, I think the proper way forward
> here is to just wait for the new builtin support to land.  In the new code,
> we initialize all built-ins up front, and check properly at expansion time
> whether the builtin is enabled in the environment that obtains during expand.

Bill, have you tried the test case in the bugzilla with your builtin rewrite
and it works?  If so, then I think the correct thing would be to skip "fixing"
trunk and wait for your builtin rewrite to land there.

That said, this does fail on GCC11 and GCC10 (not sure about earlier, haven't
tried them yet), so we will need some type of fix for the releases.  I do think
your concern about not having some checking elsewhere is valid, unless there
already is some checking there and you and I are just not aware of it.

Peter




[PATCH] RISC-V: Pattern name fix mul*3_highpart -> smul*3_highpart.

2021-09-28 Thread Jim Wilson
From: Geng Qi 

No known code changes, just fixes an inconsistency that was noticed.

Committed.

Jim

gcc/
* config/riscv/riscv.md (mulv4): Call gen_smul3_highpart.
(mulditi3): Call muldi3_highpart.
(muldi3_highpart): Rename to muldi3_highpart.
(mulsidi3): Call mulsi3_highpart.
(mulsi3_highpart): Rename to mulsi3_highpart.
---
 gcc/config/riscv/riscv.md | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index f88877fd596..98364f00f6d 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -802,7 +802,7 @@ (define_expand "mulv4"
   rtx hp = gen_reg_rtx (mode);
   rtx lp = gen_reg_rtx (mode);
 
-  emit_insn (gen_mul3_highpart (hp, operands[1], operands[2]));
+  emit_insn (gen_smul3_highpart (hp, operands[1], operands[2]));
   emit_insn (gen_mul3 (operands[0], operands[1], operands[2]));
   emit_insn (gen_ashr3 (lp, operands[0],
  GEN_INT (BITS_PER_WORD - 1)));
@@ -899,14 +899,14 @@ (define_expand "mulditi3"
   emit_insn (gen_muldi3 (low, operands[1], operands[2]));
 
   rtx high = gen_reg_rtx (DImode);
-  emit_insn (gen_muldi3_highpart (high, operands[1], operands[2]));
+  emit_insn (gen_muldi3_highpart (high, operands[1], operands[2]));
 
   emit_move_insn (gen_lowpart (DImode, operands[0]), low);
   emit_move_insn (gen_highpart (DImode, operands[0]), high);
   DONE;
 })
 
-(define_insn "muldi3_highpart"
+(define_insn "muldi3_highpart"
   [(set (match_operand:DI0 "register_operand" "=r")
(truncate:DI
  (lshiftrt:TI
@@ -961,13 +961,13 @@ (define_expand "mulsidi3"
 {
   rtx temp = gen_reg_rtx (SImode);
   emit_insn (gen_mulsi3 (temp, operands[1], operands[2]));
-  emit_insn (gen_mulsi3_highpart (riscv_subword (operands[0], true),
+  emit_insn (gen_mulsi3_highpart (riscv_subword (operands[0], true),
 operands[1], operands[2]));
   emit_insn (gen_movsi (riscv_subword (operands[0], false), temp));
   DONE;
 })
 
-(define_insn "mulsi3_highpart"
+(define_insn "mulsi3_highpart"
   [(set (match_operand:SI0 "register_operand" "=r")
(truncate:SI
  (lshiftrt:DI
-- 
2.25.1



Re: [PATCH] RISC-V: Pattern name fix mulm3_highpart -> smulm3_highpart.

2021-09-28 Thread Jim Wilson
On Mon, Sep 27, 2021 at 4:38 AM Geng Qi via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> gcc/ChangeLog:
> * config/riscv/riscv.md
> (muldi3_highpart): Rename to muldi3_highpart.
> (mulditi3): Emit muldi3_highpart.
> (mulsi3_highpart): Rename to mulsi3_highpart.
> (mulsidi3): Emit mulsi3_highpart.
>

This doesn't build on top of tree sources.  It is missing the
mulv3_highpart change I mentioned in the riscv-gcc review.  Also, I
prefer that the order of the changelog entries match the order of hunks in
the patch, it is easier to review that way.  Otherwise, the patch is OK and
I committed it with minor changes.  Since I changed it, I need to send the
patch I did actually commit.

Jim


Re: [RFC PATCH 0/8] RISC-V: Bit-manipulation extension.

2021-09-28 Thread Jim Wilson
On Tue, Sep 28, 2021 at 3:05 PM Christoph Muellner <
cmuell...@ventanamicro.com> wrote:

> We talked about this in the T meeting on Monday.
> Philipp Tomsich mentioned, that he has a patchset from earlier this
> year, which adds support for Zbs.
> He proposed to rebase it and send it to the list in the next days.
>

And this is hopefully a patch that isn't contaminated by any of the changes
from Claire that we can't use.  That is one of the benefits of asking PLCT
as they never worked with Claire.  But at this point I think that enough
time has passed and enough changes were made to the zbs spec, and
considering that there is really only one good way to implement the zbs
binutils support anyways, I think we are probably safe to use patches from
one of us that did work with Claire.

Jim


Re: [RFC PATCH 0/8] RISC-V: Bit-manipulation extension.

2021-09-28 Thread Christoph Muellner
On Wed, Sep 29, 2021 at 12:01 AM Jim Wilson  wrote:
>
> On Thu, Sep 23, 2021 at 12:57 AM Kito Cheng  wrote:
>>
>> Bit manipulation extension[1] is finishing the public review and waiting for
>> the rest of the ratification process, I believe that will become a ratified
>> extension soon, so I think it's time to submit to upstream for review now :)
>
>
> We still don't have upstream zbs assembler support.  We have rejected other 
> patches because they didn't upstream the assembler support first.  We should 
> be following the same rule here for bitmanip.
>
> Maybe we can ask PLCT to write the missing assembler support?

We talked about this in the T meeting on Monday.
Philipp Tomsich mentioned, that he has a patchset from earlier this
year, which adds support for Zbs.
He proposed to rebase it and send it to the list in the next days.


Re: [RFC PATCH 0/8] RISC-V: Bit-manipulation extension.

2021-09-28 Thread Jim Wilson
On Mon, Sep 27, 2021 at 4:20 AM Christoph Muellner <
cmuell...@ventanamicro.com> wrote:

> In case somebody wants to test this patchset, a patchset for Binutils
> is required as well.
> AFAIK here would be the Binutils branch with the required changes:
>
> https://github.com/riscv-collab/riscv-binutils-gdb/tree/riscv-binutils-experiment


This branch only has the zba/zbb/zbc support that is already upstream.
There is nothing useful here.

Jim


Re: [RFC PATCH 0/8] RISC-V: Bit-manipulation extension.

2021-09-28 Thread Jim Wilson
On Thu, Sep 23, 2021 at 12:57 AM Kito Cheng  wrote:

> Bit manipulation extension[1] is finishing the public review and waiting
> for
> the rest of the ratification process, I believe that will become a ratified
> extension soon, so I think it's time to submit to upstream for review now
> :)
>

We still don't have upstream zbs assembler support.  We have rejected other
patches because they didn't upstream the assembler support first.  We
should be following the same rule here for bitmanip.

Maybe we can ask PLCT to write the missing assembler support?

Jim


RE: [PATCH] Make flag_trapping_math a non-binary Boolean.

2021-09-28 Thread Joseph Myers
On Tue, 28 Sep 2021, Roger Sayle wrote:

> Next, I'd like to state that your "five restrictions" ontology is an 
> excellent starting point, but I'd like to argue that your proposed list 
> of 5 is the wrong shape (insufficiently refined). Instead, I'd like to 
> counter-propose that an improvement/refinement of the Myers model, is 
> actually "3 primitive restrictions * N trapping conditions * 2 flow 
> control sensitivity".

It's true you can treat the rules on what code transformations are 
permitted as orthogonal to which exceptions or sub-exceptions those are 
applied to.  (I'm not sure exact what you are including under "flow 
control sensitivity".)  And also that IEEE 754-2019 subclause 8.1 says 
that language standards should allow for alternate exception handling 
attributes to be associated with sets of exceptions or sub-exceptions (as 
well as being associated to particular blocks in the source code, as 
represented for C by the pragmas defined in TS 18661-5), which does tend 
to suggest such a model listing (sub-)exceptions separately for each rule 
on alternate exception handling.

Note that "trapping conditions" is not a good way of expressing things; 
the right way is much closer to "alternate exception handling" as defined 
in IEEE 754-2019 (or -2008), even if some of the more permissive modes 
(e.g. allowing spurious exceptions to be raised) don't actually correspond 
to any kind of alternate exception handling described in IEEE 754.  
Trapping, in the sense of transferring control to a trap handler 
(typically a SIGFPE signal handler) is, at least at the level of APIs for 
user code, an obsolescent form of exception handling: it was described in 
IEEE 754-1985 but removed in IEEE 754-2008, replaced by alternate 
exception handling.  Trapping and trap handlers are too machine-specific 
to form a good API for normal user code.  Some architectures may invoke a 
trap handler some time later than the instruction that signaled the 
exception.  Some may not support trapping on floating-point exceptions at 
all; support is optional on Arm and many processors don't implement it, 
trapping on floating-point exceptions isn't supported by RISC-V at all, 
for example.

So I think we should avoid reference to traps, when talking about 
floating-point exceptions, as much as possible, in the GCC documentation, 
command-line option names, source code and development discussion, except 
in limited cases where the specific legacy mechanism described in IEEE 
754-1985 is meant.  That doesn't make much difference to permitted 
optimizations; some forms of alternate exception handling would place 
similar restrictions on permitted code transformations to those 
restrictions coming from 1985-style trapping.

> Next your item [4] highlights what I consider the underlying problem that
> until now has been overlooked, that there are different kinds of traps are
> observationally/behaviourally different.  Above you describe, "underflow",
> but likewise there are traps for inexact result, "2.0 / 3.0", traps for
> division
> by 0.0, that invokes undefined behaviour in C++ (but sometimes not in C),
> and distinctions between quiet and signaling NaNs.  Your primitivie
> restrictions,
> [1], [2] and [5] may apply differently to these different kinds of
> exceptions.

As per the above, these aren't kinds of traps, but exceptions or 
sub-exceptions.

> Consider the following four lines of C++:
> constexpr t1 = 2.0 / 3.0;
> constexpr t2 = std::numeric_limits::quiet_NaN() == 0.0;
> constexpr t2 = std::numeric_limits::quiet_NaN() < 0.0;
> constexpr t3 = 1.0 / 0.0;
> which by IEEE generate four different types of exception, but as you've

t2 does not generate an exception; == is compareQuietEqual not 
compareSignalingEqual.

> Two very useful references I've been following are:
> https://docs.oracle.com/cd/E19957-01/806-3568/ncg_handle.html
> https://docs.oracle.com/cd/E88353_01/html/E37846/fex-getexcepthandler-3m.html

I don't think these are a good starting point; the TS 18661-5 APIs are a 
more appropriate basis for possible C bindings to alternate exception 
handling as described in IEEE 754-2008 or -2019, as opposed to 1985-style 
trapping or anything not based on 754-2008 or newer.

> numbers in match.pd, fold-const.c and simplify-rtx.c.  For example, what
> IEEE calls
> "FPE_INTOVF" is more commonly known as TRAPV inside GCC.  Likewise, IEEE

IEEE has no such name as FPE_INTOVF.

-ftrapv is itself an obsolescent feature.  Not because of any problems 
with its notion of trap, which is disjoint from that of floating-point 
exceptions (it's a synchronous call to abort or something equivalent), but 
because the implementation is problematic and an alias for certain 
sanitizer options is more maintainable.  We should be moving to that alias 
rather than adding any more internal representation related to -ftrapv.

> concepts such as FE_INVALID are really just groups of bits in our 
> enumeration, but we allow much finer 

Re: [PATCH] PR fortran/102520 - [10/11/12 Regression] ICE in expand_constructor, at fortran/array.c:1802

2021-09-28 Thread Thomas Koenig via Gcc-patches

Hi Harald,


Gerhard's testcase triggers a NULL pointer dereference during the
attempt to expand an invalid constructor.  The simple and obvious
solution is to catch that case.

Regtested on x86_64-pc-linux-gnu.  OK for all affected branches?


OK.

Thanks for the patch!

Best regards

Thomas


Re: [PATCH] [GIMPLE] Simplify (_Float16) ceil ((double) x) to .CEIL (x) when available.

2021-09-28 Thread Joseph Myers
On Tue, 28 Sep 2021, Hongtao Liu via Gcc-patches wrote:

> Yes, that's why I didn't follow the existing pattern, i think we can
> add optimize back to the condition, but not canonicalize_math_p ()
> since there's no math function for _Float16.

At some point we should add _Float16 functions to glibc, but that's 
trickier than for the _FloatN / _FloatNx types that are already supported 
there because of the implications for minimum compiler versions supported 
(_Float16 functions could be supported in glibc for AArch64 and Arm by 
increasing the minimum compiler version for building glibc for those 
architectures to GCC 7, but not for x86_64 and x86 until we're ready to 
require GCC 12 or later to build glibc for those architectures), and we'd 
need to think carefully about design questions such as when functions are 
implemented using _Float16 or float arithmetic or just wrap float 
functions as a starting point.  And for generating calls to such functions 
GCC would need to know whether the libm implementation provides them or 
not.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] PR fortran/102520 - [10/11/12 Regression] ICE in expand_constructor, at fortran/array.c:1802

2021-09-28 Thread Harald Anlauf via Gcc-patches
Dear Fortranners,

Gerhard's testcase triggers a NULL pointer dereference during the
attempt to expand an invalid constructor.  The simple and obvious
solution is to catch that case.

Regtested on x86_64-pc-linux-gnu.  OK for all affected branches?

Thanks,
Harald

Fortran: fix error recovery for invalid constructor

gcc/fortran/ChangeLog:

	PR fortran/102520
	* array.c (expand_constructor): Do not dereference NULL pointer.

gcc/testsuite/ChangeLog:

	PR fortran/102520
	* gfortran.dg/pr102520.f90: New test.

diff --git a/gcc/fortran/array.c b/gcc/fortran/array.c
index b858bada18a..8d66e009f66 100644
--- a/gcc/fortran/array.c
+++ b/gcc/fortran/array.c
@@ -1798,6 +1805,9 @@ expand_constructor (gfc_constructor_base base)

   e = c->expr;

+  if (e == NULL)
+	return false;
+
   if (empty_constructor)
 	empty_ts = e->ts;

diff --git a/gcc/testsuite/gfortran.dg/pr102520.f90 b/gcc/testsuite/gfortran.dg/pr102520.f90
new file mode 100644
index 000..1c98c185c17
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr102520.f90
@@ -0,0 +1,12 @@
+! { dg-do compile }
+! PR fortran/102520 - ICE in expand_constructor, at fortran/array.c:1802
+
+program p
+  type t
+  end type
+  type(t), parameter :: a(4)   = shape(1) ! { dg-error "Incompatible" }
+  type(t), parameter :: b(2,2) = reshape(a,[2,2]) ! { dg-error "Incompatible" }
+  type(t), parameter :: c(2,2) = transpose(b) ! { dg-error "Unclassifiable" }
+end
+
+! { dg-error "Different shape for array assignment" " " { target *-*-* } 7 }


Re: [PATCH] Loop unswitching: support gswitch statements.

2021-09-28 Thread Andrew MacLeod via Gcc-patches

On 9/28/21 7:50 AM, Richard Biener wrote:

On Wed, Sep 15, 2021 at 10:46 AM Martin Liška  wrote:

   /* Unswitch single LOOP.  NUM is number of unswitchings done; we do not allow
@@ -269,6 +311,7 @@ tree_unswitch_single_loop (class loop *loop, int num)
 class loop *nloop;
 unsigned i, found;
 tree cond = NULL_TREE;
+  edge cond_edge = NULL;
 gimple *stmt;
 bool changed = false;
 HOST_WIDE_INT iterations;
@@ -311,11 +354,12 @@ tree_unswitch_single_loop (class loop *loop, int num)
 bbs = get_loop_body (loop);
 found = loop->num_nodes;

+  gimple_ranger ranger;

ISTR constructing/destructing ranger has a non-negligible overhead -
is it possible
to keep it live for a longer time (note we're heavily modifying the CFG)?



There is some overhead.. right now we determine all the imports and 
exports for each block ahead of time, but thats about it. We can make 
adjustments for true on demand clients like this so that even that 
doesnt happen. we only do that so we know ahead of time which ssa-names 
are never used in outgoing edges, and never even have to check those.  
Thats mostly an optimization for heavy users like EVRP.  If you want, I 
can make that an option  so there virtually no overhead


More importantly, the longer it remains alive, the more "reuse" of 
ranges you will get..   If there is not a pattern of using variables 
from earlier in the program it wouldnt really matter much.


In Theory, modifying the IL should be fine, it happens already in 
places, but its not extensively tested under those conditions yet.



 while (1)
   {
 /* Find a bb to unswitch on.  */
 for (; i < loop->num_nodes; i++)
-   if ((cond = tree_may_unswitch_on (bbs[i], loop)))
+   if ((cond = tree_may_unswitch_on (bbs[i], loop, _edge)))
   break;

 if (i == loop->num_nodes)
@@ -333,24 +377,70 @@ tree_unswitch_single_loop (class loop *loop, int num)
   break;
 }

-  cond = simplify_using_entry_checks (loop, cond);

I also fear we're losing simplification of unswitching on float compares or even
run into endlessly unswitching on the same condition?
Ranger will only do integra/pointer stuff right now.  floats are on the 
list for GCC13, fwiw.


In the testcases I failed to see some verifying we're _not_ repeatedly
processing
ifs, like scan for a definitive number of unswitchings for, say,

   for (..)
 {
  if (a)
   ...;
  xyz;
  if (a)
...;
 }

where we want to unswitch on if (a) only once (but of course simplify the second
if (a) ideally from within unswitching so CFG cleanup removes the dead paths).
The old code guaranteed this even for float compares IIRC.

At least also add scan-tree-dump-times overall expected unswitch count scans
to the new testcases.

Btw, I was hoping to use the relation stuff here, not so much use range
queries, but see below ...
I seem to recall a discussion about using predication and how thats 
really what we want here?  . I had some thoughts on a predication engine 
we could add utilizing the relations oracle, but haven't had a time to 
look into it yet



 stmt = last_stmt (bbs[i]);
-  if (integer_nonzerop (cond))
+  gcond *condition = dyn_cast (stmt);
+  gswitch *swtch = dyn_cast (stmt);
+
+  if (condition != NULL)
 {
- /* Remove false path.  */
- gimple_cond_set_condition_from_tree (as_a  (stmt),
-  boolean_true_node);
- changed = true;
+ int_range_max r;
+ edge edge_true, edge_false;
+ extract_true_false_edges_from_block (bbs[i], _true, _false);
+ tree cond = gimple_cond_lhs (stmt);
+
+ if (r.supports_type_p (TREE_TYPE (cond)))
+   {
+ if (ranger.range_on_edge (r, edge_true, cond)
+ && r.undefined_p ())

Can you really use ranger this way to tell whether the edge is not executed?
I think you instead want to somehow evaluate the gcond or gswitch, looking
for a known taken edge?


Yes and no  :-)  I use to do that, but now that we allow uninitialized 
values to be treated as UNDEFINED,  it may also mean that its 
uninitialized on that edge.


Evaluating
if (c_3 == 0)   when we know c_3 = [1,1]

What you suggest is fundamentally what ranger does... It evaluates what 
the full set of possible ranges are on the edge you ask about, then 
intersects it with the known range of c_3.  .   If the condition cannot 
ever be true,and is thus unexecutable,  the result will be UNDEFINED .  
ie above,  c_3 would have to have a range of [0,0] on the true edge, and 
its real range is [1,1].. intersecting the 2 values results in UNDEFINED...


So it can mean the edge is unexecutable.   It can also mean the value is 
actually undefined.. if this was a use-before-def case, the range of c_3 
in the block would be UNDEFINED.  and c_3 will be UNDEFINED on BOTH 
edges due ot the intersection.  

Re: [patch][gcc12-changes] Add a new item about the support for automatic static variable initialization

2021-09-28 Thread Kees Cook via Gcc-patches
On Tue, Sep 28, 2021 at 08:31:13PM +, Qing Zhao wrote:
> Hi,
> 
> This is the patch for the gcc12 changes  per your request. 
> 
> Kees provided most of the wording. 
> 
> Please take a look and let’s know whether it’s good for commit?
> 
> thanks.
> 
> Qing
> 
> 
> 
> 
> From: qing zhao 
> Date: Tue, 28 Sep 2021 12:01:42 -0700
> Subject: [PATCH] gcc-12/changes.html: Uninitialized stack variables
>  initialization update
> 
>   * htdocs/gcc-12/changes.html (Eliminating uninitialized variables):
>   Item about the support for automatic static variable initialization.
> ---
>  htdocs/gcc-12/changes.html | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
> index 1f156a9..8e2979c 100644
> --- a/htdocs/gcc-12/changes.html
> +++ b/htdocs/gcc-12/changes.html
> @@ -245,6 +245,25 @@ a work-in-progress.
>  
>  Other significant improvements
>  
> +Eliminating uninitialized variables
> +
> +
> +  GCC can now initialize all stack variables implicitly, including
> +  padding. This is intended to eliminate all classes of uninitialized
> +  stack variable flaws. Lack of explicit initialization will still
> +  warn when -Wuninitialized is active. For best
> +  debugging, use of the new command-line option
> +  -ftrivial-auto-var-init=pattern can be used to fill
> +  variables with a repeated 0xFE pattern, which tends to illuminate
> +  many bugs (e.g. pointers receive invalid addresses, sizes and indices
> +  are very large). For best production results, the new command-line
> +  option -ftrivial-auto-var-init=zero can be used to
> +  fill variables with 0x00, which tends to provide a safer state for
> +  bugs (e.g. pointers are NULL, strings are NULL filled, and sizes

Minor nit: I've always been corrected that "NULL" refers to a pointer, and
"NUL" refers to the "null character", so the latter use of NULL should be
"NUL": ... pointers are NULL, strings are NUL filled, and size ...

I mix this up all the time, so apologies if that got introduced by me!
:)

-Kees

> +  and indices are 0).
> +  
> +
> +
>  Debugging formats
>  
>  
> -- 
> 1.9.1
> 
> 

-- 
Kees Cook


[PATCH, v2] c++: Fix up synthetization of defaulted comparison operators on classes with bitfields [PR102490]

2021-09-28 Thread Jakub Jelinek via Gcc-patches
On Tue, Sep 28, 2021 at 03:33:35PM -0400, Jason Merrill wrote:
> > > According to the function comment for defaulted_late_check, won't
> > > COMPLETE_TYPE_P (ctx) always be false here?
> 
> Not for a function defaulted outside the class.
> 
> > If so, I wonder if we could get away with moving this entire fragment
> > from defaulted_late_check to finish_struct_1 instead of calling
> > defaulted_late_check from finish_struct_1.
> 
> The comment in check_bases_and_members says that we call it there so that
> it's before we clone [cd]tors.  Probably better to leave the call there for
> other functions, just skip it for comparisons.

So like this instead then?  Just tested with dg.exp=*spaceship* so far.

2021-09-28  Jakub Jelinek  

PR c++/102490
* method.c (defaulted_late_check): Don't synthetize constexpr
defaulted comparisons.
(finish_struct_1): Synthetize constexpr defaulted comparisons here
after layout_class_type.

* g++.dg/cpp2a/spaceship-eq11.C: New test.
* g++.dg/cpp2a/spaceship-eq12.C: New test.

--- gcc/cp/method.c.jj  2021-09-28 11:34:10.165412477 +0200
+++ gcc/cp/method.c 2021-09-28 22:28:23.637981709 +0200
@@ -3158,18 +3158,7 @@ defaulted_late_check (tree fn)
   special_function_kind kind = special_function_p (fn);
 
   if (kind == sfk_comparison)
-{
-  /* If the function was declared constexpr, check that the definition
-qualifies.  Otherwise we can define the function lazily.  */
-  if (DECL_DECLARED_CONSTEXPR_P (fn) && !DECL_INITIAL (fn))
-   {
- /* Prevent GC.  */
- function_depth++;
- synthesize_method (fn);
- function_depth--;
-   }
-  return;
-}
+return;
 
   bool fn_const_p = (copy_fn_p (fn) == 2);
   tree implicit_fn = implicitly_declare_fn (kind, ctx, fn_const_p,
--- gcc/cp/class.c.jj   2021-09-28 11:34:10.096413431 +0200
+++ gcc/cp/class.c  2021-09-28 22:29:59.072669058 +0200
@@ -7467,7 +7467,21 @@ finish_struct_1 (tree t)
  for any static member objects of the type we're working on.  */
   for (x = TYPE_FIELDS (t); x; x = DECL_CHAIN (x))
 if (DECL_DECLARES_FUNCTION_P (x))
-  DECL_IN_AGGR_P (x) = false;
+  {
+   /* Synthetize constexpr defaulted comparisons.  */
+   if (!DECL_ARTIFICIAL (x)
+   && DECL_DEFAULTED_IN_CLASS_P (x)
+   && special_function_p (x) == sfk_comparison
+   && DECL_DECLARED_CONSTEXPR_P (x)
+   && !DECL_INITIAL (x))
+ {
+   /* Prevent GC.  */
+   function_depth++;
+   synthesize_method (x);
+   function_depth--;
+ }
+   DECL_IN_AGGR_P (x) = false;
+  }
 else if (VAR_P (x) && TREE_STATIC (x)
 && TREE_TYPE (x) != error_mark_node
 && same_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (x)), t))
--- gcc/testsuite/g++.dg/cpp2a/spaceship-eq11.C.jj  2021-09-28 
22:27:40.524574708 +0200
+++ gcc/testsuite/g++.dg/cpp2a/spaceship-eq11.C 2021-09-28 22:27:40.524574708 
+0200
@@ -0,0 +1,43 @@
+// PR c++/102490
+// { dg-do run { target c++20 } }
+
+struct A
+{
+  unsigned char a : 1;
+  unsigned char b : 1;
+  constexpr bool operator== (const A &) const = default;
+};
+
+struct B
+{
+  unsigned char a : 8;
+  int : 0;
+  unsigned char b : 7;
+  constexpr bool operator== (const B &) const = default;
+};
+
+struct C
+{
+  unsigned char a : 3;
+  unsigned char b : 1;
+  constexpr bool operator== (const C &) const = default;
+};
+
+void
+foo (C , int y)
+{
+  x.b = y;
+}
+
+int
+main ()
+{
+  A a{}, b{};
+  B c{}, d{};
+  C e{}, f{};
+  a.b = 1;
+  d.b = 1;
+  foo (e, 0);
+  foo (f, 1);
+  return a == b || c == d || e == f;
+}
--- gcc/testsuite/g++.dg/cpp2a/spaceship-eq12.C.jj  2021-09-28 
22:27:40.524574708 +0200
+++ gcc/testsuite/g++.dg/cpp2a/spaceship-eq12.C 2021-09-28 22:27:40.524574708 
+0200
@@ -0,0 +1,5 @@
+// PR c++/102490
+// { dg-do run { target c++20 } }
+// { dg-options "-O2" }
+
+#include "spaceship-eq11.C"


Jakub



[patch][gcc12-changes] Add a new item about the support for automatic static variable initialization

2021-09-28 Thread Qing Zhao via Gcc-patches
Hi,

This is the patch for the gcc12 changes  per your request. 

Kees provided most of the wording. 

Please take a look and let’s know whether it’s good for commit?

thanks.

Qing




From: qing zhao 
Date: Tue, 28 Sep 2021 12:01:42 -0700
Subject: [PATCH] gcc-12/changes.html: Uninitialized stack variables
 initialization update

* htdocs/gcc-12/changes.html (Eliminating uninitialized variables):
Item about the support for automatic static variable initialization.
---
 htdocs/gcc-12/changes.html | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 1f156a9..8e2979c 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -245,6 +245,25 @@ a work-in-progress.
 
 Other significant improvements
 
+Eliminating uninitialized variables
+
+
+  GCC can now initialize all stack variables implicitly, including
+  padding. This is intended to eliminate all classes of uninitialized
+  stack variable flaws. Lack of explicit initialization will still
+  warn when -Wuninitialized is active. For best
+  debugging, use of the new command-line option
+  -ftrivial-auto-var-init=pattern can be used to fill
+  variables with a repeated 0xFE pattern, which tends to illuminate
+  many bugs (e.g. pointers receive invalid addresses, sizes and indices
+  are very large). For best production results, the new command-line
+  option -ftrivial-auto-var-init=zero can be used to
+  fill variables with 0x00, which tends to provide a safer state for
+  bugs (e.g. pointers are NULL, strings are NULL filled, and sizes
+  and indices are 0).
+  
+
+
 Debugging formats
 
 
-- 
1.9.1




Re: Fix 48631_neg test in _GLIBCXX_VERSION_NAMESPACE mode

2021-09-28 Thread Jonathan Wakely via Gcc-patches
On Tue, 28 Sept 2021 at 21:21, François Dumont via Libstdc++
 wrote:
>
> On 27/09/21 11:06 pm, Jonathan Wakely wrote:
> > On Mon, 27 Sept 2021 at 21:26, François Dumont via Libstdc++
> >  wrote:
> >> Here is a small patch to fix a test which fails in
> >> _GLIBCXX_VERSION_NAMESPACE mode.
> >>
> >> IMHO it would be better to avoid putting  content in
> >> versioned namespace, no ?
>
> No opinion on this, you prefer to use consistently the versioned namespace ?

I haven't though about it in much detail, but I think it's safer to
keep them in the versioned namespace.

Can we be sure that we'd never need to make any incompatible changes
to anything in that header? It seems likely, but I'm not entirely
confident.


> >> There is of course more work to do, so for now here is the simpler 
> >> approach.
> >>
> >> Ok to commit ?
> > Leaving the pattern ending with just "struct" isn't very useful.
> > Wouldn't it be better to do:
> >
> > // { dg-prune-output "no type named 'type' in" }
> >
> > or just:
> >
> > // { dg-prune-output "enable_if" }
> >
> > ?
> >
> > Either of those is OK to commit.
>
> Done with "enable_if"

Thanks.



[PATCH v4] attribs: Implement -Wno-attributes=vendor::attr [PR101940]

2021-09-28 Thread Marek Polacek via Gcc-patches
On Thu, Sep 23, 2021 at 02:25:16PM -0400, Jason Merrill wrote:
> On 9/20/21 18:59, Marek Polacek via Gcc-patches wrote:
> > +void
> > +handle_ignored_attributes_option (vec *v)
> > +{
> > +  if (v == nullptr)
> > +return;
> > +
> > +  for (auto opt : v)
> > +{
> > +  if (strcmp (opt, "clang") == 0)
> > +   {
> > + // TODO
> > + continue;
> > +   }
> 
> If this doesn't work yet, let's not accept it at all for now.

Ok.
 
> > +  char *q = strstr (opt, "::");
> > +  /* We don't accept '::attr'.  */
> > +  if (q == nullptr || q == opt)
> > +   {
> > + error ("wrong argument to ignored attributes");
> > + inform (input_location, "valid format is %, %, "
> > + "or %");
> 
> ...or even mention it.  Users can ignore clang:: instead, it doesn't matter
> to us if clang attributes are misspelled.

Removed.

> > + continue;
> > +   }
> > +  /* Cut off the vendor part.  */
> > +  *q = '\0';
> > +  char *vendor = opt;
> > +  char *attr = q + 2;
> > +  /* Verify that they look valid.  */
> > +  auto valid_p = [](const char *s) {
> > +   for (; *s != '\0'; ++s)
> > + if (!ISALNUM (*s) && *s != '_')
> > +   return false;
> > +   return true;
> > +  };
> > +  if (!valid_p (vendor) || !valid_p (attr))
> > +   {
> > + error ("wrong argument to ignored attributes");
> > + continue;
> > +   }
> > +  /* Turn "__attr__" into "attr" so that we have a canonical form of
> > +attribute names.  Likewise for vendor.  */
> > +  auto strip = [](char *) {
> > +   const size_t l = strlen (s);
> > +   if (l > 4 && s[0] == '_' && s[1] == '_'
> > +   && s[l - 1] == '_' && s[l - 2] == '_')
> > + {
> > +   s[l - 2] = '\0';
> > +   s += 2;
> > + }
> > +  };
> > +  strip (attr);
> > +  strip (vendor);
> > +  /* If we've already seen this vendor::attr, ignore it.  Attempting to
> > +register it twice would lead to a crash.  */
> > +  if (lookup_scoped_attribute_spec (get_identifier (vendor),
> > +   get_identifier (attr)))
> > +   continue;
> > +  /* In the "vendor::" case, we should ignore *any* attribute coming
> > +from this attribute namespace.  */
> > +  const bool ignored_ns = attr[0] == '\0';
> 
> Maybe set attr to nullptr instead of declaring ignored_ns?
> 
> > +  /* Create a table with extra attributes which we will register.
> > +We can't free it here, so squirrel away the pointers.  */
> > +  attribute_spec *table = new attribute_spec[2];
> > +  ignored_attributes_table.safe_push (table);
> > +  table[0] = { ignored_ns ? nullptr : attr, 0, 0, false, false,
> 
> ...so this can just use attr.

I also need ignored_ns...
 
> > +  false, false, nullptr, nullptr };
> > +  table[1] = { nullptr, 0, 0, false, false, false, false, nullptr, 
> > nullptr };
> > +  register_scoped_attributes (table, vendor, ignored_ns);

...here, but I tweaked this a bit to get rid of the bool.

> > +}
> > +}
> > +
> > +/* Free data we might have allocated when adding extra attributes.  */
> > +
> > +void
> > +free_attr_data ()
> > +{
> > +  for (auto x : ignored_attributes_table)
> > +delete[] x;
> > +}
> 
> You probably also want to zero out ignored_attributes_table at this point.

Done.

> >   /* Initialize attribute tables, and make some sanity checks if checking is
> >  enabled.  */
> > @@ -252,6 +353,9 @@ init_attributes (void)
> >   /* Put all the GNU attributes into the "gnu" namespace.  */
> >   register_scoped_attributes (attribute_tables[i], "gnu");
> > +  vec *ignored = (vec *) flag_ignored_attributes;
> > +  handle_ignored_attributes_option (ignored);
> > +
> > invoke_plugin_callbacks (PLUGIN_ATTRIBUTES, NULL);
> > attributes_initialized = true;
> >   }
> > @@ -456,6 +560,19 @@ diag_attr_exclusions (tree last_decl, tree node, tree 
> > attrname,
> > return found;
> >   }
> > +/* Return true iff we should not complain about unknown attributes
> > +   coming from the attribute namespace NS.  This is the case for
> > +   the -Wno-attributes=ns:: command-line option.  */
> > +
> > +static bool
> > +attr_namespace_ignored_p (tree ns)
> > +{
> > +  if (ns == NULL_TREE)
> > +return false;
> > +  scoped_attributes *r = find_attribute_namespace (IDENTIFIER_POINTER 
> > (ns));
> > +  return r && r->ignored_p;
> > +}
> > +
> >   /* Process the attributes listed in ATTRIBUTES and install them in *NODE,
> >  which is either a DECL (including a TYPE_DECL) or a TYPE.  If a DECL,
> >  it should be modified in place; if a TYPE, a copy should be created
> > @@ -556,7 +673,8 @@ decl_attributes (tree *node, tree attributes, int flags,
> > if (spec == NULL)
> > {
> > - if (!(flags & (int) ATTR_FLAG_BUILT_IN))
> > + if (!(flags & (int) ATTR_FLAG_BUILT_IN)
> > + && !attr_namespace_ignored_p (ns))
> > {
> >   if (ns == NULL_TREE || !cxx11_attr_p)
> > 

Re: Fix 48631_neg test in _GLIBCXX_VERSION_NAMESPACE mode

2021-09-28 Thread François Dumont via Gcc-patches

On 27/09/21 11:06 pm, Jonathan Wakely wrote:

On Mon, 27 Sept 2021 at 21:26, François Dumont via Libstdc++
 wrote:

Here is a small patch to fix a test which fails in
_GLIBCXX_VERSION_NAMESPACE mode.

IMHO it would be better to avoid putting  content in
versioned namespace, no ?


No opinion on this, you prefer to use consistently the versioned namespace ?




There is of course more work to do, so for now here is the simpler approach.

Ok to commit ?

Leaving the pattern ending with just "struct" isn't very useful.
Wouldn't it be better to do:

// { dg-prune-output "no type named 'type' in" }

or just:

// { dg-prune-output "enable_if" }

?

Either of those is OK to commit.


Done with "enable_if"



Re: [PATCH] c++: Suppress error when cv-qualified reference is introduced by typedef [PR101783]

2021-09-28 Thread Jason Merrill via Gcc-patches

On 9/26/21 21:31, nick huang via Gcc-patches wrote:

Hi Jason,

1. Thank you very much for your detailed comments for my patch and I really 
appreciate it! Here is my revised patch:

The root cause of this bug is that it considers reference with
cv-qualifiers as an error by generating value for variable "bad_quals".
However, this is not correct for case of typedef. Here I quote spec:
"Cv-qualified references are ill-formed except when the cv-qualifiers
are introduced through the use of a typedef-name ([dcl.typedef],
[temp.param]) or decltype-specifier ([dcl.type.decltype]),
in which case the cv-qualifiers are ignored."

2021-09-25  qingzhe huang  

gcc/cp/
PR c++/101783
* tree.c (cp_build_qualified_type_real):


git gcc-verify still rejects this line with

ERR: missing description of a change: "	* tree.c 
(cp_build_qualified_type_real):"


You may need to run contrib/gcc-git-customization.sh to get the git 
gcc-verify command.



gcc/testsuite/
PR c++/101783
* g++.dg/parse/pr101783.C: New test.
-- next part --


Please drop this line, it breaks git gcc-verify when I apply the patch 
with git am.  The patch should start immediately after the ChangeLog 
entries.



diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 8840932dba2..d5c8daeb340 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -1356,11 +1356,18 @@ cp_build_qualified_type_real (tree type,
/* A reference or method type shall not be cv-qualified.
   [dcl.ref], [dcl.fct].  This used to be an error, but as of DR 295
   (in CD1) we always ignore extra cv-quals on functions.  */
+
+  /* Cv-qualified references are ill-formed except when the cv-qualifiers


In my previous reply I meant please add "[dcl.ref]/1" at the beginning 
of this comment.



+ are introduced through the use of a typedef-name ([dcl.typedef],
+ [temp.param]) or decltype-specifier ([dcl.type.decltype]),
+ in which case the cv-qualifiers are ignored.
+   */
if (type_quals & (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE)
&& (TYPE_REF_P (type)
  || FUNC_OR_METHOD_TYPE_P (type)))
  {
-  if (TYPE_REF_P (type))
+  if (TYPE_REF_P (type)
+ && (!typedef_variant_p (type) || FUNC_OR_METHOD_TYPE_P (type)))
bad_quals |= type_quals & (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE);
type_quals &= ~(TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE);
  }
diff --git a/gcc/testsuite/g++.dg/parse/pr101783.C 
b/gcc/testsuite/g++.dg/parse/pr101783.C
new file mode 100644
index 000..4e0a435dd0b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/pr101783.C
@@ -0,0 +1,5 @@
+template struct A{
+typedef T& Type;
+};
+template void f(const typename A::Type){}
+template <> void f(const typename A::Type){}



2.

In Jonathan's earlier reply he asked how you tested the patch; this
message still doesn't say anything about that.

I communicated with Mr. Jonathan in private email, worrying my naive question 
might pollute the public maillist. The following is major part of this 
communication and I attached original part in attachment.


How has this patch been tested? Have you bootstrapped the compiler and
run the full testsuite?

Here is how I am doing:
a) build original 10.2.0 from scratch and make check to get both 
"testsuite/gcc/gcc.sum"
and "testsuite/g++/g++.sum".
b) apply my patch and build from scratch and make check to get both two files 
above.
c) compare two run's *.sum files to see if there is any difference.

  (Later I realized there is tool  "contrib/compare_tests" is a good help of 
doing so.)

3.

What is the legal status of your contributions?

I thought small patch didn't require assignment. However, I just sent email to 
ass...@gnu.org to request assignment.
Alternatively, I am not sure if adding this "signoff" tag in submission will 
help?
Signed-off-by: qingzhe huang 


Thank you again!



On 8/28/21 07:54, nick huang via Gcc-patches wrote:

Reference with cv-qualifiers should be ignored instead of causing an error
because standard accepts cv-qualified references introduced by typedef which
is ignored.
Therefore, the fix prevents GCC from reporting error by not setting variable
"bad_quals" in case the reference is introduced by typedef. Still the
cv-qualifier is silently ignored.
Here I quote spec (https://timsong-cpp.github.io/cppwp/dcl.ref#1):
"Cv-qualified references are ill-formed except when the cv-qualifiers
are introduced through the use of a typedef-name ([dcl.typedef],
[temp.param]) or decltype-specifier ([dcl.type.decltype]),
in which case the cv-qualifiers are ignored."

PR c++/101783

gcc/cp/ChangeLog:

2021-08-27  qingzhe huang  

* tree.c (cp_build_qualified_type_real):


The git commit verifier rejects this commit message with

Checking 1fa0fbcdd15adf936ab4fae584f841beb35da1bb: FAILED ERR: missing
description of a change:
" * tree.c (cp_build_qualified_type_real):"

(your initial patch had a description here, you just need to copy it over)

ERR: PR 101783 in 

Re: [PATCH v2] libgcc: Add a backchain fallback to _Unwind_Backtrace() on PowerPC

2021-09-28 Thread Segher Boessenkool
Hi!

On Thu, Aug 26, 2021 at 11:53:24AM -0300, Raphael Moreira Zinsly wrote:
> Without dwarf2 unwind tables available _Unwind_Backtrace() is not
> able to return the full backtrace.
> This patch adds a fallback function on powerpc to get the backtrace
> by doing a backchain, this code was originally at glibc.

Okay, the backchain as fallback if other (better!) methods cannot work.

>   * config/rs6000/linux-unwind.h (struct rt_sigframe): Move it to
>   outside of get_regs() in order to use it in another function,
>   this is done twice: for __powerpc64__ and for !__powerpc64__.
>   (struct trace_arg): New struct.
>   (struct layout): New struct.
>   (ppc_backchain_fallback): New function.
>   * unwind.inc (_Unwind_Backtrace): Look for _URC_NORMAL_STOP
>   code state and call MD_BACKCHAIN_FALLBACK.

Changelog lines wrap at 80 chars, not 70 or so.  The emails from commits
(to bugzilla) are a bit malformed (it counts the number of columns for
leading tabs wrong it seems), but the actual commits are just fine.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/unwind-backchain.c
> @@ -0,0 +1,22 @@
> +/* { dg-do run { target { powerpc*-*-linux* } } } */

Don't say such targets in gcc.target/powerpc/ tests please.  Everything
in gcc.target is for powerpc*-*-* already, so if you really want to
limit to powerpc*-*-linux* just write *-*-linux*.  But there are better
ways to get what you want, like, testing for the actual feature you want
(which is if backtrace() works?)  But such an improvement can be done
later (and needs more testing etc).

But please write some simple comment saying why you need -linux* in the
test.

> +void
> +test_backtrace()
> +{
> +  int addresses;
> +  void *buffer[10];
> +
> +  addresses = backtrace(buffer, 10);
> +  if(addresses != 4)
> +__builtin_abort();
> +}

Does that work?!  Has this been tested on all powerpc*-linux configs?
Importantly also BE and 32-bit.

Okay for trunk with the testcase fix, if all testing works out.  Thanks!


Segher


Re: [PATCH] c++: ttp matching with constrained auto parm [PR99909]

2021-09-28 Thread Jason Merrill via Gcc-patches

On 9/28/21 15:15, Patrick Palka wrote:

Here, when unifying TT with S, processing_template_decl is unset, and
this foils the dependence checks in do_auto_deduction for avoiding
checking constraints on an auto when the initializer is dependent.

This patch fixes this issue by making sure processing_template_decl is
set during the call to unify from coerce_template_template_parms; this
seems sensible because we're unifying one set of template parameters
with another, so we're dealing with templated trees throughout.



Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


PR c++/99909

gcc/cp/ChangeLog:

* pt.c (coerce_template_template_parms): Keep
processing_template_decl set during the call to unify as well.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ttp3.C: New test.
---
  gcc/cp/pt.c|  4 ++--
  gcc/testsuite/g++.dg/cpp2a/concepts-ttp3.C | 11 +++
  2 files changed, 13 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ttp3.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 41fa7ed5e43..1dcdffe322a 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -7994,12 +7994,12 @@ coerce_template_template_parms (tree parm_parms,
/* So coerce P's args to apply to A's parms, and then deduce between A's
 args and the converted args.  If that succeeds, A is at least as
 specialized as P, so they match.*/
+  processing_template_decl_sentinel ptds (/*reset*/false);
+  ++processing_template_decl;
tree pargs = template_parms_level_to_args (parm_parms);
pargs = add_outermost_template_args (outer_args, pargs);
-  ++processing_template_decl;
pargs = coerce_template_parms (arg_parms, pargs, NULL_TREE, tf_none,
 /*require_all*/true, /*use_default*/true);
-  --processing_template_decl;
if (pargs != error_mark_node)
{
  tree targs = make_tree_vec (nargs);
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-ttp3.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp3.C
new file mode 100644
index 000..898524e0dfa
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp3.C
@@ -0,0 +1,11 @@
+// PR c++/99909
+// { dg-do compile { target c++20 } }
+
+template constexpr bool always_true = true;
+template concept C = always_true;
+
+template struct S { };
+
+template class TT> void f() { }
+
+template void f();





[committed] libstdc++: Remove obfuscating typedefs in

2021-09-28 Thread Jonathan Wakely via Gcc-patches
There is no benefit to using _SizeT instead of size_t, and IterT tells
you less about the type than const _CharT*. This removes some unhelpful
typedefs.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/regex_automaton.h (_NFA_base::_SizeT): Remove.
* include/bits/regex_compiler.h (_Compiler::_IterT): Remove.
* include/bits/regex_compiler.tcc: Likewise.
* include/bits/regex_scanner.h (_Scanner::_IterT): Remove.
* include/bits/regex_scanner.tcc: Likewise.

Tested x86_64-linux. Committed to trunk.

commit c44c5f3d9f46705a262911c2098c1568d7e8ac2d
Author: Jonathan Wakely 
Date:   Tue Sep 28 13:39:36 2021

libstdc++: Remove obfuscating typedefs in 

There is no benefit to using _SizeT instead of size_t, and IterT tells
you less about the type than const _CharT*. This removes some unhelpful
typedefs.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/regex_automaton.h (_NFA_base::_SizeT): Remove.
* include/bits/regex_compiler.h (_Compiler::_IterT): Remove.
* include/bits/regex_compiler.tcc: Likewise.
* include/bits/regex_scanner.h (_Scanner::_IterT): Remove.
* include/bits/regex_scanner.tcc: Likewise.

diff --git a/libstdc++-v3/include/bits/regex_automaton.h 
b/libstdc++-v3/include/bits/regex_automaton.h
index 02d81f3e417..f108675f35e 100644
--- a/libstdc++-v3/include/bits/regex_automaton.h
+++ b/libstdc++-v3/include/bits/regex_automaton.h
@@ -183,7 +183,6 @@ namespace __detail
 
   struct _NFA_base
   {
-typedef size_t  _SizeT;
 typedef regex_constants::syntax_option_type _FlagT;
 
 explicit
@@ -206,14 +205,14 @@ namespace __detail
 _M_start() const noexcept
 { return _M_start_state; }
 
-_SizeT
+size_t
 _M_sub_count() const noexcept
 { return _M_subexpr_count; }
 
 _GLIBCXX_STD_C::vector _M_paren_stack;
 _FlagT_M_flags;
 _StateIdT _M_start_state;
-_SizeT_M_subexpr_count;
+size_t_M_subexpr_count;
 bool  _M_has_backref;
   };
 
diff --git a/libstdc++-v3/include/bits/regex_compiler.h 
b/libstdc++-v3/include/bits/regex_compiler.h
index 423ab823194..646766ebdf9 100644
--- a/libstdc++-v3/include/bits/regex_compiler.h
+++ b/libstdc++-v3/include/bits/regex_compiler.h
@@ -58,11 +58,10 @@ namespace __detail
 {
 public:
   typedef typename _TraitsT::char_type_CharT;
-  typedef const _CharT*   _IterT;
   typedef _NFA<_TraitsT> _RegexT;
   typedef regex_constants::syntax_option_type _FlagT;
 
-  _Compiler(_IterT __b, _IterT __e,
+  _Compiler(const _CharT* __b, const _CharT* __e,
const typename _TraitsT::locale_type& __traits, _FlagT __flags);
 
   shared_ptr
diff --git a/libstdc++-v3/include/bits/regex_compiler.tcc 
b/libstdc++-v3/include/bits/regex_compiler.tcc
index 9f04c1be686..1bd30972cbb 100644
--- a/libstdc++-v3/include/bits/regex_compiler.tcc
+++ b/libstdc++-v3/include/bits/regex_compiler.tcc
@@ -63,7 +63,7 @@ namespace __detail
 {
   template
 _Compiler<_TraitsT>::
-_Compiler(_IterT __b, _IterT __e,
+_Compiler(const _CharT* __b, const _CharT* __e,
  const typename _TraitsT::locale_type& __loc, _FlagT __flags)
 : _M_flags((__flags
& (regex_constants::ECMAScript
diff --git a/libstdc++-v3/include/bits/regex_scanner.h 
b/libstdc++-v3/include/bits/regex_scanner.h
index 05d8172a0ad..4e7d5efb34b 100644
--- a/libstdc++-v3/include/bits/regex_scanner.h
+++ b/libstdc++-v3/include/bits/regex_scanner.h
@@ -211,12 +211,11 @@ namespace __detail
 : public _ScannerBase
 {
 public:
-  typedef const _CharT*   _IterT;
   typedef std::basic_string<_CharT>   _StringT;
   typedef regex_constants::syntax_option_type _FlagT;
   typedef const std::ctype<_CharT>_CtypeT;
 
-  _Scanner(_IterT __begin, _IterT __end,
+  _Scanner(const _CharT* __begin, const _CharT* __end,
   _FlagT __flags, std::locale __loc);
 
   void
@@ -257,8 +256,8 @@ namespace __detail
   void
   _M_eat_class(char);
 
-  _IterT_M_current;
-  _IterT_M_end;
+  const _CharT* _M_current;
+  const _CharT* _M_end;
   _CtypeT&  _M_ctype;
   _StringT  _M_value;
   void (_Scanner::* _M_eat_escape)();
diff --git a/libstdc++-v3/include/bits/regex_scanner.tcc 
b/libstdc++-v3/include/bits/regex_scanner.tcc
index a9d6a613648..b2b709ce3cb 100644
--- a/libstdc++-v3/include/bits/regex_scanner.tcc
+++ b/libstdc++-v3/include/bits/regex_scanner.tcc
@@ -54,8 +54,7 @@ namespace 

[committed] libstdc++: Tweaks to to avoid warnings

2021-09-28 Thread Jonathan Wakely via Gcc-patches
Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/regex_compiler.tcc: Add line break in empty while
statement.
* include/bits/regex_executor.tcc: Avoid unused parameter
warning.

Tested x86_64-linux. Committed to trunk.

commit b5f276b8c76d892f7fed229153cfbadc13f4696e
Author: Jonathan Wakely 
Date:   Mon Sep 27 20:44:24 2021

libstdc++: Tweaks to  to avoid warnings

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/regex_compiler.tcc: Add line break in empty while
statement.
* include/bits/regex_executor.tcc: Avoid unused parameter
warning.

diff --git a/libstdc++-v3/include/bits/regex_compiler.tcc 
b/libstdc++-v3/include/bits/regex_compiler.tcc
index 440669debe0..9f04c1be686 100644
--- a/libstdc++-v3/include/bits/regex_compiler.tcc
+++ b/libstdc++-v3/include/bits/regex_compiler.tcc
@@ -140,7 +140,8 @@ namespace __detail
return true;
   if (this->_M_atom())
{
- while (this->_M_quantifier());
+ while (this->_M_quantifier())
+   ;
  return true;
}
   return false;
@@ -440,7 +441,8 @@ namespace __detail
  __last_char.second = '-';
}
}
-  while (_M_expression_term(__last_char, __matcher));
+  while (_M_expression_term(__last_char, __matcher))
+   ;
   if (__last_char.first)
__matcher._M_add_char(__last_char.second);
   __matcher._M_ready();
diff --git a/libstdc++-v3/include/bits/regex_executor.tcc 
b/libstdc++-v3/include/bits/regex_executor.tcc
index 3cefeda48a3..2577265c33a 100644
--- a/libstdc++-v3/include/bits/regex_executor.tcc
+++ b/libstdc++-v3/include/bits/regex_executor.tcc
@@ -423,7 +423,7 @@ namespace __detail
   template
 void _Executor<_BiIter, _Alloc, _TraitsT, __dfs_mode>::
-_M_handle_accept(_Match_mode __match_mode, _StateIdT __i)
+_M_handle_accept(_Match_mode __match_mode, _StateIdT)
 {
   if (__dfs_mode)
{


[committed] libstdc++: Add noexcept to functions in

2021-09-28 Thread Jonathan Wakely via Gcc-patches
Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/regex.h (basic_regex, swap): Add noexcept to
non-throwing functions.
* include/bits/regex_automaton.h (_State_base, _State)
(_NFA_base): Likewise.
* include/bits/regex_compiler.h (_Compiler): Likewise.
* include/bits/regex_error.h (regex_error::code()): Likewise.
* include/bits/regex_scanner.h (_Scanner): Likewise.

Tested x86_64-linux. Committed to trunk.

commit df0dd04b78cfc0f723387b703978600caac93cbb
Author: Jonathan Wakely 
Date:   Mon Sep 27 20:42:17 2021

libstdc++: Add noexcept to functions in 

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/regex.h (basic_regex, swap): Add noexcept to
non-throwing functions.
* include/bits/regex_automaton.h (_State_base, _State)
(_NFA_base): Likewise.
* include/bits/regex_compiler.h (_Compiler): Likewise.
* include/bits/regex_error.h (regex_error::code()): Likewise.
* include/bits/regex_scanner.h (_Scanner): Likewise.

diff --git a/libstdc++-v3/include/bits/regex.h 
b/libstdc++-v3/include/bits/regex.h
index b8a0ad251d8..d4a7729de2c 100644
--- a/libstdc++-v3/include/bits/regex.h
+++ b/libstdc++-v3/include/bits/regex.h
@@ -421,7 +421,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
* Constructs a basic regular expression that does not match any
* character sequence.
*/
-  basic_regex()
+  basic_regex() noexcept
   : _M_flags(ECMAScript), _M_loc(), _M_automaton(nullptr)
   { }
 
@@ -697,7 +697,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
* expression.
*/
   unsigned int
-  mark_count() const
+  mark_count() const noexcept
   {
if (_M_automaton)
  return _M_automaton->_M_sub_count() - 1;
@@ -709,7 +709,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
* or in the last call to assign().
*/
   flag_type
-  flags() const
+  flags() const noexcept
   { return _M_flags; }
 
   // [7.8.5] locale
@@ -731,7 +731,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
*object.
*/
   locale_type
-  getloc() const
+  getloc() const noexcept
   { return _M_loc; }
 
   // [7.8.6] swap
@@ -741,7 +741,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
* @param __rhs Another regular expression object.
*/
   void
-  swap(basic_regex& __rhs)
+  swap(basic_regex& __rhs) noexcept
   {
std::swap(_M_flags, __rhs._M_flags);
std::swap(_M_loc, __rhs._M_loc);
@@ -848,7 +848,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   template
 inline void
 swap(basic_regex<_Ch_type, _Rx_traits>& __lhs,
-basic_regex<_Ch_type, _Rx_traits>& __rhs)
+basic_regex<_Ch_type, _Rx_traits>& __rhs) noexcept
 { __lhs.swap(__rhs); }
 
 
diff --git a/libstdc++-v3/include/bits/regex_automaton.h 
b/libstdc++-v3/include/bits/regex_automaton.h
index 872a17fe8cb..02d81f3e417 100644
--- a/libstdc++-v3/include/bits/regex_automaton.h
+++ b/libstdc++-v3/include/bits/regex_automaton.h
@@ -95,13 +95,13 @@ namespace __detail
 };
 
   protected:
-explicit _State_base(_Opcode __opcode)
+explicit _State_base(_Opcode __opcode) noexcept
 : _M_opcode(__opcode), _M_next(_S_invalid_state_id)
 { }
 
   public:
 bool
-_M_has_alt()
+_M_has_alt() const noexcept
 {
   return _M_opcode == _S_opcode_alternative
|| _M_opcode == _S_opcode_repeat
@@ -130,7 +130,7 @@ namespace __detail
"std::function");
 
   explicit
-  _State(_Opcode __opcode) : _State_base(__opcode)
+  _State(_Opcode __opcode) noexcept : _State_base(__opcode)
   {
if (_M_opcode() == _S_opcode_match)
  new (this->_M_matcher_storage._M_addr()) _MatcherT();
@@ -143,7 +143,7 @@ namespace __detail
_MatcherT(__rhs._M_get_matcher());
   }
 
-  _State(_State&& __rhs) : _State_base(__rhs)
+  _State(_State&& __rhs) noexcept : _State_base(__rhs)
   {
if (__rhs._M_opcode() == _S_opcode_match)
  new (this->_M_matcher_storage._M_addr())
@@ -162,7 +162,7 @@ namespace __detail
   // Since correct ctor and dtor rely on _M_opcode, it's better not to
   // change it over time.
   _Opcode
-  _M_opcode() const
+  _M_opcode() const noexcept
   { return _State_base::_M_opcode; }
 
   bool
@@ -170,11 +170,11 @@ namespace __detail
   { return _M_get_matcher()(__char); }
 
   _MatcherT&
-  _M_get_matcher()
+  _M_get_matcher() noexcept
   { return *static_cast<_MatcherT*>(this->_M_matcher_storage._M_addr()); }
 
   const _MatcherT&
-  _M_get_matcher() const
+  _M_get_matcher() const noexcept
   {
return *static_cast(
this->_M_matcher_storage._M_addr());
@@ -187,7 +187,7 @@ namespace __detail
 typedef regex_constants::syntax_option_type _FlagT;
 
 explicit
-  

[PATCH] libstdc++: Fix return values for atomic wait on futex

2021-09-28 Thread Jonathan Wakely via Gcc-patches
This fixes a logic error in the futex-based timed wait.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/atomic_timed_wait.h (__platform_wait_until_impl):
Return false for ETIMEDOUT and true otherwise.

Tested x86_64-linux.

I'm not seeing any tests fail as a result of this, btu it does seem to
be incorrect. Please check my working.


commit 94dc544bbf42e95a363b916ed0d665afcf88
Author: Jonathan Wakely 
Date:   Tue Aug 31 10:20:41 2021

libstdc++: Fix return values for atomic wait on futex

This fixes a logic error in the futex-based timed wait.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/atomic_timed_wait.h (__platform_wait_until_impl):
Return false for ETIMEDOUT and true otherwise.

diff --git a/libstdc++-v3/include/bits/atomic_timed_wait.h 
b/libstdc++-v3/include/bits/atomic_timed_wait.h
index 3db08f82707..d423a7af7c3 100644
--- a/libstdc++-v3/include/bits/atomic_timed_wait.h
+++ b/libstdc++-v3/include/bits/atomic_timed_wait.h
@@ -101,12 +101,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
if (__e)
  {
-   if ((errno != ETIMEDOUT) && (errno != EINTR)
-   && (errno != EAGAIN))
+   if (errno == ETIMEDOUT)
+ return false;
+   if (errno != EINTR && errno != EAGAIN)
  __throw_system_error(errno);
-   return true;
  }
-   return false;
+   return true;
   }
 
 // returns true if wait ended before timeout


Re: [PATCH] c++: Fix up synthetization of defaulted comparison operators on classes with bitfields [PR102490]

2021-09-28 Thread Jason Merrill via Gcc-patches

On 9/28/21 09:53, Patrick Palka wrote:

On Tue, 28 Sep 2021, Patrick Palka wrote:


On Tue, 28 Sep 2021, Jakub Jelinek via Gcc-patches wrote:


Hi!

The testcases in the patch are either miscompiled or ICE with checking,
because the defaulted operator== is synthetized too early (but only if
constexpr), when the corresponding class type is still incomplete type.
The problem is that at that point the bitfield FIELD_DECLs still have as
TREE_TYPE their underlying type rather than integral type with their
precision and when layout_class_type is called for the class soon after
that, it changes those types but the COMPONENT_REFs type stay the way
that they were during the operator== synthetize_method type and the
middle-end is then upset by the mismatch of types.
As what exact type will be given isn't just a one liner but quite long code
especially for over-sized bitfields, I think it is best to just not
synthetize the comparison operators so early (the defaulted_late_check
change) and call defaulted_late_check for them once again as soon as the
class is complete.


Nice, this might also fix PR98712.



Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-09-28  Jakub Jelinek  

PR c++/102490
* method.c (defaulted_late_check): Don't synthetize constexpr
defaulted comparisons if context is still incomplete type.
(finish_struct_1): Call defaulted_late_check again for defaulted
comparisons.

* g++.dg/cpp2a/spaceship-eq11.C: New test.
* g++.dg/cpp2a/spaceship-eq12.C: New test.

--- gcc/cp/method.c.jj  2021-09-15 08:55:37.563497558 +0200
+++ gcc/cp/method.c 2021-09-27 13:48:12.139271830 +0200
@@ -3160,8 +3160,11 @@ defaulted_late_check (tree fn)
if (kind == sfk_comparison)
  {
/* If the function was declared constexpr, check that the definition
-qualifies.  Otherwise we can define the function lazily.  */
-  if (DECL_DECLARED_CONSTEXPR_P (fn) && !DECL_INITIAL (fn))
+qualifies.  Otherwise we can define the function lazily.
+Don't do this if the class type is still incomplete.  */
+  if (DECL_DECLARED_CONSTEXPR_P (fn)
+ && !DECL_INITIAL (fn)
+ && COMPLETE_TYPE_P (ctx))
{


According to the function comment for defaulted_late_check, won't
COMPLETE_TYPE_P (ctx) always be false here?


Not for a function defaulted outside the class.


If so, I wonder if we could get away with moving this entire fragment
from defaulted_late_check to finish_struct_1 instead of calling
defaulted_late_check from finish_struct_1.


The comment in check_bases_and_members says that we call it there so 
that it's before we clone [cd]tors.  Probably better to leave the call 
there for other functions, just skip it for comparisons.



  /* Prevent GC.  */
  function_depth++;
--- gcc/cp/class.c.jj   2021-09-03 09:46:28.801428380 +0200
+++ gcc/cp/class.c  2021-09-27 14:07:03.465562255 +0200
@@ -7467,7 +7467,14 @@ finish_struct_1 (tree t)
   for any static member objects of the type we're working on.  */
for (x = TYPE_FIELDS (t); x; x = DECL_CHAIN (x))
  if (DECL_DECLARES_FUNCTION_P (x))
-  DECL_IN_AGGR_P (x) = false;
+  {
+   /* Synthetize constexpr defaulted comparisons.  */
+   if (!DECL_ARTIFICIAL (x)
+   && DECL_DEFAULTED_IN_CLASS_P (x)
+   && special_function_p (x) == sfk_comparison)
+ defaulted_late_check (x);
+   DECL_IN_AGGR_P (x) = false;
+  }
  else if (VAR_P (x) && TREE_STATIC (x)
 && TREE_TYPE (x) != error_mark_node
 && same_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (x)), t))
--- gcc/testsuite/g++.dg/cpp2a/spaceship-eq11.C.jj  2021-09-27 
14:20:04.723713371 +0200
+++ gcc/testsuite/g++.dg/cpp2a/spaceship-eq11.C 2021-09-27 14:20:20.387495858 
+0200
@@ -0,0 +1,43 @@
+// PR c++/102490
+// { dg-do run { target c++20 } }
+
+struct A
+{
+  unsigned char a : 1;
+  unsigned char b : 1;
+  constexpr bool operator== (const A &) const = default;
+};
+
+struct B
+{
+  unsigned char a : 8;
+  int : 0;
+  unsigned char b : 7;
+  constexpr bool operator== (const B &) const = default;
+};
+
+struct C
+{
+  unsigned char a : 3;
+  unsigned char b : 1;
+  constexpr bool operator== (const C &) const = default;
+};
+
+void
+foo (C , int y)
+{
+  x.b = y;
+}
+
+int
+main ()
+{
+  A a{}, b{};
+  B c{}, d{};
+  C e{}, f{};
+  a.b = 1;
+  d.b = 1;
+  foo (e, 0);
+  foo (f, 1);
+  return a == b || c == d || e == f;
+}
--- gcc/testsuite/g++.dg/cpp2a/spaceship-eq12.C.jj  2021-09-27 
14:20:12.050611625 +0200
+++ gcc/testsuite/g++.dg/cpp2a/spaceship-eq12.C 2021-09-27 14:20:39.633228602 
+0200
@@ -0,0 +1,5 @@
+// PR c++/102490
+// { dg-do run { target c++20 } }
+// { dg-options "-O2" }
+
+#include "spaceship-eq11.C"

Jakub










[committed] libstdc++: Define macro before it is first checked

2021-09-28 Thread Jonathan Wakely via Gcc-patches
On Thu, 2 Sept 2021 at 22:25, Jonathan Wakely wrote:
>
> On Thu, 2 Sept 2021 at 19:00, Jonathan Wakely wrote:
> >
> > * include/bits/atomic_wait.h (_GLIBCXX_HAVE_PLATFORM_WAIT):
> > Define before first attempt to check it.
> >
> > Tested x86_64-linux and powerpc64-linux, not committed yet.
>
> Actually ignore that ... I tested the wrong patch. This one introduces
> a new FAIL, which I have a fix for, but it will have to wait for next
> week.
>
>
> > I think we need this, otherwise __platform_wait_uses_type is false
> > for all T.

This is the fixed patch.

Tested x86_64-linux, pushed to trunk.
commit aeaea265cea3a2b2e772af7825351a4ceef29aac
Author: Jonathan Wakely 
Date:   Tue Aug 31 15:51:09 2021

libstdc++: Define macro before it is first checked

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/atomic_wait.h (_GLIBCXX_HAVE_PLATFORM_WAIT):
Define before first attempt to check it.

diff --git a/libstdc++-v3/include/bits/atomic_wait.h 
b/libstdc++-v3/include/bits/atomic_wait.h
index 07bb744d822..35c92644146 100644
--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -56,9 +56,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   namespace __detail
   {
 #ifdef _GLIBCXX_HAVE_LINUX_FUTEX
+#define _GLIBCXX_HAVE_PLATFORM_WAIT 1
 using __platform_wait_t = int;
 static constexpr size_t __platform_wait_alignment = 4;
 #else
+// define _GLIBCX_HAVE_PLATFORM_WAIT and implement __platform_wait()
+// and __platform_notify() if there is a more efficient primitive supported
+// by the platform (e.g. __ulock_wait()/__ulock_wake()) which is better than
+// a mutex/condvar based wait.
 using __platform_wait_t = uint64_t;
 static constexpr size_t __platform_wait_alignment
   = __alignof__(__platform_wait_t);
@@ -70,7 +75,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
   = is_scalar_v<_Tp>
&& ((sizeof(_Tp) == sizeof(__detail::__platform_wait_t))
-   && (alignof(_Tp*) >= __platform_wait_alignment));
+   && (alignof(_Tp*) >= __detail::__platform_wait_alignment));
 #else
   = false;
 #endif
@@ -78,7 +83,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   namespace __detail
   {
 #ifdef _GLIBCXX_HAVE_LINUX_FUTEX
-#define _GLIBCXX_HAVE_PLATFORM_WAIT 1
 enum class __futex_wait_flags : int
 {
 #ifdef _GLIBCXX_HAVE_LINUX_FUTEX_PRIVATE
@@ -118,11 +122,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 static_cast(__futex_wait_flags::__wake_private),
 __all ? INT_MAX : 1);
   }
-#else
-// define _GLIBCX_HAVE_PLATFORM_WAIT and implement __platform_wait()
-// and __platform_notify() if there is a more efficient primitive supported
-// by the platform (e.g. __ulock_wait()/__ulock_wake()) which is better than
-// a mutex/condvar based wait
 #endif
 
 inline void
@@ -331,7 +330,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
if constexpr (__platform_wait_uses_type<_Up>)
  {
-   __val == __old;
+   __builtin_memcpy(&__val, &__old, sizeof(__val));
  }
else
  {


[pushed] Darwin, D : Add .d suffix to the list for invoking dsymutil.

2021-09-28 Thread Iain Sandoe via Gcc-patches
Hi,

Recognise .d for D source files on the command line.  This will
trigger an invocation of dsymutil when a D source is present.

tested along with D patches on i686, powerpc and x86_64 darwin,
pushed to master, thanks,
Iain

gcc/ChangeLog:

* config/darwin.h (DSYMUTIL_SPEC): Recognize D sources.
---
 gcc/config/darwin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index 50524a51511..0fa1c572bc9 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -251,7 +251,7 @@ extern GTY(()) int darwin_ms_struct;
 %{v} \
 %{g*:%{!gctf:%{!gbtf:%{!gstabs*:%{%:debug-level-gt(0): -idsym}\
 %{.c|.cc|.C|.cpp|.cp|.c++|.cxx|.CPP|.m|.mm|.s|.f|.f90|\
-  .f95|.f03|.f77|.for|.F|.F90|.F95|.F03: \
+  .f95|.f03|.f77|.for|.F|.F90|.F95|.F03|.d: \
 %{g*:%{!gctf:%{!gbtf:%{!gstabs*:%{%:debug-level-gt(0): -dsym}"
 
 #define LINK_COMMAND_SPEC LINK_COMMAND_SPEC_A DSYMUTIL_SPEC
-- 
2.24.3 (Apple Git-128)



[committed] libstdc++: Skip container adaptor tests that fail concept checks

2021-09-28 Thread Jonathan Wakely via Gcc-patches
As an extension, our container adaptors SFINAE away the default
constructor if the adapted sequence container is not default
constructible. When _GLIBCXX_CONCEPT_CHECKS is defined we enforce that
the sequence is default constructible, so the tests for the extension
fail. This disables the relevant parts of the tests.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* 
testsuite/23_containers/priority_queue/requirements/explicit_instantiation/1.cc:
Do not check non-default constructible sequences when
_GLIBCXX_CONCEPT_CHECKS is defined.
* 
testsuite/23_containers/priority_queue/requirements/explicit_instantiation/1_c++98.cc:
Likewise.
* 
testsuite/23_containers/queue/requirements/explicit_instantiation/1.cc:
Likewise.
* 
testsuite/23_containers/queue/requirements/explicit_instantiation/1_c++98.cc:
Likewise.
* 
testsuite/23_containers/stack/requirements/explicit_instantiation/1.cc:
Likewise.
* 
testsuite/23_containers/stack/requirements/explicit_instantiation/1_c++98.cc:
Likewise.

Tested x86_64-linux. Committed to trunk.

commit 07fbdd7bda1166ab2722dbeb4fd3c6b8558b324b
Author: Jonathan Wakely 
Date:   Fri Sep 24 14:32:34 2021

libstdc++: Skip container adaptor tests that fail concept checks

As an extension, our container adaptors SFINAE away the default
constructor if the adapted sequence container is not default
constructible. When _GLIBCXX_CONCEPT_CHECKS is defined we enforce that
the sequence is default constructible, so the tests for the extension
fail. This disables the relevant parts of the tests.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* 
testsuite/23_containers/priority_queue/requirements/explicit_instantiation/1.cc:
Do not check non-default constructible sequences when
_GLIBCXX_CONCEPT_CHECKS is defined.
* 
testsuite/23_containers/priority_queue/requirements/explicit_instantiation/1_c++98.cc:
Likewise.
* 
testsuite/23_containers/queue/requirements/explicit_instantiation/1.cc:
Likewise.
* 
testsuite/23_containers/queue/requirements/explicit_instantiation/1_c++98.cc:
Likewise.
* 
testsuite/23_containers/stack/requirements/explicit_instantiation/1.cc:
Likewise.
* 
testsuite/23_containers/stack/requirements/explicit_instantiation/1_c++98.cc:
Likewise.

diff --git 
a/libstdc++-v3/testsuite/23_containers/priority_queue/requirements/explicit_instantiation/1.cc
 
b/libstdc++-v3/testsuite/23_containers/priority_queue/requirements/explicit_instantiation/1.cc
index d1e18f879df..a425001612d 100644
--- 
a/libstdc++-v3/testsuite/23_containers/priority_queue/requirements/explicit_instantiation/1.cc
+++ 
b/libstdc++-v3/testsuite/23_containers/priority_queue/requirements/explicit_instantiation/1.cc
@@ -24,12 +24,15 @@
 
 template class std::priority_queue;
 
-struct NonDefaultConstructible : std::vector {
-  NonDefaultConstructible(int) { }
-};
 struct Cmp : std::less {
   Cmp(int) { }
 };
+template class std::priority_queue, Cmp>;
+
+#ifndef _GLIBCXX_CONCEPT_CHECKS
+struct NonDefaultConstructible : std::vector {
+  NonDefaultConstructible(int) { }
+};
 template class std::priority_queue;
 template class std::priority_queue;
-template class std::priority_queue, Cmp>;
+#endif
diff --git 
a/libstdc++-v3/testsuite/23_containers/priority_queue/requirements/explicit_instantiation/1_c++98.cc
 
b/libstdc++-v3/testsuite/23_containers/priority_queue/requirements/explicit_instantiation/1_c++98.cc
index def9259dc6b..28549f5246e 100644
--- 
a/libstdc++-v3/testsuite/23_containers/priority_queue/requirements/explicit_instantiation/1_c++98.cc
+++ 
b/libstdc++-v3/testsuite/23_containers/priority_queue/requirements/explicit_instantiation/1_c++98.cc
@@ -24,12 +24,15 @@
 
 template class std::priority_queue;
 
-struct NonDefaultConstructible : std::vector {
-  NonDefaultConstructible(int) { }
-};
 struct Cmp : std::less {
   Cmp(int) { }
 };
+template class std::priority_queue, Cmp>;
+
+#ifndef _GLIBCXX_CONCEPT_CHECKS
+struct NonDefaultConstructible : std::vector {
+  NonDefaultConstructible(int) { }
+};
 template class std::priority_queue;
 template class std::priority_queue;
-template class std::priority_queue, Cmp>;
+#endif
diff --git 
a/libstdc++-v3/testsuite/23_containers/queue/requirements/explicit_instantiation/1.cc
 
b/libstdc++-v3/testsuite/23_containers/queue/requirements/explicit_instantiation/1.cc
index b737a15a30b..3b9090cb945 100644
--- 
a/libstdc++-v3/testsuite/23_containers/queue/requirements/explicit_instantiation/1.cc
+++ 
b/libstdc++-v3/testsuite/23_containers/queue/requirements/explicit_instantiation/1.cc
@@ -24,7 +24,9 @@
 
 template class std::queue;
 
+#ifndef _GLIBCXX_CONCEPT_CHECKS
 struct NonDefaultConstructible : std::deque {
   NonDefaultConstructible(int) { }
 };
 template 

[committed] libstdc++: Skip tests that fail with _GLIBCXX_CONCEPT_CHECKS

2021-09-28 Thread Jonathan Wakely via Gcc-patches
The extension that allows implicitly rebinding a container's allocator
is not allowed when _GLIBCXX_CONCEPT_CHECKS is defined, so skip the
tests for that extension.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* 
testsuite/23_containers/deque/requirements/explicit_instantiation/3.cc:
Do not test implicit allocator rebinding when _GLIBCXX_CONCEPT_CHECKS
is defined.
* 
testsuite/23_containers/forward_list/requirements/explicit_instantiation/3.cc:
Likewise.
* testsuite/23_containers/list/requirements/explicit_instantiation/3.cc:
Likewise.
* testsuite/23_containers/list/requirements/explicit_instantiation/5.cc:
Likewise.
* testsuite/23_containers/map/requirements/explicit_instantiation/3.cc:
Likewise.
* testsuite/23_containers/map/requirements/explicit_instantiation/5.cc:
Likewise.
* 
testsuite/23_containers/multimap/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/multimap/requirements/explicit_instantiation/5.cc:
Likewise.
* 
testsuite/23_containers/multiset/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/multiset/requirements/explicit_instantiation/5.cc:
Likewise.
* testsuite/23_containers/set/requirements/explicit_instantiation/3.cc:
Likewise.
* testsuite/23_containers/set/requirements/explicit_instantiation/5.cc:
Likewise.
* 
testsuite/23_containers/unordered_map/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/unordered_multimap/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/unordered_multiset/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/unordered_set/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/vector/ext_pointer/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/vector/requirements/explicit_instantiation/3.cc:
Likewise.

Tested x86_64-linux. Committed to trunk.

commit b701f46ea6d651aff8dbd267c29213253045e2b6
Author: Jonathan Wakely 
Date:   Fri Sep 24 14:23:36 2021

libstdc++: Skip tests that fail with _GLIBCXX_CONCEPT_CHECKS

The extension that allows implicitly rebinding a container's allocator
is not allowed when _GLIBCXX_CONCEPT_CHECKS is defined, so skip the
tests for that extension.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* 
testsuite/23_containers/deque/requirements/explicit_instantiation/3.cc:
Do not test implicit allocator rebinding when 
_GLIBCXX_CONCEPT_CHECKS
is defined.
* 
testsuite/23_containers/forward_list/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/list/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/list/requirements/explicit_instantiation/5.cc:
Likewise.
* 
testsuite/23_containers/map/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/map/requirements/explicit_instantiation/5.cc:
Likewise.
* 
testsuite/23_containers/multimap/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/multimap/requirements/explicit_instantiation/5.cc:
Likewise.
* 
testsuite/23_containers/multiset/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/multiset/requirements/explicit_instantiation/5.cc:
Likewise.
* 
testsuite/23_containers/set/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/set/requirements/explicit_instantiation/5.cc:
Likewise.
* 
testsuite/23_containers/unordered_map/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/unordered_multimap/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/unordered_multiset/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/unordered_set/requirements/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/vector/ext_pointer/explicit_instantiation/3.cc:
Likewise.
* 
testsuite/23_containers/vector/requirements/explicit_instantiation/3.cc:
Likewise.

diff --git 
a/libstdc++-v3/testsuite/23_containers/deque/requirements/explicit_instantiation/3.cc
 
b/libstdc++-v3/testsuite/23_containers/deque/requirements/explicit_instantiation/3.cc
index 0cbedf4693b..2a23eaa3f17 100644
--- 

[committed] libstdc++: Fix concept checks for iterators

2021-09-28 Thread Jonathan Wakely via Gcc-patches
This adds some additional checks the the C++98-style concept checks for
iterators, and removes some bogus checks for mutable iterators. Instead
of requiring that the result of dereferencing a mutable iterator is
assignable (which is a property of the value type, not required for the
iterator) check that the reference type is a non-const reference to the
value type.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/boost_concept_check.h (_ForwardIteratorConcept)
(_BidirectionalIteratorConcept, _RandomAccessIteratorConcept):
Check result types of iterator operations.
(_Mutable_ForwardIteratorConcept): Check that iterator's
reference type is a reference to its value type.
(_Mutable_BidirectionalIteratorConcept): Do not require the
value type to be assignable.
(_Mutable_RandomAccessIteratorConcept): Likewise.
* testsuite/24_iterators/operations/prev_neg.cc: Adjust dg-error
line number.

Tested x86_64-linux. Committed to trunk.

commit afffc96a5259ba4e3f3cca154dc5ea32a496875e
Author: Jonathan Wakely 
Date:   Fri Sep 24 13:56:33 2021

libstdc++: Fix concept checks for iterators

This adds some additional checks the the C++98-style concept checks for
iterators, and removes some bogus checks for mutable iterators. Instead
of requiring that the result of dereferencing a mutable iterator is
assignable (which is a property of the value type, not required for the
iterator) check that the reference type is a non-const reference to the
value type.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/boost_concept_check.h (_ForwardIteratorConcept)
(_BidirectionalIteratorConcept, _RandomAccessIteratorConcept):
Check result types of iterator operations.
(_Mutable_ForwardIteratorConcept): Check that iterator's
reference type is a reference to its value type.
(_Mutable_BidirectionalIteratorConcept): Do not require the
value type to be assignable.
(_Mutable_RandomAccessIteratorConcept): Likewise.
* testsuite/24_iterators/operations/prev_neg.cc: Adjust dg-error
line number.

diff --git a/libstdc++-v3/include/bits/boost_concept_check.h 
b/libstdc++-v3/include/bits/boost_concept_check.h
index ba36c24abec..71c99c13e93 100644
--- a/libstdc++-v3/include/bits/boost_concept_check.h
+++ b/libstdc++-v3/include/bits/boost_concept_check.h
@@ -44,6 +44,14 @@
 #include 
 #include // for traits and tags
 
+namespace std  _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+  struct _Bit_iterator;
+  struct _Bit_const_iterator;
+_GLIBCXX_END_NAMESPACE_VERSION
+}
+
 namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -470,6 +478,52 @@ struct _Aux_require_same<_Tp,_Tp> { typedef _Tp _Type; };
 _ValueT __val() const;
   };
 
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wunused-variable"
+
+  template 
+  struct _ForwardIteratorReferenceConcept
+  {
+void __constraints() {
+#if __cplusplus >= 201103L
+  typedef typename std::iterator_traits<_Tp>::reference _Ref;
+  static_assert(std::is_reference<_Ref>::value,
+ "reference type of a forward iterator must be a real reference");
+#endif
+}
+  };
+
+  template 
+  struct _Mutable_ForwardIteratorReferenceConcept
+  {
+void __constraints() {
+  typedef typename std::iterator_traits<_Tp>::reference _Ref;
+  typedef typename std::iterator_traits<_Tp>::value_type _Val;
+  __function_requires< _SameTypeConcept<_Ref, _Val&> >();
+}
+  };
+
+  // vector::iterator is not a real forward reference, but pretend it is.
+  template <>
+  struct _ForwardIteratorReferenceConcept
+  {
+void __constraints() { }
+  };
+
+  // vector::iterator is not a real forward reference, but pretend it is.
+  template <>
+  struct _Mutable_ForwardIteratorReferenceConcept
+  {
+void __constraints() { }
+  };
+
+  // And vector::const iterator too.
+  template <>
+  struct _ForwardIteratorReferenceConcept
+  {
+void __constraints() { }
+  };
+
   template 
   struct _ForwardIteratorConcept
   {
@@ -479,8 +533,12 @@ struct _Aux_require_same<_Tp,_Tp> { typedef _Tp _Type; };
   __function_requires< _ConvertibleConcept<
 typename std::iterator_traits<_Tp>::iterator_category,
 std::forward_iterator_tag> >();
+  __function_requires< _ForwardIteratorReferenceConcept<_Tp> >();
+  _Tp& __j = ++__i;
+  const _Tp& __k = __i++;
   typedef typename std::iterator_traits<_Tp>::reference _Ref;
-  _Ref __r _IsUnused = *__i;
+  _Ref __r = *__k;
+  _Ref __r2 = *__i++;
 }
 _Tp __i;
   };
@@ -490,7 +548,9 @@ struct _Aux_require_same<_Tp,_Tp> { typedef _Tp _Type; };
   {
 void __constraints() {
   __function_requires< _ForwardIteratorConcept<_Tp> >();
-  *__i++ = 

[committed] libstdc++: Improve types used as iterators in testsuite

2021-09-28 Thread Jonathan Wakely via Gcc-patches
Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* testsuite/25_algorithms/copy/34595.cc: Add missing operation
for type used as an iterator.
* testsuite/25_algorithms/unique_copy/check_type.cc: Likewise.

Tested x86_64-linux. Committed to trunk.

commit 5f1db7627f6eea2050c3d71f17bca5ecf586a813
Author: Jonathan Wakely 
Date:   Fri Sep 24 13:23:34 2021

libstdc++: Improve types used as iterators in testsuite

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* testsuite/25_algorithms/copy/34595.cc: Add missing operation
for type used as an iterator.
* testsuite/25_algorithms/unique_copy/check_type.cc: Likewise.

diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/34595.cc 
b/libstdc++-v3/testsuite/25_algorithms/copy/34595.cc
index c534eeb17f5..513425a5a2c 100644
--- a/libstdc++-v3/testsuite/25_algorithms/copy/34595.cc
+++ b/libstdc++-v3/testsuite/25_algorithms/copy/34595.cc
@@ -27,11 +27,12 @@ class Counting_output_iterator
 public:
   Counting_output_iterator() : c(0) {}
   Counting_output_iterator& operator++() { return *this; }
+  Counting_output_iterator operator++(int) { return *this; }
   Counting_output_iterator& operator*() { return *this; }
-  
+
   template 
   void operator=(const T&) { ++c; }
-  
+
   std::size_t current_counter() const { return c; }
 };
 
diff --git a/libstdc++-v3/testsuite/25_algorithms/unique_copy/check_type.cc 
b/libstdc++-v3/testsuite/25_algorithms/unique_copy/check_type.cc
index af86548609f..27b35794e8a 100644
--- a/libstdc++-v3/testsuite/25_algorithms/unique_copy/check_type.cc
+++ b/libstdc++-v3/testsuite/25_algorithms/unique_copy/check_type.cc
@@ -25,27 +25,35 @@
 using __gnu_test::input_iterator_wrapper;
 using __gnu_test::output_iterator_wrapper;
 
-struct S1 { };
+template
+struct iter_facade
+{
+  T& operator++();
+  T operator++(int);
+  T& operator*() const;
+};
 
-struct S2
+struct S1 : iter_facade { };
+
+struct S2 : iter_facade
 {
   S2(const S1&) {}
 };
 
-bool 
+bool
 operator==(const S1&, const S1&) {return true;}
 
-struct X1 { };
+struct X1 : iter_facade  { };
 
-struct X2
+struct X2 : iter_facade
 {
   X2(const X1&) {}
 };
 
-bool 
+bool
 predicate(const X1&, const X1&) {return true;}
 
-output_iterator_wrapper 
+output_iterator_wrapper
 test1(input_iterator_wrapper& s1, output_iterator_wrapper& s2)
 { return std::unique_copy(s1, s1, s2); }
 


[committed] libstdc++: Fix tests that use invalid types in ordered containers

2021-09-28 Thread Jonathan Wakely via Gcc-patches
Types used in ordered containers need to be comparable, or the container
needs to use a custom comparison function. These tests fail when
_GLIBCXX_CONCEPT_CHECKS is defined, because the element types aren't
comparable.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* testsuite/20_util/is_nothrow_swappable/value.h: Use custom
comparison function for priority_queue of type with no
relational operators.
* testsuite/20_util/is_swappable/value.h: Likewise.
* testsuite/24_iterators/output/concept.cc: Add operator< to
type used in set.

Tested x86_64-linux. Committed to trunk.

commit 4000d722e6091e923721b54911bb784eeec3
Author: Jonathan Wakely 
Date:   Fri Sep 24 13:21:34 2021

libstdc++: Fix tests that use invalid types in ordered containers

Types used in ordered containers need to be comparable, or the container
needs to use a custom comparison function. These tests fail when
_GLIBCXX_CONCEPT_CHECKS is defined, because the element types aren't
comparable.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* testsuite/20_util/is_nothrow_swappable/value.h: Use custom
comparison function for priority_queue of type with no
relational operators.
* testsuite/20_util/is_swappable/value.h: Likewise.
* testsuite/24_iterators/output/concept.cc: Add operator< to
type used in set.

diff --git a/libstdc++-v3/testsuite/20_util/is_nothrow_swappable/value.h 
b/libstdc++-v3/testsuite/20_util/is_nothrow_swappable/value.h
index 62b3db8dc1f..d6f166bee46 100644
--- a/libstdc++-v3/testsuite/20_util/is_nothrow_swappable/value.h
+++ b/libstdc++-v3/testsuite/20_util/is_nothrow_swappable/value.h
@@ -285,7 +285,9 @@ void test01()
   static_assert(test_property>(true), "");
   static_assert(test_property>(true), "");
+   std::priority_queue,
+   comps::CompareNoThrowCopyable>>(true), "");
   static_assert(test_property>(true), "");
   static_assert(test_property
+  bool operator()(const T&, const T&) const
+  { return false; }
+  };
 }
 void test01()
 {
@@ -152,7 +159,9 @@ void test01()
   static_assert(test_property[1][2][3]>(true), "");
   static_assert(test_property>(true), "");
+   std::priority_queue,
+   funny::DummyCmp>>(true), "");
   static_assert(test_property>(true), "");
   static_assert(test_property::iterator, int > );
 static_assert( output_iterator< array::iterator, B > );


[committed] libstdc++: Fix _OutputIteratorConcept checks in algorithms

2021-09-28 Thread Jonathan Wakely via Gcc-patches
The _OutputIteratorConcept should be checked using the correct value
category. The std::move_backward and std::copy_backward algorithms
should use _OutputIteratorConcept instead of _ConvertibleConcept.

In order to use the correct value category, the concept should use a
function that returns _ValueT instead of using an lvalue data member.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/boost_concept_check.h (_OutputIteratorConcept):
Use a function to preserve value category of the type.
* include/bits/stl_algobase.h (copy, move, fill_n): Use a
reference as the second argument for _OutputIteratorConcept.
(copy_backward, move_backward): Use _OutputIteratorConcept
instead of _ConvertibleConcept.

Tested x86_64-linux. Committed to trunk.

commit 45a8cd256934be3770f7e000db7b13f10eabee9a
Author: Jonathan Wakely 
Date:   Fri Sep 24 15:35:20 2021

libstdc++: Fix _OutputIteratorConcept checks in algorithms

The _OutputIteratorConcept should be checked using the correct value
category. The std::move_backward and std::copy_backward algorithms
should use _OutputIteratorConcept instead of _ConvertibleConcept.

In order to use the correct value category, the concept should use a
function that returns _ValueT instead of using an lvalue data member.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/boost_concept_check.h (_OutputIteratorConcept):
Use a function to preserve value category of the type.
* include/bits/stl_algobase.h (copy, move, fill_n): Use a
reference as the second argument for _OutputIteratorConcept.
(copy_backward, move_backward): Use _OutputIteratorConcept
instead of _ConvertibleConcept.

diff --git a/libstdc++-v3/include/bits/boost_concept_check.h 
b/libstdc++-v3/include/bits/boost_concept_check.h
index 5c87e32f36b..ba36c24abec 100644
--- a/libstdc++-v3/include/bits/boost_concept_check.h
+++ b/libstdc++-v3/include/bits/boost_concept_check.h
@@ -464,10 +464,10 @@ struct _Aux_require_same<_Tp,_Tp> { typedef _Tp _Type; };
   __function_requires< _AssignableConcept<_Tp> >();
   ++__i;// require preincrement operator
   __i++;// require postincrement operator
-  *__i++ = __t; // require postincrement and assignment
+  *__i++ = __val(); // require postincrement and assignment
 }
 _Tp __i;
-_ValueT __t;
+_ValueT __val() const;
   };
 
   template 
diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h
index d0c49628d7f..e1443b8a92a 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -613,7 +613,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
   // concept requirements
   __glibcxx_function_requires(_InputIteratorConcept<_II>)
   __glibcxx_function_requires(_OutputIteratorConcept<_OI,
-   typename iterator_traits<_II>::value_type>)
+   typename iterator_traits<_II>::reference>)
   __glibcxx_requires_can_increment_range(__first, __last, __result);
 
   return std::__copy_move_a<__is_move_iterator<_II>::__value>
@@ -646,7 +646,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
   // concept requirements
   __glibcxx_function_requires(_InputIteratorConcept<_II>)
   __glibcxx_function_requires(_OutputIteratorConcept<_OI,
-   typename iterator_traits<_II>::value_type>)
+   typename iterator_traits<_II>::value_type&&>)
   __glibcxx_requires_can_increment_range(__first, __last, __result);
 
   return std::__copy_move_a(std::__miter_base(__first),
@@ -850,9 +850,8 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
   // concept requirements
   __glibcxx_function_requires(_BidirectionalIteratorConcept<_BI1>)
   __glibcxx_function_requires(_Mutable_BidirectionalIteratorConcept<_BI2>)
-  __glibcxx_function_requires(_ConvertibleConcept<
-   typename iterator_traits<_BI1>::value_type,
-   typename iterator_traits<_BI2>::value_type>)
+  __glibcxx_function_requires(_OutputIteratorConcept<_BI2,
+   typename iterator_traits<_BI1>::reference>)
   __glibcxx_requires_can_decrement_range(__first, __last, __result);
 
   return std::__copy_move_backward_a<__is_move_iterator<_BI1>::__value>
@@ -886,9 +885,8 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
   // concept requirements
   __glibcxx_function_requires(_BidirectionalIteratorConcept<_BI1>)
   __glibcxx_function_requires(_Mutable_BidirectionalIteratorConcept<_BI2>)
-  __glibcxx_function_requires(_ConvertibleConcept<
-   typename iterator_traits<_BI1>::value_type,
-   typename iterator_traits<_BI2>::value_type>)
+  __glibcxx_function_requires(_OutputIteratorConcept<_BI2,
+   typename iterator_traits<_BI1>::value_type&&>)
   

[committed] libstdc++: Specialize std::pointer_traits<__normal_iterator>

2021-09-28 Thread Jonathan Wakely via Gcc-patches
This allows std::__to_address to be used with __normal_iterator in
C++11/14/17 modes. Without the partial specialization the deduced
pointer_traits::element_type is incorrect, and so the return type of
__to_address is wrong.

A similar partial specialization is probably needed for
__gnu_debug::_Safe_iterator.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (pointer_traits): Define partial
specialization for __normal_iterator.
* testsuite/24_iterators/normal_iterator/to_address.cc: New test.

Tested x86_64-linux. Committed to trunk.

commit 82626be2d633a9802a8b08727ef51c627e37fee5
Author: Jonathan Wakely 
Date:   Tue Sep 28 15:26:46 2021

libstdc++: Specialize std::pointer_traits<__normal_iterator>

This allows std::__to_address to be used with __normal_iterator in
C++11/14/17 modes. Without the partial specialization the deduced
pointer_traits::element_type is incorrect, and so the return type of
__to_address is wrong.

A similar partial specialization is probably needed for
__gnu_debug::_Safe_iterator.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (pointer_traits): Define partial
specialization for __normal_iterator.
* testsuite/24_iterators/normal_iterator/to_address.cc: New test.

diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h
index c5b02408c1c..004d767224d 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -1285,6 +1285,34 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { return __it.base(); }
 
 #if __cplusplus >= 201103L
+
+  // Need to specialize pointer_traits because the primary template will
+  // deduce element_type of __normal_iterator as T* rather than T.
+  template
+struct pointer_traits<__gnu_cxx::__normal_iterator<_Iterator, _Container>>
+{
+private:
+  using _Base = pointer_traits<_Iterator>;
+
+public:
+  using element_type = typename _Base::element_type;
+  using pointer = __gnu_cxx::__normal_iterator<_Iterator, _Container>;
+  using difference_type = typename _Base::difference_type;
+
+  template
+   using rebind = __gnu_cxx::__normal_iterator<_Tp, _Container>;
+
+  static pointer
+  pointer_to(element_type& __e) noexcept
+  { return pointer(_Base::pointer_to(__e)); }
+
+#if __cplusplus >= 202002L
+  static element_type*
+  to_address(pointer __p) noexcept
+  { return __p.base(); }
+#endif
+};
+
   /**
* @addtogroup iterators
* @{
diff --git a/libstdc++-v3/testsuite/24_iterators/normal_iterator/to_address.cc 
b/libstdc++-v3/testsuite/24_iterators/normal_iterator/to_address.cc
new file mode 100644
index 000..510d627435f
--- /dev/null
+++ b/libstdc++-v3/testsuite/24_iterators/normal_iterator/to_address.cc
@@ -0,0 +1,6 @@
+// { dg-do compile { target { c++11 } } }
+#include 
+#include 
+
+char* p = std::__to_address(std::string("1").begin());
+const char* q = std::__to_address(std::string("2").cbegin());


Re: [PATCH] rs6000: Remove builtin mask check from builtin_decl [PR102347]

2021-09-28 Thread Bill Schmidt via Gcc-patches
Hi Kewen,

Although I agree that what we do now is tragically bad (and will be fixed in 
the builtin rewrite), this seems a little too cavalier to remove all checking 
during initialization without adding any checking somewhere else. :-)  We still 
need to check for invalid usage when the builtin is expanded, and I don't think 
the old code does this at all.

Unless you are planning to do a backport, I think the proper way forward here 
is to just wait for the new builtin support to land.  In the new code, we 
initialize all built-ins up front, and check properly at expansion time whether 
the builtin is enabled in the environment that obtains during expand.

My two cents,
Bill

On 9/28/21 3:13 AM, Kewen.Lin wrote:
> Hi,
>
> As the discussion in PR102347, currently builtin_decl is invoked so
> early, it's when making up the function_decl for builtin functions,
> at that time the rs6000_builtin_mask could be wrong for those
> builtins sitting in #pragma/attribute target functions, though it
> will be updated properly later when LTO processes all nodes.
>
> This patch is to align with the practice i386 port adopts, also
> align with r10-7462 by relaxing builtin mask checking in some places.
>
> Bootstrapped and regress-tested on powerpc64le-linux-gnu P9 and
> powerpc64-linux-gnu P8.
>
> Is it ok for trunk?
>
> BR,
> Kewen
> -
> gcc/ChangeLog:
>
>   PR target/102347
>   * config/rs6000/rs6000-call.c (rs6000_builtin_decl): Remove builtin
>   mask check.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/102347
>   * gcc.target/powerpc/pr102347.c: New test.
>
> ---
>  gcc/config/rs6000/rs6000-call.c | 14 --
>  gcc/testsuite/gcc.target/powerpc/pr102347.c | 15 +++
>  2 files changed, 19 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102347.c
>
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index fd7f24da818..15e0e09c07d 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -13775,23 +13775,17 @@ rs6000_init_builtins (void)
>  }
>  }
>
> -/* Returns the rs6000 builtin decl for CODE.  */
> +/* Returns the rs6000 builtin decl for CODE.  Note that we don't check
> +   the builtin mask here since there could be some #pragma/attribute
> +   target functions and the rs6000_builtin_mask could be wrong when
> +   this checking happens, though it will be updated properly later.  */
>
>  tree
>  rs6000_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED)
>  {
> -  HOST_WIDE_INT fnmask;
> -
>if (code >= RS6000_BUILTIN_COUNT)
>  return error_mark_node;
>
> -  fnmask = rs6000_builtin_info[code].mask;
> -  if ((fnmask & rs6000_builtin_mask) != fnmask)
> -{
> -  rs6000_invalid_builtin ((enum rs6000_builtins)code);
> -  return error_mark_node;
> -}
> -
>return rs6000_builtin_decls[code];
>  }
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr102347.c 
> b/gcc/testsuite/gcc.target/powerpc/pr102347.c
> new file mode 100644
> index 000..05c439a8dac
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr102347.c
> @@ -0,0 +1,15 @@
> +/* { dg-do link } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-require-effective-target lto } */
> +/* { dg-options "-flto -mdejagnu-cpu=power9" } */
> +
> +/* Verify there are no error messages in LTO mode.  */
> +
> +#pragma GCC target "cpu=power10"
> +int main ()
> +{
> +  float *b;
> +  __vector_quad c;
> +  __builtin_mma_disassemble_acc (b, );
> +  return 0;
> +}
> --
> 2.27.0
>



[pushed] Darwin, PPC : Fix R13 for PPC64.

2021-09-28 Thread Iain Sandoe via Gcc-patches
Hi,

We have a somewhat unusual situation in that for PPC64, R13 is
both reserved for future use by the ABI document and callee-saved.
In fact, it is already  used internally by the pthreads
implementation to contain pthread_self.

So add R13 to the fixed regs, but also keep it in the callee-
saved set.

tested on powerpc-darwin9, pushed to master,
thanks
Iain

gcc/ChangeLog:

* config/rs6000/darwin.h (FIXED_R13): Add for PPC64.
(FIRST_SAVED_GP_REGNO): Save from R13 even when it is one
of the fixed regs.
---
 gcc/config/rs6000/darwin.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h
index 6abf8e84f54..120b01f9a2b 100644
--- a/gcc/config/rs6000/darwin.h
+++ b/gcc/config/rs6000/darwin.h
@@ -203,7 +203,7 @@
 
 /* Make both r2 and r13 available for allocation.  */
 #define FIXED_R2 0
-#define FIXED_R13 0
+#define FIXED_R13 TARGET_64BIT
 
 /* Base register for access to local variables of the function.  */
 
@@ -213,6 +213,9 @@
 #undef  RS6000_PIC_OFFSET_TABLE_REGNUM
 #define RS6000_PIC_OFFSET_TABLE_REGNUM 31
 
+#undef FIRST_SAVED_GP_REGNO
+#define FIRST_SAVED_GP_REGNO 13
+
 /* Darwin's stack must remain 16-byte aligned for both 32 and 64 bit
ABIs.  */
 
-- 
2.24.3 (Apple Git-128)



[PATCH] c++: ttp matching with constrained auto parm [PR99909]

2021-09-28 Thread Patrick Palka via Gcc-patches
Here, when unifying TT with S, processing_template_decl is unset, and
this foils the dependence checks in do_auto_deduction for avoiding
checking constraints on an auto when the initializer is dependent.

This patch fixes this issue by making sure processing_template_decl is
set during the call to unify from coerce_template_template_parms; this
seems sensible because we're unifying one set of template parameters
with another, so we're dealing with templated trees throughout.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/99909

gcc/cp/ChangeLog:

* pt.c (coerce_template_template_parms): Keep
processing_template_decl set during the call to unify as well.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ttp3.C: New test.
---
 gcc/cp/pt.c|  4 ++--
 gcc/testsuite/g++.dg/cpp2a/concepts-ttp3.C | 11 +++
 2 files changed, 13 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ttp3.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 41fa7ed5e43..1dcdffe322a 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -7994,12 +7994,12 @@ coerce_template_template_parms (tree parm_parms,
   /* So coerce P's args to apply to A's parms, and then deduce between A's
 args and the converted args.  If that succeeds, A is at least as
 specialized as P, so they match.*/
+  processing_template_decl_sentinel ptds (/*reset*/false);
+  ++processing_template_decl;
   tree pargs = template_parms_level_to_args (parm_parms);
   pargs = add_outermost_template_args (outer_args, pargs);
-  ++processing_template_decl;
   pargs = coerce_template_parms (arg_parms, pargs, NULL_TREE, tf_none,
 /*require_all*/true, /*use_default*/true);
-  --processing_template_decl;
   if (pargs != error_mark_node)
{
  tree targs = make_tree_vec (nargs);
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-ttp3.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp3.C
new file mode 100644
index 000..898524e0dfa
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-ttp3.C
@@ -0,0 +1,11 @@
+// PR c++/99909
+// { dg-do compile { target c++20 } }
+
+template constexpr bool always_true = true;
+template concept C = always_true;
+
+template struct S { };
+
+template class TT> void f() { }
+
+template void f();
-- 
2.33.0.591.gddb1055343



Re: [Patch] Fortran: Fix assumed-size to assumed-rank passing [PR94070]

2021-09-28 Thread Harald Anlauf via Gcc-patches
Hi Tobias,

let me first reach for my brown bag...

> Otherwise, the quote from F2018 of my previous email applies:
>
> F2018:16.9.109 LBOUND has for "case(i)", i.e. with a 'dim'
> argument the following. The case without 'dim' just iterates
> through case (i) for each dim. Thus:
>
> "If DIM is present,
>   ARRAY is a whole array,
>   and either ARRAY is an assumed-size array of rank DIM
>   or dimension DIM of ARRAY has nonzero extent,
>   the result has a value equal to the lower bound for subscript DIM of ARRAY.
> Otherwise, if DIM is present, the result value is 1."

It was probably too late, and I could no longer distinguish
"assumed-size" from "assumed-rank", and likely some more...

> Here, we assume dim=2 is present [either directly or via case(ii)],
> ARRAY is a whole array but it neither is of assumed size nor has nonzero
> extent.
> Hence, the "otherwise" applies and the result is 1 - as gfortran has
> and ifort has in the caller.

... which lead to my complete confusion and loss of focus.

Of course you are right.  Sorry for that.  Will now put that bag on...

Harald



[pushed] libgcc, X86, Darwin: Export cpu_model and indicator.

2021-09-28 Thread Iain Sandoe via Gcc-patches
Hi,

These two symbols have been emitted since 4.8, but were not added
to the Darwin exports, so we have been using the ones from libgcc.a.

Added to libgcc_s now.

tested on i686 and x86_64-darwin, pushed to master,
thanks
Iain

Signed-off-by: Iain Sandoe 

libgcc/ChangeLog:

* config/i386/libgcc-darwin.ver: Add Symbols for
__cpu_model, __cpu_indicator_init.
---
 libgcc/config/i386/libgcc-darwin.ver | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/libgcc/config/i386/libgcc-darwin.ver 
b/libgcc/config/i386/libgcc-darwin.ver
index 5224cdc982e..c97dae73855 100644
--- a/libgcc/config/i386/libgcc-darwin.ver
+++ b/libgcc/config/i386/libgcc-darwin.ver
@@ -1,4 +1,7 @@
-
+GCC_4.8.0 {
+  __cpu_model
+  __cpu_indicator_init
+}
 
 %inherit GCC_12.0.0 GCC_7.0.0
 GCC_12.0.0 {
-- 
2.24.3 (Apple Git-128)



[PATCH] ctf: Do not warn for CTF not supported for GNU GIMPLE

2021-09-28 Thread Indu Bhagat via Gcc-patches
CTF is supported for C only.  Currently, a warning is emitted if the -gctf
command line option is specified for a non-C frontend.  This warning is also
used by the GCC testsuite framework - it skips adding -gctf to the list of
debug flags for automated testing, if CTF is not supported for the frontend.

The following warning, however, is not useful in case of LTO:

"lto1: note: CTF debug info requested, but not supported for ‘GNU GIMPLE’
frontend"

This patch disables the generation of the above warning for GNU GIMPLE.

Bootstrapped and regression tested on x86_64.

gcc/ChangeLog:

* toplev.c (process_options): Do not warn for GNU GIMPLE.
---
 gcc/toplev.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/gcc/toplev.c b/gcc/toplev.c
index e1688aa..511a343 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1416,14 +1416,16 @@ process_options (void)
debug_info_level = DINFO_LEVEL_NONE;
 }
 
-  /* CTF is supported for only C at this time.
- Compiling with -flto results in frontend language of GNU GIMPLE.  */
+  /* CTF is supported for only C at this time.  */
   if (!lang_GNU_C ()
   && ctf_debug_info_level > CTFINFO_LEVEL_NONE)
 {
-  inform (UNKNOWN_LOCATION,
- "CTF debug info requested, but not supported for %qs frontend",
- language_string);
+  /* Compiling with -flto results in frontend language of GNU GIMPLE.  It
+is not useful to warn in that case.  */
+  if (!startswith (lang_hooks.name, "GNU GIMPLE"))
+   inform (UNKNOWN_LOCATION,
+   "CTF debug info requested, but not supported for %qs frontend",
+   language_string);
   ctf_debug_info_level = CTFINFO_LEVEL_NONE;
 }
 
-- 
1.8.3.1



[PATCH] debug/102507: ICE in btf_finalize when compiling with -gbtf

2021-09-28 Thread Indu Bhagat via Gcc-patches
Fix the free'up of btf_var_ids hash_map in btf_finalize ().

Testing notes:

- Bootstrapped GCC with -gbtf as an experiment.
- Usual bootstrap and regression testing on x86_64.
- BPF backend testing - make all-gcc, reg tested bpf.exp, btf.exp and ctf.exp.
  (tested using David Faust's config.gcc patch posted earlier
   https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580422.html)

gcc/ChangeLog:

PR debug/102507
* btfout.c (GTY): Add GTY (()) albeit for cosmetic only purpose.
(btf_finalize): Empty the hash_map btf_var_ids.
---
 gcc/btfout.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/btfout.c b/gcc/btfout.c
index cdc6c63..a787815 100644
--- a/gcc/btfout.c
+++ b/gcc/btfout.c
@@ -70,7 +70,7 @@ static char btf_info_section_label[MAX_BTF_LABEL_BYTES];
converted to BTF_KIND_VAR type records. Strictly accounts for the index
from the start of the variable type entries, does not include the number
of types emitted prior to the variable records.  */
-static hash_map  *btf_var_ids;
+static GTY (()) hash_map  *btf_var_ids;
 
 /* Mapping of type IDs from original CTF ID to BTF ID. Types do not map
1-to-1 from CTF to BTF. To avoid polluting the CTF container when updating
@@ -1119,12 +1119,12 @@ btf_finalize (void)
 
   funcs = NULL;
 
+  btf_var_ids->empty ();
+  btf_var_ids = NULL;
+
   free (btf_id_map);
   btf_id_map = NULL;
 
-  ggc_free (btf_var_ids);
-  btf_var_ids = NULL;
-
   ctf_container_ref tu_ctfc = ctf_get_tu_ctfc ();
   ctfc_delete_container (tu_ctfc);
   tu_ctfc = NULL;
-- 
1.8.3.1



Re: [PATCH] c++: Fix up synthetization of defaulted comparison operators on classes with bitfields [PR102490]

2021-09-28 Thread Jakub Jelinek via Gcc-patches
On Tue, Sep 28, 2021 at 01:25:13PM -0400, Patrick Palka via Gcc-patches wrote:
> On Tue, 28 Sep 2021, Jakub Jelinek wrote:
> 
> > On Tue, Sep 28, 2021 at 06:49:38PM +0200, Jakub Jelinek via Gcc-patches 
> > wrote:
> > > On Tue, Sep 28, 2021 at 12:44:58PM -0400, Patrick Palka wrote:
> > > > Ah yeah, sorry for the noise, I misunderstood the function comment.
> > > > 
> > > > On a related note I think 'ctx' can also be a NAMESPACE_DECL here in
> > > > the case of a defaulted non-member operator<=> (as in the below), for
> > > > which I'd expect the added COMPLETE_TYPE_P check to crash, but it looks
> > > > like in this case DECL_INITIAL is error_mark_node instead of NULL_TREE
> > > > so a crash is averted.  If anyone else was wondering...
> > > > 
> > > >   struct A {
> > > > friend constexpr bool operator==(const A&, const A&);
> > > >   };
> > > > 
> > > >   constexpr bool operator==(const A&, const A&) = default;
> > > 
> > > That means maybe ctx isn't the right way to get at the type and we
> > > should look it up from the first argument's type?
> > > I guess I'll look at where the build_comparison_op takes it from...
> 
> I suspect this synthesize_method call from defaulted_late_check is
> really only needed when operator<=> has been defaulted inside the class
> definition, because out-of-class defaulted definitions generally already
> get eagerly synthesized IIUC.  So it might be fine to keep using ctx if
> we also check DECL_DEFAULTED_IN_CLASS_P in defaulted_late_check.  But
> Jason knows for sure..

Indeed, cp_finish_decl has:
8333  /* An out-of-class default definition is defined at
8334 the point where it is explicitly defaulted.  */
8335  if (DECL_DELETED_FN (decl))
8336maybe_explain_implicit_delete (decl);
8337  else if (DECL_INITIAL (decl) == error_mark_node)
8338synthesize_method (decl);

Jakub



[PATCH] bpf: correct extra_headers

2021-09-28 Thread David Faust via Gcc-patches
The BPF CO-RE support (commit 8bdabb37549f12ce727800a1c8aa182c0b1dd42a)
mistakenly overwrote bpf-*-* extra_headers in config.gcc, causing
bpf-helpers.h to not be installed. The redefinition with coreout.h is
unneeded, so delete it.

gcc/ChangeLog:

* config.gcc (bpf-*-*): Do not overwrite extra_headers.
---
 gcc/config.gcc | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 498c51e619d..aa5bd5d1459 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1531,7 +1531,6 @@ bpf-*-*)
 use_collect2=no
 extra_headers="bpf-helpers.h"
 use_gcc_stdint=provide
-extra_headers="coreout.h"
 extra_objs="coreout.o"
 target_gtfiles="$target_gtfiles \$(srcdir)/config/bpf/coreout.c"
 ;;
-- 
2.30.2



Re: [PATCH] coroutines: Only set parm copy guard vars if we have exceptions [PR 102454].

2021-09-28 Thread Jason Merrill via Gcc-patches

On 9/27/21 15:38, Iain Sandoe wrote:

For coroutines, we make copies of the original function arguments into
the coroutine frame.  Normally, these are destroyed on the proper exit
from the coroutine when the frame is destroyed.

However, if an exception is thrown before the first suspend point is
reached, the cleanup has to happen in the ramp function.  These cleanups
are guarded such that they are only applied to any param copies actually
made.

The ICE is caused by an attempt to set the guard variable when there are
no exceptions enabled (the guard var is not created in this case).

Fixed by checking for flag_exceptions in this case too.

While touching this code paths, also clean up the synthetic names used
when a function parm is unnamed.

tested on x86_64-darwin,
OK for master?


OK.


Signed-off-by: Iain Sandoe 

PR c++/102454

gcc/cp/ChangeLog:

* coroutines.cc (analyze_fn_parms): Clean up synthetic names for
unnamed function params.
(morph_fn_to_coro): Do not try to set a guard variable for param
DTORs in the ramp, unless we have exceptions active.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr102454.C: New test.
---
  gcc/cp/coroutines.cc   | 26 ---
  gcc/testsuite/g++.dg/coroutines/pr102454.C | 38 ++
  2 files changed, 52 insertions(+), 12 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr102454.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index fbd5c49533f..c761e769c12 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -3829,13 +3829,12 @@ analyze_fn_parms (tree orig)
  
if (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (parm.frame_type))

{
- char *buf = xasprintf ("_Coro_%s_live", IDENTIFIER_POINTER (name));
- parm.guard_var = build_lang_decl (VAR_DECL, get_identifier (buf),
-   boolean_type_node);
- free (buf);
- DECL_ARTIFICIAL (parm.guard_var) = true;
- DECL_CONTEXT (parm.guard_var) = orig;
- DECL_INITIAL (parm.guard_var) = boolean_false_node;
+ char *buf = xasprintf ("%s%s_live", DECL_NAME (arg) ? "_Coro_" : "",
+IDENTIFIER_POINTER (name));
+ parm.guard_var
+   = coro_build_artificial_var (UNKNOWN_LOCATION, get_identifier (buf),
+boolean_type_node, orig,
+boolean_false_node);
  parm.trivial_dtor = false;
}
else
@@ -4843,11 +4842,14 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
 NULL, parm.frame_type,
 LOOKUP_NORMAL,
 tf_warning_or_error);
- /* This var is now live.  */
- r = build_modify_expr (fn_start, parm.guard_var,
-boolean_type_node, INIT_EXPR, fn_start,
-boolean_true_node, boolean_type_node);
- finish_expr_stmt (r);
+ if (flag_exceptions)
+   {
+ /* This var is now live.  */
+ r = build_modify_expr (fn_start, parm.guard_var,
+boolean_type_node, INIT_EXPR, fn_start,
+boolean_true_node, boolean_type_node);
+ finish_expr_stmt (r);
+   }
}
}
  }
diff --git a/gcc/testsuite/g++.dg/coroutines/pr102454.C 
b/gcc/testsuite/g++.dg/coroutines/pr102454.C
new file mode 100644
index 000..41aeda7b973
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr102454.C
@@ -0,0 +1,38 @@
+//  { dg-additional-options "-fno-exceptions" }
+
+#include 
+#include 
+
+template 
+struct looper {
+  struct promise_type {
+auto get_return_object () { return handle_type::from_promise (*this); }
+auto initial_suspend () { return suspend_always_prt {}; }
+auto final_suspend () noexcept { return suspend_always_prt {}; }
+void return_value (T);
+void unhandled_exception ();
+  };
+
+  using handle_type = std::coroutine_handle;
+
+  looper (handle_type);
+
+  struct suspend_always_prt {
+bool await_ready () noexcept;
+void await_suspend (handle_type) noexcept;
+void await_resume () noexcept;
+  };
+};
+
+template 
+looper
+with_ctorable_state (T)
+{
+  co_return T ();
+}
+
+auto
+foo ()
+{
+  return with_ctorable_state;
+}





Re: [PATCH] c++: Fix up synthetization of defaulted comparison operators on classes with bitfields [PR102490]

2021-09-28 Thread Patrick Palka via Gcc-patches
On Tue, 28 Sep 2021, Jakub Jelinek wrote:

> On Tue, Sep 28, 2021 at 06:49:38PM +0200, Jakub Jelinek via Gcc-patches wrote:
> > On Tue, Sep 28, 2021 at 12:44:58PM -0400, Patrick Palka wrote:
> > > Ah yeah, sorry for the noise, I misunderstood the function comment.
> > > 
> > > On a related note I think 'ctx' can also be a NAMESPACE_DECL here in
> > > the case of a defaulted non-member operator<=> (as in the below), for
> > > which I'd expect the added COMPLETE_TYPE_P check to crash, but it looks
> > > like in this case DECL_INITIAL is error_mark_node instead of NULL_TREE
> > > so a crash is averted.  If anyone else was wondering...
> > > 
> > >   struct A {
> > > friend constexpr bool operator==(const A&, const A&);
> > >   };
> > > 
> > >   constexpr bool operator==(const A&, const A&) = default;
> > 
> > That means maybe ctx isn't the right way to get at the type and we
> > should look it up from the first argument's type?
> > I guess I'll look at where the build_comparison_op takes it from...

I suspect this synthesize_method call from defaulted_late_check is
really only needed when operator<=> has been defaulted inside the class
definition, because out-of-class defaulted definitions generally already
get eagerly synthesized IIUC.  So it might be fine to keep using ctx if
we also check DECL_DEFAULTED_IN_CLASS_P in defaulted_late_check.  But
Jason knows for sure..

> 
>   tree lhs = DECL_ARGUMENTS (fndecl);
>   if (is_this_parameter (lhs))
> lhs = cp_build_fold_indirect_ref (lhs);
>   else
> lhs = convert_from_reference (lhs);
>   tree ctype = TYPE_MAIN_VARIANT (TREE_TYPE (lhs));
> apparently.
> 
>   Jakub
> 
> 



[committed] libstdc++: Improve std::forward static assert message

2021-09-28 Thread Jonathan Wakely via Gcc-patches
The previous message told you something was wrong, but not why it
happened or why it's bad. This changes it to explain that the function
is being misused.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/move.h (forward(remove_reference_t&&)):
Improve text of static_assert.
* testsuite/20_util/forward/c_neg.cc: Adjust dg-error.
* testsuite/20_util/forward/f_neg.cc: Likewise.

Tested x86_64-linux. Committed to trunk.

commit a11052d98db2f2a61841f0c5ee84de4ca1b3e296
Author: Jonathan Wakely 
Date:   Tue Sep 28 12:35:29 2021

libstdc++: Improve std::forward static assert message

The previous message told you something was wrong, but not why it
happened or why it's bad. This changes it to explain that the function
is being misused.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/move.h (forward(remove_reference_t&&)):
Improve text of static_assert.
* testsuite/20_util/forward/c_neg.cc: Adjust dg-error.
* testsuite/20_util/forward/f_neg.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/move.h b/libstdc++-v3/include/bits/move.h
index 3abbb37ceeb..2dd7ed9e4f9 100644
--- a/libstdc++-v3/include/bits/move.h
+++ b/libstdc++-v3/include/bits/move.h
@@ -88,8 +88,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 constexpr _Tp&&
 forward(typename std::remove_reference<_Tp>::type&& __t) noexcept
 {
-  static_assert(!std::is_lvalue_reference<_Tp>::value, "template argument"
-   " substituting _Tp must not be an lvalue reference type");
+  static_assert(!std::is_lvalue_reference<_Tp>::value,
+ "std::forward must not be used to convert an rvalue to an lvalue");
   return static_cast<_Tp&&>(__t);
 }
 
diff --git a/libstdc++-v3/testsuite/20_util/forward/c_neg.cc 
b/libstdc++-v3/testsuite/20_util/forward/c_neg.cc
index dc7ec51bde6..3875792866e 100644
--- a/libstdc++-v3/testsuite/20_util/forward/c_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/forward/c_neg.cc
@@ -17,7 +17,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-error "must not be an lvalue reference" "" { target *-*-* } 0 }
+// { dg-error "convert an rvalue to an lvalue" "" { target *-*-* } 0 }
 
 #include 
 
diff --git a/libstdc++-v3/testsuite/20_util/forward/f_neg.cc 
b/libstdc++-v3/testsuite/20_util/forward/f_neg.cc
index 4ccd7264c65..51ccaf29c1a 100644
--- a/libstdc++-v3/testsuite/20_util/forward/f_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/forward/f_neg.cc
@@ -17,7 +17,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-error "must not be an lvalue reference" "" { target *-*-* } 0 }
+// { dg-error "convert an rvalue to an lvalue" "" { target *-*-* } 0 }
 
 #include 
 


[committed] libstdc++: Fix mismatched noexcept-specifiers in filesystem::path [PR102499]

2021-09-28 Thread Jonathan Wakely via Gcc-patches
Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

PR libstdc++/102499
* include/bits/fs_path.h (path::begin, path::end): Add noexcept
to declarations, to match definitions.

Tested x86_64-linux. Committed to trunk.

commit f2b7f56a15d9cbbd2f0db22e0e39c4dd161bab69
Author: Jonathan Wakely 
Date:   Mon Sep 27 22:07:12 2021

libstdc++: Fix mismatched noexcept-specifiers in filesystem::path [PR102499]

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

PR libstdc++/102499
* include/bits/fs_path.h (path::begin, path::end): Add noexcept
to declarations, to match definitions.

diff --git a/libstdc++-v3/include/bits/fs_path.h 
b/libstdc++-v3/include/bits/fs_path.h
index 92f7cbbe357..1918c243d74 100644
--- a/libstdc++-v3/include/bits/fs_path.h
+++ b/libstdc++-v3/include/bits/fs_path.h
@@ -489,8 +489,8 @@ namespace __detail
 class iterator;
 using const_iterator = iterator;
 
-iterator begin() const;
-iterator end() const;
+iterator begin() const noexcept;
+iterator end() const noexcept;
 
 /// Write a path to a stream
 template


Re: [PATCH] c++: Fix up synthetization of defaulted comparison operators on classes with bitfields [PR102490]

2021-09-28 Thread Jakub Jelinek via Gcc-patches
On Tue, Sep 28, 2021 at 06:49:38PM +0200, Jakub Jelinek via Gcc-patches wrote:
> On Tue, Sep 28, 2021 at 12:44:58PM -0400, Patrick Palka wrote:
> > Ah yeah, sorry for the noise, I misunderstood the function comment.
> > 
> > On a related note I think 'ctx' can also be a NAMESPACE_DECL here in
> > the case of a defaulted non-member operator<=> (as in the below), for
> > which I'd expect the added COMPLETE_TYPE_P check to crash, but it looks
> > like in this case DECL_INITIAL is error_mark_node instead of NULL_TREE
> > so a crash is averted.  If anyone else was wondering...
> > 
> >   struct A {
> > friend constexpr bool operator==(const A&, const A&);
> >   };
> > 
> >   constexpr bool operator==(const A&, const A&) = default;
> 
> That means maybe ctx isn't the right way to get at the type and we
> should look it up from the first argument's type?
> I guess I'll look at where the build_comparison_op takes it from...

  tree lhs = DECL_ARGUMENTS (fndecl);
  if (is_this_parameter (lhs))
lhs = cp_build_fold_indirect_ref (lhs);
  else
lhs = convert_from_reference (lhs);
  tree ctype = TYPE_MAIN_VARIANT (TREE_TYPE (lhs));
apparently.

Jakub



Re: [PATCH] c++: Fix up synthetization of defaulted comparison operators on classes with bitfields [PR102490]

2021-09-28 Thread Jakub Jelinek via Gcc-patches
On Tue, Sep 28, 2021 at 12:44:58PM -0400, Patrick Palka wrote:
> Ah yeah, sorry for the noise, I misunderstood the function comment.
> 
> On a related note I think 'ctx' can also be a NAMESPACE_DECL here in
> the case of a defaulted non-member operator<=> (as in the below), for
> which I'd expect the added COMPLETE_TYPE_P check to crash, but it looks
> like in this case DECL_INITIAL is error_mark_node instead of NULL_TREE
> so a crash is averted.  If anyone else was wondering...
> 
>   struct A {
> friend constexpr bool operator==(const A&, const A&);
>   };
> 
>   constexpr bool operator==(const A&, const A&) = default;

That means maybe ctx isn't the right way to get at the type and we
should look it up from the first argument's type?
I guess I'll look at where the build_comparison_op takes it from...

Jakub



Re: [PATCH] c++: Fix up synthetization of defaulted comparison operators on classes with bitfields [PR102490]

2021-09-28 Thread Patrick Palka via Gcc-patches
On Tue, 28 Sep 2021, Jakub Jelinek wrote:

> On Tue, Sep 28, 2021 at 09:49:11AM -0400, Patrick Palka via Gcc-patches wrote:
> > > --- gcc/cp/method.c.jj2021-09-15 08:55:37.563497558 +0200
> > > +++ gcc/cp/method.c   2021-09-27 13:48:12.139271830 +0200
> > > @@ -3160,8 +3160,11 @@ defaulted_late_check (tree fn)
> > >if (kind == sfk_comparison)
> > >  {
> > >/* If the function was declared constexpr, check that the 
> > > definition
> > > -  qualifies.  Otherwise we can define the function lazily.  */
> > > -  if (DECL_DECLARED_CONSTEXPR_P (fn) && !DECL_INITIAL (fn))
> > > +  qualifies.  Otherwise we can define the function lazily.
> > > +  Don't do this if the class type is still incomplete.  */
> > > +  if (DECL_DECLARED_CONSTEXPR_P (fn)
> > > +   && !DECL_INITIAL (fn)
> > > +   && COMPLETE_TYPE_P (ctx))
> > >   {
> > 
> > According to the function comment for defaulted_late_check, won't
> > COMPLETE_TYPE_P (ctx) always be false here?
> 
> It is true in the call from the following hunk.
> The function comment at least to me doesn't imply it is always called on
> incomplete types, and defaultable_fn_check also calls it.

Ah yeah, sorry for the noise, I misunderstood the function comment.

On a related note I think 'ctx' can also be a NAMESPACE_DECL here in
the case of a defaulted non-member operator<=> (as in the below), for
which I'd expect the added COMPLETE_TYPE_P check to crash, but it looks
like in this case DECL_INITIAL is error_mark_node instead of NULL_TREE
so a crash is averted.  If anyone else was wondering...

  struct A {
friend constexpr bool operator==(const A&, const A&);
  };

  constexpr bool operator==(const A&, const A&) = default;

> > 
> > > /* Prevent GC.  */
> > > function_depth++;
> > > --- gcc/cp/class.c.jj 2021-09-03 09:46:28.801428380 +0200
> > > +++ gcc/cp/class.c2021-09-27 14:07:03.465562255 +0200
> > > @@ -7467,7 +7467,14 @@ finish_struct_1 (tree t)
> > >   for any static member objects of the type we're working on.  */
> > >for (x = TYPE_FIELDS (t); x; x = DECL_CHAIN (x))
> > >  if (DECL_DECLARES_FUNCTION_P (x))
> > > -  DECL_IN_AGGR_P (x) = false;
> > > +  {
> > > + /* Synthetize constexpr defaulted comparisons.  */
> > > + if (!DECL_ARTIFICIAL (x)
> > > + && DECL_DEFAULTED_IN_CLASS_P (x)
> > > + && special_function_p (x) == sfk_comparison)
> > > +   defaulted_late_check (x);
> > > + DECL_IN_AGGR_P (x) = false;
> > > +  }
> > >  else if (VAR_P (x) && TREE_STATIC (x)
> > >&& TREE_TYPE (x) != error_mark_node
> > >&& same_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (x)), t))
> 
>   Jakub
> 
> 



Re: [PATCH] [PR102501] Adjust jump threading testcases for ppc64* and others.

2021-09-28 Thread Jeff Law via Gcc-patches




On 9/28/2021 10:09 AM, Aldy Hernandez wrote:

I really don't know what to do here.  This is a bit of whack-o-mole.
The IL is sufficiently different for various architectures that any
tweak can cause the number of jump threads to vary.

For the pr7745-2.c testcase, we have less threading candidates because 2
of them now cross loop boundaries.  Interestingly, this test matches
"Jumps threaded", not threads registered, so the block copier can
drop threads at copying time adding further confusion.

For example, we can register N threads, but the old copier can cancel
N-M threads while updating the CFG for a variety of different reasons
(removed edges, threading through loop exits, etc).  This makes the
"Registering jump threads" not to match the total number of threads this
test checks for with "Jumps threaded".

The pr66752-3.c test OTOH, is just a matter of thread4 eliminating the
"if".  I had erroneously thought it would always be eliminated by
thread3, but we really don't care where it gets cleaned up.  All we know
is that DCE can't depend on the early threaders doing this work, because
it may cross loop boundaries.  I've chosen thread4 arbitrarily, but we
could just as easily pick the ".optimized" dump.

Sorry, I'm really at my wits end here.  I don't see any clean path
forward, except rewrite these tests as gimple IL.  They're close to useless
as they sit.

OK?

gcc/testsuite/ChangeLog:

PR testsuite/102501
* gcc.dg/tree-ssa/pr66752-3.c: Adjust.
* gcc.dg/tree-ssa/pr77445-2.c: Adjust.

Note these were two of the consistent failures on other targets as well.
Jeff



[Patch] Fortran: Fix same_type_as

2021-09-28 Thread Tobias Burnus

Found when looking at Sandra's c535b-1.f90 and playing around.
When fixing same_type_as, I spotted by code reading another issue,
related to not catering for derived types. (Untested whether it
failed indeed.)

I added now a bunch of testcases.

OK for mainline?

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: Fix same_type_as

A test for CLASS(*) + assumed rank was missing; adding a test to
unlimited_polymorphic_1.f03 showed an ICE as backend_decl wasn't
set. While gfc_get_symbol_decl would fix it, the code also assumed
that the class(*) was a variable and could not be a subobject of
a derived type.

gcc/fortran/ChangeLog:

	* trans-intrinsic.c (gfc_conv_same_type_as): Fix handling
	of UNLIMITED_POLY.
	* trans.h (gfc_vtpr_hash_get): Renamed prototype to ...
	(gfc_vptr_hash_get): ... this to match function name.

gcc/testsuite/ChangeLog:

	* gfortran.dg/c-interop/c535b-1.f90: Remove wrong comment.
	* gfortran.dg/unlimited_polymorphic_1.f03: Extend.
	* gfortran.dg/unlimited_polymorphic_32.f90: New test.

 gcc/fortran/trans-intrinsic.c  |  42 ++--
 gcc/fortran/trans.h|   2 +-
 gcc/testsuite/gfortran.dg/c-interop/c535b-1.f90|   2 -
 .../gfortran.dg/unlimited_polymorphic_1.f03|  17 +-
 .../gfortran.dg/unlimited_polymorphic_32.f90   | 254 +
 5 files changed, 296 insertions(+), 21 deletions(-)

diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index 900a1a29817..2a2829c9f04 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -9126,21 +9126,14 @@ gfc_conv_same_type_as (gfc_se *se, gfc_expr *expr)
   a = expr->value.function.actual->expr;
   b = expr->value.function.actual->next->expr;
 
-  if (UNLIMITED_POLY (a))
+  bool unlimited_poly_a = UNLIMITED_POLY (a);
+  bool unlimited_poly_b = UNLIMITED_POLY (b);
+  if (unlimited_poly_a)
 {
-  tmp = gfc_class_vptr_get (a->symtree->n.sym->backend_decl);
-  conda = fold_build2_loc (input_location, NE_EXPR, logical_type_node,
-			   tmp, build_int_cst (TREE_TYPE (tmp), 0));
-}
-
-  if (UNLIMITED_POLY (b))
-{
-  tmp = gfc_class_vptr_get (b->symtree->n.sym->backend_decl);
-  condb = fold_build2_loc (input_location, NE_EXPR, logical_type_node,
-			   tmp, build_int_cst (TREE_TYPE (tmp), 0));
+  se1.want_pointer = 1;
+  gfc_add_vptr_component (a);
 }
-
-  if (a->ts.type == BT_CLASS)
+  else if (a->ts.type == BT_CLASS)
 {
   gfc_add_vptr_component (a);
   gfc_add_hash_component (a);
@@ -9149,7 +9142,12 @@ gfc_conv_same_type_as (gfc_se *se, gfc_expr *expr)
 a = gfc_get_int_expr (gfc_default_integer_kind, NULL,
 			  a->ts.u.derived->hash_value);
 
-  if (b->ts.type == BT_CLASS)
+  if (unlimited_poly_b)
+{
+  se2.want_pointer = 1;
+  gfc_add_vptr_component (b);
+}
+  else if (b->ts.type == BT_CLASS)
 {
   gfc_add_vptr_component (b);
   gfc_add_hash_component (b);
@@ -9161,6 +9159,22 @@ gfc_conv_same_type_as (gfc_se *se, gfc_expr *expr)
   gfc_conv_expr (, a);
   gfc_conv_expr (, b);
 
+  if (unlimited_poly_a)
+{
+  conda = fold_build2_loc (input_location, NE_EXPR, logical_type_node,
+			   se1.expr,
+			   build_int_cst (TREE_TYPE (se1.expr), 0));
+  se1.expr = gfc_vptr_hash_get (se1.expr);
+}
+
+  if (unlimited_poly_b)
+{
+  condb = fold_build2_loc (input_location, NE_EXPR, logical_type_node,
+			   se2.expr,
+			   build_int_cst (TREE_TYPE (se2.expr), 0));
+  se2.expr = gfc_vptr_hash_get (se2.expr);
+}
+
   tmp = fold_build2_loc (input_location, EQ_EXPR,
 			 logical_type_node, se1.expr,
 			 fold_convert (TREE_TYPE (se1.expr), se2.expr));
diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
index 53f0f86b265..fa3e8651b44 100644
--- a/gcc/fortran/trans.h
+++ b/gcc/fortran/trans.h
@@ -438,7 +438,7 @@ tree gfc_class_vtab_def_init_get (tree);
 tree gfc_class_vtab_copy_get (tree);
 tree gfc_class_vtab_final_get (tree);
 /* Get an accessor to the vtab's * field, when a vptr handle is present.  */
-tree gfc_vtpr_hash_get (tree);
+tree gfc_vptr_hash_get (tree);
 tree gfc_vptr_size_get (tree);
 tree gfc_vptr_extends_get (tree);
 tree gfc_vptr_def_init_get (tree);
diff --git a/gcc/testsuite/gfortran.dg/c-interop/c535b-1.f90 b/gcc/testsuite/gfortran.dg/c-interop/c535b-1.f90
index 3de77b00106..748e027f897 100644
--- a/gcc/testsuite/gfortran.dg/c-interop/c535b-1.f90
+++ b/gcc/testsuite/gfortran.dg/c-interop/c535b-1.f90
@@ -297,8 +297,6 @@ end function
 ! coshape, lcobound, ucobound: requires CODIMENSION attribute, which is
 !   not permitted on an assumed-rank variable.
 !
-! extends_type_of, same_type_as: require a class argument.
-
 
 ! F2018 additionally permits the first arg to 

Re: [PATCH] Improve jump threading dump output.

2021-09-28 Thread Aldy Hernandez via Gcc-patches




On 9/28/21 6:05 PM, Richard Biener wrote:

On September 28, 2021 5:45:52 PM GMT+02:00, Jeff Law via Gcc-patches 
 wrote:



On 9/28/2021 7:53 AM, Aldy Hernandez wrote:



On 9/28/21 3:47 PM, Jeff Law wrote:



On 9/28/2021 3:45 AM, Aldy Hernandez wrote:

In analyzing PR102511, it has become abundantly clear that we need
better debugging aids for the jump threader solver.  Currently
debugging these issues is a nightmare if you're not intimately
familiar with the code.  This patch attempts to improve this.

First, I'm enabling path solver dumps with TDF_THREADING. None of the
available TDF_* flags are a good match, and using TDF_DETAILS would
blow
up the dump file, since both threaders continually call the solver to
try out candidates.  This will allow dumping path solver details
without
having to resort to hacking the source.

I am also dumping the current registered_jump_thread dbg counter used
by the registry, in the solver.  That way narrowing down a problematic
thread can then be examined by -fdump-*-threading and looking at the
solver details surrounding the appropriate counter (which the dbgcnt
also dumps to the dump file).

You still need knowledge of the solver to debug these issues, but at
least now it's not entirely opaque.

OK?

gcc/ChangeLog:

 * dbgcnt.c (dbg_cnt_counter): New.
 * dbgcnt.h (dbg_cnt_counter): New.
 * dumpfile.c (dump_options): Add entry for TDF_THREADING.
 * dumpfile.h (enum dump_flag): Add TDF_THREADING.
 * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
 * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
 debug counter.

OK.

Note we've got massive failures in the tester starting sometime
yesterday and I suspect all the threader work.    So I'm going to
slow down on reviews of that code as we stabilize stuff.


Fair enough.  Let's knock those out then.

So several are failing gcc.dg/loop-unswitch-3.c.

This test appears to be verifying that we unswitch a test in one of the
loops, which is no longer happening after the change to replace the VRP
threader with the hybrid forward threader.

So both the old VRP threader and the new style identify and realize a
single jump thread.

In the old VRP threader realization of the jump thread ends up creating
nested loops.  In the new implementation we end up creating a single
loop with two back edges to the header.

ie, the (partial) graphs look like this

OLD

        1<--+
        |  |
+->  2 |
|    /   \   |
|  3 4  |
+- + +-+

NEW


+->  2 <-+
|    /   \   |
|  3 4  |
+- + +-+


I wonder if we're not doing proper loop fixups or something similar
after that change.  IIRC we have/had bits in the copier and CFG update
code to mark the loops that need re-analysis and fixing up.

Anyway, you should be able to trigger and analyze with a cross compiler.

I've got to switch to my day job, but I'll pass along more as I get a
chance to look at them.


If you're stuck I'm also happy to help. Note that relying on loop fixup is 
almost never good because we easily lose track of loop association of info like 
OMP simd loops and all loop pragmas.


I could absolutely use the help here.  Care to take a look?

Aldy



[PATCH] [PR102501] Adjust jump threading testcases for ppc64* and others.

2021-09-28 Thread Aldy Hernandez via Gcc-patches
I really don't know what to do here.  This is a bit of whack-o-mole.
The IL is sufficiently different for various architectures that any
tweak can cause the number of jump threads to vary.

For the pr7745-2.c testcase, we have less threading candidates because 2
of them now cross loop boundaries.  Interestingly, this test matches
"Jumps threaded", not threads registered, so the block copier can
drop threads at copying time adding further confusion.

For example, we can register N threads, but the old copier can cancel
N-M threads while updating the CFG for a variety of different reasons
(removed edges, threading through loop exits, etc).  This makes the
"Registering jump threads" not to match the total number of threads this
test checks for with "Jumps threaded".

The pr66752-3.c test OTOH, is just a matter of thread4 eliminating the
"if".  I had erroneously thought it would always be eliminated by
thread3, but we really don't care where it gets cleaned up.  All we know
is that DCE can't depend on the early threaders doing this work, because
it may cross loop boundaries.  I've chosen thread4 arbitrarily, but we
could just as easily pick the ".optimized" dump.

Sorry, I'm really at my wits end here.  I don't see any clean path
forward, except rewrite these tests as gimple IL.  They're close to useless
as they sit.

OK?

gcc/testsuite/ChangeLog:

PR testsuite/102501
* gcc.dg/tree-ssa/pr66752-3.c: Adjust.
* gcc.dg/tree-ssa/pr77445-2.c: Adjust.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c | 4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c
index 922a331b217..ba7025ae33b 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-thread1-details -fdump-tree-thread3" } */
+/* { dg-options "-O2 -fdump-tree-thread1-details -fdump-tree-thread4" } */
 
 extern int status, pt;
 extern int count;
@@ -43,4 +43,4 @@ foo (int N, int c, int b, int *a)
run after loop optimizations , can successfully eliminate the
references to FLAG.  Verify that ther are no references by the late
threading passes.  */
-/* { dg-final { scan-tree-dump-not "if .flag" "thread3"} } */
+/* { dg-final { scan-tree-dump-not "if .flag" "thread4"} } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
index 01a0f1f197d..18f7aab2be7 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c
@@ -123,7 +123,7 @@ enum STATES FMS( u8 **in , u32 *transitions) {
aarch64 has the highest CASE_VALUES_THRESHOLD in GCC.  It's high enough
to change decisions in switch expansion which in turn can expose new
jump threading opportunities.  Skip the later tests on aarch64.  */
-/* { dg-final { scan-tree-dump "Jumps threaded: 9" "thread1" } } */
+/* { dg-final { scan-tree-dump "Jumps threaded: \[7-9\]" "thread1" } } */
 /* { dg-final { scan-tree-dump-times "Invalid sum" 1 "thread1" } } */
 /* { dg-final { scan-tree-dump-not "optimizing for size" "thread1" } } */
 /* { dg-final { scan-tree-dump-not "optimizing for size" "thread2" } } */
-- 
2.31.1



Re: [PATCH] Improve jump threading dump output.

2021-09-28 Thread Richard Biener via Gcc-patches
On September 28, 2021 5:45:52 PM GMT+02:00, Jeff Law via Gcc-patches 
 wrote:
>
>
>On 9/28/2021 7:53 AM, Aldy Hernandez wrote:
>>
>>
>> On 9/28/21 3:47 PM, Jeff Law wrote:
>>>
>>>
>>> On 9/28/2021 3:45 AM, Aldy Hernandez wrote:
 In analyzing PR102511, it has become abundantly clear that we need
 better debugging aids for the jump threader solver.  Currently
 debugging these issues is a nightmare if you're not intimately
 familiar with the code.  This patch attempts to improve this.

 First, I'm enabling path solver dumps with TDF_THREADING. None of the
 available TDF_* flags are a good match, and using TDF_DETAILS would 
 blow
 up the dump file, since both threaders continually call the solver to
 try out candidates.  This will allow dumping path solver details 
 without
 having to resort to hacking the source.

 I am also dumping the current registered_jump_thread dbg counter used
 by the registry, in the solver.  That way narrowing down a problematic
 thread can then be examined by -fdump-*-threading and looking at the
 solver details surrounding the appropriate counter (which the dbgcnt
 also dumps to the dump file).

 You still need knowledge of the solver to debug these issues, but at
 least now it's not entirely opaque.

 OK?

 gcc/ChangeLog:

 * dbgcnt.c (dbg_cnt_counter): New.
 * dbgcnt.h (dbg_cnt_counter): New.
 * dumpfile.c (dump_options): Add entry for TDF_THREADING.
 * dumpfile.h (enum dump_flag): Add TDF_THREADING.
 * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
 * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
 debug counter.
>>> OK.
>>>
>>> Note we've got massive failures in the tester starting sometime 
>>> yesterday and I suspect all the threader work.    So I'm going to 
>>> slow down on reviews of that code as we stabilize stuff.
>>
>> Fair enough.  Let's knock those out then.
>So several are failing gcc.dg/loop-unswitch-3.c.
>
>This test appears to be verifying that we unswitch a test in one of the 
>loops, which is no longer happening after the change to replace the VRP 
>threader with the hybrid forward threader.
>
>So both the old VRP threader and the new style identify and realize a 
>single jump thread.
>
>In the old VRP threader realization of the jump thread ends up creating 
>nested loops.  In the new implementation we end up creating a single 
>loop with two back edges to the header.
>
>ie, the (partial) graphs look like this
>
>OLD
>
>        1<--+
>        |  |
>+->  2 |
>|    /   \   |
>|  3 4  |
>+- + +-+
>
>NEW
>
>
>+->  2 <-+
>|    /   \   |
>|  3 4  |
>+- + +-+
>
>
>I wonder if we're not doing proper loop fixups or something similar 
>after that change.  IIRC we have/had bits in the copier and CFG update 
>code to mark the loops that need re-analysis and fixing up.
>
>Anyway, you should be able to trigger and analyze with a cross compiler.
>
>I've got to switch to my day job, but I'll pass along more as I get a 
>chance to look at them.

If you're stuck I'm also happy to help. Note that relying on loop fixup is 
almost never good because we easily lose track of loop association of info like 
OMP simd loops and all loop pragmas. 

Richard. 

>jeff
>
>
>



Re: [PATCH] Improve jump threading dump output.

2021-09-28 Thread Richard Biener via Gcc-patches
On September 28, 2021 5:45:52 PM GMT+02:00, Jeff Law via Gcc-patches 
 wrote:
>
>
>On 9/28/2021 7:53 AM, Aldy Hernandez wrote:
>>
>>
>> On 9/28/21 3:47 PM, Jeff Law wrote:
>>>
>>>
>>> On 9/28/2021 3:45 AM, Aldy Hernandez wrote:
 In analyzing PR102511, it has become abundantly clear that we need
 better debugging aids for the jump threader solver.  Currently
 debugging these issues is a nightmare if you're not intimately
 familiar with the code.  This patch attempts to improve this.

 First, I'm enabling path solver dumps with TDF_THREADING. None of the
 available TDF_* flags are a good match, and using TDF_DETAILS would 
 blow
 up the dump file, since both threaders continually call the solver to
 try out candidates.  This will allow dumping path solver details 
 without
 having to resort to hacking the source.

 I am also dumping the current registered_jump_thread dbg counter used
 by the registry, in the solver.  That way narrowing down a problematic
 thread can then be examined by -fdump-*-threading and looking at the
 solver details surrounding the appropriate counter (which the dbgcnt
 also dumps to the dump file).

 You still need knowledge of the solver to debug these issues, but at
 least now it's not entirely opaque.

 OK?

 gcc/ChangeLog:

 * dbgcnt.c (dbg_cnt_counter): New.
 * dbgcnt.h (dbg_cnt_counter): New.
 * dumpfile.c (dump_options): Add entry for TDF_THREADING.
 * dumpfile.h (enum dump_flag): Add TDF_THREADING.
 * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
 * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
 debug counter.
>>> OK.
>>>
>>> Note we've got massive failures in the tester starting sometime 
>>> yesterday and I suspect all the threader work.    So I'm going to 
>>> slow down on reviews of that code as we stabilize stuff.
>>
>> Fair enough.  Let's knock those out then.
>So several are failing gcc.dg/loop-unswitch-3.c.
>
>This test appears to be verifying that we unswitch a test in one of the 
>loops, which is no longer happening after the change to replace the VRP 
>threader with the hybrid forward threader.
>
>So both the old VRP threader and the new style identify and realize a 
>single jump thread.
>
>In the old VRP threader realization of the jump thread ends up creating 
>nested loops.  In the new implementation we end up creating a single 
>loop with two back edges to the header.
>
>ie, the (partial) graphs look like this
>
>OLD
>
>        1<--+
>        |  |
>+->  2 |
>|    /   \   |
>|  3 4  |
>+- + +-+
>
>NEW
>
>
>+->  2 <-+
>|    /   \   |
>|  3 4  |
>+- + +-+
>
>
>I wonder if we're not doing proper loop fixups or something similar 
>after that change.  IIRC we have/had bits in the copier and CFG update 
>code to mark the loops that need re-analysis and fixing up.
>
>Anyway, you should be able to trigger and analyze with a cross compiler.
>
>I've got to switch to my day job, but I'll pass along more as I get a 
>chance to look at them.

If you're stuck I'm also happy to help. Note that relying on loop fixup is 
almost never good because we easily lose track of loop association of info like 
OMP simd loops and all loop pragmas. 

Richard. 

>jeff
>
>
>



Re: [PATCH] Improve jump threading dump output.

2021-09-28 Thread Jeff Law via Gcc-patches




On 9/28/2021 7:53 AM, Aldy Hernandez wrote:



On 9/28/21 3:47 PM, Jeff Law wrote:



On 9/28/2021 3:45 AM, Aldy Hernandez wrote:

In analyzing PR102511, it has become abundantly clear that we need
better debugging aids for the jump threader solver.  Currently
debugging these issues is a nightmare if you're not intimately
familiar with the code.  This patch attempts to improve this.

First, I'm enabling path solver dumps with TDF_THREADING. None of the
available TDF_* flags are a good match, and using TDF_DETAILS would 
blow

up the dump file, since both threaders continually call the solver to
try out candidates.  This will allow dumping path solver details 
without

having to resort to hacking the source.

I am also dumping the current registered_jump_thread dbg counter used
by the registry, in the solver.  That way narrowing down a problematic
thread can then be examined by -fdump-*-threading and looking at the
solver details surrounding the appropriate counter (which the dbgcnt
also dumps to the dump file).

You still need knowledge of the solver to debug these issues, but at
least now it's not entirely opaque.

OK?

gcc/ChangeLog:

* dbgcnt.c (dbg_cnt_counter): New.
* dbgcnt.h (dbg_cnt_counter): New.
* dumpfile.c (dump_options): Add entry for TDF_THREADING.
* dumpfile.h (enum dump_flag): Add TDF_THREADING.
* gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
* tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
debug counter.

OK.

Note we've got massive failures in the tester starting sometime 
yesterday and I suspect all the threader work.    So I'm going to 
slow down on reviews of that code as we stabilize stuff.


Fair enough.  Let's knock those out then.

So several are failing gcc.dg/loop-unswitch-3.c.

This test appears to be verifying that we unswitch a test in one of the 
loops, which is no longer happening after the change to replace the VRP 
threader with the hybrid forward threader.


So both the old VRP threader and the new style identify and realize a 
single jump thread.


In the old VRP threader realization of the jump thread ends up creating 
nested loops.  In the new implementation we end up creating a single 
loop with two back edges to the header.


ie, the (partial) graphs look like this

OLD

       1<--+
       |  |
+->  2 |
|    /   \   |
|  3 4  |
+- + +-+

NEW


+->  2 <-+
|    /   \   |
|  3 4  |
+- + +-+


I wonder if we're not doing proper loop fixups or something similar 
after that change.  IIRC we have/had bits in the copier and CFG update 
code to mark the loops that need re-analysis and fixing up.


Anyway, you should be able to trigger and analyze with a cross compiler.

I've got to switch to my day job, but I'll pass along more as I get a 
chance to look at them.


jeff





Re: [r12-3899 Regression] FAIL: gcc.dg/strlenopt-13.c scan-tree-dump-times strlen1 "memcpy \\(" 7 on Linux/x86_64

2021-09-28 Thread Martin Sebor via Gcc-patches

On 9/28/21 1:20 AM, Richard Biener wrote:

On Mon, 27 Sep 2021, sunil.k.pandey wrote:


On Linux/x86_64,

d06dc8a2c73735e9496f434787ba4c93ceee5eea is the first bad commit
commit d06dc8a2c73735e9496f434787ba4c93ceee5eea
Author: Richard Biener 
Date:   Mon Sep 27 13:36:12 2021 +0200

 middle-end/102450 - avoid type_for_size for non-existing modes

caused

FAIL: gcc.dg/out-of-bounds-1.c  (test for warnings, line 12)
FAIL: gcc.dg/pr78408-1.c scan-tree-dump-times fab1 "after previous" 17
FAIL: gcc.dg/strlenopt-13.c scan-tree-dump-times strlen1 "memcpy \\(" 7


After the change the new memcpy inlining limit using MOVE_MAX * MOVE_RATIO
comes into play and ends up using an OImode move which previously was
disregarded as there's no __int256 standard type in the frontend
(but now we build such type anyway after verifying the mode exists and
it has move support).

For example gcc.dg/out-of-bounds-1.c which looks like

void ProjectOverlay(const float localTextureAxis[2], char *lump)
{
const void *d = 
int size = sizeof(float)*8 ;
__builtin_memcpy( [ 0 ], d, size );  /* { dg-warning "reading" }
*/
}

gets turned into

 movq%rdi, -8(%rsp)
 vmovdqu64   -8(%rsp), %ymm31
 vmovdqu64   %ymm31, (%rsi)

which I guess is good but then the diagnostic is no longer emitted
because -Wstringop-overread only applies to the builtin.  Usually
we avoid the folding in such a case but

   /* Detect out-of-bounds accesses without issuing warnings.
  Avoid folding out-of-bounds copies but to avoid false
  positives for unreachable code defer warning until after
  DCE has worked its magic.
  -Wrestrict is still diagnosed.  */
   if (int warning = check_bounds_or_overlap (as_a (stmt),
  dest, src, len,
len,
  false, false))
 if (warning != OPT_Wrestrict)
   return false;


The check_bounds_or_overlap() call only implements -Wrestrict and
a small subset of -Warray-bounds (the subset issued for forming
out-of-bounds pointers by built-ins).  It's a limitation/bug in
the gimple-ssa-warn-restrict.c code that it doesn't detect
the problem (it's confused by taking the address of a pointer).

To let the test pass I suggest either bumping up the size or making
it an odd number (or anything else that's not a power of 2).



does not seem to trigger here.  Changing the testcase to

void ProjectOverlay(const float localTextureAxis[2], char *lump)
{
const void *d = 
int size = sizeof(float)*4 ;
__builtin_memcpy( [ 0 ], d, size );  /* { dg-warning "reading" }
*/
}

also fails to warn.


The very late -Wstringop-{overflow,overread} warnings that run just
before expansion have historically only worked for built-in calls.
Now that they are in a GIMPLE pass of their own as opposed to
working with trees in builtins.c, it will be easy to handle plain
stores as well.  It's on my list of things to do.

Martin



Richard.





[PATCH] aarch64: Add command-line support for Armv8.7-a

2021-09-28 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

This patch adds support for -march=armv8.7-a in GCC.
It adds the +ls64 extension that's included in this architecture revision.
Currently this is just the command-line option and +ls64 allows the relevant 
instructions
to be used in inline assembly. The ACLE defines some intrinsics for them but 
those can be
added separately later (together with the appropriate __ARM_FEATURE_* 
predefine).

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk.
Thanks,
Kyrill

2021-09-27  Kyrylo Tkachov  

* config/aarch64/aarch64.h (AARCH64_FL_LS64): Define
(AARCH64_FL_V8_7): Likewise.
(AARCH64_FL_FOR_ARCH8_7): Likewise.
* config/aarch64/aarch64-arches.def (armv8.7-a): Define.
* config/aarch64/aarch64-option-extensions.def (ls64): Define.
* doc/invoke.texi: Document the above.


v87.patch
Description: v87.patch


Fwd: [PATCH][testsuite][aarch64]: Fix gcc.target/aarch64/auto-init-* tests.

2021-09-28 Thread Qing Zhao via Gcc-patches
Ping…

Qing

Begin forwarded message:

From: Qing Zhao via Gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>
Subject: [PATCH][testsuite][aarch64]: Fix gcc.target/aarch64/auto-init-* tests.
Date: September 21, 2021 at 2:20:58 PM CDT
To: gcc-patches Nick Alcock via 
mailto:gcc-patches@gcc.gnu.org>>
Reply-To: Qing Zhao mailto:qing.z...@oracle.com>>

Hi,

This is the patch to fix gcc.target/aarch64/auto-init-* tests.

I have tested the change on aarch64-linux with

make check-gcc 
RUNTESTFLAGS='--target_board=unix\{-mabi=lp64,-mabi=ilp32,-mabi=lp64/-fstack-clash-protection/-fstack-protector-all,-mabi=ilp32/-fstack-clash-protection/-fstack-protector-all,-mabi=lp64/-march=armv8-a,-mabi=ilp32/-march=armv8.2-a,-mabi=lp64/-march=armv8.4-a,-mabi=ilp32/-march=armv8.6-a,-mabi=lp64/-march=armv8-r\}
 aarch64.exp=auto-init*'

Everything works fine.

Okay for commit?

Thanks.

Qing

==



From c46888eed5621df842178a85adf7e221c7e00b48 Mon Sep 17 00:00:00 2001
From: qing zhao mailto:qing.z...@oracle.com>>
Date: Tue, 21 Sep 2021 12:05:32 -0700
Subject: [PATCH] testsuite: Fix gcc.target/aarch64/auto-init-* tests.

Add -fno-stack-protector for two testing cases and also different
pattern match for lp64 and ilp32 for the other two cases.

gcc/testsuite/ChangeLog:

2021-09-21  qing zhao  mailto:qing.z...@oracle.com>>

* gcc.target/aarch64/auto-init-1.c: Add -fno-stack-protector.
* gcc.target/aarch64/auto-init-7.c: Likewise.
* gcc.target/aarch64/auto-init-2.c: Different pattern match for
lp64 and ilp32.
* gcc.target/aarch64/auto-init-padding-5.c: Likewise.
---
gcc/testsuite/gcc.target/aarch64/auto-init-1.c | 2 +-
gcc/testsuite/gcc.target/aarch64/auto-init-2.c | 3 ++-
gcc/testsuite/gcc.target/aarch64/auto-init-7.c | 2 +-
gcc/testsuite/gcc.target/aarch64/auto-init-padding-5.c | 3 ++-
4 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/auto-init-1.c 
b/gcc/testsuite/gcc.target/aarch64/auto-init-1.c
index 0fa4708..a38d91b 100644
--- a/gcc/testsuite/gcc.target/aarch64/auto-init-1.c
+++ b/gcc/testsuite/gcc.target/aarch64/auto-init-1.c
@@ -1,6 +1,6 @@
/* Verify zero initialization for integer and pointer type automatic variables. 
 */
/* { dg-do compile } */
-/* { dg-options "-ftrivial-auto-var-init=zero -fdump-rtl-expand" } */
+/* { dg-options "-ftrivial-auto-var-init=zero -fdump-rtl-expand 
-fno-stack-protector" } */

#ifndef __cplusplus
# define bool _Bool
diff --git a/gcc/testsuite/gcc.target/aarch64/auto-init-2.c 
b/gcc/testsuite/gcc.target/aarch64/auto-init-2.c
index 2c54e6d..136dbf6 100644
--- a/gcc/testsuite/gcc.target/aarch64/auto-init-2.c
+++ b/gcc/testsuite/gcc.target/aarch64/auto-init-2.c
@@ -32,4 +32,5 @@ void foo()
/* { dg-final { scan-rtl-dump-times "0xfe\\\]" 1 "expand" } } */
/* { dg-final { scan-rtl-dump-times "0xfefe" 1 "expand" } } */
/* { dg-final { scan-rtl-dump-times "0xfefefefe" 2 "expand" } } */
-/* { dg-final { scan-rtl-dump-times "0xfefefefefefefefe" 2 "expand" } } */
+/* { dg-final { scan-rtl-dump-times "0xfefefefefefefefe" 2 "expand" { target 
lp64 } } } */
+/* { dg-final { scan-rtl-dump-times "0xfefefefefefefefe" 1 "expand" { target 
ilp32 } } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/auto-init-7.c 
b/gcc/testsuite/gcc.target/aarch64/auto-init-7.c
index ac27fbe..fde6e56 100644
--- a/gcc/testsuite/gcc.target/aarch64/auto-init-7.c
+++ b/gcc/testsuite/gcc.target/aarch64/auto-init-7.c
@@ -1,6 +1,6 @@
/* Verify zero initialization for array, union, and structure type automatic 
variables.  */
/* { dg-do compile } */
-/* { dg-options "-ftrivial-auto-var-init=zero -fdump-rtl-expand" } */
+/* { dg-options "-ftrivial-auto-var-init=zero -fdump-rtl-expand 
-fno-stack-protector" } */

struct S
{
diff --git a/gcc/testsuite/gcc.target/aarch64/auto-init-padding-5.c 
b/gcc/testsuite/gcc.target/aarch64/auto-init-padding-5.c
index 3c45a6c..7991367 100644
--- a/gcc/testsuite/gcc.target/aarch64/auto-init-padding-5.c
+++ b/gcc/testsuite/gcc.target/aarch64/auto-init-padding-5.c
@@ -17,6 +17,7 @@ int foo ()
  return var.four;
}

-/* { dg-final { scan-assembler-times "stp\txzr, xzr," 2 } } */
+/* { dg-final { scan-assembler-times "stp\txzr, xzr," 2 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "stp\txzr, xzr," 1 { target ilp32 } } } */


--
1.9.1




Re: [PATCH] c++: Fix up synthetization of defaulted comparison operators on classes with bitfields [PR102490]

2021-09-28 Thread Jakub Jelinek via Gcc-patches
On Tue, Sep 28, 2021 at 09:49:11AM -0400, Patrick Palka via Gcc-patches wrote:
> > --- gcc/cp/method.c.jj  2021-09-15 08:55:37.563497558 +0200
> > +++ gcc/cp/method.c 2021-09-27 13:48:12.139271830 +0200
> > @@ -3160,8 +3160,11 @@ defaulted_late_check (tree fn)
> >if (kind == sfk_comparison)
> >  {
> >/* If the function was declared constexpr, check that the definition
> > -qualifies.  Otherwise we can define the function lazily.  */
> > -  if (DECL_DECLARED_CONSTEXPR_P (fn) && !DECL_INITIAL (fn))
> > +qualifies.  Otherwise we can define the function lazily.
> > +Don't do this if the class type is still incomplete.  */
> > +  if (DECL_DECLARED_CONSTEXPR_P (fn)
> > + && !DECL_INITIAL (fn)
> > + && COMPLETE_TYPE_P (ctx))
> > {
> 
> According to the function comment for defaulted_late_check, won't
> COMPLETE_TYPE_P (ctx) always be false here?

It is true in the call from the following hunk.
The function comment at least to me doesn't imply it is always called on
incomplete types, and defaultable_fn_check also calls it.
> 
> >   /* Prevent GC.  */
> >   function_depth++;
> > --- gcc/cp/class.c.jj   2021-09-03 09:46:28.801428380 +0200
> > +++ gcc/cp/class.c  2021-09-27 14:07:03.465562255 +0200
> > @@ -7467,7 +7467,14 @@ finish_struct_1 (tree t)
> >   for any static member objects of the type we're working on.  */
> >for (x = TYPE_FIELDS (t); x; x = DECL_CHAIN (x))
> >  if (DECL_DECLARES_FUNCTION_P (x))
> > -  DECL_IN_AGGR_P (x) = false;
> > +  {
> > +   /* Synthetize constexpr defaulted comparisons.  */
> > +   if (!DECL_ARTIFICIAL (x)
> > +   && DECL_DEFAULTED_IN_CLASS_P (x)
> > +   && special_function_p (x) == sfk_comparison)
> > + defaulted_late_check (x);
> > +   DECL_IN_AGGR_P (x) = false;
> > +  }
> >  else if (VAR_P (x) && TREE_STATIC (x)
> >  && TREE_TYPE (x) != error_mark_node
> >  && same_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (x)), t))

Jakub



Re: *PING* [PATCH] c++: fix cases of core1001/1322 by not dropping cv-qualifier of function parameter of type of typename or decltype[PR101402,PR102033,PR102034,PR102039,PR102044]

2021-09-28 Thread Jason Merrill via Gcc-patches

On 9/25/21 15:15, nick huang wrote:

Why doesn't the PR92010 fix address these testcases as well?


3. PR92010 creates new functions of "rebuild_function_or_method_type" and by 
using gdb to trace PR101402 code as following:

template struct A {
  typedef T arr[3];
};
template void f(const typename A::arr) { }// #1
template void f(const A::arr);   // #2

I added some print function declaration code before and after calling 
"maybe_rebuild_function_decl_type" to print out its parameter "r" which is function 
declaration inside "tsubst_function_decl".
Here is the result:
a) Before calling, the function declaration is "void f(int*)" and after calling, it is adjusted to correct one as 
"void f(const int*)". However, after this line "SET_DECL_IMPLICIT_INSTANTIATION (r);",  it fallback to original 
dependent type as "void f(typename A::arr) [with T = int; typename A::arr = int [3]]" till end. This 
completely defeats the purpose of template substitution effort.


That's just an artifact of (bug in) how we print it as template+args 
once it's marked as an instantiation; the actual type of the function 
returned from tsubst_function_decl is still void (const int*).


The problem seems to come when we get back to determine_specialization, 
where we have


  // Then, try to form the new function type.  
=>insttype = tsubst (TREE_TYPE (fn), targs, tf_fndecl_type, NULL_TREE);


which does the wrong substitution again, and not the correct one from 
maybe_rebuild_function_decl_type.


Both this substitution check and the constraint check just before it 
seem redundant with the checks we already did in fn_type_unification, so 
the right fix may be to just remove the broken ones here in 
determine_specialization.


Jason



Re: [PATCH] Improve jump threading dump output.

2021-09-28 Thread Jeff Law via Gcc-patches




On 9/28/2021 7:53 AM, Aldy Hernandez wrote:



On 9/28/21 3:47 PM, Jeff Law wrote:



On 9/28/2021 3:45 AM, Aldy Hernandez wrote:

In analyzing PR102511, it has become abundantly clear that we need
better debugging aids for the jump threader solver.  Currently
debugging these issues is a nightmare if you're not intimately
familiar with the code.  This patch attempts to improve this.

First, I'm enabling path solver dumps with TDF_THREADING. None of the
available TDF_* flags are a good match, and using TDF_DETAILS would 
blow

up the dump file, since both threaders continually call the solver to
try out candidates.  This will allow dumping path solver details 
without

having to resort to hacking the source.

I am also dumping the current registered_jump_thread dbg counter used
by the registry, in the solver.  That way narrowing down a problematic
thread can then be examined by -fdump-*-threading and looking at the
solver details surrounding the appropriate counter (which the dbgcnt
also dumps to the dump file).

You still need knowledge of the solver to debug these issues, but at
least now it's not entirely opaque.

OK?

gcc/ChangeLog:

* dbgcnt.c (dbg_cnt_counter): New.
* dbgcnt.h (dbg_cnt_counter): New.
* dumpfile.c (dump_options): Add entry for TDF_THREADING.
* dumpfile.h (enum dump_flag): Add TDF_THREADING.
* gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
* tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
debug counter.

OK.

Note we've got massive failures in the tester starting sometime 
yesterday and I suspect all the threader work.    So I'm going to 
slow down on reviews of that code as we stabilize stuff.


Fair enough.  Let's knock those out then.
Yup.  I suspect it's just one or two issues that are showing up on a 
variety of targets.  And as I've said before, that's why we've got a 
tester :-)




I just fixed a P1 that was causing undefined behavior.  Other than 
that, I don't have any known regressions apart from the loop crossing 
restrictions which you and me haven't agreed upon yet. (Well...there 
are some archs that need testsuite tweaking, but they're not bugs per 
se.)
These could end up being testsuite issues.  I've only debugged as far as 
"there's a sea of red failures" on the dashboard.




Send anything my way.
Got a docker instance of the first one spinning right now for debugging 
purposes.  I'll look at it after I finish playing chauffeur for my daughter.


jeff


Re: [PATCH] Improve jump threading dump output.

2021-09-28 Thread Aldy Hernandez via Gcc-patches




On 9/28/21 3:47 PM, Jeff Law wrote:



On 9/28/2021 3:45 AM, Aldy Hernandez wrote:

In analyzing PR102511, it has become abundantly clear that we need
better debugging aids for the jump threader solver.  Currently
debugging these issues is a nightmare if you're not intimately
familiar with the code.  This patch attempts to improve this.

First, I'm enabling path solver dumps with TDF_THREADING.  None of the
available TDF_* flags are a good match, and using TDF_DETAILS would blow
up the dump file, since both threaders continually call the solver to
try out candidates.  This will allow dumping path solver details without
having to resort to hacking the source.

I am also dumping the current registered_jump_thread dbg counter used
by the registry, in the solver.  That way narrowing down a problematic
thread can then be examined by -fdump-*-threading and looking at the
solver details surrounding the appropriate counter (which the dbgcnt
also dumps to the dump file).

You still need knowledge of the solver to debug these issues, but at
least now it's not entirely opaque.

OK?

gcc/ChangeLog:

* dbgcnt.c (dbg_cnt_counter): New.
* dbgcnt.h (dbg_cnt_counter): New.
* dumpfile.c (dump_options): Add entry for TDF_THREADING.
* dumpfile.h (enum dump_flag): Add TDF_THREADING.
* gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
* tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
debug counter.

OK.

Note we've got massive failures in the tester starting sometime 
yesterday and I suspect all the threader work.    So I'm going to slow 
down on reviews of that code as we stabilize stuff.


Fair enough.  Let's knock those out then.

I just fixed a P1 that was causing undefined behavior.  Other than that, 
I don't have any known regressions apart from the loop crossing 
restrictions which you and me haven't agreed upon yet.  (Well...there 
are some archs that need testsuite tweaking, but they're not bugs per se.)


Send anything my way.

Aldy



Re: [PATCH] c++: Fix up synthetization of defaulted comparison operators on classes with bitfields [PR102490]

2021-09-28 Thread Patrick Palka via Gcc-patches
On Tue, 28 Sep 2021, Patrick Palka wrote:

> On Tue, 28 Sep 2021, Jakub Jelinek via Gcc-patches wrote:
> 
> > Hi!
> > 
> > The testcases in the patch are either miscompiled or ICE with checking,
> > because the defaulted operator== is synthetized too early (but only if
> > constexpr), when the corresponding class type is still incomplete type.
> > The problem is that at that point the bitfield FIELD_DECLs still have as
> > TREE_TYPE their underlying type rather than integral type with their
> > precision and when layout_class_type is called for the class soon after
> > that, it changes those types but the COMPONENT_REFs type stay the way
> > that they were during the operator== synthetize_method type and the
> > middle-end is then upset by the mismatch of types.
> > As what exact type will be given isn't just a one liner but quite long code
> > especially for over-sized bitfields, I think it is best to just not
> > synthetize the comparison operators so early (the defaulted_late_check
> > change) and call defaulted_late_check for them once again as soon as the
> > class is complete.
> 
> Nice, this might also fix PR98712.
> 
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > 2021-09-28  Jakub Jelinek  
> > 
> > PR c++/102490
> > * method.c (defaulted_late_check): Don't synthetize constexpr
> > defaulted comparisons if context is still incomplete type.
> > (finish_struct_1): Call defaulted_late_check again for defaulted
> > comparisons.
> > 
> > * g++.dg/cpp2a/spaceship-eq11.C: New test.
> > * g++.dg/cpp2a/spaceship-eq12.C: New test.
> > 
> > --- gcc/cp/method.c.jj  2021-09-15 08:55:37.563497558 +0200
> > +++ gcc/cp/method.c 2021-09-27 13:48:12.139271830 +0200
> > @@ -3160,8 +3160,11 @@ defaulted_late_check (tree fn)
> >if (kind == sfk_comparison)
> >  {
> >/* If the function was declared constexpr, check that the definition
> > -qualifies.  Otherwise we can define the function lazily.  */
> > -  if (DECL_DECLARED_CONSTEXPR_P (fn) && !DECL_INITIAL (fn))
> > +qualifies.  Otherwise we can define the function lazily.
> > +Don't do this if the class type is still incomplete.  */
> > +  if (DECL_DECLARED_CONSTEXPR_P (fn)
> > + && !DECL_INITIAL (fn)
> > + && COMPLETE_TYPE_P (ctx))
> > {
> 
> According to the function comment for defaulted_late_check, won't
> COMPLETE_TYPE_P (ctx) always be false here?

If so, I wonder if we could get away with moving this entire fragment
from defaulted_late_check to finish_struct_1 instead of calling
defaulted_late_check from finish_struct_1.

> 
> >   /* Prevent GC.  */
> >   function_depth++;
> > --- gcc/cp/class.c.jj   2021-09-03 09:46:28.801428380 +0200
> > +++ gcc/cp/class.c  2021-09-27 14:07:03.465562255 +0200
> > @@ -7467,7 +7467,14 @@ finish_struct_1 (tree t)
> >   for any static member objects of the type we're working on.  */
> >for (x = TYPE_FIELDS (t); x; x = DECL_CHAIN (x))
> >  if (DECL_DECLARES_FUNCTION_P (x))
> > -  DECL_IN_AGGR_P (x) = false;
> > +  {
> > +   /* Synthetize constexpr defaulted comparisons.  */
> > +   if (!DECL_ARTIFICIAL (x)
> > +   && DECL_DEFAULTED_IN_CLASS_P (x)
> > +   && special_function_p (x) == sfk_comparison)
> > + defaulted_late_check (x);
> > +   DECL_IN_AGGR_P (x) = false;
> > +  }
> >  else if (VAR_P (x) && TREE_STATIC (x)
> >  && TREE_TYPE (x) != error_mark_node
> >  && same_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (x)), t))
> > --- gcc/testsuite/g++.dg/cpp2a/spaceship-eq11.C.jj  2021-09-27 
> > 14:20:04.723713371 +0200
> > +++ gcc/testsuite/g++.dg/cpp2a/spaceship-eq11.C 2021-09-27 
> > 14:20:20.387495858 +0200
> > @@ -0,0 +1,43 @@
> > +// PR c++/102490
> > +// { dg-do run { target c++20 } }
> > +
> > +struct A
> > +{
> > +  unsigned char a : 1;
> > +  unsigned char b : 1;
> > +  constexpr bool operator== (const A &) const = default;
> > +};
> > +
> > +struct B
> > +{
> > +  unsigned char a : 8;
> > +  int : 0;
> > +  unsigned char b : 7;
> > +  constexpr bool operator== (const B &) const = default;
> > +};
> > +
> > +struct C
> > +{
> > +  unsigned char a : 3;
> > +  unsigned char b : 1;
> > +  constexpr bool operator== (const C &) const = default;
> > +};
> > +
> > +void
> > +foo (C , int y)
> > +{
> > +  x.b = y;
> > +}
> > +
> > +int
> > +main ()
> > +{
> > +  A a{}, b{};
> > +  B c{}, d{};
> > +  C e{}, f{};
> > +  a.b = 1;
> > +  d.b = 1;
> > +  foo (e, 0);
> > +  foo (f, 1);
> > +  return a == b || c == d || e == f;
> > +}
> > --- gcc/testsuite/g++.dg/cpp2a/spaceship-eq12.C.jj  2021-09-27 
> > 14:20:12.050611625 +0200
> > +++ gcc/testsuite/g++.dg/cpp2a/spaceship-eq12.C 2021-09-27 
> > 14:20:39.633228602 +0200
> > @@ -0,0 +1,5 @@
> > +// PR c++/102490
> > +// { dg-do run { target c++20 } }
> > +// { dg-options "-O2" }
> > +
> > +#include "spaceship-eq11.C"
> > 
> > Jakub
> > 
> > 
> 



Re: [PATCH] c++: Fix up synthetization of defaulted comparison operators on classes with bitfields [PR102490]

2021-09-28 Thread Patrick Palka via Gcc-patches
On Tue, 28 Sep 2021, Jakub Jelinek via Gcc-patches wrote:

> Hi!
> 
> The testcases in the patch are either miscompiled or ICE with checking,
> because the defaulted operator== is synthetized too early (but only if
> constexpr), when the corresponding class type is still incomplete type.
> The problem is that at that point the bitfield FIELD_DECLs still have as
> TREE_TYPE their underlying type rather than integral type with their
> precision and when layout_class_type is called for the class soon after
> that, it changes those types but the COMPONENT_REFs type stay the way
> that they were during the operator== synthetize_method type and the
> middle-end is then upset by the mismatch of types.
> As what exact type will be given isn't just a one liner but quite long code
> especially for over-sized bitfields, I think it is best to just not
> synthetize the comparison operators so early (the defaulted_late_check
> change) and call defaulted_late_check for them once again as soon as the
> class is complete.

Nice, this might also fix PR98712.

> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2021-09-28  Jakub Jelinek  
> 
>   PR c++/102490
>   * method.c (defaulted_late_check): Don't synthetize constexpr
>   defaulted comparisons if context is still incomplete type.
>   (finish_struct_1): Call defaulted_late_check again for defaulted
>   comparisons.
> 
>   * g++.dg/cpp2a/spaceship-eq11.C: New test.
>   * g++.dg/cpp2a/spaceship-eq12.C: New test.
> 
> --- gcc/cp/method.c.jj2021-09-15 08:55:37.563497558 +0200
> +++ gcc/cp/method.c   2021-09-27 13:48:12.139271830 +0200
> @@ -3160,8 +3160,11 @@ defaulted_late_check (tree fn)
>if (kind == sfk_comparison)
>  {
>/* If the function was declared constexpr, check that the definition
> -  qualifies.  Otherwise we can define the function lazily.  */
> -  if (DECL_DECLARED_CONSTEXPR_P (fn) && !DECL_INITIAL (fn))
> +  qualifies.  Otherwise we can define the function lazily.
> +  Don't do this if the class type is still incomplete.  */
> +  if (DECL_DECLARED_CONSTEXPR_P (fn)
> +   && !DECL_INITIAL (fn)
> +   && COMPLETE_TYPE_P (ctx))
>   {

According to the function comment for defaulted_late_check, won't
COMPLETE_TYPE_P (ctx) always be false here?

> /* Prevent GC.  */
> function_depth++;
> --- gcc/cp/class.c.jj 2021-09-03 09:46:28.801428380 +0200
> +++ gcc/cp/class.c2021-09-27 14:07:03.465562255 +0200
> @@ -7467,7 +7467,14 @@ finish_struct_1 (tree t)
>   for any static member objects of the type we're working on.  */
>for (x = TYPE_FIELDS (t); x; x = DECL_CHAIN (x))
>  if (DECL_DECLARES_FUNCTION_P (x))
> -  DECL_IN_AGGR_P (x) = false;
> +  {
> + /* Synthetize constexpr defaulted comparisons.  */
> + if (!DECL_ARTIFICIAL (x)
> + && DECL_DEFAULTED_IN_CLASS_P (x)
> + && special_function_p (x) == sfk_comparison)
> +   defaulted_late_check (x);
> + DECL_IN_AGGR_P (x) = false;
> +  }
>  else if (VAR_P (x) && TREE_STATIC (x)
>&& TREE_TYPE (x) != error_mark_node
>&& same_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (x)), t))
> --- gcc/testsuite/g++.dg/cpp2a/spaceship-eq11.C.jj2021-09-27 
> 14:20:04.723713371 +0200
> +++ gcc/testsuite/g++.dg/cpp2a/spaceship-eq11.C   2021-09-27 
> 14:20:20.387495858 +0200
> @@ -0,0 +1,43 @@
> +// PR c++/102490
> +// { dg-do run { target c++20 } }
> +
> +struct A
> +{
> +  unsigned char a : 1;
> +  unsigned char b : 1;
> +  constexpr bool operator== (const A &) const = default;
> +};
> +
> +struct B
> +{
> +  unsigned char a : 8;
> +  int : 0;
> +  unsigned char b : 7;
> +  constexpr bool operator== (const B &) const = default;
> +};
> +
> +struct C
> +{
> +  unsigned char a : 3;
> +  unsigned char b : 1;
> +  constexpr bool operator== (const C &) const = default;
> +};
> +
> +void
> +foo (C , int y)
> +{
> +  x.b = y;
> +}
> +
> +int
> +main ()
> +{
> +  A a{}, b{};
> +  B c{}, d{};
> +  C e{}, f{};
> +  a.b = 1;
> +  d.b = 1;
> +  foo (e, 0);
> +  foo (f, 1);
> +  return a == b || c == d || e == f;
> +}
> --- gcc/testsuite/g++.dg/cpp2a/spaceship-eq12.C.jj2021-09-27 
> 14:20:12.050611625 +0200
> +++ gcc/testsuite/g++.dg/cpp2a/spaceship-eq12.C   2021-09-27 
> 14:20:39.633228602 +0200
> @@ -0,0 +1,5 @@
> +// PR c++/102490
> +// { dg-do run { target c++20 } }
> +// { dg-options "-O2" }
> +
> +#include "spaceship-eq11.C"
> 
>   Jakub
> 
> 



Re: [PATCH] Improve jump threading dump output.

2021-09-28 Thread Jeff Law via Gcc-patches




On 9/28/2021 3:45 AM, Aldy Hernandez wrote:

In analyzing PR102511, it has become abundantly clear that we need
better debugging aids for the jump threader solver.  Currently
debugging these issues is a nightmare if you're not intimately
familiar with the code.  This patch attempts to improve this.

First, I'm enabling path solver dumps with TDF_THREADING.  None of the
available TDF_* flags are a good match, and using TDF_DETAILS would blow
up the dump file, since both threaders continually call the solver to
try out candidates.  This will allow dumping path solver details without
having to resort to hacking the source.

I am also dumping the current registered_jump_thread dbg counter used
by the registry, in the solver.  That way narrowing down a problematic
thread can then be examined by -fdump-*-threading and looking at the
solver details surrounding the appropriate counter (which the dbgcnt
also dumps to the dump file).

You still need knowledge of the solver to debug these issues, but at
least now it's not entirely opaque.

OK?

gcc/ChangeLog:

* dbgcnt.c (dbg_cnt_counter): New.
* dbgcnt.h (dbg_cnt_counter): New.
* dumpfile.c (dump_options): Add entry for TDF_THREADING.
* dumpfile.h (enum dump_flag): Add TDF_THREADING.
* gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
* tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
debug counter.

OK.

Note we've got massive failures in the tester starting sometime 
yesterday and I suspect all the threader work.    So I'm going to slow 
down on reviews of that code as we stabilize stuff.


jeff



Re: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass

2021-09-28 Thread Christophe LYON via Gcc-patches



On 28/09/2021 13:18, Kyrylo Tkachov wrote:

Hi Christophe,


-Original Message-
From: Gcc-patches  On Behalf Of Christophe
LYON via Gcc-patches
Sent: 08 September 2021 08:49
To: Richard Earnshaw ; gcc-
patc...@gcc.gnu.org
Subject: Re: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass


On 07/09/2021 15:35, Richard Earnshaw wrote:


On 07/09/2021 13:05, Christophe LYON wrote:

On 07/09/2021 11:42, Richard Earnshaw wrote:


On 07/09/2021 10:15, Christophe Lyon via Gcc-patches wrote:

At some point during the development of this patch series, it appeared
that in some cases the register allocator wants “VPR or general”
rather than “VPR or general or FP” (which is the same thing as
ALL_REGS).  The series does not seem to require this anymore, but it
seems to be a good thing to do anyway, to give the register allocator
more freedom.

2021-09-01  Christophe Lyon 

 gcc/
 * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
 (REG_CLASS_NAMES): Likewise.
 (REG_CLASS_CONTENTS): Likewise. Add VPR_REG to ALL_REGS.

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 015299c1534..fab39d05916 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1286,6 +1286,7 @@ enum reg_class
     SFP_REG,
     AFP_REG,
     VPR_REG,
+  GENERAL_AND_VPR_REGS,
     ALL_REGS,
     LIM_REG_CLASSES
   };
@@ -1315,6 +1316,7 @@ enum reg_class
     "SFP_REG",    \
     "AFP_REG",    \
     "VPR_REG",    \
+  "GENERAL_AND_VPR_REGS", \
     "ALL_REGS"    \
   }
   @@ -1343,7 +1345,8 @@ enum reg_class
     { 0x, 0x, 0x, 0x0040 }, /* SFP_REG
*/    \
     { 0x, 0x, 0x, 0x0080 }, /* AFP_REG
*/    \
     { 0x, 0x, 0x, 0x0400 }, /* VPR_REG.
*/    \
-  { 0x7FFF, 0x, 0x, 0x000F }  /* ALL_REGS.
*/    \
+  { 0x5FFF, 0x, 0x, 0x0400 }, /*
GENERAL_AND_VPR_REGS.  */ \
+  { 0x7FFF, 0x, 0x, 0x040F }  /* ALL_REGS.
*/    \
   }

You've changed the definition of ALL_REGS here (to include VPR_REG),
but not really explained why.  Is that the source of the underlying
issue with the 'appeared' you mention?


I first added VPR_REG to ALL_REGS, but Richard Sandiford suggested I
create a new GENERAL_AND_VPR_REGS that would be more restrictive. I
did not remove VPR_REG from ALL_REGS because I thought it was an
omission: shouldn't ALL_REGS contain all registers?

Surely that should be a separate patch then.

OK, I can remove that line from this patch and make a separate one-liner
for ALL_REGS.

Did you end up sending that patch out? (Sorry, I may have missed it in my 
archive).
This patch to add GENERAL_AND_VPR_REGS is okay with the ALL_REGS change 
separated out.


No I didn't send it yet: I suspect there will be iterations on the next 
patches in the series, this small change alone wasn't worth sending a v2 :-)


Thanks,

Christophe




Thanks,
Kyrill


Thanks,

Christophe



R.




R.



     #define FP_SYSREGS \



Re: [PATCH 03/13] arm: Add test for PR target/101325

2021-09-28 Thread Christophe LYON via Gcc-patches



On 28/09/2021 13:14, Kyrylo Tkachov wrote:



-Original Message-
From: Gcc-patches  On Behalf Of Christophe
Lyon via Gcc-patches
Sent: 07 September 2021 10:15
To: gcc-patches@gcc.gnu.org
Subject: [PATCH 03/13] arm: Add test for PR target/101325

This test is derived from the one provided in the PR: it is a
compile-only test because I do not have access to anything that could
execute it.  We can switch it do 'dg-do run' later, however it would
be better to write a new executable test to ensure coverage in case
the tester cannot execute such code (and it will need a new
arm_v8_1m_mve_hw or similar effective-target).

The test is okay for now.
I think we'll want to have a arm_v8_1m_mve_hw target sooner or later.
Maybe Alex or Andrea can help to write one we can use?



Since I posted the patch series, QEMU has gained support for MVE, I plan 
to write a similar testcase which is executable.


There's already an executable testcase in the PR.

Thanks

Christophe




Thanks,
Kyrill


2021-09-01  Christophe Lyon  

gcc/testsuite/
PR target/101325
* gcc.target/arm/simd/pr101325.c: New.

diff --git a/gcc/testsuite/gcc.target/arm/simd/pr101325.c
b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
new file mode 100644
index 000..a466683a0b1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O3" } */
+
+#include 
+
+unsigned foo(int8x16_t v, int8x16_t w)
+{
+  return vcmpeqq (v, w);
+}
+/* { dg-final { scan-assembler {\tvcmp.i8  eq} } } */
+/* { dg-final { scan-assembler {\tvmrs\t r[0-9]+, P0} } } */
+/* { dg-final { scan-assembler {\tuxth} } } */
--
2.25.1


Re: [PATCH 02/13] arm: Add tests for PR target/100757

2021-09-28 Thread Christophe LYON via Gcc-patches



On 28/09/2021 13:12, Kyrylo Tkachov wrote:



-Original Message-
From: Gcc-patches  On Behalf Of Christophe
Lyon via Gcc-patches
Sent: 07 September 2021 10:15
To: gcc-patches@gcc.gnu.org
Subject: [PATCH 02/13] arm: Add tests for PR target/100757

These tests currently trigger an ICE which is fixed later in the patch
series.

The pr100757*.c testcases are derived from
gcc.c-torture/compile/20160205-1.c, forcing the use of MVE, and using
various types and return values different from 0 and 1 to avoid
commonalization with boolean masks.  In addition, since we should not
need these masks, the tests make sure they are not present.

Ok, but I'd rather it was committed together with the patch that fixes the ICE.
I don't mind if it's a separate commit or rolled into that patch.



Sure, I'll wait for the main patch approval. I split it this way to 
hopefully make the reviews easier, to exercise the testcase without the 
fix proposal.


Thanks,

Christophe




Thanks,
Kyrill


2021-09-01  Christophe Lyon  

gcc/testsuite/
PR target/100757
* gcc.target/arm/simd/pr100757-2.c: New.
* gcc.target/arm/simd/pr100757-3.c: New.
* gcc.target/arm/simd/pr100757-4.c: New.
* gcc.target/arm/simd/pr100757.c: New.

diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
b/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
new file mode 100644
index 000..c2262b4d81e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
+/* Derived from gcc.c-torture/compile/20160205-1.c.  */
+
+float a[32];
+int fn1(int d) {
+  int c = 4;
+  for (int b = 0; b < 32; b++)
+if (a[b] != 2.0f)
+  c = 5;
+  return c;
+}
+
+/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 4 } } */ /*
Constant 2.0f.  */
+/* { dg-final { scan-assembler-times {\t.word\t4\n} 4 } } */ /* Initial value
for c.  */
+/* { dg-final { scan-assembler-times {\t.word\t5\n} 4 } } */ /* Possible
value for c.  */
+/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
+/* { dg-final { scan-assembler-not {\t.word\t0\n} } } */ /* 'false' mask.  */
diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
b/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
new file mode 100644
index 000..e604555c04c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
+/* Copied from gcc.c-torture/compile/20160205-1.c.  */
+
+float a[32];
+float fn1(int d) {
+  float c = 4.0f;
+  for (int b = 0; b < 32; b++)
+if (a[b] != 2.0f)
+  c = 5.0f;
+  return c;
+}
+
+/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 4 } } */ /*
Constant 2.0f.  */
+/* { dg-final { scan-assembler-times {\t.word\t1084227584\n} 4 } } */ /*
Initial value for c (4.0).  */
+/* { dg-final { scan-assembler-times {\t.word\t1082130432\n} 4 } } */ /*
Possible value for c (5.0).  */
+/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
+/* { dg-final { scan-assembler-not {\t.word\t0\n} } } */ /* 'false' mask.  */
diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
b/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
new file mode 100644
index 000..c12040c517f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O3" } */
+/* Derived from gcc.c-torture/compile/20160205-1.c.  */
+
+unsigned int a[32];
+int fn1(int d) {
+  int c = 2;
+  for (int b = 0; b < 32; b++)
+if (a[b])
+  c = 3;
+  return c;
+}
+
+/* { dg-final { scan-assembler-times {\t.word\t0\n} 4 } } */ /* 'false' mask.
*/
+/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
+/* { dg-final { scan-assembler-times {\t.word\t2\n} 4 } } */ /* Initial value
for c.  */
+/* { dg-final { scan-assembler-times {\t.word\t3\n} 4 } } */ /* Possible
value for c.  */
diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757.c
b/gcc/testsuite/gcc.target/arm/simd/pr100757.c
new file mode 100644
index 000..41d6e4e2d7a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/pr100757.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O3" } */
+/* Derived from gcc.c-torture/compile/20160205-1.c.  */
+
+int a[32];
+int fn1(int d) {
+  int c = 2;
+  for (int b = 0; b < 32; b++)
+if (a[b])
+  c = 3;
+  return c;
+}
+
+/* { dg-final { scan-assembler-times 

Re: [PATCH] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-09-28 Thread Koning, Paul via Gcc-patches



> On Sep 28, 2021, at 2:14 AM, Richard Biener via Gcc-patches 
>  wrote:
> 
> On Tue, Sep 21, 2021 at 4:26 PM Richard Biener via Gcc-patches
>  wrote:
>> 
>> This makes defaults.h choose DWARF2_DEBUG if PREFERRED_DEBUGGING_TYPE
>> is not specified by the target and errors out if DWARF DWARF is not 
>> supported.
>> 
>> ...
>> 
>> This completes the series of deprecating STABS for GCC 12.
>> 
>> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>> 
>> OK for trunk?
> 
> Ping.

pdp11 is fine.

paul



[PATCH 8/7] ifcvt: Second try in order to avoid unnecessary temporaries

2021-09-28 Thread Robin Dapp via Gcc-patches

Hi,

this patch implements the latest of my attempts to avoid some of the 
unnecessary temporaries noce_convert_multiple currently emits.  I named 
it 8/7 because it actually applies on top of the last series that is not 
yet approved while being a rather minor change.


The idea is to go over the list of convertible sets a second time if, 
during the first try, we noticed that we potentially overwrite the 
condition but no later set makes use of it anymore (because it can rely 
on the CC directly instead).  In that case we omit creating a temporary.


The whole series was bootstrapped and regtested on s390, x86 and ppc64.

Regards
 Robin

--

gcc/ChangeLog:

* ifcvt.c (noce_convert_multiple_sets): Perform a second try 
with less temporaries.
commit dd5a0f8d7d39447025d36ed5305709d38fe3f16b
Author: Robin Dapp 
Date:   Fri Sep 17 20:22:10 2021 +0200

ifcvt: Run second pass if it is possible to omit a temporary.

If one of the to-be-converted SETs requires the original comparison
(i.e. in order to generate a min/max insn) but no other insn after it
does, we can omit creating temporaries, thus facilitating costing.

diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c
index 4f3af5e1b00..2243157e32c 100644
--- a/gcc/ifcvt.c
+++ b/gcc/ifcvt.c
@@ -3261,6 +3261,11 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 
   need_cmov_or_rewire (then_bb, _no_cmov, _src);
 
+  int last_needs_comparison = -1;
+  bool second_try = false;
+
+restart:
+
   FOR_BB_INSNS (then_bb, insn)
 {
   /* Skip over non-insns.  */
@@ -3302,8 +3307,12 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 	 Therefore we introduce a temporary every time we are about to
 	 overwrite a variable used in the check.  Costing of a sequence with
 	 these is going to be inaccurate so only use temporaries when
-	 needed.  */
-  if (reg_overlap_mentioned_p (target, cond))
+	 needed.
+
+	 If performing a second try, we know how many insns require a
+	 temporary.  For the last of these, we can omit creating one.  */
+  if (reg_overlap_mentioned_p (target, cond)
+	  && (!second_try || count < last_needs_comparison))
 	temp = gen_reg_rtx (GET_MODE (target));
   else
 	temp = target;
@@ -3386,6 +3395,8 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
 	{
 	  seq = seq1;
 	  temp_dest = temp_dest1;
+	  if (!second_try)
+	last_needs_comparison = count;
 	}
   else if (seq2 != NULL_RTX)
 	{
@@ -3409,6 +3420,24 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
   unmodified_insns.safe_push (insn);
 }
 
+/* If there are insns that overwrite part of the initial
+   comparison, we can still omit creating temporaries for
+   the last of them.
+   As the second try will always create a less expensive,
+   valid sequence, we do not need to compare and can discard
+   the first one.  */
+if (!second_try && last_needs_comparison >= 0)
+  {
+	end_sequence ();
+	start_sequence ();
+	count = 0;
+	targets.truncate (0);
+	temporaries.truncate (0);
+	unmodified_insns.truncate (0);
+	second_try = true;
+	goto restart;
+  }
+
   /* We must have seen some sort of insn to insert, otherwise we were
  given an empty BB to convert, and we can't handle that.  */
   gcc_assert (!unmodified_insns.is_empty ());


Re: [Patch] libgomp: Only check for 2*sizeof(void*) int type with Fortran [PR96661]

2021-09-28 Thread Jakub Jelinek via Gcc-patches
On Tue, Sep 28, 2021 at 03:00:56PM +0200, Tobias Burnus wrote:
> The depend type is a struct with two pointer members for C/C++ - but for
> Fortran OpenMP requires an integer type with kind = omp_depend_kind. Thus,
> libgomp's configure checks that an integer type/kind with size 2*sizeof(void*)
> is available. However, this integer type/kind is not needed when building 
> without
> Fortran support. Thus, only check this when Fortran is enabled.
> 
> libgomp/
>   PR libgomp/96661
>   * configure.ac: Only check for int-type = 2*size_t support when
>   building with Fortran support.
>   * configure: Regenerate.

Ok, thanks.

Jakub



[Patch] libgomp: Only check for 2*sizeof(void*) int type with Fortran [PR96661]

2021-09-28 Thread Tobias Burnus

Found this one lurking around in one of my trees.

It does not solve the actual issue of John that hppa64-hp-hpux11.11 does
not have an __int128 alias integer(kind=16) type. The latter is required for
OpenMP's omp_depend_kind as per implementation choice is has to be large
enough to store two pointers (2*sizeof(void*)).

However, when thinking about the check again: It does not make sense to
break the build if only C/C++ is enabled and Fortran disabled.

While that probably has no real-world impact, I still think it makes
sense to fix it.

OK for mainline and GCC 11?

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp: Only check for 2*sizeof(void*) int type with Fortran [PR96661]

The depend type is a struct with two pointer members for C/C++ - but for
Fortran OpenMP requires an integer type with kind = omp_depend_kind. Thus,
libgomp's configure checks that an integer type/kind with size 2*sizeof(void*)
is available. However, this integer type/kind is not needed when building without
Fortran support. Thus, only check this when Fortran is enabled.

libgomp/
	PR libgomp/96661
	* configure.ac: Only check for int-type = 2*size_t support when
	building with Fortran support.
	* configure: Regenerate.

diff --git a/libgomp/configure b/libgomp/configure
index 6161da579c0..4bc9b381c5c 100755
--- a/libgomp/configure
+++ b/libgomp/configure
@@ -17007,13 +17007,15 @@ fi
 if test $OMP_NEST_LOCK_25_SIZE -gt 8 || test $OMP_NEST_LOCK_25_ALIGN -gt $OMP_NEST_LOCK_25_SIZE; then
   OMP_NEST_LOCK_25_KIND=8
 fi
-if test $OMP_DEPEND_KIND -eq 16; then
-  if test $OMP_INT128_SIZE -ne 16; then
-as_fn_error $? "unsupported system, cannot find Fortran int kind=16, needed for omp_depend_kind" "$LINENO" 5
-  fi
-else
-  if test $OMP_DEPEND_KIND -ne 8; then
-as_fn_error $? "unsupported system, cannot find Fortran integer kind for omp_depend_kind" "$LINENO" 5
+if test "$ac_cv_fc_compiler_gnu" = yes; then
+  if test $OMP_DEPEND_KIND -eq 16; then
+if test $OMP_INT128_SIZE -ne 16; then
+  as_fn_error $? "unsupported system, cannot find Fortran int kind=16, needed for omp_depend_kind" "$LINENO" 5
+fi
+  else
+if test $OMP_DEPEND_KIND -ne 8; then
+  as_fn_error $? "unsupported system, cannot find Fortran integer kind for omp_depend_kind" "$LINENO" 5
+fi
   fi
 fi
 
diff --git a/libgomp/configure.ac b/libgomp/configure.ac
index 7df80a32765..bfb613b91f0 100644
--- a/libgomp/configure.ac
+++ b/libgomp/configure.ac
@@ -438,13 +438,15 @@ fi
 if test $OMP_NEST_LOCK_25_SIZE -gt 8 || test $OMP_NEST_LOCK_25_ALIGN -gt $OMP_NEST_LOCK_25_SIZE; then
   OMP_NEST_LOCK_25_KIND=8
 fi
-if test $OMP_DEPEND_KIND -eq 16; then
-  if test $OMP_INT128_SIZE -ne 16; then
-AC_MSG_ERROR([unsupported system, cannot find Fortran int kind=16, needed for omp_depend_kind])
-  fi
-else
-  if test $OMP_DEPEND_KIND -ne 8; then
-AC_MSG_ERROR([unsupported system, cannot find Fortran integer kind for omp_depend_kind])
+if test "$ac_cv_fc_compiler_gnu" = yes; then
+  if test $OMP_DEPEND_KIND -eq 16; then
+if test $OMP_INT128_SIZE -ne 16; then
+  AC_MSG_ERROR([unsupported system, cannot find Fortran int kind=16, needed for omp_depend_kind])
+fi
+  else
+if test $OMP_DEPEND_KIND -ne 8; then
+  AC_MSG_ERROR([unsupported system, cannot find Fortran integer kind for omp_depend_kind])
+fi
   fi
 fi
 


Re: [PATCH v3 3/3] reassoc: Test rank biasing

2021-09-28 Thread Richard Biener via Gcc-patches
On Tue, 28 Sep 2021, Ilya Leoshkevich wrote:

> On Tue, 2021-09-28 at 13:28 +0200, Richard Biener wrote:
> > On Sun, 26 Sep 2021, Ilya Leoshkevich wrote:
> > 
> > > Add both positive and negative tests.
> > 
> > The tests will likely be quite fragile with respect to what is
> > actually vectorized on which target.  If you move the tests
> > to gcc.dg/vect/ you could at least do
> > 
> > /* { dg-require-effective-target vect_int } */
> > 
> > do you need to look for the exact GIMPLE IL or is it enough to
> > verify we are vectorizing the reduction?
> 
> Actually I don't think vectorization is that important here, and I
> only check how many times sum_x = sum_y + _z appears.  So I use
> (?:vect_)?, which may or may not be there.
> 
> An alternative I considered was to use -fno-tree-vectorize to get
> smaller regexes, but I thought it would be nice to know that
> vectorization does not mess up reassociation results.

Ah, OK.  So lets go ahead with the patch unchanged, but be prepared
to deal with eventual fallout here on weird targets.

Thanks,
Richard.


Re: [Patch] Fortran: Fix assumed-size to assumed-rank passing [PR94070]

2021-09-28 Thread Thomas Schwinge
Hi!

On 2021-09-27T14:07:53+0200, Tobias Burnus  wrote:
> now committed r12-3897-g00f6de9c69119594f7dad3bd525937c94c8200d0


> Conclusion: Reviews are very helpful :-)

Ha!  :-) (... and I wasn't even involed here!)  ;-P


As testing showed here:

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/assumed_rank_22_aux.c
> @@ -0,0 +1,68 @@
> +/* Called by assumed_rank_22.f90.  */

> +  if (num == 0)
> +assert (x->dim[2].extent == -1);
> +  else if (num == 20)
> +assert (x->dim[2].extent == 1);
> +  else if (num == 40)
> +{
> +  /* FIXME: - dg-output = 'c_assumed ... OK' checked in .f90 file. */
> +  /* assert (x->dim[2].extent == 0); */
> +  if (x->dim[2].extent == 0)
> + __builtin_printf ("c_assumed - 40 - OK\n");
> +  else
> + __builtin_printf ("ERROR: c_assumed num=%d: "
> +   "x->dim[2].extent = %d != 0\n",
> +   num, x->dim[2].extent);
> +}
> +  else if (num == 60)
> +assert (x->dim[2].extent == 2);
> +  else if (num == 80)
> +assert (x->dim[2].extent == 2);
> +  else if (num == 100)
> +{
> +  /* FIXME: - dg-output = 'c_assumed ... OK' checked in .f90 file. */
> +  /* assert (x->dim[2].extent == 0); */
> +  if (x->dim[2].extent == 0)
> + __builtin_printf ("c_assumed - 100 - OK\n");
> +  else
> + __builtin_printf ("ERROR: c_assumed num=%d: "
> +   "x->dim[2].extent = %d != 0\n",
> +   num, x->dim[2].extent);
> +}
> +  else
> +assert (0);

... the 'ERROR:' prefixes printed do confuse DejaGnu...  As obvious,
pushed to master branch commit 95540a6d1d7b29cdd3ed06fbcb07465804504cfd
"'gfortran.dg/assumed_rank_22_aux.c' messages printed vs. DejaGnu", see
attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 95540a6d1d7b29cdd3ed06fbcb07465804504cfd Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 28 Sep 2021 09:02:56 +0200
Subject: [PATCH] 'gfortran.dg/assumed_rank_22_aux.c' messages printed vs.
 DejaGnu

Print lower-case 'error: [...]' instead of upper-case 'ERROR: [...]', to not
confuse the DejaGnu log processing harness into thinking these are DejaGnu
harness ERRORs:

Running /scratch/tschwing/build2-trusty-cs/gcc/build/submit-big/source-gcc/gcc/testsuite/gfortran.dg/dg.exp ...
+ERROR: c_assumed num=100: x->dim[2].extent = -1 != 0
+ERROR: c_assumed num=100: x->dim[2].extent = -1 != 0
+ERROR: c_assumed num=100: x->dim[2].extent = -1 != 0
+ERROR: c_assumed num=100: x->dim[2].extent = -1 != 0
+ERROR: c_assumed num=100: x->dim[2].extent = -1 != 0
+ERROR: c_assumed num=100: x->dim[2].extent = -1 != 0
[...]

Fix-up for recent commit 00f6de9c69119594f7dad3bd525937c94c8200d0
"Fortran: Fix assumed-size to assumed-rank passing [PR94070]".

	gcc/testsuite/
	* gfortran.dg/assumed_rank_22_aux.c: Adjust messages printed.
---
 gcc/testsuite/gfortran.dg/assumed_rank_22_aux.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/assumed_rank_22_aux.c b/gcc/testsuite/gfortran.dg/assumed_rank_22_aux.c
index 2fbf83d649a..e5fe02135e9 100644
--- a/gcc/testsuite/gfortran.dg/assumed_rank_22_aux.c
+++ b/gcc/testsuite/gfortran.dg/assumed_rank_22_aux.c
@@ -29,7 +29,7 @@ c_assumed (CFI_cdesc_t *x, int num)
   if (x->dim[2].extent == 0)
 	__builtin_printf ("c_assumed - 40 - OK\n");
   else
-	__builtin_printf ("ERROR: c_assumed num=%d: "
+	__builtin_printf ("error: c_assumed num=%d: "
 		  "x->dim[2].extent = %d != 0\n",
 		  num, x->dim[2].extent);
 }
@@ -44,7 +44,7 @@ c_assumed (CFI_cdesc_t *x, int num)
   if (x->dim[2].extent == 0)
 	__builtin_printf ("c_assumed - 100 - OK\n");
   else
-	__builtin_printf ("ERROR: c_assumed num=%d: "
+	__builtin_printf ("error: c_assumed num=%d: "
 		  "x->dim[2].extent = %d != 0\n",
 		  num, x->dim[2].extent);
 }
-- 
2.33.0



Re: [committed] libgomp.oacc-fortran/privatized-ref-2.f90: Fix dg-note (was: [Patch] Fortran: Fix assumed-size to assumed-rank passing [PR94070])

2021-09-28 Thread Thomas Schwinge
Hi!

On 2021-09-27T14:38:56+0200, Tobias Burnus  wrote:
> On 27.09.21 14:07, Tobias Burnus wrote:
>> now committed r12-3897-g00f6de9c69119594f7dad3bd525937c94c8200d0
>
> I accidentally changed dg-note to dg-message when updating the expected
> output, as the dump has changed. (Copying seemingly the sorry line
> instead of the dg-note lines as template.)

Strange.  ;-P

> Changed back to dg-note & committed as
> r12-3898-gda1f6391b7c255e4e2eea983832120eff4f7d3df.

As shown by offloading testing, a bit more is necessary here; I've
pushed to master branch commit a43ae03a053faad871e6f48099d21e64b8e316cf
'Further test case adjustment re "Fortran: Fix assumed-size to
assumed-rank passing"', see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From a43ae03a053faad871e6f48099d21e64b8e316cf Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 28 Sep 2021 08:05:28 +0200
Subject: [PATCH] Further test case adjustment re "Fortran: Fix assumed-size to
 assumed-rank passing"

Fix-up for recent commit 00f6de9c69119594f7dad3bd525937c94c8200d0
"Fortran: Fix assumed-size to assumed-rank passing [PR94070]",
and commit da1f6391b7c255e4e2eea983832120eff4f7d3df
"libgomp.oacc-fortran/privatized-ref-2.f90: Fix dg-note".

Due to use of '#if !ACC_MEM_SHARED' conditionals in
'libgomp.oacc-fortran/if-1.f90', 'target { !  openacc_host_selected }'
needs some special care (ignoring the pre-existing mismatch of
'ACC_MEM_SHARED' vs. 'openacc_host_selected').

As seen with GCN offloading, we need to revert to another bit of the
original code in 'libgomp.oacc-fortran/privatized-ref-2.f90'.

	libgomp/
	* testsuite/libgomp.oacc-fortran/if-1.f90: Adjust.
	* testsuite/libgomp.oacc-fortran/privatized-ref-2.f90: Likewise.
---
 libgomp/testsuite/libgomp.oacc-fortran/if-1.f90 | 6 ++
 libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-2.f90 | 3 +--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/libgomp/testsuite/libgomp.oacc-fortran/if-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/if-1.f90
index 3089d6a0c43..9eadfcf9738 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/if-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/if-1.f90
@@ -394,6 +394,7 @@ program main
 
   !$acc data copyin (a(1:N)) copyout (b(1:N)) if (0 == 1)
   ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target { ! openacc_host_selected } } .-1 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-2 }
 
 #if !ACC_MEM_SHARED
   if (acc_is_present (a) .eqv. .TRUE.) STOP 21
@@ -408,6 +409,7 @@ program main
   !$acc data copyin (a(1:N)) if (1 == 1)
   ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-3 }
 
 #if !ACC_MEM_SHARED
 if (acc_is_present (a) .eqv. .FALSE.) STOP 23
@@ -416,6 +418,7 @@ program main
 !$acc data copyout (b(1:N)) if (0 == 1)
 ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
 ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-2 }
+! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-3 }
 #if !ACC_MEM_SHARED
   if (acc_is_present (b) .eqv. .TRUE.) STOP 24
 #endif
@@ -864,6 +867,7 @@ program main
 
   !$acc data copyin (a(1:N)) copyout (b(1:N)) if (0 == 1)
   ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target { ! openacc_host_selected } } .-1 }
+  ! { dg-note {variable 'parm\.[0-9]+' declared in block is candidate for adjusting OpenACC privatization level} "" { target { ! openacc_host_selected } } .-2 }
 
 #if !ACC_MEM_SHARED
   if (acc_is_present (a) .eqv. .TRUE.) STOP 56
@@ -878,6 +882,7 @@ program main
   !$acc data copyin (a(1:N)) if (1 == 1)
   ! { dg-note {variable 'D\.[0-9]+' declared in block isn't candidate for adjusting OpenACC privatization level: not addressable} "" { target *-*-* } .-1 }
   ! { dg-note {variable 'parm\.[0-9]+' declared in block isn't 

Re: [PATCH] Make flag_trapping_math a non-binary Boolean.

2021-09-28 Thread Richard Biener via Gcc-patches
On Tue, Sep 28, 2021 at 1:34 PM Roger Sayle  wrote:
>
>
> Hi Joseph,
> Firstly very many thanks for taking the time to respond, and especially for
> mentioning
> the discussion in PR 54192 (and Marc Glisse's -ffenv-access patches, but
> they are a
> little less relevant).  Indeed the starting point for this patch is Richard
> Beiner's proposal
> in comment #9 for that PR.  That you've partially misunderstood the goal of
> this patch is
> encouraging (if it was simple to understand/fix, there wouldn't be so many
> open PRs).
> Hopefully, I'm bringing some fresh thinking on how to solve/tackle these
> long standing
> issues.
>
> Next, I'd like to state that your "five restrictions" ontology is an
> excellent starting point,
> but I'd like to argue that your proposed list of 5 is the wrong shape
> (insufficiently refined).
> Instead, I'd like to counter-propose that an improvement/refinement of the
> Myers model,
> is actually "3 primitive restrictions * N trapping conditions * 2 flow
> control sensitivity".
>
> For reference, here's your original list:
> > [1] Disallowing code transformations that cause some code to raise more
> > exception flags than it would have before.
> > [2] Disallowing code transformations that cause some code to raise fewer
> > exception flags than it would have before.
> > [3] Ensuring the code generated allows for possible non-local control flow
>
> > from exception traps raised by floating-point operations (this is the part
>
> > where -fnon-call-exceptions might be relevant).
> > [4] Disallowing code transformations that might affect whether an exact
> > underflow exception occurs in some code (not observable through exception
> > flags, is observable through trap handlers).
> > [5] Ensuring floating-point operations that might raise exception flags
> are
> > not removed, or moved past code (asms or function calls) that might read
> > or modify the exception flag state
>
> Firstly your item [3], concerns the relationship between traps and flow
> control, such as C++ exception handling, which is as you correctly point out
> the role of "-fnon-call-exceptions", which Richard B has recently confirmed
> only applies to targets/languages supporting C++ style exceptions, i.e. this
> is controlled by -fexceptions.  On targets such as nvptx-none, that don't
> support non-local control flow, stack unwinding nor setjmp/longjmp, i.e.
> don't support exceptions, this is completely orthogonal to the others.
>
> Next your item [4] highlights what I consider the underlying problem that
> until now has been overlooked, that there are different kinds of traps are
> observationally/behaviourally different.  Above you describe, "underflow",
> but likewise there are traps for inexact result, "2.0 / 3.0", traps for
> division
> by 0.0, that invokes undefined behaviour in C++ (but sometimes not in C),
> and distinctions between quiet and signaling NaNs.  Your primitivie
> restrictions,
> [1], [2] and [5] may apply differently to these different kinds of
> exceptions.
>
> More relevant than Marc Glisse's -fenv-access is actually my
> -fpreserve-traps
> patch from July:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574885.html
> which tackles restriction [5] (and perhaps [2]).
>
> Working towards the Myers restriction model, I believe we'd be a significant
> step
> closer with three (command line) flags (or families of flags):
>
> -ftrapping-math related to Myers restrictons [1],[2],[5]
> -fpreserve-trapsrelated to Myers restriction [5]
> -fcounted-traps related to Myers restriction [2]

Just to throw in a comment without intending to interrupt the fruitful
argument...

I'd like to keep changes refined to the frontends / middle-ends until we
sort out the bigger picture and have an approach that is usable in the
actual implementation and also extensible, that is, it doesn't fall apart
when considering the related problems Joseph mentioned (-frounding-math,
FENV access).

And only _then_ think of how to expose this best to the user with new
user-visible options and tunables.  Because those tend to stick around
forever and so mistakes there are much more costly (and it's not that
we don't have too many entangled knobs in the area of math semantics...)

> The insight that untangles the Gordian knot, is that these three options are
>
> not simple true/false Binary flags, but actually (bit) sets of exception
> types
> (hopefully all actually using the same TRAPPING_MATH enumeration).
>
> Consider the following four lines of C++:
> constexpr t1 = 2.0 / 3.0;
> constexpr t2 = std::numeric_limits::quiet_NaN() == 0.0;
> constexpr t2 = std::numeric_limits::quiet_NaN() < 0.0;
> constexpr t3 = 1.0 / 0.0;
> which by IEEE generate four different types of exception, but as you've
> expertly
> confirmed have (sometimes) different behaviours under the C++ standard.
> Treating all trapping conditions identically is clearly insufficient.
>
> Hopefully, the argument/proposal above is 

Re: [RFC] Don't move cold code out of loop by checking bb count

2021-09-28 Thread Richard Biener via Gcc-patches
On Fri, Sep 24, 2021 at 8:29 AM Xionghu Luo  wrote:
>
> Update the patch to v3, not sure whether you prefer the paste style
> and continue to link the previous thread as Segher dislikes this...
>
>
> [PATCH v3] Don't move cold code out of loop by checking bb count
>
>
> Changes:
> 1. Handle max_loop in determine_max_movement instead of
> outermost_invariant_loop.
> 2. Remove unnecessary changes.
> 3. Add for_all_locs_in_loop (loop, ref, ref_in_loop_hot_body) in can_sm_ref_p.
> 4. "gsi_next ();" in move_computations_worker is kept since it caused
> infinite loop when implementing v1 and the iteration is missed to be
> updated actually.
>
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576488.html
> v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579086.html
>
> There was a patch trying to avoid move cold block out of loop:
>
> https://gcc.gnu.org/pipermail/gcc/2014-November/215551.html
>
> Richard suggested to "never hoist anything from a bb with lower execution
> frequency to a bb with higher one in LIM invariantness_dom_walker
> before_dom_children".
>
> In gimple LIM analysis, add find_coldest_out_loop to move invariants to
> expected target loop, if profile count of the loop bb is colder
> than target loop preheader, it won't be hoisted out of loop.
> Likely for store motion, if all locations of the REF in loop is cold,
> don't do store motion of it.
>
> SPEC2017 performance evaluation shows 1% performance improvement for
> intrate GEOMEAN and no obvious regression for others.  Especially,
> 500.perlbench_r +7.52% (Perf shows function S_regtry of perlbench is
> largely improved.), and 548.exchange2_r+1.98%, 526.blender_r +1.00%
> on P8LE.
>
> gcc/ChangeLog:
>
> * loop-invariant.c (find_invariants_bb): Check profile count
> before motion.
> (find_invariants_body): Add argument.
> * tree-ssa-loop-im.c (find_coldest_out_loop): New function.
> (determine_max_movement): Use find_coldest_out_loop.
> (move_computations_worker): Adjust and fix iteration udpate.
> (execute_sm_exit): Check pointer validness.
> (class ref_in_loop_hot_body): New functor.
> (ref_in_loop_hot_body::operator): New.
> (can_sm_ref_p): Use for_all_locs_in_loop.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/recip-3.c: Adjust.
> * gcc.dg/tree-ssa/ssa-lim-18.c: New test.
> * gcc.dg/tree-ssa/ssa-lim-19.c: New test.
> * gcc.dg/tree-ssa/ssa-lim-20.c: New test.
> ---
>  gcc/loop-invariant.c   | 10 ++--
>  gcc/tree-ssa-loop-im.c | 61 --
>  gcc/testsuite/gcc.dg/tree-ssa/recip-3.c|  2 +-
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c | 20 +++
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c | 27 ++
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-20.c | 25 +
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c | 28 ++
>  7 files changed, 165 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-20.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c
>
> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
> index fca0c2b24be..5c3be7bf0eb 100644
> --- a/gcc/loop-invariant.c
> +++ b/gcc/loop-invariant.c
> @@ -1183,9 +1183,14 @@ find_invariants_insn (rtx_insn *insn, bool 
> always_reached, bool always_executed)
> call.  */
>
>  static void
> -find_invariants_bb (basic_block bb, bool always_reached, bool 
> always_executed)
> +find_invariants_bb (class loop *loop, basic_block bb, bool always_reached,
> +   bool always_executed)
>  {
>rtx_insn *insn;
> +  basic_block preheader = loop_preheader_edge (loop)->src;
> +
> +  if (preheader->count > bb->count)
> +return;
>
>FOR_BB_INSNS (bb, insn)
>  {
> @@ -1214,8 +1219,7 @@ find_invariants_body (class loop *loop, basic_block 
> *body,
>unsigned i;
>
>for (i = 0; i < loop->num_nodes; i++)
> -find_invariants_bb (body[i],
> -   bitmap_bit_p (always_reached, i),
> +find_invariants_bb (loop, body[i], bitmap_bit_p (always_reached, i),
> bitmap_bit_p (always_executed, i));
>  }
>
> diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
> index 4b187c2cdaf..655fab03442 100644
> --- a/gcc/tree-ssa-loop-im.c
> +++ b/gcc/tree-ssa-loop-im.c
> @@ -417,6 +417,28 @@ movement_possibility (gimple *stmt)
>return ret;
>  }
>
> +/* Find coldest loop between outmost_loop and loop by comapring profile 
> count.  */
> +
> +static class loop *
> +find_coldest_out_loop (class loop *outmost_loop, class loop *loop,
> +  basic_block curr_bb)
> +{
> +  class loop *cold_loop, *min_loop;
> +  cold_loop = min_loop = outmost_loop;
> +  profile_count min_count = loop_preheader_edge (min_loop)->src->count;
> +
> +  if (curr_bb 

Re: [PATCH] Loop unswitching: support gswitch statements.

2021-09-28 Thread Richard Biener via Gcc-patches
On Wed, Sep 15, 2021 at 10:46 AM Martin Liška  wrote:
>
> Hello.
>
> The patch extends the loop unswitching pass so that gswitch
> statements are supported. The pass now uses ranger which marks
> switch edges that are known to be unreachable in a versioned loop.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> * tree-cfg.c (gimple_lv_add_condition_to_bb): Support non-gimple
> expressions that needs to be gimplified.
> * tree-ssa-loop-unswitch.c (tree_unswitch_loop): Add new
> cond_edge parameter.
> (tree_may_unswitch_on): Support gswitch statements.
> (clean_up_switches): New function.
> (tree_ssa_unswitch_loops): Call clean_up_switches.
> (simplify_using_entry_checks): Removed and replaced with ranger.
> (tree_unswitch_single_loop): Change assumptions.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/loop-unswitch-6.c: New test.
> * gcc.dg/loop-unswitch-7.c: New test.
> * gcc.dg/loop-unswitch-8.c: New test.
> * gcc.dg/loop-unswitch-9.c: New test.
>
> Co-Authored-By: Richard Biener 
> ---
>   gcc/testsuite/gcc.dg/loop-unswitch-6.c |  56 +
>   gcc/testsuite/gcc.dg/loop-unswitch-7.c |  45 
>   gcc/testsuite/gcc.dg/loop-unswitch-8.c |  28 +++
>   gcc/testsuite/gcc.dg/loop-unswitch-9.c |  34 +++
>   gcc/tree-cfg.c |   7 +-
>   gcc/tree-ssa-loop-unswitch.c   | 284 ++---
>   6 files changed, 374 insertions(+), 80 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.dg/loop-unswitch-6.c
>   create mode 100644 gcc/testsuite/gcc.dg/loop-unswitch-7.c
>   create mode 100644 gcc/testsuite/gcc.dg/loop-unswitch-8.c
>   create mode 100644 gcc/testsuite/gcc.dg/loop-unswitch-9.c
>
> diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-6.c 
> b/gcc/testsuite/gcc.dg/loop-unswitch-6.c
> new file mode 100644
> index 000..8a022e0f200
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/loop-unswitch-6.c
> @@ -0,0 +1,56 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details 
> --param=max-unswitch-insns=1000 --param=max-unswitch-level=10" } */
> +
> +int
> +__attribute__((noipa))
> +foo(double *a, double *b, double *c, double *d, double *r, int size, int 
> order)
> +{
> +  for (int i = 0; i < size; i++)
> +  {
> +double tmp, tmp2;
> +
> +switch(order)
> +{
> +  case 0:
> +tmp = -8 * a[i];
> +tmp2 = 2 * b[i];
> +break;
> +  case 1:
> +tmp = 3 * a[i] -  2 * b[i];
> +tmp2 = 5 * b[i] - 2 * c[i];
> +break;
> +  case 2:
> +tmp = 9 * a[i] +  2 * b[i] + c[i];
> +tmp2 = 4 * b[i] + 2 * c[i] + 8 * d[i];
> +break;
> +  case 3:
> +tmp = 3 * a[i] +  2 * b[i] - c[i];
> +tmp2 = b[i] - 2 * c[i] + 8 * d[i];
> +break;
> +  defaut:
> +__builtin_unreachable ();
> +}
> +
> +double x = 3 * tmp + d[i] + tmp;
> +double y = 3.4f * tmp + d[i] + tmp2;
> +r[i] = x + y;
> +  }
> +
> +  return 0;
> +}
> +
> +#define N 16 * 1024
> +double aa[N], bb[N], cc[N], dd[N], rr[N];
> +
> +int main()
> +{
> +  for (int i = 0; i < 100 * 1000; i++)
> +foo (aa, bb, cc, dd, rr, N, i % 4);
> +}
> +
> +
> +/* Test that we actually unswitched something.  */
> +/* { dg-final { scan-tree-dump ";; Unswitching loop with condition: order.* 
> == 0" "unswitch" } } */
> +/* { dg-final { scan-tree-dump ";; Unswitching loop with condition: order.* 
> == 1" "unswitch" } } */
> +/* { dg-final { scan-tree-dump ";; Unswitching loop with condition: order.* 
> == 2" "unswitch" } } */
> +/* { dg-final { scan-tree-dump ";; Unswitching loop with condition: order.* 
> == 3" "unswitch" } } */
> diff --git a/gcc/testsuite/gcc.dg/loop-unswitch-7.c 
> b/gcc/testsuite/gcc.dg/loop-unswitch-7.c
> new file mode 100644
> index 000..00f2fcff64b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/loop-unswitch-7.c
> @@ -0,0 +1,45 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -funswitch-loops -fdump-tree-unswitch-details 
> --param=max-unswitch-insns=1000 --param=max-unswitch-level=10" } */
> +
> +int
> +foo(double *a, double *b, double *c, double *d, double *r, int size, int 
> order)
> +{
> +  for (int i = 0; i < size; i++)
> +  {
> +double tmp, tmp2;
> +
> +switch(order)
> +{
> +  case 5 ... 6:
> +  case 9:
> +tmp = -8 * a[i];
> +tmp2 = 2 * b[i];
> +break;
> +  case 11:
> +tmp = 3 * a[i] -  2 * b[i];
> +tmp2 = 5 * b[i] - 2 * c[i];
> +break;
> +  case 22:
> +tmp = 9 * a[i] +  2 * b[i] + c[i];
> +tmp2 = 4 * b[i] + 2 * c[i] + 8 * d[i];
> +break;
> +  case 33:
> +tmp = 3 * a[i] +  2 * b[i] - c[i];
> +tmp2 = b[i] - 2 * c[i] + 8 * d[i];
> +break;
> +  defaut:
> +__builtin_unreachable ();
> +}
> +
> +double x = 3 * 

Re: [PATCH v3 3/3] reassoc: Test rank biasing

2021-09-28 Thread Ilya Leoshkevich via Gcc-patches
On Tue, 2021-09-28 at 13:28 +0200, Richard Biener wrote:
> On Sun, 26 Sep 2021, Ilya Leoshkevich wrote:
> 
> > Add both positive and negative tests.
> 
> The tests will likely be quite fragile with respect to what is
> actually vectorized on which target.  If you move the tests
> to gcc.dg/vect/ you could at least do
> 
> /* { dg-require-effective-target vect_int } */
> 
> do you need to look for the exact GIMPLE IL or is it enough to
> verify we are vectorizing the reduction?

Actually I don't think vectorization is that important here, and I
only check how many times sum_x = sum_y + _z appears.  So I use
(?:vect_)?, which may or may not be there.

An alternative I considered was to use -fno-tree-vectorize to get
smaller regexes, but I thought it would be nice to know that
vectorization does not mess up reassociation results.

Best regards,
Ilya



RE: [PATCH] Make flag_trapping_math a non-binary Boolean.

2021-09-28 Thread Roger Sayle


Hi Joseph,
Firstly very many thanks for taking the time to respond, and especially for
mentioning
the discussion in PR 54192 (and Marc Glisse's -ffenv-access patches, but
they are a
little less relevant).  Indeed the starting point for this patch is Richard
Beiner's proposal
in comment #9 for that PR.  That you've partially misunderstood the goal of
this patch is
encouraging (if it was simple to understand/fix, there wouldn't be so many
open PRs).
Hopefully, I'm bringing some fresh thinking on how to solve/tackle these
long standing
issues.

Next, I'd like to state that your "five restrictions" ontology is an
excellent starting point,
but I'd like to argue that your proposed list of 5 is the wrong shape
(insufficiently refined).
Instead, I'd like to counter-propose that an improvement/refinement of the
Myers model,
is actually "3 primitive restrictions * N trapping conditions * 2 flow
control sensitivity".

For reference, here's your original list:
> [1] Disallowing code transformations that cause some code to raise more 
> exception flags than it would have before.
> [2] Disallowing code transformations that cause some code to raise fewer 
> exception flags than it would have before.
> [3] Ensuring the code generated allows for possible non-local control flow

> from exception traps raised by floating-point operations (this is the part

> where -fnon-call-exceptions might be relevant).
> [4] Disallowing code transformations that might affect whether an exact 
> underflow exception occurs in some code (not observable through exception 
> flags, is observable through trap handlers).
> [5] Ensuring floating-point operations that might raise exception flags
are 
> not removed, or moved past code (asms or function calls) that might read 
> or modify the exception flag state

Firstly your item [3], concerns the relationship between traps and flow
control, such as C++ exception handling, which is as you correctly point out
the role of "-fnon-call-exceptions", which Richard B has recently confirmed
only applies to targets/languages supporting C++ style exceptions, i.e. this
is controlled by -fexceptions.  On targets such as nvptx-none, that don't
support non-local control flow, stack unwinding nor setjmp/longjmp, i.e.
don't support exceptions, this is completely orthogonal to the others.

Next your item [4] highlights what I consider the underlying problem that
until now has been overlooked, that there are different kinds of traps are
observationally/behaviourally different.  Above you describe, "underflow",
but likewise there are traps for inexact result, "2.0 / 3.0", traps for
division
by 0.0, that invokes undefined behaviour in C++ (but sometimes not in C),
and distinctions between quiet and signaling NaNs.  Your primitivie
restrictions,
[1], [2] and [5] may apply differently to these different kinds of
exceptions.

More relevant than Marc Glisse's -fenv-access is actually my
-fpreserve-traps
patch from July:
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574885.html
which tackles restriction [5] (and perhaps [2]).

Working towards the Myers restriction model, I believe we'd be a significant
step
closer with three (command line) flags (or families of flags):

-ftrapping-math related to Myers restrictons [1],[2],[5]
-fpreserve-trapsrelated to Myers restriction [5]
-fcounted-traps related to Myers restriction [2]

The insight that untangles the Gordian knot, is that these three options are

not simple true/false Binary flags, but actually (bit) sets of exception
types
(hopefully all actually using the same TRAPPING_MATH enumeration).

Consider the following four lines of C++:
constexpr t1 = 2.0 / 3.0;
constexpr t2 = std::numeric_limits::quiet_NaN() == 0.0;
constexpr t2 = std::numeric_limits::quiet_NaN() < 0.0;
constexpr t3 = 1.0 / 0.0;
which by IEEE generate four different types of exception, but as you've
expertly
confirmed have (sometimes) different behaviours under the C++ standard.
Treating all trapping conditions identically is clearly insufficient.

Hopefully, the argument/proposal above is sufficient to convince the list
that
we need some form of enumeration (following Richard Beiner's proposal).
Perhaps the devil is in the details, as to what the final form of this
enumeration
should look like [even though at this stage there are no functional changes
yet].

Two very useful references I've been following are:
https://docs.oracle.com/cd/E19957-01/806-3568/ncg_handle.html
https://docs.oracle.com/cd/E88353_01/html/E37846/fex-getexcepthandler-3m.htm
l

Ultimately, the fields and naming of this enumeration are a middle-end
detail,
and reflect constant folding transformations that the middle-end may or may
not perform on either trees or RTL.  In theory, the could be named after
line
numbers in match.pd, fold-const.c and simplify-rtx.c.  For example, what
IEEE calls
"FPE_INTOVF" is more commonly known as TRAPV inside GCC.  Likewise, IEEE
concepts such as FE_INVALID are really 

Re: [PATCH v3 3/3] reassoc: Test rank biasing

2021-09-28 Thread Richard Biener via Gcc-patches
On Sun, 26 Sep 2021, Ilya Leoshkevich wrote:

> Add both positive and negative tests.

The tests will likely be quite fragile with respect to what is
actually vectorized on which target.  If you move the tests
to gcc.dg/vect/ you could at least do

/* { dg-require-effective-target vect_int } */

do you need to look for the exact GIMPLE IL or is it enough to
verify we are vectorizing the reduction?

Thanks,
Richard.


> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/tree-ssa/reassoc-46.c: New test.
>   * gcc.dg/tree-ssa/reassoc-46.h: Common code for new tests.
>   * gcc.dg/tree-ssa/reassoc-47.c: New test.
>   * gcc.dg/tree-ssa/reassoc-48.c: New test.
>   * gcc.dg/tree-ssa/reassoc-49.c: New test.
>   * gcc.dg/tree-ssa/reassoc-50.c: New test.
>   * gcc.dg/tree-ssa/reassoc-51.c: New test.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c |  7 +
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h | 33 ++
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c |  9 ++
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c |  9 ++
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c | 11 
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c | 10 +++
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c | 11 
>  7 files changed, 90 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c
> new file mode 100644
> index 000..97563dd929f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
> +
> +#include "reassoc-46.h"
> +
> +/* Check that the loop accumulator is added last.  */
> +/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = 
> (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ 
> (?:vect_)?_[\d._]+)} 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h 
> b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h
> new file mode 100644
> index 000..e60b490ea0d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h
> @@ -0,0 +1,33 @@
> +#define M 1024
> +unsigned int arr1[M];
> +unsigned int arr2[M];
> +volatile unsigned int sink;
> +
> +unsigned int
> +test (void)
> +{
> +  unsigned int sum = 0;
> +  for (int i = 0; i < M; i++)
> +{
> +#ifdef MODIFY
> +  /* Modify the loop accumulator using a chain of operations - this 
> should
> + not affect its rank biasing.  */
> +  sum |= 1;
> +  sum ^= 2;
> +#endif
> +#ifdef STORE
> +  /* Save the loop accumulator into a global variable - this should not
> + affect its rank biasing.  */
> +  sink = sum;
> +#endif
> +#ifdef USE
> +  /* Add a tricky use of the loop accumulator - this should prevent its
> + rank biasing.  */
> +  i = (i + sum) % M;
> +#endif
> +  /* Use addends with different ranks.  */
> +  sum += arr1[i];
> +  sum += arr2[((i ^ 1) + 1) % M];
> +}
> +  return sum;
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c
> new file mode 100644
> index 000..1b0f0fdabe1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
> +
> +#define MODIFY
> +#include "reassoc-46.h"
> +
> +/* Check that if the loop accumulator is saved into a global variable, it's
> +   still added last.  */
> +/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = 
> (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ 
> (?:vect_)?_[\d._]+)} 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c
> new file mode 100644
> index 000..13836ebe8e6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
> +
> +#define STORE
> +#include "reassoc-46.h"
> +
> +/* Check that if the loop accumulator is modified using a chain of operations
> +   other than addition, its new value is still added last.  */
> +/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = 
> (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ 
> (?:vect_)?_[\d._]+)} 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c 
> 

Re: [PATCH v3 2/3] reassoc: Propagate PHI_LOOP_BIAS along single uses

2021-09-28 Thread Richard Biener via Gcc-patches
On Sun, 26 Sep 2021, Ilya Leoshkevich wrote:

> PR tree-optimization/49749 introduced code that shortens dependency
> chains containing loop accumulators by placing them last on operand
> lists of associative operations.
> 
> 456.hmmer benchmark on s390 could benefit from this, however, the code
> that needs it modifies loop accumulator before using it, and since only
> so-called loop-carried phis are are treated as loop accumulators, the
> code in the present form doesn't really help.   According to Bill
> Schmidt - the original author - such a conservative approach was chosen
> so as to avoid unnecessarily swapping operands, which might cause
> unpredictable effects.  However, giving special treatment to forms of
> loop accumulators is acceptable.
> 
> The definition of loop-carried phi is: it's a single-use phi, which is
> used in the same innermost loop it's defined in, at least one argument
> of which is defined in the same innermost loop as the phi itself.
> Given this, it seems natural to treat single uses of such phis as phis
> themselves.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
> 
>   * tree-ssa-reassoc.c (biased_names): New global.
>   (propagate_bias_p): New function.
>   (loop_carried_phi): Remove.
>   (propagate_rank): Propagate bias along single uses.
>   (get_rank): Update biased_names when needed.
> ---
>  gcc/tree-ssa-reassoc.c | 109 -
>  1 file changed, 74 insertions(+), 35 deletions(-)
> 
> diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
> index 420c14e8cf5..db9fb4e1cac 100644
> --- a/gcc/tree-ssa-reassoc.c
> +++ b/gcc/tree-ssa-reassoc.c
> @@ -211,6 +211,10 @@ static int64_t *bb_rank;
>  /* Operand->rank hashtable.  */
>  static hash_map *operand_rank;
>  
> +/* SSA_NAMEs that are forms of loop accumulators and whose ranks need to be
> +   biased.  */
> +static auto_bitmap biased_names;
> +
>  /* Vector of SSA_NAMEs on which after reassociate_bb is done with
> all basic blocks the CFG should be adjusted - basic blocks
> split right after that SSA_NAME's definition statement and before
> @@ -256,6 +260,53 @@ reassoc_remove_stmt (gimple_stmt_iterator *gsi)
> the rank difference between two blocks.  */
>  #define PHI_LOOP_BIAS (1 << 15)
>  
> +/* Return TRUE iff PHI_LOOP_BIAS should be propagated from one of the STMT's
> +   operands to the STMT's left-hand side.  The goal is to preserve bias in 
> code
> +   like this:
> +
> + x_1 = phi(x_0, x_2)
> + a = x_1 | 1
> + b = a ^ 2
> + .MEM = b
> + c = b + d
> + x_2 = c + e
> +
> +   That is, we need to preserve bias along single-use chains originating from
> +   loop-carried phis.  Only GIMPLE_ASSIGNs to SSA_NAMEs are considered to be
> +   uses, because only they participate in rank propagation.  */
> +static bool
> +propagate_bias_p (gimple *stmt)
> +{
> +  use_operand_p use;
> +  imm_use_iterator use_iter;
> +  gimple *single_use_stmt = NULL;
> +
> +  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_reference)
> +return false;
> +
> +  FOR_EACH_IMM_USE_FAST (use, use_iter, gimple_assign_lhs (stmt))
> +{
> +  gimple *current_use_stmt = USE_STMT (use);
> +
> +  if (is_gimple_assign (current_use_stmt)
> +   && TREE_CODE (gimple_assign_lhs (current_use_stmt)) == SSA_NAME)
> + {
> +   if (single_use_stmt != NULL && single_use_stmt != current_use_stmt)
> + return false;
> +   single_use_stmt = current_use_stmt;
> + }
> +}
> +
> +  if (single_use_stmt == NULL)
> +return false;
> +
> +  if (gimple_bb (stmt)->loop_father
> +  != gimple_bb (single_use_stmt)->loop_father)
> +return false;
> +
> +  return true;
> +}
> +
>  /* Rank assigned to a phi statement.  If STMT is a loop-carried phi of
> an innermost loop, and the phi has only a single use which is inside
> the loop, then the rank is the block rank of the loop latch plus an
> @@ -313,49 +364,27 @@ phi_rank (gimple *stmt)
>return bb_rank[bb->index];
>  }
>  
> -/* If EXP is an SSA_NAME defined by a PHI statement that represents a
> -   loop-carried dependence of an innermost loop, return TRUE; else
> -   return FALSE.  */
> -static bool
> -loop_carried_phi (tree exp)
> -{
> -  gimple *phi_stmt;
> -  int64_t block_rank;
> -
> -  if (TREE_CODE (exp) != SSA_NAME
> -  || SSA_NAME_IS_DEFAULT_DEF (exp))
> -return false;
> -
> -  phi_stmt = SSA_NAME_DEF_STMT (exp);
> -
> -  if (gimple_code (SSA_NAME_DEF_STMT (exp)) != GIMPLE_PHI)
> -return false;
> -
> -  /* Non-loop-carried phis have block rank.  Loop-carried phis have
> - an additional bias added in.  If this phi doesn't have block rank,
> - it's biased and should not be propagated.  */
> -  block_rank = bb_rank[gimple_bb (phi_stmt)->index];
> -
> -  if (phi_rank (phi_stmt) != block_rank)
> -return true;
> -
> -  return false;
> -}
> -
>  /* Return the maximum of RANK and the rank that should be propagated
> from expression OP.  For 

RE: [PATCH 06/13] arm: Fix mve_vmvnq_n_ argument mode

2021-09-28 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: 07 September 2021 10:17
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH 06/13] arm: Fix mve_vmvnq_n_ argument
> mode
> 
> The vmvnq_n* intrinsics and have [u]int[16|32]_t arguments, so use
>  iterator instead of HI in mve_vmvnq_n_.

Ok. This can go in independently from the rest if testing is ok.
Thanks,
Kyrill

> 
> 2021-09-03  Christophe Lyon  
> 
>   gcc/
>   * config/arm/mve.md (mve_vmvnq_n_): Use V_elem
> mode
>   for operand 1.
> 
> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> index e393518ea88..14d17060290 100644
> --- a/gcc/config/arm/mve.md
> +++ b/gcc/config/arm/mve.md
> @@ -617,7 +617,7 @@ (define_insn "mve_vcvtaq_"
>  (define_insn "mve_vmvnq_n_"
>[
> (set (match_operand:MVE_5 0 "s_register_operand" "=w")
> - (unspec:MVE_5 [(match_operand:HI 1 "immediate_operand" "i")]
> + (unspec:MVE_5 [(match_operand: 1
> "immediate_operand" "i")]
>VMVNQ_N))
>]
>"TARGET_HAVE_MVE"
> --
> 2.25.1



RE: [PATCH 05/13] arm: Add support for VPR_REG in arm_class_likely_spilled_p

2021-09-28 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: 07 September 2021 10:17
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH 05/13] arm: Add support for VPR_REG in
> arm_class_likely_spilled_p
> 
> VPR_REG is the only register in its class, so it should be handled by
> TARGET_CLASS_LIKELY_SPILLED_P.  No test fails without this patch, but
> it seems it should be implemented.

The documentation for the hook does recommend returning true when there is only 
one register in the class.
So this seems sensible to me. It's supposed to affect optimisation rather than 
correctness so I'm in favour of it.
Ok.
Thanks,
Kyrill

> 
> 2021-09-01  Christophe Lyon  
> 
>   gcc/
>   * config/arm/arm.c (arm_class_likely_spilled_p): Handle VPR_REG.
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 11dafc70067..1222cb0d0fe 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -29307,6 +29307,9 @@ arm_class_likely_spilled_p (reg_class_t rclass)
>|| rclass  == CC_REG)
>  return true;
> 
> +  if (TARGET_HAVE_MVE && (rclass == VPR_REG))
> +return true;
> +
>return false;
>  }
> 
> --
> 2.25.1



Re: [PATCH v3 1/3] reassoc: Do not bias loop-carried PHIs early

2021-09-28 Thread Richard Biener via Gcc-patches
On Sun, 26 Sep 2021, Ilya Leoshkevich wrote:

> Biasing loop-carried PHIs during the 1st reassociation pass interferes
> with reduction chains and does not bring measurable benefits, so do it
> only during the 2nd reassociation pass.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
> 
> * passes.def (pass_reassoc): Rename parameter to early_p.
> * tree-ssa-reassoc.c (reassoc_bias_loop_carried_phi_ranks_p):
> New variable.
> (phi_rank): Don't bias loop-carried phi ranks
> before vectorization pass.
> (execute_reassoc): Add bias_loop_carried_phi_ranks_p parameter.
> (pass_reassoc::pass_reassoc): Add bias_loop_carried_phi_ranks_p
> initializer.
> (pass_reassoc::set_param): Set bias_loop_carried_phi_ranks_p
> value.
> (pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p to
> execute_reassoc.
> (pass_reassoc::bias_loop_carried_phi_ranks_p): New member.
> ---
>  gcc/passes.def |  4 ++--
>  gcc/tree-ssa-reassoc.c | 16 ++--
>  2 files changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/passes.def b/gcc/passes.def
> index d7a1f8c97a6..c5f915d04c6 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -242,7 +242,7 @@ along with GCC; see the file COPYING3.  If not see
>/* Identify paths that should never be executed in a conforming
>program and isolate those paths.  */
>NEXT_PASS (pass_isolate_erroneous_paths);
> -  NEXT_PASS (pass_reassoc, true /* insert_powi_p */);
> +  NEXT_PASS (pass_reassoc, true /* early_p */);
>NEXT_PASS (pass_dce);
>NEXT_PASS (pass_forwprop);
>NEXT_PASS (pass_phiopt, false /* early_p */);
> @@ -325,7 +325,7 @@ along with GCC; see the file COPYING3.  If not see
>NEXT_PASS (pass_lower_vector_ssa);
>NEXT_PASS (pass_lower_switch);
>NEXT_PASS (pass_cse_reciprocals);
> -  NEXT_PASS (pass_reassoc, false /* insert_powi_p */);
> +  NEXT_PASS (pass_reassoc, false /* early_p */);
>NEXT_PASS (pass_strength_reduction);
>NEXT_PASS (pass_split_paths);
>NEXT_PASS (pass_tracer);
> diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
> index 8498cfc7aa8..420c14e8cf5 100644
> --- a/gcc/tree-ssa-reassoc.c
> +++ b/gcc/tree-ssa-reassoc.c
> @@ -180,6 +180,10 @@ along with GCC; see the file COPYING3.  If not see
> point 3a in the pass header comment.  */
>  static bool reassoc_insert_powi_p;
>  
> +/* Enable biasing ranks of loop accumulators.  We don't want this before
> +   vectorization, since it interferes with reduction chains.  */
> +static bool reassoc_bias_loop_carried_phi_ranks_p;
> +
>  /* Statistics */
>  static struct
>  {
> @@ -269,6 +273,9 @@ phi_rank (gimple *stmt)
>use_operand_p use;
>gimple *use_stmt;
>  
> +  if (!reassoc_bias_loop_carried_phi_ranks_p)
> +return bb_rank[bb->index];
> +
>/* We only care about real loops (those with a latch).  */
>if (!father->latch)
>  return bb_rank[bb->index];
> @@ -6940,9 +6947,10 @@ fini_reassoc (void)
> optimization of a gimple conditional.  Otherwise returns zero.  */
>  
>  static unsigned int
> -execute_reassoc (bool insert_powi_p)
> +execute_reassoc (bool insert_powi_p, bool bias_loop_carried_phi_ranks_p)
>  {
>reassoc_insert_powi_p = insert_powi_p;
> +  reassoc_bias_loop_carried_phi_ranks_p = bias_loop_carried_phi_ranks_p;
>  
>init_reassoc ();
>  
> @@ -6983,15 +6991,19 @@ public:
>  {
>gcc_assert (n == 0);
>insert_powi_p = param;
> +  bias_loop_carried_phi_ranks_p = !param;
>  }
>virtual bool gate (function *) { return flag_tree_reassoc != 0; }
>virtual unsigned int execute (function *)
> -{ return execute_reassoc (insert_powi_p); }
> +  {
> +return execute_reassoc (insert_powi_p, bias_loop_carried_phi_ranks_p);
> +  }
>  
>   private:
>/* Enable insertion of __builtin_powi calls during execute_reassoc.  See
>   point 3a in the pass header comment.  */
>bool insert_powi_p;
> +  bool bias_loop_carried_phi_ranks_p;
>  }; // class pass_reassoc
>  
>  } // anon namespace
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


RE: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass

2021-09-28 Thread Kyrylo Tkachov via Gcc-patches
Hi Christophe,

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> LYON via Gcc-patches
> Sent: 08 September 2021 08:49
> To: Richard Earnshaw ; gcc-
> patc...@gcc.gnu.org
> Subject: Re: [PATCH 04/13] arm: Add GENERAL_AND_VPR_REGS regclass
> 
> 
> On 07/09/2021 15:35, Richard Earnshaw wrote:
> >
> >
> > On 07/09/2021 13:05, Christophe LYON wrote:
> >>
> >> On 07/09/2021 11:42, Richard Earnshaw wrote:
> >>>
> >>>
> >>> On 07/09/2021 10:15, Christophe Lyon via Gcc-patches wrote:
>  At some point during the development of this patch series, it appeared
>  that in some cases the register allocator wants “VPR or general”
>  rather than “VPR or general or FP” (which is the same thing as
>  ALL_REGS).  The series does not seem to require this anymore, but it
>  seems to be a good thing to do anyway, to give the register allocator
>  more freedom.
> 
>  2021-09-01  Christophe Lyon 
> 
>  gcc/
>  * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
>  (REG_CLASS_NAMES): Likewise.
>  (REG_CLASS_CONTENTS): Likewise. Add VPR_REG to ALL_REGS.
> 
>  diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>  index 015299c1534..fab39d05916 100644
>  --- a/gcc/config/arm/arm.h
>  +++ b/gcc/config/arm/arm.h
>  @@ -1286,6 +1286,7 @@ enum reg_class
>      SFP_REG,
>      AFP_REG,
>      VPR_REG,
>  +  GENERAL_AND_VPR_REGS,
>      ALL_REGS,
>      LIM_REG_CLASSES
>    };
>  @@ -1315,6 +1316,7 @@ enum reg_class
>      "SFP_REG",    \
>      "AFP_REG",    \
>      "VPR_REG",    \
>  +  "GENERAL_AND_VPR_REGS", \
>      "ALL_REGS"    \
>    }
>    @@ -1343,7 +1345,8 @@ enum reg_class
>      { 0x, 0x, 0x, 0x0040 }, /* SFP_REG
>  */    \
>      { 0x, 0x, 0x, 0x0080 }, /* AFP_REG
>  */    \
>      { 0x, 0x, 0x, 0x0400 }, /* VPR_REG.
>  */    \
>  -  { 0x7FFF, 0x, 0x, 0x000F }  /* ALL_REGS.
>  */    \
>  +  { 0x5FFF, 0x, 0x, 0x0400 }, /*
>  GENERAL_AND_VPR_REGS.  */ \
>  +  { 0x7FFF, 0x, 0x, 0x040F }  /* ALL_REGS.
>  */    \
>    }
> >>>
> >>> You've changed the definition of ALL_REGS here (to include VPR_REG),
> >>> but not really explained why.  Is that the source of the underlying
> >>> issue with the 'appeared' you mention?
> >>
> >>
> >> I first added VPR_REG to ALL_REGS, but Richard Sandiford suggested I
> >> create a new GENERAL_AND_VPR_REGS that would be more restrictive. I
> >> did not remove VPR_REG from ALL_REGS because I thought it was an
> >> omission: shouldn't ALL_REGS contain all registers?
> >
> > Surely that should be a separate patch then.
> 
> OK, I can remove that line from this patch and make a separate one-liner
> for ALL_REGS.

Did you end up sending that patch out? (Sorry, I may have missed it in my 
archive).
This patch to add GENERAL_AND_VPR_REGS is okay with the ALL_REGS change 
separated out.

Thanks,
Kyrill

> 
> Thanks,
> 
> Christophe
> 
> 
> >
> > R.
> >
> >>
> >>
> >>>
> >>> R.
> >>>
> >>>
>      #define FP_SYSREGS \
> 


Re: [PATCH] Enable auto-vectorization at O2 with very-cheap cost model.

2021-09-28 Thread Richard Biener via Gcc-patches
On Sun, 26 Sep 2021, liuhongt wrote:

> Hi:
> > Please don't add the -fno- option to the warning tests.  As I said,
> > I would prefer to either suppress the vectorization for the failing
> > cases by tweaking the test code or xfail them.  That way future
> > regressions won't be masked by the option.  Once we've moved
> > the warning to a more suitable pass we'll add a new test to verify
> > it works as intended or remove the xfails.
> 
> Remove -fno-tree-vectorize from the warning tests, and add xfails to them.
> The warning information is mainly affected by vectorization of 4 or 2 char
> store. Some targets support both, some targets only support one of them,
> and some targets supported neither, which means the warning information
> would differ from targets to targets.
> I only added xfail { x86_64-*-* i?86-*-* }, other backends may need to
> re-adjust these xfail.
> 
>   Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
> 
>   * common.opt (ftree-vectorize): Add Var(flag_tree_vectorize).
>   * doc/invoke.texi (Options That Control Optimization): Update
>   documents.
>   * opts.c (default_options_table): Enable auto-vectorization at
>   O2 with very-cheap cost model.
>   (finish_options): Use cheap cost model for
>   explicit -ftree{,-loop}-vectorize.
> 
> gcc/testsuite/ChangeLog:
> 
>   * c-c++-common/Wstringop-overflow-2.c: Adjust testcase.
>   * g++.dg/tree-ssa/pr81408.C: Ditto.
>   * g++.dg/warn/Wuninitialized-13.C: Ditto.
>   * gcc.dg/Warray-bounds-51.c: Ditto.
>   * gcc.dg/Warray-parameter-3.c: Ditto.
>   * gcc.dg/Wstringop-overflow-14.c: Ditto.
>   * gcc.dg/Wstringop-overflow-21.c: Ditto.
>   * gcc.dg/Wstringop-overflow-68.c: Ditto.
>   * gcc.dg/Wstringop-overflow-76.c: Ditto.
>   * gcc.dg/gomp/pr46032-2.c: Ditto.
>   * gcc.dg/gomp/pr46032-3.c: Ditto.
>   * gcc.dg/gomp/simd-2.c: Ditto.
>   * gcc.dg/gomp/simd-3.c: Ditto.
>   * gcc.dg/graphite/fuse-1.c: Ditto.
>   * gcc.dg/pr67089-6.c: Ditto.
>   * gcc.dg/pr82929-2.c: Ditto.
>   * gcc.dg/pr82929.c: Ditto.
>   * gcc.dg/store_merging_1.c: Ditto.
>   * gcc.dg/store_merging_11.c: Ditto.
>   * gcc.dg/store_merging_15.c: Ditto.
>   * gcc.dg/store_merging_16.c: Ditto.
>   * gcc.dg/store_merging_19.c: Ditto.
>   * gcc.dg/store_merging_24.c: Ditto.
>   * gcc.dg/store_merging_25.c: Ditto.
>   * gcc.dg/store_merging_28.c: Ditto.
>   * gcc.dg/store_merging_30.c: Ditto.
>   * gcc.dg/store_merging_5.c: Ditto.
>   * gcc.dg/store_merging_7.c: Ditto.
>   * gcc.dg/store_merging_8.c: Ditto.
>   * gcc.dg/strlenopt-85.c: Ditto.
>   * gcc.dg/tree-ssa/dump-6.c: Ditto.
>   * gcc.dg/tree-ssa/pr19210-1.c: Ditto.
>   * gcc.dg/tree-ssa/pr47059.c: Ditto.
>   * gcc.dg/tree-ssa/pr86017.c: Ditto.
>   * gcc.dg/tree-ssa/pr91482.c: Ditto.
>   * gcc.dg/tree-ssa/predcom-1.c: Ditto.
>   * gcc.dg/tree-ssa/predcom-dse-3.c: Ditto.
>   * gcc.dg/tree-ssa/prefetch-3.c: Ditto.
>   * gcc.dg/tree-ssa/prefetch-6.c: Ditto.
>   * gcc.dg/tree-ssa/prefetch-8.c: Ditto.
>   * gcc.dg/tree-ssa/prefetch-9.c: Ditto.
>   * gcc.dg/tree-ssa/ssa-dse-18.c: Ditto.
>   * gcc.dg/tree-ssa/ssa-dse-19.c: Ditto.
>   * gcc.dg/uninit-40.c: Ditto.
>   * gcc.dg/unroll-7.c: Ditto.
>   * gcc.misc-tests/help.exp: Ditto.
>   * gcc.target/i386/avx512vpopcntdqvl-vpopcntd-1.c: Ditto.
>   * gcc.target/i386/pr34012.c: Ditto.
>   * gcc.target/i386/pr49781-1.c: Ditto.
>   * gcc.target/i386/pr95798-1.c: Ditto.
>   * gcc.target/i386/pr95798-2.c: Ditto.
>   * gfortran.dg/pr77498.f: Ditto.
> ---
>  gcc/common.opt|  2 +-
>  gcc/doc/invoke.texi   |  8 
>  gcc/opts.c| 17 +---
>  .../c-c++-common/Wstringop-overflow-2.c   | 20 +--
>  gcc/testsuite/g++.dg/tree-ssa/pr81408.C   |  2 +-
>  gcc/testsuite/g++.dg/warn/Wuninitialized-13.C |  2 +-
>  gcc/testsuite/gcc.dg/Warray-bounds-51.c   |  2 +-
>  gcc/testsuite/gcc.dg/Warray-parameter-3.c |  4 ++--
>  gcc/testsuite/gcc.dg/Wstringop-overflow-14.c  |  4 ++--
>  gcc/testsuite/gcc.dg/Wstringop-overflow-21.c  |  8 
>  gcc/testsuite/gcc.dg/Wstringop-overflow-68.c  | 10 +-
>  gcc/testsuite/gcc.dg/Wstringop-overflow-76.c  | 20 +--
>  gcc/testsuite/gcc.dg/gomp/pr46032-2.c |  2 +-
>  gcc/testsuite/gcc.dg/gomp/pr46032-3.c |  2 +-
>  gcc/testsuite/gcc.dg/gomp/simd-2.c|  2 +-
>  gcc/testsuite/gcc.dg/gomp/simd-3.c|  2 +-
>  gcc/testsuite/gcc.dg/graphite/fuse-1.c|  2 +-
>  gcc/testsuite/gcc.dg/pr67089-6.c  |  2 +-
>  gcc/testsuite/gcc.dg/pr82929-2.c  |  2 +-
>  gcc/testsuite/gcc.dg/pr82929.c|  2 +-
>  gcc/testsuite/gcc.dg/store_merging_1.c|  2 +-
>  

RE: [PATCH 03/13] arm: Add test for PR target/101325

2021-09-28 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: 07 September 2021 10:15
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH 03/13] arm: Add test for PR target/101325
> 
> This test is derived from the one provided in the PR: it is a
> compile-only test because I do not have access to anything that could
> execute it.  We can switch it do 'dg-do run' later, however it would
> be better to write a new executable test to ensure coverage in case
> the tester cannot execute such code (and it will need a new
> arm_v8_1m_mve_hw or similar effective-target).

The test is okay for now.
I think we'll want to have a arm_v8_1m_mve_hw target sooner or later.
Maybe Alex or Andrea can help to write one we can use?

Thanks,
Kyrill

> 
> 2021-09-01  Christophe Lyon  
> 
>   gcc/testsuite/
>   PR target/101325
>   * gcc.target/arm/simd/pr101325.c: New.
> 
> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr101325.c
> b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
> new file mode 100644
> index 000..a466683a0b1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/pr101325.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O3" } */
> +
> +#include 
> +
> +unsigned foo(int8x16_t v, int8x16_t w)
> +{
> +  return vcmpeqq (v, w);
> +}
> +/* { dg-final { scan-assembler {\tvcmp.i8  eq} } } */
> +/* { dg-final { scan-assembler {\tvmrs\t r[0-9]+, P0} } } */
> +/* { dg-final { scan-assembler {\tuxth} } } */
> --
> 2.25.1



RE: [PATCH 02/13] arm: Add tests for PR target/100757

2021-09-28 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: 07 September 2021 10:15
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH 02/13] arm: Add tests for PR target/100757
> 
> These tests currently trigger an ICE which is fixed later in the patch
> series.
> 
> The pr100757*.c testcases are derived from
> gcc.c-torture/compile/20160205-1.c, forcing the use of MVE, and using
> various types and return values different from 0 and 1 to avoid
> commonalization with boolean masks.  In addition, since we should not
> need these masks, the tests make sure they are not present.

Ok, but I'd rather it was committed together with the patch that fixes the ICE.
I don't mind if it's a separate commit or rolled into that patch.

Thanks,
Kyrill

> 
> 2021-09-01  Christophe Lyon  
> 
>   gcc/testsuite/
>   PR target/100757
>   * gcc.target/arm/simd/pr100757-2.c: New.
>   * gcc.target/arm/simd/pr100757-3.c: New.
>   * gcc.target/arm/simd/pr100757-4.c: New.
>   * gcc.target/arm/simd/pr100757.c: New.
> 
> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
> b/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
> new file mode 100644
> index 000..c2262b4d81e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
> +/* Derived from gcc.c-torture/compile/20160205-1.c.  */
> +
> +float a[32];
> +int fn1(int d) {
> +  int c = 4;
> +  for (int b = 0; b < 32; b++)
> +if (a[b] != 2.0f)
> +  c = 5;
> +  return c;
> +}
> +
> +/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 4 } } */ /*
> Constant 2.0f.  */
> +/* { dg-final { scan-assembler-times {\t.word\t4\n} 4 } } */ /* Initial value
> for c.  */
> +/* { dg-final { scan-assembler-times {\t.word\t5\n} 4 } } */ /* Possible
> value for c.  */
> +/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
> +/* { dg-final { scan-assembler-not {\t.word\t0\n} } } */ /* 'false' mask.  */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
> b/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
> new file mode 100644
> index 000..e604555c04c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
> +/* Copied from gcc.c-torture/compile/20160205-1.c.  */
> +
> +float a[32];
> +float fn1(int d) {
> +  float c = 4.0f;
> +  for (int b = 0; b < 32; b++)
> +if (a[b] != 2.0f)
> +  c = 5.0f;
> +  return c;
> +}
> +
> +/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 4 } } */ /*
> Constant 2.0f.  */
> +/* { dg-final { scan-assembler-times {\t.word\t1084227584\n} 4 } } */ /*
> Initial value for c (4.0).  */
> +/* { dg-final { scan-assembler-times {\t.word\t1082130432\n} 4 } } */ /*
> Possible value for c (5.0).  */
> +/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
> +/* { dg-final { scan-assembler-not {\t.word\t0\n} } } */ /* 'false' mask.  */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
> b/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
> new file mode 100644
> index 000..c12040c517f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O3" } */
> +/* Derived from gcc.c-torture/compile/20160205-1.c.  */
> +
> +unsigned int a[32];
> +int fn1(int d) {
> +  int c = 2;
> +  for (int b = 0; b < 32; b++)
> +if (a[b])
> +  c = 3;
> +  return c;
> +}
> +
> +/* { dg-final { scan-assembler-times {\t.word\t0\n} 4 } } */ /* 'false' mask.
> */
> +/* { dg-final { scan-assembler-not {\t.word\t1\n} } } */ /* 'true' mask.  */
> +/* { dg-final { scan-assembler-times {\t.word\t2\n} 4 } } */ /* Initial value
> for c.  */
> +/* { dg-final { scan-assembler-times {\t.word\t3\n} 4 } } */ /* Possible
> value for c.  */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr100757.c
> b/gcc/testsuite/gcc.target/arm/simd/pr100757.c
> new file mode 100644
> index 000..41d6e4e2d7a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/pr100757.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O3" } */
> +/* Derived from gcc.c-torture/compile/20160205-1.c.  */
> +
> +int a[32];
> +int fn1(int d) {
> +  int c = 2;
> +  for (int b = 0; b < 32; b++)
> +if (a[b])
> +  c = 3;

RE: [PATCH 01/13] arm: Add new tests for comparison vectorization with Neon and MVE

2021-09-28 Thread Kyrylo Tkachov via Gcc-patches
Hi Christophe,

Sorry for the delay.

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: 07 September 2021 10:15
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH 01/13] arm: Add new tests for comparison vectorization
> with Neon and MVE
> 
> This patch mainly adds Neon tests similar to existing MVE ones,
> to make sure we do not break Neon when fixing MVE.
> 
> mve-vcmp-f32-2.c is similar to mve-vcmp-f32.c but uses a conditional
> with 2.0f and 3.0f constants to help scan-assembler-times.
> 
> 2021-09-01  Christophe Lyon 
> 
>   gcc/testsuite/
>   * gcc.target/arm/simd/mve-vcmp-f32-2.c: New.
>   * gcc.target/arm/simd/neon-compare-1.c: New.
>   * gcc.target/arm/simd/neon-compare-2.c: New.
>   * gcc.target/arm/simd/neon-compare-3.c: New.
>   * gcc.target/arm/simd/neon-compare-scalar-1.c: New.
>   * gcc.target/arm/simd/neon-vcmp-f16.c: New.
>   * gcc.target/arm/simd/neon-vcmp-f32-2.c: New.
>   * gcc.target/arm/simd/neon-vcmp-f32-3.c: New.
>   * gcc.target/arm/simd/neon-vcmp-f32.c: New.
>   * gcc.target/arm/simd/neon-vcmp.c: New.

Thanks,
Kyrill

> 
> diff --git a/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
> b/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
> new file mode 100644
> index 000..917a95bf141
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
> @@ -0,0 +1,32 @@
> +/* { dg-do assemble } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O3 -funsafe-math-optimizations" } */
> +
> +#include 
> +
> +#define NB 4
> +
> +#define FUNC(OP, NAME)
>   \
> +  void test_ ## NAME ##_f (float * __restrict__ dest, float *a, float *b) { \
> +int i;   \
> +for (i=0; i +  dest[i] = (a[i] OP b[i]) ? 2.0f : 3.0f;
> \
> +}
> \
> +  }
> +
> +FUNC(==, vcmpeq)
> +FUNC(!=, vcmpne)
> +FUNC(<, vcmplt)
> +FUNC(<=, vcmple)
> +FUNC(>, vcmpgt)
> +FUNC(>=, vcmpge)
> +
> +/* { dg-final { scan-assembler-times {\tvcmp.f32\teq, q[0-9]+, q[0-9]+\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcmp.f32\tne, q[0-9]+, q[0-9]+\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcmp.f32\tlt, q[0-9]+, q[0-9]+\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcmp.f32\tle, q[0-9]+, q[0-9]+\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcmp.f32\tgt, q[0-9]+, q[0-9]+\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcmp.f32\tge, q[0-9]+, q[0-9]+\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\t.word\t1073741824\n} 24 } } */ /*
> Constant 2.0f.  */
> +/* { dg-final { scan-assembler-times {\t.word\t1077936128\n} 24 } } */ /*
> Constant 3.0f.  */
> diff --git a/gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
> b/gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
> new file mode 100644
> index 000..2e0222a71f2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
> @@ -0,0 +1,78 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_neon_ok } */
> +/* { dg-add-options arm_neon } */
> +/* { dg-additional-options "-O3" } */
> +
> +#include "mve-compare-1.c"
> +
> +/* 64-bit vectors.  */
> +/* vmvn is used by 'ne' comparisons: 3 sizes * 2 (signed/unsigned) * 2
> +   (register/zero) = 12.  */
> +/* { dg-final { scan-assembler-times {\tvmvn\td[0-9]+, d[0-9]+\n} 12 } } */
> +
> +/* { 8 bits } x { eq, ne, lt, le, gt, ge }. */
> +/* ne uses eq, lt/le only apply to comparison with zero, they use gt/ge
> +   otherwise.  */
> +/* { dg-final { scan-assembler-times {\tvceq.i8\td[0-9]+, d[0-9]+, d[0-9]+\n}
> 4 } } */
> +/* { dg-final { scan-assembler-times {\tvceq.i8\td[0-9]+, d[0-9]+, #0\n} 4 } 
> }
> */
> +/* { dg-final { scan-assembler-times {\tvclt.s8\td[0-9]+, d[0-9]+, #0\n} 1 } 
> }
> */
> +/* { dg-final { scan-assembler-times {\tvcle.s8\td[0-9]+, d[0-9]+, #0\n} 1 } 
> }
> */
> +/* { dg-final { scan-assembler-times {\tvcgt.s8\td[0-9]+, d[0-9]+, d[0-9]+\n}
> 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s8\td[0-9]+, d[0-9]+, #0\n} 1 } 
> }
> */
> +/* { dg-final { scan-assembler-times {\tvcge.s8\td[0-9]+, d[0-9]+, d[0-9]+\n}
> 2 } } */
> +/* { dg-final { scan-assembler-times {\tvcge.s8\td[0-9]+, d[0-9]+, #0\n} 1 } 
> }
> */
> +
> +/* { 16 bits } x { eq, ne, lt, le, gt, ge }. */
> +/* { dg-final { scan-assembler-times {\tvceq.i16\td[0-9]+, d[0-9]+, d[0-
> 9]+\n} 4 } } */
> +/* { dg-final { scan-assembler-times {\tvceq.i16\td[0-9]+, d[0-9]+, #0\n}
> 4 } } */
> +/* { dg-final { scan-assembler-times {\tvclt.s16\td[0-9]+, d[0-9]+, #0\n} 1 
> } }
> */
> +/* { dg-final { scan-assembler-times {\tvcle.s16\td[0-9]+, d[0-9]+, #0\n}
> 1 } } */
> +/* { dg-final { scan-assembler-times {\tvcgt.s16\td[0-9]+, d[0-9]+, d[0-

Re: [PATCH] Control all jump threading passes with -fjump-threads.

2021-09-28 Thread Richard Biener via Gcc-patches
On Tue, Sep 28, 2021 at 11:42 AM Aldy Hernandez  wrote:
>
>
>
> On 9/28/21 9:41 AM, Richard Biener wrote:
> > On Tue, Sep 28, 2021 at 8:29 AM Jeff Law via Gcc-patches
> >  wrote:
> >>
> >>
> >>
> >> On 9/28/2021 12:17 AM, Aldy Hernandez wrote:
> >>> On Tue, Sep 28, 2021 at 3:46 AM Jeff Law  wrote:
> 
> 
>  On 9/27/2021 9:00 AM, Aldy Hernandez wrote:
> > Last year I mentioned that -fthread-jumps was being ignored by the
> > majority of our jump threading passes, and Jeff said he'd be in favor
> > of fixing this.
> >
> > This patch remedies the situation, but it does change existing behavior.
> > Currently -fthread-jumps is only enabled for -O2, -O3, and -Os.  This
> > means that even if we restricted all jump threading passes with
> > -fthread-jumps, DOM jump threading would still seep through since it
> > runs at -O1.
> >
> > I propose this patch, but it does mean that DOM jump threading would
> > have to be explicitly enabled with -O1 -fthread-jumps.  An
> > alternative would be to also offer a specific -fno-dom-threading, but
> > that seems icky.
> >
> > OK pending tests?
> >
> > gcc/ChangeLog:
> >
> > * tree-ssa-threadbackward.c (pass_thread_jumps::gate): Check
> > flag_thread_jumps.
> > (pass_early_thread_jumps::gate): Same.
> > * tree-ssa-threadedge.c (jump_threader::thread_outgoing_edges):
> > Return if !flag_thread_jumps.
> > * tree-ssa-threadupdate.c
> > (jt_path_registry::register_jump_thread): Assert that
> > flag_thread_jumps is true.
>  OK.  Clearly this is going to be even better once we disentangle
>  threading from DOM.
> >>> Annoyingly, I had to tweak a few more tests, particularly some
> >>> -Wuninitialized -O1 ones which seem to depend on DOM jump threading to
> >>> give proper diagnostics.  It seems that every change to jump threading
> >>> needs tweaks to the Wuninitialized code :-(.
> >> Well, a lot of jump threading is there to help eliminate false positives
> >> from Wuninitialized by eliminating paths through the CFG that we can
> >> prove never execute at runtime.  SO that's not a huge surprise.
> >
> > I would have suggested to enable -fthread-jumps at -O1 instead
> > and eventually just add && flag_expensive_optimizations to the
> > use in cfgcleanup.c to restrict that to -O2+
>
> Hmmm, that's a much better idea.  I was afraid of messing existing
> behavior, but I suppose adding even more false positives for -O1
> -Wuninitialized is worse.
>
> BTW, I plugged one more tweak to the registry in
> remove_jump_threads_including.  No need to go add things to the removed
> edges hash table, if we're not going to thread.
>
> OK pending tests?

OK.

Richard.

> Aldy


[PATCH] tree-optimization/99793 - testcase for the PR

2021-09-28 Thread Richard Biener via Gcc-patches
This adds a testcase for the PR which was fixed with the fix for
PR100112.

Tested on x86_64-unknown-linux-gnu, pushed.

2021-09-28  Richard Biener  

PR tree-optimization/99793
* gcc.dg/tree-ssa/pr99793.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr99793.c | 14 ++
 1 file changed, 14 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr99793.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr99793.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr99793.c
new file mode 100644
index 000..912744928e5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr99793.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fstrict-aliasing -fdump-tree-optimized" } */
+
+extern void foo(void);
+static int a, *b = , c, *d = 
+int main()
+{
+  int **e = 
+  if (!((unsigned)((*e = d) == 0) - (*b = 1)))
+foo();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-not "foo" "optimized" } } */
-- 
2.31.1


Re: [PATCH] i386: Don't emit fldpi etc. if -frounding-math [PR102498]

2021-09-28 Thread Uros Bizjak via Gcc-patches
On Tue, Sep 28, 2021 at 11:33 AM Jakub Jelinek  wrote:
>
> Hi!
>
> i387 has instructions to store some transcedental numbers into the top of
> stack.  The problem is that what exact bit in the last place one gets for
> those depends on the current rounding mode, the CPU knows the number with
> slightly higher precision.  The compiler assumes rounding to nearest when
> comparing them against constants in the IL, but at runtime the rounding
> can be different and so some of these depending on rounding mode and the
> constant could be 1 ulp higher or smaller than expected.
> We only support changing the rounding mode at runtime if the non-default
> -frounding-mode option is used, so the following patch just disables
> using those constants if that flag is on.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2021-09-28  Jakub Jelinek  
>
> PR target/102498
> * config/i386/i386.c (standard_80387_constant_p): Don't recognize
> special 80387 instruction XFmode constants if flag_rounding_math.
>
> * gcc.target/i386/pr102498.c: New test.

OK.

Thanks,
Uros.

>
> --- gcc/config/i386/i386.c.jj   2021-09-18 09:44:31.720743823 +0200
> +++ gcc/config/i386/i386.c  2021-09-27 16:55:37.928072249 +0200
> @@ -5035,7 +5035,8 @@ standard_80387_constant_p (rtx x)
>/* For XFmode constants, try to find a special 80387 instruction when
>   optimizing for size or on those CPUs that benefit from them.  */
>if (mode == XFmode
> -  && (optimize_function_for_size_p (cfun) || TARGET_EXT_80387_CONSTANTS))
> +  && (optimize_function_for_size_p (cfun) || TARGET_EXT_80387_CONSTANTS)
> +  && !flag_rounding_math)
>  {
>int i;
>
> --- gcc/testsuite/gcc.target/i386/pr102498.c.jj 2021-09-27 17:09:30.387509264 
> +0200
> +++ gcc/testsuite/gcc.target/i386/pr102498.c2021-09-27 17:09:22.548618148 
> +0200
> @@ -0,0 +1,59 @@
> +/* PR target/102498 */
> +/* { dg-do run { target fenv } } */
> +/* { dg-options "-frounding-math" } */
> +
> +#include 
> +#include 
> +
> +__attribute__((noipa)) long double
> +fldlg2 (void)
> +{
> +  return 0.3010299956639811952256464283594894482L;
> +}
> +
> +__attribute__((noipa)) long double
> +fldln2 (void)
> +{
> +  return 0.6931471805599453094286904741849753009L;
> +}
> +
> +__attribute__((noipa)) long double
> +fldl2e (void)
> +{
> +  return 1.4426950408889634073876517827983434472L;
> +}
> +
> +__attribute__((noipa)) long double
> +fldl2t (void)
> +{
> +  return 3.3219280948873623478083405569094566090L;
> +}
> +
> +__attribute__((noipa)) long double
> +fldpi (void)
> +{
> +  return 3.1415926535897932385128089594061862044L;
> +}
> +
> +int
> +main ()
> +{
> +  long double a = fldlg2 ();
> +  long double b = fldln2 ();
> +  long double c = fldl2e ();
> +  long double d = fldl2t ();
> +  long double e = fldpi ();
> +  static int f[] = { FE_TONEAREST, FE_TOWARDZERO, FE_UPWARD, FE_DOWNWARD };
> +  int i;
> +  for (i = 0; i < 4; i++)
> +{
> +  fesetround (f[i]);
> +  if (a != fldlg2 ()
> + || b != fldln2 ()
> + || c != fldl2e ()
> + || d != fldl2t ()
> + || e != fldpi ())
> +   abort ();
> +}
> +  return 0;
> +}
>
> Jakub
>


[PATCH] Improve jump threading dump output.

2021-09-28 Thread Aldy Hernandez via Gcc-patches
In analyzing PR102511, it has become abundantly clear that we need
better debugging aids for the jump threader solver.  Currently
debugging these issues is a nightmare if you're not intimately
familiar with the code.  This patch attempts to improve this.

First, I'm enabling path solver dumps with TDF_THREADING.  None of the
available TDF_* flags are a good match, and using TDF_DETAILS would blow
up the dump file, since both threaders continually call the solver to
try out candidates.  This will allow dumping path solver details without
having to resort to hacking the source.

I am also dumping the current registered_jump_thread dbg counter used
by the registry, in the solver.  That way narrowing down a problematic
thread can then be examined by -fdump-*-threading and looking at the
solver details surrounding the appropriate counter (which the dbgcnt
also dumps to the dump file).

You still need knowledge of the solver to debug these issues, but at
least now it's not entirely opaque.

OK?

gcc/ChangeLog:

* dbgcnt.c (dbg_cnt_counter): New.
* dbgcnt.h (dbg_cnt_counter): New.
* dumpfile.c (dump_options): Add entry for TDF_THREADING.
* dumpfile.h (enum dump_flag): Add TDF_THREADING.
* gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
* tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
debug counter.
---
 gcc/dbgcnt.c|  8 
 gcc/dbgcnt.h|  1 +
 gcc/dumpfile.c  |  1 +
 gcc/dumpfile.h  |  3 +++
 gcc/gimple-range-path.cc|  2 +-
 gcc/tree-ssa-threadupdate.c | 13 +
 6 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/gcc/dbgcnt.c b/gcc/dbgcnt.c
index 934bbe033ee..6a7eb34cd3e 100644
--- a/gcc/dbgcnt.c
+++ b/gcc/dbgcnt.c
@@ -98,6 +98,14 @@ dbg_cnt (enum debug_counter index)
 return false;
 }
 
+/* Return the counter for INDEX.  */
+
+unsigned
+dbg_cnt_counter (enum debug_counter index)
+{
+  return count[index];
+}
+
 /* Compare limit_tuple intervals by first item in descending order.  */
 
 static int
diff --git a/gcc/dbgcnt.h b/gcc/dbgcnt.h
index 17f2091f5a7..3c35dcc3e0a 100644
--- a/gcc/dbgcnt.h
+++ b/gcc/dbgcnt.h
@@ -33,6 +33,7 @@ enum debug_counter {
 
 extern bool dbg_cnt_is_enabled (enum debug_counter index);
 extern bool dbg_cnt (enum debug_counter index);
+extern unsigned dbg_cnt_counter (enum debug_counter index);
 extern void dbg_cnt_process_opt (const char *arg);
 extern void dbg_cnt_list_all_counters (void);
 
diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
index 8169daf7f59..e6ead5debe5 100644
--- a/gcc/dumpfile.c
+++ b/gcc/dumpfile.c
@@ -145,6 +145,7 @@ static const kv_pair dump_options[] =
   {"missed", MSG_MISSED_OPTIMIZATION},
   {"note", MSG_NOTE},
   {"optall", MSG_ALL_KINDS},
+  {"threading", TDF_THREADING},
   {"all", dump_flags_t (TDF_ALL_VALUES
& ~(TDF_RAW | TDF_SLIM | TDF_LINENO | TDF_GRAPH
| TDF_STMTADDR | TDF_RHS_ONLY | TDF_NOUID
diff --git a/gcc/dumpfile.h b/gcc/dumpfile.h
index 892bfc9ae90..6c7758dd2fb 100644
--- a/gcc/dumpfile.h
+++ b/gcc/dumpfile.h
@@ -197,6 +197,9 @@ enum dump_flag
   /* For error.  */
   TDF_ERROR = (1 << 26),
 
+  /* Dumping for range path solver.  */
+  TDF_THREADING = (1 << 27),
+
   /* All values.  */
   TDF_ALL_VALUES = (1 << 29) - 1
 };
diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index 9da67d2a35b..a29d5318ca9 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -34,7 +34,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 
 // Internal construct to help facilitate debugging of solver.
-#define DEBUG_SOLVER (0 && dump_file)
+#define DEBUG_SOLVER (dump_file && dump_flags & TDF_THREADING)
 
 path_range_query::path_range_query (gimple_ranger , bool resolve)
   : m_ranger (ranger)
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index cf96c903668..905dea2e6ca 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -218,10 +218,15 @@ dump_jump_thread_path (FILE *dump_file,
   const vec ,
   bool registering)
 {
-  fprintf (dump_file,
-  "  %s jump thread: (%d, %d) incoming edge; ",
-  (registering ? "Registering" : "Cancelling"),
-  path[0]->e->src->index, path[0]->e->dest->index);
+  if (registering)
+fprintf (dump_file,
+"  [%u] Registering jump thread: (%d, %d) incoming edge; ",
+dbg_cnt_counter (registered_jump_thread),
+path[0]->e->src->index, path[0]->e->dest->index);
+  else
+fprintf (dump_file,
+"  Cancelling jump thread: (%d, %d) incoming edge; ",
+path[0]->e->src->index, path[0]->e->dest->index);
 
   for (unsigned int i = 1; i < path.length (); i++)
 {
-- 
2.31.1



  1   2   >