Re: [PATCH 2/7] PowerPC tests: Add PLI/PADDI tests.

2020-06-26 Thread Michael Meissner via Gcc-patches
On Thu, Jun 25, 2020 at 11:52:50AM -0500, Segher Boessenkool wrote:
> On Mon, Jun 01, 2020 at 03:53:37PM -0400, Michael Meissner wrote:
> > Add tests for -mcpu=future that test the generation of PADDI (and PLI which
> > becomes PADDI).
> > 
> > 2020-06-01  Michael Meissner  
> > 
> > * gcc.target/powerpc/prefix-add.c: New test.
> > * gcc.target/powerpc/prefix-si-constant.c: New test.
> > * gcc.target/powerpc/prefix-di-constant.c: New test.
> 
> This is okay for trunk (with required changes: -mdejagnu-cpu=power10,
> and the selector names have changed to be something with power10 instead
> of something with future; please retest before commit).

Since all of these tests are for code that is in GCC 10, can I apply these
patches to the GCC 10 branch after the completion of the -mcpu=power10 changes
for GCC 10 and a suitable waiting period and retest?

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH 5/7, V2] PowerPC tests: Prefixed insn with large offsets

2020-06-26 Thread Michael Meissner via Gcc-patches
On Thu, Jun 25, 2020 at 12:09:41PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jun 04, 2020 at 01:03:51PM -0400, Michael Meissner wrote:
> > [PATCH 5/7, V2] PowerPC tests: Prefixed insn with large offsets
> > 
> > Add tests to make sure for -mcpu=future that prefixed load/store 
> > instructions
> > are generated if the offset is larger than 16 bits.  The only difference
> > is I reworded the comments, based on a suggestion by Will Schmidt.
> 
> That is the difference to v1?  Please mark this up clearly -- there are
> many examples of how you can do this, on the GCC lists even.
> 
> > +/* { dg-final { scan-assembler-times {\mpaddi\M|\mpli|\mpla\M} 3 } } */
> 
> Is there are reason you don't have \M on pli, or is that an oversight?

Just an oversight.

> Okay for trunk with that looked at, and the necessary "no future"
> changes (and retesting ofc).  Thanks!

I simplified the test to use just the instruction (PLI) that is currently
generated.  I was just trying to be general to accomidate possible future code
generation changes (i.e. instead of loading up the offset with PLI, we could
potentionally do a PADDI to the base register that was loaded), but we can
change the test if/when we change the compiler.

The test in question was for loading up _Decimal32, which we don't have a
D*-form instruction that we can use, and we must always do an X-form
instruction to load up the 32-bit value into the vector register.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH 7/7] PowerPC test: Add prefixed stack protect test

2020-06-26 Thread Michael Meissner via Gcc-patches
On Thu, Jun 25, 2020 at 12:18:42PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Mon, Jun 01, 2020 at 03:53:42PM -0400, Michael Meissner wrote:
> > Test that stack protection generates prefixed stack instructions if you are
> > using large stack frame for -mcpu=future.
> > 
> > 2020-06-01  Michael Meissner  
> > 
> > * gcc.target/powerpc/prefix-stack-protect.c: New test.
> 
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target powerpc_prefixed_addr } */
> 
> Is this test necessary anymore, does -mcpu=power10 not guarantee that?

I believe the test is necessary to prevent regressions.  As I said in the test,
we stumbled on this problem when building GLIBC with -mcpu=future/power10.

> Okay for trunk with that looked at.  Thanks!

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH 2/7] PowerPC tests: Add PLI/PADDI tests.

2020-06-26 Thread Michael Meissner via Gcc-patches
On Thu, Jun 25, 2020 at 11:52:50AM -0500, Segher Boessenkool wrote:
> On Mon, Jun 01, 2020 at 03:53:37PM -0400, Michael Meissner wrote:
> > Add tests for -mcpu=future that test the generation of PADDI (and PLI which
> > becomes PADDI).
> > 
> > 2020-06-01  Michael Meissner  
> > 
> > * gcc.target/powerpc/prefix-add.c: New test.
> > * gcc.target/powerpc/prefix-si-constant.c: New test.
> > * gcc.target/powerpc/prefix-di-constant.c: New test.
> 
> This is okay for trunk (with required changes: -mdejagnu-cpu=power10,
> and the selector names have changed to be something with power10 instead
> of something with future; please retest before commit).

Done.  I was just about to resubmit the patches with the -mcpu=power10 changes.
I did retest on power8/power9 little endian, and power8 big endian (both
32/64-bit).  In running the tests, I discovered 3 of the tests that needed
ILP64 to test the particular code that I was looking for that I previously
hadn't flagged.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] Fix target clone indirection elimination.

2020-06-26 Thread Yichao Yu via Gcc-patches
On Fri, Jun 26, 2020 at 5:51 PM Jeff Law  wrote:
>
> On Mon, 2020-06-22 at 11:24 +0200, Richard Biener via Gcc-patches wrote:
> > On Sun, Jun 21, 2020 at 2:32 AM Yichao Yu via Gcc-patches
> >  wrote:
> > > On Sat, Jun 20, 2020 at 8:16 PM Yichao Yu  wrote:
> > > > On Sat, Jun 20, 2020 at 3:41 PM Yichao Yu  wrote:
> > > > > On Sat, Jun 20, 2020 at 3:26 PM Yichao Yu  wrote:
> > > > > > On Sat, Jun 20, 2020 at 3:25 PM H.J. Lu  wrote:
> > > > > > > On Sat, Jun 20, 2020 at 12:20 PM Yichao Yu  
> > > > > > > wrote:
> > > > > > > > On Sat, Jun 20, 2020 at 3:12 PM H.J. Lu  
> > > > > > > > wrote:
> > > > > > > > > On Sat, Jun 20, 2020 at 11:37 AM Yichao Yu 
> > > > > > > > >  wrote:
> > > > > > > > > > On Sat, Jun 20, 2020 at 1:54 PM Yichao Yu 
> > > > > > > > > >  wrote:
> > > > > > > > > > > > > The current logic seems to be comparing the whole 
> > > > > > > > > > > > > attribute tree between the callee
> > > > > > > > > > > > > and caller (or at least the tree starting from the 
> > > > > > > > > > > > > target attribute).
> > > > > > > > > > > > > This is unnecessary and causes strange dependency of 
> > > > > > > > > > > > > the indirection
> > > > > > > > > > > > > elimination on unrelated properties like 
> > > > > > > > > > > > > `noinline`(PR95780) and `visibility`(PR95778).
> > > > > > > > > > > >
> > > > > > > > > > > > Does it fix PR95780 and PR95778?  Can you include 
> > > > > > > > > > > > testcases for them?
> > > > > > > > > >
> > > > > > > > > > OK so replacing
> > > > > > > > > > https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5
> > > > > > > > > > with/adding to the test the following one should work. I 
> > > > > > > > > > still
> > > > > > > > > > couldn't get test to run though..
> > > > > > > > > >
> > > > > > > > > Can you try the enclosed testcases?
> > > > > > > >
> > > > > > > > The assembly produced are the following with my patch and if I
> > > > > > > > understand it correctly those should work. Unfortunately I 
> > > > > > > > don't know
> > > > > > > > how to actually run the test as a test (if that makes 
> > > > > > > > sense).
> > > > > > > >
> > > > > > >
> > > > > > > Place 2 files under gcc/testsuite/gcc.target/i386 and in GCC 
> > > > > > > build directory, do
> > > > > > >
> > > > > > > $ make check-gcc RUNTESTFLAGS="--target_board='unix{-m32,}'
> > > > > > > i386.exp=pr95778-*.c"
> > > > > > >
> > > > >
> > > > > Finally got it start running after installing dejagnu...
> > > > > It seems to be runing something unrelated though
> > > > >
> > > > > e.g. Running 
> > > > > /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp
> > > > > ...
> > > > > Running 
> > > > > /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/execute.exp
> > > >
> > > > OK it seems that the `-*.c` syntax doesn't work for me somehow, there
> > > > are still a lot of unrelated outputs but when I replaced the * with 1
> > > > and 2 separately I can confirm that there are only expected pass with
> > > > my patch and there are some unexpected failures when the patch is
> > > > reverted.
> > >
> > > And the updated patch is
> >
> > OK.
> I fixed up the obvious formatting goofs and constructed a ChangeLog and have
> pushed the change to the trunk.

Thank you.

> jeff
> >
>


Re: [PATCH] tree-ssa-threadbackward.c (profitable_jump_thread_path): Do not allow __builtin_constant_p.

2020-06-26 Thread Ilya Leoshkevich via Gcc-patches
On Fri, 2020-06-26 at 16:04 -0600, Jeff Law wrote:
> On Fri, 2020-06-26 at 23:54 +0200, Ilya Leoshkevich wrote:
> > How should this work ideally?  Suppose we have:
> > 
> > static inline void foo (int i)
> > {
> >   if (__builtin_is_constant_p (i))
> > asm volatile ("bar" :: "i" (i))
> >   else
> > asm volatile ("baz" :: "d" (i));
> > }
> > 
> > First of all, this is a supported use case, right?
> Yes, this is a supported case.
> 
> > Then, the way I see it, is that at least for a while there must
> > exist
> > trees like the ones above, regardless of whether their asm
> > arguments
> > match constraints.  But ultimately dead code elimination should get
> > rid
> > of the wrong ones before they reach RTL.
> > E.g. in the example above, the non-inlined version should have
> > `!__builtin_is_constant_p (i)`, so `bar` should not survive until
> > RTL.  The same for inlined `foo (1)` version's `baz`.
> In theory yes, but there are cases where paths converge (like you've
> shown) where
> you may have evaluated to a constant on the paths, but it's not a
> constant at the
> convergence point.  You have to be very careful using b_c_p like this
> and it's
> been a regular source of kernel bugs.

Is there something specific that a compiler user should look out for?
For example, here is the kernel code, from which the test was derived:

static inline void atomic_add(int i, atomic_t *v)
{
#ifdef CONFIG_HAVE_MARCH_Z196_FEATURES
if (__builtin_constant_p(i) && (i > -129) && (i < 128)) {
__atomic_add_const(i, >counter);
return;
}
#endif
__atomic_add(i, >counter);
}

It looks very straightforward - can there still be something wrong
with its usage of b_c_p?

> I'd recommend looking at the .ssa dump and walk forward from there if
> the .ssa
> dump looks correct.
> 
> jeff

Well, 021t.ssa already has:

__attribute__((gnu_inline))
__atomic_add_const (intD.6 valD.2269, intD.6 * ptrD.2270)
{
  intD.6 val_3(D) = valD.2269;
  intD.6 * ptr_2(D) = ptrD.2270;
;;   basic block 2, loop depth 0, maybe hot
;;prev block 0, next block 1, flags: (NEW)
;;pred:   ENTRY (FALLTHRU)
  # .MEM_4 = VDEF <.MEM_1(D)>
  __asm__ __volatile__("asi %0,%1
" : "ptr" "=Q" *ptr_2(D) : "val" "i" val_3(D), "Q" *ptr_2(D) :
"memory", "cc");
  # VUSE <.MEM_4>
  return;
;;succ:   EXIT

}

which is, strictly speaking, not correct, because val_3(D) and
valD.2269 are not constant.  But as far as I understand we are willing
to tolerate trees like this until a certain point.

What is this point supposed to be?  If I understood you right, 
106t.thread1 is already too late - why is it so?



Re: [PATCH] arc: add exceptions for PR92860.

2020-06-26 Thread Jeff Law via Gcc-patches
On Wed, 2020-06-24 at 09:43 +0200, Martin Liška wrote:
> Hey.
> 
> The patch is about addition of some exceptions for arc target that
> address:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92860#c26
> 
> It's again another example where optimization options influence target
> options.
> 
> Ready for master?
> Martin
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/92860
>   * optc-save-gen.awk: Add exceptions for arc target.
It doens't look like you're explicitly handling the OPT_mmillicode case.  Was
that intentional?  Regardless, this patch is OK as-is or with the millicode 
stuff
added.

jeff
> 



Re: [PATCH, rs6000] Add support to enable vmsumudm behind vec_msum builtin.

2020-06-26 Thread Segher Boessenkool
Hi!

On Fri, Jun 26, 2020 at 01:20:43PM -0500, will schmidt wrote:
>   * config/rs6000/altivec.h (vec_vmsumudm): New define.
>   * config/rs6000/altivec.md (UNSPEC_VMSUMUDM): New unspec.
>   (altivec_vmsumudm): New define_insn.
>   * config/rs6000/rs6000-builtin.def (altivec_vmsumudm): New
>   BU_ALTIVEC_3 entry. (vmsumudm): New BU_ALTIVEC_OVERLOAD_3 
>   entry.

> +(define_insn "altivec_vmsumudm"
> +  [(set (match_operand:V1TI 0 "register_operand" "=v")
> + (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v")
> +   (match_operand:V2DI 2 "register_operand" "v")
> +   (match_operand:V1TI 3 "register_operand" "v")]
> +  UNSPEC_VMSUMUDM))]
> +  "TARGET_P8_VECTOR"
> +  "vmsumudm %0,%1,%2,%3"
> +  [(set_attr "type" "veccomplex")])

I wonder if it would be better to actually describe what the insn does,
instead of using an unspec.  All similar insns are like this already of
course, it's not something that needs to be fixed right now.

TARGET_P8_VECTOR is wrong (it is ISA 3.0B).

> +The @code{vec_msum} functions perform a vector multiply-sum, returning
> +the result of arg1*arg2+arg3.  ISA 3.0 adds support for vec_msum returning
> +a vector int128 result.

Well, that doesn't describe the horizontal addition it does?  The muls
are widening, and two adjacent results are added (together with the
corresponding elt in arg3).

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-msum.c
> @@ -0,0 +1,25 @@
> +/* Verify that overloaded built-ins for vec_msum with __int128
> +   inputs generate the proper code.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power8  -O3" } */

This needs to change to power9 as well?

Thanks,


Segher


Re: [PATCH] Linux/i386: Remove SUBTARGET_FRAME_POINTER_REQUIRED

2020-06-26 Thread Jeff Law via Gcc-patches
On Fri, 2020-06-12 at 05:56 -0700, H.J. Lu wrote:
> On Thu, Jun 11, 2020 at 8:24 PM Jeff Law  wrote:
> > On Sat, 2020-05-02 at 04:55 -0700, H.J. Lu wrote:
> > > Currently patchable area is at the wrong place.  It is placed immediately
> > > after function label, before both .cfi_startproc and ENDBR.  This patch
> > > adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
> > > changes ENDBR insertion pass to also insert patchable area instruction.
> > > TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY is defined to avoid placing
> > > patchable area before .cfi_startproc and ENDBR.
> > > 
> > > OK for master?
> > > 
> > > Thanks.
> > > 
> > > H.J.
> > > ---
> > > gcc/
> > > 
> > >   PR target/93492
> > >   * config/i386/i386-features.c (rest_of_insert_endbranch):
> > >   Renamed to ...
> > >   (rest_of_insert_endbr_and_patchable_area): Change return type
> > >   to void. Add need_endbr and patchable_area_size arguments.
> > >   Don't call timevar_push nor timevar_pop.  Replace
> > >   endbr_queued_at_entrance with insn_queued_at_entrance.  Insert
> > >   UNSPECV_PATCHABLE_AREA for patchable area.
> > >   (pass_data_insert_endbranch): Renamed to ...
> > >   (pass_data_insert_endbr_and_patchable_area): This.  Change
> > >   pass name to endbr_and_patchable_area.
> > >   (pass_insert_endbranch): Renamed to ...
> > >   (pass_insert_endbr_and_patchable_area): This.  Add need_endbr
> > >   and patchable_area_size;.
> > >   (pass_insert_endbr_and_patchable_area::gate): Set and check
> > >   need_endbr and patchable_area_size.
> > >   (pass_insert_endbr_and_patchable_area::execute): Call
> > >   timevar_push and timevar_pop.  Pass need_endbr and
> > >   patchable_area_size to rest_of_insert_endbr_and_patchable_area.
> > >   (make_pass_insert_endbranch): Renamed to ...
> > >   (make_pass_insert_endbr_and_patchable_area): This.
> > >   * config/i386/i386-passes.def: Replace pass_insert_endbranch
> > >   with pass_insert_endbr_and_patchable_area.
> > >   * config/i386/i386-protos.h (ix86_output_patchable_area): New.
> > >   (make_pass_insert_endbranch): Renamed to ...
> > >   (make_pass_insert_endbr_and_patchable_area): This.
> > >   * config/i386/i386.c (ix86_asm_output_function_label): Set
> > >   function_label_emitted to true.
> > >   (ix86_print_patchable_function_entry): New function.
> > >   (ix86_output_patchable_area): Likewise.
> > >   (x86_function_profiler): Replace endbr_queued_at_entrance with
> > >   insn_queued_at_entrance.  Generate ENDBR only for TYPE_ENDBR.
> > >   Call ix86_output_patchable_area to generate patchable area if
> > >   needed.
> > >   (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): New.
> > >   * i386.h (queued_insn_type): New.
> > >   (machine_function): Add function_label_emitted.  Replace
> > >   endbr_queued_at_entrance with insn_queued_at_entrance.
> > >   * config/i386/i386.md (UNSPECV_PATCHABLE_AREA): New.
> > >   (patchable_area): New.
> > > 
> > > gcc/testsuite/
> > > 
> > >   PR target/93492
> > >   * gcc.target/i386/pr93492-1.c: New test.
> > >   * gcc.target/i386/pr93492-2.c: Likewise.
> > >   * gcc.target/i386/pr93492-3.c: Likewise.
> > >   * gcc.target/i386/pr93492-4.c: Likewise.
> > >   * gcc.target/i386/pr93492-5.c: Likewise.
> > OK
> > jeff
> 
> My patch triggered a latent x86 backend bug.   This patch fixes
> it.  OK for master?
OK
jeff



Re: [PATCH] tree-ssa-threadbackward.c (profitable_jump_thread_path): Do not allow __builtin_constant_p.

2020-06-26 Thread Jeff Law via Gcc-patches
On Fri, 2020-06-26 at 23:54 +0200, Ilya Leoshkevich wrote:
> 
> How should this work ideally?  Suppose we have:
> 
> static inline void foo (int i)
> {
>   if (__builtin_is_constant_p (i))
> asm volatile ("bar" :: "i" (i))
>   else
> asm volatile ("baz" :: "d" (i));
> }
> 
> First of all, this is a supported use case, right?
Yes, this is a supported case.

> 
> Then, the way I see it, is that at least for a while there must exist
> trees like the ones above, regardless of whether their asm arguments
> match constraints.  But ultimately dead code elimination should get rid
> of the wrong ones before they reach RTL.

> 
> E.g. in the example above, the non-inlined version should have
> `!__builtin_is_constant_p (i)`, so `bar` should not survive until
> RTL.  The same for inlined `foo (1)` version's `baz`.
In theory yes, but there are cases where paths converge (like you've shown) 
where
you may have evaluated to a constant on the paths, but it's not a constant at 
the
convergence point.  You have to be very careful using b_c_p like this and it's
been a regular source of kernel bugs.


I'd recommend looking at the .ssa dump and walk forward from there if the .ssa
dump looks correct.

jeff



Re: [PING][PATCH] underline null argument in -Wnonnull (PR c++/86568)

2020-06-26 Thread Jeff Law via Gcc-patches
On Mon, 2020-06-22 at 12:02 -0600, Martin Sebor wrote:
> Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547415.html
> 
> Jason already approved the C++ changes (with a couple of minor
> tweaks).  I'm still looking for an approval of the corresponding
> middle end diff.
Do you need an auto_digantosic_group in the tree-ssa-ccp.c changes too?  If so,
add it.  If not, then it's OK as-is.

jeff
> 



Re: [PATCH] tree-ssa-threadbackward.c (profitable_jump_thread_path): Do not allow __builtin_constant_p.

2020-06-26 Thread Ilya Leoshkevich via Gcc-patches
On Fri, 2020-06-26 at 15:34 -0600, Jeff Law wrote:
> On Fri, 2020-06-26 at 23:17 +0200, Ilya Leoshkevich wrote:
> > On Fri, 2020-06-26 at 14:27 -0600, Jeff Law wrote:
> > > On Thu, 2020-06-04 at 02:37 +0200, Ilya Leoshkevich via Gcc-
> > > patches
> > > wrote:
> > > > Bootstrapped and regtested on x86_64-redhat-linux, ppc64le-
> > > > redhat-
> > > > linux
> > > > and s390x-redhat-linux.
> > > > 
> > > > 
> > > > Linux Kernel (specifically, drivers/leds/trigger/ledtrig-cpu.c)
> > > > build
> > > > with GCC 10 fails on s390 with "impossible constraint".
> > > > 
> > > > The problem is that jump threading makes __builtin_constant_p
> > > > lie
> > > > when
> > > > an expression in question appears to be a constant on a
> > > > threading
> > > > path.
> > > What do you mean by this?
> > > 
> > > What should happen is the path where the queried object is
> > > constant,
> > > builtin_constant_p will return true.  On the other path where it
> > > is
> > > not,
> > > builtin_constant_p will return false.
> > 
> > But what if a path ends before we use the `builtin_constant_p`
> > result?
> > In the testcase from the patch we have `i = is_active ? 1 : -1`,
> > which 
> > produces two paths.  On each path `builtin_constant_p (i)` returns 
> > true, however, the asm statement which depends on this being true
> > is
> > not on either path and comes only later.
> > 
> > Here are the relevant tree dumps (sorry, a bit verbose, I couldn't
> > make
> > them smaller):
> > 
> > gcc/cc1 ../gcc/testsuite/gcc.target/s390/builtin-constant-p-
> > threading.c 
> > -O2 -march=z196 -mzarch -fdump-tree-all-all
> > 
> > Before 106t.thread1:
> > 
> > ;;   basic block 2
> >   if (is_active_2(D) != 0)
> > goto ; [50.00%]
> >   else
> > goto ; [50.00%]
> > 
> > ;;   basic block 3
> >   # iftmp.0_6 = PHI <1(2)>
> >   _7 = __builtin_constant_pD.1098 (iftmp.0_6);
> >   if (_7 != 0)
> > goto ; [50.00%]
> >   else
> > goto ; [50.00%]
> > 
> > ;;   basic block 4
> >   # iftmp.0_4 = PHI <-1(2)>
> >   _12 = __builtin_constant_pD.1098 (iftmp.0_4);
> >   if (_12 != 0)
> > goto ; [50.00%]
> >   else
> > goto ; [50.00%]
> > 
> > ;;   basic block 5
> >   if (iftmp.0_4 >= -128)
> > goto ; [20.00%]
> >   else
> > goto ; [80.00%]
> > 
> > ;;   basic block 6
> >   if (iftmp.0_6 <= 127)
> > goto ; [12.00%]
> >   else
> > goto ; [88.00%]
> > 
> > ;;   basic block 7
> >   # iftmp.0_13 = PHI 
> >   # .MEM_10 = VDEF <.MEM_3(D)>
> >   __asm__ __volatile__("asi %0,%1
> > " : "ptr" "=Q" MEM[(intD.6 *)_active_cpusD.2277] : "val" "i"
> > iftmp.0_13, "Q" MEM[(intD.6 *)_active_cpusD.2277] : "memory",
> > "cc");
> >   goto ; [100.00%]
> > ;;succ:   9 [always]  count:91268056 (estimated locally)
> > (FALLTHRU,EXECUTABLE)
> This looks broken already.  iftmp.0_13 is not a constant prior to
> thread1 and
> thus, IIUC is not suitable for use in this instruction.  Threading
> just
> simplified things, but it's just as broken before threading as it is
> after
> threading.

How should this work ideally?  Suppose we have:

static inline void foo (int i)
{
  if (__builtin_is_constant_p (i))
asm volatile ("bar" :: "i" (i))
  else
asm volatile ("baz" :: "d" (i));
}

First of all, this is a supported use case, right?

Then, the way I see it, is that at least for a while there must exist
trees like the ones above, regardless of whether their asm arguments
match constraints.  But ultimately dead code elimination should get rid
of the wrong ones before they reach RTL.

E.g. in the example above, the non-inlined version should have
`!__builtin_is_constant_p (i)`, so `bar` should not survive until
RTL.  The same for inlined `foo (1)` version's `baz`.

Is 106t.thread1 already too late for such trees?  Should some earlier
pass have cleaned them up already?

> Jeff



Re: [PATCH] Fix target clone indirection elimination.

2020-06-26 Thread Jeff Law via Gcc-patches
On Mon, 2020-06-22 at 11:24 +0200, Richard Biener via Gcc-patches wrote:
> On Sun, Jun 21, 2020 at 2:32 AM Yichao Yu via Gcc-patches
>  wrote:
> > On Sat, Jun 20, 2020 at 8:16 PM Yichao Yu  wrote:
> > > On Sat, Jun 20, 2020 at 3:41 PM Yichao Yu  wrote:
> > > > On Sat, Jun 20, 2020 at 3:26 PM Yichao Yu  wrote:
> > > > > On Sat, Jun 20, 2020 at 3:25 PM H.J. Lu  wrote:
> > > > > > On Sat, Jun 20, 2020 at 12:20 PM Yichao Yu  
> > > > > > wrote:
> > > > > > > On Sat, Jun 20, 2020 at 3:12 PM H.J. Lu  
> > > > > > > wrote:
> > > > > > > > On Sat, Jun 20, 2020 at 11:37 AM Yichao Yu  
> > > > > > > > wrote:
> > > > > > > > > On Sat, Jun 20, 2020 at 1:54 PM Yichao Yu  
> > > > > > > > > wrote:
> > > > > > > > > > > > The current logic seems to be comparing the whole 
> > > > > > > > > > > > attribute tree between the callee
> > > > > > > > > > > > and caller (or at least the tree starting from the 
> > > > > > > > > > > > target attribute).
> > > > > > > > > > > > This is unnecessary and causes strange dependency of 
> > > > > > > > > > > > the indirection
> > > > > > > > > > > > elimination on unrelated properties like 
> > > > > > > > > > > > `noinline`(PR95780) and `visibility`(PR95778).
> > > > > > > > > > > 
> > > > > > > > > > > Does it fix PR95780 and PR95778?  Can you include 
> > > > > > > > > > > testcases for them?
> > > > > > > > > 
> > > > > > > > > OK so replacing
> > > > > > > > > https://github.com/gcc-mirror/gcc/commit/b8ce8129a560f64f8b2855c4a3812b7c3c0ebf3f#diff-e2d535917af8555baad2e9c8749e96a5
> > > > > > > > > with/adding to the test the following one should work. I still
> > > > > > > > > couldn't get test to run though..
> > > > > > > > > 
> > > > > > > > Can you try the enclosed testcases?
> > > > > > > 
> > > > > > > The assembly produced are the following with my patch and if I
> > > > > > > understand it correctly those should work. Unfortunately I don't 
> > > > > > > know
> > > > > > > how to actually run the test as a test (if that makes sense).
> > > > > > > 
> > > > > > 
> > > > > > Place 2 files under gcc/testsuite/gcc.target/i386 and in GCC build 
> > > > > > directory, do
> > > > > > 
> > > > > > $ make check-gcc RUNTESTFLAGS="--target_board='unix{-m32,}'
> > > > > > i386.exp=pr95778-*.c"
> > > > > > 
> > > > 
> > > > Finally got it start running after installing dejagnu...
> > > > It seems to be runing something unrelated though
> > > > 
> > > > e.g. Running 
> > > > /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/builtins/builtins.exp
> > > > ...
> > > > Running 
> > > > /home/yuyichao/projects/contrib/gcc/gcc/testsuite/gcc.c-torture/execute/execute.exp
> > > 
> > > OK it seems that the `-*.c` syntax doesn't work for me somehow, there
> > > are still a lot of unrelated outputs but when I replaced the * with 1
> > > and 2 separately I can confirm that there are only expected pass with
> > > my patch and there are some unexpected failures when the patch is
> > > reverted.
> > 
> > And the updated patch is
> 
> OK.
I fixed up the obvious formatting goofs and constructed a ChangeLog and have
pushed the change to the trunk.
jeff
> 



Re: [PATCH v3] c++: Fix CTAD for aggregates in template [PR95568]

2020-06-26 Thread Marek Polacek via Gcc-patches
On Fri, Jun 26, 2020 at 04:19:08PM -0400, Jason Merrill wrote:
> Please also test scoped enum.

Here:

-- >8 --
95568 complains that CTAD for aggregates doesn't work within
requires-clause and it turned out that it doesn't work when we try
the deduction in a template.  The reason is that maybe_aggr_guide
creates a guide that can look like this

  template X(decltype (X::x))-> X

where the parameter is a decltype, which is a non-deduced context.  So
the subsequent build_new_function_call fails because unify_one_argument
can't deduce anything from it ([temp.deduct.type]: "If a template
parameter is used only in non-deduced contexts and is not explicitly
specified, template argument deduction fails.")

Those decltypes come from finish_decltype_type.  We can just use
TREE_TYPE instead.  I pondered using unlowered_expr_type, but that
didn't make any difference for the FIELD_DECLs I saw in
class-deduction-aggr6.C.

gcc/cp/ChangeLog:

PR c++/95568
* pt.c (collect_ctor_idx_types): Use TREE_TYPE.

gcc/testsuite/ChangeLog:

PR c++/95568
* g++.dg/cpp2a/class-deduction-aggr5.C: New test.
* g++.dg/cpp2a/class-deduction-aggr6.C: New test.
---
 gcc/cp/pt.c   |  2 +-
 .../g++.dg/cpp2a/class-deduction-aggr5.C  | 20 +++
 .../g++.dg/cpp2a/class-deduction-aggr6.C  | 35 +++
 3 files changed, 56 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr5.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr6.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 53a64c3a15e..618bf68b2d6 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -28329,7 +28329,7 @@ collect_ctor_idx_types (tree ctor, tree list, tree elt 
= NULL_TREE)
   tree idx, val; unsigned i;
   FOR_EACH_CONSTRUCTOR_ELT (v, i, idx, val)
 {
-  tree ftype = elt ? elt : finish_decltype_type (idx, true, tf_none);
+  tree ftype = elt ? elt : TREE_TYPE (idx);
   if (BRACE_ENCLOSED_INITIALIZER_P (val)
  && CONSTRUCTOR_NELTS (val)
  /* As in reshape_init_r, a non-aggregate or array-of-dependent-bound
diff --git a/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr5.C 
b/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr5.C
new file mode 100644
index 000..01253f42006
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr5.C
@@ -0,0 +1,20 @@
+// PR c++/95568
+// { dg-do compile { target c++20 } }
+
+template struct X { T x; };
+template struct X2 { T x; U y; };
+template concept Y = requires { X{0}; };
+
+template
+void g()
+{
+  X{0};
+  X2{1, 2.2};
+  Y auto y = X{1};
+}
+
+void
+fn ()
+{
+  g();
+}
diff --git a/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr6.C 
b/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr6.C
new file mode 100644
index 000..95d7c5eec18
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr6.C
@@ -0,0 +1,35 @@
+// PR c++/95568
+// { dg-do compile { target c++20 } }
+// CTAD with aggregates containing bit-fields.
+
+template struct same_type;
+template struct same_type {};
+
+enum E { e };
+enum class F { f };
+
+template
+struct X {
+  T a : 5;
+};
+
+template
+void g()
+{
+  auto x = X{ 0 };
+  same_type();
+  auto x2 = X{ E::e };
+  same_type();
+  auto x3 = X{ false };
+  same_type();
+  auto x4 = X{ 0u };
+  same_type();
+  auto x5 = X{ F::f };
+  same_type();
+}
+
+void
+fn ()
+{
+  g();
+}

base-commit: 0801f419440c14f6772b28f763ad7d40f7f7a580
-- 
2.26.2



Re: [PATCH] tree-ssa-threadbackward.c (profitable_jump_thread_path): Do not allow __builtin_constant_p.

2020-06-26 Thread Jeff Law via Gcc-patches
On Fri, 2020-06-26 at 23:17 +0200, Ilya Leoshkevich wrote:
> On Fri, 2020-06-26 at 14:27 -0600, Jeff Law wrote:
> > On Thu, 2020-06-04 at 02:37 +0200, Ilya Leoshkevich via Gcc-patches
> > wrote:
> > > Bootstrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-
> > > linux
> > > and s390x-redhat-linux.
> > > 
> > > 
> > > Linux Kernel (specifically, drivers/leds/trigger/ledtrig-cpu.c)
> > > build
> > > with GCC 10 fails on s390 with "impossible constraint".
> > > 
> > > The problem is that jump threading makes __builtin_constant_p lie
> > > when
> > > an expression in question appears to be a constant on a threading
> > > path.
> > What do you mean by this?
> > 
> > What should happen is the path where the queried object is constant,
> > builtin_constant_p will return true.  On the other path where it is
> > not,
> > builtin_constant_p will return false.
> 
> But what if a path ends before we use the `builtin_constant_p` result?
> In the testcase from the patch we have `i = is_active ? 1 : -1`, which 
> produces two paths.  On each path `builtin_constant_p (i)` returns 
> true, however, the asm statement which depends on this being true is
> not on either path and comes only later.
> 
> Here are the relevant tree dumps (sorry, a bit verbose, I couldn't make
> them smaller):
> 
> gcc/cc1 ../gcc/testsuite/gcc.target/s390/builtin-constant-p-threading.c 
> -O2 -march=z196 -mzarch -fdump-tree-all-all
> 
> Before 106t.thread1:
> 
> ;;   basic block 2
>   if (is_active_2(D) != 0)
> goto ; [50.00%]
>   else
> goto ; [50.00%]
> 
> ;;   basic block 3
>   # iftmp.0_6 = PHI <1(2)>
>   _7 = __builtin_constant_pD.1098 (iftmp.0_6);
>   if (_7 != 0)
> goto ; [50.00%]
>   else
> goto ; [50.00%]
> 
> ;;   basic block 4
>   # iftmp.0_4 = PHI <-1(2)>
>   _12 = __builtin_constant_pD.1098 (iftmp.0_4);
>   if (_12 != 0)
> goto ; [50.00%]
>   else
> goto ; [50.00%]
> 
> ;;   basic block 5
>   if (iftmp.0_4 >= -128)
> goto ; [20.00%]
>   else
> goto ; [80.00%]
> 
> ;;   basic block 6
>   if (iftmp.0_6 <= 127)
> goto ; [12.00%]
>   else
> goto ; [88.00%]
> 
> ;;   basic block 7
>   # iftmp.0_13 = PHI 
>   # .MEM_10 = VDEF <.MEM_3(D)>
>   __asm__ __volatile__("asi %0,%1
> " : "ptr" "=Q" MEM[(intD.6 *)_active_cpusD.2277] : "val" "i"
> iftmp.0_13, "Q" MEM[(intD.6 *)_active_cpusD.2277] : "memory",
> "cc");
>   goto ; [100.00%]
> ;;succ:   9 [always]  count:91268056 (estimated locally)
> (FALLTHRU,EXECUTABLE)
This looks broken already.  iftmp.0_13 is not a constant prior to thread1 and
thus, IIUC is not suitable for use in this instruction.  Threading just
simplified things, but it's just as broken before threading as it is after
threading.


Jeff
> 



Re: Move simplification of statements using ranges into its own class.

2020-06-26 Thread Jeff Law via Gcc-patches
On Thu, 2020-06-18 at 13:24 +0200, Aldy Hernandez wrote:
> Howdy.
> 
> This moves all the simplification code from vr_values into a separate 
> class (simplify_using_ranges).  In doing so, we get rid of a bunch of 
> dependencies on the internals of vr_values.  The goal is to (a) remove 
> unnecessary interdependendcies (b) be able to use this engine with any 
> range infrastructure, as all it needs is a method to get the range for 
> an SSA name (get_value_range).
> 
> I also removed as many dependencies on value_range_equiv as possible, 
> preferring value_range.  A few value_range_equiv uses remain, but for 
> cases where equivalences are actually used (folding conditionals, etc).
> 
> The plan is to use it like this:
> 
>simplify_using_ranges simplifier (vr_values_thinggie);
>simplifier.simplify (gsi);
> 
> OK?
OK
jeff
> 



Re: [PATCH] tree-ssa-threadbackward.c (profitable_jump_thread_path): Do not allow __builtin_constant_p.

2020-06-26 Thread Ilya Leoshkevich via Gcc-patches
On Fri, 2020-06-26 at 14:27 -0600, Jeff Law wrote:
> On Thu, 2020-06-04 at 02:37 +0200, Ilya Leoshkevich via Gcc-patches
> wrote:
> > Bootstrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-
> > linux
> > and s390x-redhat-linux.
> > 
> > 
> > Linux Kernel (specifically, drivers/leds/trigger/ledtrig-cpu.c)
> > build
> > with GCC 10 fails on s390 with "impossible constraint".
> > 
> > The problem is that jump threading makes __builtin_constant_p lie
> > when
> > an expression in question appears to be a constant on a threading
> > path.
> What do you mean by this?
> 
> What should happen is the path where the queried object is constant,
> builtin_constant_p will return true.  On the other path where it is
> not,
> builtin_constant_p will return false.

But what if a path ends before we use the `builtin_constant_p` result?
In the testcase from the patch we have `i = is_active ? 1 : -1`, which 
produces two paths.  On each path `builtin_constant_p (i)` returns 
true, however, the asm statement which depends on this being true is
not on either path and comes only later.

Here are the relevant tree dumps (sorry, a bit verbose, I couldn't make
them smaller):

gcc/cc1 ../gcc/testsuite/gcc.target/s390/builtin-constant-p-threading.c 
-O2 -march=z196 -mzarch -fdump-tree-all-all

Before 106t.thread1:

;;   basic block 2
  if (is_active_2(D) != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

;;   basic block 3
  # iftmp.0_6 = PHI <1(2)>
  _7 = __builtin_constant_pD.1098 (iftmp.0_6);
  if (_7 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

;;   basic block 4
  # iftmp.0_4 = PHI <-1(2)>
  _12 = __builtin_constant_pD.1098 (iftmp.0_4);
  if (_12 != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

;;   basic block 5
  if (iftmp.0_4 >= -128)
goto ; [20.00%]
  else
goto ; [80.00%]

;;   basic block 6
  if (iftmp.0_6 <= 127)
goto ; [12.00%]
  else
goto ; [88.00%]

;;   basic block 7
  # iftmp.0_13 = PHI 
  # .MEM_10 = VDEF <.MEM_3(D)>
  __asm__ __volatile__("asi %0,%1
" : "ptr" "=Q" MEM[(intD.6 *)_active_cpusD.2277] : "val" "i"
iftmp.0_13, "Q" MEM[(intD.6 *)_active_cpusD.2277] : "memory",
"cc");
  goto ; [100.00%]
;;succ:   9 [always]  count:91268056 (estimated locally)
(FALLTHRU,EXECUTABLE)

;;   basic block 8
  # iftmp.0_14 = PHI 
  # .MEM_11 = VDEF <.MEM_3(D)>
  __asm__ __volatile__("laa %0,%2,%1
" : "old" "=d" old_8, "ptr" "=Q" MEM[(intD.6 *)_active_cpusD.2277]
: "val" "d" iftmp.0_14, "Q" MEM[(intD.6 *)_active_cpusD.2277] :
"memory", "cc");

`asi` instruction (bb 7) can be used only with constants, and that's 
what `__builtin_constant_p` ensures here.  For other values `laa`
instruction (block 8) must be used.

106t.thread1 says:

about to thread: path: 2 -> 4, 4 -> 5, 5 -> 6
about to thread: path: 3 -> 4, 4 -> 5, 5 -> 6, 6 -> 7

and as a result produces this:

;;   basic block 2
  if (is_active_2(D) != 0)
goto ; [50.00%]
  else
goto ; [50.00%]

;;   basic block 3

;;   basic block 4
  # iftmp.0_13 = PHI <-1(2), 1(3)>
  # .MEM_10 = VDEF <.MEM_3(D)>
  __asm__ __volatile__("asi %0,%1
" : "ptr" "=Q" MEM[(intD.6 *)_active_cpusD.2277] : "val" "i"
iftmp.0_13, "Q" MEM[(intD.6 *)_active_cpusD.2277] : "memory",
"cc");

This code won't compile - `iftmp.0_13` is not a constant.  My idea was
that the root cause is that `i` is constant on each threading path,
but is not constant anymore when they merge.  But maybe there is a
deeper issue?  What would be a better way to fix this?

> > Fix by disallowing __builtin_constant_p on threading paths.
> > 
> > gcc/ChangeLog:
> > 
> > 2020-06-03  Ilya Leoshkevich  
> > 
> > * tree-ssa-threadbackward.c
> > (thread_jumps::profitable_jump_thread_path):
> > Do not allow __builtin_constant_p on a threading path.
> Sorry, this is wrong.
> 
> jeff
> 
> 



[PATCH] x86-64: Define ASM_OUTPUT_ALIGNED_DECL_LOCAL

2020-06-26 Thread H.J. Lu via Gcc-patches
Define ASM_OUTPUT_ALIGNED_DECL_LOCAL for large local common symbol.

gcc/ChangeLog:

* config/i386/x86-64.h (ASM_OUTPUT_ALIGNED_DECL_LOCAL): New.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr95620.c: New test.
---
 gcc/config/i386/x86-64.h| 11 +++
 gcc/testsuite/gcc.target/i386/pr95620.c | 19 +++
 2 files changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr95620.c

diff --git a/gcc/config/i386/x86-64.h b/gcc/config/i386/x86-64.h
index 88db428f592..0c5b8af5a13 100644
--- a/gcc/config/i386/x86-64.h
+++ b/gcc/config/i386/x86-64.h
@@ -59,6 +59,17 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 #define ASM_OUTPUT_ALIGNED_DECL_COMMON(FILE, DECL, NAME, SIZE, ALIGN)  
\
   x86_elf_aligned_decl_common (FILE, DECL, NAME, SIZE, ALIGN);
 
+#undef  ASM_OUTPUT_ALIGNED_DECL_LOCAL
+#define ASM_OUTPUT_ALIGNED_DECL_LOCAL(FILE, DECL, NAME, SIZE, ALIGN)  \
+  do \
+{\
+  fprintf ((FILE), "%s", LOCAL_ASM_OP);  \
+  assemble_name ((FILE), (NAME));\
+  fprintf ((FILE), "\n");\
+  ASM_OUTPUT_ALIGNED_DECL_COMMON (FILE, DECL, NAME, SIZE, ALIGN); \
+}\
+  while (0)
+
 /* This is used to align code labels according to Intel recommendations.  */
 
 #define SUBALIGN_LOG 3
diff --git a/gcc/testsuite/gcc.target/i386/pr95620.c 
b/gcc/testsuite/gcc.target/i386/pr95620.c
new file mode 100644
index 000..ac9b40383e7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr95620.c
@@ -0,0 +1,19 @@
+/* { dg-do assemble { target lp64 } } */
+/* { dg-require-effective-target fopenmp } */
+/* { dg-require-effective-target lto } */
+/* { dg-options "-flto -fopenmp -mcmodel=medium" } */
+
+double a[353783808];
+int b, c, d;
+
+int
+main()
+{
+  for (; b;)
+#pragma omp parallel
+a[c] = 1;
+  for (;; b++)
+if (a[c])
+  d++;
+  return 0;
+}
-- 
2.26.2



Re: [PATCH] tree-ssa-threadbackward.c (profitable_jump_thread_path): Do not allow __builtin_constant_p.

2020-06-26 Thread Jeff Law via Gcc-patches
On Thu, 2020-06-04 at 02:37 +0200, Ilya Leoshkevich via Gcc-patches wrote:
> Bootstrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-linux
> and s390x-redhat-linux.
> 
> 
> Linux Kernel (specifically, drivers/leds/trigger/ledtrig-cpu.c) build
> with GCC 10 fails on s390 with "impossible constraint".
> 
> The problem is that jump threading makes __builtin_constant_p lie when
> an expression in question appears to be a constant on a threading path.
What do you mean by this?

What should happen is the path where the queried object is constant,
builtin_constant_p will return true.  On the other path where it is not,
builtin_constant_p will return false.

> 
> Fix by disallowing __builtin_constant_p on threading paths.
> 
> gcc/ChangeLog:
> 
> 2020-06-03  Ilya Leoshkevich  
> 
>   * tree-ssa-threadbackward.c (thread_jumps::profitable_jump_thread_path):
>   Do not allow __builtin_constant_p on a threading path.
Sorry, this is wrong.

jeff


> 



Re: [PATCH v2] c++: Fix CTAD for aggregates in template [PR95568]

2020-06-26 Thread Jason Merrill via Gcc-patches

On 6/25/20 5:37 PM, Marek Polacek wrote:

On Wed, Jun 24, 2020 at 03:52:12PM -0400, Jason Merrill via Gcc-patches wrote:

On 6/23/20 6:58 PM, Marek Polacek wrote:

95568 complains that CTAD for aggregates doesn't work within
requires-clause and it turned out that it doesn't work when we try
the deduction in a template.  The reason is that maybe_aggr_guide
creates a guide that can look like this

template X(decltype (X::x))-> X


Then that's the bug; there's no reason the guide should be different just
because we're trying to do the deduction in a template.

I'm not sure why I used finish_decltype_type there.


Aha.  Here's a v2 using just TREE_TYPE; in the patch I explain why I didn't
go with unlowered_expr_type.  Which is fully expected since
is_bitfield_expr_with_lowered_type doesn't do anything with FIELD_DECLs.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
95568 complains that CTAD for aggregates doesn't work within
requires-clause and it turned out that it doesn't work when we try
the deduction in a template.  The reason is that maybe_aggr_guide
creates a guide that can look like this

   template X(decltype (X::x))-> X

where the parameter is a decltype, which is a non-deduced context.  So
the subsequent build_new_function_call fails because unify_one_argument
can't deduce anything from it ([temp.deduct.type]: "If a template
parameter is used only in non-deduced contexts and is not explicitly
specified, template argument deduction fails.")

Those decltypes come from finish_decltype_type.  We can just use
TREE_TYPE instead.  I pondered using unlowered_expr_type, but that
didn't make any difference for the FIELD_DECLs I saw in
class-deduction-aggr6.C.


Please also test scoped enum.


gcc/cp/ChangeLog:

PR c++/95568
* pt.c (collect_ctor_idx_types): Use TREE_TYPE.

gcc/testsuite/ChangeLog:

PR c++/95568
* g++.dg/cpp2a/class-deduction-aggr5.C: New test.
* g++.dg/cpp2a/class-deduction-aggr6.C: New test.
---
  gcc/cp/pt.c   |  2 +-
  .../g++.dg/cpp2a/class-deduction-aggr5.C  | 20 
  .../g++.dg/cpp2a/class-deduction-aggr6.C  | 32 +++
  3 files changed, 53 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr5.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr6.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 53a64c3a15e..618bf68b2d6 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -28329,7 +28329,7 @@ collect_ctor_idx_types (tree ctor, tree list, tree elt 
= NULL_TREE)
tree idx, val; unsigned i;
FOR_EACH_CONSTRUCTOR_ELT (v, i, idx, val)
  {
-  tree ftype = elt ? elt : finish_decltype_type (idx, true, tf_none);
+  tree ftype = elt ? elt : TREE_TYPE (idx);
if (BRACE_ENCLOSED_INITIALIZER_P (val)
  && CONSTRUCTOR_NELTS (val)
  /* As in reshape_init_r, a non-aggregate or array-of-dependent-bound
diff --git a/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr5.C 
b/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr5.C
new file mode 100644
index 000..01253f42006
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr5.C
@@ -0,0 +1,20 @@
+// PR c++/95568
+// { dg-do compile { target c++20 } }
+
+template struct X { T x; };
+template struct X2 { T x; U y; };
+template concept Y = requires { X{0}; };
+
+template
+void g()
+{
+  X{0};
+  X2{1, 2.2};
+  Y auto y = X{1};
+}
+
+void
+fn ()
+{
+  g();
+}
diff --git a/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr6.C 
b/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr6.C
new file mode 100644
index 000..5332eb93bbe
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/class-deduction-aggr6.C
@@ -0,0 +1,32 @@
+// PR c++/95568
+// { dg-do compile { target c++20 } }
+// CTAD with aggregates containing bit-fields.
+
+template struct same_type;
+template struct same_type {};
+
+enum E { e };
+
+template
+struct X {
+  T a : 5;
+};
+
+template
+void g()
+{
+  auto x = X{ 0 };
+  same_type();
+  auto x2 = X{ E::e };
+  same_type();
+  auto x3 = X{ false };
+  same_type();
+  auto x4 = X{ 0u };
+  same_type();
+}
+
+void
+fn ()
+{
+  g();
+}

base-commit: 77d455ee81ec3a23f8b20259a31ab963716f8e82





[PATCH] Add TARGET_LOWER_LOCAL_DECL_ALIGNMENT [PR95237]

2020-06-26 Thread H.J. Lu via Gcc-patches
On Thu, Jun 25, 2020 at 1:10 AM Richard Biener
 wrote:
>
> On Thu, Jun 25, 2020 at 2:53 AM Sunil Pandey  wrote:
> >
> > On Wed, Jun 24, 2020 at 12:30 AM Richard Biener
> >  wrote:
> > >
> > > On Tue, Jun 23, 2020 at 5:31 PM Sunil K Pandey via Gcc-patches
> > >  wrote:
> > > >
> > > > From: Sunil K Pandey 
> > > >
> > > > Default for this hook is NOP. For x86, in 32 bit mode, this hook
> > > > sets alignment of long long on stack to 32 bits if preferred stack
> > > > boundary is 32 bits.
> > > >
> > > >  - This patch fixes
> > > > gcc.target/i386/pr69454-2.c
> > > > gcc.target/i386/stackalign/longlong-1.c
> > > >  - Regression test on x86-64, no new fail introduced.
> > >
> > > I think the name is badly chosen, TARGET_LOWER_LOCAL_DECL_ALIGNMENT
> >
> > Yes, I can change the target hook name.
> >
> > > would be better suited (and then asks for LOCAL_DECL_ALIGNMENT to be
> > > renamed to INCREASE_LOCAL_DECL_ALIGNMENT).
> >
> > It seems like LOCAL_DECL_ALIGNMENT macro documentation is incorrect.
> > It increases as well as decreases alignment based on condition(-m32
> > -mpreferred-stack-boundary=2)
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95885
> >
> > >
> > > You're calling it from do_type_align which IMHO is dangerous since that's
> > > invoked from FIELD_DECL layout as well.  Instead invoke it from
> > > layout_decl itself where we do
> > >
> > >   if (code != FIELD_DECL)
> > > /* For non-fields, update the alignment from the type.  */
> > > do_type_align (type, decl);
> > >
> > > and invoke the hook _after_ do_type_align.  Also avoid
> > > invoking the hook on globals or hard regs and only
> > > invoke it on VAR_DECLs, thus only
> > >
> > >   if (VAR_P (decl) && !is_global_var (decl) && !DECL_HARD_REGISTER (decl))
> >
> > It seems like decl property is not fully populated at this point call
> > to is_global_var (decl) on global variable return false.
> >
> > $ cat foo.c
> > long long x;
> > int main()
> > {
> > if (__alignof__(x) != 8)
> >   __builtin_abort();
> > return 0;
> > }
> >
> > Breakpoint 1, layout_decl (decl=0x77ffbb40, known_align=0)
> > at /local/skpandey/gccwork/gccwork/gcc/gcc/stor-layout.c:674
> > 674 do_type_align (type, decl);
> > Missing separate debuginfos, use: dnf debuginfo-install
> > gmp-6.1.2-10.fc31.x86_64 isl-0.16.1-9.fc31.x86_64
> > libmpc-1.1.0-4.fc31.x86_64 mpfr-3.1.6-5.fc31.x86_64
> > zlib-1.2.11-20.fc31.x86_64
> > (gdb) call debug_tree(decl)
> >   > type  > size 
> > unit-size 
> > align:64 warn_if_not_align:0 symtab:0 alias-set -1
> > canonical-type 0x7fffea801888 precision:64 min  > 0x7fffea7e8fd8 -9223372036854775808> max  > 9223372036854775807>
> > pointer_to_this >
> > DI foo.c:1:11 size  unit-size
> > 
> > align:1 warn_if_not_align:0>
> >
> > (gdb) p is_global_var(decl)
> > $1 = false
> > (gdb)
> >
> >
> > What about calling hook here
> >
> >  603 do_type_align (tree type, tree decl)
> >  604 {
> >  605   if (TYPE_ALIGN (type) > DECL_ALIGN (decl))
> >  606 {
> >  607   SET_DECL_ALIGN (decl, TYPE_ALIGN (type));
> >  608   if (TREE_CODE (decl) == FIELD_DECL)
> >  609 DECL_USER_ALIGN (decl) = TYPE_USER_ALIGN (type);
> >  610   else
> >  611 /* Lower local decl alignment */
> >  612 if (VAR_P (decl)
> >  613 && !is_global_var (decl)
> >  614 && !DECL_HARD_REGISTER (decl)
> >  615 && cfun != NULL)
> >  616   targetm.lower_local_decl_alignment (decl);
> >  617 }
>
> But that doesn't change anything (obviously).  layout_decl
> is called quite early, too early it looks like.
>
> Now there doesn't seem to be any other good place where
> we are sure to catch the decl before we evaluate things
> like __alignof__
>
> void __attribute__((noipa))
> foo (__SIZE_TYPE__ align, long long *p)
> {
>   if ((__SIZE_TYPE__)p & (align-1))
> __builtin_abort ();
> }
> int main()
> {
>   long long y;
>   foo (_Alignof y, );
>   return 0;
> }
>
> Joseph/Jason - do you have a good recommendation
> how to deal with targets where natural alignment
> is supposed to be lowered for optimization purposes?
> (this case is for i?86 to avoid dynamic stack re-alignment
> to align long long to 8 bytes with -mpreferred-stack-boundary=2)
>
> I note that for -mincoming-stack-boundary=2 we do perform
> dynamic stack re-alignment already.
>
> I can't find a suitable existing target macro/hook for this,
> but my gut feeling is that the default alignment should
> instead be the lower one and instead the alignment for
> globals should be raised as optimization?
>

Here is the updated patch from Sunil.

-- 
H.J.


0001-Add-TARGET_LOWER_LOCAL_DECL_ALIGNMENT-PR95237.patch
Description: Binary data


[RFC] rs6000: Add builtins for fegetround, feclearexcept and feraiseexcept [PR94193]

2020-06-26 Thread Raoni Fassina Firmino via Gcc-patches
Hi all,


This is an early draft I'm working on to add fegetround , feclearexcept
and feraiseexcept as builtins on rs6000.  This is my first patch so I
welcome any and all feedback.  Foremost I have some questions to ask as
I got stuck on some problems.


Q1) How to implement a target specific builtin for a C standard
function?

More specifically, how to make gcc use a rs6000 builtin for a
standard C function? Right now, I am getting a double define of the
builtin.  I don't know if define is the right word for it, may be
register an implementation?

The context is that I am creating builtin optimizations for fegetround,
feclearexcept and feraiseexcept.  Early on I discovered that there is
this file that defines builtins for all C library but not actually
implements them (in gcc/builtins.def) and trying to redefine them in
gcc/config/rs6000/rs6000-builtin.def ends up with a name clash.  So I
implemented the builtins with a suffix in its names and pushed this
problem for later...  And this later time is now.

I tried my best to find something about it on the gcc internal
documentation but I may have missed it.

So this is my question, how to I link the builtin defined in
gcc/builtins.def to use my implementation on rs6000? If someone has a
pointer about it or a patch that does it for some other c function (in
any target architecture) that would be great.


Q2) How to fallback to the default behavior of the function call when
the builtin is not suitable for the parameters?

Here, it is more specifically for feclearexcept and feraiseexcept.  The
builtin should only be used in the case of the parameter input is a
constant number with only 1bit mask (to work on only one exception).
Right now, I make the correctly check and it works (I validate the
builtins using a name suffix to avoid the problem mentioned in Q1)
But It aborts when the input is not valid instead of falling back to a
function call.


Q3) Are the implementations for the builtins more or less on the
right places?

The first one I did was fegetround and I based it on ppc_get_timebase
and other related builtins, so I used a define_expand on rs6000.md, but
when I was working on the fe*except I was basing it on other builtins
and ended up implementing it all on rs6000-call.c, but I am not sure if
there is a canonical way of doing it one way or another.


o/
Raoni Fassina Firmino

 8< 

This optimizations were originally in glibc, but was removed
and sugested that they were a good fit as gcc builtins[1].

The associated bugreport: PR target/94193

[1] https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00047.html
https://sourceware.org/legacy-ml/libc-alpha/2020-03/msg00080.html

Signed-off-by: Raoni Fassina Firmino 
---
 gcc/config/rs6000/rs6000-builtin.def | 13 ++
 gcc/config/rs6000/rs6000-call.c  | 69 
 gcc/config/rs6000/rs6000.md  | 18 
 3 files changed, 100 insertions(+)

diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 54f750c8384..d5ca15141b1 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2567,12 +2567,25 @@ BU_SPECIAL_X (RS6000_BUILTIN_GET_TB, 
"__builtin_ppc_get_timebase",
 BU_SPECIAL_X (RS6000_BUILTIN_MFTB, "__builtin_ppc_mftb",
  RS6000_BTM_ALWAYS, RS6000_BTC_MISC)
 
+BU_SPECIAL_X (RS6000_BUILTIN_FEGETROUND, "__builtin_fegetround",
+ RS6000_BTM_ALWAYS, RS6000_BTC_MISC)
+
 BU_SPECIAL_X (RS6000_BUILTIN_MFFS, "__builtin_mffs",
  RS6000_BTM_ALWAYS, RS6000_BTC_MISC)
 
 BU_SPECIAL_X (RS6000_BUILTIN_MFFSL, "__builtin_mffsl",
  RS6000_BTM_ALWAYS, RS6000_BTC_MISC)
 
+RS6000_BUILTIN_X (RS6000_BUILTIN_FECLEAREXCEPT, "__builtin_feclearexcept",
+ RS6000_BTM_ALWAYS,
+ RS6000_BTC_MISC | RS6000_BTC_UNARY,
+ CODE_FOR_nothing)
+
+RS6000_BUILTIN_X (RS6000_BUILTIN_FERAISEEXCEPT, "__builtin_feraiseexcept",
+ RS6000_BTM_ALWAYS,
+ RS6000_BTC_MISC | RS6000_BTC_UNARY,
+ CODE_FOR_nothing)
+
 RS6000_BUILTIN_X (RS6000_BUILTIN_MTFSF, "__builtin_mtfsf",
  RS6000_BTM_ALWAYS,
  RS6000_BTC_MISC | RS6000_BTC_UNARY | RS6000_BTC_VOID,
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 7621d6f5278..af93259e73d 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -8533,6 +8533,53 @@ rs6000_expand_zeroop_builtin (enum insn_code icode, rtx 
target)
 }
 
 
+static rtx
+rs6000_expand_feCRexcept_builtin (enum insn_code icode, tree exp, rtx target)
+{
+  rtx pat;
+  tree arg0 = CALL_EXPR_ARG (exp, 0);
+  rtx op0 = expand_normal (arg0);
+
+  if (icode == CODE_FOR_nothing)
+/* Builtin not supported on this processor.  */
+return 0;
+
+  if (rs6000_isa_flags & OPTION_MASK_SOFT_FLOAT)
+{
+  error ("%<__builtin_feclearexcept%> and "
+

Re: PSA: Default C++ dialect is now C++17

2020-06-26 Thread Marek Polacek via Gcc-patches
On Fri, Jun 26, 2020 at 03:34:09PM -0400, Marek Polacek via Gcc-patches wrote:
> As discussed last month:
> 
> it's time to change the C++ default to gnu++17.  I've committed the patch 
> after
> testing x86_64-pc-linux-gnu and powerpc64le-unknown-linux-gnu.  Brace 
> yourselves!

I've also updated wwwdocs to that effect:

commit 1528c44593180742ab827f9ea66d31e75b803de0
Author: Marek Polacek 
Date:   Fri Jun 26 15:56:12 2020 -0400

The default C++ dialect in GCC 11 is C++17.

diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
index dc22f216..cea01a9c 100644
--- a/htdocs/gcc-11/changes.html
+++ b/htdocs/gcc-11/changes.html
@@ -30,6 +30,9 @@ a work-in-progress.
 
 Caveats
 
+  The default mode for C++ is now -std=gnu++17 instead of
+  -std=gnu++14.
+
   Naming and location of auxiliary and dump output files changed.
   If you compile multiple input files in a single command, if you
   enable Link Time Optimization, or if you use -dumpbase,
@@ -72,6 +75,7 @@ a work-in-progress.
 
 C++
 
+  The default mode has been changed to -std=gnu++17.
   Several C++ Defect Reports have been resolved, e.g.:
   
 DR 1512, Pointer comparison vs qualification conversions
diff --git a/htdocs/projects/cxx-status.html b/htdocs/projects/cxx-status.html
index b5cdd1a9..ed1a3440 100644
--- a/htdocs/projects/cxx-status.html
+++ b/htdocs/projects/cxx-status.html
@@ -577,11 +577,10 @@
   https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2017;>the
 library documentation.
   
 
-  C++17 features are available since GCC 5. To enable C++17
-  support, add the command-line parameter -std=c++17
-  to your g++ command line. Or, to enable GNU
-  extensions in addition to C++17 features,
-add -std=gnu++17.
+  C++17 features are available since GCC 5.  This mode is the default
+  in GCC 11; it can be explicitly selected with the -std=c++17
+  command-line flag, or -std=gnu++17 to enable GNU extensions
+  as well.
 
   C++17 Language Features
 
@@ -902,8 +901,8 @@
   GCC has full support for the previous revision of the C++
   standard, which was published in 2014.
 
-  This mode is the default in GCC 6.1 and above; it can be explicitly
-  selected with the -std=c++14 command-line flag,
+  This mode is the default in GCC 6.1 up until GCC 10 (including); it can
+  be explicitly selected with the -std=c++14 command-line flag,
   or -std=gnu++14 to enable GNU extensions as well.
 
   C++14 Language Features



[PATCH] PR fortran/95881 - [9/10/11 Regression] ICE in resolve_symbol, at fortran/resolve.c:15175

2020-06-26 Thread Harald Anlauf
Dear all,

here's another NULL pointer dereference on invalid code.

Regtested on x86_64-pc-linux-gnu.

OK for master / backports where appropriate?

Thanks,
Harald


PR fortran/95881 - ICE in resolve_symbol, at fortran/resolve.c:15175

Avoid NULL pointer dereference.

gcc/fortran/
PR fortran/95881
* resolve.c (resolve_symbol): Avoid NULL pointer dereference.

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 6fa34caec54..55f57e2769b 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -15170,6 +15170,7 @@ resolve_symbol (gfc_symbol *sym)
   if (flag_coarray == GFC_FCOARRAY_LIB && sym->ts.type == BT_CLASS
   && sym->ts.u.derived && CLASS_DATA (sym)
   && CLASS_DATA (sym)->attr.codimension
+  && CLASS_DATA (sym)->ts.u.derived
   && (CLASS_DATA (sym)->ts.u.derived->attr.alloc_comp
 	  || CLASS_DATA (sym)->ts.u.derived->attr.pointer_comp))
 {
diff --git a/gcc/testsuite/gfortran.dg/pr95881.f90 b/gcc/testsuite/gfortran.dg/pr95881.f90
new file mode 100644
index 000..13146967e9f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr95881.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! { dg-options "-fcoarray=lib" }
+! PR fortran/95881 - ICE in resolve_symbol, at fortran/resolve.c:15175
+
+program p
+  type t
+ real, allocatable :: a[:]
+  end type t
+  class(t) :: x ! { dg-error "must be dummy, allocatable or pointer" }
+  allocate (x%a[*])
+end


RE: [Patch][gcn, nvptx, offloading] mkoffload – handle -fpic/-fPIC

2020-06-26 Thread Moore, Catherine
Hi Tom,

It doesn't look like you were explicitly cc'd on this patch and probably 
haven't seen it.  Would you mind taking a look and approving the nvptx portions?

Thanks,
Catherine

>-Original Message-
>From: Gcc-patches [mailto:gcc-patches-boun...@gcc.gnu.org] On Behalf
>Of Burnus, Tobias
>Sent: Tuesday, June 23, 2020 11:21 AM
>To: gcc-patches ; Jakub Jelinek
>
>Cc: Stubbs, Andrew ; Schwinge, Thomas
>
>Subject: [Patch][gcn, nvptx, offloading] mkoffload – handle -fpic/-fPIC
>
>If the offloading code is (only) in a library, one can come up
>with the idea to build those parts as shared library – and link
>it to the nonoffloading code.(*)
>
>Currently, this fails as the mkoffload calls the nonoffloading
>compiler without the -fpic/-fPIC flags, even though the compiler
>was originally invoked with those options. – And at some point,
>the linker then complains.
>
>This patch simply adds -fpic/-fPIC to the calls to the nonoffloading
>("host") compiler, invoked from mkoffload, if they were present before.
>
>For the testcase at hand, this works with both AMDGCN and nvptx
>with the attached patch.
>
>OK for the trunk?
>
>Tobias
>
>PS: I think as mid-/longterm project it would be nice to test this
>in the testsuite, but that's unfortunately a larger task.
>
>(*) Thomas mentioned that this is supposed to work also in more
>complex cases than the one I outlined, although, that is probably
>currently the most common one.
>
>-
>Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München /
>Germany
>Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung,
>Alexander Walter


PSA: Default C++ dialect is now C++17

2020-06-26 Thread Marek Polacek via Gcc-patches
As discussed last month:

it's time to change the C++ default to gnu++17.  I've committed the patch after
testing x86_64-pc-linux-gnu and powerpc64le-unknown-linux-gnu.  Brace 
yourselves!

Marek



[PATCH] c++: Check uniqueness of concepts/variable templates [PR94553]

2020-06-26 Thread Marek Polacek via Gcc-patches
This patch wraps up PR94553.  Variable template names have no C
compatibility implications so they should be unique in their
declarative region.  It occurred to me that this applies to concepts
as well.  This is not specified in [basic.scope.declarative]/4.2
but that seems like a bug in the standard.

I couldn't use variable_template_p because that uses PRIMARY_TEMPLATE_P
which uses DECL_PRIMARY_TEMPLATE and that might not have been set up yet
(push_template_decl hasn't yet been called).  PRIMARY_TEMPLATE_P is
important to distinguish between a variable template and a variable in a
function template.  But I think we don't have to worry about that in
duplicate_decls: a template declaration cannot appear at block scope,
and additional checks in duplicate_decls suggest that it won't ever
see a TEMPLATE_DECL for a variable in a function template.  So
checking that the DECL_TEMPLATE_RESULT is a VAR_DECL seems to be fine.
I could have added a default argument to variable_template_p too to
avoid checking PRIMARY_TEMPLATE_P but it didn't seem worth the effort.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/94553
* decl.c (duplicate_decls): Make sure a concept or a variable
template is unique in its declarative region.

gcc/testsuite/ChangeLog:

PR c++/94553
* g++.dg/cpp1y/pr68578.C: Adjust dg-error.
* g++.dg/cpp1y/var-templ66.C: New test.
* g++.dg/cpp2a/concepts-redecl1.C: New test.
---
 gcc/cp/decl.c | 12 +++-
 gcc/testsuite/g++.dg/cpp1y/pr68578.C  |  2 +-
 gcc/testsuite/g++.dg/cpp1y/var-templ66.C  |  7 +++
 gcc/testsuite/g++.dg/cpp2a/concepts-redecl1.C |  7 +++
 4 files changed, 26 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/var-templ66.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-redecl1.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 3afad5ca805..45c871af741 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -1679,6 +1679,16 @@ duplicate_decls (tree newdecl, tree olddecl, bool 
newdecl_is_friend)
   else if (DECL_TYPE_TEMPLATE_P (olddecl)
   || DECL_TYPE_TEMPLATE_P (newdecl))
/* Class template conflicts.  */;
+  else if ((TREE_CODE (olddecl) == TEMPLATE_DECL
+   && DECL_TEMPLATE_RESULT (olddecl)
+   && TREE_CODE (DECL_TEMPLATE_RESULT (olddecl)) == VAR_DECL)
+  || (TREE_CODE (newdecl) == TEMPLATE_DECL
+  && DECL_TEMPLATE_RESULT (newdecl)
+  && TREE_CODE (DECL_TEMPLATE_RESULT (newdecl)) == VAR_DECL))
+   /* Variable template conflicts.  */;
+  else if (concept_definition_p (olddecl)
+  || concept_definition_p (newdecl))
+   /* Concept conflicts.  */;
   else if ((TREE_CODE (newdecl) == FUNCTION_DECL
&& DECL_FUNCTION_TEMPLATE_P (olddecl))
   || (TREE_CODE (olddecl) == FUNCTION_DECL
@@ -1701,7 +1711,7 @@ duplicate_decls (tree newdecl, tree olddecl, bool 
newdecl_is_friend)
  " literal operator template", newdecl);
  else
return NULL_TREE;
- 
+
  inform (olddecl_loc, "previous declaration %q#D", olddecl);
  return error_mark_node;
}
diff --git a/gcc/testsuite/g++.dg/cpp1y/pr68578.C 
b/gcc/testsuite/g++.dg/cpp1y/pr68578.C
index 18edd83cd7f..9b3898176f1 100644
--- a/gcc/testsuite/g++.dg/cpp1y/pr68578.C
+++ b/gcc/testsuite/g++.dg/cpp1y/pr68578.C
@@ -1,4 +1,4 @@
 // { dg-do compile { target c++14 } }
 
-template  struct bar foo; template <> struct foo<>:  // { dg-error 
"class template" }
+template  struct bar foo; template <> struct foo<>:  // { dg-error 
"class template|redeclared" }
 // { dg-error "-:expected" "" { target *-*-* } .+1 }
diff --git a/gcc/testsuite/g++.dg/cpp1y/var-templ66.C 
b/gcc/testsuite/g++.dg/cpp1y/var-templ66.C
new file mode 100644
index 000..65cd3d9d31b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/var-templ66.C
@@ -0,0 +1,7 @@
+// PR c++/94553
+// { dg-do compile { target c++14 } }
+
+struct C { };
+template int C; // { dg-error "different kind of entity" }
+template int D;
+struct D { }; // { dg-error "different kind of entity" }
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-redecl1.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-redecl1.C
new file mode 100644
index 000..33cd778a318
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-redecl1.C
@@ -0,0 +1,7 @@
+// PR c++/94553
+// { dg-do compile { target c++20 } }
+
+struct E { };
+template concept E = false; // { dg-error "different kind of entity" 
}
+template concept F = false;
+struct F { }; // { dg-error "different kind of entity" }

base-commit: b3d77404c060c0d65d8d4c97254995737d0fc032
-- 
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA



[PATCH, rs6000] Add support to enable vmsumudm behind vec_msum builtin.

2020-06-26 Thread will schmidt via Gcc-patches
Hi,

  Add support for the vmsumudm instruction and tie it into the vec_msum
  built-in to support the variants of that built-in using vector
 _int128 parameters.

  vector _uint128_t vec_msum (vector unsigned long long,
  vector unsigned long long,
  vector _uint128_t);
  vector _int128_t vec_msum (vector signed long long,
 vector signed long long,
 vector _int128_t);

Regtests currently running on assorted powerpc targets.

OK for trunk?

Thanks,
-Will


[gcc]

2020-06-18  Will Schmidt  

* config/rs6000/altivec.h (vec_vmsumudm): New define.
* config/rs6000/altivec.md (UNSPEC_VMSUMUDM): New unspec.
(altivec_vmsumudm): New define_insn.
* config/rs6000/rs6000-builtin.def (altivec_vmsumudm): New
BU_ALTIVEC_3 entry. (vmsumudm): New BU_ALTIVEC_OVERLOAD_3 
entry.
* config/rs6000/rs6000-call.c (altivec_overloaded_builtins): 
Add entries for ALTIVEC_BUILTIN_VMSUMUDM variants of vec_msum.

[testsuite]

2020-06-18  Will Schmidt  

* gcc.target/powerpc/builtins-msum-runnable.c: New test.
* gcc.target/powerpc/vsx-builtin-msum.c: New test.

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index bb1524f..0d19939 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -159,10 +159,11 @@
 #define vec_vmsumuhm __builtin_vec_vmsumuhm
 #define vec_vmsummbm __builtin_vec_vmsummbm
 #define vec_vmsumubm __builtin_vec_vmsumubm
 #define vec_vmsumshs __builtin_vec_vmsumshs
 #define vec_vmsumuhs __builtin_vec_vmsumuhs
+#define vec_vmsumudm __builtin_vec_vmsumudm
 #define vec_vmulesb __builtin_vec_vmulesb
 #define vec_vmulesh __builtin_vec_vmulesh
 #define vec_vmuleuh __builtin_vec_vmuleuh
 #define vec_vmuleub __builtin_vec_vmuleub
 #define vec_vmulosh __builtin_vec_vmulosh
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 2ce9227..0481642 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -19,10 +19,11 @@
 ;; .
 
 (define_c_enum "unspec"
   [UNSPEC_VCMPBFP
UNSPEC_VMSUMU
+   UNSPEC_VMSUMUDM
UNSPEC_VMSUMM
UNSPEC_VMSUMSHM
UNSPEC_VMSUMUHS
UNSPEC_VMSUMSHS
UNSPEC_VMHADDSHS
@@ -970,10 +971,20 @@
 UNSPEC_VMSUMU))]
   "TARGET_ALTIVEC"
   "vmsumum %0,%1,%2,%3"
   [(set_attr "type" "veccomplex")])
 
+(define_insn "altivec_vmsumudm"
+  [(set (match_operand:V1TI 0 "register_operand" "=v")
+   (unspec:V1TI [(match_operand:V2DI 1 "register_operand" "v")
+ (match_operand:V2DI 2 "register_operand" "v")
+ (match_operand:V1TI 3 "register_operand" "v")]
+UNSPEC_VMSUMUDM))]
+  "TARGET_P8_VECTOR"
+  "vmsumudm %0,%1,%2,%3"
+  [(set_attr "type" "veccomplex")])
+
 (define_insn "altivec_vmsummm"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
 (unspec:V4SI [(match_operand:VIshort 1 "register_operand" "v")
  (match_operand:VIshort 2 "register_operand" "v")
   (match_operand:V4SI 3 "register_operand" "v")]
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index 363656e..ee0d787 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -1140,10 +1140,11 @@ BU_ALTIVEC_3 (VMHADDSHS,  "vmhaddshs",  SAT,
altivec_vmhaddshs)
 BU_ALTIVEC_3 (VMHRADDSHS, "vmhraddshs", SAT,   
altivec_vmhraddshs)
 BU_ALTIVEC_3 (VMLADDUHM,  "vmladduhm",  CONST, fmav8hi4)
 BU_ALTIVEC_3 (VMSUMUBM,   "vmsumubm",   CONST, 
altivec_vmsumubm)
 BU_ALTIVEC_3 (VMSUMMBM,   "vmsummbm",   CONST, 
altivec_vmsummbm)
 BU_ALTIVEC_3 (VMSUMUHM,   "vmsumuhm",   CONST, 
altivec_vmsumuhm)
+BU_ALTIVEC_3 (VMSUMUDM,   "vmsumudm",   CONST, 
altivec_vmsumudm)
 BU_ALTIVEC_3 (VMSUMSHM,   "vmsumshm",   CONST, 
altivec_vmsumshm)
 BU_ALTIVEC_3 (VMSUMUHS,   "vmsumuhs",   SAT,   
altivec_vmsumuhs)
 BU_ALTIVEC_3 (VMSUMSHS,   "vmsumshs",   SAT,   
altivec_vmsumshs)
 BU_ALTIVEC_3 (VNMSUBFP,   "vnmsubfp",   FP,nfmsv4sf4)
 BU_ALTIVEC_3 (VPERM_1TI,  "vperm_1ti",  CONST, 
altivec_vperm_v1ti)
@@ -1497,10 +1498,11 @@ BU_ALTIVEC_OVERLOAD_3 (SEL,"sel")
 BU_ALTIVEC_OVERLOAD_3 (VMSUMMBM,   "vmsummbm")
 BU_ALTIVEC_OVERLOAD_3 (VMSUMSHM,   "vmsumshm")
 BU_ALTIVEC_OVERLOAD_3 (VMSUMSHS,   "vmsumshs")
 BU_ALTIVEC_OVERLOAD_3 (VMSUMUBM,   "vmsumubm")
 BU_ALTIVEC_OVERLOAD_3 (VMSUMUHM,   "vmsumuhm")
+BU_ALTIVEC_OVERLOAD_3 (VMSUMUDM,   "vmsumudm")
 BU_ALTIVEC_OVERLOAD_3 (VMSUMUHS,   "vmsumuhs")
 
 /* Altivec DST overloaded builtins.  */
 BU_ALTIVEC_OVERLOAD_D (DST,   "dst")
 BU_ALTIVEC_OVERLOAD_D (DSTT,  "dstt")
diff --git a/gcc/config/rs6000/rs6000-call.c 

Re: [PATCH] libstdc++: std::variant doesn't like types with a defaulted virtual destructor [PR95915]

2020-06-26 Thread Jonathan Wakely via Gcc-patches

On 26/06/20 19:12 +0300, Ville Voutilainen via Libstdc++ wrote:

This patch also deprecates std::is_literal_type.

2020-06-26  Ville Voutilainen  

   PR libstdc++/95915
   * include/std/type_traits (is_literal_type, is_literal_type_v):
   Deprecate in C++17.
   * include/std/variant (_Uninitialized):
   Adjust the condition and the comment.
   * testsuite/20_util/is_literal_type/deprecated-1z.cc: New.
   * testsuite/20_util/is_literal_type/requirements/explicit_instantiation.cc:
   Adjust.
   * testsuite/20_util/is_literal_type/requirements/typedefs.cc: Likewise.
   * testsuite/20_util/is_literal_type/value.cc: Likewise.
   * testsuite/20_util/variant/95915.cc: New.
   * testsuite/20_util/variant/compile.cc: Add new test.


+

+// { dg-prune-output "declared here" }
diff --git 
a/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/explicit_instantiation.cc
 
b/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/explicit_instantiation.cc
index d0a20f3cf4e..d9c57bb8ef4 100644
--- 
a/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/explicit_instantiation.cc
+++ 
b/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/explicit_instantiation.cc
@@ -1,3 +1,4 @@
+// { dg-options "-Wno-deprecated" }


For these three tests I think this would be slightly better:

// { dg-additional-options "-Wno-deprecated" { target c++17 } }

That way we only ignore the warning when actually needed.


// { dg-do compile { target c++11 } }
// 2010-02-21  Paolo Carlini  

diff --git 
a/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/typedefs.cc 
b/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/typedefs.cc
index 9b7ae894725..24f508805f2 100644
--- a/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/typedefs.cc
+++ b/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/typedefs.cc
@@ -1,3 +1,4 @@
+// { dg-options "-Wno-deprecated" }
// { dg-do compile { target c++11 } }

// 2010-02-21  Paolo Carlini  
diff --git a/libstdc++-v3/testsuite/20_util/is_literal_type/value.cc 
b/libstdc++-v3/testsuite/20_util/is_literal_type/value.cc
index a6624774ef0..3bd6fe373f7 100644
--- a/libstdc++-v3/testsuite/20_util/is_literal_type/value.cc
+++ b/libstdc++-v3/testsuite/20_util/is_literal_type/value.cc
@@ -1,3 +1,4 @@
+// { dg-options "-Wno-deprecated" }
// { dg-do compile { target c++11 } }

// 2010-03-23  Paolo Carlini  


OK for master and gcc-10 with that change. Thanks!




Re: [PATCH] libgomp, fortran: Apply if clause to all sub-constructs in combined OpenMP constructs

2020-06-26 Thread Jakub Jelinek via Gcc-patches
On Fri, Jun 26, 2020 at 06:55:30PM +0100, Kwok Cheung Yeung wrote:
> Okay for master/OG10?

Ok for master, not sure if I have anything to say for OG10.
With a nit.

> commit 04bdcaa20827d814c323847630c59ee843c51408
> Author: Kwok Cheung Yeung 
> Date:   Fri Jun 26 10:35:36 2020 -0700
> 
> Fix failure in gfortran.dg/gomp/combined-if.f90 test
> 
> Enabling nvptx offloading results in extra '#pragma omp simd' statements
> in the tree dump with an extra '_simt_'.
> 
> 2020-06-26  Kwok Cheung Yeung  
> 
>   gcc/testsuite/
>   * testsuite/gfortran.dg/gomp/combined-if.f90: Adjust expected number
>   of matches depending on whether nvptx offloading is supported.
>   * testsuite/lib/target-supports.exp
>   (check_effective_target_offload_nvptx): New.

The testsuite/ prefix shouldn't be there, guess you wouldn't be able
to commit it that way.

Jakub



Re: [PATCH] libgomp, fortran: Apply if clause to all sub-constructs in combined OpenMP constructs

2020-06-26 Thread Kwok Cheung Yeung

On 26/06/2020 8:35 am, Christophe Lyon wrote:

Hi,

I've noticed a regression since your commit, on arm aarch64 and x86:
for instance on arm-linux-gnueabi:
PASS: gfortran.dg/gomp/combined-if.f90   -O  (test for excess errors)
PASS: gfortran.dg/gomp/combined-if.f90   -O   scan-tree-dump-times
omplower "(?n)#pragma omp target.* if\\(" 9
gfortran.dg/gomp/combined-if.f90   -O  : pattern found 4 times
FAIL: gfortran.dg/gomp/combined-if.f90   -O   scan-tree-dump-times
omplower "(?n)#pragma omp simd.* if\\(" 7
PASS: gfortran.dg/gomp/combined-if.f90   -O   scan-tree-dump-times
omplower "(?n)#pragma omp parallel.* if\\(" 6

Not sure why you didn't see it?



Hello

I was working with a compiler built with offloading support for nvptx, and that 
emits extra '#pragma omp simd' statements with an extra _simt_:


#pragma omp simd _simduid_(simduid.21) _looptemp_(D.4127) _looptemp_(D.4128) 
_simt_ linear(i.13:1) linear(i:1) if(D.4108)
#pragma omp simd _simduid_(simduid.22) _looptemp_(D.4127) _looptemp_(D.4128) 
linear(i.13:1) linear(i:1) if(D.4108)
#pragma omp simd _simduid_(simduid.64) _looptemp_(D.4437) _looptemp_(D.4438) 
_simt_ linear(i.58:1) linear(i:1) if(D.3977)
#pragma omp simd _simduid_(simduid.65) _looptemp_(D.4437) _looptemp_(D.4438) 
linear(i.58:1) linear(i:1) if(D.3977)
#pragma omp simd _simduid_(simduid.106) _simt_ linear(i.101:1) linear(i:1) 
if(D.4681)

#pragma omp simd _simduid_(simduid.107) linear(i.101:1) linear(i:1) if(D.4681)
#pragma omp simd _simduid_(simduid.170) _looptemp_(D.5081) _looptemp_(D.5082) 
linear(i.164:1) linear(i:1) if(D.4069)


If offloading is disabled, or targeted for amdgcn, the lines sith _simt_ do not 
appear. This appears to be controlled by omp_max_simt_vf() in omp-general.c, and 
is currently only non-zero for nvptx.


I think the easiest fix would be to expect different numbers of matches 
depending on whether nvptx offloading is enabled. This requires an extra 
function in gcc/testsuite/lib/target-supports.exp.


Okay for master/OG10?

Thanks

Kwok
commit 04bdcaa20827d814c323847630c59ee843c51408
Author: Kwok Cheung Yeung 
Date:   Fri Jun 26 10:35:36 2020 -0700

Fix failure in gfortran.dg/gomp/combined-if.f90 test

Enabling nvptx offloading results in extra '#pragma omp simd' statements
in the tree dump with an extra '_simt_'.

2020-06-26  Kwok Cheung Yeung  

gcc/testsuite/
* testsuite/gfortran.dg/gomp/combined-if.f90: Adjust expected number
of matches depending on whether nvptx offloading is supported.
* testsuite/lib/target-supports.exp
(check_effective_target_offload_nvptx): New.

diff --git a/gcc/testsuite/gfortran.dg/gomp/combined-if.f90 
b/gcc/testsuite/gfortran.dg/gomp/combined-if.f90
index bf4a9a8..0bb6c28 100644
--- a/gcc/testsuite/gfortran.dg/gomp/combined-if.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/combined-if.f90
@@ -104,5 +104,6 @@ contains
 end module
 
 ! { dg-final { scan-tree-dump-times "(?n)#pragma omp target.* if\\(" 9 
"omplower" } }
-! { dg-final { scan-tree-dump-times "(?n)#pragma omp simd.* if\\(" 7 
"omplower" } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma omp simd.* if\\(" 4 
"omplower" { target { ! offload_nvptx } } } }
+! { dg-final { scan-tree-dump-times "(?n)#pragma omp simd.* if\\(" 7 
"omplower" { target { offload_nvptx } } } }
 ! { dg-final { scan-tree-dump-times "(?n)#pragma omp parallel.* if\\(" 6 
"omplower" } }
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index cf0cfa1..2279361 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -9820,6 +9820,14 @@ proc check_effective_target_vect_max_reduc { } {
 return 0
 }
 
+# Return 1 if the compiler has been configured with nvptx offloading.
+
+proc check_effective_target_offload_nvptx { } {
+return [check_no_compiler_messages offload_nvptx assembly {
+   int main () {return 0;}
+} "-foffload=nvptx-none" ]
+}
+
 # Return 1 if the compiler has been configured with hsa offloading.
 
 proc check_effective_target_offload_hsa { } {
@@ -9828,7 +9836,7 @@ proc check_effective_target_offload_hsa { } {
 } "-foffload=hsa" ]
 }
 
-# Return 1 if the compiler has been configured with hsa offloading.
+# Return 1 if the compiler has been configured with gcn offloading.
 
 proc check_effective_target_offload_gcn { } {
 return [check_no_compiler_messages offload_gcn assembly {


Re: [PATCH] libgomp: added simple functions and tests for OMPD

2020-06-26 Thread y2s1982 . via Gcc-patches
Hello,

On Fri, Jun 26, 2020 at 1:08 PM Jakub Jelinek  wrote:

> On Fri, Jun 26, 2020 at 12:57:59PM -0400, y2s1982 . wrote:
> > > >   * env.c(ompd_dll_locations_valid): Define with no compiler
> > > optimization.
> > >
> > > Again, missing space.
> > >
> > > The contrib/mklog.py, which I use to generate these messages, seems to
> > generate the statement without a space after the filename and before the
> (.
> > It seems like an easy fix of just adding a space like the following:
> >   diff --git a/contrib/mklog.py b/contrib/mklog.py
> >   index 243edbb15c5..0b01cde6fca 100755
> >   --- a/contrib/mklog.py
> >   +++ b/contrib/mklog.py
> >   @@ -215,7 +215,7 @@ def generate_changelog(data, no_functions=False,
> > fill_pr_titles=False):
> >if functions:
> >out += '\t* %s (%s):\n' % (relative_path,
> > functions[0])
> >for fn in functions[1:]:
> >   -out += '\t(%s):\n' % fn
> >   +   out += '\t (%s):\n' % fn
> >else:
> >   out += '\t* %s:\n' % relative_path
> >   out += '\n'
>
> That is strange.  The missing space I'm complaining about, e.g. between
> env.c and (, is there in the above snippet, that is
> '\t* %s (%s):\n'
>^- this one
> and the space you are adding to mklog.py is undesirable, when one modifies
> multiple functions, it should look like:
> * env.c (foo): Something.
> (bar): Something else.
> rather than
> * env.c (foo): Something.
>  (bar): Something else.
>

I see. I was thinking that doesn't make much sense, too, but when tested,
it placed
a space between the filename and the (.  I wasn't sure what to make of it.
Maybe my editor is somehow manipulating spaces and tabs when loading the
patch
for final editing or maybe the script I made is doing something more than
just pasting
the output of the contrib/mklog.py to the patch.
Thanks for pointing it out. I will investigate further.

Cheers,

Tony Sim

>
> Jakub
>
>


Re: PING Re: testsuite: clarify scan-dump file globbing behavior

2020-06-26 Thread Mike Stump via Gcc-patches
On Jun 2, 2020, at 10:37 PM, Frederik Harwath  wrote:
> 
> Frederik Harwath  writes:
> 
> ping :-)

Ok.

>> Frederik Harwath  writes:
>> 
>> Hi Rainer, hi Mike,
>> ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545803.html
>> 
>> Best regards,
>> Frederik
>> 
>>> Hi Thomas,
>>> 
>>> Thomas Schwinge  writes:
>>> 
 I can't formally approve testsuite patches, but did a review anyway:
>>> 
>>> Thanks for the review!
>>> 
 On 2020-05-15T12:31:54+0200, Frederik Harwath  
 wrote:
>>> 
> The dump
> scanning procedures are changed to make the test unresolved
> if globbing matches more than one file.
 
 (The code changes look good, but I have not tested that specific aspect.)
>>> 
>>> We do not have automated tests for the testsuite commands :-), but I
>>> have of course tested this manually.
>>> 
 As I said, not an approval, and minor comments (see below), but still:
 
Reviewed-by: Thomas Schwinge 
 
 Do we have to similarly also audit/alter other testsuite infrastructure
 files, anything that uses '[glob [...]]'?  (..., and then generalize
 'glob-dump-file' into 'glob-one-file', or similar.)  That can be done
 incrementally, as far as I'm concerned.
>>> 
>>> I also think it would make sense to adapt similar test commands as well.
>>> 
 May also make this more useful/explicit:
 
This is useful if, for example, if a pass has several static
instances [correct terminology?], and depending on torture testing
command-line flags, a different instance executes and produces a dump
file, and so in the test case you can use a generic [put example
here] to scan the varying dump files names.
 
 (Or similar.)
>>> 
>>> I have moved the explanation below the description of the individual
>>> commands and added an example. See the attached revised patch.
>>> 
>>> Best regards,
>>> Frederik
>>> 
>>> From 2a17749d6dbcac690d698323240438722d6119ef Mon Sep 17 00:00:00 2001
>>> From: Frederik Harwath 
>>> Date: Fri, 15 May 2020 10:35:48 +0200
>>> Subject: [PATCH] testsuite: clarify scan-dump file globbing behavior
>>> 
>>> The test commands for scanning optimization dump files
>>> perform globbing on the argument that specifies the suffix
>>> of the dump files to be scanned.  This behavior is currently
>>> undocumented.  Furthermore, the current implementation of
>>> "scan-dump" and similar procedures yields an error whenever
>>> the file name globbing matches more than one file (due to an
>>> attempt to call "open" on multiple files) while a failure to
>>> match any file results in an unresolved test.
>>> 
>>> This commit documents the globbing behavior.  The dump
>>> scanning procedures are changed to make the test unresolved
>>> if globbing matches more than one file.
>>> 
>>> gcc/ChangeLog:
>>> 
>>> 2020-05-19  Frederik Harwath  
>>> 
>>> * doc/sourcebuild.texi: Describe globbing of the
>>> dump file scanning commands "suffix" argument.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>> 2020-05-19  Frederik Harwath  
>>> 
>>> * lib/scandump.exp (glob-dump-file): New proc.
>>> (scan-dump): Use glob-dump-file for file name expansion.
>>> (scan-dump-times): Likewise.
>>> (scan-dump-dem): Likewise.
>>> (scan-dump-dem-not): Likewise.



Re: [PATCH] libgomp: added simple functions and tests for OMPD

2020-06-26 Thread Jakub Jelinek via Gcc-patches
On Fri, Jun 26, 2020 at 12:57:59PM -0400, y2s1982 . wrote:
> > >   * env.c(ompd_dll_locations_valid): Define with no compiler
> > optimization.
> >
> > Again, missing space.
> >
> > The contrib/mklog.py, which I use to generate these messages, seems to
> generate the statement without a space after the filename and before the (.
> It seems like an easy fix of just adding a space like the following:
>   diff --git a/contrib/mklog.py b/contrib/mklog.py
>   index 243edbb15c5..0b01cde6fca 100755
>   --- a/contrib/mklog.py
>   +++ b/contrib/mklog.py
>   @@ -215,7 +215,7 @@ def generate_changelog(data, no_functions=False,
> fill_pr_titles=False):
>if functions:
>out += '\t* %s (%s):\n' % (relative_path,
> functions[0])
>for fn in functions[1:]:
>   -out += '\t(%s):\n' % fn
>   +   out += '\t (%s):\n' % fn
>else:
>   out += '\t* %s:\n' % relative_path
>   out += '\n'

That is strange.  The missing space I'm complaining about, e.g. between
env.c and (, is there in the above snippet, that is
'\t* %s (%s):\n'
   ^- this one
and the space you are adding to mklog.py is undesirable, when one modifies
multiple functions, it should look like:
* env.c (foo): Something.
(bar): Something else.
rather than
* env.c (foo): Something.
 (bar): Something else.

Jakub



Re: [PATCH] libgomp: added simple functions and tests for OMPD

2020-06-26 Thread y2s1982 . via Gcc-patches
Hello,

On Fri, Jun 26, 2020 at 5:03 AM Jakub Jelinek  wrote:

> Thanks, just nits, no need to repost, just commit it after you make those
> changes.
>
> On Thu, Jun 25, 2020 at 05:58:23PM -0400, y2s1982 via Gcc-patches wrote:
>
> >   * Makefile.am(toolexeclib_LTLIBRARIES): Add libgompd.la.
>
> Missing space between am and (.
>
> >   * env.c(ompd_dll_locations_valid): Define with no compiler
> optimization.
>
> Again, missing space.
>
> The contrib/mklog.py, which I use to generate these messages, seems to
generate the statement without a space after the filename and before the (.
It seems like an easy fix of just adding a space like the following:
  diff --git a/contrib/mklog.py b/contrib/mklog.py
  index 243edbb15c5..0b01cde6fca 100755
  --- a/contrib/mklog.py
  +++ b/contrib/mklog.py
  @@ -215,7 +215,7 @@ def generate_changelog(data, no_functions=False,
fill_pr_titles=False):
   if functions:
   out += '\t* %s (%s):\n' % (relative_path,
functions[0])
   for fn in functions[1:]:
  -out += '\t(%s):\n' % fn
  +   out += '\t (%s):\n' % fn
   else:
  out += '\t* %s:\n' % relative_path
  out += '\n'

Would it be okay for me to make this patch? I wasn't sure of the exact
procedure that I might have to do prior to sending in a patch.

Cheers,

Tony Sim


> Jakub
>
>


Re: [PATCH] libstdc++: std::variant doesn't like types with a defaulted virtual destructor [PR95915]

2020-06-26 Thread Ville Voutilainen via Gcc-patches
On Fri, 26 Jun 2020 at 19:12, Ville Voutilainen
 wrote:
>
> This patch also deprecates std::is_literal_type.

I forgot to ask, OK for trunk and GCC 10 if full suite testing passes?
The problematic compiler bug has been
gone since GCC 10, so we can just as well backport this there, but not further.


[PATCH] libstdc++: std::variant doesn't like types with a defaulted virtual destructor [PR95915]

2020-06-26 Thread Ville Voutilainen via Gcc-patches
This patch also deprecates std::is_literal_type.

2020-06-26  Ville Voutilainen  

PR libstdc++/95915
* include/std/type_traits (is_literal_type, is_literal_type_v):
Deprecate in C++17.
* include/std/variant (_Uninitialized):
Adjust the condition and the comment.
* testsuite/20_util/is_literal_type/deprecated-1z.cc: New.
* testsuite/20_util/is_literal_type/requirements/explicit_instantiation.cc:
Adjust.
* testsuite/20_util/is_literal_type/requirements/typedefs.cc: Likewise.
* testsuite/20_util/is_literal_type/value.cc: Likewise.
* testsuite/20_util/variant/95915.cc: New.
* testsuite/20_util/variant/compile.cc: Add new test.
diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++-v3/include/std/type_traits
index bc9a45b3746..261f25d21e7 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -703,7 +703,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /// is_literal_type
   template
-struct is_literal_type
+struct
+_GLIBCXX17_DEPRECATED
+is_literal_type
 : public integral_constant
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
@@ -3087,6 +3089,7 @@ template 
   inline constexpr bool is_pod_v = is_pod<_Tp>::value;
 #pragma GCC diagnostic pop
 template 
+  _GLIBCXX17_DEPRECATED
   inline constexpr bool is_literal_type_v = is_literal_type<_Tp>::value;
 template 
   inline constexpr bool is_empty_v = is_empty<_Tp>::value;
diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant
index 6eeb3c80ec2..c9504914365 100644
--- a/libstdc++-v3/include/std/variant
+++ b/libstdc++-v3/include/std/variant
@@ -202,15 +202,9 @@ namespace __variant
 	  std::forward<_Variants>(__variants)...);
 }
 
-  // _Uninitialized is guaranteed to be a literal type, even if T is not.
-  // We have to do this, because [basic.types]p10.5.3 (n4606) is not implemented
-  // yet. When it's implemented, _Uninitialized can be changed to the alias
-  // to T, therefore equivalent to being removed entirely.
-  //
-  // Another reason we may not want to remove _Uninitialzied may be that, we
-  // want _Uninitialized to be trivially destructible, no matter whether T
-  // is; but we will see.
-  template>
+  // _Uninitialized is guaranteed to be a trivially destructible type,
+  // even if T is not.
+  template>
 struct _Uninitialized;
 
   template
diff --git a/libstdc++-v3/testsuite/20_util/is_literal_type/deprecated-1z.cc b/libstdc++-v3/testsuite/20_util/is_literal_type/deprecated-1z.cc
new file mode 100644
index 000..a91ff56dcf6
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/is_literal_type/deprecated-1z.cc
@@ -0,0 +1,26 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++17" }
+// { dg-do compile { target c++17 } }
+
+#include 
+
+static_assert(std::is_literal_type::value); // { dg-warning "is deprecated" }
+static_assert(std::is_literal_type_v); // { dg-warning "is deprecated" }
+
+// { dg-prune-output "declared here" }
diff --git a/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/explicit_instantiation.cc b/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/explicit_instantiation.cc
index d0a20f3cf4e..d9c57bb8ef4 100644
--- a/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/explicit_instantiation.cc
+++ b/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/explicit_instantiation.cc
@@ -1,3 +1,4 @@
+// { dg-options "-Wno-deprecated" }
 // { dg-do compile { target c++11 } }
 // 2010-02-21  Paolo Carlini  
 
diff --git a/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/typedefs.cc b/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/typedefs.cc
index 9b7ae894725..24f508805f2 100644
--- a/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/typedefs.cc
+++ b/libstdc++-v3/testsuite/20_util/is_literal_type/requirements/typedefs.cc
@@ -1,3 +1,4 @@
+// { dg-options "-Wno-deprecated" }
 // { dg-do compile { target c++11 } }
 
 // 2010-02-21  Paolo Carlini  
diff --git a/libstdc++-v3/testsuite/20_util/is_literal_type/value.cc b/libstdc++-v3/testsuite/20_util/is_literal_type/value.cc
index a6624774ef0..3bd6fe373f7 100644
--- 

Re: [PATCH 3/4] libgcc: fix the handling of return address mangling [PR94891]

2020-06-26 Thread Szabolcs Nagy
The 06/05/2020 17:51, Szabolcs Nagy wrote:
> Mangling, currently only used on AArch64 for return address signing,
> is an internal representation that should not be exposed via
> 
>   __builtin_return_address return value,
>   __builtin_eh_return handler argument,
>   _Unwind_DebugHook handler argument.
> 
> Note that a mangled address might not even fit into a void *, e.g.
> with AArch64 ilp32 ABI the return address is stored as 64bit, so
> the mangled return address cannot be accessed via _Unwind_GetPtr.
> 
> This patch changes the unwinder hooks as follows:
> 
> MD_POST_EXTRACT_ROOT_ADDR is removed: root address comes from
> __builtin_return_address which is not mangled.
> 
> MD_POST_EXTRACT_FRAME_ADDR is renamed to MD_DEMANGLE_RETURN_ADDR,
> it now operates on _Unwind_Word instead of void *, so the hook
> should work when return address signing is enabled on AArch64 ilp32.
> (But for that __builtin_aarch64_autia1716 should be fixed to operate
> on 64bit input instead of a void *.)
> 
> MD_POST_FROB_EH_HANDLER_ADDR is removed: it is the responsibility of
> __builtin_eh_return to do the mangling if necessary.
> 
> libgcc/ChangeLog:
> 
> 2020-06-04  Szabolcs Nagy  
> 
>   * config/aarch64/aarch64-unwind.h (MD_POST_EXTRACT_ROOT_ADDR): Remove.
>   (MD_POST_FROB_EH_HANDLER_ADDR): Remove.
>   (MD_POST_EXTRACT_FRAME_ADDR): Rename to ...
>   (MD_DEMANGLE_RETURN_ADDR): This.
>   (aarch64_post_extract_frame_addr): Rename to ...
>   (aarch64_demangle_return_addr): This.
>   (aarch64_post_frob_eh_handler_addr): Remove.
>   * unwind-dw2.c (uw_update_context): Demangle return address.
>   (uw_frob_return_addr): Remove.

ping. (adding Ian on cc)

tested without regressions on aarch64 with pac-ret.

> ---
>  libgcc/config/aarch64/aarch64-unwind.h | 34 --
>  libgcc/unwind-dw2.c| 34 ++
>  2 files changed, 13 insertions(+), 55 deletions(-)
> 
> diff --git a/libgcc/config/aarch64/aarch64-unwind.h 
> b/libgcc/config/aarch64/aarch64-unwind.h
> index ed84a96db41..b1d732e0b2d 100644
> --- a/libgcc/config/aarch64/aarch64-unwind.h
> +++ b/libgcc/config/aarch64/aarch64-unwind.h
> @@ -27,11 +27,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>  
>  #define DWARF_REGNUM_AARCH64_RA_STATE 34
>  
> -#define MD_POST_EXTRACT_ROOT_ADDR(addr)  __builtin_aarch64_xpaclri (addr)
> -#define MD_POST_EXTRACT_FRAME_ADDR(context, fs, addr) \
> -  aarch64_post_extract_frame_addr (context, fs, addr)
> -#define MD_POST_FROB_EH_HANDLER_ADDR(current, target, addr) \
> -  aarch64_post_frob_eh_handler_addr (current, target, addr)
> +#define MD_DEMANGLE_RETURN_ADDR(context, fs, addr) \
> +  aarch64_demangle_return_addr (context, fs, addr)
>  #define MD_FROB_UPDATE_CONTEXT(context, fs) \
>aarch64_frob_update_context (context, fs)
>  
> @@ -57,9 +54,10 @@ aarch64_cie_signed_with_b_key (struct _Unwind_Context 
> *context)
> using CFA of current frame.  */
>  
>  static inline void *
> -aarch64_post_extract_frame_addr (struct _Unwind_Context *context,
> -  _Unwind_FrameState *fs, void *addr)
> +aarch64_demangle_return_addr (struct _Unwind_Context *context,
> +   _Unwind_FrameState *fs, _Unwind_Word addr_word)
>  {
> +  void *addr = (void *)addr_word;
>if (context->flags & RA_SIGNED_BIT)
>  {
>_Unwind_Word salt = (_Unwind_Word) context->cfa;
> @@ -71,28 +69,6 @@ aarch64_post_extract_frame_addr (struct _Unwind_Context 
> *context,
>  return addr;
>  }
>  
> -/* Do AArch64 private frob on exception handler's address HANDLER_ADDR before
> -   installing it into current context CURRENT.  TARGET is currently not used.
> -   We need to sign exception handler's address if CURRENT itself is signed.  
> */
> -
> -static inline void *
> -aarch64_post_frob_eh_handler_addr (struct _Unwind_Context *current,
> -struct _Unwind_Context *target
> -ATTRIBUTE_UNUSED,
> -void *handler_addr)
> -{
> -  if (current->flags & RA_SIGNED_BIT)
> -{
> -  if (aarch64_cie_signed_with_b_key (current))
> - return __builtin_aarch64_pacib1716 (handler_addr,
> - (_Unwind_Word) current->cfa);
> -  return __builtin_aarch64_pacia1716 (handler_addr,
> - (_Unwind_Word) current->cfa);
> -}
> -  else
> -return handler_addr;
> -}
> -
>  /* Do AArch64 private initialization on CONTEXT based on frame info FS.  Mark
> CONTEXT as return address signed if bit 0 of 
> DWARF_REGNUM_AARCH64_RA_STATE is
> set.  */
> diff --git a/libgcc/unwind-dw2.c b/libgcc/unwind-dw2.c
> index 62d4a3d29a2..fe896565d2e 100644
> --- a/libgcc/unwind-dw2.c
> +++ b/libgcc/unwind-dw2.c
> @@ -1538,11 +1538,14 @@ uw_update_context (struct _Unwind_Context *context, 
> _Unwind_FrameState *fs)
>  {
>/* Compute the 

Re: [PATCH 2/4] aarch64: fix __builtin_eh_return with pac-ret [PR94891]

2020-06-26 Thread Szabolcs Nagy
The 06/05/2020 17:51, Szabolcs Nagy wrote:
> The handler argument must not be signed since that may come from
> outside the current module and exposing signed addresses is a pointer
> ABI break. (The signed address also may not be representable as void *
> which is why pac-ret is currently broken on ilp32.)
> 
> There is no point protecting the eh return path with pointer auth
> since arbitrary target can be reached with the instruction sequence
> in the caller function anyway, however this is a big hammer solution
> that turns off pac-ret for the caller completely not just on the eh
> return path.
> 
> 2020-06-04  Szabolcs Nagy  
> 
>   * config/aarch64/aarch64.c (aarch64_return_address_signing_enabled):
>   Disable return address signing if __builtin_eh_return is used.

ping.

this fixes a correctness bug in pac-ret, tested
on aarch64, with only the following regressions:

FAIL: gcc.target/aarch64/return_address_sign_1.c scan-assembler-times autiasp 4
FAIL: gcc.target/aarch64/return_address_sign_1.c scan-assembler-times paciasp 4
FAIL: gcc.target/aarch64/return_address_sign_b_1.c scan-assembler-times autibsp 
4
FAIL: gcc.target/aarch64/return_address_sign_b_1.c scan-assembler-times pacibsp 
4

which can be fixed by

-/* { dg-final { scan-assembler-times "autiasp" 4 } } */
-/* { dg-final { scan-assembler-times "paciasp" 4 } } */
+/* { dg-final { scan-assembler-times "autiasp" 3 } } */
+/* { dg-final { scan-assembler-times "paciasp" 3 } } */

since __builtin_eh_return path no longer uses pac/aut.

> ---
>  gcc/config/aarch64/aarch64.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 6a2f85c4af7..d9557f7c0a2 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -6954,6 +6954,10 @@ aarch64_return_address_signing_enabled (void)
>/* This function should only be called after frame laid out.   */
>gcc_assert (cfun->machine->frame.laid_out);
>  
> +  /* TODO: Big hammer handling of __builtin_eh_return.  */
> +  if (crtl->calls_eh_return)
> +return false;
> +
>/* If signing scope is AARCH64_FUNCTION_NON_LEAF, we only sign a leaf 
> function
>   if its LR is pushed onto stack.  */
>return (aarch64_ra_sign_scope == AARCH64_FUNCTION_ALL
> -- 
> 2.17.1
> 

-- 


Re: [PATCH 1/4] aarch64: fix return address access with pac [PR94891][PR94791]

2020-06-26 Thread Szabolcs Nagy
The 06/05/2020 17:51, Szabolcs Nagy wrote:
> This is a big hammer fix for __builtin_return_address (PR target/94891)
> returning signed addresses (sometimes, depending on wether lr happens
> to be signed or not at the time of call which depends on optimizations),
> and similarly -pg may pass signed return address to _mcount
> (PR target/94791).
> 
> At the time of return address expansion we don't know if it's signed or
> not so it is done unconditionally.
> 
> I wonder if allocate_initial_value for the lr reg may solve this better
> such that get_hard_reg_initial_val just gives the right (unsigned) value?
> 
> 2020-06-04  Szabolcs Nagy  
> 
>   * config/aarch64/aarch64-protos.h (aarch64_return_addr_rtx): Declare.
>   * config/aarch64/aarch64.c (aarch64_return_addr_rtx): New.
>   (aarch64_return_addr): Use aarch64_return_addr_rtx.
>   * config/aarch64/aarch64.h (PROFILE_HOOK): Likewise.

ping.

(this fixes a correctness bug in pac-ret, tested with no regressions).

> ---
>  gcc/config/aarch64/aarch64-protos.h |  1 +
>  gcc/config/aarch64/aarch64.c| 20 +++-
>  gcc/config/aarch64/aarch64.h|  2 +-
>  3 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 9e43adb7db0..723d9ba6ac6 100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -578,6 +578,7 @@ int aarch64_vec_fpconst_pow_of_2 (rtx);
>  rtx aarch64_eh_return_handler_rtx (void);
>  rtx aarch64_mask_from_zextract_ops (rtx, rtx);
>  const char *aarch64_output_move_struct (rtx *operands);
> +rtx aarch64_return_addr_rtx (void);
>  rtx aarch64_return_addr (int, rtx);
>  rtx aarch64_simd_gen_const_vector_dup (machine_mode, HOST_WIDE_INT);
>  bool aarch64_simd_mem_operand_p (rtx);
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 6352d4ff78a..6a2f85c4af7 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -10819,6 +10819,24 @@ aarch64_initial_elimination_offset (unsigned from, 
> unsigned to)
>return cfun->machine->frame.frame_size;
>  }
>  
> +
> +/* Get return address without mangling.  */
> +
> +rtx
> +aarch64_return_addr_rtx (void)
> +{
> +  rtx val = get_hard_reg_initial_val (Pmode, LR_REGNUM);
> +  /* Note: aarch64_return_address_signing_enabled only
> + works after cfun->machine->frame.laid_out is set,
> + so here we don't know if the return address will
> + be signed or not.  */
> +  rtx lr = gen_rtx_REG (Pmode, LR_REGNUM);
> +  emit_move_insn (lr, val);
> +  emit_insn (GEN_FCN (CODE_FOR_xpaclri) ());
> +  return lr;
> +}
> +
> +
>  /* Implement RETURN_ADDR_RTX.  We do not support moving back to a
> previous frame.  */
>  
> @@ -10827,7 +10845,7 @@ aarch64_return_addr (int count, rtx frame 
> ATTRIBUTE_UNUSED)
>  {
>if (count != 0)
>  return const0_rtx;
> -  return get_hard_reg_initial_val (Pmode, LR_REGNUM);
> +  return aarch64_return_addr_rtx ();
>  }
>  
>  
> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> index 2be52fd4d73..f11941bbc86 100644
> --- a/gcc/config/aarch64/aarch64.h
> +++ b/gcc/config/aarch64/aarch64.h
> @@ -1112,7 +1112,7 @@ typedef struct
>  #define PROFILE_HOOK(LABEL)  \
>{  \
>  rtx fun, lr; \
> -lr = get_hard_reg_initial_val (Pmode, LR_REGNUM);
> \
> +lr = aarch64_return_addr_rtx (); \
>  fun = gen_rtx_SYMBOL_REF (Pmode, MCOUNT_NAME);   \
>  emit_library_call (fun, LCT_NORMAL, VOIDmode, lr, Pmode);
> \
>}
> -- 
> 2.17.1
> 

-- 


Re: [committed] d: Merge upstream dmd 90450f3ef.

2020-06-26 Thread Iain Buclaw via Gcc-patches
Hi,

This patch has been backported, bootstrapped and regression tested on the
releases/gcc-9 and releases/gcc-10 branches.

Regards
Iain.

On 25/06/2020 17:41, Iain Buclaw wrote:
> Hi,
> 
> This patch merges the D front-end implementation with upstream dmd
> 90450f3ef.  Fixes a regression caused by an incomplete backport of
> converting the Expression semantic pass to a Visitor.
> 
> Bootstrapped and regression tested on x86_64-linux-gnu, and committed to
> mainline.
> 
> Regards
> Iain.
> 
> ---
> gcc/d/ChangeLog:
> 
>   PR d/95250
>   * dmd/MERGE: Merge upstream dmd 90450f3ef.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR d/95250
>   * gdc.dg/pr95250.d: New test.
> ---
>  gcc/d/dmd/MERGE|  2 +-
>  gcc/d/dmd/expressionsem.c  |  1 +
>  gcc/testsuite/gdc.dg/pr95250.d | 18 ++
>  3 files changed, 20 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gdc.dg/pr95250.d
> 
> diff --git a/gcc/d/dmd/MERGE b/gcc/d/dmd/MERGE
> index 0e48f42a0e2..0d50149a750 100644
> --- a/gcc/d/dmd/MERGE
> +++ b/gcc/d/dmd/MERGE
> @@ -1,4 +1,4 @@
> -5fc1806cd7dd281e944022df2e11ef6b04ee4339
> +90450f3ef6ab8551b5f383d8c6190f80034dbf93
>  
>  The first line of this file holds the git revision number of the last
>  merge done from the dlang/dmd repository.
> diff --git a/gcc/d/dmd/expressionsem.c b/gcc/d/dmd/expressionsem.c
> index e3a5cb36a82..ac6b5bc81f3 100644
> --- a/gcc/d/dmd/expressionsem.c
> +++ b/gcc/d/dmd/expressionsem.c
> @@ -6883,6 +6883,7 @@ public:
>  if (Expression *ex = binSemanticProp(exp, sc))
>  {
>  result = ex;
> +return;
>  }
>  Expression *e = exp->op_overload(sc);
>  if (e)
> diff --git a/gcc/testsuite/gdc.dg/pr95250.d b/gcc/testsuite/gdc.dg/pr95250.d
> new file mode 100644
> index 000..c8810c41968
> --- /dev/null
> +++ b/gcc/testsuite/gdc.dg/pr95250.d
> @@ -0,0 +1,18 @@
> +// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95250
> +// { dg-do compile }
> +
> +template Unsigned(T)
> +{
> +static assert(false, "Type " ~ T.stringof ~
> +  " does not have an Unsigned counterpart");
> +}
> +
> +
> +void* f95250(T)(T a, T b)
> +{
> +alias UnsignedVoid = Unsigned!(T);
> +return cast(T)(cast(T)(cast(UnsignedVoid)(a-b) / 2));
> +}
> +
> +static assert(is(typeof(f!(void*)(null, null)) == void*));
> +// { dg-error "static assert  \(.*\) is false" "" { target *-*-* } .-1 }
> 


Re: [committed] d: Fix ICE in uda_attribute_p when looking up unknown attribute

2020-06-26 Thread Iain Buclaw via Gcc-patches
Hi,

This patch has been backported, bootstrapped and regression tested on the
releases/gcc-9 and releases/gcc-10 branches.

Regards
Iain.

On 25/06/2020 17:38, Iain Buclaw wrote:
> Hi,
> 
> This patch fixes an ICE in uda_attribute_p when looking up an unknown
> attribute.  The target attribute table is not guaranteed to be set by
> all backends.
> 
> Bootstrapped and regression tested on x86_64-linux-gnu, and committed to
> mainline.
> 
> Regards
> Iain.
> 
> ---
> gcc/d/ChangeLog:
> 
>   PR d/95173
>   * d-attribs.cc (uda_attribute_p): Don't search target attribute table
>   if NULL.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR d/95173
>   * gdc.dg/pr95173.d: New test.
> ---
>  gcc/d/d-attribs.cc |  9 ++---
>  gcc/testsuite/gdc.dg/pr95173.d | 10 ++
>  2 files changed, 16 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gdc.dg/pr95173.d
> 
> diff --git a/gcc/d/d-attribs.cc b/gcc/d/d-attribs.cc
> index 964f59f96f4..f4086c0f0ee 100644
> --- a/gcc/d/d-attribs.cc
> +++ b/gcc/d/d-attribs.cc
> @@ -216,10 +216,13 @@ uda_attribute_p (const char *name)
>   return true;
>  }
>  
> -  for (const attribute_spec *p = targetm.attribute_table; p->name; p++)
> +  if (targetm.attribute_table)
>  {
> -  if (get_identifier (p->name) == ident)
> - return true;
> +  for (const attribute_spec *p = targetm.attribute_table; p->name; p++)
> + {
> +   if (get_identifier (p->name) == ident)
> + return true;
> + }
>  }
>  
>return false;
> diff --git a/gcc/testsuite/gdc.dg/pr95173.d b/gcc/testsuite/gdc.dg/pr95173.d
> new file mode 100644
> index 000..2a4b2ed8232
> --- /dev/null
> +++ b/gcc/testsuite/gdc.dg/pr95173.d
> @@ -0,0 +1,10 @@
> +// https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95173
> +// { dg-do compile }
> +// { dg-options "-Wattributes" }
> +
> +import gcc.attribute;
> +
> +@attribute("foo") // { dg-warning "unknown attribute .foo." }
> +void f95173() 
> +{
> +}
> 


Re: [PATCH] rs6000: Add support for __builtin_cpu_is ("power10")

2020-06-26 Thread Segher Boessenkool
On Fri, Jun 26, 2020 at 03:52:33AM -0400, Michael Meissner wrote:
> On Thu, Jun 25, 2020 at 06:54:26PM -0500, Segher Boessenkool wrote:
> > On Thu, Jun 25, 2020 at 06:36:51PM -0500, Peter Bergner wrote:
> > >   * config/rs6000/rs6000-call.c (cpu_is_info) : New.
> > >   : Remove unneeded ','.
> > 
> > The comma helps making the diff less for future additions (and, makes
> > merging/refactoring easier, that way).  A trailing comma was not allowed
> > with older C standards (or just with some implementations?), but it
> > should be fine with C++11 as we require now.  Is there something I am
> > missing here?
> 
> A trailing comma has always been allowed for structure and array
> initializations.  Where it is not allowed is for enumeration names.

It is allowed even there since C99.  But GCC never required C99 for the
host compiler, and C++ didn't get this feature until C++11, which we
require since a little over a month now!  Freedom at last!  ;-)


Segher


Re: [PATCH] coroutines: Handle awaiters that are sub-objects [PR95736]

2020-06-26 Thread Nathan Sidwell

On 6/26/20 9:02 AM, Iain Sandoe wrote:

Hi

During development, it seemed a reasonable strategy to defer
decisions on what needed to be captured in the coroutine
frame (to minimize the frame size).  In the case of awaitables
that are user-variables, params (or otherwise originate outside
the coroutine scope), that turns out to make things more
complicated than necessary (leading to the omission that results
in the PR).

So the revised solution is to deal with these cases when the await
expression is built.

tested on x86-64-linux, darwin, powerpc64-linux
OK for master and 10.2?
thanks
Iain


ok.  please consistify 'initializer' spelling, you have at least one 
'initialiser'






Move deciding on initializers for awaitables to the build of the
co_await, this allows us to analyse cases that do not need
a temporary at that point.

As the PR shows, the late analysis meant that we  were not
checking properly for the case that an awaiter is a sub-object
of an existing variable outside the current function scope (and
therefore does not need to be duplicated in the frame).

gcc/cp/ChangeLog:

PR c++/95736
* coroutines.cc (get_awaitable_var): New helper.
(build_co_await): Check more carefully before
copying an awaitable.
(expand_one_await_expression): No initializer
is required when the awaitable is not a temp.
(register_awaits): Remove handling that is now
completed when the await expression is built.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr95736.C: New test.
---
  gcc/cp/coroutines.cc  | 112 --
  gcc/testsuite/g++.dg/coroutines/pr95736.C |  84 
  2 files changed, 166 insertions(+), 30 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95736.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 8b8d00e8e0c..312d6a1f77b 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -740,6 +740,30 @@ enum suspend_point_kind {
FINAL_SUSPEND_POINT
  };
  
+/* Helper function to build a named variable for the temps we use for each

+   await point.  The root of the name is determined by SUSPEND_KIND, and
+   the variable is of type V_TYPE.  The awaitable number is reset each time
+   we encounter a final suspend.  */
+
+static tree
+get_awaitable_var (suspend_point_kind suspend_kind, tree v_type)
+{
+  static int awn = 0;
+  char *buf;
+  switch (suspend_kind)
+{
+  default: buf = xasprintf ("Aw%d", awn++); break;
+  case CO_YIELD_SUSPEND_POINT: buf =  xasprintf ("Yd%d", awn++); break;
+  case INITIAL_SUSPEND_POINT: buf =  xasprintf ("Is"); break;
+  case FINAL_SUSPEND_POINT: buf =  xasprintf ("Fs"); awn = 0; break;
+  }
+  tree ret = get_identifier (buf);
+  free (buf);
+  ret = build_lang_decl (VAR_DECL, ret, v_type);
+  DECL_ARTIFICIAL (ret) = true;
+  return ret;
+}
+
  /*  This performs [expr.await] bullet 3.3 and validates the interface 
obtained.
  It is also used to build the initial and final suspend points.
  
@@ -798,23 +822,57 @@ build_co_await (location_t loc, tree a, suspend_point_kind suspend_kind)

  return error_mark_node;
  
/* To complete the lookups, we need an instance of 'e' which is built from

- 'o' according to [expr.await] 3.4.  However, we don't want to materialize
- 'e' here (it might need to be placed in the coroutine frame) so we will
- make a temp placeholder instead.  If 'o' is a parameter or a local var,
- then we do not need an additional var (parms and local vars are already
- copied into the frame and will have lifetimes according to their original
- scope).  */
+ 'o' according to [expr.await] 3.4.
+
+ If we need to materialize this as a temporary, then that will have to be
+ 'promoted' to a coroutine frame var.  However, if the awaitable is a
+ user variable, parameter or comes from a scope outside this function,
+ then we must use it directly - or we will see unnecessary copies.
+
+ If o is a variable, find the underlying var.  */
tree e_proxy = STRIP_NOPS (o);
if (INDIRECT_REF_P (e_proxy))
  e_proxy = TREE_OPERAND (e_proxy, 0);
+  while (TREE_CODE (e_proxy) == COMPONENT_REF)
+{
+  e_proxy = TREE_OPERAND (e_proxy, 0);
+  if (INDIRECT_REF_P (e_proxy))
+   e_proxy = TREE_OPERAND (e_proxy, 0);
+  if (TREE_CODE (e_proxy) == CALL_EXPR)
+   {
+ /* We could have operator-> here too.  */
+ tree op = TREE_OPERAND (CALL_EXPR_FN (e_proxy), 0);
+ if (DECL_OVERLOADED_OPERATOR_P (op)
+ && DECL_OVERLOADED_OPERATOR_IS (op, COMPONENT_REF))
+   {
+ e_proxy = CALL_EXPR_ARG (e_proxy, 0);
+ STRIP_NOPS (e_proxy);
+ gcc_checking_assert (TREE_CODE (e_proxy) == ADDR_EXPR);
+ e_proxy = TREE_OPERAND (e_proxy, 0);
+   }
+   }
+  STRIP_NOPS (e_proxy);
+}
+
+  /* Only build a temporary if we need it.  */
if 

[PATCH] coroutines: Handle awaiters that are sub-objects [PR95736]

2020-06-26 Thread Iain Sandoe
Hi

During development, it seemed a reasonable strategy to defer
decisions on what needed to be captured in the coroutine
frame (to minimize the frame size).  In the case of awaitables
that are user-variables, params (or otherwise originate outside
the coroutine scope), that turns out to make things more
complicated than necessary (leading to the omission that results
in the PR).

So the revised solution is to deal with these cases when the await
expression is built.

tested on x86-64-linux, darwin, powerpc64-linux
OK for master and 10.2?
thanks
Iain



Move deciding on initializers for awaitables to the build of the
co_await, this allows us to analyse cases that do not need
a temporary at that point.

As the PR shows, the late analysis meant that we  were not
checking properly for the case that an awaiter is a sub-object
of an existing variable outside the current function scope (and
therefore does not need to be duplicated in the frame).

gcc/cp/ChangeLog:

PR c++/95736
* coroutines.cc (get_awaitable_var): New helper.
(build_co_await): Check more carefully before
copying an awaitable.
(expand_one_await_expression): No initializer
is required when the awaitable is not a temp.
(register_awaits): Remove handling that is now
completed when the await expression is built.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr95736.C: New test.
---
 gcc/cp/coroutines.cc  | 112 --
 gcc/testsuite/g++.dg/coroutines/pr95736.C |  84 
 2 files changed, 166 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/pr95736.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 8b8d00e8e0c..312d6a1f77b 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -740,6 +740,30 @@ enum suspend_point_kind {
   FINAL_SUSPEND_POINT
 };
 
+/* Helper function to build a named variable for the temps we use for each
+   await point.  The root of the name is determined by SUSPEND_KIND, and
+   the variable is of type V_TYPE.  The awaitable number is reset each time
+   we encounter a final suspend.  */
+
+static tree
+get_awaitable_var (suspend_point_kind suspend_kind, tree v_type)
+{
+  static int awn = 0;
+  char *buf;
+  switch (suspend_kind)
+{
+  default: buf = xasprintf ("Aw%d", awn++); break;
+  case CO_YIELD_SUSPEND_POINT: buf =  xasprintf ("Yd%d", awn++); break;
+  case INITIAL_SUSPEND_POINT: buf =  xasprintf ("Is"); break;
+  case FINAL_SUSPEND_POINT: buf =  xasprintf ("Fs"); awn = 0; break;
+  }
+  tree ret = get_identifier (buf);
+  free (buf);
+  ret = build_lang_decl (VAR_DECL, ret, v_type);
+  DECL_ARTIFICIAL (ret) = true;
+  return ret;
+}
+
 /*  This performs [expr.await] bullet 3.3 and validates the interface obtained.
 It is also used to build the initial and final suspend points.
 
@@ -798,23 +822,57 @@ build_co_await (location_t loc, tree a, 
suspend_point_kind suspend_kind)
 return error_mark_node;
 
   /* To complete the lookups, we need an instance of 'e' which is built from
- 'o' according to [expr.await] 3.4.  However, we don't want to materialize
- 'e' here (it might need to be placed in the coroutine frame) so we will
- make a temp placeholder instead.  If 'o' is a parameter or a local var,
- then we do not need an additional var (parms and local vars are already
- copied into the frame and will have lifetimes according to their original
- scope).  */
+ 'o' according to [expr.await] 3.4.
+
+ If we need to materialize this as a temporary, then that will have to be
+ 'promoted' to a coroutine frame var.  However, if the awaitable is a
+ user variable, parameter or comes from a scope outside this function,
+ then we must use it directly - or we will see unnecessary copies.
+
+ If o is a variable, find the underlying var.  */
   tree e_proxy = STRIP_NOPS (o);
   if (INDIRECT_REF_P (e_proxy))
 e_proxy = TREE_OPERAND (e_proxy, 0);
+  while (TREE_CODE (e_proxy) == COMPONENT_REF)
+{
+  e_proxy = TREE_OPERAND (e_proxy, 0);
+  if (INDIRECT_REF_P (e_proxy))
+   e_proxy = TREE_OPERAND (e_proxy, 0);
+  if (TREE_CODE (e_proxy) == CALL_EXPR)
+   {
+ /* We could have operator-> here too.  */
+ tree op = TREE_OPERAND (CALL_EXPR_FN (e_proxy), 0);
+ if (DECL_OVERLOADED_OPERATOR_P (op)
+ && DECL_OVERLOADED_OPERATOR_IS (op, COMPONENT_REF))
+   {
+ e_proxy = CALL_EXPR_ARG (e_proxy, 0);
+ STRIP_NOPS (e_proxy);
+ gcc_checking_assert (TREE_CODE (e_proxy) == ADDR_EXPR);
+ e_proxy = TREE_OPERAND (e_proxy, 0);
+   }
+   }
+  STRIP_NOPS (e_proxy);
+}
+
+  /* Only build a temporary if we need it.  */
   if (TREE_CODE (e_proxy) == PARM_DECL
-  || (VAR_P (e_proxy) && (!DECL_ARTIFICIAL (e_proxy)
- || DECL_HAS_VALUE_EXPR_P 

[pushed] coroutines: Handle bad g-r-o-o-a-f cases.

2020-06-26 Thread Iain Sandoe
Hi

This fixes an ice-on-invalid I found while working on other
cases.

tested on x86_64-linux, darwin, powerpc64-linux
applied to master as obvious/trivial.
thanks
Iain

---

If we see a get_return_object_on_allocation_failure in the
promise, we expect to be able to use it.  If this isn't
possible (because of some error in the declaration) then we
need to handle the erroneous return to allow following code
to complete.

gcc/cp/ChangeLog:

* coroutines.cc (morph_fn_to_coro): Handle error
returns in building g-r-o-o-a-f expressions.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/coro1-allocators.h (BAD_GROOAF_STATIC):
New.
* g++.dg/coroutines/coro-bad-grooaf-00-static.C: New test.
---
 gcc/cp/coroutines.cc  |  5 +
 .../g++.dg/coroutines/coro-bad-grooaf-00-static.C | 15 +++
 .../g++.dg/coroutines/coro1-allocators.h  |  5 -
 3 files changed, 24 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/coro-bad-grooaf-00-static.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 64b97535c8d..ad276592231 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -3932,6 +3932,11 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
NULL_TREE, LOOKUP_NORMAL, NULL,
tf_warning_or_error);
 
+  /* ... but if that fails, returning an error, the later stages can't handle
+ the erroneous expression, so we reset the call as if it was absent.  */
+  if (grooaf == error_mark_node)
+grooaf = NULL_TREE;
+
   /* Allocate the frame, this has several possibilities:
  [dcl.fct.def.coroutine] / 9 (part 1)
  The allocation function’s name is looked up in the scope of the promise
diff --git a/gcc/testsuite/g++.dg/coroutines/coro-bad-grooaf-00-static.C 
b/gcc/testsuite/g++.dg/coroutines/coro-bad-grooaf-00-static.C
new file mode 100644
index 000..e7d04346d57
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/coro-bad-grooaf-00-static.C
@@ -0,0 +1,15 @@
+/* g-r-o-o-a-f should be static.  */
+
+#define BAD_GROOAF_STATIC
+#define PROVIDE_GROOAF
+#include "coro1-allocators.h"
+
+int used_grooaf = 0;
+
+struct coro1
+f () noexcept
+{
+  PRINT ("coro1: about to return");
+  co_return;
+} // { dg-error {cannot call member function 'coro1 
coro1::promise_type::get_return_object_on_allocation_failure\(\)' without 
object} }
+
diff --git a/gcc/testsuite/g++.dg/coroutines/coro1-allocators.h 
b/gcc/testsuite/g++.dg/coroutines/coro1-allocators.h
index 3d869106006..f7a85e9e671 100644
--- a/gcc/testsuite/g++.dg/coroutines/coro1-allocators.h
+++ b/gcc/testsuite/g++.dg/coroutines/coro1-allocators.h
@@ -172,8 +172,11 @@ struct coro1 {
   }
 #endif
 
+#ifndef BAD_GROOAF_STATIC
+# define BAD_GROOAF_STATIC static
+#endif
 #ifdef PROVIDE_GROOAF
-  static coro1 get_return_object_on_allocation_failure () noexcept {
+  BAD_GROOAF_STATIC coro1 get_return_object_on_allocation_failure () noexcept {
 PRINT ("alloc fail return");
 used_grooaf++;
 return coro1 (nullptr);
-- 
2.24.1




[PATCH] tree-optimization/95839 - teach SLP vectorization about vector inputs

2020-06-26 Thread Richard Biener
(sorry for the duplicate, forgot to copy the list)

This teaches SLP analysis about vector typed externals that are
fed into the SLP operations via lane extracting BIT_FIELD_REFs.
It shows that there's currently no good representation for
vector code on the SLP side so I went a half way and represent
such vector externals uses always using a SLP permutation node
with a single external SLP child which has a non-standard
representation of no scalar defs but only a vector def.  That
works best for shielding the rest of the vectorizer from it.

I'm not sure it's actually worth the trouble and what real-world
cases benefit from this.  In theory vectorized unrolled code
interfacing with scalar code might be one case but there
we necessarily go through memory and there's no intermediate
pass transforming that to registers [to make BB vectorization
cheaper].   

It's also not even close to ready for re-vectorizing vectorized 
code with a larger VF.

Any opinions?   

Bootstrapped / tested on x86_64-unknown-linux-gnu.

Thanks, 
Richard. 

2020-06-26  Richard Biener  

PR tree-optimization/95839
* tree-vect-slp.c (vect_slp_tree_uniform_p): Pre-existing
vectors are not uniform.
(vect_build_slp_tree_1): Handle BIT_FIELD_REFs of
vector registers.
(vect_build_slp_tree_2): For groups of lane extracts
from a vector register generate a permute node
with a special child representing the pre-existing vector.
(vect_prologue_cost_for_slp): Pre-existing vectors cost nothing.
(vect_slp_analyze_node_operations): Use SLP_TREE_LANES.
(vectorizable_slp_permutation): Do not generate or cost identity
permutes.
(vect_schedule_slp_instance): Handle pre-existing vector
that are function arguments.

* gcc.dg/vect/bb-slp-pr95839-2.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr95839-2.c |  20 
 gcc/tree-vect-slp.c  | 119 ---
 2 files changed, 124 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr95839-2.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr95839-2.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr95839-2.c
new file mode 100644
index 000..49e75d8c95c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr95839-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_double } */
+/* { dg-additional-options "-w -Wno-psabi" } */
+
+typedef double __attribute__((vector_size(16))) v2df;
+
+v2df f(v2df a, v2df b)
+{
+  return (v2df){a[0] + b[0], a[1] + b[1]};
+}
+
+v2df g(v2df a, v2df b)
+{
+  return (v2df){a[0] + b[1], a[1] + b[0]};
+}
+
+/* Verify we manage to vectorize this with using the original vectors
+   and do not end up with any vector CTORs.  */
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 2 "slp2" } } */
+/* { dg-final { scan-tree-dump-not "vect_cst" "slp2" } } */
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index b223956e3af..83ec382ee0d 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -247,6 +247,10 @@ vect_slp_tree_uniform_p (slp_tree node)
   gcc_assert (SLP_TREE_DEF_TYPE (node) == vect_constant_def
  || SLP_TREE_DEF_TYPE (node) == vect_external_def);
 
+  /* Pre-exsting vectors.  */
+  if (SLP_TREE_SCALAR_OPS (node).is_empty ())
+return false;
+
   unsigned i;
   tree op, first = NULL_TREE;
   FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_OPS (node), i, op)
@@ -838,7 +842,7 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap,
   else
{
  rhs_code = gimple_assign_rhs_code (stmt);
- load_p = TREE_CODE_CLASS (rhs_code) == tcc_reference;
+ load_p = gimple_vuse (stmt);
}
 
   /* Check the operation.  */
@@ -899,6 +903,22 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char 
*swap,
   need_same_oprnds = true;
   first_op1 = gimple_assign_rhs2 (stmt);
 }
+ else if (!load_p
+  && rhs_code == BIT_FIELD_REF)
+   {
+ tree vec = TREE_OPERAND (gimple_assign_rhs1 (stmt), 0);
+ if (TREE_CODE (vec) != SSA_NAME
+ || !types_compatible_p (vectype, TREE_TYPE (vec)))
+   {
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"Build SLP failed: "
+"BIT_FIELD_REF not supported\n");
+ /* Fatal 

Re: [patch] Take into account range info to optimize range tests

2020-06-26 Thread Richard Biener via Gcc-patches
On Thu, Jun 25, 2020 at 1:07 PM Eric Botcazou  wrote:
>
> Hi,
>
> ...into bit tests, as done by optimize_range_tests_to_bit_test of the reassoc
> pass.  The patch is aimed at addressing the following two issues:
>
>   1. In order to protect the shift operation from undefinedness, the new bit
> test is guarded with a new test, but this new test uses the range of the bit
> test values, not that of the shift operation so, if the input is in the range
> of the shift operation but not of the bit test values, then the subsequent VRP
> pass cannot eliminate the new test.  Moreover changing the new test to use the
> range of the shift operation, instead of that of the bit test values, in the
> general case would pessimize the cases which are in between.
>
>   2. If the new test can be eliminated, then it becomes profitable to do the
> optimization into a bit test for one fewer comparison in the source code.
>
> Therefore the patch changes optimize_range_tests_to_bit_test to use the range
> info of the input in order to eliminate the new test if possible.
>
> Tested on x86-64/Linux, OK for the mainline?

OK.

Thanks,
Richard.

>
> 2020-06-25  Eric Botcazou  
>
> * tree-ssa-reassoc.c (dump_range_entry): New function.
> (debug_range_entry): New debug function.
> (update_range_test): Invoke dump_range_entry for dumping.
> (optimize_range_tests_to_bit_test): Merge the entry test in the bit
> test when possible and lower the profitability threshold in this case.
>
>
> 2020-06-25  Eric Botcazou  
>
> * exp_ch4.adb (Expand_Set_Membership): Expand the membership test
> using left associativity instead of right associativity.
>
>
> 2020-06-25  Eric Botcazou  
>
> * gnat.dg/opt86_pkg.ads: New helper.
> * gnat.dg/opt86a.adb: New test.
> * gnat.dg/opt86b.adb: Likewise.
> * gnat.dg/opt86c.adb: Likewise.
>
> --
> Eric Botcazou


RE: [PATCH] [arm] Don't generate invalid LDRD insns

2020-06-26 Thread Alex Coplan
> -Original Message-
> From: Gcc-patches  On Behalf Of Alex
> Coplan
> Sent: 18 May 2020 16:37
> To: Kyrylo Tkachov ; gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; nd ; Ramana
> Radhakrishnan 
> Subject: RE: [PATCH] [arm] Don't generate invalid LDRD insns
> 
> > -Original Message-
> > From: Kyrylo Tkachov 
> > Sent: 15 May 2020 11:57
> > To: Alex Coplan ; gcc-patches@gcc.gnu.org
> > Cc: nd ; ni...@redhat.com; Richard Earnshaw
> > ; Ramana Radhakrishnan
> 
> > Subject: RE: [PATCH] [arm] Don't generate invalid LDRD insns
> >
> > Hi Alex,
> >
> > > -Original Message-
> > > From: Alex Coplan 
> > > Sent: 15 May 2020 11:36
> > > To: gcc-patches@gcc.gnu.org
> > > Cc: nd ; ni...@redhat.com; Richard Earnshaw
> > > ; Ramana Radhakrishnan
> > > ; Kyrylo Tkachov
> > > 
> > > Subject: [PATCH] [arm] Don't generate invalid LDRD insns
> > >
> > > Hello,
> > >
> > > This patch fixes a bug in the arm backend where GCC generates invalid
> LDRD
> > > instructions. The LDRD instruction requires the first transfer
> register to be
> > > even, but GCC attempts to use odd registers here. For example, with
> the
> > > following C code:
> > >
> > > struct c {
> > >   double a;
> > > } __attribute((aligned)) __attribute((packed));
> > > struct c d;
> > > struct c f(struct c);
> > > void e() { f(d); }
> > >
> > > The struct d is passed in registers r1 and r2 to the function f, and
> GCC
> > > attempts to do this with a LDRD instruction when compiling with -
> > > march=armv7-a
> > > on a soft float toolchain.
> > >
> > > The fix is analogous to the corresponding one for STRD in the same
> function:
> > >
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=52057dc4ac5295caebf83147f6
> > > 88d769c93cbc8d
> > >
> > > Testing:
> > >  - New unit tests which pass after applying the patch.
> > >  - Tested on an x64 -> arm-none-eabi cross.
> > >  - Bootstrapped and regtested on arm-none-linux-gnueabihf (in both
> thumb
> > > and arm
> > >modes).
> > >
> > > OK for master?
> >
> > Ok.
> > Please apply for write-after-approval commit access to the repo by
> filling the
> > form at:
> > https://sourceware.org/cgi-bin/pdw/ps_form.cgi
> > listing me as your sponsor.
> > You can then push the patch yourself.
> 
> OK, I've pushed the patch to master.
> 
> Thanks,
> Alex

This patch applies cleanly on gcc-{8,9,10} branches. I've run bootstrap
and regression tests (both thumb and arm) with the patch applied to the
branches and those came back clean.

OK to backport to the branches?

Thanks,
Alex

> 
> >
> > Thanks,
> > Kyrill
> >
> > >
> > > Thanks,
> > > Alex
> > >
> > > ---
> > >
> > > gcc/ChangeLog:
> > >
> > > 2020-05-15  Alex Coplan  
> > > * config/arm/arm.c (output_move_double): Fix codegen when
> loading
> > > into
> > > a register pair with an odd base register.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > 2020-05-15  Alex Coplan  
> > > * gcc.c-torture/compile/packed-aligned-1.c: New test.
> > > * gcc.c-torture/execute/packed-aligned.c: New test.
> >




Re: [PATCH 07/13] OpenACC 2.6 deep copy: libgomp parts

2020-06-26 Thread Thomas Schwinge
Hi Julian!

On 2019-12-17T22:03:47-0800, Julian Brown  wrote:
> This part contains the libgomp runtime support for the GOMP_MAP_ATTACH and
> GOMP_MAP_DETACH mapping kinds (etc.), as introduced by the front-end
> patches following in this series.

> --- a/libgomp/target.c
> +++ b/libgomp/target.c

> @@ -1534,6 +1571,18 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, 
> bool do_copyfrom,

This is the code path at the end of a structured OpenACC 'data'
construct.

> +  /* We must perform detachments before any copies back to the host.  */
> +  for (i = 0; i < tgt->list_count; i++)
> +{
> +  splay_tree_key k = tgt->list[i].key;
> +
> +  if (k != NULL && tgt->list[i].do_detach)
> + gomp_detach_pointer (devicep, aq, k, tgt->list[i].key->host_start
> +  + tgt->list[i].offset,
> +  k->refcount == 1, NULL);
> +}

Can you please explain (as a source code comment) the logic for here
using 'k->refcount == 1' for the 'bool finalize' parameter of
'gomp_detach_pointer'; this somehow feels "strange"?

Nonwithstanding the question whether that's a valid thing to do or not,
but doesn't the current code hide the "attach count underflow" error if
you reach the above code with 'attach_count == 0' (user already
explicitly 'detach'ed), but then given 'k->refcount == 1' (thus
'finalize' semantics), 'gomp_detach_pointer' will then re-initialize
'attach_count = 1', and then do another 'gomp_copy_host2dev', etc.
instead of emitting an error.

(I have not attempted to produce a libgomp test case.)

Shouldn't this just always be 'finalize = false' given that there is no
'finalize' semantics for 'detach' on a structured OpenACC 'data'
constructs -- at least that's what I remember right now?


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: RFA: Remove use of register keyword in libiberty.h

2020-06-26 Thread Nick Clifton via Gcc-patches
Hi Guys,

>> include/ChangeLog
>> 2020-06-25  Nick Clifton  
>>
>>  * libiberty.h (bsearch_r): Remove use of the register keyword from
>>  the prototype.
>>
>> libiberty/ChangeLog
>> 2020-06-25  Nick Clifton  
>>
>>  * bsearch.c (bsearch): Remove use of register keyword.
>>  * bsearch_r.c (bsearch_r): Likewise.
> 
> Sure, this is fine.

Committed.

Cheers
  Nick




Re: [PATCH] libgomp: added simple functions and tests for OMPD

2020-06-26 Thread Jakub Jelinek via Gcc-patches
Thanks, just nits, no need to repost, just commit it after you make those
changes.

On Thu, Jun 25, 2020 at 05:58:23PM -0400, y2s1982 via Gcc-patches wrote:

>   * Makefile.am(toolexeclib_LTLIBRARIES): Add libgompd.la.

Missing space between am and (.

> (libgompd_la_LDFLAGS, libgompd_la_DEPENDENCIES, libgompd_la_LINK,
> libgompd_la_SOURCES, libgompd_version_dep, libgompd_version_script,
> libgompd.ver-sun, libgompd.ver, libgompd_version_info): Set.

These lines are wrongly indented, should use tab instead of 8 spaces.

>   * env.c(ompd_dll_locations_valid): Define with no compiler optimization.

Again, missing space.

> --- /dev/null
> +++ b/libgomp/libgompd.map
> @@ -0,0 +1,49 @@
> +OMPD_5.0 {
> +  global:
> + ompd_dll_locations_valid;

ompd_dll_locations_valid is exported from libgomp.so.1, not libgompd.so.1,
so doesn't belong here.

> +ompd_rc_t
> +ompd_get_version_string (const char **string)
> +{
  *string = "GNU OpenMP Runtime implementing OpenMP 5.0 "
ompd_stringify (OMPD_VERSION);

Please align ompd_stringify below "GNU.

Jakub



[PATCH] tree-optimization/95897 - fix fold-left SLP reduction insert place

2020-06-26 Thread Richard Biener
This fixes computation of the insertion place for fold-left SLP
reductions where the PHIs do not have vectorized stmts.  The
SLP representation isn't perfect here thus the following.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

2020-06-26  Richard Biener  

PR tree-optimization/95897
* tree-vectorizer.h (vectorizable_induction): Remove
unused gimple_stmt_iterator * parameter.
* tree-vect-loop.c (vectorizable_induction): Likewise.
(vect_analyze_loop_operations): Adjust.
* tree-vect-stmts.c (vect_analyze_stmt): Likewise.
(vect_transform_stmt): Likewise.
* tree-vect-slp.c (vect_schedule_slp_instance): Adjust
for fold-left reductions, clarify existing reduction case.

* gcc.dg/vect/pr95897.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr95897.c | 13 +
 gcc/tree-vect-loop.c|  3 +--
 gcc/tree-vect-slp.c | 25 +++--
 gcc/tree-vect-stmts.c   |  4 ++--
 gcc/tree-vectorizer.h   |  1 -
 5 files changed, 39 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr95897.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr95897.c 
b/gcc/testsuite/gcc.dg/vect/pr95897.c
new file mode 100644
index 000..a17b72dd040
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr95897.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+
+double foo (double x, int n)
+{
+  double s = 0.;
+  for (int i = 0; i < n; ++i)
+{
+  s += x;
+  s += x;
+  s += x;
+}
+  return s;
+}
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 08c9f119626..bc913eeeb36 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -1573,7 +1573,7 @@ vect_analyze_loop_operations (loop_vec_info loop_vinfo)
   if (STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def
  && ! PURE_SLP_STMT (stmt_info))
ok = vectorizable_induction (loop_vinfo,
-stmt_info, NULL, NULL, NULL,
+stmt_info, NULL, NULL,
 _vec);
  else if ((STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def
|| (STMT_VINFO_DEF_TYPE (stmt_info)
@@ -7285,7 +7285,6 @@ vect_worthwhile_without_simd_p (vec_info *vinfo, 
tree_code code)
 bool
 vectorizable_induction (loop_vec_info loop_vinfo,
stmt_vec_info stmt_info,
-   gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED,
gimple **vec_stmt, slp_tree slp_node,
stmt_vector_for_cost *cost_vec)
 {
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 4d3de77988d..47498b1d417 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -4331,8 +4331,15 @@ vect_schedule_slp_instance (vec_info *vinfo,
   si = gsi_for_stmt (last_stmt_info->stmt);
 }
   else if (SLP_TREE_CHILDREN (node).is_empty ())
-/* This happens for reduction PHIs.  */
-si = gsi_for_stmt (vect_find_last_scalar_stmt_in_slp (node)->stmt);
+{
+  /* This happens for reduction and induction PHIs where we do not use the
+insertion iterator.  */
+  gcc_assert (STMT_VINFO_TYPE (SLP_TREE_REPRESENTATIVE (node))
+ == cycle_phi_info_type
+ || (STMT_VINFO_TYPE (SLP_TREE_REPRESENTATIVE (node))
+ == induc_vec_info_type));
+  si = gsi_none ();
+}
   else
 {
   /* Emit other stmts after the children vectorized defs which is
@@ -4341,6 +4348,20 @@ vect_schedule_slp_instance (vec_info *vinfo,
   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
if (SLP_TREE_DEF_TYPE (child) == vect_internal_def)
  {
+   /* For fold-left reductions we are retaining the scalar
+  reduction PHI but we still have SLP_TREE_NUM_VEC_STMTS
+  set so the representation isn't perfect.  Resort to the
+  last scalar def here.  */
+   if (SLP_TREE_VEC_STMTS (child).is_empty ())
+ {
+   gcc_assert (STMT_VINFO_TYPE (SLP_TREE_REPRESENTATIVE (child))
+   == cycle_phi_info_type);
+   gphi *phi = as_a 
+ (vect_find_last_scalar_stmt_in_slp (child)->stmt);
+   if (!last_stmt
+   || vect_stmt_dominates_stmt_p (last_stmt, phi))
+ last_stmt = phi;
+ }
/* We are emitting all vectorized stmts in the same place and
   the last one is the last.
   ???  Unless we have a load permutation applied and that
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index ab1e70a50c3..7cd39f0f23d 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -10549,7 +10549,7 @@ vect_analyze_stmt (vec_info *vinfo,
  || vectorizable_reduction (as_a  (vinfo), stmt_info,
   

Re: [PATCH] rs6000: Add support for __builtin_cpu_is ("power10")

2020-06-26 Thread Michael Meissner via Gcc-patches
On Thu, Jun 25, 2020 at 06:54:26PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jun 25, 2020 at 06:36:51PM -0500, Peter Bergner wrote:
> > rs6000: Add support for __builtin_cpu_is ("power10")
> > 
> > Add support for __builtin_cpu_is ("power10").  Also add documentation for
> > the recently added "arch_3_1" and "mma" __builtin_cpu_supports arguments.
> 
> > gcc/
> > * config/rs6000/rs6000-call.c (cpu_is_info) : New.
> > : Remove unneeded ','.
> 
> The comma helps making the diff less for future additions (and, makes
> merging/refactoring easier, that way).  A trailing comma was not allowed
> with older C standards (or just with some implementations?), but it
> should be fine with C++11 as we require now.  Is there something I am
> missing here?

A trailing comma has always been allowed for structure and array
initializations.  Where it is not allowed is for enumeration names.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] libgomp, fortran: Apply if clause to all sub-constructs in combined OpenMP constructs

2020-06-26 Thread Christophe Lyon via Gcc-patches
On Thu, 25 Jun 2020 at 15:24, Kwok Cheung Yeung  wrote:
>
> On 24/06/2020 6:29 pm, Tobias Burnus wrote:
> > Hi Kwok,
> >
> > the TODO is fixed by the attached patch; I would be happy if you could 
> > handle
> > this patch,
> > e.g. together with your patch – or as follow up.
> >
> > (Lightly tested only, i.e. it fixes the ICE but I did not
> > do a full testsuite run. But I regard it as obvious.)
>
> Hello
>
> I have committed your patch along with the testcase as 'obvious'. I have
> confirmed that it does not regress the gfortran and libgomp testsuites.
>

Hi,

I've noticed a regression since your commit, on arm aarch64 and x86:
for instance on arm-linux-gnueabi:
PASS: gfortran.dg/gomp/combined-if.f90   -O  (test for excess errors)
PASS: gfortran.dg/gomp/combined-if.f90   -O   scan-tree-dump-times
omplower "(?n)#pragma omp target.* if\\(" 9
gfortran.dg/gomp/combined-if.f90   -O  : pattern found 4 times
FAIL: gfortran.dg/gomp/combined-if.f90   -O   scan-tree-dump-times
omplower "(?n)#pragma omp simd.* if\\(" 7
PASS: gfortran.dg/gomp/combined-if.f90   -O   scan-tree-dump-times
omplower "(?n)#pragma omp parallel.* if\\(" 6

Not sure why you didn't see it?

Thanks,

Christophe

> Kwok
>