Re: use -mno-strict-align for strlenopt-80.c on powerpc

2021-03-11 Thread Richard Biener via Gcc-patches
On Wed, Mar 10, 2021 at 7:04 AM Alexandre Oliva  wrote:
>
>
> ppc configurations that have -mstrict-align enabled by default fail
> gcc.dg/strlenopt-80.c, because some memcpy calls don't get turned into
> MEM_REFs, which defeats the tested-for strlen optimization.
>
> This was regstrapped on x86_64-linux-gnu, tested with a cross to a
> ppc64-vxworks7r2 configured with -mstrict-align enabled by default,
> and I'm now also regstrapping on ppc64-linux-gnu just to be sure.
> Ok to install?

Btw, I asked during two stage1s whether relaxing the folding to also
apply to strict-align targets would be OK.  But I didn't receive any
answers and thus dropped the ball ...

Basically for the folding turning integer-mode sized copies into
separate load + store also allow the load/store to be not aligned
according to their mode (and rely on RTL expansion to perform
bitfield load/store ops for them).

Richard.

>
> for  gcc/testsuite/ChangeLog
>
> * gcc.dg/strlenopt-80.c: Add -mno-strict-align on ppc.
> ---
>  gcc/testsuite/gcc.dg/strlenopt-80.c |6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.dg/strlenopt-80.c 
> b/gcc/testsuite/gcc.dg/strlenopt-80.c
> index 9124fe4740a1b..9978f5cf3bc56 100644
> --- a/gcc/testsuite/gcc.dg/strlenopt-80.c
> +++ b/gcc/testsuite/gcc.dg/strlenopt-80.c
> @@ -5,7 +5,11 @@
> such a store.
> { dg-do compile { target aarch64*-*-* i?86-*-* powerpc*-*-* x86_64-*-* } }
>
> -   { dg-options "-O2 -Wall -fdump-tree-optimized" } */
> +   { dg-options "-O2 -Wall -fdump-tree-optimized" }
> +
> +   On powerpc configurations that have -mstrict-align by default,
> +   the memcpy calls for ncpylog >= 3 are not turned into MEM_REFs.
> +   { dg-additional-options "-mno-strict-align" { target powerpc*-*-* } }  */
>
>  #define CHAR_BIT  __CHAR_BIT__
>  #define SIZE_MAX  __SIZE_MAX__
>
>
> --
> Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
>Free Software Activist GNU Toolchain Engineer
> Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


Re: ipa-sra is mostly disabled by -mlongcall

2021-03-11 Thread Richard Biener via Gcc-patches
On Thu, Mar 11, 2021 at 8:51 PM Segher Boessenkool
 wrote:
>
> On Thu, Mar 11, 2021 at 04:52:18AM -0300, Alexandre Oliva wrote:
> > Several ipa-sra tests fail because -mlongcall on powerpc is a
> > TYPE_ATTRIBUTE, and those disable ipa-sra.
> >
> > This local workaround disables -mlongcall for the failing tests on
> > powerpc*-*-vxworks*, so they get a chance to run even in kernel mode.
> >
> > (AdaCore has -mlongcall enabled for powerpc*-wrs-vxworks* kernel-mode
> > targets.)
> >
> > This was regstrapped on x86_64-linux-gnu and ppc64-linux-gnu, and tested
> > with a cross to a ppc64-vxworks7r2 with -mlongcall enabled by default.
> >
> > I was leaning towards maintaing this change internal, but I thought it
> > wouldn't hurt to ask whether it's ok to install.
> >
> > I wonder whether it wouldn't be more desirable to have an alternate
> > implementation of -mlongcall on ppc that didn't conflict with IPA-SRA,
> > and IIRC with some scenarios involving C++ templates.
>
> It has been this way for decades (well, since 2002, not *quite* multiple
> decades yet).  It also is almost never necessary to use: not many
> programs have more than 32MB in a single module (and you do not need
> this for cross-module calls).
>
> > E.g., on ARM, -mlong-calls doesn't add an attribute to every function,
> > it just changes a default, that attributes short_call and long_call may
> > then override for specific functions.  Would a similar implementation be
> > acceptable for ppc?
>
> That is how it works already!  See the first entry on
>   https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Function-Attributes.html
>
> Given Martin's and Richard's replies, I think we should not take this
> patch.  Sorry.

You could dg-skip-if -mlongcall if the option is explicit or XFAIL on
some new target selector checking for its effect when it is enabled
by default somehow.

Of course given what Segher says I wonder why it's the default
on vxworks...

Richard.

>
> Segher


Re: [RFC] decay vect tests from run to link for pr95401

2021-03-11 Thread Richard Biener via Gcc-patches
On Thu, Mar 11, 2021 at 3:47 PM Alexandre Oliva  wrote:
>
> On Mar 11, 2021, Richard Biener  wrote:
>
> > I think that's OK.
>
> Cool, here's the patch I'm nearly done regstrapping on x86_64-linux-gnu
> and x-ppc64-vx7r2.  Ok to install?

OK.

Richard.

> > It's probably difficult to make the test UNSUPPORTED
> > when dg-additional-sources is discovered with a dg-do compile test?
>
> Well, first of all, I really don't like the idea of skipping a test if
> we can still get some useful information out of it.
>
> That said, I suppose we could test for additional_sources_used in
> ${langdriver}_target_compile proces in gcc.exp, g++.exp et al, between
> their calling dg-additional-files-options and target_compile, to
> conditionally skip the latter, or pass $type to the former in all
> $langdriver.exp, so that the extra files can be flagged and/or discarded
> in unsupported modes.  I believe such changes would also require
> adjustments to library test infrastructures.
>
>
> decay vect tests from run to link for pr95401
>
> When vect.exp finds our configuration disables altivec by default, it
> disables the execution of vectorization tests, assuming the test
> hardware doesn't support it.
>
> Tests become just compile tests, but compile tests won't work
> correctly when additional sources are named, e.g. pr95401.cc, because
> GCC refuses to compile multiple files into the same asm output.
>
> With this patch, the default for when execution is not possible
> becomes link.
>
>
> for  gcc/testsuite/ChangeLog
>
> * lib/target-supports.exp (check_vect_support_and_set_flags):
> Decay to link rather than compile.
> ---
>  gcc/testsuite/lib/target-supports.exp |   14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 52d3d036d3c5c..f5b9b0578de37 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -9632,7 +9632,7 @@ proc check_vect_support_and_set_flags { } {
>  if [check_750cl_hw_available] {
>  set dg-do-what-default run
>  } else {
> -set dg-do-what-default compile
> +set dg-do-what-default link
>  }
>  } elseif [istarget powerpc*-*-*] {
>  # Skip targets not supporting -maltivec.
> @@ -9656,14 +9656,14 @@ proc check_vect_support_and_set_flags { } {
>  # Specify a cpu that supports VMX for compile-only tests.
>  lappend DEFAULT_VECTCFLAGS "-mcpu=970"
>  }
> -set dg-do-what-default compile
> +set dg-do-what-default link
>  }
>  } elseif { [istarget i?86-*-*] || [istarget x86_64-*-*] } {
>  lappend DEFAULT_VECTCFLAGS "-msse2"
>  if { [check_effective_target_sse2_runtime] } {
>  set dg-do-what-default run
>  } else {
> -set dg-do-what-default compile
> +set dg-do-what-default link
>  }
>  } elseif { [istarget mips*-*-*]
>&& [check_effective_target_nomips16] } {
> @@ -9682,7 +9682,7 @@ proc check_vect_support_and_set_flags { } {
>  if [check_effective_target_ultrasparc_hw] {
>  set dg-do-what-default run
>  } else {
> -set dg-do-what-default compile
> +set dg-do-what-default link
>  }
>  } elseif [istarget alpha*-*-*] {
>  # Alpha's vectorization capabilities are extremely limited.
> @@ -9695,7 +9695,7 @@ proc check_vect_support_and_set_flags { } {
>  if [check_alpha_max_hw_available] {
>  set dg-do-what-default run
>  } else {
> -set dg-do-what-default compile
> +set dg-do-what-default link
>  }
>  } elseif [istarget ia64-*-*] {
>  set dg-do-what-default run
> @@ -9708,7 +9708,7 @@ proc check_vect_support_and_set_flags { } {
>  if [is-effective-target arm_neon_hw] {
>  set dg-do-what-default run
>  } else {
> -set dg-do-what-default compile
> +set dg-do-what-default link
>  }
>  } elseif [istarget "aarch64*-*-*"] {
>  set dg-do-what-default run
> @@ -9729,7 +9729,7 @@ proc check_vect_support_and_set_flags { } {
>  set dg-do-what-default run
>  } else {
> lappend DEFAULT_VECTCFLAGS "-march=z14" "-mzarch"
> -set dg-do-what-default compile
> +set dg-do-what-default link
>  }
>  } elseif [istarget amdgcn-*-*] {
>  set dg-do-what-default run
>
>
> --
> Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
>Free Software Activist GNU Toolchain Engineer
> Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


Re: [PATCH] x86: Update 'P' operand modifier for -fno-plt

2021-03-11 Thread Uros Bizjak via Gcc-patches
On Thu, Mar 11, 2021 at 11:22 PM H.J. Lu  wrote:
>
> Update 'P' operand modifier for -fno-plt to support inline assembly
> statements.  In 64-bit, we can always load function address with
> @GOTPCREL.  In 32-bit, we load function address with @GOT only for
> non-PIC since PIC register may not be available at call site.
>
> gcc/
>
> PR target/99504
> * config/i386/i386.c (ix86_print_operand): Update 'P' handling
> for -fno-plt.
>
> gcc/testsuite/
>
> PR target/99504
> * gcc.target/i386/pr99530-1.c: New test.
> * gcc.target/i386/pr99530-2.c: Likewise.
> * gcc.target/i386/pr99530-3.c: Likewise.
> * gcc.target/i386/pr99530-4.c: Likewise.
> * gcc.target/i386/pr99530-5.c: Likewise.
> * gcc.target/i386/pr99530-6.c: Likewise.
> ---
>  gcc/config/i386/i386.c| 33 +--
>  gcc/testsuite/gcc.target/i386/pr99530-1.c | 11 
>  gcc/testsuite/gcc.target/i386/pr99530-2.c | 11 
>  gcc/testsuite/gcc.target/i386/pr99530-3.c | 11 
>  gcc/testsuite/gcc.target/i386/pr99530-4.c | 11 
>  gcc/testsuite/gcc.target/i386/pr99530-5.c | 11 
>  gcc/testsuite/gcc.target/i386/pr99530-6.c | 11 
>  7 files changed, 97 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-5.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-6.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 260f87b..8733fcecf65 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -12701,7 +12701,8 @@ print_reg (rtx x, int code, FILE *file)
> y -- print "st(0)" instead of "st" as a register.
> d -- print duplicated register operand for AVX instruction.
> D -- print condition for SSE cmp instruction.
> -   P -- if PIC, print an @PLT suffix.
> +   P -- if PIC, print an @PLT suffix.  For -fno-plt, load function
> +   address from GOT.
> p -- print raw symbol name.
> X -- don't print any sort of PIC '@' suffix for a symbol.
> & -- print some in-use local-dynamic symbol name.
> @@ -13445,7 +13446,35 @@ ix86_print_operand (FILE *file, rtx x, int code)
>   x = const0_rtx;
> }
>
> -  if (code != 'P' && code != 'p')
> +  if (code == 'P')
> +   {
> + if (current_output_insn == NULL_RTX
> + && (TARGET_64BIT || (!flag_pic && HAVE_AS_IX86_GOT32X))
> + && !TARGET_PECOFF
> + && !TARGET_MACHO
> + && ix86_cmodel != CM_LARGE
> + && ix86_cmodel != CM_LARGE_PIC
> + && GET_CODE (x) == SYMBOL_REF
> + && SYMBOL_REF_FUNCTION_P (x)
> + && (!flag_plt
> + || (SYMBOL_REF_DECL (x)
> + && lookup_attribute ("noplt",
> +  DECL_ATTRIBUTES (SYMBOL_REF_DECL 
> (x)
> + && !SYMBOL_REF_LOCAL_P (x))

You can use ix86_force_load_from_GOT_p instead.

Uros.

> +   {
> + /* For inline assembly statement, load function address
> +from GOT with 'P' operand modifier to avoid PLT.
> +NB: This works only with call or jmp.  */
> + const char *xasm;
> + if (TARGET_64BIT)
> +   xasm = "{*%p0@GOTPCREL(%%rip)|[QWORD PTR %p0@GOTPCREL[rip]]}";
> + else
> +   xasm = "{*%p0@GOT|[DWORD PTR %p0@GOT]}";
> + output_asm_insn (xasm, );
> + return;
> +   }
> +   }
> +  else if (code != 'p')
> {
>   if (CONST_INT_P (x))
> {
> diff --git a/gcc/testsuite/gcc.target/i386/pr99530-1.c 
> b/gcc/testsuite/gcc.target/i386/pr99530-1.c
> new file mode 100644
> index 000..080d7cc9399
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr99530-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target { i?86-*-linux* x86_64-*-linux* } } } */
> +/* { dg-require-effective-target lp64 } */
> +/* { dg-options "-O2 -fpic -mcmodel=large -fno-plt" } */
> +/* { dg-final { scan-assembler-not "foo@GOTPCREL" } } */
> +
> +extern void foo (void);
> +void
> +bar (void)
> +{
> +  asm ("call %P0" : : "X" (foo));
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/pr99530-2.c 
> b/gcc/testsuite/gcc.target/i386/pr99530-2.c
> new file mode 100644
> index 000..9808957d624
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr99530-2.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target { i?86-*-linux* x86_64-*-linux* } } } */
> +/* { dg-require-effective-target lp64 } */
> +/* { dg-options "-O2 -fno-pic -mcmodel=large -fno-plt" } */
> +/* { dg-final { scan-assembler-not "foo@GOTPCREL" } } */
> +
> +extern void foo (void);
> 

Re: [PATCH] x86: Update 'P' operand modifier for -fno-plt

2021-03-11 Thread Uros Bizjak via Gcc-patches
On Thu, Mar 11, 2021 at 11:22 PM H.J. Lu  wrote:
>
> Update 'P' operand modifier for -fno-plt to support inline assembly
> statements.  In 64-bit, we can always load function address with
> @GOTPCREL.  In 32-bit, we load function address with @GOT only for
> non-PIC since PIC register may not be available at call site.
>
> gcc/
>
> PR target/99504
> * config/i386/i386.c (ix86_print_operand): Update 'P' handling
> for -fno-plt.
>
> gcc/testsuite/
>
> PR target/99504
> * gcc.target/i386/pr99530-1.c: New test.
> * gcc.target/i386/pr99530-2.c: Likewise.
> * gcc.target/i386/pr99530-3.c: Likewise.
> * gcc.target/i386/pr99530-4.c: Likewise.
> * gcc.target/i386/pr99530-5.c: Likewise.
> * gcc.target/i386/pr99530-6.c: Likewise.
> ---
>  gcc/config/i386/i386.c| 33 +--
>  gcc/testsuite/gcc.target/i386/pr99530-1.c | 11 
>  gcc/testsuite/gcc.target/i386/pr99530-2.c | 11 
>  gcc/testsuite/gcc.target/i386/pr99530-3.c | 11 
>  gcc/testsuite/gcc.target/i386/pr99530-4.c | 11 
>  gcc/testsuite/gcc.target/i386/pr99530-5.c | 11 
>  gcc/testsuite/gcc.target/i386/pr99530-6.c | 11 
>  7 files changed, 97 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-5.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-6.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 260f87b..8733fcecf65 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -12701,7 +12701,8 @@ print_reg (rtx x, int code, FILE *file)
> y -- print "st(0)" instead of "st" as a register.
> d -- print duplicated register operand for AVX instruction.
> D -- print condition for SSE cmp instruction.
> -   P -- if PIC, print an @PLT suffix.
> +   P -- if PIC, print an @PLT suffix.  For -fno-plt, load function
> +   address from GOT.
> p -- print raw symbol name.
> X -- don't print any sort of PIC '@' suffix for a symbol.
> & -- print some in-use local-dynamic symbol name.
> @@ -13445,7 +13446,35 @@ ix86_print_operand (FILE *file, rtx x, int code)
>   x = const0_rtx;
> }
>
> -  if (code != 'P' && code != 'p')
> +  if (code == 'P')
> +   {
> + if (current_output_insn == NULL_RTX
> + && (TARGET_64BIT || (!flag_pic && HAVE_AS_IX86_GOT32X))
> + && !TARGET_PECOFF
> + && !TARGET_MACHO
> + && ix86_cmodel != CM_LARGE
> + && ix86_cmodel != CM_LARGE_PIC
> + && GET_CODE (x) == SYMBOL_REF
> + && SYMBOL_REF_FUNCTION_P (x)
> + && (!flag_plt
> + || (SYMBOL_REF_DECL (x)
> + && lookup_attribute ("noplt",
> +  DECL_ATTRIBUTES (SYMBOL_REF_DECL 
> (x)
> + && !SYMBOL_REF_LOCAL_P (x))
> +   {
> + /* For inline assembly statement, load function address
> +from GOT with 'P' operand modifier to avoid PLT.
> +NB: This works only with call or jmp.  */
> + const char *xasm;
> + if (TARGET_64BIT)
> +   xasm = "{*%p0@GOTPCREL(%%rip)|[QWORD PTR %p0@GOTPCREL[rip]]}";
> + else
> +   xasm = "{*%p0@GOT|[DWORD PTR %p0@GOT]}";
> + output_asm_insn (xasm, );
> + return;

This should be handled in output_pic_addr_const.

Uros.

> +   }
> +   }
> +  else if (code != 'p')
> {
>   if (CONST_INT_P (x))
> {
> diff --git a/gcc/testsuite/gcc.target/i386/pr99530-1.c 
> b/gcc/testsuite/gcc.target/i386/pr99530-1.c
> new file mode 100644
> index 000..080d7cc9399
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr99530-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target { i?86-*-linux* x86_64-*-linux* } } } */
> +/* { dg-require-effective-target lp64 } */
> +/* { dg-options "-O2 -fpic -mcmodel=large -fno-plt" } */
> +/* { dg-final { scan-assembler-not "foo@GOTPCREL" } } */
> +
> +extern void foo (void);
> +void
> +bar (void)
> +{
> +  asm ("call %P0" : : "X" (foo));
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/pr99530-2.c 
> b/gcc/testsuite/gcc.target/i386/pr99530-2.c
> new file mode 100644
> index 000..9808957d624
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr99530-2.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target { i?86-*-linux* x86_64-*-linux* } } } */
> +/* { dg-require-effective-target lp64 } */
> +/* { dg-options "-O2 -fno-pic -mcmodel=large -fno-plt" } */
> +/* { dg-final { scan-assembler-not "foo@GOTPCREL" } } */
> +
> +extern void foo (void);
> 

RE: [PATCH] aarch64: Improve generic SVE tuning defaults

2021-03-11 Thread qia...@fujitsu.com
Hello Kyrill,

As you said, this patch is only effective for generic SVE tuning.
So, I will evaluate performance without -mcpu option on a64fx.
I'll tell you the result next week.

Regards,
Qian

> -Original Message-
> From: Kyrylo Tkachov 
> Sent: Wednesday, March 10, 2021 10:56 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Sandiford ; Qian, Jianhua/钱 建华
> 
> Subject: [PATCH] aarch64: Improve generic SVE tuning defaults
> 
> Hi all,
> 
> This patch adds the recently-added tweak to split some SVE VL-based scalar
> operations [1] to the generic tuning used for SVE, as enabled by adding +sve 
> to
> the -march flag, for example -march=armv8.2-a+sve.
> 
> The recommendation for best performance on a particular CPU remains
> unchanged:
> use the -mcpu option for that CPU, where possible. -mcpu=native makes this
> straightforward for native compilation.
> 
> The tweak to split out SVE VL-based scalar operations is a consistent win for
> the Neoverse V1 CPU and should be neutral for the Fujitsu A64FX. A run of
> SPEC2017 on A64FX with this tweak on didn't show any non-noise differences.
> It is also expected to be neutral on SVE2 implementations.
> 
> Therefore, the patch enables the tweak for generic +sve tuning e.g.
> -march=armv8.2-a+sve. No SVE2 CPUs are expected to benefit from it,
> therefore the tweak is disabled for generic tuning when +sve2 is in -march 
> e.g.
> -march=armv8.2-a+sve2.
> 
> The implementation of this approach requires a bit of custom logic in
> aarch64_override_options_internal to handle these kinds of
> architecture-dependent decisions, but we do believe the user-facing principle
> here is important to implement.
> 
> Qian, as you've contributed the A64FX support to GCC, I would be grateful for
> your feedback on this approach and in particular on the performance evaluation
> of this change.
> 
> In general, for the generic target we're using a decision framework that looks
> like:
> 
> * If all cores that are known to benefit from an optimization are of 
> architecture X,
> and all other cores that implement X or above are not impacted, or have a very
> slight impact, we will consider it for generic tuning for architecture X.
> * We will not enable that optimisation for generic tuning for architecture 
> X+1 if
> no known cores of architecture X+1 or above will benefit.
> 
> This framework allows us to improve generic tuning for CPUs of generation X
> while avoiding accumulating tweaks for future CPUs of generation X+1, X+2...
> that do not need them, and thus avoid even the slight negative effects of 
> these
> optimisations if the user is willing to tell us the desired architecture 
> accurately.
> 
> X above can mean either annual architecture updates (Armv8.2-a, Armv8.3-a
> etc) or optional architecture extensions (like SVE, SVE2).
> 
> We think that this patch fits that framework, so would like to propose it for 
> the
> trunk default tunings for SVE.
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> 
> Thanks,
> Kyrill
> 
> [1] http://gcc.gnu.org/g:a65b9ad863c5fc0aea12db58557f4d286a1974d7
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64.c (aarch64_adjust_generic_arch_tuning):
> Define.
>   (aarch64_override_options_internal): Use it.
>   (generic_tunings): Add
> AARCH64_EXTRA_TUNE_CSE_SVE_VL_CONSTANTS to
>   tune_flags.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.target/aarch64/sve/aarch64-sve.exp: Add
> -moverride=tune=none to
>   sve_flags.
>   * g++.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise.
>   * g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise.
>   * gcc.target/aarch64/sve/aarch64-sve.exp: Likewise.
>   * gcc.target/aarch64/sve/acle/aarch64-sve-acle-asm.exp: Likewise.
>   * gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: Likewise.
> 



Re: Merge from trunk to gccgo branch

2021-03-11 Thread Ian Lance Taylor via Gcc-patches
I merged trunk revision 7ad5a72c8bc6aa71a0d195ddfa207db01265fe0b to
the gccgo branch.

Ian


[PATCH v2] c: don't drop typedef information in casts

2021-03-11 Thread David Lamparter

The TYPE_MAIN_VARIANT() here was, for casts to a typedef'd type name,
resulting in all information about the typedef's involvement getting
lost.  This drops necessary information for warnings and can make them
confusing or even misleading.  It also makes specialized warnings for
unspecified-size system types (pid_t, uid_t, ...) impossible.

gcc/c/ChangeLog:
2021-03-09  David Lamparter  

PR c/99526
* c-typeck.c (build_c_cast): retain (unqualified) typedefs in
  casts rather than stripping down to basic type.
---
 gcc/c/c-typeck.c| 37 ++---
 gcc/testsuite/gcc.dg/cast-typedef.c | 29 ++
 2 files changed, 63 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/cast-typedef.c

diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 4e6d369c4c9c..7323a8e82547 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -5816,6 +5816,7 @@ c_safe_function_type_cast_p (tree t1, tree t2)
 tree
 build_c_cast (location_t loc, tree type, tree expr)
 {
+  tree res_type, walk_type;
   tree value;
 
   bool int_operands = EXPR_INT_CONST_OPERANDS (expr);
@@ -5836,7 +5837,37 @@ build_c_cast (location_t loc, tree type, tree expr)
   if (objc_is_object_ptr (type) && objc_is_object_ptr (TREE_TYPE (expr)))
 return build1 (NOP_EXPR, type, expr);
 
+  /* Want to keep typedef information, but at the same time we need to strip
+ qualifiers for a proper rvalue.  Unfortunately, we don't know if any
+ qualifiers on a typedef are part of the typedef or were locally supplied.
+ So grab the original typedef and use that only if it has no qualifiers.
+ cf. cast-typedef testcase */
+
+  res_type = NULL;
+  walk_type = type;
+
+  while (walk_type && TYPE_NAME (walk_type)
+	 && TREE_CODE (TYPE_NAME (walk_type)) == TYPE_DECL)
+{
+  tree orig_decl = lookup_name (TYPE_IDENTIFIER (walk_type));
+
+  if (!orig_decl || TREE_CODE (orig_decl) != TYPE_DECL)
+	break;
+
+  walk_type = TREE_TYPE (orig_decl);
+  if (TYPE_QUALS (walk_type) == TYPE_UNQUALIFIED)
+	{
+	  res_type = walk_type;
+	  break;
+	}
+
+  walk_type = DECL_ORIGINAL_TYPE (TYPE_NAME (walk_type));
+}
+
   type = TYPE_MAIN_VARIANT (type);
+  if (!res_type)
+res_type = type;
+  gcc_assert (TYPE_MAIN_VARIANT (res_type) == type);
 
   if (TREE_CODE (type) == ARRAY_TYPE)
 {
@@ -5864,7 +5895,7 @@ build_c_cast (location_t loc, tree type, tree expr)
 		 "ISO C forbids casting nonscalar to the same type");
 
   /* Convert to remove any qualifiers from VALUE's type.  */
-  value = convert (type, value);
+  value = convert (res_type, value);
 }
   else if (TREE_CODE (type) == UNION_TYPE)
 {
@@ -6018,7 +6049,7 @@ build_c_cast (location_t loc, tree type, tree expr)
 		" from %qT to %qT", otype, type);
 
   ovalue = value;
-  value = convert (type, value);
+  value = convert (res_type, value);
 
   /* Ignore any integer overflow caused by the cast.  */
   if (TREE_CODE (value) == INTEGER_CST && !FLOAT_TYPE_P (otype))
@@ -6054,7 +6085,7 @@ build_c_cast (location_t loc, tree type, tree expr)
 		&& INTEGRAL_TYPE_P (TREE_TYPE (expr)))
 	   || TREE_CODE (expr) == REAL_CST
 	   || TREE_CODE (expr) == COMPLEX_CST)))
-  value = build1 (NOP_EXPR, type, value);
+  value = build1 (NOP_EXPR, res_type, value);
 
   /* If the expression has integer operands and so can occur in an
  unevaluated part of an integer constant expression, ensure the
diff --git a/gcc/testsuite/gcc.dg/cast-typedef.c b/gcc/testsuite/gcc.dg/cast-typedef.c
new file mode 100644
index ..488624c16f72
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cast-typedef.c
@@ -0,0 +1,29 @@
+/* Test cast <> typedef interactions */
+/* Origin: David Lamparter  */
+/* { dg-do compile } */
+/* { dg-options "-Wconversion" } */
+
+typedef int typedefname;
+typedef volatile int qual1;
+typedef volatile typedefname qual2;
+
+extern int val;
+extern void f2(unsigned char arg);
+
+void
+f (void)
+{
+  /* -Wconversion just used to print out the actual type */
+
+  f2 ((typedefname) val); /* { dg-warning "typedefname" } */
+  f2 ((volatile typedefname) val); /* { dg-warning "typedefname" } */
+  f2 ((qual1) val); /* { dg-warning "int" } */
+  f2 ((qual2) val); /* { dg-warning "typedefname" } */
+
+  /* { dg-bogus "volatile" "qualifiers should be stripped" { target { "*-*-*" } } 19  } */
+  /* { dg-bogus "volatile" "qualifiers should be stripped" { target { "*-*-*" } } 20  } */
+  /* { dg-bogus "volatile" "qualifiers should be stripped" { target { "*-*-*" } } 21  } */
+
+  /* { dg-bogus "qual1" "typedef with qualifier should not be used" { target { "*-*-*" } } 20  } */
+  /* { dg-bogus "qual2" "typedef with qualifier should not be used" { target { "*-*-*" } } 21  } */
+}


Re: use -mno-strict-align for strlenopt-80.c on powerpc

2021-03-11 Thread Segher Boessenkool
On Wed, Mar 10, 2021 at 02:41:50AM -0300, Alexandre Oliva wrote:
> ppc configurations that have -mstrict-align enabled by default fail
> gcc.dg/strlenopt-80.c, because some memcpy calls don't get turned into
> MEM_REFs, which defeats the tested-for strlen optimization.
> 
> This was regstrapped on x86_64-linux-gnu, tested with a cross to a
> ppc64-vxworks7r2 configured with -mstrict-align enabled by default,
> and I'm now also regstrapping on ppc64-linux-gnu just to be sure.
> Ok to install?

The -mstrict-align option is defined in sysv4.opt, which is not used in
all configurations (like, powerpc64-darwin*, and the AIX configs).  So
no, sorry; you'll need more work here.

Btw, aarch64 has this same option.


Segher


Re: [PATCH] rs6000: Fix check_effective_target_sqrt_insn (PR99352)

2021-03-11 Thread Segher Boessenkool
On Wed, Mar 10, 2021 at 02:06:24AM -0300, Alexandre Oliva wrote:
> On Mar  9, 2021, Segher Boessenkool  wrote:
> 
> > +return [check_no_compiler_messages powerpc_sqrt object {
> 
> I don't think you really need to assemble this.  s/object/assembly/
> would do.  Even preprocessing alone would do, but I don't think
> check_compiler can do that.

We use object in all similar tests.  It helps to check if the assembler
is set up for the options used.  That doesn't matter for this test of
course, but making things more tricky for no win is a bad tradeoff.

Also, I didn't even bother to consider not assembling it :-)


Segher


Re: [Patch] Fortran: Fix libgfortran I/O race with newunit_free [PR99529]

2021-03-11 Thread Jerry DeLisle

Looks good Tobias, OK,

Odd about that line in set_internal_unit. Probably leftover from a 
copy/paste. I see in comment #5 of the PR that you mentioned trying the 
assert to make sure it is useless code.


Thanks for the patch,

Jerry

On 3/11/21 2:38 AM, Tobias Burnus wrote:
Revised version – the previous had a lock inversion, which could lead 
to a deadlock, found by -fsanitize=thread. See transfer.c for the 
changes.


OK?

Tobias

On 11.03.21 10:42, Tobias Burnus wrote:

Hi all,

as found by Martin (thanks!) there is a race for newunit_free.
While that call is within the unitlock for the calls in io/unit.c,
the call in transfer.c did not use locks.

Additionally,
  unit = get_gfc_unit (dtp->common.unit, do_create);
  set_internal_unit (dtp, unit, kind);
gets first the unit (with proper locking when using the unit number
dtp->common.unit) but then in set_internal_unit it re-sets the
unit number to the same number without locking. That causes
race warnings and if the assignment is not atomic it is a true race.

OK for mainline? What about GCC 10?

As Martin notes in the email thread and in the PR there are more
race warnings (and likely true race issues).

Tobias





Re: [RFC][patch for gcc12][version 1] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-03-11 Thread Kees Cook via Gcc-patches
On Thu, Mar 11, 2021 at 03:47:17PM -0600, Qing Zhao wrote:
> Hi, Kees,
> 
> Sorry for the late reply (I have been busy with other work recently).
> 
> Currently, I am working on the issue of flexible length array as the last 
> field of the structure.
> 
> In order to fix it correctly, I have the following question:
> 
> 
> > On Feb 26, 2021, at 3:42 PM, Kees Cook  wrote:
> > 
> > On Thu, Feb 25, 2021 at 05:56:38PM -0600, Qing Zhao wrote:
> >> Just noticed that you didn’t add -fauto-var-init-approach=D to the command 
> >> line.
> > 
> > Ah-ha! I didn't realize that was needed; thanks. However, now some of the 
> > sources crash in a different way. Here's the reproducer:
> > 
> > $ cat poc.i
> > struct a {
> >  int b;
> >  int array[];
> > };
> > void c() {
> >  struct a d;
> > }
> > 
> 
> For such variable length array as the last field of the structure, static 
> initialization is not allowed, 
> User needs to explicitly allocate memory and initialize the allocated array 
> manually in the source code. 
> 
> So, if the compiler has to initialize this structure when requested by 
> -ftrivial-auto-var-init,  I think that 
> only the fields before the last fields need to be initialized, Is this the 
> correct behavior you expected?

Right, that would be my expectation as well. Putting such a struct on
the stack tends to be nonsensical, but maybe happens if part of a union,
which would get initialized correctly, etc:

union {
struct a {
int b;
int array[];
};
char buf[32];
};

-- 
Kees Cook


Go patch committed: create temporaries for heap variables

2021-03-11 Thread Ian Lance Taylor via Gcc-patches
The Go frontend generally doesn't create a temporary for an expression
that is a variable, because it's normally valid to simply reload
thevalue from the variable.  However, if the variable is in the heap,
then loading the value is a pointer indirection.  The process of
creating GCC IR can cause the variable load and the pointer
indirection to be split, such that the second evaluation only does the
pointer indirection.  If there are conditionals in between the two
uses, this can cause the second use to load the pointer from an
uninitialized register.

This patch avoids this by introducing a new Expression method that
returns whether it is safe to evaluate an expression multiple times,
and use it everywhere.

The test case is https://golang.org/cl/300789.

This fixes https://golang.org/issue/44383.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
b7781971f555a2b752ef1afd02732493bef1dec6
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 5b45f03a26e..58c881a7872 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-93380a9126e76b71fa208e62c31c7914084c0e37
+bf35249a7c752836741b1cab43a312f87916fcb0
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc
index 17b4cfd2c19..101cbe7ac31 100644
--- a/gcc/go/gofrontend/expressions.cc
+++ b/gcc/go/gofrontend/expressions.cc
@@ -526,7 +526,7 @@ Expression::convert_interface_to_interface(Type *lhs_type, 
Expression* rhs,
   // method table.
 
   // We are going to evaluate RHS multiple times.
-  go_assert(rhs->is_variable());
+  go_assert(rhs->is_multi_eval_safe());
 
   // Get the type descriptor for the right hand side.  This will be
   // NULL for a nil interface.
@@ -569,7 +569,7 @@ Expression::convert_interface_to_type(Gogo* gogo, Type 
*lhs_type, Expression* rh
   Location location)
 {
   // We are going to evaluate RHS multiple times.
-  go_assert(rhs->is_variable());
+  go_assert(rhs->is_multi_eval_safe());
 
   // Build an expression to check that the type is valid.  It will
   // panic with an appropriate runtime type error if the type is not
@@ -707,8 +707,8 @@ Expression::check_bounds(Expression* val, Operator op, 
Expression* bound,
 Statement_inserter* inserter,
 Location loc)
 {
-  go_assert(val->is_variable() || val->is_constant());
-  go_assert(bound->is_variable() || bound->is_constant());
+  go_assert(val->is_multi_eval_safe());
+  go_assert(bound->is_multi_eval_safe());
 
   Type* int_type = Type::lookup_integer_type("int");
   int int_type_size = int_type->integer_type()->bits();
@@ -3976,7 +3976,7 @@ Type_conversion_expression::do_flatten(Gogo*, 
Named_object*,
   if (((this->type()->is_string_type()
 && this->expr_->type()->is_slice_type())
|| this->expr_->type()->interface_type() != NULL)
-  && !this->expr_->is_variable())
+  && !this->expr_->is_multi_eval_safe())
 {
   Temporary_statement* temp =
   Statement::make_temporary(NULL, this->expr_, this->location());
@@ -4264,7 +4264,7 @@ 
Type_conversion_expression::do_get_backend(Translate_context* context)
   Array_type* a = expr_type->array_type();
   Type* e = a->element_type()->forwarded();
   go_assert(e->integer_type() != NULL);
-  go_assert(this->expr_->is_variable());
+  go_assert(this->expr_->is_multi_eval_safe());
 
   Expression* buf;
   if (this->no_escape_ && !this->no_copy_)
@@ -4711,7 +4711,7 @@ Unary_expression::do_flatten(Gogo* gogo, Named_object*,
 
   Location location = this->location();
   if (this->op_ == OPERATOR_MULT
-  && !this->expr_->is_variable())
+  && !this->expr_->is_multi_eval_safe())
 {
   go_assert(this->expr_->type()->points_to() != NULL);
   switch (this->requires_nil_check(gogo))
@@ -4731,7 +4731,7 @@ Unary_expression::do_flatten(Gogo* gogo, Named_object*,
 }
 }
 
-  if (this->create_temp_ && !this->expr_->is_variable())
+  if (this->create_temp_ && !this->expr_->is_multi_eval_safe())
 {
   Temporary_statement* temp =
   Statement::make_temporary(NULL, this->expr_, location);
@@ -5326,7 +5326,7 @@ Unary_expression::do_get_backend(Translate_context* 
context)
   bexpr = gogo->backend()->var_expression(decl, loc);
 }
 
-  go_assert(!this->create_temp_ || this->expr_->is_variable());
+  go_assert(!this->create_temp_ || this->expr_->is_multi_eval_safe());
   ret = gogo->backend()->address_expression(bexpr, loc);
   break;
 
@@ -5347,7 +5347,7 @@ Unary_expression::do_get_backend(Translate_context* 
context)
   }
 case NIL_CHECK_NEEDED:
   {
-go_assert(this->expr_->is_variable());
+go_assert(this->expr_->is_multi_eval_safe());
 
 

[committed] analyzer: new implementation of shortest feasible path [PR96374]

2021-03-11 Thread David Malcolm via Gcc-patches
The analyzer builds an exploded graph of (point,state) pairs and when
it finds a problem, records a diagnostic at the relevant exploded node.
Once it has finished exploring the graph, the analyzer needs to generate
the shortest feasible path through the graph to each diagnostic's node.
This is used:
- for rejecting diagnostics that are infeasible (due to impossible sets
  of constraints),
- for use in determining which diagnostic to use in each deduplication
  set (the one with the shortest path), and
- for building checker_paths for the "winning" diagnostics, giving a
  list of events

Prior to this patch the analyzer simply found the shortest path to the
node, and then checked it for feasibility, which could lead to falsely
rejecting diagnostics: "the shortest path, if feasible" is not the same
as "the shortest feasible path" (PR analyzer/96374).
An example is PR analyzer/93355, where this issue causes the analyzer
to fail to emit a leak warning for a missing fclose on an error-handling
path in intl/localealias.c.

This patch implements a new algorithm for finding the shortest feasible
path to an exploded node: instead of simply finding the shortest path,
the new algorithm uses a worklist to iteratively build a tree of path
prefixes, which are feasible paths by construction, until a path to the
target node is found.  The worklist is prioritized, so that the first
feasible path discovered is the shortest possible feasible path.  The
algorithm continues trying paths until the target node is reached or a
limit is exceeded, in which case the diagnostic is treated as being
infeasible (which could still be a false negative, but is much less
likely to happen than before).  Iteratively building a tree of paths
allows for work to be reused, and the tree can be dumped in .dot form
(via a new -fdump-analyzer-feasibility option), making it much easier to
debug compared to other approaches I tried.

Doing so fixes the missing leak warning for PR analyzer/93355 and
various other test cases.

Testing:
- I manually verified that the behavior is determistic using 50 builds
  of pr93355-localealias.c.  All dumps were identical.
- I manually verified that it still builds with --disable-analyzer.
- Lightly tested with valgrind; no additional issues.
- Lightly performance tested, showing a slight speed regression to the
  analyzer relative to before the patch, but correctness for this issue
  is more important than the slight performance hit for the analyzer.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

I also smoketested that it builds and runs on gcc211.fsffrance.org
(Solaris 11 on SPARC64 with gcc 5.5).

I'm taking some liberties by committing a patch this big at this point
in the GCC 11 cycle, but it fixes a large category of issues in
the analyzer, and the changes are confined to the analyzer; I've
verified that it builds with at least GCC 10 and GCC 5 to (I hope)
reduce the risk of breakage to anyone's build from the patch.

Pushed to trunk as r11-7640-g3857edb5d32dcdc11d9a2fe3ad7c156c52a1ec7f.

gcc/ChangeLog:
PR analyzer/96374
* Makefile.in (ANALYZER_OBJS): Add analyzer/feasible-graph.o and
analyzer/trimmed-graph.o.
* doc/analyzer.texi (Analyzer Paths): Rewrite description of
feasibility checking to reflect new implementation.
* doc/invoke.texi (-fdump-analyzer-feasibility): Document new
option.
* shortest-paths.h (shortest_paths::get_shortest_distance): New.

gcc/analyzer/ChangeLog:
PR analyzer/96374
* analyzer.opt (-param=analyzer-max-infeasible-edges=): New param.
(fdump-analyzer-feasibility): New flag.
* diagnostic-manager.cc: Include "analyzer/trimmed-graph.h" and
"analyzer/feasible-graph.h".
(epath_finder::epath_finder): Convert m_sep to a pointer and
only create it if !flag_analyzer_feasibility.
(epath_finder::~epath_finder): New.
(epath_finder::m_sep): Convert to a pointer.
(epath_finder::get_best_epath): Add param "diag_idx" and use it
when logging.  Rather than finding the shortest path and then
checking feasibility, instead use explore_feasible_paths unless
!flag_analyzer_feasibility, in which case simply use the shortest
path, and note if it is infeasible.  Update for m_sep becoming a
pointer.
(class feasible_worklist): New.
(epath_finder::explore_feasible_paths): New.
(epath_finder::process_worklist_item): New.
(class dump_eg_with_shortest_path): New.
(epath_finder::dump_trimmed_graph): New.
(epath_finder::dump_feasible_graph): New.
(saved_diagnostic::saved_diagnostic): Add "idx" param, using it
on new field m_idx.
(saved_diagnostic::to_json): Dump m_idx.
(saved_diagnostic::calc_best_epath): Pass m_idx to get_best_epath.
Remove assertion that m_problem was set when m_best_epath is NULL.

[committed] analyzer: support reverse direction in shortest-paths.h

2021-03-11 Thread David Malcolm via Gcc-patches
This patch generalizes shortest-path.h so that it can be used to
find the shortest path from each node to a given target node (on top
of the existing support for finding the shortest path from a given
origin node to each node).

I've marked this as "analyzer" as this is the only code using
shortest-paths.h.

This patch is required by followup work to fix PR analyzer/96374.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r11-7639-g5e33e5b042a6a830c40cee3d0a925bc49dcfe069.

gcc/analyzer/ChangeLog:
* diagnostic-manager.cc (epath_finder::epath_finder):
Update shortest_paths init for new param.

gcc/ChangeLog:
* digraph.cc (selftest::test_shortest_paths): Update
shortest_paths init for new param.  Add test of
SPS_TO_GIVEN_TARGET.
* shortest-paths.h (enum shortest_path_sense): New.
(shortest_paths::shortest_paths): Add "sense" param.
Update for renamings.  Generalize to use "sense" param.
(shortest_paths::get_shortest_path): Rename param.
(shortest_paths::m_sense): New field.
(shortest_paths::m_prev): Rename...
(shortest_paths::m_best_edge): ...to this.
(shortest_paths::get_shortest_path): Update for renamings.
Conditionalize flipping of path on sense of traversal.
---
 gcc/analyzer/diagnostic-manager.cc |   2 +-
 gcc/digraph.cc |  39 +-
 gcc/shortest-paths.h   | 121 +
 3 files changed, 125 insertions(+), 37 deletions(-)

diff --git a/gcc/analyzer/diagnostic-manager.cc 
b/gcc/analyzer/diagnostic-manager.cc
index 7f20841768b..e84953e82ee 100644
--- a/gcc/analyzer/diagnostic-manager.cc
+++ b/gcc/analyzer/diagnostic-manager.cc
@@ -73,7 +73,7 @@ class epath_finder
 public:
   epath_finder (const exploded_graph )
   : m_eg (eg),
-m_sep (eg, eg.get_origin ())
+m_sep (eg, eg.get_origin (), SPS_FROM_GIVEN_ORIGIN)
   {
   }
 
diff --git a/gcc/digraph.cc b/gcc/digraph.cc
index 3441a8586cb..e6966b076ca 100644
--- a/gcc/digraph.cc
+++ b/gcc/digraph.cc
@@ -147,7 +147,8 @@ test_shortest_paths ()
 
   /* Use "A" as the origin; all nodes should be reachable.  */
   {
-shortest_paths sp (g, a);
+shortest_paths sp (g, a,
+SPS_FROM_GIVEN_ORIGIN);
 
 test_path path_to_a = sp.get_shortest_path (a);
 ASSERT_EQ (path_to_a.m_edges.length (), 0); /* Trivial path.  */
@@ -181,7 +182,8 @@ test_shortest_paths ()
 
   /* Use "B" as the origin, so only E and F are reachable.  */
   {
-shortest_paths sp (g, b);
+shortest_paths sp (g, b,
+SPS_FROM_GIVEN_ORIGIN);
 
 test_path path_to_a = sp.get_shortest_path (a);
 ASSERT_EQ (path_to_a.m_edges.length (), 0); /* No path.  */
@@ -207,7 +209,8 @@ test_shortest_paths ()
 
   /* Use "C" as the origin, so only D and F are reachable.  */
   {
-shortest_paths sp (g, c);
+shortest_paths sp (g, c,
+SPS_FROM_GIVEN_ORIGIN);
 
 test_path path_to_a = sp.get_shortest_path (a);
 ASSERT_EQ (path_to_a.m_edges.length (), 0); /* No path.  */
@@ -229,6 +232,36 @@ test_shortest_paths ()
 ASSERT_EQ (path_to_f.m_edges.length (), 1);
 ASSERT_EQ (path_to_f.m_edges[0], cf);
   }
+
+  /* Test of SPS_TO_GIVEN_TARGET.  Use "F" as the target.  */
+  {
+shortest_paths sp (g, f,
+SPS_TO_GIVEN_TARGET);
+
+test_path path_to_a = sp.get_shortest_path (a);
+ASSERT_EQ (path_to_a.m_edges.length (), 2);
+ASSERT_EQ (path_to_a.m_edges[0], ac);
+ASSERT_EQ (path_to_a.m_edges[1], cf);
+
+test_path path_to_b = sp.get_shortest_path (b);
+ASSERT_EQ (path_to_b.m_edges.length (), 2);
+ASSERT_EQ (path_to_b.m_edges[0], be);
+ASSERT_EQ (path_to_b.m_edges[1], ef);
+
+test_path path_to_c = sp.get_shortest_path (c);
+ASSERT_EQ (path_to_c.m_edges.length (), 1);
+ASSERT_EQ (path_to_c.m_edges[0], cf);
+
+test_path path_to_d = sp.get_shortest_path (d);
+ASSERT_EQ (path_to_d.m_edges.length (), 0); /* No path.  */
+
+test_path path_to_e = sp.get_shortest_path (e);
+ASSERT_EQ (path_to_e.m_edges.length (), 1);
+ASSERT_EQ (path_to_e.m_edges[0], ef);
+
+test_path path_to_f = sp.get_shortest_path (f);
+ASSERT_EQ (path_to_f.m_edges.length (), 0);
+  }
 }
 
 /* Run all of the selftests within this file.  */
diff --git a/gcc/shortest-paths.h b/gcc/shortest-paths.h
index 5648a959895..40d2c2a015a 100644
--- a/gcc/shortest-paths.h
+++ b/gcc/shortest-paths.h
@@ -23,8 +23,24 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "timevar.h"
 
-/* A record of the shortest path to each node in an graph
-   from the origin node.
+enum shortest_path_sense
+{
+  /* Find the shortest path from the given origin node to each
+ node in the graph.  */
+  SPS_FROM_GIVEN_ORIGIN,
+
+  /* Find the shortest path from 

[committed] analyzer: gracefully handle impossible paths in shortest-paths.h

2021-03-11 Thread David Malcolm via Gcc-patches
This bulletproofs the shortest_paths code against unreachable nodes,
gracefully handling them, rather than failing an assertion.

I've marked this as "analyzer" as this is the only code using
shortest-paths.h.

This patch is required by followup work to fix PR analyzer/96374.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r11-7638-g3f958348e78f38d91f0611618bb909182170c0f3.

gcc/ChangeLog:
* digraph.cc (selftest::test_shortest_paths): Add test coverage
for paths from B and C.
* shortest-paths.h (shortest_paths::shortest_paths): Handle
unreachable nodes, rather than asserting.
---
 gcc/digraph.cc   | 101 +--
 gcc/shortest-paths.h |   6 ++-
 2 files changed, 83 insertions(+), 24 deletions(-)

diff --git a/gcc/digraph.cc b/gcc/digraph.cc
index 163909bb3ea..3441a8586cb 100644
--- a/gcc/digraph.cc
+++ b/gcc/digraph.cc
@@ -142,36 +142,93 @@ test_shortest_paths ()
   test_edge *ac = g.add_test_edge (a, c);
   test_edge *cd = g.add_test_edge (c, d);
   test_edge *be = g.add_test_edge (b, e);
-  g.add_test_edge (e, f);
+  test_edge *ef = g.add_test_edge (e, f);
   test_edge *cf = g.add_test_edge (c, f);
 
-  shortest_paths sp (g, a);
+  /* Use "A" as the origin; all nodes should be reachable.  */
+  {
+shortest_paths sp (g, a);
+
+test_path path_to_a = sp.get_shortest_path (a);
+ASSERT_EQ (path_to_a.m_edges.length (), 0); /* Trivial path.  */
+
+test_path path_to_b = sp.get_shortest_path (b);
+ASSERT_EQ (path_to_b.m_edges.length (), 1);
+ASSERT_EQ (path_to_b.m_edges[0], ab);
+
+test_path path_to_c = sp.get_shortest_path (c);
+ASSERT_EQ (path_to_c.m_edges.length (), 1);
+ASSERT_EQ (path_to_c.m_edges[0], ac);
+
+test_path path_to_d = sp.get_shortest_path (d);
+ASSERT_EQ (path_to_d.m_edges.length (), 2);
+ASSERT_EQ (path_to_d.m_edges[0], ac);
+ASSERT_EQ (path_to_d.m_edges[1], cd);
+
+test_path path_to_e = sp.get_shortest_path (e);
+ASSERT_EQ (path_to_e.m_edges.length (), 2);
+ASSERT_EQ (path_to_e.m_edges[0], ab);
+ASSERT_EQ (path_to_e.m_edges[1], be);
+
+test_path path_to_f = sp.get_shortest_path (f);
+ASSERT_EQ (path_to_f.m_edges.length (), 2);
+ASSERT_EQ (path_to_f.m_edges[0], ac);
+ASSERT_EQ (path_to_f.m_edges[1], cf);
+  }
+
+  /* Verify that we gracefully handle an origin from which some nodes
+ aren't reachable.  */
+
+  /* Use "B" as the origin, so only E and F are reachable.  */
+  {
+shortest_paths sp (g, b);
 
-  test_path path_to_a = sp.get_shortest_path (a);
-  ASSERT_EQ (path_to_a.m_edges.length (), 0);
+test_path path_to_a = sp.get_shortest_path (a);
+ASSERT_EQ (path_to_a.m_edges.length (), 0); /* No path.  */
 
-  test_path path_to_b = sp.get_shortest_path (b);
-  ASSERT_EQ (path_to_b.m_edges.length (), 1);
-  ASSERT_EQ (path_to_b.m_edges[0], ab);
+test_path path_to_b = sp.get_shortest_path (b);
+ASSERT_EQ (path_to_b.m_edges.length (), 0); /* Trivial path.  */
 
-  test_path path_to_c = sp.get_shortest_path (c);
-  ASSERT_EQ (path_to_c.m_edges.length (), 1);
-  ASSERT_EQ (path_to_c.m_edges[0], ac);
+test_path path_to_c = sp.get_shortest_path (c);
+ASSERT_EQ (path_to_c.m_edges.length (), 0); /* No path.  */
 
-  test_path path_to_d = sp.get_shortest_path (d);
-  ASSERT_EQ (path_to_d.m_edges.length (), 2);
-  ASSERT_EQ (path_to_d.m_edges[0], ac);
-  ASSERT_EQ (path_to_d.m_edges[1], cd);
+test_path path_to_d = sp.get_shortest_path (d);
+ASSERT_EQ (path_to_d.m_edges.length (), 0); /* No path.  */
 
-  test_path path_to_e = sp.get_shortest_path (e);
-  ASSERT_EQ (path_to_e.m_edges.length (), 2);
-  ASSERT_EQ (path_to_e.m_edges[0], ab);
-  ASSERT_EQ (path_to_e.m_edges[1], be);
+test_path path_to_e = sp.get_shortest_path (e);
+ASSERT_EQ (path_to_e.m_edges.length (), 1);
+ASSERT_EQ (path_to_e.m_edges[0], be);
 
-  test_path path_to_f = sp.get_shortest_path (f);
-  ASSERT_EQ (path_to_f.m_edges.length (), 2);
-  ASSERT_EQ (path_to_f.m_edges[0], ac);
-  ASSERT_EQ (path_to_f.m_edges[1], cf);
+test_path path_to_f = sp.get_shortest_path (f);
+ASSERT_EQ (path_to_f.m_edges.length (), 2);
+ASSERT_EQ (path_to_f.m_edges[0], be);
+ASSERT_EQ (path_to_f.m_edges[1], ef);
+  }
+
+  /* Use "C" as the origin, so only D and F are reachable.  */
+  {
+shortest_paths sp (g, c);
+
+test_path path_to_a = sp.get_shortest_path (a);
+ASSERT_EQ (path_to_a.m_edges.length (), 0); /* No path.  */
+
+test_path path_to_b = sp.get_shortest_path (b);
+ASSERT_EQ (path_to_b.m_edges.length (), 0); /* No path.  */
+
+test_path path_to_c = sp.get_shortest_path (c);
+ASSERT_EQ (path_to_c.m_edges.length (), 0); /* Trivial path.  */
+
+test_path path_to_d = sp.get_shortest_path (d);
+ASSERT_EQ (path_to_d.m_edges.length (), 1);
+ASSERT_EQ (path_to_d.m_edges[0], cd);
+
+test_path path_to_e = sp.get_shortest_path (e);
+ASSERT_EQ 

[PATCH] x86: Update 'P' operand modifier for -fno-plt

2021-03-11 Thread H.J. Lu via Gcc-patches
Update 'P' operand modifier for -fno-plt to support inline assembly
statements.  In 64-bit, we can always load function address with
@GOTPCREL.  In 32-bit, we load function address with @GOT only for
non-PIC since PIC register may not be available at call site.

gcc/

PR target/99504
* config/i386/i386.c (ix86_print_operand): Update 'P' handling
for -fno-plt.

gcc/testsuite/

PR target/99504
* gcc.target/i386/pr99530-1.c: New test.
* gcc.target/i386/pr99530-2.c: Likewise.
* gcc.target/i386/pr99530-3.c: Likewise.
* gcc.target/i386/pr99530-4.c: Likewise.
* gcc.target/i386/pr99530-5.c: Likewise.
* gcc.target/i386/pr99530-6.c: Likewise.
---
 gcc/config/i386/i386.c| 33 +--
 gcc/testsuite/gcc.target/i386/pr99530-1.c | 11 
 gcc/testsuite/gcc.target/i386/pr99530-2.c | 11 
 gcc/testsuite/gcc.target/i386/pr99530-3.c | 11 
 gcc/testsuite/gcc.target/i386/pr99530-4.c | 11 
 gcc/testsuite/gcc.target/i386/pr99530-5.c | 11 
 gcc/testsuite/gcc.target/i386/pr99530-6.c | 11 
 7 files changed, 97 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-6.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 260f87b..8733fcecf65 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12701,7 +12701,8 @@ print_reg (rtx x, int code, FILE *file)
y -- print "st(0)" instead of "st" as a register.
d -- print duplicated register operand for AVX instruction.
D -- print condition for SSE cmp instruction.
-   P -- if PIC, print an @PLT suffix.
+   P -- if PIC, print an @PLT suffix.  For -fno-plt, load function
+   address from GOT.
p -- print raw symbol name.
X -- don't print any sort of PIC '@' suffix for a symbol.
& -- print some in-use local-dynamic symbol name.
@@ -13445,7 +13446,35 @@ ix86_print_operand (FILE *file, rtx x, int code)
  x = const0_rtx;
}
 
-  if (code != 'P' && code != 'p')
+  if (code == 'P')
+   {
+ if (current_output_insn == NULL_RTX
+ && (TARGET_64BIT || (!flag_pic && HAVE_AS_IX86_GOT32X))
+ && !TARGET_PECOFF
+ && !TARGET_MACHO
+ && ix86_cmodel != CM_LARGE
+ && ix86_cmodel != CM_LARGE_PIC
+ && GET_CODE (x) == SYMBOL_REF
+ && SYMBOL_REF_FUNCTION_P (x)
+ && (!flag_plt
+ || (SYMBOL_REF_DECL (x)
+ && lookup_attribute ("noplt",
+  DECL_ATTRIBUTES (SYMBOL_REF_DECL 
(x)
+ && !SYMBOL_REF_LOCAL_P (x))
+   {
+ /* For inline assembly statement, load function address
+from GOT with 'P' operand modifier to avoid PLT.
+NB: This works only with call or jmp.  */
+ const char *xasm;
+ if (TARGET_64BIT)
+   xasm = "{*%p0@GOTPCREL(%%rip)|[QWORD PTR %p0@GOTPCREL[rip]]}";
+ else
+   xasm = "{*%p0@GOT|[DWORD PTR %p0@GOT]}";
+ output_asm_insn (xasm, );
+ return;
+   }
+   }
+  else if (code != 'p')
{
  if (CONST_INT_P (x))
{
diff --git a/gcc/testsuite/gcc.target/i386/pr99530-1.c 
b/gcc/testsuite/gcc.target/i386/pr99530-1.c
new file mode 100644
index 000..080d7cc9399
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99530-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { i?86-*-linux* x86_64-*-linux* } } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O2 -fpic -mcmodel=large -fno-plt" } */
+/* { dg-final { scan-assembler-not "foo@GOTPCREL" } } */
+
+extern void foo (void); 
+void
+bar (void)
+{
+  asm ("call %P0" : : "X" (foo));
+} 
diff --git a/gcc/testsuite/gcc.target/i386/pr99530-2.c 
b/gcc/testsuite/gcc.target/i386/pr99530-2.c
new file mode 100644
index 000..9808957d624
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99530-2.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { i?86-*-linux* x86_64-*-linux* } } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O2 -fno-pic -mcmodel=large -fno-plt" } */
+/* { dg-final { scan-assembler-not "foo@GOTPCREL" } } */
+
+extern void foo (void); 
+void
+bar (void)
+{
+  asm ("call %P0" : : "X" (foo));
+} 
diff --git a/gcc/testsuite/gcc.target/i386/pr99530-3.c 
b/gcc/testsuite/gcc.target/i386/pr99530-3.c
new file mode 100644
index 000..22fe81b25f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr99530-3.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { i?86-*-linux* 

Re: xfail fetestexcept test - ppc always uses fcmpu

2021-03-11 Thread Joseph Myers
On Thu, 11 Mar 2021, Alexandre Oliva wrote:

> I kind of like the notion of adding a comment to the test itself, but I
> wasn't sure that's what you meant.

Yes, adding a comment to the test itself is what I meant.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] AIX thread local uninitialized data (PR 99094)

2021-03-11 Thread David Edelsohn via Gcc-patches
GCC on AIX generates thread local uninitialized data in the common section,
which could conflict with another module.

This patch changes the code generation to place static uninitialized
thread local data into the local common section specified with .lcomm.
This change also removes the need to create a file-local name for the TBSS
data.

Bootstrapped on powerpc-ibm-aix7.2.3.0.

gcc/ChangeLog:

2021-03-11  David Edelsohn  

PR target/99094
* config/rs6000/rs6000.c (rs6000_xcoff_file_start): Don't create
xcoff_tbss_section_name.
* config/rs6000/xcoff.h (ASM_OUTPUT_TLS_COMMON): Use .lcomm.
* xcoffout.c (xcoff_tbss_section_name): Delete.
* xcoffout.h (xcoff_tbss_section_name): Delete.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 35ecf5ac9ab..46ddf49d34b 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -21261,8 +21261,6 @@ rs6000_xcoff_file_start (void)
   main_input_filename, ".ro_");
   rs6000_gen_section_name (_tls_data_section_name,
   main_input_filename, ".tls_");
-  rs6000_gen_section_name (_tbss_section_name,
-  main_input_filename, ".tbss_[UL]");

   fputs ("\t.file\t", asm_out_file);
   output_quoted_string (asm_out_file, main_input_filename);
diff --git a/gcc/config/rs6000/xcoff.h b/gcc/config/rs6000/xcoff.h
index c01667809a5..cb9aae753b2 100644
--- a/gcc/config/rs6000/xcoff.h
+++ b/gcc/config/rs6000/xcoff.h
@@ -255,11 +255,11 @@
  } while (0)

 #ifdef HAVE_AS_TLS
-#define ASM_OUTPUT_TLS_COMMON(FILE, DECL, NAME, SIZE)  \
-  do { fputs (COMMON_ASM_OP, (FILE));  \
-   RS6000_OUTPUT_BASENAME ((FILE), (NAME));\
-   fprintf ((FILE), "[UL]," HOST_WIDE_INT_PRINT_UNSIGNED"\n", \
-   (SIZE));\
+#define ASM_OUTPUT_TLS_COMMON(FILE, DECL, NAME, SIZE)   \
+  do { fputs (LOCAL_COMMON_ASM_OP, (FILE)); \
+   fprintf ((FILE), "%s," HOST_WIDE_INT_PRINT_UNSIGNED",%s[UL],3\n", \
+   (*targetm.strip_name_encoding) (NAME), (SIZE),  \
+   (*targetm.strip_name_encoding) (NAME)); \
   } while (0)
 #endif

diff --git a/gcc/xcoffout.c b/gcc/xcoffout.c
index d07c2fba44e..3dbea04ea5d 100644
--- a/gcc/xcoffout.c
+++ b/gcc/xcoffout.c
@@ -66,7 +66,6 @@ char *xcoff_bss_section_name;
 char *xcoff_private_data_section_name;
 char *xcoff_private_rodata_section_name;
 char *xcoff_tls_data_section_name;
-char *xcoff_tbss_section_name;
 char *xcoff_read_only_section_name;

 /* Last source file name mentioned in a NOTE insn.  */
diff --git a/gcc/xcoffout.h b/gcc/xcoffout.h
index 10a583b48e0..ae6aee0af2a 100644
--- a/gcc/xcoffout.h
+++ b/gcc/xcoffout.h
@@ -129,7 +129,6 @@ extern char *xcoff_bss_section_name;
 extern char *xcoff_private_data_section_name;
 extern char *xcoff_private_rodata_section_name;
 extern char *xcoff_tls_data_section_name;
-extern char *xcoff_tbss_section_name;
 extern char *xcoff_read_only_section_name;

 /* Last source file name mentioned in a NOTE insn.  */


Re: [RFC][patch for gcc12][version 1] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-03-11 Thread Qing Zhao via Gcc-patches
Hi, Kees,

Sorry for the late reply (I have been busy with other work recently).

Currently, I am working on the issue of flexible length array as the last field 
of the structure.

In order to fix it correctly, I have the following question:


> On Feb 26, 2021, at 3:42 PM, Kees Cook  wrote:
> 
> On Thu, Feb 25, 2021 at 05:56:38PM -0600, Qing Zhao wrote:
>> Just noticed that you didn’t add -fauto-var-init-approach=D to the command 
>> line.
> 
> Ah-ha! I didn't realize that was needed; thanks. However, now some of the 
> sources crash in a different way. Here's the reproducer:
> 
> $ cat poc.i
> struct a {
>  int b;
>  int array[];
> };
> void c() {
>  struct a d;
> }
> 

For such variable length array as the last field of the structure, static 
initialization is not allowed, 
User needs to explicitly allocate memory and initialize the allocated array 
manually in the source code. 

So, if the compiler has to initialize this structure when requested by 
-ftrivial-auto-var-init,  I think that 
only the fields before the last fields need to be initialized, Is this the 
correct behavior you expected?

Thanks.

Qing


> $ gcc -ftrivial-auto-var-init=pattern -fauto-var-init-approach=D -c /dev/null 
> poc.i
> during RTL pass: expand
> poc.i: In function ‘c’:
> poc.i:6:12: internal compiler error: in build_pattern_cst, at tree.c:2652
>6 |   struct a d;
>  |^
> 0x75b572 build_pattern_cst(tree_node*)
>../../../gcc/gcc/tree.c:2652
> 0x10db116 build_pattern_cst(tree_node*)
>../../../gcc/gcc/tree.c:2612
> 0xb8a230 expand_DEFERRED_INIT
>../../../gcc/gcc/internal-fn.c:2980
> 0x970e17 expand_call_stmt
>../../../gcc/gcc/cfgexpand.c:2749
> 0x970e17 expand_gimple_stmt_1
>../../../gcc/gcc/cfgexpand.c:3844
> 0x970e17 expand_gimple_stmt
>../../../gcc/gcc/cfgexpand.c:4008
> 0x9766b3 expand_gimple_basic_block
>../../../gcc/gcc/cfgexpand.c:6045
> 0x9780d6 execute
>../../../gcc/gcc/cfgexpand.c:6729
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See  for instructions.
> 
> I assume it's not handling the flex-array happily?
> 
> -- 
> Kees Cook



c++: Fix unhiding friend with imports [PR 99248]

2021-03-11 Thread Nathan Sidwell


This was a simple thinko about which object held the reference to the 
binding vector.  I also noticed stale code in the tree dumper, as I 
recently removed the flags from a lazy number.


PR c++/99248
gcc/cp/
* name-lookup.c (lookup_elaborated_type_1): Access slot not bind
when there's a binding vector.
* ptree.c (cxx_print_xnode): Lazy flags are no longer a thing.
gcc/testsuite/
* g++.dg/modules/pr99248.h: New.
* g++.dg/modules/pr99248_a.H: New.
* g++.dg/modules/pr99248_b.H: New.

--
Nathan Sidwell
diff --git c/gcc/cp/name-lookup.c w/gcc/cp/name-lookup.c
index 28271ba0485..d8839e29fe5 100644
--- c/gcc/cp/name-lookup.c
+++ w/gcc/cp/name-lookup.c
@@ -7940,7 +7940,7 @@ lookup_elaborated_type_1 (tree name, TAG_how how)
 			  if (*slot == bind)
 			*slot = decl;
 			  else
-			BINDING_VECTOR_CLUSTER (bind, 0)
+			BINDING_VECTOR_CLUSTER (*slot, 0)
 			  .slots[BINDING_SLOT_CURRENT] = decl;
 			}
 		}
diff --git c/gcc/cp/ptree.c w/gcc/cp/ptree.c
index e06fe6f72c1..95a4fdf284a 100644
--- c/gcc/cp/ptree.c
+++ w/gcc/cp/ptree.c
@@ -310,8 +310,7 @@ cxx_print_xnode (FILE *file, tree node, int indent)
 		{
 		  indent_to (file, indent + 4);
 		  unsigned lazy = slot.get_lazy ();
-		  fprintf (file, "%s snum:%u flags:%d",
-			   pfx, lazy >> 2, lazy & 3);
+		  fprintf (file, "%s snum:%u", pfx, lazy);
 		}
 		  else if (slot)
 		print_node (file, pfx, slot, indent + 4);
diff --git c/gcc/testsuite/g++.dg/modules/pr99248.h w/gcc/testsuite/g++.dg/modules/pr99248.h
new file mode 100644
index 000..89d8dd5b39e
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr99248.h
@@ -0,0 +1,5 @@
+class locale
+{
+  template
+  friend struct __use_cache;
+};
diff --git c/gcc/testsuite/g++.dg/modules/pr99248_a.H w/gcc/testsuite/g++.dg/modules/pr99248_a.H
new file mode 100644
index 000..7f3fddad3bc
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr99248_a.H
@@ -0,0 +1,5 @@
+// PR 99248 ICE with friend
+// { dg-additional-options -fmodule-header }
+// { dg-module-cmi {} }
+#include "pr99248.h"
+
diff --git c/gcc/testsuite/g++.dg/modules/pr99248_b.H w/gcc/testsuite/g++.dg/modules/pr99248_b.H
new file mode 100644
index 000..33e242d4014
--- /dev/null
+++ w/gcc/testsuite/g++.dg/modules/pr99248_b.H
@@ -0,0 +1,7 @@
+// { dg-additional-options -fmodule-header }
+// { dg-module-cmi {} }
+#include "pr99248.h"
+import "pr99248_a.H";
+
+template
+struct __use_cache;


[PATCH] c++: Prune dead functions.

2021-03-11 Thread Marek Polacek via Gcc-patches
I was looking at the LCOV coverage report for the C++ FE and
found a bunch of unused functions that I think we can remove.
Obviously, I left alone various dump_* and debug_* routines.
I haven't removed cp_build_function_call although it is also
currently unused.

* lambda_return_type: was used in parser.c in GCC 7, unused since r255950,
* classtype_has_non_deleted_copy_ctor: appeared in GCC 10, its usage
  was removed in c++/95350,
* contains_wildcard_p: used in GCC 9, unused since r276764,
* get_template_head_requirements: seems to never have been used,
* check_constrained_friend: seems to never have been used,
* subsumes_constraints: unused since r276764,
* push_void_library_fn: usage removed in r248328,
* get_template_parms_at_level: unused since r157857,
* get_pattern_parm: unused since r275387.

(Some of the seemingly unused functions, such as set_global_friend, are
actually used in libcc1.)

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

* class.c (classtype_has_non_deleted_copy_ctor): Remove.
* constraint.cc (contains_wildcard_p): Likewise.
(get_template_head_requirements): Likewise.
(check_constrained_friend): Likewise.
(subsumes_constraints): Likewise.
* cp-tree.h (classtype_has_non_deleted_copy_ctor): Likewise.
(push_void_library_fn): Likewise.
(get_pattern_parm): Likewise.
(get_template_parms_at_level): Likewise.
(lambda_return_type): Likewise.
(get_template_head_requirements): Likewise.
(check_constrained_friend): Likewise.
(subsumes_constraints): Likewise.
* decl.c (push_void_library_fn): Likewise.
* lambda.c (lambda_return_type): Likewise.
* pt.c (get_template_parms_at_level): Likewise.
(get_pattern_parm): Likewise.
---
 gcc/cp/class.c   | 13 --
 gcc/cp/constraint.cc | 62 
 gcc/cp/cp-tree.h |  8 --
 gcc/cp/decl.c| 10 ---
 gcc/cp/lambda.c  | 18 -
 gcc/cp/pt.c  | 49 --
 6 files changed, 160 deletions(-)

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 856e81e3d1a..9fde7ab5bcc 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -5630,19 +5630,6 @@ classtype_has_non_deleted_move_ctor (tree t)
   return false;
 }
 
-/* True iff T has a copy constructor that is not deleted.  */
-
-bool
-classtype_has_non_deleted_copy_ctor (tree t)
-{
-  if (CLASSTYPE_LAZY_COPY_CTOR (t))
-lazily_declare_fn (sfk_copy_constructor, t);
-  for (ovl_iterator iter (CLASSTYPE_CONSTRUCTORS (t)); iter; ++iter)
-if (copy_fn_p (*iter) && !DECL_DELETED_FN (*iter))
-  return true;
-  return false;
-}
-
 /* If T, a class, has a user-provided copy constructor, copy assignment
operator, or destructor, returns that function.  Otherwise, null.  */
 
diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 5cf43bd6c18..3702c49f10b 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -278,21 +278,6 @@ get_concept_check_template (tree t)
   return tmpl;
 }
 
-/* Returns true if any of the arguments in the template argument list is
-   a wildcard or wildcard pack.  */
-
-bool
-contains_wildcard_p (tree args)
-{
-  for (int i = 0; i < TREE_VEC_LENGTH (args); ++i)
-{
-  tree arg = TREE_VEC_ELT (args, i);
-  if (TREE_CODE (arg) == WILDCARD_DECL)
-   return true;
-}
-  return false;
-}
-
 /*---
 Resolution of qualified concept names
 ---*/
@@ -1300,18 +1285,6 @@ maybe_substitute_reqs_for (tree reqs, const_tree decl_)
   return reqs;
 }
 
-/* Returns the template-head requires clause for the template
-   declaration T or NULL_TREE if none.  */
-
-tree
-get_template_head_requirements (tree t)
-{
-  tree ci = get_constraints (t);
-  if (!ci)
-return NULL_TREE;
-  return CI_TEMPLATE_REQS (ci);
-}
-
 /* Returns the trailing requires clause of the declarator of
a template declaration T or NULL_TREE if none.  */
 
@@ -3429,31 +3402,6 @@ check_function_concept (tree fn)
   return NULL_TREE;
 }
 
-
-// Check that a constrained friend declaration function declaration,
-// FN, is admissible. This is the case only when the declaration depends
-// on template parameters and does not declare a specialization.
-void
-check_constrained_friend (tree fn, tree reqs)
-{
-  if (fn == error_mark_node)
-return;
-  gcc_assert (TREE_CODE (fn) == FUNCTION_DECL);
-
-  // If there are not constraints, this cannot be an error.
-  if (!reqs)
-return;
-
-  // Constrained friend functions that don't depend on template
-  // arguments are effectively meaningless.
-  if (!uses_template_parms (TREE_TYPE (fn)))
-{
-  error_at (location_of (fn),
-   "constrained friend does not depend on template parameters");
-  return;
-

RE: printers.py issue

2021-03-11 Thread Hoyer, David via Gcc-patches
Thank you so much for taking the time to help me with this!   It was greatly 
appreciated.

-Original Message-
From: Jonathan Wakely  
Sent: Thursday, March 11, 2021 11:56 AM
To: Hoyer, David 
Cc: libstdc++ ; gcc-patches@gcc.gnu.org
Subject: Re: printers.py issue

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




On 10/03/21 16:57 +, Hoyer, David wrote:
>I wanted to finally follow up on these issues I reported.
>
>On issue 1, the patch you provided did fix the problem

Thanks for confirming. It's committed to the master branch now (as attached to 
this mail). I also plan to backport it to the active release branches later.

>On issue 2, I changed our python script to convert it to a string and it now 
>works!

Great. I think it's correct that those constructor methods expect strings for 
the 'typename' parameter.





Re: ipa-sra is mostly disabled by -mlongcall

2021-03-11 Thread Segher Boessenkool
On Thu, Mar 11, 2021 at 04:52:18AM -0300, Alexandre Oliva wrote:
> Several ipa-sra tests fail because -mlongcall on powerpc is a
> TYPE_ATTRIBUTE, and those disable ipa-sra.
> 
> This local workaround disables -mlongcall for the failing tests on
> powerpc*-*-vxworks*, so they get a chance to run even in kernel mode.
> 
> (AdaCore has -mlongcall enabled for powerpc*-wrs-vxworks* kernel-mode
> targets.)
> 
> This was regstrapped on x86_64-linux-gnu and ppc64-linux-gnu, and tested
> with a cross to a ppc64-vxworks7r2 with -mlongcall enabled by default.
> 
> I was leaning towards maintaing this change internal, but I thought it
> wouldn't hurt to ask whether it's ok to install.
> 
> I wonder whether it wouldn't be more desirable to have an alternate
> implementation of -mlongcall on ppc that didn't conflict with IPA-SRA,
> and IIRC with some scenarios involving C++ templates.

It has been this way for decades (well, since 2002, not *quite* multiple
decades yet).  It also is almost never necessary to use: not many
programs have more than 32MB in a single module (and you do not need
this for cross-module calls).

> E.g., on ARM, -mlong-calls doesn't add an attribute to every function,
> it just changes a default, that attributes short_call and long_call may
> then override for specific functions.  Would a similar implementation be
> acceptable for ppc?

That is how it works already!  See the first entry on
  https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Function-Attributes.html

Given Martin's and Richard's replies, I think we should not take this
patch.  Sorry.


Segher


Re: [committed] libstdc++: Handle EPERM for filesystem access errors on MacOS [PR 99537]

2021-03-11 Thread Jonathan Wakely via Gcc-patches

On 11/03/21 17:54 +, Jonathan Wakely wrote:

Contrary to what POSIX says, some directory operations on MacOS can fail
with EPERM instead of EACCES, so we need to handle both.

libstdc++-v3/ChangeLog:

PR libstdc++/99537


Oh drat, that's the wrong PR number. It should be PR 99533, as the
comment says in the code. I'll fix the ChangeLog tomorrow.


* src/c++17/fs_dir.cc (recursive_directory_iterator): Use new
helper function to check for permission denied errors.
* src/filesystem/dir.cc (recursive_directory_iterator):
Likewise.
* src/filesystem/dir-common.h (is_permission_denied_error): New
helper function.





--- a/libstdc++-v3/src/filesystem/dir-common.h
+++ b/libstdc++-v3/src/filesystem/dir-common.h
@@ -141,6 +141,18 @@ struct _Dir_base
  posix::DIR*   dirp;
};

+inline bool
+is_permission_denied_error(int e)
+{
+  if (e == EACCES)
+return true;
+#ifdef __APPLE__
+  if (e == EPERM) // See PR 99533
+return true;
+#endif
+  return false;
+}
+
} // namespace filesystem

// BEGIN/END macros must be defined before including this file.




Re: ipa-sra is mostly disabled by -mlongcall

2021-03-11 Thread Segher Boessenkool
On Thu, Mar 11, 2021 at 11:29:57AM -0300, Alexandre Oliva wrote:
> On Mar 11, 2021, Richard Biener  wrote:
> > I wonder what the effect of this switch is - it's documented as
> > affecting the call site (different call insn sequence by default), so
> > the best representation would be an attribute on actual call stmts
> 
> I don't think that would fulfill its purpose.

That is how it already works though.  The option sets a flag that gives
the default for a function attribute.  The function attribute is used at
the call site:

===
void f(void);
void g(void) __attribute__((longcall));

void h(void)
{
f();
g();
}

static void (*i)(void) = f;
static void (*j)(void) __attribute__((longcall)) = f;

void k(void)
{
i();
j();
}
===

Note that in "k" the call via "i" is done as a direct call, and that via
"j" uses a bctr.

> My understanding is that the longcall attribute is intended to issue
> call sequences that assume the target of the call will be out of range
> for available PC-relative addressing modes in call instructions.

Yes.

> One use I'm familiar with is when different modules are known to be
> going to be linked separately, and dynamically loaded at disparate
> memory locations, so that PC-relative branches across modules won't do,
> and PLTs are not available.

And you need to call via function pointers *anyway*, so you do not need
this attribute at all?  It could matter if you do cross-module IPA
optimisations, but you don't (do you?  That will cause no end of fresh
new problems!)

> Indirect calls are pretty much long calls by definition,

It's the other way around.  Longcalls are implemented as indirect calls.

> but I suppose
> the goal of retaining the long-call requests has to do with the
> possibility of sometimes resolving an indirect call back to a direct
> call.

You will save yourself a lot of problems by simply making that
impossible.  Makes such function pointers explicit in your code.

> If it wasn't associated with the type,

It isn't.  It is a function attribute.

> > but it does look better than just a global flag.

The flag is just for setting a default, which often is handy (esp. for
code that is too humonguous to compile without it: it is a simple flag
to add, and your program was probably slow to begin with, being huge and
all).


Segher


Re: [PATCH 4/4] libstdc++: Add fallback 128-bit integer class type and use it

2021-03-11 Thread Jonathan Wakely via Gcc-patches

On 11/03/21 19:09 +0100, Daniel Krügler via Libstdc++ wrote:

It seems to me that basically all uint128_t operations should be
unconditionally noexcept. Currently none of them is so (The constexpr
keyword is a red herring in this regard).


None of this code is exposed to users (so nobody is using the noexcept
operator to heck if it throws), and is all inline, so the compiler can
see that it doesn't throw. We could make it noexcept, but I don't
think it makes any difference to anything. That could be done as a
follow-up patch.


Is it worth considering to add (conditional) support for operator<=>
and corresponding simplifications of comparison operators?


This is (currently) compiled with -std=gnu++17 so that's not
available. We could of course change that, but enabling experimental
C++20 features in the implementation of our less-experimental C++17
features has non-zero risk.

I definitely don't want conditional support for operator<=>
complicating the code. Either we compile it with -std=gnu++17 or we
compile it with -std=gnu++20, but we don't need to check which it is
with #if.




Re: [PATCH 4/4] libstdc++: Add fallback 128-bit integer class type and use it

2021-03-11 Thread Jonathan Wakely via Gcc-patches

On 11/03/21 12:16 -0500, Patrick Palka via Libstdc++ wrote:

--- /dev/null
+++ b/libstdc++-v3/src/c++17/uint128_t.h
@@ -0,0 +1,297 @@
+// A relatiely minimal unsigned 128-bit integer class type, used by the


"relatively"

Apart from that, all four patches in the series are OK for trunk.




Re: [PATCH 4/4] libstdc++: Add fallback 128-bit integer class type and use it

2021-03-11 Thread Daniel Krügler via Gcc-patches
Am Do., 11. März 2021 um 18:17 Uhr schrieb Patrick Palka via Libstdc++
:
>
> This implements a minimal integer class type that emulates 128-bit
> unsigned arithmetic using a pair of 64-bit integers, which the
> floating-point std::to_chars implementation then uses as a drop-in
> replacement for unsigned __int128 on targets that lack the latter.
> This allows us to fully support formatting of large long double types
> on targets that lack __int128.
>
> Since Ryu performs 128-bit division/modulus only by 2, 5 and 10, the
> integer class type supports only these divisors rather than supporting
> general division/modulus.
>
> Tested on x86, x86_64, ppc64le, ppc64be and aarch64, with and without
> performing the equivalent of -U__SIZEOF_INT128__ in floating_to_chars.cc
> (so that we also test using the class type on targets when __int128 is
> available).
>
> libstdc++-v3/ChangeLog:
>
> * src/c++17/floating_to_chars.cc: Simplify the file as if
> __SIZEOF_INT128__ is always defined.
> [!defined __SIZEOF_INT128__]: Include "uint128_t.h".  Define
> a to_chars overload for the uint128_t class type.
> * src/c++17/uint128_t.h: New file.
> * testsuite/20_util/to_chars/long_double.cc: No longer expect an
> execution FAIL on targets that have a large long double type
> but lack __int128.
> ---
>  libstdc++-v3/src/c++17/floating_to_chars.cc   |  58 ++--
>  libstdc++-v3/src/c++17/uint128_t.h| 297 ++
>  .../testsuite/20_util/to_chars/long_double.cc |   1 -
>  3 files changed, 332 insertions(+), 24 deletions(-)
>  create mode 100644 libstdc++-v3/src/c++17/uint128_t.h
>
> diff --git a/libstdc++-v3/src/c++17/floating_to_chars.cc 
> b/libstdc++-v3/src/c++17/floating_to_chars.cc
> index da3fbaa1ed1..86f4401e134 100644
> --- a/libstdc++-v3/src/c++17/floating_to_chars.cc
> +++ b/libstdc++-v3/src/c++17/floating_to_chars.cc
> @@ -64,25 +64,19 @@ extern "C" int __sprintfieee128(char*, const char*, ...);
>
>  #if __LDBL_MANT_DIG__ == __DBL_MANT_DIG__
>  # define LONG_DOUBLE_KIND LDK_BINARY64
> -#elif defined(__SIZEOF_INT128__)
> -// The Ryu routines need a 128-bit integer type in order to do shortest
> -// formatting of types larger than 64-bit double, so without __int128 we 
> can't
> -// support any large long double format.  This is the case for e.g. i386.
> -# if __LDBL_MANT_DIG__ == 64
> +#elif __LDBL_MANT_DIG__ == 64
>  #  define LONG_DOUBLE_KIND LDK_FLOAT80
> -# elif __LDBL_MANT_DIG__ == 113
> -#  define LONG_DOUBLE_KIND LDK_BINARY128
> -# elif __LDBL_MANT_DIG__ == 106
> -#  define LONG_DOUBLE_KIND LDK_IBM128
> -# endif
> -# if defined _GLIBCXX_USE_FLOAT128 && __FLT128_MANT_DIG__ == 113
> -// Define overloads of std::to_chars for __float128.
> -#  define FLOAT128_TO_CHARS 1
> -# endif
> +#elif __LDBL_MANT_DIG__ == 113
> +# define LONG_DOUBLE_KIND LDK_BINARY128
> +#elif __LDBL_MANT_DIG__ == 106
> +# define LONG_DOUBLE_KIND LDK_IBM128
> +#else
> +# define LONG_DOUBLE_KIND LDK_UNSUPPORTED
>  #endif
>
> -#if !defined(LONG_DOUBLE_KIND)
> -# define LONG_DOUBLE_KIND LDK_UNSUPPORTED
> +#if defined _GLIBCXX_USE_FLOAT128 && __FLT128_MANT_DIG__ == 113
> +// Define overloads of std::to_chars for __float128.
> +# define FLOAT128_TO_CHARS 1
>  #endif
>
>  // For now we only support __float128 when it's the powerpc64 __ieee128 type.
> @@ -100,6 +94,8 @@ namespace
>  {
>  #if defined __SIZEOF_INT128__
>using uint128_t = unsigned __int128;
> +#else
> +# include "uint128_t.h"
>  #endif
>
>namespace ryu
> @@ -114,7 +110,6 @@ namespace
>  #include "ryu/d2fixed.c"
>  #include "ryu/f2s.c"
>
> -#ifdef __SIZEOF_INT128__
>  namespace generic128
>  {
>// Put the generic Ryu bits in their own namespace to avoid name 
> conflicts.
> @@ -129,7 +124,6 @@ namespace
>  int
>  to_chars(const floating_decimal_128 v, char* const result)
>  { return generic128::generic_to_chars(v, result); }
> -#endif
>} // namespace ryu
>
>// A traits class that contains pertinent information about the binary
> @@ -407,10 +401,8 @@ namespace
>   return uint32_t{};
> else if constexpr (total_bits <= 64)
>   return uint64_t{};
> -#ifdef __SIZEOF_INT128__
> else if constexpr (total_bits <= 128)
>   return uint128_t{};
> -#endif
>};
>using uint_t = decltype(get_uint_t());
>uint_t value_bits = 0;
> @@ -503,7 +495,6 @@ namespace
> return ryu::floating_to_fd32(value);
>else if constexpr (std::is_same_v)
> return ryu::floating_to_fd64(value);
> -#ifdef __SIZEOF_INT128__
>else if constexpr (std::is_same_v
>  || std::is_same_v)
> {
> @@ -519,7 +510,6 @@ namespace
> mantissa_bits, exponent_bits,
> !has_implicit_leading_bit);
> }
> -#endif
>  }
>
>// This subroutine returns true if the shortest scientific form fd is a
> 

[PATCH] tighten up checking for virtual bases (PR 99502)

2021-03-11 Thread Martin Sebor via Gcc-patches

More extensive testing of the patch I just committed in r11-7563 to
avoid the false positive -Warray-bounds on accesses to members of
virtual bases has exposed a couple of problems that cause many false
negatives for even basic bugs like:

   typedef struct A { int i; } A;

   void* g (void)
   {
 A *p = (A*)malloc (3);
 p->i = 0;// missing warning in C++
 return p;
   }

as well as a number of others, including more complex ones involving
virtual base classes that can, with some additional work, be reliably
detected.  Most of them were successfully diagnosed before the fix
for PR 98266.  Unfortunately, the tests that exercise this are all
C so the C++ only regressions due to the divergence went unnoticed.

The root cause is twofold: a) the simplistic check for TYPE_BINFO
in the new inbounds_vbase_memaccess_p() function means we exclude
from checking any member accesses whose leading offset is in bounds,
and b) the check for the leading offset misses cases where just
the ending offset of an access is out of bounds.

The attached patch adds code to restrict the original solution
to just the narrow subset of accesses to members of classes with
virtual bases, and also adds a check for the ending offset.

Is the patch okay for trunk?

(The code for has_virtual_base() is adapted from
polymorphic_type_binfo_p() in ipa-utils.h.  It seems like a handy
utility that could go in tree.h.)

Martin

PS There is another regression I discovered while working on this.
It's a false positive but unrelated to the r11-7563 fix so I'll
post a patch for separately.
PR middle-end/99502 - missing -Warray-bounds on partial out of bounds

gcc/ChangeLog:

	PR middle-end/99502
	* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref):
	Handle void* type.  Print the size of even declared objects.
	(has_virtual_base_r): New function.
	(has_virtual_base): New function.
	(inbounds_vbase_memaccess_p): Call has_virtual_base.  Also check
	the ending offset of the accessed member.

gcc/testsuite/ChangeLog:

	PR middle-end/99502
	* g++.dg/warn/Warray-bounds-22.C: New test.
	* g++.dg/warn/Warray-bounds-23.C: New test.
	
diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
index 54f32051199..34552b7c812 100644
@@ -890,9 +902,46 @@ array_bounds_checker::check_addr_expr (location_t location, tree t)
 }
 }
 
+/* Return true if BINFO corresponds to a class with a virtual base class.  */
+
+static bool
+has_virtual_base_r (tree binfo)
+{
+  for (unsigned i = 0; i < BINFO_N_BASE_BINFOS (binfo); i++)
+{
+  tree base = BINFO_BASE_BINFO (binfo, i);
+  if (BINFO_VIRTUAL_P (base) || has_virtual_base_r (base))
+	return true;
+}
+
+  return false;
+}
+
+/* Return true if TYPE is a class type with virtual functions or with
+   a virtual base class.  */
+
+static bool
+has_virtual_base (tree type)
+{
+  tree binfo = TYPE_BINFO (type);
+  if (!binfo)
+return false;
+  tree btype = BINFO_TYPE (binfo);
+  if (!btype)
+return false;
+  tree tbtype = TYPE_BINFO (btype);
+  if (!tbtype)
+return false;
+  if (!BINFO_VTABLE (tbtype))
+return false;
+
+  return has_virtual_base_r (binfo);
+}
+
 /* Return true if T is a reference to a member of a base class that's within
-   the bounds of the enclosing complete object.  The function "hacks" around
-   problems discussed in pr98266 and pr97595.  */
+   the bounds of the enclosing complete object.  Considers only classes with
+   virtual bases.  The function "hacks" around problems discussed in pr98266
+   and pr97595.  */
 
 static bool
 inbounds_vbase_memaccess_p (tree t)
@@ -904,10 +953,12 @@ inbounds_vbase_memaccess_p (tree t)
   if (TREE_CODE (mref) != MEM_REF)
 return false;
 
-  /* Consider the access if its type is a derived class.  */
+  /* Consider the access only if its type is a derived polymorphic class
+ with a virtual base.  */
   tree mreftype = TREE_TYPE (mref);
-  if (!RECORD_OR_UNION_TYPE_P (mreftype)
-  || !TYPE_BINFO (mreftype))
+  if (!RECORD_OR_UNION_TYPE_P (mreftype))
+return false;
+  if (!has_virtual_base (mreftype))
 return false;
 
   /* Compute the size of the referenced object (it could be dynamically
@@ -929,9 +980,19 @@ inbounds_vbase_memaccess_p (tree t)
   tree refoff = TREE_OPERAND (mref, 1);
   tree fldoff = int_const_binop (PLUS_EXPR, fldpos, refoff);
 
-  /* Return true if the member offset is less than the size of the complete
- object.  */
-  return tree_int_cst_lt (fldoff, refsize);
+  /* Return false if the member offset is greater or equal to the size
+ of the complete object.  */
+  if (!tree_int_cst_lt (fldoff, refsize))
+return false;
+
+  tree fldsiz = DECL_SIZE_UNIT (fld);
+  if (!fldsiz || TREE_CODE (fldsiz) != INTEGER_CST)
+return false;
+
+  /* Return true if the offset just past the end of the member is less
+ than or equal to the size of the complete object.  */
+  tree fldend = int_const_binop (PLUS_EXPR, fldoff, fldsiz);
+  return tree_int_cst_le 

Re: printers.py issue

2021-03-11 Thread Jonathan Wakely via Gcc-patches

On 10/03/21 16:57 +, Hoyer, David wrote:

I wanted to finally follow up on these issues I reported.

On issue 1, the patch you provided did fix the problem


Thanks for confirming. It's committed to the master branch now (as
attached to this mail). I also plan to backport it to the active
release branches later.


On issue 2, I changed our python script to convert it to a string and it now 
works!


Great. I think it's correct that those constructor methods expect
strings for the 'typename' parameter.



commit 9dacc828bf8b087ddfaf797ed7238da45609082c
Author: Jonathan Wakely 
Date:   Thu Mar 11 16:48:32 2021

libstdc++: Fix find_type helper to work consistently

The find_type helper function sometimes results in "class X::name" and
lookup for that fails. For more details see "Problem 1" in
https://gcc.gnu.org/pipermail/libstdc++/2021-March/052132.html and the
example at https://sourceware.org/bugzilla/show_bug.cgi?id=27510#c2

This patch replaces typ.unqualified() with typ.tag, which is never
qualified, and will never include the 'class' or 'struct' keywords.
Using the .tag attribute should be safe here because we know we are
looking at a class type and we've already used strip_typedefs().

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (find_type): Use tag attribute
instead of unqualified() method.

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py b/libstdc++-v3/python/libstdcxx/v6/printers.py
index 00db6cb4e82..550e0ecdd22 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -85,8 +85,8 @@ except ImportError:
 def find_type(orig, name):
 typ = orig.strip_typedefs()
 while True:
-# Strip cv-qualifiers.  PR 67440.
-search = '%s::%s' % (typ.unqualified(), name)
+# Use Type.tag to ignore cv-qualifiers.  PR 67440.
+search = '%s::%s' % (typ.tag, name)
 try:
 return gdb.lookup_type(search)
 except RuntimeError:


[committed] libstdc++: Use acq_rel memory ordering [PR 99537]

2021-03-11 Thread Jonathan Wakely via Gcc-patches
As Lewis Baker wrote in the PR:

> The 'fetch_sub()' operation in _M_release_ownership() should be using
> memory_order::acq_rel instead of memory_order::release. The use of
> 'release' only is insufficient as it does not synchronise with any
> corresponding 'acquire' operation.

> With the current implementation, it's possible that a prior write to
> one of the _M_value or _M_head data-members by a thread releasing the
> second-to-last reference might not be visible to another thread that
> releases the last reference and frees the memory, resulting in
> potential write to freed memory.

This simply changes the memory order to acq_rel as suggested.

libstdc++-v3/ChangeLog:

PR libstdc++/99537
* include/std/stop_token (_Stop_state_t::_M_release_ownership):
Use acq_rel memory ordering.

Tested powerpc64le-linux. Committed to trunk.

commit 15825b17cf3fbf28181c51fe94a2898f448f915c
Author: Jonathan Wakely 
Date:   Thu Mar 11 16:44:57 2021

libstdc++: Use acq_rel memory ordering [PR 99537]

As Lewis Baker wrote in the PR:

> The 'fetch_sub()' operation in _M_release_ownership() should be using
> memory_order::acq_rel instead of memory_order::release. The use of
> 'release' only is insufficient as it does not synchronise with any
> corresponding 'acquire' operation.

> With the current implementation, it's possible that a prior write to
> one of the _M_value or _M_head data-members by a thread releasing the
> second-to-last reference might not be visible to another thread that
> releases the last reference and frees the memory, resulting in
> potential write to freed memory.

This simply changes the memory order to acq_rel as suggested.

libstdc++-v3/ChangeLog:

PR libstdc++/99537
* include/std/stop_token (_Stop_state_t::_M_release_ownership):
Use acq_rel memory ordering.

diff --git a/libstdc++-v3/include/std/stop_token 
b/libstdc++-v3/include/std/stop_token
index 83905f6525f..fffc215d2a8 100644
--- a/libstdc++-v3/include/std/stop_token
+++ b/libstdc++-v3/include/std/stop_token
@@ -185,7 +185,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   void
   _M_release_ownership() noexcept
   {
-   if (_M_owners.fetch_sub(1, memory_order::release) == 1)
+   if (_M_owners.fetch_sub(1, memory_order::acq_rel) == 1)
  delete this;
   }
 


[committed] libstdc++: Make barrier::arrival_token a move-only class type

2021-03-11 Thread Jonathan Wakely via Gcc-patches
The standard only specifies that barrier::arrival_token is a move
constructible and move assignable type. We originally used a scoped enum
type, but that means we do not diagnose non-portable code that makes
copies of arrival tokens (or compares them for equality, or uses them as
keys in map!) This wraps the enum in a move-only class type, so that
users are forced to pass it correctly.

The move constructor and move assignment operator of the new class do
not zero out the moved-from token, as that would add additional
instructions. That means that passing a moved-from token will work with
our implementation, despite being a bug in the user code. We could
consider doing that zeroing out in debug mode.

libstdc++-v3/ChangeLog:

* include/std/barrier (barrier::arrival_token): New move-only
class that encapsulates the underlying token value.

Tested powerpc64le-linux. Committed to trunk.

commit 5643f6f396ef7f60d317aef07dd98978cec6afd0
Author: Jonathan Wakely 
Date:   Thu Mar 11 16:57:20 2021

libstdc++: Make barrier::arrival_token a move-only class type

The standard only specifies that barrier::arrival_token is a move
constructible and move assignable type. We originally used a scoped enum
type, but that means we do not diagnose non-portable code that makes
copies of arrival tokens (or compares them for equality, or uses them as
keys in map!) This wraps the enum in a move-only class type, so that
users are forced to pass it correctly.

The move constructor and move assignment operator of the new class do
not zero out the moved-from token, as that would add additional
instructions. That means that passing a moved-from token will work with
our implementation, despite being a bug in the user code. We could
consider doing that zeroing out in debug mode.

libstdc++-v3/ChangeLog:

* include/std/barrier (barrier::arrival_token): New move-only
class that encapsulates the underlying token value.

diff --git a/libstdc++-v3/include/std/barrier b/libstdc++-v3/include/std/barrier
index e09212dfcb9..6f2b9873500 100644
--- a/libstdc++-v3/include/std/barrier
+++ b/libstdc++-v3/include/std/barrier
@@ -209,15 +209,27 @@ It looks different from literature pseudocode for two 
main reasons:
   __algorithm_t _M_b;
 
 public:
-  using arrival_token = typename 
__tree_barrier<_CompletionF>::arrival_token;
+  class arrival_token final
+  {
+  public:
+   arrival_token(arrival_token&&) = default;
+   arrival_token& operator=(arrival_token&&) = default;
+   ~arrival_token() = default;
+
+  private:
+   friend class barrier;
+   using __token = typename __algorithm_t::arrival_token;
+   explicit arrival_token(__token __tok) noexcept : _M_tok(__tok) { }
+   __token _M_tok;
+  };
 
   static constexpr ptrdiff_t
   max() noexcept
   { return __algorithm_t::max(); }
 
-  explicit barrier(ptrdiff_t __count,
-  _CompletionF __completion = _CompletionF())
- : _M_b(__count, std::move(__completion))
+  explicit
+  barrier(ptrdiff_t __count, _CompletionF __completion = _CompletionF())
+  : _M_b(__count, std::move(__completion))
   { }
 
   barrier(barrier const&) = delete;
@@ -225,11 +237,11 @@ It looks different from literature pseudocode for two 
main reasons:
 
   [[nodiscard]] arrival_token
   arrive(ptrdiff_t __update = 1)
-  { return _M_b.arrive(__update); }
+  { return arrival_token{_M_b.arrive(__update)}; }
 
   void
   wait(arrival_token&& __phase) const
-  { _M_b.wait(std::move(__phase)); }
+  { _M_b.wait(std::move(__phase._M_tok)); }
 
   void
   arrive_and_wait()


[committed] libstdc++: Handle EPERM for filesystem access errors on MacOS [PR 99537]

2021-03-11 Thread Jonathan Wakely via Gcc-patches
Contrary to what POSIX says, some directory operations on MacOS can fail
with EPERM instead of EACCES, so we need to handle both.

libstdc++-v3/ChangeLog:

PR libstdc++/99537
* src/c++17/fs_dir.cc (recursive_directory_iterator): Use new
helper function to check for permission denied errors.
* src/filesystem/dir.cc (recursive_directory_iterator):
Likewise.
* src/filesystem/dir-common.h (is_permission_denied_error): New
helper function.

Tested powerpc64le-linux. Committed to trunk.

commit 8cfb387388a90730ab36ac24d9049677db633a11
Author: Jonathan Wakely 
Date:   Thu Mar 11 16:43:51 2021

libstdc++: Handle EPERM for filesystem access errors on MacOS [PR 99537]

Contrary to what POSIX says, some directory operations on MacOS can fail
with EPERM instead of EACCES, so we need to handle both.

libstdc++-v3/ChangeLog:

PR libstdc++/99537
* src/c++17/fs_dir.cc (recursive_directory_iterator): Use new
helper function to check for permission denied errors.
* src/filesystem/dir.cc (recursive_directory_iterator):
Likewise.
* src/filesystem/dir-common.h (is_permission_denied_error): New
helper function.

diff --git a/libstdc++-v3/src/c++17/fs_dir.cc b/libstdc++-v3/src/c++17/fs_dir.cc
index 994368c25c4..4de0f798d05 100644
--- a/libstdc++-v3/src/c++17/fs_dir.cc
+++ b/libstdc++-v3/src/c++17/fs_dir.cc
@@ -207,7 +207,7 @@ recursive_directory_iterator(const path& p, 
directory_options options,
   else
 {
   const int err = errno;
-  if (err == EACCES
+  if (fs::is_permission_denied_error(err)
  && is_set(options, fs::directory_options::skip_permission_denied))
{
  if (ecptr)
diff --git a/libstdc++-v3/src/filesystem/dir-common.h 
b/libstdc++-v3/src/filesystem/dir-common.h
index 56e279230f4..a49b8304a29 100644
--- a/libstdc++-v3/src/filesystem/dir-common.h
+++ b/libstdc++-v3/src/filesystem/dir-common.h
@@ -141,6 +141,18 @@ struct _Dir_base
   posix::DIR*  dirp;
 };
 
+inline bool
+is_permission_denied_error(int e)
+{
+  if (e == EACCES)
+return true;
+#ifdef __APPLE__
+  if (e == EPERM) // See PR 99533
+return true;
+#endif
+  return false;
+}
+
 } // namespace filesystem
 
 // BEGIN/END macros must be defined before including this file.
diff --git a/libstdc++-v3/src/filesystem/dir.cc 
b/libstdc++-v3/src/filesystem/dir.cc
index acc62986c68..215a9533f1b 100644
--- a/libstdc++-v3/src/filesystem/dir.cc
+++ b/libstdc++-v3/src/filesystem/dir.cc
@@ -202,7 +202,7 @@ recursive_directory_iterator(const path& p, 
directory_options options,
   else
 {
   const int err = errno;
-  if (err == EACCES
+  if (std::filesystem::is_permission_denied_error(err)
  && is_set(options, fs::directory_options::skip_permission_denied))
{
  if (ecptr)


[committed] libstdc++: Initialize std::normal_distribution::_M_saved [PR 99536]

2021-03-11 Thread Jonathan Wakely via Gcc-patches
This avoids a false positive -Wmaybe-uninitialized warning, by
initializing _M_saved on construction.

libstdc++-v3/ChangeLog:

PR libstdc++/99536
* include/bits/random.h (normal_distribution): Use
default-initializer for _M_saved and _M_saved_available.

Tested powerpc64le-linux. Committed to trunk.

commit 67e397660611990efd98f9e4106c1ee81f6803a4
Author: Jonathan Wakely 
Date:   Thu Mar 11 16:43:51 2021

libstdc++: Initialize std::normal_distribution::_M_saved [PR 99536]

This avoids a false positive -Wmaybe-uninitialized warning, by
initializing _M_saved on construction.

libstdc++-v3/ChangeLog:

PR libstdc++/99536
* include/bits/random.h (normal_distribution): Use
default-initializer for _M_saved and _M_saved_available.

diff --git a/libstdc++-v3/include/bits/random.h 
b/libstdc++-v3/include/bits/random.h
index b74e9cfd504..877ac3406b6 100644
--- a/libstdc++-v3/include/bits/random.h
+++ b/libstdc++-v3/include/bits/random.h
@@ -2024,12 +2024,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   explicit
   normal_distribution(result_type __mean,
  result_type __stddev = result_type(1))
-  : _M_param(__mean, __stddev), _M_saved_available(false)
+  : _M_param(__mean, __stddev)
   { }
 
   explicit
   normal_distribution(const param_type& __p)
-  : _M_param(__p), _M_saved_available(false)
+  : _M_param(__p)
   { }
 
   /**
@@ -2166,8 +2166,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
const param_type& __p);
 
   param_type  _M_param;
-  result_type _M_saved;
-  bool_M_saved_available;
+  result_type _M_saved = 0;
+  bool_M_saved_available = false;
 };
 
   /**


Re: [Bug libstdc++/99402] [10/11 Regression] std::copy creates _GLIBCXX_DEBUG false positive for attempt to subscript a dereferenceable (start-of-sequence) iterator

2021-03-11 Thread François Dumont via Gcc-patches

I eventually prefer to propose this version.

Compared to the previous one I have the _M_can_advance calling the 
former one with correct number of elements to check for advance. And the 
former _M_can_advance is also properly making use of the 
__dp_sign_max_size precision.


Here is the revisited git log:

    libstdc++: [_GLIBCXX_DEBUG] Fix management of __dp_sign_max_size 
[PR 99402]


    __dp_sign precision indicates that we found out what iterator comes 
first or
    last in the range. __dp_sign_max_size is the same plus it gives the 
information
    of the max size of the range that is to say the max_size value such 
that

    distance(lhs, rhs) < max_size.
    Thanks to this additional information we are able to tell when a 
copy of n elements

    to that range will fail even if we do not know exactly how large it is.

    This patch makes sure that we are properly using this information.

    libstdc++-v3/ChangeLog:

    PR libstdc++/99402
    * include/debug/helper_functions.h 
(__can_advance(_InputIterator,

    const std::pair<_Diff, _Distance_precision>&, int)): New.
    (__can_advance(const _Safe_iterator<>&,
    const std::pair<_Diff, _Distance_precision>&, int)): New.
    * include/debug/macros.h 
(__glibcxx_check_can_increment_dist): New,

    use latter.
    (__glibcxx_check_can_increment_range): Adapt to use latter.
    (__glibcxx_check_can_decrement_range): Likewise.
    * include/debug/safe_iterator.h
    (_Safe_iterator<>::_M_can_advance(const std::pair<_Diff, 
_Distance_precision>&,

    int)): New.
    (__can_advance(const _Safe_iterator<>&,
    const std::pair<_Diff, _Distance_precision>&, int)): New.
    * include/debug/safe_iterator.tcc
    (_Safe_iterator<>::_M_can_advance(const std::pair<_Diff, 
_Distance_precision>&,

    int)): New.
    (_Safe_iterator<>::_M_valid_range(const _Safe_iterator<>&,
    std::pair&, bool)): 
Adapt for

    __dp_sign_max_size.
    (__copy_move_a): Adapt to use 
__glibcxx_check_can_increment_dist.

    (__copy_move_backward_a): Likewise.
    (__equal_aux): Likewise.
    * include/debug/stl_iterator.h (__can_advance(const 
std::reverse_iterator<>&,

    const std::pair<_Diff, _Distance_precision>&, int)): New.
    (__can_advance(const std::move_iterator<>&,
    const std::pair<_Diff, _Distance_precision>&, int)): New.
    * testsuite/25_algorithms/copy/debug/99402.cc: New test.

Ok to commit if tests are all PASS ?

François

On 07/03/21 10:30 pm, François Dumont wrote:

Here is the patch to correctly deal with the new __dp_sign_max_size.

I prefer to introduce new __can_advance overloads for this to 
correctly deal with the _Distance_precision in it. _M_valid_range was 
also ignoring __dp_sign_max_size.


    libstdc++: [_GLIBCXX_DEBUG] Fix management of __dp_sign_max_size 
[PR 99402]


    libstdc++-v3/ChangeLog:

    PR libstdc++/99402
    * include/debug/helper_functions.h 
(__can_advance(_InputIterator,

    const std::pair<_Diff, _Distance_precision>&, int)): New.
    (__can_advance(const _Safe_iterator<>&,
    const std::pair<_Diff, _Distance_precision>&, int)): New.
    * include/debug/macros.h 
(__glibcxx_check_can_increment_dist): New,

    use latter.
    (__glibcxx_check_can_increment_range): Adapt to use latter.
    (__glibcxx_check_can_decrement_range): Likewise.
    * include/debug/safe_iterator.h
    (_Safe_iterator<>::_M_can_advance(const std::pair<_Diff, 
_Distance_precision>&,

    int)): New.
    (__can_advance(const _Safe_iterator<>&,
    const std::pair<_Diff, _Distance_precision>&, int)): New.
    * include/debug/safe_iterator.tcc
    (_Safe_iterator<>::_M_can_advance(const std::pair<_Diff, 
_Distance_precision>&,

    int)): New.
    (_Safe_iterator<>::_M_valid_range(const _Safe_iterator<>&,
    std::pair&, bool)): 
Adapt for

    __dp_sign_max_size.
    (__copy_move_a): Adapt to use 
__glibcxx_check_can_increment_dist.

    (__copy_move_backward_a): Likewise.
    (__equal_aux): Likewise.
    * include/debug/stl_iterator.h (__can_advance(const 
std::reverse_iterator<>&,

    const std::pair<_Diff, _Distance_precision>&, int)): New.
    (__can_advance(const std::move_iterator<>&,
    const std::pair<_Diff, _Distance_precision>&, int)): New.
    * testsuite/25_algorithms/copy/debug/99402.cc: New test.

Tested under Linux x86_64.

Ok to commit ?

François



diff --git a/libstdc++-v3/include/debug/helper_functions.h b/libstdc++-v3/include/debug/helper_functions.h
index 513e5460eba..c0144ced979 100644
--- a/libstdc++-v3/include/debug/helper_functions.h
+++ 

c++: template partial instantiation mismatch [PR 99528]

2021-03-11 Thread Nathan Sidwell
This turned out	to be an existing problem, which had been hidden by 
other bugs.  Templated members of templated classes can end up 
instantiating the template itself, and we were not handling the 
mergeableness of that correctly.


PR c++/99528
gcc/cp/
* module.cc (enum merge_kind): Delete MK_type_tmpl_spec,
MK_decl_tmpl_spec.
(trees_in::decl_value): Adjust add_mergeable_specialization call.
(trees_out::get_merge_kind): Adjust detecting a partial template
instantiation.
(trees_out::key_mergeable): Adjust handling same.
(trees_in::key_mergeabvle): Likewise.
gcc/testsuite/
* g++.dg/modules/pr99528.h: New.
* g++.dg/modules/pr99528_a.H: New.
* g++.dg/modules/pr99528_b.H: New.
* g++.dg/modules/pr99528_c.C: New.

--
Nathan Sidwell
diff --git c/gcc/cp/module.cc w/gcc/cp/module.cc
index db5fa9076c3..03359db28e1 100644
--- c/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -2783,11 +2783,7 @@ enum merge_kind
   MK_tmpl_tmpl_mask = 0x1, /* We want TEMPLATE_DECL.  */
 
   MK_type_spec = MK_template_mask,
-  MK_type_tmpl_spec = MK_type_spec | MK_tmpl_tmpl_mask,
-
   MK_decl_spec = MK_template_mask | MK_tmpl_decl_mask,
-  MK_decl_tmpl_spec = MK_decl_spec | MK_tmpl_tmpl_mask,
-
   MK_alias_spec = MK_decl_spec | MK_tmpl_alias_mask,
 
   MK_hwm = 0x20
@@ -8062,10 +8058,10 @@ trees_in::decl_value ()
 	{
 	  /* Add to specialization tables now that constraints etc are
 	 added.  */
-	  bool is_decl = (mk & MK_template_mask) && (mk & MK_tmpl_decl_mask);
+	  bool is_type = mk == MK_partial || !(mk & MK_tmpl_decl_mask);
 
-	  spec.spec = is_decl ? inner : type;
-	  add_mergeable_specialization (is_decl, , decl, spec_flags);
+	  spec.spec = is_type ? type : mk & MK_tmpl_tmpl_mask ? inner : decl;
+	  add_mergeable_specialization (!is_type, , decl, spec_flags);
 	}
 
   if (TREE_CODE (decl) == INTEGER_CST && !TREE_OVERFLOW (decl))
@@ -10229,7 +10225,6 @@ trees_out::get_merge_kind (tree decl, depset *dep)
 case depset::EK_SPECIALIZATION:
   {
 	gcc_checking_assert (dep->is_special ());
-	spec_entry *entry = reinterpret_cast  (dep->deps[0]);
 
 	if (TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL)
 	  /* An block-scope classes of templates are themselves
@@ -10247,13 +10242,8 @@ trees_out::get_merge_kind (tree decl, depset *dep)
 
 	if (TREE_CODE (decl) == TEMPLATE_DECL)
 	  {
-	tree res = DECL_TEMPLATE_RESULT (decl);
-	if (!(mk & MK_tmpl_decl_mask))
-	  res = TREE_TYPE (res);
-
-	if (res == entry->spec)
-	  /* We check we can get back to the template during
-		 streaming.  */
+	spec_entry *entry = reinterpret_cast  (dep->deps[0]);
+	if (TREE_CODE (entry->spec) != TEMPLATE_DECL)
 	  mk = merge_kind (mk | MK_tmpl_tmpl_mask);
 	  }
   }
@@ -10362,15 +10352,14 @@ trees_out::key_mergeable (int tag, merge_kind mk, tree decl, tree inner,
 		gcc_assert (match_mergeable_specialization (false, entry)
 			== TREE_TYPE (existing));
 	  else if (mk & MK_tmpl_tmpl_mask)
-		if (tree ti = DECL_TEMPLATE_INFO (existing))
-		  existing = TI_TEMPLATE (ti);
+		existing = DECL_TI_TEMPLATE (existing);
 	}
 	  else
 	{
-	  if (!(mk & MK_tmpl_tmpl_mask))
+	  if (mk & MK_tmpl_tmpl_mask)
+		existing = CLASSTYPE_TI_TEMPLATE (existing);
+	  else
 		existing = TYPE_NAME (existing);
-	  else if (tree ti = CLASSTYPE_TEMPLATE_INFO (existing))
-		existing = TI_TEMPLATE (ti);
 	}
 
 	  /* The walkabout should have found ourselves.  */
@@ -10677,13 +10666,6 @@ trees_in::key_mergeable (int tag, merge_kind mk, tree decl, tree inner,
   DECL_NAME (inner) = DECL_NAME (decl);
   DECL_CONTEXT (inner) = DECL_CONTEXT (decl);
 
-  spec.spec = decl;
-  if (mk & MK_tmpl_tmpl_mask)
-	{
-	  if (inner == decl)
-	return error_mark_node;
-	  spec.spec = inner;
-	}
   tree constr = NULL_TREE;
   bool is_decl = mk & MK_tmpl_decl_mask;
   if (is_decl)
@@ -10694,13 +10676,10 @@ trees_in::key_mergeable (int tag, merge_kind mk, tree decl, tree inner,
 	  if (constr)
 		set_constraints (inner, constr);
 	}
+	  spec.spec = (mk & MK_tmpl_tmpl_mask) ? inner : decl;
 	}
   else
-	{
-	  if (mk == MK_type_spec && inner != decl)
-	return error_mark_node;
-	  spec.spec = type;
-	}
+	spec.spec = type;
   existing = match_mergeable_specialization (is_decl, );
   if (constr)
 	/* We'll add these back later, if this is the new decl.  */
@@ -10712,24 +10691,15 @@ trees_in::key_mergeable (int tag, merge_kind mk, tree decl, tree inner,
 	{
 	  /* A declaration specialization.  */
 	  if (mk & MK_tmpl_tmpl_mask)
-	if (tree ti = DECL_TEMPLATE_INFO (existing))
-	  {
-		tree tmpl = TI_TEMPLATE (ti);
-		if (DECL_TEMPLATE_RESULT (tmpl) == existing)
-		  existing = tmpl;
-	  }
+	existing = DECL_TI_TEMPLATE (existing);
 	}
   else
 	{
 	  /* A type specialization.  */
-	  if (!(mk & MK_tmpl_tmpl_mask))
+	  if (mk & MK_tmpl_tmpl_mask)
+	

[PATCH 4/4] libstdc++: Add fallback 128-bit integer class type and use it

2021-03-11 Thread Patrick Palka via Gcc-patches
This implements a minimal integer class type that emulates 128-bit
unsigned arithmetic using a pair of 64-bit integers, which the
floating-point std::to_chars implementation then uses as a drop-in
replacement for unsigned __int128 on targets that lack the latter.
This allows us to fully support formatting of large long double types
on targets that lack __int128.

Since Ryu performs 128-bit division/modulus only by 2, 5 and 10, the
integer class type supports only these divisors rather than supporting
general division/modulus.

Tested on x86, x86_64, ppc64le, ppc64be and aarch64, with and without
performing the equivalent of -U__SIZEOF_INT128__ in floating_to_chars.cc
(so that we also test using the class type on targets when __int128 is
available).

libstdc++-v3/ChangeLog:

* src/c++17/floating_to_chars.cc: Simplify the file as if
__SIZEOF_INT128__ is always defined.
[!defined __SIZEOF_INT128__]: Include "uint128_t.h".  Define
a to_chars overload for the uint128_t class type.
* src/c++17/uint128_t.h: New file.
* testsuite/20_util/to_chars/long_double.cc: No longer expect an
execution FAIL on targets that have a large long double type
but lack __int128.
---
 libstdc++-v3/src/c++17/floating_to_chars.cc   |  58 ++--
 libstdc++-v3/src/c++17/uint128_t.h| 297 ++
 .../testsuite/20_util/to_chars/long_double.cc |   1 -
 3 files changed, 332 insertions(+), 24 deletions(-)
 create mode 100644 libstdc++-v3/src/c++17/uint128_t.h

diff --git a/libstdc++-v3/src/c++17/floating_to_chars.cc 
b/libstdc++-v3/src/c++17/floating_to_chars.cc
index da3fbaa1ed1..86f4401e134 100644
--- a/libstdc++-v3/src/c++17/floating_to_chars.cc
+++ b/libstdc++-v3/src/c++17/floating_to_chars.cc
@@ -64,25 +64,19 @@ extern "C" int __sprintfieee128(char*, const char*, ...);
 
 #if __LDBL_MANT_DIG__ == __DBL_MANT_DIG__
 # define LONG_DOUBLE_KIND LDK_BINARY64
-#elif defined(__SIZEOF_INT128__)
-// The Ryu routines need a 128-bit integer type in order to do shortest
-// formatting of types larger than 64-bit double, so without __int128 we can't
-// support any large long double format.  This is the case for e.g. i386.
-# if __LDBL_MANT_DIG__ == 64
+#elif __LDBL_MANT_DIG__ == 64
 #  define LONG_DOUBLE_KIND LDK_FLOAT80
-# elif __LDBL_MANT_DIG__ == 113
-#  define LONG_DOUBLE_KIND LDK_BINARY128
-# elif __LDBL_MANT_DIG__ == 106
-#  define LONG_DOUBLE_KIND LDK_IBM128
-# endif
-# if defined _GLIBCXX_USE_FLOAT128 && __FLT128_MANT_DIG__ == 113
-// Define overloads of std::to_chars for __float128.
-#  define FLOAT128_TO_CHARS 1
-# endif
+#elif __LDBL_MANT_DIG__ == 113
+# define LONG_DOUBLE_KIND LDK_BINARY128
+#elif __LDBL_MANT_DIG__ == 106
+# define LONG_DOUBLE_KIND LDK_IBM128
+#else
+# define LONG_DOUBLE_KIND LDK_UNSUPPORTED
 #endif
 
-#if !defined(LONG_DOUBLE_KIND)
-# define LONG_DOUBLE_KIND LDK_UNSUPPORTED
+#if defined _GLIBCXX_USE_FLOAT128 && __FLT128_MANT_DIG__ == 113
+// Define overloads of std::to_chars for __float128.
+# define FLOAT128_TO_CHARS 1
 #endif
 
 // For now we only support __float128 when it's the powerpc64 __ieee128 type.
@@ -100,6 +94,8 @@ namespace
 {
 #if defined __SIZEOF_INT128__
   using uint128_t = unsigned __int128;
+#else
+# include "uint128_t.h"
 #endif
 
   namespace ryu
@@ -114,7 +110,6 @@ namespace
 #include "ryu/d2fixed.c"
 #include "ryu/f2s.c"
 
-#ifdef __SIZEOF_INT128__
 namespace generic128
 {
   // Put the generic Ryu bits in their own namespace to avoid name 
conflicts.
@@ -129,7 +124,6 @@ namespace
 int
 to_chars(const floating_decimal_128 v, char* const result)
 { return generic128::generic_to_chars(v, result); }
-#endif
   } // namespace ryu
 
   // A traits class that contains pertinent information about the binary
@@ -407,10 +401,8 @@ namespace
  return uint32_t{};
else if constexpr (total_bits <= 64)
  return uint64_t{};
-#ifdef __SIZEOF_INT128__
else if constexpr (total_bits <= 128)
  return uint128_t{};
-#endif
   };
   using uint_t = decltype(get_uint_t());
   uint_t value_bits = 0;
@@ -503,7 +495,6 @@ namespace
return ryu::floating_to_fd32(value);
   else if constexpr (std::is_same_v)
return ryu::floating_to_fd64(value);
-#ifdef __SIZEOF_INT128__
   else if constexpr (std::is_same_v
 || std::is_same_v)
{
@@ -519,7 +510,6 @@ namespace
mantissa_bits, exponent_bits,
!has_implicit_leading_bit);
}
-#endif
 }
 
   // This subroutine returns true if the shortest scientific form fd is a
@@ -558,10 +548,32 @@ namespace
   get_mantissa_length(const ryu::floating_decimal_64 fd)
   { return ryu::decimalLength17(fd.mantissa); }
 
-#ifdef __SIZEOF_INT128__
   int
   get_mantissa_length(const ryu::floating_decimal_128 fd)
   { return ryu::generic128::decimalLength(fd.mantissa); }
+
+#if !defined 

[PATCH 3/4] libstdc++: Remove Ryu's uint128_t aliases

2021-03-11 Thread Patrick Palka via Gcc-patches
This makes Ryu consistently use the uint128_t alias defined in
floating_to_chars.cc.

libstdc++-v3/ChangeLog:

* src/c++17/ryu/LOCAL_PATCHES: Update.
* src/c++17/ryu/d2s_intrinsics.h: Don't define uint128_t.
* src/c++17/ryu/generic_128.h: Likewise.
* src/c++17/ryu/ryu_generic_128.h (struct floating_decimal_128):
Use uint128_t instead of __uint128_t.
(generic_binary_to_decimal): Likewise.
---
 libstdc++-v3/src/c++17/ryu/LOCAL_PATCHES | 1 +
 libstdc++-v3/src/c++17/ryu/d2s_intrinsics.h  | 4 
 libstdc++-v3/src/c++17/ryu/generic_128.h | 3 ---
 libstdc++-v3/src/c++17/ryu/ryu_generic_128.h | 4 ++--
 4 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/libstdc++-v3/src/c++17/ryu/LOCAL_PATCHES 
b/libstdc++-v3/src/c++17/ryu/LOCAL_PATCHES
index c26633ee6c2..a72168c1f71 100644
--- a/libstdc++-v3/src/c++17/ryu/LOCAL_PATCHES
+++ b/libstdc++-v3/src/c++17/ryu/LOCAL_PATCHES
@@ -1 +1,2 @@
 r11-6248
+r11-
diff --git a/libstdc++-v3/src/c++17/ryu/d2s_intrinsics.h 
b/libstdc++-v3/src/c++17/ryu/d2s_intrinsics.h
index fa993e6fad6..bbac4dfd48f 100644
--- a/libstdc++-v3/src/c++17/ryu/d2s_intrinsics.h
+++ b/libstdc++-v3/src/c++17/ryu/d2s_intrinsics.h
@@ -28,10 +28,6 @@
 #define HAS_64_BIT_INTRINSICS
 #endif
 
-#if defined(HAS_UINT128)
-typedef __uint128_t uint128_t;
-#endif
-
 #if defined(HAS_64_BIT_INTRINSICS)
 
 
diff --git a/libstdc++-v3/src/c++17/ryu/generic_128.h 
b/libstdc++-v3/src/c++17/ryu/generic_128.h
index 88e96776664..d3ca398fbdb 100644
--- a/libstdc++-v3/src/c++17/ryu/generic_128.h
+++ b/libstdc++-v3/src/c++17/ryu/generic_128.h
@@ -17,9 +17,6 @@
 #ifndef RYU_GENERIC128_H
 #define RYU_GENERIC128_H
 
-
-typedef __uint128_t uint128_t;
-
 #define FLOAT_128_POW5_INV_BITCOUNT 249
 #define FLOAT_128_POW5_BITCOUNT 249
 #define POW5_TABLE_SIZE 56
diff --git a/libstdc++-v3/src/c++17/ryu/ryu_generic_128.h 
b/libstdc++-v3/src/c++17/ryu/ryu_generic_128.h
index f5d476343b6..2afbf274e11 100644
--- a/libstdc++-v3/src/c++17/ryu/ryu_generic_128.h
+++ b/libstdc++-v3/src/c++17/ryu/ryu_generic_128.h
@@ -37,7 +37,7 @@ extern "C" {
 
 // A floating decimal representing (-1)^s * m * 10^e.
 struct floating_decimal_128 {
-  __uint128_t mantissa;
+  uint128_t mantissa;
   int32_t exponent;
   bool sign;
 };
@@ -52,7 +52,7 @@ struct floating_decimal_128 long_double_to_fd128(long double 
d);
 // Converts the given binary floating point number to the shortest decimal 
floating point number
 // that still accurately represents it.
 struct floating_decimal_128 generic_binary_to_decimal(
-const __uint128_t bits, const uint32_t mantissaBits, const uint32_t 
exponentBits, const bool explicitLeadingBit);
+const uint128_t bits, const uint32_t mantissaBits, const uint32_t 
exponentBits, const bool explicitLeadingBit);
 
 // Converts the given decimal floating point number to a string, writing to 
result, and returning
 // the number characters written. Does not terminate the buffer with a 0. In 
the worst case, this
-- 
2.31.0.rc2



[PATCH 2/4] libstdc++: Add LOCAL_PATCHES file to Ryu sources

2021-03-11 Thread Patrick Palka via Gcc-patches
This file keeps track of the local modifications we've made to our copy
of Ryu.

libstdc++-v3/ChangeLog:

* src/c++17/ryu/LOCAL_PATCHES: New file.
---
 libstdc++-v3/src/c++17/ryu/LOCAL_PATCHES | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 libstdc++-v3/src/c++17/ryu/LOCAL_PATCHES

diff --git a/libstdc++-v3/src/c++17/ryu/LOCAL_PATCHES 
b/libstdc++-v3/src/c++17/ryu/LOCAL_PATCHES
new file mode 100644
index 000..c26633ee6c2
--- /dev/null
+++ b/libstdc++-v3/src/c++17/ryu/LOCAL_PATCHES
@@ -0,0 +1 @@
+r11-6248
-- 
2.31.0.rc2



[PATCH 1/4] libstdc++: Factor out uses of __int128 into a type alias

2021-03-11 Thread Patrick Palka via Gcc-patches
Since Ryu has an alias uint128_t for this same purpose, it seems best
for us to use this name as well, so as to minimize the amount of local
modifications we'd need to make to our copy of Ryu.  (In a subsequent
patch, we're going to remove Ryu's aliases so that it uses the one
defined in floating_to_chars.cc.)

libstdc++-v3/ChangeLog:

* src/c++17/floating_to_chars.cc (uint128_t): New conditionally
defined alias of unsigned __int128.
(floating_type_traits_binary128::mantissa_t): Use uint128_t
instead of unsigned __int128.
[LONG_DOUBLE_KIND == LDK_IBM128]
(floating_type_traits::mantissa_t): Likewise.
(get_ieee_repr): Likewise.  Make casts from uint_t to mantissa_t
and uint32_t explicit.  Simplify the extraction of mantissa,
exponent and sign bit.
---
 libstdc++-v3/src/c++17/floating_to_chars.cc | 25 +
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/libstdc++-v3/src/c++17/floating_to_chars.cc 
b/libstdc++-v3/src/c++17/floating_to_chars.cc
index 611747bb99e..da3fbaa1ed1 100644
--- a/libstdc++-v3/src/c++17/floating_to_chars.cc
+++ b/libstdc++-v3/src/c++17/floating_to_chars.cc
@@ -98,6 +98,10 @@ using F128_type = void;
 
 namespace
 {
+#if defined __SIZEOF_INT128__
+  using uint128_t = unsigned __int128;
+#endif
+
   namespace ryu
   {
 #include "ryu/common.h"
@@ -171,7 +175,7 @@ namespace
 static constexpr int mantissa_bits = 112;
 static constexpr int exponent_bits = 15;
 static constexpr bool has_implicit_leading_bit = true;
-using mantissa_t = unsigned __int128;
+using mantissa_t = uint128_t;
 using shortest_scientific_t = ryu::floating_decimal_128;
 
 static constexpr uint64_t pow10_adjustment_tab[]
@@ -367,7 +371,7 @@ namespace
   static constexpr int mantissa_bits = 105;
   static constexpr int exponent_bits = 11;
   static constexpr bool has_implicit_leading_bit = true;
-  using mantissa_t = unsigned __int128;
+  using mantissa_t = uint128_t;
   using shortest_scientific_t = ryu::floating_decimal_128;
 
   static constexpr uint64_t pow10_adjustment_tab[]
@@ -393,6 +397,7 @@ namespace
 ieee_t
 get_ieee_repr(const T value)
 {
+  using mantissa_t = typename floating_type_traits::mantissa_t;
   constexpr int mantissa_bits = floating_type_traits::mantissa_bits;
   constexpr int exponent_bits = floating_type_traits::exponent_bits;
   constexpr int total_bits = mantissa_bits + exponent_bits + 1;
@@ -404,7 +409,7 @@ namespace
  return uint64_t{};
 #ifdef __SIZEOF_INT128__
else if constexpr (total_bits <= 128)
- return (unsigned __int128){};
+ return uint128_t{};
 #endif
   };
   using uint_t = decltype(get_uint_t());
@@ -412,10 +417,13 @@ namespace
   memcpy(_bits, , sizeof(value));
 
   ieee_t ieee_repr;
-  ieee_repr.mantissa = value_bits & ((uint_t{1} << mantissa_bits) - 1u);
+  ieee_repr.mantissa
+   = static_cast(value_bits & ((uint_t{1} << mantissa_bits) - 
1u));
+  value_bits >>= mantissa_bits;
   ieee_repr.biased_exponent
-   = (value_bits >> mantissa_bits) & ((uint_t{1} << exponent_bits) - 1u);
-  ieee_repr.sign = (value_bits >> (mantissa_bits + exponent_bits)) & 1;
+   = static_cast(value_bits & ((uint_t{1} << exponent_bits) - 
1u));
+  value_bits >>= exponent_bits;
+  ieee_repr.sign = (value_bits & 1) != 0;
   return ieee_repr;
 }
 
@@ -430,7 +438,6 @@ namespace
   // mantissa (plus an implicit leading bit).  We use the exponent and sign
   // of the high part, and we merge the mantissa of the high part with the
   // mantissa (and the implicit leading bit) of the low part.
-  using uint_t = unsigned __int128;
   uint64_t value_bits[2] = {};
   memcpy(value_bits, , sizeof(value_bits));
 
@@ -478,8 +485,8 @@ namespace
}
 
   ieee_t ieee_repr;
-  ieee_repr.mantissa = ((uint_t{mantissa_hi} << 64)
-   | (uint_t{mantissa_lo} << 4)) >> 11;
+  ieee_repr.mantissa = ((uint128_t{mantissa_hi} << 64)
+   | (uint128_t{mantissa_lo} << 4)) >> 11;
   ieee_repr.biased_exponent = exponent_hi;
   ieee_repr.sign = sign_hi;
   return ieee_repr;
-- 
2.31.0.rc2



[Patch] Fortran: Fix func decl mismatch [PR93660]

2021-03-11 Thread Tobias Burnus

This fixes an ICE with OpenMP 'omp decare simd' but is a generic bug.

Namely TREE_TYPE(fndecl) has a mismatch to the arglist chain,
missing some hidden arguments with -fcoarray=lib.

OK for mainline and GCC 10?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
Fortran: Fix func decl mismatch [PR93660]

gcc/fortran/ChangeLog:

	PR fortran/93660
	* trans-decl.c (build_function_decl): Add comment;
	increment hidden_typelist for caf_token/caf_offset.
	* trans-types.c (gfc_get_function_type): Add comment;
	add missing caf_token/caf_offset args.

gcc/testsuite/ChangeLog:

	PR fortran/93660
	* gfortran.dg/gomp/declare-simd-coarray-lib.f90: New test.

 gcc/fortran/trans-decl.c|  6 +-
 gcc/fortran/trans-types.c   | 21 -
 .../gfortran.dg/gomp/declare-simd-coarray-lib.f90   | 12 
 3 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 6a4ed9bbfdd..34a0d49bae7 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -2488,7 +2488,9 @@ build_function_decl (gfc_symbol * sym, bool global)
 }
 
 
-/* Create the DECL_ARGUMENTS for a procedure.  */
+/* Create the DECL_ARGUMENTS for a procedure.
+   NOTE: The arguments added here must match the argument type created by
+   gfc_get_function_type ().  */
 
 static void
 create_function_arglist (gfc_symbol * sym)
@@ -2807,6 +2809,7 @@ create_function_arglist (gfc_symbol * sym)
 	  DECL_ARG_TYPE (token) = TREE_VALUE (typelist);
 	  TREE_READONLY (token) = 1;
 	  hidden_arglist = chainon (hidden_arglist, token);
+	  hidden_typelist = TREE_CHAIN (hidden_typelist);
 	  gfc_finish_decl (token);
 
 	  offset = build_decl (input_location, PARM_DECL,
@@ -2832,6 +2835,7 @@ create_function_arglist (gfc_symbol * sym)
 	  DECL_ARG_TYPE (offset) = TREE_VALUE (typelist);
 	  TREE_READONLY (offset) = 1;
 	  hidden_arglist = chainon (hidden_arglist, offset);
+	  hidden_typelist = TREE_CHAIN (hidden_typelist);
 	  gfc_finish_decl (offset);
 	}
 
diff --git a/gcc/fortran/trans-types.c b/gcc/fortran/trans-types.c
index ccdc4687c39..bc7aac1f3d9 100644
--- a/gcc/fortran/trans-types.c
+++ b/gcc/fortran/trans-types.c
@@ -3011,6 +3011,10 @@ create_fn_spec (gfc_symbol *sym, tree fntype)
   return build_type_attribute_variant (fntype, tmp);
 }
 
+
+/* NOTE: The returned function type must match the argument list created by
+   create_function_arglist.  */
+
 tree
 gfc_get_function_type (gfc_symbol * sym, gfc_actual_arglist *actual_args,
 		   const char *fnspec)
@@ -3119,10 +3123,11 @@ gfc_get_function_type (gfc_symbol * sym, gfc_actual_arglist *actual_args,
 }
 }
 
-  /* Add hidden string length parameters.  */
+  /* Add hidden arguments.  */
   for (f = gfc_sym_get_dummy_args (sym); f; f = f->next)
 {
   arg = f->sym;
+  /* Add hidden string length parameters.  */
   if (arg && arg->ts.type == BT_CHARACTER && !sym->attr.is_bind_c)
 	{
 	  if (!arg->ts.deferred)
@@ -3145,6 +3150,20 @@ gfc_get_function_type (gfc_symbol * sym, gfc_actual_arglist *actual_args,
 	   && arg->ts.type != BT_CLASS
 	   && !gfc_bt_struct (arg->ts.type))
 	vec_safe_push (typelist, boolean_type_node);
+  /* Coarrays which are descriptorless or assumed-shape pass with
+	 -fcoarray=lib the token and the offset as hidden arguments.  */
+  else if (arg
+	   && flag_coarray == GFC_FCOARRAY_LIB
+	   && ((arg->ts.type != BT_CLASS
+		&& arg->attr.codimension
+		&& !arg->attr.allocatable)
+		   || (arg->ts.type == BT_CLASS
+		   && CLASS_DATA (arg)->attr.codimension
+		   && !CLASS_DATA (arg)->attr.allocatable)))
+	{
+	  vec_safe_push (typelist, pvoid_type_node);  /* caf_token.  */
+	  vec_safe_push (typelist, gfc_array_index_type);  /* caf_offset.  */
+	}
 }
 
   if (!vec_safe_is_empty (typelist)
diff --git a/gcc/testsuite/gfortran.dg/gomp/declare-simd-coarray-lib.f90 b/gcc/testsuite/gfortran.dg/gomp/declare-simd-coarray-lib.f90
new file mode 100644
index 000..1f74da76ffe
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/declare-simd-coarray-lib.f90
@@ -0,0 +1,12 @@
+! { dg-additional-options "-fcoarray=lib" }
+!
+! PR fortran/93660
+!
+! Failed as TREE_TYPE(fndecl) did not include the
+! hidden caf_token/caf_offset arguments.
+!
+integer function f(x)
+   integer :: x[*]
+   !$omp declare simd
+   f = x[1]
+end


Re: [PATCH v1] libstdc++-v3: Update VTV vars for libtool link commands [PR99172]

2021-03-11 Thread Jonathan Wakely via Gcc-patches

On 11/03/21 17:46 +0100, Jakub Jelinek via Libstdc++ wrote:

On Thu, Mar 11, 2021 at 04:31:51PM +, Jonathan Wakely via Gcc-patches wrote:

On 11/03/21 16:27 +, Jonathan Wakely wrote:
> That seems cleaner to me, rather than adding another variable with
> minor differences from the existing AM_CXXFLAGS.

My specific concern is that AM_CXXFLAGS and AM_CXXFLAGS_LT will get
out of sync, i.e. we'll add something to the former and forget to add
it to the latter.

If we keep using AM_CXXFLAGS but cancel out the -fvtable-verify=std
option, then there aren't two separate variables that can diverge.

But I think it's too late in the gcc-11 process for that kind of
refactoring.


I think $(filter-out -fvtable-verify=std,$(AM_CXXFLAGS)) should be fairly
simple thing if that is all that needs to be done.


Yes, we could do that now, and then in stage 1 look at the other
changes (like moving -Wl,-u options to the link flags not cxxflags).

Using filter-out does assume that no target is going to add anything
different that should also be filtered out, but that's true as of
today.



Re: [WIP] Re: [PATCH] openmp: Fix intermittent hanging of task-detach-6 libgomp tests [PR98738]

2021-03-11 Thread Thomas Schwinge
Hi!

On 2021-02-23T22:52:38+0100, Jakub Jelinek via Gcc-patches 
 wrote:
> On Tue, Feb 23, 2021 at 09:43:51PM +, Kwok Cheung Yeung wrote:
>> On 19/02/2021 7:12 pm, Kwok Cheung Yeung wrote:
>> > I have included the current state of my patch. All task-detach-* tests
>> > pass when executed without offloading or with offloading to GCN, but
>> > with offloading to Nvidia, task-detach-6.* hangs consistently but
>> > everything else passes (probably because of the missing
>> > gomp_team_barrier_done?).
>>
>> It looks like the hang has nothing to do with the detach patch - this hangs
>> consistently for me when offloaded to NVPTX:
>>
>> #include 
>>
>> int main (void)
>> {
>> #pragma omp target
>>   #pragma omp parallel
>> #pragma omp task
>>   ;
>> }
>>
>> This doesn't hang when offloaded to GCN or the host device, or if
>> num_threads(1) is specified on the omp parallel.

So, I reproduced this the hard way;
 :-/

Please always file issues when you run into such things.  I've now filed
PR99555 "[OpenMP/nvptx] Execution-time hang for simple nested OpenMP
'target'/'parallel'/'task' constructs".

> Then it can be solved separately, I'll try to have a look if I see something
> bad from the dumps, but I admit I don't have much experience with debugging
> NVPTX offloaded code...

Any luck?


Until this gets resolved properly, OK to push something like the attached
(currently testing) "Avoid OpenMP/nvptx execution-time hangs for simple
nested OpenMP 'target'/'parallel'/'task' constructs [PR99555]"?


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
>From 61cb5c237ec3a402696797e5459f181d501cd0fb Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 11 Mar 2021 17:01:22 +0100
Subject: [PATCH] Avoid OpenMP/nvptx execution-time hangs for simple nested
 OpenMP 'target'/'parallel'/'task' constructs [PR99555]

... awaiting proper resolution, of course.

	libgomp/
	PR99555
	* testsuite/lib/on_device_arch.c: New file.
	* testsuite/libgomp.c/pr99555-1.c: Likewise.
	* testsuite/libgomp.c-c++-common/task-detach-6.c: Until resolved,
	skip for nvptx offloading, with error status.
	* testsuite/libgomp.fortran/task-detach-6.f90: Likewise.
---
 libgomp/testsuite/lib/on_device_arch.c| 30 +++
 .../libgomp.c-c++-common/task-detach-6.c  |  7 +
 libgomp/testsuite/libgomp.c/pr99555-1.c   | 19 
 .../libgomp.fortran/task-detach-6.f90 | 13 
 4 files changed, 69 insertions(+)
 create mode 100644 libgomp/testsuite/lib/on_device_arch.c
 create mode 100644 libgomp/testsuite/libgomp.c/pr99555-1.c

diff --git a/libgomp/testsuite/lib/on_device_arch.c b/libgomp/testsuite/lib/on_device_arch.c
new file mode 100644
index 000..1c0753c3181
--- /dev/null
+++ b/libgomp/testsuite/lib/on_device_arch.c
@@ -0,0 +1,30 @@
+#include 
+
+/* static */ int
+device_arch_nvptx (void)
+{
+  return GOMP_DEVICE_NVIDIA_PTX;
+}
+
+#pragma omp declare variant (device_arch_nvptx) match(construct={target},device={arch(nvptx)})
+/* static */ int
+device_arch (void)
+{
+  return GOMP_DEVICE_DEFAULT;
+}
+
+static int
+on_device_arch (int d)
+{
+  int d_cur;
+  #pragma omp target map(from:d_cur)
+  d_cur = device_arch ();
+
+  return d_cur == d;
+}
+
+int
+on_device_arch_nvptx ()
+{
+  return on_device_arch (GOMP_DEVICE_NVIDIA_PTX);
+}
diff --git a/libgomp/testsuite/libgomp.c-c++-common/task-detach-6.c b/libgomp/testsuite/libgomp.c-c++-common/task-detach-6.c
index e5c2291e6ff..4a3e4a2a3d2 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/task-detach-6.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/task-detach-6.c
@@ -1,5 +1,8 @@
 /* { dg-do run } */
 
+/* { dg-additional-sources "../lib/on_device_arch.c" } */
+extern int on_device_arch_nvptx ();
+
 #include 
 #include 
 
@@ -9,6 +12,10 @@
 
 int main (void)
 {
+  //TODO See '../libgomp.c/pr99555-1.c'.
+  if (on_device_arch_nvptx ())
+__builtin_abort (); //TODO Until resolved, skip, with error status.
+
   int x = 0, y = 0, z = 0;
   int thread_count;
   omp_event_handle_t detach_event1, detach_event2;
diff --git a/libgomp/testsuite/libgomp.c/pr99555-1.c b/libgomp/testsuite/libgomp.c/pr99555-1.c
new file mode 100644
index 000..9ba330959d8
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/pr99555-1.c
@@ -0,0 +1,19 @@
+// PR99555 "[OpenMP/nvptx] Execution-time hang for simple nested OpenMP 'target'/'parallel'/'task' constructs"
+
+// { dg-additional-options "-O0" }
+
+// { dg-additional-sources "../lib/on_device_arch.c" }
+extern int on_device_arch_nvptx ();
+
+int main (void)
+{
+  if (on_device_arch_nvptx ())
+__builtin_abort (); //TODO Until resolved, skip, with error status.
+
+#pragma omp target
+#pragma omp parallel // num_threads(1)
+#pragma omp task
+  ;
+
+  return 0;
+}
diff --git 

Re: [PATCH v1] libstdc++-v3: Update VTV vars for libtool link commands [PR99172]

2021-03-11 Thread Jakub Jelinek via Gcc-patches
On Thu, Mar 11, 2021 at 04:31:51PM +, Jonathan Wakely via Gcc-patches wrote:
> On 11/03/21 16:27 +, Jonathan Wakely wrote:
> > That seems cleaner to me, rather than adding another variable with
> > minor differences from the existing AM_CXXFLAGS.
> 
> My specific concern is that AM_CXXFLAGS and AM_CXXFLAGS_LT will get
> out of sync, i.e. we'll add something to the former and forget to add
> it to the latter.
> 
> If we keep using AM_CXXFLAGS but cancel out the -fvtable-verify=std
> option, then there aren't two separate variables that can diverge.
> 
> But I think it's too late in the gcc-11 process for that kind of
> refactoring.

I think $(filter-out -fvtable-verify=std,$(AM_CXXFLAGS)) should be fairly
simple thing if that is all that needs to be done.

Jakub



Re: [PATCH v1] libstdc++-v3: Update VTV vars for libtool link commands [PR99172]

2021-03-11 Thread Jonathan Wakely via Gcc-patches

On 11/03/21 16:27 +, Jonathan Wakely wrote:

That seems cleaner to me, rather than adding another variable with
minor differences from the existing AM_CXXFLAGS.


My specific concern is that AM_CXXFLAGS and AM_CXXFLAGS_LT will get
out of sync, i.e. we'll add something to the former and forget to add
it to the latter.

If we keep using AM_CXXFLAGS but cancel out the -fvtable-verify=std
option, then there aren't two separate variables that can diverge.

But I think it's too late in the gcc-11 process for that kind of
refactoring.



MAINTAINERS update for a few ex-ImgTec employees

2021-03-11 Thread Jeff Law via Gcc-patches

These updates came out of a private conversation with Robert.

Pushed to the trunk.

Jeff
commit 52db241345a81207681733c5316b6fbf44596ac0
Author: Jeff Law 
Date:   Thu Mar 11 09:28:50 2021 -0700

MAINTAINERS updates for ex-ImgTec employees

/
* MAINTAINERS: Update entries for a few ex-ImgTec employees

diff --git a/MAINTAINERS b/MAINTAINERS
index fd9eb81d395..f95640fb08b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -365,7 +365,7 @@ Lawrence Crowl  

 Ian Dall   
 David Daney
 Robin Dapp 
-Simon Dardis   
+Simon Dardis   
 Sudakshina Das 
 Bud Davis  
 Chris Demetriou
@@ -614,10 +614,10 @@ Basile Starynkevitch  

 Jakub Staszak  
 Graham Stott   
 Jeff Sturm 
-Robert Suchanek

+Robert Suchanek
 Andrew Sutton  
 Gabriele Svelto
-Toma Tabacu
+Toma Tabacu
 Omar Tahir 
 Sriraman Tallam
 Samuel Tardieu 


Re: [PATCH v1] libstdc++-v3: Update VTV vars for libtool link commands [PR99172]

2021-03-11 Thread Jonathan Wakely via Gcc-patches

On 11/03/21 07:31 -0800, Caroline Tice via Libstdc++ wrote:

Adding the libstdc++ mailing list to the patch.


Thanks.


-- Caroline
cmt...@google.com

On Wed, Mar 10, 2021 at 8:50 PM Caroline Tice  wrote:


This patch is to fix PR 99172.

Currently when GCC is configured with --enable-vtable-verify, the
libstdc++-v3 Makefiles add "-fvtable-verify=std
-Wl,-u_vtable_map_vars_start,-u_vtable_map_vars_end" to libtool link
commands. The "-fvtable-verify=std" piece causes alternate versions of
libtool (such as slibtool) to fail, unable to find "-lvtv" (GNU
libtool just removes that piece).

This patch updates the libstdc++-v3 Makefiles to not pass
"-fvtable-verify=std" to the libtool link commands, while continuing
to pass the rest of the VTV  flags (which are necessary for VTV to
work).

I tested this by configuring with --enable-vtable-verify, boostrapping
the compiler, and running all the regression testsuites (including
libvtv & libstdc++) without any regressions.  I only ran it on a linux
system, on an x86_64 machine.

I also gave a copy of the patch to the person who reported the bug,
and they verified that the patch fixes their issue.

Is this ok to commit?


Should the same change be made to CXXLINK in src/*/Makefile.am, for
consistency if nothing else?

The patch is OK for gcc-11 now, but for stage 1 I'm wondering about
simplifying it.

Why do we ever add -Wl,-u options to CXXFLAGS when those are linker
options?

Could we move the -Wl,-u options to VTV_CXXLINKFLAGS instead, so
they're only used when actually linking?

Then VTV_CXXFLAGS would just be -fvtable-verify=std (and so
VTV_PCH_CXXFLAGS would be redundant), and instead of having
AM_CXXFLAGS and AM_CXXFLAGS_LT we could just filter out the
-fvtable-verify=std option from the CXXLINK options:

CXXLINK = \
$(LIBTOOL) --tag CXX \
$(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
--mode=link $(CXX) \
$(VTV_CXXLINKFLAGS) \
$(OPT_LDFLAGS) $(SECTION_LDFLAGS) \
$(filter-out,-fvtable-verify=std,$(AM_CXXFLAGS)) \
$(LTLDFLAGS) -o $@

Would that make sense?

Or just add  -fvtable-verify=none to the CXXLINK command to cancel out
the  -fvtable-verify=std option from AM_CXXFLAGS:

CXXLINK = \
$(LIBTOOL) --tag CXX \
$(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
--mode=link $(CXX) \
$(VTV_CXXLINKFLAGS) \
$(OPT_LDFLAGS) $(SECTION_LDFLAGS) \
$(AM_CXXFLAGS) -fvtable-verify=none \
$(LTLDFLAGS) -o $@

That seems cleaner to me, rather than adding another variable with
minor differences from the existing AM_CXXFLAGS.



Re: enable sqrt insns for cdce3.c

2021-03-11 Thread Alexandre Oliva
On Mar 10, 2021, Hans-Peter Nilsson  wrote:

> On Wed, 10 Mar 2021, Alexandre Oliva wrote:
>> 
>> The test expects shrink-wrapping of the fsqrt call, but that will only
>> occur when there is a usable sqrt insn.
>> 
>> Arrange for dejagnu to add the options that enable the sqrt insn, if
>> one is available, and to skip the test otherwise.
>> 
>> 
>> H-P, this *should* obviate the mmix-specific dg-skip-if.

> Unfortunately it doesn't.

Uhh :-(

>> Would it be
>> easy for you to confirm that this is the case and, if so, drop it?

> About as easy as for anyone (this is a compile-test),

I figured you'd have a recent toolchain around ;-)  Thanks!

> FAIL: gcc.dg/cdce3.c scan-tree-dump cdce "cdce3.c:12: [^\n\r]*
> function call is shrink-wrapped into error conditions."

How surprising!  My understanding was that dg-add-options 
implicitly implies dg-require-effective-target .  That was how
I'd read et-dg-runtest.

Now I see the _runtime after the check_effective_target_${target} there,
and my understanding is updated, so some pending changes may need
revisiting.  Oh my...


> The dump files and assembly file show no obvious clues to me as
> to what is supposed to happen; attached.

The test just should keep on not running on mmix.  Without an insn for
sqrt, there's no point in shrink-wrapping, and the code that performs
the optimization is smart enough to realize that, so it just leaves the
code alone.


-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


Re: [PATCH v1] libstdc++-v3: Update VTV vars for libtool link commands [PR99172]

2021-03-11 Thread Caroline Tice via Gcc-patches
Adding the libstdc++ mailing list to the patch.

-- Caroline
cmt...@google.com

On Wed, Mar 10, 2021 at 8:50 PM Caroline Tice  wrote:
>
> This patch is to fix PR 99172.
>
> Currently when GCC is configured with --enable-vtable-verify, the
> libstdc++-v3 Makefiles add "-fvtable-verify=std
> -Wl,-u_vtable_map_vars_start,-u_vtable_map_vars_end" to libtool link
> commands. The "-fvtable-verify=std" piece causes alternate versions of
> libtool (such as slibtool) to fail, unable to find "-lvtv" (GNU
> libtool just removes that piece).
>
> This patch updates the libstdc++-v3 Makefiles to not pass
> "-fvtable-verify=std" to the libtool link commands, while continuing
> to pass the rest of the VTV  flags (which are necessary for VTV to
> work).
>
> I tested this by configuring with --enable-vtable-verify, boostrapping
> the compiler, and running all the regression testsuites (including
> libvtv & libstdc++) without any regressions.  I only ran it on a linux
> system, on an x86_64 machine.
>
> I also gave a copy of the patch to the person who reported the bug,
> and they verified that the patch fixes their issue.
>
> Is this ok to commit?
>
> -- Caroline Tice
> cmt...@google.com
>
> libstdc++-v3/ChangeLog
>
> 2021-03-10  Caroline Tice  
>
> PR libstdc++/99172
> * Makefile.in: Regenerate.
> * acinclude.m4: Add definitions for VTV_CXXFLAGS_LT.
> * configure: Regenerate.
> * doc/Makefile.in: Regenerate.
> * include/Maefile.in: Regenerate.
> * libsupc++/Makefile.in: Regenerate.
> * po/Makefile.in: Regenerate.
> * python/Makefile.in: Regenerate.
> * src/Makefile.am (AM_CXXFLAGS_LT): New definition.
> (CXXLINK): Update to use AM_CXXFLAGS_LT instead of AM_CXXFLAGS.
> * src/Makefile.in: Regenerate.
> * src/c++11/Makefile.in: Regenerate.
> * src/c++17/Makefile.in: Regenerate.
> * src/c++20/Makefile.in: Regenerate.
> * src/c++98/Makefile.in: Regenerate.
> * src/filesystem/Makefile.in: Regenerate.
> * testsuite/Makefile.in: Regenerate.


add sqrt options and test for sqrt support in ppc tests

2021-03-11 Thread Alexandre Oliva


Some powerpc tests that require the fsqrt insn to be enabled
explicitly use the -mpowerpc-gpopt option.  This fails if the fsqrt
opcode is not available on the target machine.

Switch to dg-add-options sqrt_insn for compile tests, that adds the
option for the feature (pending approval of another patch for the same
PR), and to dg-require-effective-target sqrt_insn for execution tests.


This patch depends on the recent patch for PR99352, and on the proposed
patch for PR99371.

This was regstrapped on x86_64-linux-gnu, and tested with a cross to a
ppc64-vxworks7r2 that has no -mpowerpc-gpopt disabled by default, and
that raises Illegal Instruction exceptions upon encountering a fsqrt.
I'm now regstrapping it on ppc64-linux-gnu, just to be sure.
Ok to install?


for  gcc/testsuite/ChangeLog

PR testsuite/99371
* gcc.target/powerpc/pr46728-10.c: Drop explicit
-mpowerpc-gpopt in favor of dg-require-effective-target sqrt_insn.
* gcc.target/powerpc/pr46728-11.c: Likewise.
* gcc.target/powerpc/pr46728-13.c: Likewise.
* gcc.target/powerpc/pr46728-14.c: Likewise.
* gcc.target/powerpc/pr46728-15.c: Likewise.
* gcc.target/powerpc/recip-7.c: Likewise.
* gcc.target/powerpc/pr46728-1.c: Drop explicit
-mpowerpc-gpopt in favor of dg-add-options sqrt_insn.
* gcc.target/powerpc/pr46728-2.c: Likewise.
* gcc.target/powerpc/pr46728-3.c: Likewise.
* gcc.target/powerpc/pr46728-4.c: Likewise.
* gcc.target/powerpc/pr46728-5.c: Likewise.
* gcc.target/powerpc/pr46728-7.c: Likewise.
* gcc.target/powerpc/pr46728-8.c: Likewise.

TN: U302-010
---
 gcc/testsuite/gcc.target/powerpc/pr46728-1.c  |3 ++-
 gcc/testsuite/gcc.target/powerpc/pr46728-10.c |3 ++-
 gcc/testsuite/gcc.target/powerpc/pr46728-11.c |3 ++-
 gcc/testsuite/gcc.target/powerpc/pr46728-13.c |3 ++-
 gcc/testsuite/gcc.target/powerpc/pr46728-14.c |3 ++-
 gcc/testsuite/gcc.target/powerpc/pr46728-15.c |3 ++-
 gcc/testsuite/gcc.target/powerpc/pr46728-2.c  |3 ++-
 gcc/testsuite/gcc.target/powerpc/pr46728-3.c  |3 ++-
 gcc/testsuite/gcc.target/powerpc/pr46728-4.c  |3 ++-
 gcc/testsuite/gcc.target/powerpc/pr46728-5.c  |3 ++-
 gcc/testsuite/gcc.target/powerpc/pr46728-7.c  |3 ++-
 gcc/testsuite/gcc.target/powerpc/pr46728-8.c  |3 ++-
 gcc/testsuite/gcc.target/powerpc/recip-7.c|3 ++-
 13 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/pr46728-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr46728-1.c
index fc2cd7d7c9c56..b561d8b6e42f9 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr46728-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr46728-1.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm 
-mpowerpc-gpopt -fno-ident" } */
+/* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm -fno-ident" 
} */
+/* { dg-add-options sqrt_insn } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr46728-10.c 
b/gcc/testsuite/gcc.target/powerpc/pr46728-10.c
index 3be4728d333a4..cdde53b8de037 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr46728-10.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr46728-10.c
@@ -1,6 +1,7 @@
 /* { dg-do run } */
 /* { dg-skip-if "-mpowerpc-gpopt not supported" { powerpc*-*-darwin* } } */
-/* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm 
-mpowerpc-gpopt" } */
+/* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm" } */
+/* { dg-require-effective-target sqrt_insn } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr46728-11.c 
b/gcc/testsuite/gcc.target/powerpc/pr46728-11.c
index 43b6728a4b812..62b49b1479345 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr46728-11.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr46728-11.c
@@ -1,6 +1,7 @@
 /* { dg-do run } */
 /* { dg-skip-if "-mpowerpc-gpopt not supported" { powerpc*-*-darwin* } } */
-/* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm 
-mpowerpc-gpopt" } */
+/* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm" } */
+/* { dg-require-effective-target sqrt_insn } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr46728-13.c 
b/gcc/testsuite/gcc.target/powerpc/pr46728-13.c
index b9fd63973b728..3e55fa86e0b7c 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr46728-13.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr46728-13.c
@@ -1,6 +1,7 @@
 /* { dg-do run } */
 /* { dg-skip-if "-mpowerpc-gpopt not supported" { powerpc*-*-darwin* } } */
-/* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm 
-mpowerpc-gpopt" } */
+/* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm" } */
+/* { dg-require-effective-target sqrt_insn } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr46728-14.c 
b/gcc/testsuite/gcc.target/powerpc/pr46728-14.c
index 5a13bdb6c..472b9d5b60120 100644
--- 

[PATCH][pushed] Add -fprofile-reproducible=parallel-runs to STAGEfeedback_CFLAGS to Makefile.tpl.

2021-03-11 Thread Martin Liška

Pushed as obvious, the original change was done
in g:e05a117dc4b98f3ac60851608f532ba7cee7343a.

Martin

ChangeLog:

* Makefile.tpl: The change was done Makefile.in which
is generated file.
---
 Makefile.tpl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile.tpl b/Makefile.tpl
index 3b88f351d5b..6e0337fb48f 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -488,7 +488,7 @@ STAGEprofile_TFLAGS = $(STAGE2_TFLAGS)
 STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS))
 STAGEtrain_TFLAGS = $(filter-out -fchecking=1,$(STAGE3_TFLAGS))
 
-STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use

+STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use 
-fprofile-reproducible=parallel-runs
 STAGEfeedback_TFLAGS = $(STAGE4_TFLAGS)
 
 STAGEautoprofile_CFLAGS = $(STAGE2_CFLAGS) -g

--
2.30.1



Re: add -mfloat128 to __float128-using test missing it

2021-03-11 Thread Iain Sandoe via Gcc-patches

Alexandre Oliva  wrote:


On Mar 11, 2021, Iain Sandoe  wrote:


Alexandre Oliva  wrote:

On Mar 10, 2021, Alexandre Oliva  wrote:


* gcc.target/powerpc/prefix-ds-dq.c: Enable __float128.


I've been reminded that this is not enough for the scan-assembler tests
to pass, at least in our configurations.  Nearly all of the asm
expectations are unmet.  I'm yet to identify the root cause.



I have the following patch that I was intending to apply/post later (since
the test makes unconditional use of __float128 which is not available on
all platforms).



does this solve your issue?


I suppose it would silence the fail by skipping the test.  I also
*think* it would defeat the purpose of the test, since the requirement
test would take effect before -mcpu=power10.

My understanding is that this test, being of the "check output asm"
kind, rather than "check that it compiles" or "check that it runs
without crashing", we'd get better (or at least no-worse) coverage out
of any single test run by attempting to perform the test regardless of
the powerpc target in use.


I’m all in favour of improved test coverage.


E.g., I found that, besides -mfloat128, I'd get the expect asm by just
cancelling out the -mstrict-align flag that the toolchain I'm testing
enables by default.  So here's my updated patch, that I'm nearly done
retesting.  Ok to install?


add -mfloat128 to __float128-using test missing it

Most (all?) powerpc tests that use the __float128 type either enable
it with -mfloat128, or use effective target requirements to check for
its presence.

prefix-ds-dq.c is failing in some of our configurations because it
uses the __float128 type without checking for it, or enabling it
explicitly.

Since it's a compile test, I'm enabling it explicitly.


The flag causes a warning to be printed when float128 may not be
entirely supported; I've arranged for the warning to be ignored in
this test.


In order for the expected asm to be generated, I found the need for
-mno-strict-align, on toolchains that enable -mstrict-align by
default.


for  gcc/testsuite/ChangeLog

* gcc.target/powerpc/prefix-ds-dq.c: Enable __float128,
disable -mstrict-align.
---
gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c |3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c  
b/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c

index 554cd0c1beac0..ddf563521889c 100644
--- a/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
@@ -1,7 +1,8 @@
/* { dg-do compile } */
/* { dg-require-effective-target powerpc_prefixed_addr } */
/* { dg-require-effective-target lp64 } */
-/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -mfloat128 -mno-strict-align"  
} */


"-mno-strict-align” is not accepted everywhere, I think.
Iain


+/* { dg-prune-output ".-mfloat128. option may not be fully supported" } */

/* Tests whether we generate a prefixed load/store operation for  
addresses that

   don't meet DS/DQ offset constraints.  64-bit is needed for testing the use



--
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
  Free Software Activist GNU Toolchain Engineer
   Vim, Vi, Voltei pro Emacs -- GNUlius Caesar





Re: xfail fetestexcept test - ppc always uses fcmpu

2021-03-11 Thread Alexandre Oliva
On Mar 10, 2021, Joseph Myers  wrote:

> In my view, such an XFAIL (for a GCC bug as opposed to an environmental 
> issue) should have a comment pointing to a corresponding open bug in GCC 
> Bugzilla.  In this case, that's bug 58684.

Thanks, yeah, that's valuable information to add.

So far, I added a link to the PR in the ChangeLog entry, and noticed and
fixed a typo in the filename there.

I kind of like the notion of adding a comment to the test itself, but I
wasn't sure that's what you meant.

I kind of like it because I'd have appreciated if the other tests, that
I had identified, had comments linking back to the PR.  OTOH, I realize
if I had searched the ChangeLog or their commit history, I'd have found
my way to the PR.  Instead, I just realized that that was the way ppc
was expected to behave, and quickly double-checked the code to make sure
there was no implemented alternative in hiding, entirely missing the PR.


xfail fetestexcept test - ppc always uses fcmpu

From: Alexandre Oliva 

gcc.dg/torture/pr91323.c tests that a compare with NaNf doesn't set an
exception using builtin compare intrinsics, and that it does when
using regular compare operators.

That doesn't seem to be expected to work on powerpc targets.  It fails
on GNU/Linux, it's marked to be skipped on AIX, and a similar test,
gcc.dg/torture/pr93133.c, has the execution test xfailed for all of
powerpc*-*-*.

In this test, the functions that use intrinsics for the compare end up
with the same code as the one that uses compare operators, using
fcmpu, a floating compare that, unlike fcmpo, does not set the invalid
operand exception for quiet NaN.  I couldn't find any evidence that
the rs6000 backend ever outputs fcmpo.  Therefore, I'm adding the same
execution xfail marker to this test.


for  gcc/testsuite/ChangeLog

PR target/58684
* gcc.dg/torture/pr91323.c: Expect execution fail on
powerpc*-*-*.
---
 gcc/testsuite/gcc.dg/torture/pr91323.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/torture/pr91323.c 
b/gcc/testsuite/gcc.dg/torture/pr91323.c
index 1411fcaa3966c..f97dcc12cac9d 100644
--- a/gcc/testsuite/gcc.dg/torture/pr91323.c
+++ b/gcc/testsuite/gcc.dg/torture/pr91323.c
@@ -1,4 +1,4 @@
-/* { dg-do run } */
+/* { dg-do run { xfail powerpc*-*-* } } */
 /* { dg-add-options ieee } */
 /* { dg-require-effective-target fenv_exceptions } */
 /* { dg-skip-if "fenv" { powerpc-ibm-aix* } } */




-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


Re: add -mfloat128 to __float128-using test missing it

2021-03-11 Thread Alexandre Oliva
On Mar 11, 2021, Iain Sandoe  wrote:

> Alexandre Oliva  wrote:
>> On Mar 10, 2021, Alexandre Oliva  wrote:
>> 
>>> * gcc.target/powerpc/prefix-ds-dq.c: Enable __float128.
>> 
>> I've been reminded that this is not enough for the scan-assembler tests
>> to pass, at least in our configurations.  Nearly all of the asm
>> expectations are unmet.  I'm yet to identify the root cause.

> I have the following patch that I was intending to apply/post later (since
> the test makes unconditional use of __float128 which is not available on
> all platforms).

> does this solve your issue?

I suppose it would silence the fail by skipping the test.  I also
*think* it would defeat the purpose of the test, since the requirement
test would take effect before -mcpu=power10.

My understanding is that this test, being of the "check output asm"
kind, rather than "check that it compiles" or "check that it runs
without crashing", we'd get better (or at least no-worse) coverage out
of any single test run by attempting to perform the test regardless of
the powerpc target in use.

E.g., I found that, besides -mfloat128, I'd get the expect asm by just
cancelling out the -mstrict-align flag that the toolchain I'm testing
enables by default.  So here's my updated patch, that I'm nearly done
retesting.  Ok to install?


add -mfloat128 to __float128-using test missing it

Most (all?) powerpc tests that use the __float128 type either enable
it with -mfloat128, or use effective target requirements to check for
its presence.

prefix-ds-dq.c is failing in some of our configurations because it
uses the __float128 type without checking for it, or enabling it
explicitly.

Since it's a compile test, I'm enabling it explicitly.


The flag causes a warning to be printed when float128 may not be
entirely supported; I've arranged for the warning to be ignored in
this test.


In order for the expected asm to be generated, I found the need for
-mno-strict-align, on toolchains that enable -mstrict-align by
default.


for  gcc/testsuite/ChangeLog

* gcc.target/powerpc/prefix-ds-dq.c: Enable __float128,
disable -mstrict-align.
---
 gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c 
b/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
index 554cd0c1beac0..ddf563521889c 100644
--- a/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
@@ -1,7 +1,8 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target powerpc_prefixed_addr } */
 /* { dg-require-effective-target lp64 } */
-/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -mfloat128 -mno-strict-align" } */
+/* { dg-prune-output ".-mfloat128. option may not be fully supported" } */
 
 /* Tests whether we generate a prefixed load/store operation for addresses that
don't meet DS/DQ offset constraints.  64-bit is needed for testing the use



-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


Re: [RFC] decay vect tests from run to link for pr95401

2021-03-11 Thread Alexandre Oliva
On Mar 11, 2021, Richard Biener  wrote:

> I think that's OK.

Cool, here's the patch I'm nearly done regstrapping on x86_64-linux-gnu
and x-ppc64-vx7r2.  Ok to install?

> It's probably difficult to make the test UNSUPPORTED
> when dg-additional-sources is discovered with a dg-do compile test?

Well, first of all, I really don't like the idea of skipping a test if
we can still get some useful information out of it.

That said, I suppose we could test for additional_sources_used in
${langdriver}_target_compile proces in gcc.exp, g++.exp et al, between
their calling dg-additional-files-options and target_compile, to
conditionally skip the latter, or pass $type to the former in all
$langdriver.exp, so that the extra files can be flagged and/or discarded
in unsupported modes.  I believe such changes would also require
adjustments to library test infrastructures.


decay vect tests from run to link for pr95401

When vect.exp finds our configuration disables altivec by default, it
disables the execution of vectorization tests, assuming the test
hardware doesn't support it.

Tests become just compile tests, but compile tests won't work
correctly when additional sources are named, e.g. pr95401.cc, because
GCC refuses to compile multiple files into the same asm output.

With this patch, the default for when execution is not possible
becomes link.


for  gcc/testsuite/ChangeLog

* lib/target-supports.exp (check_vect_support_and_set_flags):
Decay to link rather than compile.
---
 gcc/testsuite/lib/target-supports.exp |   14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 52d3d036d3c5c..f5b9b0578de37 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -9632,7 +9632,7 @@ proc check_vect_support_and_set_flags { } {
 if [check_750cl_hw_available] {
 set dg-do-what-default run
 } else {
-set dg-do-what-default compile
+set dg-do-what-default link
 }
 } elseif [istarget powerpc*-*-*] {
 # Skip targets not supporting -maltivec.
@@ -9656,14 +9656,14 @@ proc check_vect_support_and_set_flags { } {
 # Specify a cpu that supports VMX for compile-only tests.
 lappend DEFAULT_VECTCFLAGS "-mcpu=970"
 }
-set dg-do-what-default compile
+set dg-do-what-default link
 }
 } elseif { [istarget i?86-*-*] || [istarget x86_64-*-*] } {
 lappend DEFAULT_VECTCFLAGS "-msse2"
 if { [check_effective_target_sse2_runtime] } {
 set dg-do-what-default run
 } else {
-set dg-do-what-default compile
+set dg-do-what-default link
 }
 } elseif { [istarget mips*-*-*]
   && [check_effective_target_nomips16] } {
@@ -9682,7 +9682,7 @@ proc check_vect_support_and_set_flags { } {
 if [check_effective_target_ultrasparc_hw] {
 set dg-do-what-default run
 } else {
-set dg-do-what-default compile
+set dg-do-what-default link
 }
 } elseif [istarget alpha*-*-*] {
 # Alpha's vectorization capabilities are extremely limited.
@@ -9695,7 +9695,7 @@ proc check_vect_support_and_set_flags { } {
 if [check_alpha_max_hw_available] {
 set dg-do-what-default run
 } else {
-set dg-do-what-default compile
+set dg-do-what-default link
 }
 } elseif [istarget ia64-*-*] {
 set dg-do-what-default run
@@ -9708,7 +9708,7 @@ proc check_vect_support_and_set_flags { } {
 if [is-effective-target arm_neon_hw] {
 set dg-do-what-default run
 } else {
-set dg-do-what-default compile
+set dg-do-what-default link
 }
 } elseif [istarget "aarch64*-*-*"] {
 set dg-do-what-default run
@@ -9729,7 +9729,7 @@ proc check_vect_support_and_set_flags { } {
 set dg-do-what-default run
 } else {
lappend DEFAULT_VECTCFLAGS "-march=z14" "-mzarch"
-set dg-do-what-default compile
+set dg-do-what-default link
 }
 } elseif [istarget amdgcn-*-*] {
 set dg-do-what-default run


-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


Re: ipa-sra is mostly disabled by -mlongcall

2021-03-11 Thread Alexandre Oliva
On Mar 11, 2021, Richard Biener  wrote:

> I wonder what the effect of this switch is - it's documented as
> affecting the call site (different call insn sequence by default), so
> the best representation would be an attribute on actual call stmts

I don't think that would fulfill its purpose.

My understanding is that the longcall attribute is intended to issue
call sequences that assume the target of the call will be out of range
for available PC-relative addressing modes in call instructions.

It is also often the case that the callee address ends up in a
contiguous address, rather than as multiple fragments of insns, which
simplifies relocation processing.

One use I'm familiar with is when different modules are known to be
going to be linked separately, and dynamically loaded at disparate
memory locations, so that PC-relative branches across modules won't do,
and PLTs are not available.  When building one module, you declare as
requiring long call sequences any functions expected to be defined in
other modules.  Or you just use long calls throughout, taking a slight
performance penalty out of intra-module call sequences, that could have
used PC-relative calls.

> I agree that a type attribute (it's a type attribute likely to also
> affect indirect calls) is a bit awkward

Indirect calls are pretty much long calls by definition, but I suppose
the goal of retaining the long-call requests has to do with the
possibility of sometimes resolving an indirect call back to a direct
call.  If it wasn't associated with the type, the information would be
lost at the point of the function decay to pointer, but I gather if we
succeed in resolving a pointer back to a unique declaration, we would be
able to take a longcall annotation from that declaration.  I'm probably
missing something.

> but it does look better than just a global flag.

That possibility is not lost in the ARM implementation.  What the
ARM command-line flag does is change the default call sequence for that
translation unit.  It can still be overridden either way.

What the PPC command-line flag does is to add the attribute to every
single function decl's type.

The representations are isomorphic, equivalent in terms of
representation power, but the PPC attributes-all-over choice is wasteful
memory-wise, and it sacrifices optimizations that throw their hands up
when they find function type attributes.  OTOH, it brings some
simplification to the function type system.  ARM's representation has 4
states (two attributes that can each be present or absent): one
nonsensical (both short_ and long_call present), and one (both absent)
mapping to either of the two others, depending on the default.  On PPC,
there are only two states: longcall present or absent, and AFAIK it
offers no possibility of overriding a longcall default on a per-function
basis, only globally for a translation unit, which sort of renders all
the attributes it adds redundant.

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


RE: [PATCH] Ada: hashed container Cursor type predefined equality non-conformance

2021-03-11 Thread Richard Wai
Here is the amended commit log:

--

Ada: Ensure correct predefined equality behavior for Cursor objects of
hashed containers.

2021-03-11  Richard Wai  

gcc/ada/
* libgnat/a-cohase.ads (Cursor): Synchronize comments for the Cursor
type definition to be consistent with identical definitions in other
container packages. Add additional comments regarding the importance
of
maintaining the "Position" component for predefined equality.
* libgnat/a-cohama.ads (Cursor): Likewise.
* libgnat/a-cihama.ads (Cursor): Likewise.
* libgnat/a-cohase.adb (Find, Insert): Ensure that Cursor objects
always have their "Position" component set to ensure predefined
equality works as required.
* libgnat/a-cohama.adb (Find, Insert): Likewise.
* libgnat/a-cihama.adb (Find, Insert): Likewise.

gcc/testsuite/
* gnat.dg/containers2.adb: New test.

--

Thanks!

Richard

> -Original Message-
> From: Arnaud Charlet 
> Sent: March 10, 2021 11:27 AM
> To: Richard Wai 
> Cc: gcc-patches@gcc.gnu.org; 'Bob Duff' ; Arnaud
> Charlet 
> Subject: Re: [PATCH] Ada: hashed container Cursor type predefined equality
> non-conformance
> 
> > I'm not sure I correctly understand you here, but my interpretation is
> > that I should no longer submit Changelog entries, rather just the
> > patch, and then
> 
> Right.
> 
> > a commit message (a-la git), and then presumably the Changelong
> > entries will be generated automatically. From what I can see, gcc'
> > website does not talk about that, so I'm guessing this format based on
> > what I see from git-log, generally.
> >
> > So assuming that, attached is the "correct" patch, and here is the
> > commit
> > message:
> >
> > ---
> >
> > Author: Richard Wai 
> >
> > Ada: Ensure correct Cursor predefined equality behavior for hashed
> > containers.
> >
> > --
> >
> > And for the record, the change log entries I've come up with as per
> > the previous email:
> 
> And the commit log will look like:
> 
> 2021-03-09  Richard Wai  
> 
> gcc/ada/
>   * libgnat/...
> 
> gcc/testsuite/
>   * gnat.dg/containers2.adb: ...
> 
> Your patch is OK with these changes, thanks.



[Patch] Fortran/OpenMP: Fix use_device_{ptr,addr} with assumed-size array [PR98858]

2021-03-11 Thread Tobias Burnus

The ICE occurs because the 't' == NULL for assumed-size arrays.
For them, the size is not known – but as this should only occur
for use_device_{addr,ptr}, the size is not needed.

OK for mainline?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
Fortran/OpenMP: Fix use_device_{ptr,addr} with assumed-size array [PR98858]

gcc/ChangeLog:

	PR fortran/98858
	* gimplify.c (omp_add_variable): Handle NULL_TREE as size
	occuring for assumed-size arrays in use_device_{ptr,addr}.

libgomp/ChangeLog:

	PR fortran/98858
	* testsuite/libgomp.fortran/use_device_ptr-3.f90: New test.

 gcc/gimplify.c |  2 +-
 .../testsuite/libgomp.fortran/use_device_ptr-3.f90 | 91 ++
 2 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index caf25ccdd5c..6da66985ad6 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -7078,7 +7078,7 @@ omp_add_variable (struct gimplify_omp_ctx *ctx, tree decl, unsigned int flags)
   if ((flags & GOVD_SHARED) == 0)
 	{
 	  t = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (decl)));
-	  if (DECL_P (t))
+	  if (t && DECL_P (t))
 	omp_notice_variable (ctx, t, true);
 	}
 }
diff --git a/libgomp/testsuite/libgomp.fortran/use_device_ptr-3.f90 b/libgomp/testsuite/libgomp.fortran/use_device_ptr-3.f90
new file mode 100644
index 000..f2b33cd5e89
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/use_device_ptr-3.f90
@@ -0,0 +1,91 @@
+! PR fortran/98858
+!
+! Assumed-size array with use_device_ptr()
+!
+program test_use_device_ptr
+  use iso_c_binding, only: c_ptr, c_loc, c_f_pointer
+  implicit none
+  double precision :: alpha
+  integer, parameter :: lda = 10
+  integer, allocatable :: mat(:, :)
+   integer :: i, j
+
+  allocate(mat(lda, lda))
+  do i = 1, lda
+do j = 1, lda
+  mat(j,i) = i*100 + j
+end do
+  end do
+
+  !$omp target enter data map(to:mat)
+  call dgemm(lda, mat)
+  !$omp target exit data map(from:mat)
+
+  do i = 1, lda
+do j = 1, lda
+  if (mat(j,i) /= -(i*100 + j)) stop 1
+end do
+  end do
+
+  !$omp target enter data map(to:mat)
+  call dgemm2(lda, mat)
+  !$omp target exit data map(from:mat)
+
+  do i = 1, lda
+do j = 1, lda
+  if (mat(j,i) /= (i*100 + j)) stop 1
+end do
+  end do
+
+  contains
+
+subroutine dgemm(lda, a)
+  implicit none
+  integer :: lda
+  integer, target:: a(lda,*) ! need target attribute to use c_loc
+  !$omp target data use_device_ptr(a)
+call negate_it(c_loc(a), lda)
+  !$omp end target data
+end subroutine
+
+subroutine dgemm2(lda, a)
+  implicit none
+  integer :: lda
+  integer, target:: a(lda,*) ! need target attribute to use c_loc
+  !$omp target data use_device_addr(a)
+call negate_it(c_loc(a), lda)
+  !$omp end target data
+end subroutine
+
+subroutine negate_it(a, n)
+  type(c_ptr), value :: a
+  integer, value :: n
+  integer, pointer :: array(:,:)
+
+  ! detour due to OpenMP 5.0 oddness
+  call c_f_pointer(a, array, [n,n])
+  call do_offload(array, n)
+end
+
+subroutine do_offload(aptr, n)
+  integer, target :: aptr(:,:)
+  integer, value :: n
+  !$omp target is_device_ptr(aptr)
+  call negate_it_tgt(aptr, n)
+  !$omp end target
+end subroutine do_offload
+
+subroutine negate_it_tgt(array, n)
+  !$omp declare target
+   integer, value :: n
+   integer :: array(n,n)
+   integer :: i, j
+   !$omp parallel do collapse(2)
+   do i = 1, n
+ do j = 1, n
+   array(j,i) = - array(j,i)
+ end do
+   end do
+   !$omp end parallel do
+  end subroutine
+end program


Re: [PATCH] c++: Fix up calls to immediate functions returning reference [PR99507]

2021-03-11 Thread Nathan Sidwell

On 3/11/21 3:52 AM, Jakub Jelinek wrote:

Hi!

build_cxx_call calls convert_from_reference at the end, so if an immediate
function returns a reference, we were constant evaluating not just that
call, but that call wrapped in an INDIRECT_REF.  That unfortunately means
it can constant evaluate to something non-addressable, so if code later
needs to take its address it will fail.

The following patch fixes that by undoing the convert_from_reference
wrapping for the cxx_constant_value evaluation and readdding it ad the end.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


ok.  Maybe a comment about '// unwrap the convert from reference'?



2021-03-11  Jakub Jelinek  

PR c++/99507
* call.c (build_over_call): For immediate evaluation of functions
that return references, undo convert_from_reference effects before
calling cxx_constant_value and call convert_from_reference
afterwards.

* g++.dg/cpp2a/consteval19.C: New test.

--- gcc/cp/call.c.jj2021-03-06 10:17:01.687304578 +0100
+++ gcc/cp/call.c   2021-03-10 13:51:29.121445195 +0100
@@ -9504,6 +9504,8 @@ build_over_call (struct z_candidate *can
if (immediate_invocation_p (fndecl, nargs))
{
  tree obj_arg = NULL_TREE;
+ if (REFERENCE_REF_P (call))
+   call = TREE_OPERAND (call, 0);
  if (DECL_CONSTRUCTOR_P (fndecl))
obj_arg = cand->first_arg ? cand->first_arg : (*args)[0];
  if (obj_arg && is_dummy_object (obj_arg))
@@ -9527,6 +9529,7 @@ build_over_call (struct z_candidate *can
  call = cxx_constant_value (call, obj_arg);
  if (obj_arg && !error_operand_p (call))
call = build2 (INIT_EXPR, void_type_node, obj_arg, call);
+ call = convert_from_reference (call);
}
  }
return call;
--- gcc/testsuite/g++.dg/cpp2a/consteval19.C.jj 2021-03-10 14:20:58.018835190 
+0100
+++ gcc/testsuite/g++.dg/cpp2a/consteval19.C2021-03-10 14:20:16.642294167 
+0100
@@ -0,0 +1,6 @@
+// PR c++/99507
+// { dg-do compile { target c++20 } }
+
+constexpr int i{0};
+consteval const int  () { return i; }
+const int *a{ ()};

Jakub




--
Nathan Sidwell


Re: [PATCH] c++: Only reject reinterpret casts from pointers to integers for manifestly_const_eval evaluation [PR99456]

2021-03-11 Thread Nathan Sidwell

On 3/9/21 10:31 AM, Jakub Jelinek wrote:

Hi!

My PR82304/PR95307 fix moved reinterpret cast from pointer to integer
diagnostics from cxx_eval_outermost_constant_expr where it caught
invalid code only at the outermost level down into
cxx_eval_constant_expression.
Unfortunately, it regressed following testcase, we emit worse code
including dynamic initialization of some vars.
While the initializers are not constant expressions due to the
reinterpret_cast in there, there is no reason not to fold them as an
optimization.

I've tried to make this dependent on !ctx->quiet, but that regressed
two further tests, so this patch bases that on manifestly_const_eval.

The new testcase is now optimized as much as it used to be in GCC 10
and the only regression it causes is an extra -Wnarrowing warning
on vla22.C test on invalid code (which the patch adjusts).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


good for me, but I think Jason should also comment.  Weaving both 
consteval semantics and the need for sym+addend folding is tricky.


for avoidance of doubt, as mentioned in the PR, the inline variable case 
is an ABI issue, not 'just nice'.  I guess the ABI document should 
specify. It doesn't appear to. 
https://github.com/itanium-cxx-abi/cxx-abi/issues/121 created.




2021-03-09  Jakub Jelinek  

PR c++/99456
* constexpr.c (cxx_eval_constant_expression): For CONVERT_EXPR from
INDIRECT_TYPE_P to ARITHMETIC_TYPE_P, when !ctx->manifestly_const_eval
don't diagnose it, set *non_constant_p nor return t.

* g++.dg/opt/pr99456.C: New test.
* g++.dg/ext/vla22.C: Expect a -Wnarrowing warning for c++11 and
later.

--- gcc/cp/constexpr.c.jj   2021-03-08 23:40:28.334509562 +0100
+++ gcc/cp/constexpr.c  2021-03-09 11:50:08.721716460 +0100
@@ -6656,7 +6656,8 @@ cxx_eval_constant_expression (const cons
  
  	if (TREE_CODE (t) == CONVERT_EXPR

&& ARITHMETIC_TYPE_P (type)
-   && INDIRECT_TYPE_P (TREE_TYPE (op)))
+   && INDIRECT_TYPE_P (TREE_TYPE (op))
+   && ctx->manifestly_const_eval)
  {
if (!ctx->quiet)
  error_at (loc,
--- gcc/testsuite/g++.dg/opt/pr99456.C.jj   2021-03-09 11:43:56.452862770 
+0100
+++ gcc/testsuite/g++.dg/opt/pr99456.C  2021-03-09 11:43:56.452862770 +0100
@@ -0,0 +1,33 @@
+// PR c++/99456
+// { dg-do compile { target c++17 } }
+// { dg-options "-g0" }
+// { dg-final { scan-assembler-not "PR99456Var0\[1234]" } }
+// { dg-final { scan-assembler-not "__static_initialization_and_destruction" } 
}
+// { dg-final { scan-assembler-not "_GLOBAL__sub_I" } }
+// { dg-final { scan-assembler-not "_ZGV12PR99456Var1\[1234]" } }
+
+typedef __UINTPTR_TYPE__ uintptr_t;
+
+class Container
+{
+public:
+  uintptr_t m;
+};
+
+extern unsigned desc;
+static constexpr unsigned  = desc;
+
+inline Container PR99456Var01 {reinterpret_cast ()};
+inline Container PR99456Var02 {reinterpret_cast ()};
+inline uintptr_t PR99456Var03 {reinterpret_cast ()};
+inline uintptr_t PR99456Var04 {reinterpret_cast ()};
+
+inline Container PR99456Var11 {reinterpret_cast ()};
+inline Container PR99456Var12 {reinterpret_cast ()};
+inline uintptr_t PR99456Var13 {reinterpret_cast ()};
+inline uintptr_t PR99456Var14 {reinterpret_cast ()};
+
+auto *PR99456Ref11 = 
+auto *PR99456Ref12 = 
+auto *PR99456Ref13 = 
+auto *PR99456Ref14 = 
--- gcc/testsuite/g++.dg/ext/vla22.C.jj 2020-02-27 09:28:46.396956140 +0100
+++ gcc/testsuite/g++.dg/ext/vla22.C2021-03-09 12:00:58.275482884 +0100
@@ -6,4 +6,4 @@ void
  f ()
  {
const int tbl[(long) "h"] = { 12 }; // { dg-error "size of array .tbl. is not an 
integral constant-expression" }
-}
+}// { dg-warning "narrowing conversion" "" 
{ target c++11 } .-1 }

Jakub




--
Nathan Sidwell


[PATCH] testsuite/98245 - adjust dump scanning of gcc.dg/vect/bb-slp-46.c

2021-03-11 Thread Richard Biener
Checking the number of pluses is unreliable since the vector size
isn't known.  Instead see that the unwanted scalar compute is not
there.

Pushed.

2021-03-11  Richard Biener  

PR testsuite/98245
* gcc.dg/vect/bb-slp-46.c: Scan for the scalar compute
instead of verifying the total number of adds.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-46.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-46.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-46.c
index 8daa5c1cfce..98b29062a19 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-46.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-46.c
@@ -24,5 +24,5 @@ int foo ()
 /* { dg-final { scan-tree-dump "optimized: basic block" "slp2" } } */
 /* { dg-final { scan-tree-dump "extracting lane for live stmt" "slp2" } } */
 /* { dg-final { scan-tree-dump-times "extracting lane for live stmt" 2 "slp2" 
{ xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times " \\+ " 3 "optimized" } } */
-/* { dg-final { scan-tree-dump-times " \\+ " 2 "optimized" { xfail *-*-* } } } 
*/
+/* { dg-final { scan-tree-dump-not "tem3_\[0-9\]\+ = " "optimized" } } */
+/* { dg-final { scan-tree-dump-not "tem0_\[0-9\]\+ = " "optimized" { xfail 
*-*-* } } } */
-- 
2.26.2


[PATCH] testsuite/97494 - XFAIL gcc.dg/vect/pr97428.c on !vect_hw_misalign

2021-03-11 Thread Richard Biener
While we could at least vectorize it on targets which support
re-alignment tokens we fail to do this because of imperfections in
alignment analysis.  XFAIL when the HW cannot deal with misaligned
vector accesses for now.

Pushed.

2021-03-11  Richard Biener  

PR testsuite/97494
* gcc.dg/vect/pr97428.c: XFAIL on !vect_hw_misalign.
---
 gcc/testsuite/gcc.dg/vect/pr97428.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr97428.c 
b/gcc/testsuite/gcc.dg/vect/pr97428.c
index 49d53738256..bbd743a76c4 100644
--- a/gcc/testsuite/gcc.dg/vect/pr97428.c
+++ b/gcc/testsuite/gcc.dg/vect/pr97428.c
@@ -40,5 +40,7 @@ void foo_i2(dcmlx4_t dst[], const dcmlx_t src[], int n)
load and store groups.  */
 /* { dg-final { scan-tree-dump "Detected interleaving load of size 8" "vect" } 
} */
 /* { dg-final { scan-tree-dump "Detected interleaving store of size 16" "vect" 
} } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" } 
} */
+/* We're not able to peel & apply re-aligning to make accesses well-aligned 
for !vect_hw_misalign,
+   but we could by peeling the stores for alignment and applying re-aligning 
loads.  */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
xfail { ! vect_hw_misalign } } } } */
 /* { dg-final { scan-tree-dump-not "gap of 6 elements" "vect" } } */
-- 
2.26.2


[PATCH] testsuite/97494 - XFAIL gcc.dg/vect/vect-complex-5.c on !vect_hw_misalign

2021-03-11 Thread Richard Biener
This is a missed optimization due to bogus alignment analysis.

Pushed.

2021-03-11  Richard Biener  

PR testsuite/97494
* gcc.dg/vect/vect-complex-5.c: XFAIL on !vect_hw_misalign.
---
 gcc/testsuite/gcc.dg/vect/vect-complex-5.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/vect-complex-5.c 
b/gcc/testsuite/gcc.dg/vect/vect-complex-5.c
index 06486375449..81fdb67ce81 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-complex-5.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-complex-5.c
@@ -40,4 +40,4 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" } 
} */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
xfail { ! vect_hw_misalign } } } } */
-- 
2.26.2


[PATCH] testsuite/97494 - amend gcc.dg/vect/slp-21.c

2021-03-11 Thread Richard Biener
As reported in the PR all powerpc64 targets fail

FAIL: gcc.dg/vect/slp-21.c scan-tree-dump-times vect "vectorizing stmts using 
SLP" 2

because like on arm we now vectorize 4 opportunities.  This adjusts
the testcase to follow the arm example.

compile-tested on powerpc64le-linux-gnu, pushed.

2021-03-11  Richard Biener  

PR testsuite/97494
* gcc.dg/vect/slp-21.c: Adjust for powerpc64*-*-*.
---
 gcc/testsuite/gcc.dg/vect/slp-21.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/slp-21.c 
b/gcc/testsuite/gcc.dg/vect/slp-21.c
index 117d65c5ddb..bf8f434dd50 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-21.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-21.c
@@ -210,7 +210,7 @@ int main (void)
 
Not all vect_perm targets support that, and it's a bit too specific to have
its own effective-target selector, so we just test targets directly.  */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { 
target { aarch64*-*-* arm*-*-* } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target { vect_strided4 && { ! { aarch64*-*-* arm*-*-* } } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { 
target { aarch64*-*-* arm*-*-* powerpc64*-*-* } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target { vect_strided4 && { ! { aarch64*-*-* arm*-*-* powerpc64*-*-* } } } } } 
} */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect"  { 
target { ! { vect_strided4 } } } } } */
   
-- 
2.26.2


Re: [RFC] decay vect tests from run to link for pr95401

2021-03-11 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches  writes:
> On Thu, Mar 11, 2021 at 9:03 AM Alexandre Oliva  wrote:
>>
>>
>> When vect.exp finds our configuration disables altivec by default, it
>> disables the execution of vectorization tests, assuming the test
>> hardware doesn't support it.
>>
>> Tests become just compile tests, but compile tests won't work
>> correctly when additional sources are named, e.g. pr95401.cc, because
>> GCC refuses to compile multiple files into the same asm output.
>>
>> With this patch, the default for when execution is not possible
>> becomes link.
>>
>>
>> This was regstrapped on x86_64-linux-gnu and ppc64-linux-gnu, and tested
>> with a cross to a ppc64-vxworks7r2 with altivec disabled by default.  I
>> found fixing the handling of additional sources to e.g. compile each one
>> separately, or perhaps just discard or reject additional sources for
>> compile tests, to be a little too involved.
>>
>> So I'm leaning towards this proposed change, just extended to other
>> platforms that also decay from run to compile rather than link, and thus
>> run into this problem in g++.dg/vect/pr95401.cc.  Would this be
>> acceptable?
>
> I think that's OK.

+1 FWIW.  It seems like an improvement anyway, since it makes it
harder to forget an explicit dg-do compile in cases where it's needed.

Richard




[PATCH] tree-optimization/99523 - missing SSA decls in dumps

2021-03-11 Thread Richard Biener
This makes sure to dump SSA names without identifier in the
declaration part of a function dump.  While we dump the
anonymous variable decls the SSA names referencing them appear
without a clear reference as to what anonymous variable is used
(_3 vs. D.1234).

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-03-11  Richard Biener  

PR tree-optimization/99523
* tree-cfg.c (dump_function_to_file): Dump SSA names
w/o identifier to the decls section as well, not only those
without a VAR_DECL.
---
 gcc/tree-cfg.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index d04ce212561..7e3aae5f9c2 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -8155,7 +8155,12 @@ dump_function_to_file (tree fndecl, FILE *file, 
dump_flags_t flags)
   if (gimple_in_ssa_p (cfun))
FOR_EACH_SSA_NAME (ix, name, cfun)
  {
-   if (!SSA_NAME_VAR (name))
+   if (!SSA_NAME_VAR (name)
+   /* SSA name with decls without a name still get
+  dumped as _N, list those explicitely as well even
+  though we've dumped the decl declaration as D.xxx
+  above.  */
+   || !SSA_NAME_IDENTIFIER (name))
  {
fprintf (file, "  ");
print_generic_expr (file, TREE_TYPE (name), flags);
-- 
2.26.2


Re: [Patch] Fortran: Fix libgfortran I/O race with newunit_free [PR99529]

2021-03-11 Thread Tobias Burnus
Revised version – the previous had a lock inversion, which could lead to 
a deadlock, found by -fsanitize=thread. See transfer.c for the changes.


OK?

Tobias

On 11.03.21 10:42, Tobias Burnus wrote:

Hi all,

as found by Martin (thanks!) there is a race for newunit_free.
While that call is within the unitlock for the calls in io/unit.c,
the call in transfer.c did not use locks.

Additionally,
  unit = get_gfc_unit (dtp->common.unit, do_create);
  set_internal_unit (dtp, unit, kind);
gets first the unit (with proper locking when using the unit number
dtp->common.unit) but then in set_internal_unit it re-sets the
unit number to the same number without locking. That causes
race warnings and if the assignment is not atomic it is a true race.

OK for mainline? What about GCC 10?

As Martin notes in the email thread and in the PR there are more
race warnings (and likely true race issues).

Tobias

Fortran: Fix libgfortran I/O race with newunit_free [PR99529]

libgfortran/ChangeLog:

	* io/transfer.c (st_read_done_worker, st_write_done_worker):
	Call unlock_unit here, add unit_lock lock around newunit_free call.
	(st_read_done, st_write_done): Only call unlock_unit when not
	calling the worker function.
	* io/unit.c (set_internal_unit): Don't reset the unit_number
	to the same number as this cause race warnings.

diff --git a/libgfortran/io/transfer.c b/libgfortran/io/transfer.c
index 27bee9d4e01..71a935652e3 100644
--- a/libgfortran/io/transfer.c
+++ b/libgfortran/io/transfer.c
@@ -4339,6 +4339,7 @@ export_proto(st_read_done);
 void
 st_read_done_worker (st_parameter_dt *dtp)
 {
+  bool free_newunit = false;
   finalize_transfer (dtp);
 
   free_ionml (dtp);
@@ -4358,7 +4359,7 @@ st_read_done_worker (st_parameter_dt *dtp)
 		free (dtp->u.p.current_unit->ls);
 	  dtp->u.p.current_unit->ls = NULL;
 	}
-	  newunit_free (dtp->common.unit);
+	  free_newunit = true;
 	}
   if (dtp->u.p.unit_is_internal || dtp->u.p.format_not_saved)
 	{
@@ -4366,6 +4367,14 @@ st_read_done_worker (st_parameter_dt *dtp)
 	  free_format (dtp);
 	}
 }
+   unlock_unit (dtp->u.p.current_unit);
+   if (free_newunit)
+ {
+   /* Avoid inverse lock issues by placing after unlock_unit.  */
+   LOCK (_lock);
+   newunit_free (dtp->common.unit);
+   UNLOCK (_lock);
+ }
 }
 
 void
@@ -4382,11 +4391,10 @@ st_read_done (st_parameter_dt *dtp)
 	  if (dtp->u.p.async)
 		enqueue_done (dtp->u.p.current_unit->au, AIO_READ_DONE);
 	}
+	  unlock_unit (dtp->u.p.current_unit);
 	}
   else
-	st_read_done_worker (dtp);
-
-  unlock_unit (dtp->u.p.current_unit);
+	st_read_done_worker (dtp);  /* Calls unlock_unit.  */
 }
 
   library_end ();
@@ -4406,6 +4414,7 @@ st_write (st_parameter_dt *dtp)
 void
 st_write_done_worker (st_parameter_dt *dtp)
 {
+  bool free_newunit = false;
   finalize_transfer (dtp);
 
   if (dtp->u.p.current_unit != NULL
@@ -4446,7 +4455,7 @@ st_write_done_worker (st_parameter_dt *dtp)
 		free (dtp->u.p.current_unit->ls);
 	  dtp->u.p.current_unit->ls = NULL;
 	}
-	  newunit_free (dtp->common.unit);
+	  free_newunit = true;
 	}
   if (dtp->u.p.unit_is_internal || dtp->u.p.format_not_saved)
 	{
@@ -4454,6 +4463,14 @@ st_write_done_worker (st_parameter_dt *dtp)
 	  free_format (dtp);
 	}
 }
+   unlock_unit (dtp->u.p.current_unit);
+   if (free_newunit)
+ {
+   /* Avoid inverse lock issues by placing after unlock_unit.  */
+   LOCK (_lock);
+   newunit_free (dtp->common.unit);
+   UNLOCK (_lock);
+ }
 }
 
 extern void st_write_done (st_parameter_dt *);
@@ -4476,11 +4493,10 @@ st_write_done (st_parameter_dt *dtp)
 	  if (dtp->u.p.async)
 		enqueue_done (dtp->u.p.current_unit->au, AIO_WRITE_DONE);
 	}
+	  unlock_unit (dtp->u.p.current_unit);
 	}
   else
-	st_write_done_worker (dtp);
-
-  unlock_unit (dtp->u.p.current_unit);
+	st_write_done_worker (dtp);  /* Calls unlock_unit.  */
 }
 
   library_end ();
diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c
index 2ea664331bc..b0cc6ab2301 100644
--- a/libgfortran/io/unit.c
+++ b/libgfortran/io/unit.c
@@ -456,7 +456,6 @@ set_internal_unit (st_parameter_dt *dtp, gfc_unit *iunit, int kind)
 {
   gfc_offset start_record = 0;
 
-  iunit->unit_number = dtp->common.unit;
   iunit->recl = dtp->internal_unit_len;
   iunit->internal_unit = dtp->internal_unit;
   iunit->internal_unit_len = dtp->internal_unit_len;


Re: ipa-sra is mostly disabled by -mlongcall

2021-03-11 Thread Richard Biener via Gcc-patches
On Thu, Mar 11, 2021 at 9:19 AM Alexandre Oliva  wrote:
>
>
> Several ipa-sra tests fail because -mlongcall on powerpc is a
> TYPE_ATTRIBUTE, and those disable ipa-sra.
>
> This local workaround disables -mlongcall for the failing tests on
> powerpc*-*-vxworks*, so they get a chance to run even in kernel mode.
>
> (AdaCore has -mlongcall enabled for powerpc*-wrs-vxworks* kernel-mode
> targets.)
>
> This was regstrapped on x86_64-linux-gnu and ppc64-linux-gnu, and tested
> with a cross to a ppc64-vxworks7r2 with -mlongcall enabled by default.
>
> I was leaning towards maintaing this change internal, but I thought it
> wouldn't hurt to ask whether it's ok to install.
>
> I wonder whether it wouldn't be more desirable to have an alternate
> implementation of -mlongcall on ppc that didn't conflict with IPA-SRA,
> and IIRC with some scenarios involving C++ templates.
>
> E.g., on ARM, -mlong-calls doesn't add an attribute to every function,
> it just changes a default, that attributes short_call and long_call may
> then override for specific functions.  Would a similar implementation be
> acceptable for ppc?

I wonder what the effect of this switch is - it's documented as
affecting the call site (different call insn sequence by default), so
the best representation would be an attribute on actual call stmts so
inlining a non -mlongcall fn into a -mlongcall fn will properly
inherit the setting of calls in the inlined function?

So in C++ I should be able to write

[[gnu::arm::longcall]] foo ();

to get a single call forced as long-call (whatever this is used for
in practice).

I agree that a type attribute (it's a type attribute likely to also
affect indirect calls) is a bit awkward but it does look better
than just a global flag.

Note that the long-standing issue in IPA is that IPA would need
to know the semantics of attributes since it might need to modify
them.  One idea (partly implemented already I think) is to have
a whitelist, aka, "this attribute doesn't need to be adjusted when
parameters are split, removed or added or when the function is
split or merged" - where for target specific attributes this would
need a new target hook.

Adjusting the testsuite just hides this missed optimization issue,
so at least a bugreport with xfail keyword refering to the altered
tests should be opened.

Richard.

>
>
> for  gcc/testsuite/ChangeLog
>
> * gcc.dg/ipa/ipa-sra-12.c: Add -mno-longcall on ppc*-vxworks*.
> * gcc.dg/ipa/ipa-sra-13.c: Likewise.
> * gcc.dg/ipa/ipa-sra-15.c: Likewise.
> * gcc.dg/ipa/ipa-sra-16.c: Likewise.
> * gcc.dg/ipa/ipa-sra-17.c: Likewise.
> * gcc.dg/ipa/ipa-sra-18.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c |1 +
>  gcc/testsuite/gcc.dg/ipa/ipa-sra-13.c |1 +
>  gcc/testsuite/gcc.dg/ipa/ipa-sra-15.c |1 +
>  gcc/testsuite/gcc.dg/ipa/ipa-sra-16.c |1 +
>  gcc/testsuite/gcc.dg/ipa/ipa-sra-17.c |1 +
>  gcc/testsuite/gcc.dg/ipa/ipa-sra-18.c |1 +
>  6 files changed, 6 insertions(+)
>
> diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c 
> b/gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c
> index 0cc76bde319c7..0d74c8f52eee2 100644
> --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c
> +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-12.c
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-options "-O2 -fipa-sra -fdump-ipa-sra"  } */
> +/* { dg-additional-options "-mno-longcall" { target powerpc*-*-vxworks* } } 
> */
>
>  /* Check of a simple and transitive structure split. */
>
> diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-13.c 
> b/gcc/testsuite/gcc.dg/ipa/ipa-sra-13.c
> index e8751dad67fcc..66083f040bad9 100644
> --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-13.c
> +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-13.c
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-options "-O2 -fipa-sra -fdump-ipa-sra"  } */
> +/* { dg-additional-options "-mno-longcall" { target powerpc*-*-vxworks* } } 
> */
>
>  /* Check of a by-reference structure split. */
>
> diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-15.c 
> b/gcc/testsuite/gcc.dg/ipa/ipa-sra-15.c
> index aa13a94c7c064..98b7ae5d4f11b 100644
> --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-15.c
> +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-15.c
> @@ -1,5 +1,6 @@
>  /* { dg-do run } */
>  /* { dg-options "-O2 -fipa-sra -fdump-ipa-sra"  } */
> +/* { dg-additional-options "-mno-longcall" { target powerpc*-*-vxworks* } } 
> */
>
>  /* Check of a recursive by-reference structure split.  The recursive 
> functions
> have to be pure right from the start, otherwise the current AA would 
> detect
> diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-16.c 
> b/gcc/testsuite/gcc.dg/ipa/ipa-sra-16.c
> index 2bffe297c7456..25f65fe1fc05a 100644
> --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-16.c
> +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-16.c
> @@ -1,5 +1,6 @@
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -fipa-sra -fdump-ipa-sra -fdump-tree-optimized"  } */
> +/* { dg-additional-options "-mno-longcall" { target powerpc*-*-vxworks* } } 
> 

Re: ipa-sra is mostly disabled by -mlongcall

2021-03-11 Thread Martin Jambor
Hi,

On Thu, Mar 11 2021, Alexandre Oliva wrote:
> Several ipa-sra tests fail because -mlongcall on powerpc is a
> TYPE_ATTRIBUTE, and those disable ipa-sra.
>
> This local workaround disables -mlongcall for the failing tests on
> powerpc*-*-vxworks*, so they get a chance to run even in kernel mode.
>
> (AdaCore has -mlongcall enabled for powerpc*-wrs-vxworks* kernel-mode
> targets.)
>
> This was regstrapped on x86_64-linux-gnu and ppc64-linux-gnu, and tested
> with a cross to a ppc64-vxworks7r2 with -mlongcall enabled by default.
>
> I was leaning towards maintaing this change internal, but I thought it
> wouldn't hurt to ask whether it's ok to install.
>
> I wonder whether it wouldn't be more desirable to have an alternate
> implementation of -mlongcall on ppc that didn't conflict with IPA-SRA,
> and IIRC with some scenarios involving C++ templates.

when it comes to attributes, often IPA-SRA is over-eager to switch
itself off, even though it could just carry on, which might also be this
case.  Unfortunately, I noticed that this "to be removed later" check is
still there only at the end of November and did not manage to remove it
before stage 4.  I'll fix it in spring.

Thanks for taking care of the testcases, I will use them for testing.

Martin


>
> E.g., on ARM, -mlong-calls doesn't add an attribute to every function,
> it just changes a default, that attributes short_call and long_call may
> then override for specific functions.  Would a similar implementation be
> acceptable for ppc?
>
>
> for  gcc/testsuite/ChangeLog
>
>   * gcc.dg/ipa/ipa-sra-12.c: Add -mno-longcall on ppc*-vxworks*.
>   * gcc.dg/ipa/ipa-sra-13.c: Likewise.
>   * gcc.dg/ipa/ipa-sra-15.c: Likewise.
>   * gcc.dg/ipa/ipa-sra-16.c: Likewise.
>   * gcc.dg/ipa/ipa-sra-17.c: Likewise.
>   * gcc.dg/ipa/ipa-sra-18.c: Likewise.


Re: [RFC] decay vect tests from run to link for pr95401

2021-03-11 Thread Richard Biener via Gcc-patches
On Thu, Mar 11, 2021 at 9:03 AM Alexandre Oliva  wrote:
>
>
> When vect.exp finds our configuration disables altivec by default, it
> disables the execution of vectorization tests, assuming the test
> hardware doesn't support it.
>
> Tests become just compile tests, but compile tests won't work
> correctly when additional sources are named, e.g. pr95401.cc, because
> GCC refuses to compile multiple files into the same asm output.
>
> With this patch, the default for when execution is not possible
> becomes link.
>
>
> This was regstrapped on x86_64-linux-gnu and ppc64-linux-gnu, and tested
> with a cross to a ppc64-vxworks7r2 with altivec disabled by default.  I
> found fixing the handling of additional sources to e.g. compile each one
> separately, or perhaps just discard or reject additional sources for
> compile tests, to be a little too involved.
>
> So I'm leaning towards this proposed change, just extended to other
> platforms that also decay from run to compile rather than link, and thus
> run into this problem in g++.dg/vect/pr95401.cc.  Would this be
> acceptable?

I think that's OK.  It's probably difficult to make the test UNSUPPORTED
when dg-additional-sources is discovered with a dg-do compile test?

Richard.

>
> for  gcc/testsuite/ChangeLog
>
> * lib/target-supports.exp (check_vect_support_and_set_flags):
> Decay to link rather than compile.
> ---
>  gcc/testsuite/lib/target-supports.exp |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 52d3d036d3c5c..23d532fb2b963 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -9656,7 +9656,7 @@ proc check_vect_support_and_set_flags { } {
>  # Specify a cpu that supports VMX for compile-only tests.
>  lappend DEFAULT_VECTCFLAGS "-mcpu=970"
>  }
> -set dg-do-what-default compile
> +set dg-do-what-default link
>  }
>  } elseif { [istarget i?86-*-*] || [istarget x86_64-*-*] } {
>  lappend DEFAULT_VECTCFLAGS "-msse2"
>
> --
> Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
>Free Software Activist GNU Toolchain Engineer
> Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


[Patch] Fortran: Fix libgfortran I/O race with newunit_free [PR99529]

2021-03-11 Thread Tobias Burnus

Hi all,

as found by Martin (thanks!) there is a race for newunit_free.
While that call is within the unitlock for the calls in io/unit.c,
the call in transfer.c did not use locks.

Additionally,
  unit = get_gfc_unit (dtp->common.unit, do_create);
  set_internal_unit (dtp, unit, kind);
gets first the unit (with proper locking when using the unit number
dtp->common.unit) but then in set_internal_unit it re-sets the
unit number to the same number without locking. That causes
race warnings and if the assignment is not atomic it is a true race.

OK for mainline? What about GCC 10?

As Martin notes in the email thread and in the PR there are more
race warnings (and likely true race issues).

Tobias

Fortran: Fix libgfortran I/O race with newunit_free [PR99529]

libgfortran/ChangeLog:

	* io/transfer.c (st_read_done_worker, st_write_done_worker):
	Add unit_lock lock/unlock around newunit_free call.
	* io/unit.c (set_internal_unit): Don't reset the unit_number
	to the same number as this cause race warnings.

diff --git a/libgfortran/io/transfer.c b/libgfortran/io/transfer.c
index 27bee9d4e01..a6f115d5803 100644
--- a/libgfortran/io/transfer.c
+++ b/libgfortran/io/transfer.c
@@ -4358,7 +4358,9 @@ st_read_done_worker (st_parameter_dt *dtp)
 		free (dtp->u.p.current_unit->ls);
 	  dtp->u.p.current_unit->ls = NULL;
 	}
+	  LOCK (_lock);
 	  newunit_free (dtp->common.unit);
+	  UNLOCK (_lock);
 	}
   if (dtp->u.p.unit_is_internal || dtp->u.p.format_not_saved)
 	{
@@ -4446,7 +4448,9 @@ st_write_done_worker (st_parameter_dt *dtp)
 		free (dtp->u.p.current_unit->ls);
 	  dtp->u.p.current_unit->ls = NULL;
 	}
+	  LOCK (_lock);
 	  newunit_free (dtp->common.unit);
+	  UNLOCK (_lock);
 	}
   if (dtp->u.p.unit_is_internal || dtp->u.p.format_not_saved)
 	{
diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c
index 2ea664331bc..b0cc6ab2301 100644
--- a/libgfortran/io/unit.c
+++ b/libgfortran/io/unit.c
@@ -456,7 +456,6 @@ set_internal_unit (st_parameter_dt *dtp, gfc_unit *iunit, int kind)
 {
   gfc_offset start_record = 0;
 
-  iunit->unit_number = dtp->common.unit;
   iunit->recl = dtp->internal_unit_len;
   iunit->internal_unit = dtp->internal_unit;
   iunit->internal_unit_len = dtp->internal_unit_len;


Re: [PATCH] icf: Check return type of internal fn calls [PR99517]

2021-03-11 Thread Richard Biener
On Thu, 11 Mar 2021, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase is miscompiled, because IPA-ICF considers the two
> functions identical.  They aren't, the types of the .VEC_CONVERT call
> lhs is different.  But for calls to internal functions, there is no
> fntype nor callee with a function type to compare, so all we compare
> is just the ifn, arguments and some call flags.
> 
> The following patch fixes it by checking the internal fn calls like e.g. 
> gimple
> assignments where the type of the lhs is checked too.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2021-03-11  Jakub Jelinek  
> 
>   PR ipa/99517
>   * ipa-icf-gimple.c (func_checker::compare_gimple_call): For internal
>   function calls with lhs fail if the lhs don't have compatible types.
> 
>   * gcc.target/i386/avx2-pr99517-1.c: New test.
>   * gcc.target/i386/avx2-pr99517-2.c: New test.
> 
> --- gcc/ipa-icf-gimple.c.jj   2021-01-04 10:25:38.752234741 +0100
> +++ gcc/ipa-icf-gimple.c  2021-03-10 15:02:06.287502784 +0100
> @@ -667,7 +667,7 @@ func_checker::compare_gimple_call (gcall
>tree fntype1 = gimple_call_fntype (s1);
>tree fntype2 = gimple_call_fntype (s2);
>  
> -  /* For direct calls we verify that types are comopatible so if we matced
> +  /* For direct calls we verify that types are compatible so if we matched
>   callees, callers must match, too.  For indirect calls however verify
>   function type.  */
>if (!gimple_call_fndecl (s1))
> @@ -703,6 +703,14 @@ func_checker::compare_gimple_call (gcall
>t1 = gimple_get_lhs (s1);
>t2 = gimple_get_lhs (s2);
>  
> +  /* For internal calls, lhs types need to be verified, as neither fntype nor
> + callee comparisons can catch that.  */
> +  if (gimple_call_internal_p (s1)
> +  && t1
> +  && t2
> +  && !compatible_types_p (TREE_TYPE (t1), TREE_TYPE (t2)))
> +return return_false_with_msg ("GIMPLE internal call LHS type mismatch");
> +
>return compare_operand (t1, t2, get_operand_access_type (, t1));
>  }
>  
> --- gcc/testsuite/gcc.target/i386/avx2-pr99517-1.c.jj 2021-03-10 
> 15:28:33.175959797 +0100
> +++ gcc/testsuite/gcc.target/i386/avx2-pr99517-1.c2021-03-10 
> 15:28:57.117695186 +0100
> @@ -0,0 +1,25 @@
> +/* PR ipa/99517 */
> +/* { dg-do run { target avx2 } } */
> +/* { dg-additional-sources "avx2-pr99517-2.c" } */
> +/* { dg-options "-O2 -mavx2" } */
> +
> +#include "avx2-check.h"
> +
> +typedef signed char v32qi __attribute__((vector_size(32)));
> +typedef int v4si __attribute__((vector_size(16)));
> +typedef long long int v4di __attribute__((vector_size(32)));
> +typedef double v4df __attribute__((vector_size(32)));
> +extern v32qi foo (v4si);
> +extern v32qi bar (v4si);
> +
> +static void
> +avx2_test (void)
> +{
> +  v4si a = { 1, -2, 3, -4 };
> +  __asm ("" : "+x" (a));
> +  v4di b = (v4di) bar (a);
> +  v4df c = (v4df) foo (a);
> +  if (b[0] != 1 || c[0] != 1.0 || b[1] != -2 || c[1] != -2.0
> +  || b[2] != 3 || c[2] != 3.0 || b[3] != -4 || c[3] != -4.0)
> +__builtin_abort ();
> +}
> --- gcc/testsuite/gcc.target/i386/avx2-pr99517-2.c.jj 2021-03-10 
> 15:30:33.974624677 +0100
> +++ gcc/testsuite/gcc.target/i386/avx2-pr99517-2.c2021-03-10 
> 15:30:28.529684856 +0100
> @@ -0,0 +1,20 @@
> +/* PR ipa/99517 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx2" } */
> +
> +typedef signed char v32qi __attribute__((vector_size(32)));
> +typedef int v4si __attribute__((vector_size(16)));
> +typedef long long int v4di __attribute__((vector_size(32)));
> +typedef double v4df __attribute__((vector_size(32)));
> +
> +v32qi
> +foo (v4si x)
> +{
> +  return (v32qi) __builtin_convertvector (x, v4df);
> +}
> +
> +v32qi
> +bar (v4si x)
> +{
> +  return (v32qi) __builtin_convertvector (x, v4di);
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: add -mfloat128 to __float128-using test missing it

2021-03-11 Thread Iain Sandoe via Gcc-patches

Alexandre Oliva  wrote:


On Mar 10, 2021, Alexandre Oliva  wrote:


* gcc.target/powerpc/prefix-ds-dq.c: Enable __float128.


I've been reminded that this is not enough for the scan-assembler tests
to pass, at least in our configurations.  Nearly all of the asm
expectations are unmet.  I'm yet to identify the root cause.


I have the following patch that I was intending to apply/post later (since
the test makes unconditional use of __float128 which is not available on
all platforms).

does this solve your issue?
Iain

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/prefix-ds-dq.c: Require float128 support.

diff --git a/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c  
b/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c

index 554cd0c..7ab7201 100644
--- a/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
+++ b/gcc/testsuite/gcc.target/powerpc/prefix-ds-dq.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-require-effective-target ppc_float128_sw } */
 /* { dg-require-effective-target lp64 } */
 /* { dg-options "-O2 -mdejagnu-cpu=power10" } */





[PATCH] c++: Fix up calls to immediate functions returning reference [PR99507]

2021-03-11 Thread Jakub Jelinek via Gcc-patches
Hi!

build_cxx_call calls convert_from_reference at the end, so if an immediate
function returns a reference, we were constant evaluating not just that
call, but that call wrapped in an INDIRECT_REF.  That unfortunately means
it can constant evaluate to something non-addressable, so if code later
needs to take its address it will fail.

The following patch fixes that by undoing the convert_from_reference
wrapping for the cxx_constant_value evaluation and readdding it ad the end.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-03-11  Jakub Jelinek  

PR c++/99507
* call.c (build_over_call): For immediate evaluation of functions
that return references, undo convert_from_reference effects before
calling cxx_constant_value and call convert_from_reference
afterwards.

* g++.dg/cpp2a/consteval19.C: New test.

--- gcc/cp/call.c.jj2021-03-06 10:17:01.687304578 +0100
+++ gcc/cp/call.c   2021-03-10 13:51:29.121445195 +0100
@@ -9504,6 +9504,8 @@ build_over_call (struct z_candidate *can
   if (immediate_invocation_p (fndecl, nargs))
{
  tree obj_arg = NULL_TREE;
+ if (REFERENCE_REF_P (call))
+   call = TREE_OPERAND (call, 0);
  if (DECL_CONSTRUCTOR_P (fndecl))
obj_arg = cand->first_arg ? cand->first_arg : (*args)[0];
  if (obj_arg && is_dummy_object (obj_arg))
@@ -9527,6 +9529,7 @@ build_over_call (struct z_candidate *can
  call = cxx_constant_value (call, obj_arg);
  if (obj_arg && !error_operand_p (call))
call = build2 (INIT_EXPR, void_type_node, obj_arg, call);
+ call = convert_from_reference (call);
}
 }
   return call;
--- gcc/testsuite/g++.dg/cpp2a/consteval19.C.jj 2021-03-10 14:20:58.018835190 
+0100
+++ gcc/testsuite/g++.dg/cpp2a/consteval19.C2021-03-10 14:20:16.642294167 
+0100
@@ -0,0 +1,6 @@
+// PR c++/99507
+// { dg-do compile { target c++20 } }
+
+constexpr int i{0};
+consteval const int  () { return i; }
+const int *a{ ()};

Jakub



[PATCH, OG10, C++, committed] Fix non-static member mapping in templates

2021-03-11 Thread Chung-Lin Tang

There was a case of the implicit non-static pointer member mapping
not working properly with templates.

What happened was that the code in finish_omp_target() created the
map clauses (which normally runs after finish_omp_clauses), but being
a template class it was put through all the tsubst_* stuff and at the
end thrown into finish_omp_clauses a 2nd time. And because finish_omp_clauses
didn't handle some of the implicitly created map clauses, things didn't
work...

This patch slightly fixes many handled cases in these parts, plus some
adjustments in gimplify.c.

Tested without regressions, and pushed to devel/omp/gcc-10.

Chung-Lin
From 4e714eaad985f68533f267b8df2026e5c14d084a Mon Sep 17 00:00:00 2001
From: Chung-Lin Tang 
Date: Thu, 11 Mar 2021 00:31:08 -0800
Subject: [PATCH] Fix template case of non-static member access inside member
 functions

Prior patches for C++ non-static member access had problems under template
classes, due to re-calling of finish_omp_clauses after finish_omp_target
created the implicit maps required, but not of allowed form in 
finish_omp_clauses.

This patch solves this by slightly relaxing the allowed expressions in
finish_omp_clauses.

2021-03-11  Chung-Lin Tang  

gcc/cp/ChangeLog:

* semantics.c (finish_omp_clauses): Adjustments to allow '*ptr' and
'ptr->member' cases in map clausess.
(finish_omp_target): Use INDIRECT_REF instead of MEM_REF in created
clauses, add processing_template_decl handling.

gcc/ChangeLog:

* gimplify.c (gimplify_scan_omp_clauses): Under !DECL_P case of
GOMP_CLAUSE_MAP handling, add STRIP_NOPS for indir_p case, add to
struct_deref_set for map(*ptr_to_struct) cases.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/target-this-3.C: Adjust scan test.
* g++.dg/gomp/target-this-4.C: Likewise.
* g++.dg/gomp/target-this-5.C: New test.

libgomp/ChangeLog:

* testsuite/libgomp.c++/target-this-5.C: New test.
---
 gcc/cp/semantics.c| 45 +--
 gcc/gimplify.c| 19 +++
 gcc/testsuite/g++.dg/gomp/target-this-3.C |  2 +-
 gcc/testsuite/g++.dg/gomp/target-this-4.C |  2 +-
 gcc/testsuite/g++.dg/gomp/target-this-5.C | 34 
 libgomp/testsuite/libgomp.c++/target-this-5.C | 30 ++
 6 files changed, 120 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/gomp/target-this-5.C
 create mode 100644 libgomp/testsuite/libgomp.c++/target-this-5.C

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 55a5983..5b62fa3 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -6407,6 +6407,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type 
ort)
   bool order_seen = false;
   bool schedule_seen = false;
   bool oacc_async = false;
+  bool indirect_ref_p = false;
   bool indir_component_ref_p = false;
   tree last_iterators = NULL_TREE;
   bool last_iterators_remove = false;
@@ -7516,6 +7517,14 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type 
ort)
  indir_component_ref_p = true;
  STRIP_NOPS (t);
}
+ indirect_ref_p = false;
+ if ((ort == C_ORT_ACC || ort == C_ORT_OMP)
+ && INDIRECT_REF_P (t))
+   {
+ t = TREE_OPERAND (t, 0);
+ indirect_ref_p = true;
+ STRIP_NOPS (t);
+   }
  if (TREE_CODE (t) == COMPONENT_REF
  && ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP
  || ort == C_ORT_ACC)
@@ -7551,6 +7560,12 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type 
ort)
  break;
}
  t = TREE_OPERAND (t, 0);
+ if (INDIRECT_REF_P (t))
+   {
+ t = TREE_OPERAND (t, 0);
+ indir_component_ref_p = true;
+ STRIP_NOPS (t);
+   }
}
  if (remove)
break;
@@ -7614,6 +7629,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type 
ort)
   || (OMP_CLAUSE_MAP_KIND (c)
   != GOMP_MAP_FIRSTPRIVATE_POINTER))
   && !indir_component_ref_p
+  && !indirect_ref_p
   && !cxx_mark_addressable (t))
remove = true;
  else if (!(OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
@@ -7698,7 +7714,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type 
ort)
}
  else
{
- bitmap_set_bit (_head, DECL_UID (t));
+ if (!indirect_ref_p && !indir_component_ref_p)
+   bitmap_set_bit (_head, DECL_UID (t));
  if (t != OMP_CLAUSE_DECL (c)
  && TREE_CODE (OMP_CLAUSE_DECL (c)) == COMPONENT_REF)
bitmap_set_bit (_field_head, DECL_UID (t));
@@ -8702,9 +8719,12 @@ finish_omp_target (location_t loc, 

[PATCH] icf: Check return type of internal fn calls [PR99517]

2021-03-11 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase is miscompiled, because IPA-ICF considers the two
functions identical.  They aren't, the types of the .VEC_CONVERT call
lhs is different.  But for calls to internal functions, there is no
fntype nor callee with a function type to compare, so all we compare
is just the ifn, arguments and some call flags.

The following patch fixes it by checking the internal fn calls like e.g. gimple
assignments where the type of the lhs is checked too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-03-11  Jakub Jelinek  

PR ipa/99517
* ipa-icf-gimple.c (func_checker::compare_gimple_call): For internal
function calls with lhs fail if the lhs don't have compatible types.

* gcc.target/i386/avx2-pr99517-1.c: New test.
* gcc.target/i386/avx2-pr99517-2.c: New test.

--- gcc/ipa-icf-gimple.c.jj 2021-01-04 10:25:38.752234741 +0100
+++ gcc/ipa-icf-gimple.c2021-03-10 15:02:06.287502784 +0100
@@ -667,7 +667,7 @@ func_checker::compare_gimple_call (gcall
   tree fntype1 = gimple_call_fntype (s1);
   tree fntype2 = gimple_call_fntype (s2);
 
-  /* For direct calls we verify that types are comopatible so if we matced
+  /* For direct calls we verify that types are compatible so if we matched
  callees, callers must match, too.  For indirect calls however verify
  function type.  */
   if (!gimple_call_fndecl (s1))
@@ -703,6 +703,14 @@ func_checker::compare_gimple_call (gcall
   t1 = gimple_get_lhs (s1);
   t2 = gimple_get_lhs (s2);
 
+  /* For internal calls, lhs types need to be verified, as neither fntype nor
+ callee comparisons can catch that.  */
+  if (gimple_call_internal_p (s1)
+  && t1
+  && t2
+  && !compatible_types_p (TREE_TYPE (t1), TREE_TYPE (t2)))
+return return_false_with_msg ("GIMPLE internal call LHS type mismatch");
+
   return compare_operand (t1, t2, get_operand_access_type (, t1));
 }
 
--- gcc/testsuite/gcc.target/i386/avx2-pr99517-1.c.jj   2021-03-10 
15:28:33.175959797 +0100
+++ gcc/testsuite/gcc.target/i386/avx2-pr99517-1.c  2021-03-10 
15:28:57.117695186 +0100
@@ -0,0 +1,25 @@
+/* PR ipa/99517 */
+/* { dg-do run { target avx2 } } */
+/* { dg-additional-sources "avx2-pr99517-2.c" } */
+/* { dg-options "-O2 -mavx2" } */
+
+#include "avx2-check.h"
+
+typedef signed char v32qi __attribute__((vector_size(32)));
+typedef int v4si __attribute__((vector_size(16)));
+typedef long long int v4di __attribute__((vector_size(32)));
+typedef double v4df __attribute__((vector_size(32)));
+extern v32qi foo (v4si);
+extern v32qi bar (v4si);
+
+static void
+avx2_test (void)
+{
+  v4si a = { 1, -2, 3, -4 };
+  __asm ("" : "+x" (a));
+  v4di b = (v4di) bar (a);
+  v4df c = (v4df) foo (a);
+  if (b[0] != 1 || c[0] != 1.0 || b[1] != -2 || c[1] != -2.0
+  || b[2] != 3 || c[2] != 3.0 || b[3] != -4 || c[3] != -4.0)
+__builtin_abort ();
+}
--- gcc/testsuite/gcc.target/i386/avx2-pr99517-2.c.jj   2021-03-10 
15:30:33.974624677 +0100
+++ gcc/testsuite/gcc.target/i386/avx2-pr99517-2.c  2021-03-10 
15:30:28.529684856 +0100
@@ -0,0 +1,20 @@
+/* PR ipa/99517 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx2" } */
+
+typedef signed char v32qi __attribute__((vector_size(32)));
+typedef int v4si __attribute__((vector_size(16)));
+typedef long long int v4di __attribute__((vector_size(32)));
+typedef double v4df __attribute__((vector_size(32)));
+
+v32qi
+foo (v4si x)
+{
+  return (v32qi) __builtin_convertvector (x, v4df);
+}
+
+v32qi
+bar (v4si x)
+{
+  return (v32qi) __builtin_convertvector (x, v4di);
+}

Jakub



Re: add -mfloat128 to __float128-using test missing it

2021-03-11 Thread Alexandre Oliva
On Mar 10, 2021, Alexandre Oliva  wrote:

>   * gcc.target/powerpc/prefix-ds-dq.c: Enable __float128.

I've been reminded that this is not enough for the scan-assembler tests
to pass, at least in our configurations.  Nearly all of the asm
expectations are unmet.  I'm yet to identify the root cause.

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar