date:20220901

Re: [PATCH v2, rs6000] Put dg-options before effective target checks

2022-09-01 Thread Kewen.Lin via Gcc-patches

on 2022/9/2 11:23, HAO CHEN GUI wrote:
> Hi Kewen,
> 
> On 1/9/2022 下午 5:34, Kewen.Lin wrote:
>> Thanks for the updated patch!
>>
>> I just found that it seems all the three test cases suffer the empty
>> TU error issue from those has_arch* effective target checks?
>>
>> If yes, it looks we don't need to bother this once patch [1] gets
>> landed?
>>
>> Sorry, I didn't notice and ask when reviewing the previous version.
>>
>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598748.html
> 
> Yes, those 3 test cases all suffer from "empty translation unit" problem.
> My patch just has an side effect which avoid "empty translation unit"
> problem. But the real problem is still there.

OK, thanks for the information!  If so, I would prefer to leave them
alone for now, the issues should be fixed once [1] gets landed.

> 
> pr92398.p9+.c has another problem. It's a compiling case and it should be
> compiled on any platform when "-mdejagnu-cpu=power9" is set in dg-options
> or RUNTESTFLAGS. Putting dg-options before "has_arch_pwr9" check achieves
> this target.

OK, then go ahead to enhance it separately.  :)

BR,
Kewen

Re: [PATCH 1/2] Using pli(paddi) and rotate to build 64bit constants

2022-09-01 Thread Kewen.Lin via Gcc-patches

Hi Jeff,

Thanks for the patch, some comments on nits are inline.

on 2022/9/1 11:24, Jiufu Guo wrote:
> Hi,
> 
> As mentioned in PR106550, since pli could support 34bits immediate, we could
> use less instructions(3insn would be ok) to build 64bits constant with pli.
> 
> For example, for constant 0x020805006106003, we could generate it with:
> asm code1:
> pli 9,101736451 (0x6106003)
> sldi 9,9,32
> paddi 9,9, 213 (0x0208050)
> 
> or asm code2:
> pli 10, 213
> pli 9, 101736451
> rldimi 9, 10, 32, 0
> 
> Testing with simple cases as below, run them a lot of times:
> f1.c
> long __attribute__ ((noinline)) foo (long *arg,long *,long*)
> {
>   *arg = 0x2351847027482577;
> }
> 5insns: base
> pli+sldi+paddi: similar -0.08%
> pli+pli+rldimi: faster +0.66%
> 
> f2.c
> long __attribute__ ((noinline)) foo (long *arg, long *arg2, long *arg3)
> {
>   *arg = 0x2351847027482577;
>   *arg2 = 0x3257845024384680;
>   *arg3 = 0x1245abcef9240dec;
> }
> 5nisns: base
> pli+sldi+paddi: faster +1.35%
> pli+pli+rldimi: faster +5.49%
> 
> f2.c would be more meaningful.  Because 'sched passes' are effective for
> f2.c, but 'scheds' do less thing for f1.c.
> 
> Compare with previous patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599525.html
> This one updates code slightly and extracts changes on rs6000.md to a
> seperate patch.
> 
> This patch pass boostrap and regtest on ppc64le(includes p10).
> Is it ok for trunk?
> 
> BR,
> Jeff(Jiufu)
> 
> 
>   PR target/106550
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add 'pli' for
>   constant building.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/pr106550.c: New test.
> 
> ---
>  gcc/config/rs6000/rs6000.cc | 39 +
>  gcc/testsuite/gcc.target/powerpc/pr106550.c | 14 
>  2 files changed, 53 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106550.c
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index df491bee2ea..1ccb2ff30a1 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -10181,6 +10181,45 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT 
> c)
>   gen_rtx_IOR (DImode, copy_rtx (temp),
>GEN_INT (ud1)));
>  }
> +  else if (TARGET_PREFIXED)
> +{
> +  /* pli 9,high32 + pli 10,low32 + rldimi 9,10,32,0.  */
> +  if (can_create_pseudo_p ())
> + {
> +   temp = gen_reg_rtx (DImode);
> +   rtx temp1 = gen_reg_rtx (DImode);
> +   emit_move_insn (copy_rtx (temp), GEN_INT ((ud4 << 16) | ud3));
> +   emit_move_insn (copy_rtx (temp1), GEN_INT ((ud2 << 16) | ud1));
> +

Nit: copy_rtx here seems not necessary, as both temp and temp1 are with CODE 
REG.
The function copy_rtx returns the given rtx for code REG.

> +   emit_insn (gen_rotldi3_insert_3 (dest, temp, GEN_INT (32), temp1,
> +GEN_INT (0x)));
> + }
> +
> +  /* pli 9,high32 + sldi 9,32 + paddi 9,9,low32.  */
> +  else
> + {
> +   emit_move_insn (copy_rtx (dest), GEN_INT ((ud4 << 16) | ud3));
> +
> +   emit_move_insn (copy_rtx (dest),
> +   gen_rtx_ASHIFT (DImode, copy_rtx (dest),
> +   GEN_INT (32)));
> +
> +   bool can_use_paddi = REGNO (dest) != FIRST_GPR_REGNO;
> +

The REGNO usage has asserted dest is with CODE REG, if it's always true
I don't see why we need copy_rtx around.  Or do I miss something?

> +   /* Use paddi for the low32 bits.  */
> +   if (ud2 != 0 && ud1 != 0 && can_use_paddi)
> + emit_move_insn (dest, gen_rtx_PLUS (DImode, copy_rtx (dest),
> + GEN_INT ((ud2 << 16) | ud1)));
> +   /* Use oris, ori for low32 bits.  */
> +   if (ud2 != 0 && (ud1 == 0 || !can_use_paddi))
> + emit_move_insn (ud1 != 0 ? copy_rtx (dest) : dest,
> + gen_rtx_IOR (DImode, copy_rtx (dest),
> +  GEN_INT (ud2 << 16)));
> +   if (ud1 != 0 && (ud2 == 0 || !can_use_paddi))
> + emit_move_insn (dest, gen_rtx_IOR (DImode, copy_rtx (dest),
> +GEN_INT (ud1)));
> + }
> +}
>else
>  {
>temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106550.c 
> b/gcc/testsuite/gcc.target/powerpc/pr106550.c
> new file mode 100644
> index 000..c6f4116bb9a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr106550.c
> @@ -0,0 +1,14 @@
> +/* PR target/106550 */
> +/* { dg-options "-O2 -std=c99 -mdejagnu-cpu=power10" } */
> +

Need to check power10_ok, like:
/* { dg-require-effective-target power10_ok } */

Nit: -std=c99 is not needed?

BR,
Kewen

Re: [PATCH v2, rs6000] Put dg-options before effective target checks

2022-09-01 Thread HAO CHEN GUI via Gcc-patches

Hi Segher,
  Thanks for your review comments. I will refine it according to
your comments.

On 2/9/2022 上午 12:07, Segher Boessenkool wrote:
>> +/* { dg-do compile { target { ! has_arch_pwr9 } } } */
> Please keep dg-do first thing in the file.
Could you inform me if it's a must to put dg-do in the first line?
Here I hit a problem. "! has_arch_pwr9" can not be put into
dg-require-effective-target as it has a NOT. So I put dg-options
in the first line and make it ahead of dg-do.

> 
>> --- a/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
>> @@ -1,5 +1,6 @@
>> -/* { dg-do compile { target has_arch_ppc64 } } */
>> +/* { dg-do compile } */
>>  /* { dg-options "-mdejagnu-cpu=power6 -O2" } */
>> +/* { dg-require-effective-target has_arch_ppc64 } */
> This is fine, but it doesn't change anything, unless we have a bug.

This case suffer from "empty translation unit" problem and to be
unsupported on all platform. Put dg-options before the check avoid
the problem.

Thanks
Gui Haochen

Re: [PATCH 2/2] allow constant splitter run in split1 pass

2022-09-01 Thread Jiufu Guo via Gcc-patches

Segher Boessenkool  writes:

> Hi!
>
> On Thu, Sep 01, 2022 at 11:24:07AM +0800, Jiufu Guo wrote:
>> Currently, these two splitters (touched in this patch) are using predicate
>> `int_reg_operand_not_pseudo`, then they work in split2 pass after RA in
>> most times, and can not run before RA.
>> 
>> It would not be a bad idea to allow these splitters before RA.  Then more
>> passes (between split1 and split2) could optimize the emitted instructions.
>
> The splitters can be used earlier even.  For example, often combine will
> use them.
Oh, yes! 
>
>> And if splitting before RA, for these constant splitters, we may have more
>> freedom to create pseduo to generate more parallel instructions.
>> 
>> For the example in the leading patch [PATCH 1/2]: pli+plit+rldimi would be
>> better than pli+sldi+paddi.
>
> Yes.  If you split after reload you have to do all local optimisations
> (that would have been done in earlier passes) manually.  And all more
> global ones (involving just one or two more insns already) are
> essentially impossible to do.
>
> Splitting after reload is necessary in some cases.  For example, all the
> integer "dot" insns split to the base insn and an explicit compare, if
> for some reason RA did not get cr0 here.  Importantly, this happens very
> seldomly: RA knows it is two insns instead of one, and it chooses
> accordingly.  Also it *has* to be after reload, it directly depends on
> what RA chose to do.
>
> Splitting dependent on if a VSR or a GPR (pair) was used is a losing
> proposition.  It usually costs much more than it can gain.
>
Thanks for your detailed explain which makes more reasonable!

>> gcc/ChangeLog:
>> 
>>  * config/rs6000/rs6000.md (const splitter): Update predicate.
>
>   * config/rs6000/rs6000.md (splitter for set to and_mask constants):
>   Use int_reg_operand (instead of int_reg_operand_not_pseudo).
>   (splitter for multi-insn constant loads): Ditto.
>
> You should mention the changed to *both* splitters.  For nameless
> splitters it helps if you can describe it a bit.  This is hard, yes :-/
>
Thanks for your always helpful comments!

BR,
Jeff(Jiufu)

> Okay for trunk like that.  Thanks!
>
>
> Segher

Re: [PATCH] LoongArch: add -mdirect-extern-access option

2022-09-01 Thread Xi Ruoyao via Gcc-patches

On Fri, 2022-09-02 at 11:12 +0800, Huacai Chen wrote:
> On Thu, Sep 1, 2022 at 6:56 PM Xi Ruoyao  wrote:
> > 
> > We'd like to introduce a new codegen option to align with the old
> > "-Wa,-mla-global-with-pcrel" and avoid a performance & size
> > regression
> > building the Linux kernel with new-reloc toolchain.  And it should
> > be
> > also useful for building statically linked executables, firmwares
> > (EDK2
> > for example), and other OS kernels.
> > 
> > OK for trunk?
> This seems drop your (1)(2)(3) approach and do a similar thing as "a
> new code model" discussed in another thread?

Step (1) is rejected because they've found some issue with copy
relocation impossible to be solved (there is a plan to drop copy
relocation from other architectures as well).  This is a modification of
(2): we add -mdirect-extern-access, but without copy relocation it can
only be used by kernel, static executable, etc. so it's not the default.

In kernel we just use KBUILD_CFLAGS_KERNEL += -mdirect-extern-access. 
For modules we use GOT until we can get rid of XKPRANGE.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

Re: [PATCH v2, rs6000] Put dg-options before effective target checks

2022-09-01 Thread HAO CHEN GUI via Gcc-patches

Hi Kewen,

On 1/9/2022 下午 5:34, Kewen.Lin wrote:
> Thanks for the updated patch!
> 
> I just found that it seems all the three test cases suffer the empty
> TU error issue from those has_arch* effective target checks?
> 
> If yes, it looks we don't need to bother this once patch [1] gets
> landed?
> 
> Sorry, I didn't notice and ask when reviewing the previous version.
> 
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598748.html

Yes, those 3 test cases all suffer from "empty translation unit" problem.
My patch just has an side effect which avoid "empty translation unit"
problem. But the real problem is still there.

pr92398.p9+.c has another problem. It's a compiling case and it should be
compiled on any platform when "-mdejagnu-cpu=power9" is set in dg-options
or RUNTESTFLAGS. Putting dg-options before "has_arch_pwr9" check achieves
this target.

Thanks
Gui Haochen

Re: [PATCH] LoongArch: add -mdirect-extern-access option

2022-09-01 Thread Huacai Chen via Gcc-patches

Hi, Ruoyao,

On Thu, Sep 1, 2022 at 6:56 PM Xi Ruoyao  wrote:
>
> We'd like to introduce a new codegen option to align with the old
> "-Wa,-mla-global-with-pcrel" and avoid a performance & size regression
> building the Linux kernel with new-reloc toolchain.  And it should be
> also useful for building statically linked executables, firmwares (EDK2
> for example), and other OS kernels.
>
> OK for trunk?
This seems drop your (1)(2)(3) approach and do a similar thing as "a
new code model" discussed in another thread?

Huacai
>
> -- >8 --
>
> As a new target, LoongArch does not use copy relocation as it's
> problematic in some circumstances.  One bad consequence is we are
> emitting GOT for all accesses to all extern objects with default
> visibility.  The use of GOT is not needed in statically linked
> executables, OS kernels etc.  The GOT entry just wastes space, and the
> GOT access just slow down the execution in those environments.
>
> Before -mexplicit-relocs, we used "-Wa,-mla-global-with-pcrel" to tell
> the assembler not to use GOT for extern access.  But with
> -mexplicit-relocs, we have to opt the logic in GCC.
>
> The name "-mdirect-extern-access" is learnt from x86 port.
>
> gcc/ChangeLog:
>
> * config/loongarch/genopts/loongarch.opt.in: Add
> -mdirect-extern-access option.
> * config/loongarch/loongarch.opt: Regenerate.
> * config/loongarch/loongarch.cc (loongarch_classify_symbol):
> Don't use SYMBOL_GOT_DISP if TARGET_DIRECT_EXTERN_ACCESS.
> (loongarch_option_override_internal): Complain if
> -mdirect-extern-access is used with -fPIC or -fpic.
> * doc/invoke.texi: Document -mdirect-extern-access for
> LoongArch.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/loongarch/direct-extern-1.c: New test.
> * gcc.target/loongarch/direct-extern-2.c: New test.
> ---
>  gcc/config/loongarch/genopts/loongarch.opt.in |  4 
>  gcc/config/loongarch/loongarch.cc |  5 -
>  gcc/config/loongarch/loongarch.opt|  4 
>  gcc/doc/invoke.texi   | 15 +++
>  .../gcc.target/loongarch/direct-extern-1.c|  6 ++
>  .../gcc.target/loongarch/direct-extern-2.c|  6 ++
>  6 files changed, 39 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/direct-extern-1.c
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/direct-extern-2.c
>
> diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
> b/gcc/config/loongarch/genopts/loongarch.opt.in
> index ebdd9538d48..e10618777b2 100644
> --- a/gcc/config/loongarch/genopts/loongarch.opt.in
> +++ b/gcc/config/loongarch/genopts/loongarch.opt.in
> @@ -184,3 +184,7 @@ Enum(cmodel) String(@@STR_CMODEL_EXTREME@@) 
> Value(CMODEL_EXTREME)
>  mcmodel=
>  Target RejectNegative Joined Enum(cmodel) Var(la_opt_cmodel) 
> Init(CMODEL_NORMAL)
>  Specify the code model.
> +
> +mdirect-extern-access
> +Target Var(TARGET_DIRECT_EXTERN_ACCESS) Init(0)
> +Avoid using the GOT to access external symbols.
> diff --git a/gcc/config/loongarch/loongarch.cc 
> b/gcc/config/loongarch/loongarch.cc
> index 77e3a105390..2875fa5b0f3 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -1642,7 +1642,7 @@ loongarch_classify_symbol (const_rtx x)
>if (SYMBOL_REF_TLS_MODEL (x))
>  return SYMBOL_TLS;
>
> -  if (!loongarch_symbol_binds_local_p (x))
> +  if (!TARGET_DIRECT_EXTERN_ACCESS && !loongarch_symbol_binds_local_p (x))
>  return SYMBOL_GOT_DISP;
>
>tree t = SYMBOL_REF_DECL (x);
> @@ -6093,6 +6093,9 @@ loongarch_option_override_internal (struct gcc_options 
> *opts)
>if (loongarch_branch_cost == 0)
>  loongarch_branch_cost = loongarch_cost->branch_cost;
>
> +  if (TARGET_DIRECT_EXTERN_ACCESS && flag_shlib)
> +error ("%qs cannot be used for compiling a shared library",
> +  "-mdirect-extern-access");
>
>switch (la_target.cmodel)
>  {
> diff --git a/gcc/config/loongarch/loongarch.opt 
> b/gcc/config/loongarch/loongarch.opt
> index 6395234218b..96c811c850b 100644
> --- a/gcc/config/loongarch/loongarch.opt
> +++ b/gcc/config/loongarch/loongarch.opt
> @@ -191,3 +191,7 @@ Enum(cmodel) String(extreme) Value(CMODEL_EXTREME)
>  mcmodel=
>  Target RejectNegative Joined Enum(cmodel) Var(la_opt_cmodel) 
> Init(CMODEL_NORMAL)
>  Specify the code model.
> +
> +mdirect-extern-access
> +Target Var(TARGET_DIRECT_EXTERN_ACCESS) Init(0)
> +Avoid using the GOT to access external symbols.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index e5eb525a2c1..d4e86682827 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -1016,6 +1016,7 @@ Objective-C and Objective-C++ Dialects}.
>  -memcpy  -mno-memcpy -mstrict-align -mno-strict-align @gol
>  -mmax-inline-memcpy-size=@var{n} @gol
>  -mexplicit-relocs -mno-explicit-relocs @gol
> +-mdirect-extern-access -mno-direct-extern-access @gol
>

Re: [PATCH] rs6000/test: Fix bswap64-4.c with has_arch_ppc64 [PR106680]

2022-09-01 Thread Kewen.Lin via Gcc-patches

Hi Segher,

on 2022/9/1 23:04, Segher Boessenkool wrote:
> On Thu, Sep 01, 2022 at 05:05:44PM +0800, Kewen.Lin wrote:
>>> On Wed, Aug 31, 2022 at 05:33:28PM +0800, Kewen.Lin wrote:
>>> *Should* -mpowerpc64  be disabled by -m32?  
>>
>> I think the reason to disable -mpowerpc64 at -m32 is that we have
>> -mpowerpc64 explicitly specified at -m64 (equivalent behavior).
> 
> *Im*plicitly.  Explicit means the user has it on the command line.
> 

aha, let me reword it. :)  ... is that when -m64 is specified we make
it act like -mpowerpc64 is specified explicitly too even if user doesn't
actually specify -mpowerpc64.

>> In the current implementation, when -m64 is specified, we set the
>> bit OPTION_MASK_POWERPC64 in both opts and opts_set.  Since we
>> set OPTION_MASK_POWERPC64 in opts_set for -m64, when we find the
>> OPTION_MASK_POWERPC64 is ON in opts_set, we don't know if there
>> is one actual cmd-line option -mpowerpc64 or just -m64.
> 
> Yes.  That is what _explicit is for :-)
> 
>> Without any explicit -mpowerpc64 (and -mno-), I think we all agree
>> that -m64 should set OPTION_MASK_POWERPC64 in opts, conversely -m32
>> should unset OPTION_MASK_POWERPC64 in opts.
> 
> The latter only for OSes that do not handle -mpowerpc64 correctly.

I think it's the same for the OSes that handle -mpowerpc64 correctly.

Note that it's for the context without any explicit -mpowerpc64 (and
-mno-), assuming we don't "unset OPTION_MASK_POWERPC64 in opts" for
-m32, then the command line "-m64 -m32" would not be the same as
"-m32", since the previous "-m64" sets OPTION_MASK_POWERPC64 in opts
and it's still kept, it's unexpected.

> 
>> To make -m32/-m64 and -mpowerpc64 orthogonal, IMHO we should not
>> set bit OPTION_MASK_POWERPC64 in opts_set for -m64.
> 
> No.  Instead, we should not touch it if the user has explicitly set it
> or unset it.  Just like with all other flags :-)

I may miss something, but I think what we said here is consistent.
"should not set bit OPTION_MASK_POWERPC64 in opts_set" means we should
not make it act as -mpowerpc64 is specified explicitly, (once we won't
do the "unexpected" thing for -m64, then no reason to unset it for -m32
conversely, so explicit set/unset -mpowerpc64 is independent of -m32/-m64). 

BR,
Kewen

Re: [PATCH] rs6000/test: Fix bswap64-4.c with has_arch_ppc64 [PR106680]

2022-09-01 Thread Kewen.Lin via Gcc-patches

Hi Segher,

on 2022/9/1 22:57, Segher Boessenkool wrote:
> On Thu, Sep 01, 2022 at 04:57:59PM +0800, Kewen.Lin wrote:
>> on 2022/8/31 22:13, Peter Bergner wrote:
>>> On 8/31/22 4:33 AM, Kewen.Lin wrote:
 @@ -1,7 +1,8 @@
  /* { dg-do compile { target { powerpc*-*-* } } } */
  /* { dg-skip-if "" { powerpc*-*-aix* } } */
 -/* { dg-options "-O2 -mpowerpc64" } */
  /* { dg-require-effective-target ilp32 } */
 +/* { dg-options "-O2 -mpowerpc64" } */
 +/* { dg-require-effective-target has_arch_ppc64 } */
>>>
>>> With many of our recent patches moving the dg-options before any
>>> dg-requires-effectice-target so it affects the results of the
>>> dg-requires-effectice-target test, this looks like it's backwards
>>> from that process.  I understand why, so I think an explicit comment
>>> here in the test case explaining why it's after in this case.
>>> Just so in a few years when we come back to this test case, we
>>> won't accidentally undo this change.
>>
>> Oops, the diff shows it's like "after", but it's actually still "before". :)
>> The dg-options is meant to be placed before the succeeding has_arch_ppc64
>> effective target which is supposed to use dg-options to compile.  I felt
>> good to let ilp32 checking go first then has_arch_ppc64, so moved dg-option
>> downward.
> 
> These two are independent, but apparently we have a bug here, which will
> make what you did malfunction in some cases -- the test will not run for
> ilp32 if you have RUNTESTFLAGS {-m32,-m64}.

Yeah, because of the bug (or call it surprised behavior), the test case can
fail for some dejaGnu version like 1.5.1 (how it places the dg-options matters).
What I proposed is to detect this kind of test environment by has_arch_ppc64,
then turn the failure into unsupported.  Then the test case can survive for
any dejaGnu versions.  But based on the discussions, I'd like to try to fix
the bug and abandon this testing fix first.

> 
> It should not make a difference, -mpowerpc64 and -m32 should be wholly
> independent, and their order should not matter.  So the order of the
>   /* { dg-require-effective-target ilp32 } */
>   /* { dg-options "-O2 -mpowerpc64" } */
> lines should not make a difference either.  But it does :-(
> 

I agree the point that the order of lines should not make a difference.  :)
But to be clarified, the order of 

  /* { dg-options "-O2 -mpowerpc64" } */

and 

  /* { dg-require-effective-target has_arch_ppc64 } */

matters in this proposed fix, not for the line with ilp32.

has_arch_ppc64 uses current_compiler_flags which only incorporates dg-options
which is placed before the dg-require-effective-target.  I guess it's related
to how dejaGnu parses lines and sets global variables, for this kind of case,
we have to put the expected order for now.

BR,
Kewen

Re: [PATCH]rs6000: remove unused splitter on const_scalar_int_operand

2022-09-01 Thread Segher Boessenkool

Hi!

On Tue, Aug 30, 2022 at 05:44:26PM +0800, Jiufu Guo wrote:
> There are two splitters, both are calling rs6000_emit_set_const to emit
> instructions for constant building.
> One splitter checks `const_int_operand`, this splitter is always used.
> Another spitter checks `const_scalar_int_operand`, this one is never
> used now.

> Now, HOST_BITS_PER_WIDE_INT is forced to 64, this splitter is safe
> to remove.

Okay for trunk.  Thanks!


Segher

Re: Patch ping (was Re: [PATCH] libstdc++: Clear padding bits in atomic compare_exchange)

2022-09-01 Thread Thomas Rodgers via Gcc-patches

Sorry for the delay in getting to this.

I am currently working on moving the bulk of the atomic wait implementation
into the .so. I'd like to get that work to a stable state before revisiting
this patch, but obviously if we want this to make it into GCC13, it needs
to happen sooner rather than later.

On Thu, Aug 25, 2022 at 3:11 AM Jakub Jelinek  wrote:

> On Tue, Jan 18, 2022 at 09:48:19PM +, Jonathan Wakely via Gcc-patches
> wrote:
> > On Tue, 2 Nov 2021 at 01:26, Thomas Rodgers  wrote:
> >
> > > This should address Jonathan's feedback and adds support for
> atomic_ref
> > >
> >
> >
> > >This change implements P0528 which requires that padding bits not
> > >participate in atomic compare exchange operations. All arguments to the
> > >generic template are 'sanitized' by the __builtin_clearpadding intrisic
> >
> > The name of the intrinsic and the word "instrinsic" have typos.
>
> I'd like to ping this patch.
> To make some progress, I've tried to incorporate some of Jonathan's
> review comments below, but it is incomplete.
>
> ChangeLog + wording above it fixed.
>
> > >
> > >   explicit
> > >   __atomic_ref(_Tp& __t) : _M_ptr(std::__addressof(__t))
> > >-  { __glibcxx_assert(((uintptr_t)_M_ptr % required_alignment) ==
> 0); }
> > >+  {
> > >+ __glibcxx_assert(((uintptr_t)_M_ptr % required_alignment) == 0);
> > >+#if __cplusplus > 201402L && __has_builtin(__builtin_clear_padding)
> > >+ __builtin_clear_padding(_M_ptr);
> > >+#endif
> > >+  }
> >
> > Is this safe to do?
> >
> > What if multiple threads all create a std::atomic_ref round the same
> object
> > at once, they'll all try to clear padding, and so race, won't they?
> > I don't think we can clear padding on atomic_ref construction, only on
> > store and RMW operations.
>
> Didn't touch the above.
> >
> >
> > >--- a/libstdc++-v3/include/std/atomic
> > >+++ b/libstdc++-v3/include/std/atomic
>
> The patch against this file doesn't apply it all.
>
> > >--- /dev/null
> > >+++
> >
> b/libstdc++-v3/testsuite/29_atomics/atomic_ref/compare_exchange_padding.cc
> > >@@ -0,0 +1,43 @@
> > >+// { dg-options "-std=gnu++2a" }
> > >+// { dg-do run { target c++2a } }
> >
> > This new test is using "2a" not "20".
>
> Fixed thus, but the other testcase wasn't in the patch at all.
>
> Here it is:
>
> libstdc++: Clear padding bits in atomic compare_exchange
>
> This change implements P0528 which requires that padding bits not
> participate in atomic compare exchange operations. All arguments to the
> generic template are 'sanitized' by the __builtin_clear_padding intrinsic
> before they are used in comparisons. This requires that any stores
> also sanitize the incoming value.
>
> Signed-off-by: Thomas Rodgers 
>
> libstdc++-v3/ChangeLog:
>
> * include/std/atomic (atomic::atomic(_Tp)): Clear padding for
> __cplusplus > 201703L.
> (atomic::store()): Clear padding.
> (atomic::exchange()): Likewise.
> (atomic::compare_exchange_weak()): Likewise.
> (atomic::compare_exchange_strong()): Likewise.
> * include/bits/atomic_base.h (__atomic_impl::__clear_padding()):
> New function.
> (__atomic_impl::__maybe_has_padding()): Likewise.
> (__atomic_impl::__compare_exchange()): Likewise.
> (__atomic_impl::compare_exchange_weak()): Delegate to
> __compare_exchange().
> (__atomic_impl::compare_exchange_strong()): Likewise.
> * testsuite/29_atomics/atomic/compare_exchange_padding.cc: New
> test.
> * testsuite/28_atomics/atomic_ref/compare_exchange_padding.cc:
> Likewise.
>
> --- a/libstdc++-v3/include/bits/atomic_base.h.jj2022-05-16
> 09:46:02.361059682 +0200
> +++ b/libstdc++-v3/include/bits/atomic_base.h   2022-08-25
> 12:06:13.758883172 +0200
> @@ -954,6 +954,87 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>/// @endcond
>
> +  // Implementation details of atomic padding handling
> +  namespace __atomic_impl
> +  {
> +template
> +  _GLIBCXX_ALWAYS_INLINE _Tp*
> +  __clear_padding(_Tp& __val) noexcept
> +  {
> +   auto* __ptr = std::__addressof(__val);
> +#if __has_builtin(__builtin_clear_padding)
> +   __builtin_clear_padding(std::__addressof(__val));
> +#endif
> +   return __ptr;
> +  }
> +
> +template
> +  constexpr bool
> +  __maybe_has_padding()
> +  {
> +#if ! __has_builtin(__builtin_clear_padding)
> +   return false;
> +#elif __has_builtin(__has_unique_object_representations)
> +   return !__has_unique_object_representations(_Tp)
> + && !is_floating_point<_Tp>::value;
> +#else
> +   return true;
> +#endif
> +  }
> +
> +template
> +  _GLIBCXX_ALWAYS_INLINE bool
> +  __compare_exchange(_Tp& __val, _Tp& __e, _Tp& __i, bool __weak,
> +memory_order __s, memory_order __f) noexcept
> +  {
> +   __glibcxx_assert(__is_valid_cmpexch_failure_order(__f));
> +
> +   if _GLIBCXX17_CONSTEXPR

Re: [PATCH 2/2] allow constant splitter run in split1 pass

2022-09-01 Thread Segher Boessenkool

Hi!

On Thu, Sep 01, 2022 at 11:24:07AM +0800, Jiufu Guo wrote:
> Currently, these two splitters (touched in this patch) are using predicate
> `int_reg_operand_not_pseudo`, then they work in split2 pass after RA in
> most times, and can not run before RA.
> 
> It would not be a bad idea to allow these splitters before RA.  Then more
> passes (between split1 and split2) could optimize the emitted instructions.

The splitters can be used earlier even.  For example, often combine will
use them.

> And if splitting before RA, for these constant splitters, we may have more
> freedom to create pseduo to generate more parallel instructions.
> 
> For the example in the leading patch [PATCH 1/2]: pli+plit+rldimi would be
> better than pli+sldi+paddi.

Yes.  If you split after reload you have to do all local optimisations
(that would have been done in earlier passes) manually.  And all more
global ones (involving just one or two more insns already) are
essentially impossible to do.

Splitting after reload is necessary in some cases.  For example, all the
integer "dot" insns split to the base insn and an explicit compare, if
for some reason RA did not get cr0 here.  Importantly, this happens very
seldomly: RA knows it is two insns instead of one, and it chooses
accordingly.  Also it *has* to be after reload, it directly depends on
what RA chose to do.

Splitting dependent on if a VSR or a GPR (pair) was used is a losing
proposition.  It usually costs much more than it can gain.

> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.md (const splitter): Update predicate.

* config/rs6000/rs6000.md (splitter for set to and_mask constants):
Use int_reg_operand (instead of int_reg_operand_not_pseudo).
(splitter for multi-insn constant loads): Ditto.

You should mention the changed to *both* splitters.  For nameless
splitters it helps if you can describe it a bit.  This is hard, yes :-/

Okay for trunk like that.  Thanks!

Segher

RE: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator support.

2022-09-01 Thread Eugene Rozenfeld via Gcc-patches

Jason,

I made another small change in addressing your feedback (attached).

Thanks,

Eugene

-Original Message-
From: Gcc-patches  On 
Behalf Of Eugene Rozenfeld via Gcc-patches
Sent: Thursday, September 01, 2022 1:49 PM
To: Jason Merrill ; gcc-patches@gcc.gnu.org
Cc: Andi Kleen ; Jan Hubicka 
Subject: RE: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator 
support.

Jason,

Thank you for your review. You are correct, we only need to check 
has_discriminator for the statement that's on the same line.
I updated the patch (attached).

Thanks,

Eugene

-Original Message-
From: Jason Merrill 
Sent: Thursday, August 18, 2022 6:55 PM
To: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org
Cc: Andi Kleen ; Jan Hubicka 
Subject: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator 
support.

On 8/3/22 17:25, Eugene Rozenfeld wrote:
> One more ping for this patch
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=0
> 5%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da81
> 85dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637964709325691951
> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6I
> k1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2dT
> DYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
> 
> CC Jason since this changes discriminators emitted in dwarf.
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Eugene Rozenfeld
> Sent: Monday, June 27, 2022 12:45 PM
> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
> Hubicka 
> Subject: RE: [PING][PATCH] Add instruction level discriminator support.
> 
> Another ping for 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=05%7C01%7CEugene.Rozenfeld%40microsoft.com%7Cf217ebc45428465857bd08da8c5b6fb2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637976621612503972%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=b0kTdzWRyiwdtcEFasyNlKv1vj%2FqwnipN3776C7xWcg%3Dreserved=0
>  .
> 
> I got a review from Andi 
> (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596549.htmldata=05%7C01%7CEugene.Rozenfeld%40microsoft.com%7Cf217ebc45428465857bd08da8c5b6fb2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637976621612503972%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=qxjBUCcGiKXtR4%2BOJq%2FFQN11C2M6BBurTguOBOjWJDw%3Dreserved=0)
>  but I also need a review from someone who can approve the changes.
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Eugene Rozenfeld
> Sent: Friday, June 10, 2022 12:03 PM
> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
> Hubicka 
> Subject: [PING][PATCH] Add instruction level discriminator support.
> 
> Hello,
> 
> I'd like to ping this patch: 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=0
> 5%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da81
> 85dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637964709325691951
> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6I
> k1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2dT
> DYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Gcc-patches
>  On Behalf Of 
> Eugene Rozenfeld via Gcc-patches
> Sent: Thursday, June 02, 2022 12:22 AM
> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
> Hubicka 
> Subject: [EXTERNAL] [PATCH] Add instruction level discriminator support.
> 
> This is the first in a series of patches to enable discriminator support in 
> AutoFDO.
> 
> This patch switches to tracking discriminators per statement/instruction 
> instead of per basic block. Tracking per basic block was problematic since 
> not all statements in a basic block needed a discriminator and, also, later 
> optimizations could move statements between basic blocks making correlation 
> during AutoFDO compilation unreliable. Tracking per statement also allows us 
> to assign different discriminators to multiple function calls in the same 
> basic block. A subsequent patch will add that support.
> 
> The idea of this patch is based on commit 
> 4c311d95cf6d9519c3c20f641cc77af7df491fdf
> by Dehao Chen in vendors/google/heads/gcc-4_8 but uses a slightly different 
> approach. In Dehao's work special (normally unused) location ids and side 
> tables were used to keep track of locations with discriminators. Things have 
> changed since then and I don't think we have unused location ids anymore. 
> Instead, I made discriminators a part of ad-hoc locations.
> 
> The difference from Dehao's work also includes support for discriminator 
> reading/writing in lto

Re: [PATCH] i386: Fix conversion of move to/from AX_REG into xchg [PR106707]

2022-09-01 Thread H.J. Lu via Gcc-patches

On Thu, Sep 1, 2022 at 11:23 AM Uros Bizjak via Gcc-patches
 wrote:
>
> The conversion of a move pattern where both operands are AX_REG
> should be prevented.
>
> 2022-09-01  Uroš Bizjak  
>
> gcc/ChangeLog:
>
> PR target/106707
> * config/i386/i386.md (moves to/from AX_REG into xchg peephole2):
> Do not convert a move pattern where both operands are AX_REG.
>
> gcc/testsuite/ChangeLog:
>
> PR target/106707
> * gcc.target/i386/pr106707.c: New test.
>
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>
> Pushed to master.
>
> Uros.

I am checking in this to replace long with long long for 64-bit integer.

-- 
H.J.
From 01ca233f7a8ab683968d1ae2eb6e9f1049e86ad2 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Thu, 1 Sep 2022 15:18:18 -0700
Subject: [PATCH] i386: Replace long with long long for 64-bit integer

Replace long with long long for 64-bit integer since long may be 32
bits.

	PR target/106707
	* gcc.target/i386/pr106707.c (foo): Replace long with long long.
---
 gcc/testsuite/gcc.target/i386/pr106707.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/pr106707.c b/gcc/testsuite/gcc.target/i386/pr106707.c
index a127ccd4679..2e8ebaafb33 100644
--- a/gcc/testsuite/gcc.target/i386/pr106707.c
+++ b/gcc/testsuite/gcc.target/i386/pr106707.c
@@ -10,9 +10,9 @@ unsigned x, y;
 V v;
 
 void
-foo (long a)
+foo (long long a)
 {
-  long l = a != x;
+  long long l = a != x;
   int i = __builtin_add_overflow_p (y * ii, 0, 0);
   V u = ii < x | v, w = x <= u < i & y <= x / ii;
   v = __builtin_shufflevector (v, w, 1, 2) + (V) l;
-- 
2.37.2

Re: [PATCH 1/2] Using pli(paddi) and rotate to build 64bit constants

2022-09-01 Thread Segher Boessenkool

Hi!

This patch is a clear improvement :-)

On Thu, Sep 01, 2022 at 11:24:00AM +0800, Jiufu Guo wrote:
> As mentioned in PR106550, since pli could support 34bits immediate, we could
> use less instructions(3insn would be ok) to build 64bits constant with pli.

> For example, for constant 0x020805006106003, we could generate it with:
> asm code1:
> pli 9,101736451 (0x6106003)
> sldi 9,9,32
> paddi 9,9, 213 (0x0208050)

3 insns, 2 insns dependent on the previous, each.

> or asm code2:
> pli 10, 213
> pli 9, 101736451
> rldimi 9, 10, 32, 0

3 insns, 1 insn dependent on both others.

> Testing with simple cases as below, run them a lot of times:
> f1.c
> long __attribute__ ((noinline)) foo (long *arg,long *,long*)
> {
>   *arg = 0x2351847027482577;
> }
> 5insns: base
> pli+sldi+paddi: similar -0.08%
> pli+pli+rldimi: faster +0.66%

This mostly tests how well this micro-benchmark is scheduled.  More time
is spent in the looping and function calls (not shown)!

> f2.c
> long __attribute__ ((noinline)) foo (long *arg, long *arg2, long *arg3)
> {
>   *arg = 0x2351847027482577;
>   *arg2 = 0x3257845024384680;
>   *arg3 = 0x1245abcef9240dec;
> }
> 5nisns: base
> pli+sldi+paddi: faster +1.35%
> pli+pli+rldimi: faster +5.49%
> 
> f2.c would be more meaningful.  Because 'sched passes' are effective for
> f2.c, but 'scheds' do less thing for f1.c.

It still is a too small example to mean much without looking at a
pipeview, or at the very least perf.  But the results show a solid
improvement as expected ;-)

> gcc/ChangeLog:
>   PR target/106550
>   * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add 'pli' for
>   constant building.

"Use pli." ?

> gcc/testsuite/ChangeLog:
>   PR target/106550
>   * gcc.target/powerpc/pr106550.c: New test.

> +  else if (TARGET_PREFIXED)
> +{
> +  /* pli 9,high32 + pli 10,low32 + rldimi 9,10,32,0.  */

But not just 9 and 10.  Use A and B or X and Y or H and L or something
like that?

The comment goes...

> +  if (can_create_pseudo_p ())
> + {

... here.

> +   temp = gen_reg_rtx (DImode);
> +   rtx temp1 = gen_reg_rtx (DImode);
> +   emit_move_insn (copy_rtx (temp), GEN_INT ((ud4 << 16) | ud3));
> +   emit_move_insn (copy_rtx (temp1), GEN_INT ((ud2 << 16) | ud1));
> +
> +   emit_insn (gen_rotldi3_insert_3 (dest, temp, GEN_INT (32), temp1,
> +GEN_INT (0x)));
> + }
> +

No blank line here please.

> +  /* pli 9,high32 + sldi 9,32 + paddi 9,9,low32.  */
> +  else
> + {

The comment goes here, in the block it refers to.  Comments for a block
are the first thing *in* the block.

> +   emit_move_insn (copy_rtx (dest), GEN_INT ((ud4 << 16) | ud3));
> +
> +   emit_move_insn (copy_rtx (dest),
> +   gen_rtx_ASHIFT (DImode, copy_rtx (dest),
> +   GEN_INT (32)));
> +
> +   bool can_use_paddi = REGNO (dest) != FIRST_GPR_REGNO;

There should be a test that we so the right thing (or *a* right thing,
anyway; a working thing; but hopefully a reasonably fast thing) for
!can_use_paddi.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr106550.c
> @@ -0,0 +1,14 @@
> +/* PR target/106550 */
> +/* { dg-options "-O2 -std=c99 -mdejagnu-cpu=power10" } */
> +
> +void
> +foo (unsigned long long *a)
> +{
> +  *a++ = 0x020805006106003;
> +  *a++ = 0x2351847027482577;  
> +}
> +
> +/* 3 insns for each constant: pli+sldi+paddi or pli+pli+rldimi.
> +   And 3 additional insns: std+std+blr: 9 insns totally.  */
> +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 9 } } */

Also test the expected insns separately please?  The std's (just with
\mstd so it will catch all variations as well), the blr, the pli's and
the rldimi etc.?

We also should test all special cases as well.  Especially those that do
not happen all over the place, we will notice something is broken there
easy enough.  But unlikely cases can take years to show up.

Okay for trunk with the formatting fixed.  Thank you!

Segher

Re: [committed] c: C2x removal of unprototyped functions

2022-09-01 Thread Jeff Law via Gcc-patches





On 9/1/2022 1:12 PM, Joseph Myers wrote:

C2x has completely removed unprototyped functions, so that () now
means the same as (void) in both function declarations and
definitions, where previously that change had been made for
definitions only.  Implement this accordingly.

This is a change where GNU/Linux distribution builders might wish to
try builds with a -std=gnu2x default to start early on getting old
code fixed that still has () declarations for functions taking
arguments, in advance of GCC moving to -std=gnu2x as default maybe in
GCC 14 or 15; I don't know how much such code is likely to be in
current use.
Happy to see this happen (dropping unprototyped funtions).  IIRC older 
versions of autoconf are going to generate code that runs afoul of this 
problem as well.


jeff

[PATCH][committed]AArch64 Fix bootstrap failure due to dump_printf_loc format attribute uses [PR106782]

2022-09-01 Thread Tamar Christina via Gcc-patches

Hi All,

This fixes the bootstrap failure on AArch64 following -Werror=format by
correcting the print format modifiers in the backend.

Bootstrapped on aarch64-none-linux-gnu and no issues.

Committed as obvious.

Thanks,
Tamar

gcc/ChangeLog:

PR other/106782
* config/aarch64/aarch64.cc
(aarch64_vector_costs::prefer_unrolled_loop): Replace %u with
HOST_WIDE_INT_PRINT_UNSIGNED.

--- inline copy of patch -- 
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
4b486aeea90ea2afb9cdd96a4dbe15c5bb2abd7a..f199e77cd4296cd3556641051072dabc9f5e51fa
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -16671,7 +16671,8 @@ aarch64_vector_costs::prefer_unrolled_loop () const
 
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location, "Number of insns in"
-" unrolled Advanced SIMD loop = %d\n",
+" unrolled Advanced SIMD loop = "
+HOST_WIDE_INT_PRINT_UNSIGNED "\n",
 m_unrolled_advsimd_stmts);
 
   /* The balance here is tricky.  On the one hand, we can't be sure whether




-- 
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
4b486aeea90ea2afb9cdd96a4dbe15c5bb2abd7a..f199e77cd4296cd3556641051072dabc9f5e51fa
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -16671,7 +16671,8 @@ aarch64_vector_costs::prefer_unrolled_loop () const
 
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location, "Number of insns in"
-" unrolled Advanced SIMD loop = %d\n",
+" unrolled Advanced SIMD loop = "
+HOST_WIDE_INT_PRINT_UNSIGNED "\n",
 m_unrolled_advsimd_stmts);
 
   /* The balance here is tricky.  On the one hand, we can't be sure whether

[PATCH] Add __builtin_iseqsig()

2022-09-01 Thread FX via Gcc-patches

Attached patch adds __builtin_iseqsig() to the middle-end and C family 
front-ends.
Testing does not currently check whether the signaling part works, because with 
optimisation is actually does not (preexisting compiler bug: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106805)

Bootstrapped and regtested on x86_64-linux.
OK to commit?

(I’m not very skilled for middle-end hacking, so I’m sure there will be 
modifications to make.)

FX


0001-Add-__builtin_iseqsig.patch
Description: Binary data

RE: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator support.

2022-09-01 Thread Eugene Rozenfeld via Gcc-patches

Jason,

Thank you for your review. You are correct, we only need to check 
has_discriminator for the statement that's on the same line.
I updated the patch (attached).

Thanks,

Eugene

-Original Message-
From: Jason Merrill  
Sent: Thursday, August 18, 2022 6:55 PM
To: Eugene Rozenfeld ; gcc-patches@gcc.gnu.org
Cc: Andi Kleen ; Jan Hubicka 
Subject: [EXTERNAL] Re: [PING][PATCH] Add instruction level discriminator 
support.

On 8/3/22 17:25, Eugene Rozenfeld wrote:
> One more ping for this patch 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=0
> 5%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da81
> 85dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637964709325691951
> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6I
> k1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2dT
> DYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
> 
> CC Jason since this changes discriminators emitted in dwarf.
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Eugene Rozenfeld
> Sent: Monday, June 27, 2022 12:45 PM
> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
> Hubicka 
> Subject: RE: [PING][PATCH] Add instruction level discriminator support.
> 
> Another ping for 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=05%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da8185dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637964709325691951%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2dTDYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
>  .
> 
> I got a review from Andi 
> (https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596549.htmldata=05%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da8185dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637964709325691951%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=se6x1LD0GQyFz%2B28gdVqsye3Aw8kPoMRhVQO1BSPg6I%3Dreserved=0)
>  but I also need a review from someone who can approve the changes.
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Eugene Rozenfeld
> Sent: Friday, June 10, 2022 12:03 PM
> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
> Hubicka 
> Subject: [PING][PATCH] Add instruction level discriminator support.
> 
> Hello,
> 
> I'd like to ping this patch: 
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.
> gnu.org%2Fpipermail%2Fgcc-patches%2F2022-June%2F596065.htmldata=0
> 5%7C01%7Ceugene.rozenfeld%40microsoft.com%7C3e9ebe6dd5b14fe4471808da81
> 85dc68%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637964709325691951
> %7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6I
> k1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=K%2BMx6jelnED3n%2Be2dT
> DYAPOqZZ8Zlsd2%2FyPJ0qib5%2FM%3Dreserved=0
> 
> Thanks,
> 
> Eugene
> 
> -Original Message-
> From: Gcc-patches 
>  On Behalf Of 
> Eugene Rozenfeld via Gcc-patches
> Sent: Thursday, June 02, 2022 12:22 AM
> To: gcc-patches@gcc.gnu.org; Andi Kleen ; Jan 
> Hubicka 
> Subject: [EXTERNAL] [PATCH] Add instruction level discriminator support.
> 
> This is the first in a series of patches to enable discriminator support in 
> AutoFDO.
> 
> This patch switches to tracking discriminators per statement/instruction 
> instead of per basic block. Tracking per basic block was problematic since 
> not all statements in a basic block needed a discriminator and, also, later 
> optimizations could move statements between basic blocks making correlation 
> during AutoFDO compilation unreliable. Tracking per statement also allows us 
> to assign different discriminators to multiple function calls in the same 
> basic block. A subsequent patch will add that support.
> 
> The idea of this patch is based on commit 
> 4c311d95cf6d9519c3c20f641cc77af7df491fdf
> by Dehao Chen in vendors/google/heads/gcc-4_8 but uses a slightly different 
> approach. In Dehao's work special (normally unused) location ids and side 
> tables were used to keep track of locations with discriminators. Things have 
> changed since then and I don't think we have unused location ids anymore. 
> Instead, I made discriminators a part of ad-hoc locations.
> 
> The difference from Dehao's work also includes support for discriminator 
> reading/writing in lto streaming and in modules.
> 
> Tested on x86_64-pc-linux-gnu.

> @@ -1190,12 +1217,12 @@ assign_discriminators (void)
> || (last && same_line_p (locus, _e,
>  gimple_location (last
>   {
> -   if (e->dest->discriminator != 0 && bb->discriminator == 0)
> - bb->discriminator
> -   =

Re: [PATCH] c++, v2: Implement C++23 P2071R2 - Named universal character escapes [PR106648]

2022-09-01 Thread Jakub Jelinek via Gcc-patches

On Thu, Sep 01, 2022 at 03:00:28PM -0400, Jason Merrill wrote:
> > Apparently clang uses -Wunicode option to cover these, but unfortunately
> > they don't bother to document it (nor almost any other warning option),
> > so it is unclear what else exactly it covers.  Plus a question is how
> > we should document that option for GCC...
> 
> We might as well use the same flag name, and document it to mean what it
> currently means for GCC.

Ok, will work on that tomorrow.

> > @@ -1489,8 +1507,16 @@ _cpp_valid_ucn (cpp_reader *pfile, const
> >   if (str < limit && *str == '}')
> > {
> > - if (name == str && identifier_pos)
> > + if (identifier_pos && (name == str || !strict))
> > {
> > + if (name == str)
> > +   cpp_warning (pfile, CPP_W_NONE,
> > +"empty named universal character escape "
> > +"sequence; treating it as separate tokens");
> > + else
> > +   cpp_warning (pfile, CPP_W_NONE,
> > +"incomplete named universal character escape "
> > +"sequence; treating it as separate tokens");
> 
> It looks like this is handling \N{abc}, for which "incomplete" seems like
> the wrong description; it's complete, just wrong, and the diagnostic doesn't
> help correct it.

The point is to make it more consistent with the \N{X.1} handling.
The grammar is clear that only upper case letters + digits + space + hyphen
can appear in between \N{ and }.  So, both of those cases IMHO should be
handled the same.  The !strict case is if there is at least one lower case
letter or underscore but no other characters than letters + digits + space +
hyphen + underscore, we then find the terminating } and inside of
string/character literals want to do the UAX44LM2 algorithm suggestions.
But for X.1 in literals we don't even look for }, we just emit the
  cpp_error (pfile, CPP_DL_ERROR,
 "'\\N{' not terminated with '}' after %.*s",
 (int) (str - base), base);
diagnostics which prints after X
For the identifier_pos case, both the !strict and *str != '}' cases
are the same reason why it is treated as separate tokens, not because
the name is not valid, but because it contains invalid characters.
So perhaps for the identifier_pos !strict and *str != '}' cases
we could emit a warning with the same wording as above (but so that
we stop for !strict on the first lowercase or _ char just break instead
of set strict = true if identifier_pos).
Or we could emit such a warning and a note that would clarify that only
upper case letters, digits, space or hyphen are allowed there?

Jakub

[pushed] c++: Remove unused declaration

2022-09-01 Thread Marek Polacek via Gcc-patches

This declaration was added in r260905 but the function was never
defined.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* cp-tree.h (maybe_strip_ref_conversion): Remove.
---
 gcc/cp/cp-tree.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index c897da204fe..c45f843825e 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7598,7 +7598,6 @@ extern tree force_paren_expr  (tree, 
bool = false);
 inline tree force_paren_expr_uneval(tree t)
 { return force_paren_expr (t, true); }
 extern tree maybe_undo_parenthesized_ref   (tree);
-extern tree maybe_strip_ref_conversion (tree);
 extern tree finish_non_static_data_member   (tree, tree, tree,
 tsubst_flags_t = 
tf_warning_or_error);
 extern tree begin_stmt_expr(void);

base-commit: 42e489088bf53845c648e512449b72dbd3c7169b
-- 
2.37.2

[PATCH] btf: do not skip emitting void variables [PR106773]

2022-09-01 Thread David Faust via Gcc-patches

The eBPF loader expects to find BTF_KIND_VAR records for references to
extern const void symbols. We were mistakenly identifing these as
unsupported types, and as a result skipping emitting VAR records for
them.

Tested on bpf-unknown-none and x86_64, no known regressions.
OK?

Thanks.

gcc/ChangeLog:

PR target/106773
* btfout.cc (btf_dvd_emit_preprocess_cb): Do not skip emitting
variables which refer to void types.

gcc/testsuite/ChangeLog:

PR target/106773
* gcc.dg/debug/btf/btf-pr106773.c: New test.
---
 gcc/btfout.cc |  2 +-
 gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c | 21 +++
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 997a33fa089..37ec662c190 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -430,7 +430,7 @@ btf_dvd_emit_preprocess_cb (ctf_dvdef_ref *slot, 
ctf_container_ref arg_ctfc)
   ctf_dvdef_ref var = (ctf_dvdef_ref) * slot;
 
   /* Do not add variables which refer to unsupported types.  */
-  if (btf_removed_type_p (var->dvd_type))
+  if (!voids.contains (var->dvd_type) && btf_removed_type_p (var->dvd_type))
 return 1;
 
   arg_ctfc->ctfc_vars_list[num_vars_added] = var;
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c
new file mode 100644
index 000..4de15f76546
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-pr106773.c
@@ -0,0 +1,21 @@
+/* Test BTF generation for extern const void symbols.
+   BTF_KIND_VAR records should be emitted for such symbols if they are used.  
*/
+
+/* { dg-do compile } */
+/* { dg-options "-O0 -gbtf -dA" } */
+
+/* Expect 1 variable record only for foo.  */
+/* { dg-final { scan-assembler-times "\[\t \]0xe00\[\t 
\]+\[^\n\]*btv_info" 1 } } */
+/* { dg-final { scan-assembler-times "\[\t \]0x1\[\t \]+\[^\n\]*btv_linkage" 1 
} } */
+
+/* { dg-final { scan-assembler-times "ascii \"foo.0\"\[\t 
\]+\[^\n\]*btf_string" 1 } } */
+
+extern const void foo;
+extern const void bar;
+
+unsigned long func () {
+  unsigned long x = (unsigned long) 
+
+  return x;
+}
+
-- 
2.36.1

[committed] libstdc++: Add 'typename' for Clang compatibility

2022-09-01 Thread Jonathan Wakely via Gcc-patches

Tested x86_64-linux, pushed to trunk.

-- >8 --

Clang doesn't yet implement the C++20 change that makes 'typename'
optional here.

libstdc++-v3/ChangeLog:

* include/std/ranges (adjacent_transform_view::_Iterator): Add
typename keyword before dependent qualified-id.
---
 libstdc++-v3/include/std/ranges | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index dad1e4c9f93..2b5cb0531f0 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -,7 +,7 @@ namespace views::__adaptor
   using __detail::__unarize;
   using _Res = invoke_result_t<__unarize<__maybe_const_t<_Const, _Fp>&, 
_Nm>,
   range_reference_t<_Base>>;
-  using _Cat = iterator_traits>::iterator_category;
+  using _Cat = typename 
iterator_traits>::iterator_category;
   if constexpr (!is_lvalue_reference_v<_Res>)
return input_iterator_tag{};
   else if constexpr (derived_from<_Cat, random_access_iterator_tag>)
-- 
2.37.2

[committed] libstdc++: Optimize std::decay

2022-09-01 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, pushed to trunk.

-- >8 --

Define partial specializations of std::decay and its __decay_selector
helper so that remove_reference, is_array and is_function are not
instantiated for every type, and remove_extent is not instantiated for
arrays.

libstdc++-v3/ChangeLog:

* include/std/type_traits (__decay_selector): Add partial
specializations for array types. Only check for function types
when not dealing with an array.
(decay): Add partial specializations for reference types.
---
 libstdc++-v3/include/std/type_traits | 39 ++--
 1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index e4b9b59ce08..639c351df8a 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -2203,34 +2203,35 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // Decay trait for arrays and functions, used for perfect forwarding
   // in make_pair, make_tuple, etc.
-  template::value,
-  bool _IsFunction = is_function<_Up>::value>
-struct __decay_selector;
-
-  // NB: DR 705.
   template
-struct __decay_selector<_Up, false, false>
-{ typedef __remove_cv_t<_Up> __type; };
+struct __decay_selector
+: __conditional_t::value, // false for functions
+ remove_cv<_Up>, // N.B. DR 705.
+ add_pointer<_Up>>   // function decays to pointer
+{ };
+
+  template
+struct __decay_selector<_Up[_Nm]>
+{ using type = _Up*; };
 
   template
-struct __decay_selector<_Up, true, false>
-{ typedef typename remove_extent<_Up>::type* __type; };
+struct __decay_selector<_Up[]>
+{ using type = _Up*; };
 
-  template
-struct __decay_selector<_Up, false, true>
-{ typedef typename add_pointer<_Up>::type __type; };
   /// @endcond
 
   /// decay
   template
-class decay
-{
-  typedef typename remove_reference<_Tp>::type __remove_type;
+struct decay
+{ using type = typename __decay_selector<_Tp>::type; };
 
-public:
-  typedef typename __decay_selector<__remove_type>::__type type;
-};
+  template
+struct decay<_Tp&>
+{ using type = typename __decay_selector<_Tp>::type; };
+
+  template
+struct decay<_Tp&&>
+{ using type = typename __decay_selector<_Tp>::type; };
 
   /// @cond undocumented
 
-- 
2.37.2

[committed] libstdc++: Remove __is_referenceable helper

2022-09-01 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, pushed to trunk.

-- >8 --

We only use the __is_referenceable helper in three places now:
add_pointer, add_lvalue_reference, and add_rvalue_reference. But lots of
other traits depend on add_[lr]value_reference, and decay depends on
add_pointer, so removing the instantiation of __is_referenceable helps
compile all those other traits slightly faster.

We can just use void_t to check for a referenceable type in the
add_[lr]value_reference traits.

Then we can specialize add_pointer for reference types, so that we don't
need to use remove_reference, and then use void_t for all
non-reference types to detect when we can form a pointer to the type.

libstdc++-v3/ChangeLog:

* include/std/type_traits (__is_referenceable): Remove.
(__add_lvalue_reference_helper, __add_rvalue_reference_helper):
Use __void_t instead of __is_referenceable.
(__add_pointer_helper): Likewise.
(add_pointer): Add partial specializations for reference types.
---
 libstdc++-v3/include/std/type_traits | 37 
 1 file changed, 16 insertions(+), 21 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 3041ac3c941..8b11f31741b 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -712,18 +712,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // __void_t (std::void_t for C++11)
   template using __void_t = void;
-
-  // Utility to detect referenceable types ([defns.referenceable]).
-
-  template
-struct __is_referenceable
-: public false_type
-{ };
-
-  template
-struct __is_referenceable<_Tp, __void_t<_Tp&>>
-: public true_type
-{ };
   /// @endcond
 
   // Type properties.
@@ -1024,12 +1012,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
   /// @cond undocumented
-  template::value>
+  template
 struct __add_lvalue_reference_helper
 { using type = _Tp; };
 
   template
-struct __add_lvalue_reference_helper<_Tp, true>
+struct __add_lvalue_reference_helper<_Tp, __void_t<_Tp&>>
 { using type = _Tp&; };
 
   template
@@ -1046,12 +1034,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
   /// @cond undocumented
-  template::value>
+  template
 struct __add_rvalue_reference_helper
 { using type = _Tp; };
 
   template
-struct __add_rvalue_reference_helper<_Tp, true>
+struct __add_rvalue_reference_helper<_Tp, __void_t<_Tp&&>>
 { using type = _Tp&&; };
 
   template
@@ -1971,14 +1959,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public __remove_pointer_helper<_Tp, __remove_cv_t<_Tp>>
 { };
 
-  template,
- is_void<_Tp>>::value>
+  template
 struct __add_pointer_helper
-{ typedef _Tp type; };
+{ using type = _Tp; };
 
   template
-struct __add_pointer_helper<_Tp, true>
-{ typedef typename remove_reference<_Tp>::type* type; };
+struct __add_pointer_helper<_Tp, __void_t<_Tp*>>
+{ using type = _Tp*; };
 
   /// add_pointer
   template
@@ -1986,6 +1973,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public __add_pointer_helper<_Tp>
 { };
 
+  template
+struct add_pointer<_Tp&>
+{ using type = _Tp*; };
+
+  template
+struct add_pointer<_Tp&&>
+{ using type = _Tp*; };
+
 #if __cplusplus > 201103L
   /// Alias template for remove_pointer
   template
-- 
2.37.2

[committed] libstdc++: Add specializations for some variable templates

2022-09-01 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, pushed to trunk.

-- >8 --

This avoids having to instantiate a class template when we can detect
the true cases easily with a partial specialization.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_lvalue_reference_v)
(is_rvalue_reference_v, is_reference_v, is_const_v)
(is_volatile_v): Define using partial specializations instead
of instantiating class templates.
---
 libstdc++-v3/include/std/type_traits | 24 +---
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 52cca8bf3af..e4b9b59ce08 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3153,11 +3153,13 @@ template 
 template 
   inline constexpr bool is_pointer_v = is_pointer<_Tp>::value;
 template 
-  inline constexpr bool is_lvalue_reference_v =
-is_lvalue_reference<_Tp>::value;
+  inline constexpr bool is_lvalue_reference_v = false;
 template 
-  inline constexpr bool is_rvalue_reference_v =
-is_rvalue_reference<_Tp>::value;
+  inline constexpr bool is_lvalue_reference_v<_Tp&> = true;
+template 
+  inline constexpr bool is_rvalue_reference_v = false;
+template 
+  inline constexpr bool is_rvalue_reference_v<_Tp&&> = true;
 template 
   inline constexpr bool is_member_object_pointer_v =
 is_member_object_pointer<_Tp>::value;
@@ -3173,7 +3175,11 @@ template 
 template 
   inline constexpr bool is_function_v = is_function<_Tp>::value;
 template 
-  inline constexpr bool is_reference_v = is_reference<_Tp>::value;
+  inline constexpr bool is_reference_v = false;
+template 
+  inline constexpr bool is_reference_v<_Tp&> = true;
+template 
+  inline constexpr bool is_reference_v<_Tp&&> = true;
 template 
   inline constexpr bool is_arithmetic_v = is_arithmetic<_Tp>::value;
 template 
@@ -3187,9 +3193,13 @@ template 
 template 
   inline constexpr bool is_member_pointer_v = is_member_pointer<_Tp>::value;
 template 
-  inline constexpr bool is_const_v = is_const<_Tp>::value;
+  inline constexpr bool is_const_v = false;
 template 
-  inline constexpr bool is_volatile_v = is_volatile<_Tp>::value;
+  inline constexpr bool is_const_v = true;
+template 
+  inline constexpr bool is_volatile_v = false;
+template 
+  inline constexpr bool is_volatile_v = true;
 template 
   inline constexpr bool is_trivial_v = is_trivial<_Tp>::value;
 template 
-- 
2.37.2

[committed] libstdc++: Optimize is_constructible traits

2022-09-01 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, pushed to trunk.

-- >8 --

We can replace some class template helpers with alias templates, which
are cheaper to instantiate.

For example, replace the __is_copy_constructible_impl class template
with an alias template that uses just evaluates the __is_constructible
built-in, using add_lvalue_reference to get the argument type
in a way that works for non-referenceable types. For a given
specialization of is_copy_constructible this results in the same number
of class templates being instantiated (for the common case of non-void,
non-function types), but the add_lvalue_reference instantiations are not
specific to the is_copy_constructible specialization and so can be
reused by other traits. Previously __is_copy_constructible_impl was a
distinct class template and its specializations were never used for
anything except is_copy_constructible.

With the new definitions of these traits that don't depend on helper
classes, it becomes more practical to optimize the
is_xxx_constructible_v variable templates to avoid instantiations.
Previously doing so would have meant two entirely separate
implementation strategies for these traits.

libstdc++-v3/ChangeLog:

* include/std/type_traits (__is_constructible_impl): Replace
class template with alias template.
(is_default_constructible, is_nothrow_constructible)
(is_nothrow_constructible): Simplify base-specifier.
(__is_copy_constructible_impl, __is_move_constructible_impl)
(__is_nothrow_copy_constructible_impl)
(__is_nothrow_move_constructible_impl): Remove class templates.
(is_copy_constructible, is_move_constructible)
(is_nothrow_constructible, is_nothrow_default_constructible)
(is_nothrow_copy_constructible, is_nothrow_move_constructible):
Adjust base-specifiers to use __is_constructible_impl.
(__is_copy_assignable_impl, __is_move_assignable_impl)
(__is_nt_copy_assignable_impl, __is_nt_move_assignable_impl):
Remove class templates.
(__is_assignable_impl): New alias template.
(is_assignable, is_copy_assignable, is_move_assignable):
Adjust base-specifiers to use new alias template.
(is_nothrow_copy_assignable, is_nothrow_move_assignable):
Adjust base-specifiers to use existing alias template.
(__is_trivially_constructible_impl): New alias template.
(is_trivially_constructible, is_trivially_default_constructible)
(is_trivially_copy_constructible)
(is_trivially_move_constructible): Adjust base-specifiers to use
new alias template.
(__is_trivially_assignable_impl): New alias template.
(is_trivially_assignable, is_trivially_copy_assignable)
(is_trivially_move_assignable): Adjust base-specifier to use
new alias template.
(__add_lval_ref_t, __add_rval_ref_t): New alias templates.
(add_lvalue_reference, add_rvalue_reference): Use new alias
templates.
---
 libstdc++-v3/include/std/type_traits | 249 +++
 1 file changed, 62 insertions(+), 187 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 639c351df8a..3041ac3c941 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1001,9 +1001,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /// @cond undocumented
   template
-struct __is_constructible_impl
-: public __bool_constant<__is_constructible(_Tp, _Args...)>
-{ };
+using __is_constructible_impl
+  = __bool_constant<__is_constructible(_Tp, _Args...)>;
   /// @endcond
 
   /// is_constructible
@@ -1018,7 +1017,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   /// is_default_constructible
   template
 struct is_default_constructible
-: public __is_constructible_impl<_Tp>::type
+: public __is_constructible_impl<_Tp>
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
"template argument must be a complete class or an unbounded array");
@@ -1026,22 +1025,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /// @cond undocumented
   template::value>
-struct __is_copy_constructible_impl;
+struct __add_lvalue_reference_helper
+{ using type = _Tp; };
 
   template
-struct __is_copy_constructible_impl<_Tp, false>
-: public false_type { };
+struct __add_lvalue_reference_helper<_Tp, true>
+{ using type = _Tp&; };
 
   template
-struct __is_copy_constructible_impl<_Tp, true>
-: public __is_constructible_impl<_Tp, const _Tp&>
-{ };
+using __add_lval_ref_t = typename __add_lvalue_reference_helper<_Tp>::type;
   /// @endcond
 
   /// is_copy_constructible
   template
 struct is_copy_constructible
-: public __is_copy_constructible_impl<_Tp>
+: public __is_constructible_impl<_Tp, __add_lval_ref_t>
 {
   static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
"template

[committed] libstdc++: Use built-ins for some variable templates

2022-09-01 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, pushed to trunk.

-- >8 --

This avoids having to instantiate a class template that just uses the
same built-in anyway.

None of the corresponding class templates have any type-completeness
static assertions, so we're not losing any diagnostics by using the
built-ins directly.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_enum_v, is_class_v, is_union_v)
(is_empty_v, is_polymoprhic_v, is_abstract_v, is_final_v)
(is_base_of_v, is_aggregate_v): Use built-in directly instead of
instantiating class template.
---
 libstdc++-v3/include/std/type_traits | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 5b8314f24fd..52cca8bf3af 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3165,11 +3165,11 @@ template 
   inline constexpr bool is_member_function_pointer_v =
 is_member_function_pointer<_Tp>::value;
 template 
-  inline constexpr bool is_enum_v = is_enum<_Tp>::value;
+  inline constexpr bool is_enum_v = __is_enum(_Tp);
 template 
-  inline constexpr bool is_union_v = is_union<_Tp>::value;
+  inline constexpr bool is_union_v = __is_union(_Tp);
 template 
-  inline constexpr bool is_class_v = is_class<_Tp>::value;
+  inline constexpr bool is_class_v = __is_class(_Tp);
 template 
   inline constexpr bool is_function_v = is_function<_Tp>::value;
 template 
@@ -3206,14 +3206,14 @@ template 
   _GLIBCXX17_DEPRECATED
   inline constexpr bool is_literal_type_v = is_literal_type<_Tp>::value;
 #pragma GCC diagnostic pop
- template 
-  inline constexpr bool is_empty_v = is_empty<_Tp>::value;
 template 
-  inline constexpr bool is_polymorphic_v = is_polymorphic<_Tp>::value;
+  inline constexpr bool is_empty_v = __is_empty(_Tp);
 template 
-  inline constexpr bool is_abstract_v = is_abstract<_Tp>::value;
+  inline constexpr bool is_polymorphic_v = __is_polymorphic(_Tp);
 template 
-  inline constexpr bool is_final_v = is_final<_Tp>::value;
+  inline constexpr bool is_abstract_v = __is_abstract(_Tp);
+template 
+  inline constexpr bool is_final_v = __is_final(_Tp);
 template 
   inline constexpr bool is_signed_v = is_signed<_Tp>::value;
 template 
@@ -3318,7 +3318,7 @@ template 
   inline constexpr bool is_same_v = std::is_same<_Tp, _Up>::value;
 #endif
 template 
-  inline constexpr bool is_base_of_v = is_base_of<_Base, _Derived>::value;
+  inline constexpr bool is_base_of_v = __is_base_of(_Base, _Derived);
 template 
   inline constexpr bool is_convertible_v = is_convertible<_From, _To>::value;
 template
@@ -3356,16 +3356,19 @@ template
 
 #ifdef _GLIBCXX_HAVE_BUILTIN_IS_AGGREGATE
 # define __cpp_lib_is_aggregate 201703L
-  /// is_aggregate
+  /// is_aggregate - true if the type is an aggregate.
   /// @since C++17
   template
 struct is_aggregate
 : bool_constant<__is_aggregate(remove_cv_t<_Tp>)>
 { };
 
-  /// @ingroup variable_templates
+  /** is_aggregate_v - true if the type is an aggregate.
+   *  @ingroup variable_templates
+   *  @since C++17
+   */
   template
-inline constexpr bool is_aggregate_v = is_aggregate<_Tp>::value;
+inline constexpr bool is_aggregate_v = __is_aggregate(remove_cv_t<_Tp>);
 #endif
 #endif // C++17
 
-- 
2.37.2

[committed] c: C2x removal of unprototyped functions

2022-09-01 Thread Joseph Myers

C2x has completely removed unprototyped functions, so that () now
means the same as (void) in both function declarations and
definitions, where previously that change had been made for
definitions only.  Implement this accordingly.

This is a change where GNU/Linux distribution builders might wish to
try builds with a -std=gnu2x default to start early on getting old
code fixed that still has () declarations for functions taking
arguments, in advance of GCC moving to -std=gnu2x as default maybe in
GCC 14 or 15; I don't know how much such code is likely to be in
current use.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

gcc/c/
* c-decl.cc (grokparms): Handle () in a function declaration the
same as (void) for C2X.

gcc/testsuite/
* gcc.dg/c11-unproto-3.c, gcc.dg/c2x-unproto-3.c,
gcc.dg/c2x-unproto-4.c: New tests.
* gcc.dg/c2x-old-style-definition-6.c, gcc.dg/c2x-unproto-1.c,
gcc.dg/c2x-unproto-2.c: Update for removal of unprototyped
functions.

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 804314dd0f2..34f8feda897 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -7868,7 +7868,7 @@ grokparms (struct c_arg_info *arg_info, bool funcdef_flag)
   error ("%<[*]%> not allowed in other than function prototype scope");
 }
 
-  if (arg_types == NULL_TREE && !funcdef_flag
+  if (arg_types == NULL_TREE && !funcdef_flag && !flag_isoc2x
   && !in_system_header_at (input_location))
 warning (OPT_Wstrict_prototypes,
 "function declaration isn%'t a prototype");
@@ -7896,9 +7896,8 @@ grokparms (struct c_arg_info *arg_info, bool funcdef_flag)
   tree parm, type, typelt;
   unsigned int parmno;
 
-  /* In C2X, convert () in a function definition to (void).  */
+  /* In C2X, convert () to (void).  */
   if (flag_isoc2x
- && funcdef_flag
  && !arg_types
  && !arg_info->parms)
arg_types = arg_info->types = void_list_node;
diff --git a/gcc/testsuite/gcc.dg/c11-unproto-3.c 
b/gcc/testsuite/gcc.dg/c11-unproto-3.c
new file mode 100644
index 000..b0e4bf3d5b1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c11-unproto-3.c
@@ -0,0 +1,19 @@
+/* Test function declarations without prototypes for C11.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c11 -pedantic-errors" } */
+
+void f1 ();
+void
+f1a (void)
+{
+  f1 (1, 2);
+}
+
+void f2 ();
+void f2 (int);
+
+void f3 ();
+
+_Static_assert (_Generic (f3,
+ void (*) (int) : 1,
+ default : 3) == 1, "unprototyped test");
diff --git a/gcc/testsuite/gcc.dg/c2x-old-style-definition-6.c 
b/gcc/testsuite/gcc.dg/c2x-old-style-definition-6.c
index fc0e778446d..72bfd56f00a 100644
--- a/gcc/testsuite/gcc.dg/c2x-old-style-definition-6.c
+++ b/gcc/testsuite/gcc.dg/c2x-old-style-definition-6.c
@@ -1,16 +1,16 @@
-/* Test old-style function definitions not in C2x: () does not give
-   type with a prototype except for function definitions.  */
+/* Test old-style function definitions not in C2x: () gives a type with
+   a prototype for all declarations.  */
 /* { dg-do compile } */
 /* { dg-options "-std=c2x" } */
 
-void f1 ();
+void f1 (); /* { dg-message "declared here" } */
 
-/* Prototyped function returning a pointer to unprototyped function.  */
+/* Prototyped function returning a pointer to a function with no arguments.  */
 void (*f2 (void))() { return f1; }
 
 void
 g (void)
 {
-  f1 (1);
-  f2 () (1);
+  f1 (1); /* { dg-error "too many arguments" } */
+  f2 () (1); /* { dg-error "too many arguments" } */
 }
diff --git a/gcc/testsuite/gcc.dg/c2x-unproto-1.c 
b/gcc/testsuite/gcc.dg/c2x-unproto-1.c
index aa87d78610e..d21c6a712fb 100644
--- a/gcc/testsuite/gcc.dg/c2x-unproto-1.c
+++ b/gcc/testsuite/gcc.dg/c2x-unproto-1.c
@@ -1,25 +1,21 @@
-/* Test compatibility of unprototyped and prototyped function types (C2x made
-   the case of types affected by default argument promotions compatible, before
-   removing unprototyped functions completely).  Test affected usages are not
-   accepted for C2x.  */
+/* Test compatibility of prototyped function types with and without arguments
+   (C2x made the case of types affected by default argument promotions
+   compatible, before removing unprototyped functions completely).  Test
+   affected usages are not accepted for C2x.  */
 /* { dg-do compile } */
 /* { dg-options "-std=c2x -pedantic-errors" } */
 
 void f1 (); /* { dg-message "previous declaration" } */
 void f1 (float); /* { dg-error "conflicting types" } */
-/* { dg-message "default promotion" "" { target *-*-* } .-1 } */
 
 void f2 (float); /* { dg-message "previous declaration" } */
 void f2 (); /* { dg-error "conflicting types" } */
-/* { dg-message "default promotion" "" { target *-*-* } .-1 } */
 
 void f3 (); /* { dg-message "previous declaration" } */
 void f3 (char); /* { dg-error "conflicting types" } */
-/* { dg-message "default promotion" "" { target *-*-* } .-1 } */
 
 void f4

Re: [PATCH] c++, v2: Implement C++23 P2071R2 - Named universal character escapes [PR106648]

2022-09-01 Thread Jason Merrill via Gcc-patches


On 9/1/22 07:14, Jakub Jelinek wrote:

On Wed, Aug 31, 2022 at 12:14:22PM -0400, Jason Merrill wrote:

On 8/31/22 11:07, Jakub Jelinek wrote:

On Wed, Aug 31, 2022 at 10:52:49AM -0400, Jason Merrill wrote:

It could be more explicit, but I think we can assume that from the existing
wording; it says it designates the named character.  If there is no such
character, that cannot be satisfied, so it must be ill-formed.


Ok.


So, we could reject the int h case above and accept silently the others?


Why not warn on the others?


We were always silent for the cases like \u123X or \U12345X.
Do you think we should emit some warnings (but never pedwarns/errors in that
case) that it is universal character name like but not completely?


I think that would be helpful, at least for \u{ and \N{.


Ok.


Given what you said above, I think that is what we want for the last 2
for C++23, the question is if it is ok also for C++20/C17 etc. and whether
it should depend on -pedantic or -pedantic-errors or GNU vs. ISO mode
or not in that case.  We could handle those 2 also differently, just
warn instead of error for the \N{ABC} case if not in C++23 mode when
identifier_pos.


That sounds right.


Here is an incremental version of the patch which will make valid
\u{123} and \N{LATIN SMALL LETTER A WITH ACUTE} an extension in GNU
modes before C++23 and split it as separate tokens in ISO modes.


Looks good.


Here is a patch which implements that.
I just wonder if we shouldn't have some warning option that would cover
these warnings, currently one needs to use -w to disable those warnings.

Apparently clang uses -Wunicode option to cover these, but unfortunately
they don't bother to document it (nor almost any other warning option),
so it is unclear what else exactly it covers.  Plus a question is how
we should document that option for GCC...


We might as well use the same flag name, and document it to mean what it 
currently means for GCC.



2022-09-01  Jakub Jelinek  

* charset.cc (_cpp_valid_ucn): In possible identifier contexts, don't
handle \u{ or \N{ specially in -std=c* modes except -std=c++2{3,b}.
In possible identifier contexts, don't emit an error and punt
if \N isn't followed by {, or if \N{} surrounds some lower case
letters or _.  In possible identifier contexts when not C++23, don't
emit an error but warning about unknown character names and treat as
separate tokens.  When treating as separate tokens \u{ or \N{, emit
warnings.

* c-c++-common/cpp/delimited-escape-seq-4.c: New test.
* c-c++-common/cpp/delimited-escape-seq-5.c: New test.
* c-c++-common/cpp/named-universal-char-escape-5.c: New test.
* c-c++-common/cpp/named-universal-char-escape-6.c: New test.
* g++.dg/cpp23/named-universal-char-escape1.C: New test.
* g++.dg/cpp23/named-universal-char-escape2.C: New test.

--- libcpp/charset.cc.jj2022-09-01 09:47:24.146886929 +0200
+++ libcpp/charset.cc   2022-09-01 12:52:28.424034208 +0200
@@ -1448,7 +1448,11 @@ _cpp_valid_ucn (cpp_reader *pfile, const
if (str[-1] == 'u')
  {
length = 4;
-  if (str < limit && *str == '{')
+  if (str < limit
+ && *str == '{'
+ && (!identifier_pos
+ || CPP_OPTION (pfile, delimited_escape_seqs)
+ || !CPP_OPTION (pfile, std)))
{
  str++;
  /* Magic value to indicate no digits seen.  */
@@ -1462,8 +1466,22 @@ _cpp_valid_ucn (cpp_reader *pfile, const
else if (str[-1] == 'N')
  {
length = 4;
+  if (identifier_pos
+ && !CPP_OPTION (pfile, delimited_escape_seqs)
+ && CPP_OPTION (pfile, std))
+   {
+ *cp = 0;
+ return false;
+   }
if (str == limit || *str != '{')
-   cpp_error (pfile, CPP_DL_ERROR, "'\\N' not followed by '{'");
+   {
+ if (identifier_pos)
+   {
+ *cp = 0;
+ return false;
+   }
+ cpp_error (pfile, CPP_DL_ERROR, "'\\N' not followed by '{'");
+   }
else
{
  str++;
@@ -1489,8 +1507,16 @@ _cpp_valid_ucn (cpp_reader *pfile, const
  
  	  if (str < limit && *str == '}')

{
- if (name == str && identifier_pos)
+ if (identifier_pos && (name == str || !strict))
{
+ if (name == str)
+   cpp_warning (pfile, CPP_W_NONE,
+"empty named universal character escape "
+"sequence; treating it as separate tokens");
+ else
+   cpp_warning (pfile, CPP_W_NONE,
+"incomplete named universal character escape "
+"sequence; treating it as separate tokens");


It looks like this is handling \N{abc}, for which "incomplete" seems 
like the wrong description; it's complete, just wrong, and the

Re: [PATCH] c++: Micro-optimize most_specialized_partial_spec

2022-09-01 Thread Jason Merrill via Gcc-patches


On 8/31/22 17:15, Patrick Palka wrote:

This introduces an early exit test to most_specialized_partial_spec for
the common case where we have no partial specializations, which allows
us to avoid some unnecessary work.  In passing, clean the function up a
bit.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


gcc/cp/ChangeLog:

* pt.cc (most_specialized_partial_spec): Exit early when
DECL_TEMPLATE_SPECIALIZATIONS is empty.  Move local variable
declarations closer to their first use.  Remove redundant
flag_concepts test.  Remove redundant forward declaration.
---
  gcc/cp/pt.cc | 45 +++--
  1 file changed, 19 insertions(+), 26 deletions(-)

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index fe7e809fc2d..497a18ef728 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -187,7 +187,6 @@ static int unify_pack_expansion (tree, tree, tree,
  static tree copy_template_args (tree);
  static tree tsubst_template_parms (tree, tree, tsubst_flags_t);
  static void tsubst_each_template_parm_constraints (tree, tree, 
tsubst_flags_t);
-tree most_specialized_partial_spec (tree, tsubst_flags_t);
  static tree tsubst_aggr_type (tree, tree, tsubst_flags_t, tree, int);
  static tree tsubst_arg_types (tree, tree, tree, tsubst_flags_t, tree);
  static tree tsubst_function_type (tree, tree, tsubst_flags_t, tree);
@@ -25756,15 +25755,7 @@ most_general_template (tree decl)
  tree
  most_specialized_partial_spec (tree target, tsubst_flags_t complain)
  {
-  tree list = NULL_TREE;
-  tree t;
-  tree champ;
-  int fate;
-  bool ambiguous_p;
-  tree outer_args = NULL_TREE;
-  tree tmpl, args;
-
-  tree decl;
+  tree tmpl, args, decl;
if (TYPE_P (target))
  {
tree tinfo = CLASSTYPE_TEMPLATE_INFO (target);
@@ -25788,13 +25779,18 @@ most_specialized_partial_spec (tree target, 
tsubst_flags_t complain)
else
  gcc_unreachable ();
  
+  tree main_tmpl = most_general_template (tmpl);

+  tree specs = DECL_TEMPLATE_SPECIALIZATIONS (main_tmpl);
+  if (!specs)
+/* There are no partial specializations of this template.  */
+return NULL_TREE;
+
push_access_scope_guard pas (decl);
deferring_access_check_sentinel acs (dk_no_deferred);
  
-  tree main_tmpl = most_general_template (tmpl);

-
/* For determining which partial specialization to use, only the
   innermost args are interesting.  */
+  tree outer_args = NULL_TREE;
if (TMPL_ARGS_HAVE_MULTIPLE_LEVELS (args))
  {
outer_args = strip_innermost_template_args (args, 1);
@@ -25806,7 +25802,8 @@ most_specialized_partial_spec (tree target, 
tsubst_flags_t complain)
   fully resolve everything.  */
processing_template_decl_sentinel ptds;
  
-  for (t = DECL_TEMPLATE_SPECIALIZATIONS (main_tmpl); t; t = TREE_CHAIN (t))

+  tree list = NULL_TREE;
+  for (tree t = specs; t; t = TREE_CHAIN (t))
  {
const tree ospec_tmpl = TREE_VALUE (t);
  
@@ -25829,10 +25826,8 @@ most_specialized_partial_spec (tree target, tsubst_flags_t complain)

  if (outer_args)
spec_args = add_to_template_args (outer_args, spec_args);
  
-  /* Keep the candidate only if the constraints are satisfied,

- or if we're not compiling with concepts.  */
-  if (!flag_concepts
- || constraints_satisfied_p (ospec_tmpl, spec_args))
+ /* Keep the candidate only if the constraints are satisfied.  */
+ if (constraints_satisfied_p (ospec_tmpl, spec_args))
  {
  list = tree_cons (spec_args, ospec_tmpl, list);
TREE_TYPE (list) = TREE_TYPE (t);
@@ -25843,13 +25838,11 @@ most_specialized_partial_spec (tree target, 
tsubst_flags_t complain)
if (! list)
  return NULL_TREE;
  
-  ambiguous_p = false;

-  t = list;
-  champ = t;
-  t = TREE_CHAIN (t);
-  for (; t; t = TREE_CHAIN (t))
+  tree champ = list;
+  bool ambiguous_p = false;
+  for (tree t = TREE_CHAIN (list); t; t = TREE_CHAIN (t))
  {
-  fate = more_specialized_partial_spec (tmpl, champ, t);
+  int fate = more_specialized_partial_spec (tmpl, champ, t);
if (fate == 1)
;
else
@@ -25868,9 +25861,9 @@ most_specialized_partial_spec (tree target, 
tsubst_flags_t complain)
  }
  
if (!ambiguous_p)

-for (t = list; t && t != champ; t = TREE_CHAIN (t))
+for (tree t = list; t && t != champ; t = TREE_CHAIN (t))
{
-   fate = more_specialized_partial_spec (tmpl, champ, t);
+   int fate = more_specialized_partial_spec (tmpl, champ, t);
if (fate != 1)
  {
ambiguous_p = true;
@@ -25889,7 +25882,7 @@ most_specialized_partial_spec (tree target, 
tsubst_flags_t complain)
else
error ("ambiguous template instantiation for %q#D", target);
str = ngettext ("candidate is:", "candidates are:", list_length (list));
-  for (t = list; t; t = TREE_CHAIN (t))
+  for (tree t = list; t; t =

Re: [COMMITTED] Implement ranger folder for __builtin_signbit.

2022-09-01 Thread Aldy Hernandez via Gcc-patches

On Thu, Sep 1, 2022 at 6:41 PM Joseph Myers  wrote:
>
> On Thu, 1 Sep 2022, Aldy Hernandez via Gcc-patches wrote:
>
> > Now that we keep track of the signbit, we can use it to fold 
> > __builtin_signbit.
> >
> > I am assuming I don't have try too hard to get the actual signbit
> > number and 1 will do.  Especially, since we're inconsistent in trunk whether
> > we fold the builtin or whether we calculate it at runtime.
>
> The main thing to watch out for is cases where, in the abstract machine,
> there is a single call executed to __builtin_signbit, but after code
> transformations, some uses of the result of that call are optimized to use
> a 0 or 1 value and other uses end up using a runtime value - inconsistency
> between different calls is fine, inconsistency where only a single call is
> executed in the abstract machine isn't.  (Cf. bugs 102930, 85957, 93681,
> 93806, 93682, for example; the test in bug 93806 comment 27 is the sort of
> thing to try.)

Can't we just be consistent with the runtime?  I'm happy to return
whatever value is appropriate for the architecture.  It doesn't have
to be 1.  Though ISTM that if we know the sign on one side of a
conditional, we should know the sign on the other side of the
conditional, so we should fold all uses of __builtin_signbit in that
case.  So maybe we're ok.

On the other hand, we could narrow the range to nonzero, which we can
model perfectly for integers (and __builtin_signbit returns one).  If
we take this approach it means we can't fold:

  if (x < -5.0)
num = __builtin_signbit (x);

but we could fold:

  if (x > 5.0)
num = __builtin_signbit (x);

since that's always 0.

And it also means we could fold conditional checks against zero/nonzero:

void func(float x)
{
  if (x < -5.0 && !__builtin_signbit (x))
link_error ();
}

Whatever works for y'all.
Aldy

p.s. My head exploded after reading half of PR93806.  I should just go
back to integers.

[PATCH] i386: Fix conversion of move to/from AX_REG into xchg [PR106707]

2022-09-01 Thread Uros Bizjak via Gcc-patches

The conversion of a move pattern where both operands are AX_REG
should be prevented.

2022-09-01  Uroš Bizjak  

gcc/ChangeLog:

PR target/106707
* config/i386/i386.md (moves to/from AX_REG into xchg peephole2):
Do not convert a move pattern where both operands are AX_REG.

gcc/testsuite/ChangeLog:

PR target/106707
* gcc.target/i386/pr106707.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index a4a18cf89f5..1aef1af594d 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3043,8 +3043,8 @@ (define_peephole2
   [(set (match_operand:SWI48 0 "general_reg_operand")
(match_operand:SWI48 1 "general_reg_operand"))]
  "optimize_size > 1
-  && (REGNO (operands[0]) == AX_REG
-  || REGNO (operands[1]) == AX_REG)
+  && ((REGNO (operands[0]) == AX_REG)
+  != (REGNO (operands[1]) == AX_REG))
   && optimize_insn_for_size_p ()
   && peep2_reg_dead_p (1, operands[1])"
   [(parallel [(set (match_dup 0) (match_dup 1))
diff --git a/gcc/testsuite/gcc.target/i386/pr106707.c 
b/gcc/testsuite/gcc.target/i386/pr106707.c
new file mode 100644
index 000..a127ccd4679
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106707.c
@@ -0,0 +1,19 @@
+/* PR target/106707 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-Oz -g -fno-cprop-registers -fno-dce" } */
+
+typedef unsigned __attribute__((__vector_size__ (8))) V;
+
+unsigned __int128 ii;
+unsigned x, y;
+
+V v;
+
+void
+foo (long a)
+{
+  long l = a != x;
+  int i = __builtin_add_overflow_p (y * ii, 0, 0);
+  V u = ii < x | v, w = x <= u < i & y <= x / ii;
+  v = __builtin_shufflevector (v, w, 1, 2) + (V) l;
+}

[PATCH] Convert rest of compiler to dconst[n]inf.

2022-09-01 Thread Aldy Hernandez via Gcc-patches

This is kinda obvious.

OK?

gcc/ChangeLog:

* builtins.cc (fold_builtin_inf): Convert use of real_info to dconstinf.
(fold_builtin_fpclassify): Same.
* fold-const-call.cc (fold_const_call_cc): Same.
* match.pd: Same.
* omp-low.cc (omp_reduction_init_op): Same.
* realmpfr.cc (real_from_mpfr): Same.
* tree.cc (build_complex_inf): Same.
---
 gcc/builtins.cc| 8 ++--
 gcc/fold-const-call.cc | 2 +-
 gcc/match.pd   | 2 +-
 gcc/omp-low.cc | 9 +++--
 gcc/realmpfr.cc| 2 +-
 gcc/tree.cc| 5 ++---
 6 files changed, 10 insertions(+), 18 deletions(-)

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index f1f7c0ce337..5f319b28030 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -8696,8 +8696,6 @@ fold_builtin_strlen (location_t loc, tree expr, tree 
type, tree arg)
 static tree
 fold_builtin_inf (location_t loc, tree type, int warn)
 {
-  REAL_VALUE_TYPE real;
-
   /* __builtin_inff is intended to be usable to define INFINITY on all
  targets.  If an infinity is not available, INFINITY expands "to a
  positive constant of type float that overflows at translation
@@ -8708,8 +8706,7 @@ fold_builtin_inf (location_t loc, tree type, int warn)
   if (!MODE_HAS_INFINITIES (TYPE_MODE (type)) && warn)
 pedwarn (loc, 0, "target format does not support infinity");
 
-  real_inf ();
-  return build_real (type, real);
+  return build_real (type, dconstinf);
 }
 
 /* Fold function call to builtin sincos, sincosf, or sincosl.  Return
@@ -9336,9 +9333,8 @@ fold_builtin_fpclassify (location_t loc, tree *args, int 
nargs)
 
   if (tree_expr_maybe_infinite_p (arg))
 {
-  real_inf ();
   tmp = fold_build2_loc (loc, EQ_EXPR, integer_type_node, arg,
-build_real (type, r));
+build_real (type, dconstinf));
   res = fold_build3_loc (loc, COND_EXPR, integer_type_node, tmp,
 fp_infinite, res);
 }
diff --git a/gcc/fold-const-call.cc b/gcc/fold-const-call.cc
index c18256825af..72953875414 100644
--- a/gcc/fold-const-call.cc
+++ b/gcc/fold-const-call.cc
@@ -1116,7 +1116,7 @@ fold_const_call_cc (real_value *result_real, real_value 
*result_imag,
 CASE_CFN_CPROJ:
   if (real_isinf (arg_real) || real_isinf (arg_imag))
{
- real_inf (result_real);
+ *result_real = dconstinf;
  *result_imag = dconst0;
  result_imag->sign = arg_imag->sign;
}
diff --git a/gcc/match.pd b/gcc/match.pd
index f5fec634279..17318f523fb 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5300,7 +5300,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 if (cmp == LT_EXPR || cmp == GE_EXPR)
   tow = dconst0;
 else
-  real_inf ();
+  tow = dconstinf;
 real_nextafter (, fmt, , );
 real_convert (, fmt, );
 if (REAL_VALUE_ISINF (c2alt))
diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index f54dea923bf..e9e4bd05d72 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -4524,12 +4524,9 @@ omp_reduction_init_op (location_t loc, enum tree_code 
op, tree type)
 case MAX_EXPR:
   if (SCALAR_FLOAT_TYPE_P (type))
{
- REAL_VALUE_TYPE max, min;
+ REAL_VALUE_TYPE min;
  if (HONOR_INFINITIES (type))
-   {
- real_inf ();
- real_arithmetic (, NEGATE_EXPR, , NULL);
-   }
+   real_arithmetic (, NEGATE_EXPR, , NULL);
  else
real_maxval (, 1, TYPE_MODE (type));
  return build_real (type, min);
@@ -4551,7 +4548,7 @@ omp_reduction_init_op (location_t loc, enum tree_code op, 
tree type)
{
  REAL_VALUE_TYPE max;
  if (HONOR_INFINITIES (type))
-   real_inf ();
+   max = dconstinf;
  else
real_maxval (, 0, TYPE_MODE (type));
  return build_real (type, max);
diff --git a/gcc/realmpfr.cc b/gcc/realmpfr.cc
index 54d097f5965..f7f096330ce 100644
--- a/gcc/realmpfr.cc
+++ b/gcc/realmpfr.cc
@@ -68,7 +68,7 @@ real_from_mpfr (REAL_VALUE_TYPE *r, mpfr_srcptr m, const 
real_format *format,
   /* Take care of Infinity and NaN.  */
   if (mpfr_inf_p (m))
 {
-  real_inf (r);
+  *r = dconstinf;
   if (mpfr_sgn (m) < 0)
*r = real_value_negate (r);
   return;
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 007c9325b17..0179c0fdc9d 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -2535,11 +2535,10 @@ build_complex (tree type, tree real, tree imag)
 tree
 build_complex_inf (tree type, bool neg)
 {
-  REAL_VALUE_TYPE rinf, rzero = dconst0;
+  REAL_VALUE_TYPE rzero = dconst0;
 
-  real_inf ();
   rzero.sign = neg;
-  return build_complex (type, build_real (TREE_TYPE (type), rinf),
+  return build_complex (type, build_real (TREE_TYPE (type), dconstinf),
build_real (TREE_TYPE (type), rzero));
 }
 
-- 
2.37.1

[COMMITTED] Convert ranger uses of real_inf to dconst[n]inf.

2022-09-01 Thread Aldy Hernandez via Gcc-patches

gcc/ChangeLog:

* range-op-float.cc (build_le): Convert to dconst*inf.
(build_ge): Same.
* value-range.cc (frange::set_signbit): Same.
(frange::normalize_kind): Same.
(range_tests_floats): Same.
* value-range.h (vrp_val_max): Same.
(vrp_val_min): Same.
(frange::set_varying): Same.
---
 gcc/range-op-float.cc | 16 ++--
 gcc/value-range.cc| 23 ---
 gcc/value-range.h | 16 
 3 files changed, 18 insertions(+), 37 deletions(-)

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index 2f1af4055c3..7301e5a060b 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -232,17 +232,15 @@ frange_drop_ninf (frange , tree type)
   r.intersect (tmp);
 }
 
-// (X <= VAL) produces the range of [MIN, VAL].
+// (X <= VAL) produces the range of [-INF, VAL].
 
 static void
 build_le (frange , tree type, const REAL_VALUE_TYPE )
 {
-  REAL_VALUE_TYPE min;
-  real_inf (, 1);
-  r.set (type, min, val);
+  r.set (type, dconstninf, val);
 }
 
-// (X < VAL) produces the range of [MIN, VAL).
+// (X < VAL) produces the range of [-INF, VAL).
 
 static void
 build_lt (frange , tree type, const REAL_VALUE_TYPE )
@@ -251,17 +249,15 @@ build_lt (frange , tree type, const REAL_VALUE_TYPE 
)
   build_le (r, type, val);
 }
 
-// (X >= VAL) produces the range of [VAL, MAX].
+// (X >= VAL) produces the range of [VAL, +INF].
 
 static void
 build_ge (frange , tree type, const REAL_VALUE_TYPE )
 {
-  REAL_VALUE_TYPE max;
-  real_inf (, 0);
-  r.set (type, val, max);
+  r.set (type, val, dconstinf);
 }
 
-// (X > VAL) produces the range of (VAL, MAX].
+// (X > VAL) produces the range of (VAL, +INF].
 
 static void
 build_gt (frange , tree type, const REAL_VALUE_TYPE )
diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 71581b2c54d..6fd6e3b745c 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -320,18 +320,14 @@ frange::set_signbit (fp_prop::kind k)
   if (k == fp_prop::YES)
 {
   // Crop the range to [-INF, 0].
-  REAL_VALUE_TYPE min;
-  real_inf (, 1);
-  frange crop (m_type, min, dconst0);
+  frange crop (m_type, dconstninf, dconst0);
   intersect (crop);
   m_props.set_signbit (fp_prop::YES);
 }
   else if (k == fp_prop::NO)
 {
   // Crop the range to [0, +INF].
-  REAL_VALUE_TYPE max;
-  real_inf (, 0);
-  frange crop (m_type, dconst0, max);
+  frange crop (m_type, dconst0, dconstinf);
   intersect (crop);
   m_props.set_signbit (fp_prop::NO);
 }
@@ -440,8 +436,8 @@ frange::normalize_kind ()
   if (!m_props.varying_p ())
{
  m_kind = VR_RANGE;
- real_inf (_min, 1);
- real_inf (_max, 0);
+ m_min = dconstninf;
+ m_max = dconstinf;
  return true;
}
 }
@@ -3785,12 +3781,9 @@ range_tests_floats ()
   ASSERT_FALSE (r0.varying_p ());
 
   // The endpoints of a VARYING are +-INF.
-  REAL_VALUE_TYPE inf, ninf;
-  real_inf (, 0);
-  real_inf (, 1);
   r0.set_varying (float_type_node);
-  ASSERT_TRUE (real_identical (_bound (), ));
-  ASSERT_TRUE (real_identical (_bound (), ));
+  ASSERT_TRUE (real_identical (_bound (), ));
+  ASSERT_TRUE (real_identical (_bound (), ));
 
   // The maximum representable range for a type is still a subset of VARYING.
   REAL_VALUE_TYPE q, r;
@@ -3800,9 +3793,9 @@ range_tests_floats ()
   // r0 is not a varying, because it does not include -INF/+INF.
   ASSERT_FALSE (r0.varying_p ());
   // The upper bound of r0 must be less than +INF.
-  ASSERT_TRUE (real_less (_bound (), ));
+  ASSERT_TRUE (real_less (_bound (), ));
   // The lower bound of r0 must be greater than -INF.
-  ASSERT_TRUE (real_less (, _bound ()));
+  ASSERT_TRUE (real_less (, _bound ()));
 
   // For most architectures, where float and double are different
   // sizes, having the same endpoints does not necessarily mean the
diff --git a/gcc/value-range.h b/gcc/value-range.h
index 3767bd17314..bc00f3d5b08 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -1050,11 +1050,7 @@ vrp_val_max (const_tree type)
   return wide_int_to_tree (const_cast (type), max);
 }
   if (frange::supports_p (type))
-{
-  REAL_VALUE_TYPE real;
-  real_inf ();
-  return build_real (const_cast  (type), real);
-}
+return build_real (const_cast  (type), dconstinf);
   return NULL_TREE;
 }
 
@@ -1068,11 +1064,7 @@ vrp_val_min (const_tree type)
   if (POINTER_TYPE_P (type))
 return build_zero_cst (const_cast (type));
   if (frange::supports_p (type))
-{
-  REAL_VALUE_TYPE ninf;
-  real_inf (, 1);
-  return build_real (const_cast  (type), ninf);
-}
+return build_real (const_cast  (type), dconstninf);
   return NULL_TREE;
 }
 
@@ -1145,8 +1137,8 @@ frange::set_varying (tree type)
 {
   m_kind = VR_VARYING;
   m_type = type;
-  real_inf (_min, 1);
-  real_inf (_max, 0);
+  m_min = dconstninf;
+  m_max = dconstinf;

Re: [PATCH] ipa: Fix throw in multi-versioned functions [PR106627]

2022-09-01 Thread Simon Rainer

Hi,

Thanks for taking a look at my patch. I tested some combinations with 
pure/noreturn attributes. gcc seems to ignore those attributes on multiversion 
functions and generates sub-optimal assembly.
But I wasn't able to fix this by simply copying members like DECL_PURE_P. It's 
pretty hard for me to tell which members of tree are relevant for a function 
declaration and should be copied and which should not be copied.

Anyway, I think the TREE_NOTHROW change is the most important one, because it 
leads to correctness problems (and is what broke my original program :D ), so 
could you please commit my patch as I don't have write-access myself.

Should I open a new bug on bugzilla for the pure/noreturn issue?

Thanks
Simon Rainer


On Thu, Sep 1, 2022, at 08:37, Richard Biener wrote:
> On Wed, Aug 31, 2022 at 11:00 PM Simon Rainer  wrote:
> >
> > Hi,
> >
> > This patch fixes PR106627. I ran the i386.exp tests on my x86_64-linux-gnu 
> > machine with a fully bootstrapped checkout. I also tested manually that no 
> > exception handling code is generated if none of the function versions 
> > throws an exception.
> > I don't have access to a machine to test the change to  rs6000.cc, but the 
> > code seems like an exact copy and I don't see a reason why it shouldn't 
> > work there the same way.
> >
> > Regards
> > Simon Rainer
> >
> > From 6fcb1c742fa1d61048f7d63243225a8d1931af4a Mon Sep 17 00:00:00 2001
> > From: Simon Rainer 
> > Date: Wed, 31 Aug 2022 20:56:04 +0200
> > Subject: [PATCH] ipa: Fix throw in multi-versioned functions [PR106627]
> >
> > Any multi-versioned function was implicitly declared as noexcept, which
> > leads to an abort if an exception is thrown inside the function.
> > The reason for this is that the function declaration is replaced by a
> > newly created dispatcher declaration, which has TREE_NOTHROW always set
> > to 1. Instead we need to set TREE_NOTHROW to the value of the original
> > declaration.
> 
> Looks quite obvious.  The middle-end to target interface is a bit iffy
> since we have
> to duplicate this everywhere.  There's also other flags like
> pure/const and noreturn
> that do not impose correctness issues but may cause irritations if the IL gets
> a call to the dispatcher not marked noreturn but there's no code following.
> 
> That said, the fix looks good to me.
> 
> Thanks,
> Richard.
> 
> > PR ipa/106627
> >
> > gcc/ChangeLog:
> >
> > * config/i386/i386-features.cc 
> > (ix86_get_function_versions_dispatcher): Set TREE_NOTHROW
> > correctly for dispatcher declaration
> > * config/rs6000/rs6000.cc 
> > (rs6000_get_function_versions_dispatcher): Likewise
> >
> > gcc/testsuite/ChangeLog:
> >
> > * g++.target/i386/pr106627.C: New test.
> > ---
> >  gcc/config/i386/i386-features.cc |  1 +
> >  gcc/config/rs6000/rs6000.cc  |  1 +
> >  gcc/testsuite/g++.target/i386/pr106627.C | 30 
> >  3 files changed, 32 insertions(+)
> >  create mode 100644 gcc/testsuite/g++.target/i386/pr106627.C
> >
> > diff --git a/gcc/config/i386/i386-features.cc 
> > b/gcc/config/i386/i386-features.cc
> > index d6bb66cbe01..5b3b1aeff28 100644
> > --- a/gcc/config/i386/i386-features.cc
> > +++ b/gcc/config/i386/i386-features.cc
> > @@ -3268,6 +3268,7 @@ ix86_get_function_versions_dispatcher (void *decl)
> >
> >/* Right now, the dispatching is done via ifunc.  */
> >dispatch_decl = make_dispatcher_decl (default_node->decl);
> > +  TREE_NOTHROW(dispatch_decl) = TREE_NOTHROW(fn);
> >
> >dispatcher_node = cgraph_node::get_create (dispatch_decl);
> >gcc_assert (dispatcher_node != NULL);
> > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> > index 2f3146e56f8..9280da8a5c8 100644
> > --- a/gcc/config/rs6000/rs6000.cc
> > +++ b/gcc/config/rs6000/rs6000.cc
> > @@ -24861,6 +24861,7 @@ rs6000_get_function_versions_dispatcher (void *decl)
> >
> >/* Right now, the dispatching is done via ifunc.  */
> >dispatch_decl = make_dispatcher_decl (default_node->decl);
> > +  TREE_NOTHROW(dispatch_decl) = TREE_NOTHROW(fn);
> >
> >dispatcher_node = cgraph_node::get_create (dispatch_decl);
> >gcc_assert (dispatcher_node != NULL);
> > diff --git a/gcc/testsuite/g++.target/i386/pr106627.C 
> > b/gcc/testsuite/g++.target/i386/pr106627.C
> > new file mode 100644
> > index 000..a67f5ae4813
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.target/i386/pr106627.C
> > @@ -0,0 +1,30 @@
> > +/* PR c++/103012 Exception handling with multiversioned functions */
> > +/* { dg-do run } */
> > +/* { dg-require-ifunc "" }  */
> > +
> > +#include 
> > +
> > +__attribute__((target("default")))
> > +void f() {
> > +throw 1;
> > +}
> > +
> > +__attribute__((target("sse4.2,bmi")))
> > +void f() {
> > +throw 2;
> > +}
> > +
> > +int main()
> > +{
> > +try {
> > +f();
> > +}
> > +catch(...)
> > +{
> > +return 0;
> > +

Re: [COMMITTED] Implement ranger folder for __builtin_signbit.

2022-09-01 Thread Joseph Myers

On Thu, 1 Sep 2022, Aldy Hernandez via Gcc-patches wrote:

> Now that we keep track of the signbit, we can use it to fold 
> __builtin_signbit.
> 
> I am assuming I don't have try too hard to get the actual signbit
> number and 1 will do.  Especially, since we're inconsistent in trunk whether
> we fold the builtin or whether we calculate it at runtime.

The main thing to watch out for is cases where, in the abstract machine, 
there is a single call executed to __builtin_signbit, but after code 
transformations, some uses of the result of that call are optimized to use 
a 0 or 1 value and other uses end up using a runtime value - inconsistency 
between different calls is fine, inconsistency where only a single call is 
executed in the abstract machine isn't.  (Cf. bugs 102930, 85957, 93681, 
93806, 93682, for example; the test in bug 93806 comment 27 is the sort of 
thing to try.)

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH v2, rs6000] Put dg-options before effective target checks

2022-09-01 Thread Segher Boessenkool

On Thu, Sep 01, 2022 at 01:30:18PM +0800, HAO CHEN GUI wrote:
> --- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c
> @@ -1,6 +1,10 @@
> -/* { dg-do compile { target { lp64 && has_arch_pwr9 } } } */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power9 -mvsx" } */

-mcpu=power9 already implies -mvsx.  If you would keep -mvsx, that
belongs *after* testing powerpc_vsx_ok.

> +/* { dg-require-effective-target has_arch_ppc64 } */
> +/* { dg-require-effective-target int128 } */
>  /* { dg-require-effective-target powerpc_vsx_ok } */
> -/* { dg-options "-O2 -mvsx" } */
> +/* The test case can be compiled on all platforms with compiling option
> +   -mdejagnu-cpu=power9.  */

Please don't put in comments like this: that is what the code already
*does*, after all :-)

> --- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c
> @@ -1,6 +1,8 @@
> -/* { dg-do compile { target { lp64 && {! has_arch_pwr9} } } } */
> -/* { dg-require-effective-target powerpc_vsx_ok } */
>  /* { dg-options "-O2 -mvsx" } */

You cannot add -mvsx without first testing powerpc_vsx_ok (unless it is
guaranteed some other way of course; here, it isn't).

> +/* { dg-do compile { target { ! has_arch_pwr9 } } } */

Please keep dg-do first thing in the file.

> --- a/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
> @@ -1,5 +1,6 @@
> -/* { dg-do compile { target has_arch_ppc64 } } */
> +/* { dg-do compile } */
>  /* { dg-options "-mdejagnu-cpu=power6 -O2" } */
> +/* { dg-require-effective-target has_arch_ppc64 } */

This is fine, but it doesn't change anything, unless we have a bug.


Segher

Re: [OpenMP, nvptx] Use bar.sync/arrive for barriers when tasking is not used

2022-09-01 Thread Jakub Jelinek via Gcc-patches

On Thu, Sep 01, 2022 at 11:39:42PM +0800, Chung-Lin Tang wrote:
> our work on SPEChpc2021 benchmarks show that, after the fix for PR99555 was 
> committed:
> [libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end
> https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=5ed77fb3ed1ee0289a0ec9499ef52b99b39421f1
> 
> while that patch fixed the hang, there were quite severe performance 
> regressions caused
> by this new barrier code. Under OpenMP target offload mode, Minisweep 
> regressed by about 350%,
> while HPGMG-FV was about 2x slower.
> 
> So the problem was presumably the new barriers, which replaced erroneous but 
> fast bar.sync
> instructions, with correct but really heavy-weight futex_wait/wake operations 
> on the GPU.
> This is probably required for preserving correct task vs. barrier behavior.
> 
> However, the observation is that: when tasks-related functionality are not 
> used at all by
> the team inside an OpenMP target region, and a barrier is just a place to 
> wait for all
> threads to rejoin (no problem of invoking waiting tasks to re-start) a 
> barrier can in that
> case be implemented by simple bar.sync and bar.arrive PTX instructions. That 
> should be
> able to recover most performance the cases that usually matter, e.g. 'omp 
> parallel for' inside
> 'omp target'.
> 
> So the plan is to mark cases where 'tasks are never used'. This patch adds a 
> 'task_never_used'
> flag inside struct gomp_team, initialized to true, and set to false when 
> tasks are added to
> the team. The nvptx specific gomp_team_barrier_wait_end routines can then use 
> simple barrier
> when team->task_never_used remains true on the barrier.

I'll defer the nvptx specific changes to Tom because I'm not familiar enough
with NVPTX.  But I'll certainly object against any changes for this outside
of nvptx.  We don't need or want the task_never_used field and its
maintainance nor GOMP_task_set_used entrypoint in host libgomp.so nor for
NVPTX.
As you use it for many other constructs (master/masked/critical/single -
does omp_set_lock etc. count too?  only one thread acquires the lock, others
don't), it looks very much misnamed, perhaps better talk about thread
divergence or what is the PTX term for it.
Anyway, there is no point to track this all on the host or for amdgcn of
xeon phi offloading, nothing will use that info ever, so it is just wasted
memory and CPU cycles.
I don't understand how it can safely work, because if it needs to fallback
to the fixed behavior for master or single, why isn't user just using
  if (omp_get_thread_num () == 0)
{
  // whatever
}
etc. problematic too?
If it can for some reason work safely, then instead of adding
GOMP_task_set_used calls add some ifn call that will be after IPA folded to
nothing everywhere but on NVPTX and only have those calls on NVPTX, on the
library add some macros for the team->task_ever_used tweaks, defined to
nothing except for config/nvptx/*.h and limit the changes to PTX libgomp.a
then.
But I'm afraid a lot of code creates some asymmetric loads, even just a
work-sharing loop, if number of iterations isn't divisible by number of
threads, some threads could do less work, or with dynamic etc. schedules,
...

Jakub

[OpenMP, nvptx] Use bar.sync/arrive for barriers when tasking is not used

2022-09-01 Thread Chung-Lin Tang

Hi, 
our work on SPEChpc2021 benchmarks show that, after the fix for PR99555 was 
committed:
[libgomp, nvptx] Fix hang in gomp_team_barrier_wait_end
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=5ed77fb3ed1ee0289a0ec9499ef52b99b39421f1

while that patch fixed the hang, there were quite severe performance 
regressions caused
by this new barrier code. Under OpenMP target offload mode, Minisweep regressed 
by about 350%,
while HPGMG-FV was about 2x slower.

So the problem was presumably the new barriers, which replaced erroneous but 
fast bar.sync
instructions, with correct but really heavy-weight futex_wait/wake operations 
on the GPU.
This is probably required for preserving correct task vs. barrier behavior.

However, the observation is that: when tasks-related functionality are not used 
at all by
the team inside an OpenMP target region, and a barrier is just a place to wait 
for all
threads to rejoin (no problem of invoking waiting tasks to re-start) a barrier 
can in that
case be implemented by simple bar.sync and bar.arrive PTX instructions. That 
should be
able to recover most performance the cases that usually matter, e.g. 'omp 
parallel for' inside
'omp target'.

So the plan is to mark cases where 'tasks are never used'. This patch adds a 
'task_never_used'
flag inside struct gomp_team, initialized to true, and set to false when tasks 
are added to
the team. The nvptx specific gomp_team_barrier_wait_end routines can then use 
simple barrier
when team->task_never_used remains true on the barrier.

Some other cases, like the master/masked construct, and single construct, also 
needs to have
task_never_used set false; because these constructs inherently creates 
asymmetric loads where
only a subset of threads run through the region (which may or may not use 
tasking), there may
be the case where different threads wait at the end assuming different 
task_never_used cases.
For correctness, these constructs must have team->task_never_used 
conservatively marked false
at the start of the construct.

This patch has been divided into two: the first is the inlining of contents of 
config/linux/bar.c
into config/nvptx/bar.c (instead of an include). This is needed now because 
some parts of
gomp_team_barrier_wait_[cancel_]end now needs nvptx specific adjustments. The 
second contains
the above described changes.

Tested on powerpc64le-linux and x86_64-linux with nvptx offloading, seeking 
approval for trunk.

Thanks,
Chung-Lin

From c2fdc31880d2d040822e8abece015c29a6d7b472 Mon Sep 17 00:00:00 2001
From: Chung-Lin Tang 
Date: Thu, 1 Sep 2022 05:53:49 -0700
Subject: [PATCH 1/2] libgomp: inline config/linux/bar.c into
 config/nvptx/bar.c

Preparing to add nvptx specific modifications to gomp_team_barrier_wait_end,
et al., so change from using an #include of config/linux/bar.c
in config/nvptx/bar.c, to a full copy of the implementation.

2022-09-01  Chung-Lin Tang  

libgomp/ChangeLog:

* config/nvptx/bar.c: Adjust include of "../linux/bar.c" into an
inlining of contents of config/linux/bar.c,
---
 libgomp/config/nvptx/bar.c | 183 -
 1 file changed, 180 insertions(+), 3 deletions(-)

diff --git a/libgomp/config/nvptx/bar.c b/libgomp/config/nvptx/bar.c
index eee2107..a850c22 100644
--- a/libgomp/config/nvptx/bar.c
+++ b/libgomp/config/nvptx/bar.c
@@ -161,6 +161,183 @@ static inline void do_wait (int *addr, int val)
 futex_wait (addr, val);
 }
 
-/* Reuse the linux implementation.  */
-#define GOMP_WAIT_H 1
-#include "../linux/bar.c"
+/* Below is based on the linux implementation.  */
+
+void
+gomp_barrier_wait_end (gomp_barrier_t *bar, gomp_barrier_state_t state)
+{
+  if (__builtin_expect (state & BAR_WAS_LAST, 0))
+{
+  /* Next time we'll be awaiting TOTAL threads again.  */
+  bar->awaited = bar->total;
+  __atomic_store_n (>generation, bar->generation + BAR_INCR,
+   MEMMODEL_RELEASE);
+  futex_wake ((int *) >generation, INT_MAX);
+}
+  else
+{
+  do
+   do_wait ((int *) >generation, state);
+  while (__atomic_load_n (>generation, MEMMODEL_ACQUIRE) == state);
+}
+}
+
+void
+gomp_barrier_wait (gomp_barrier_t *bar)
+{
+  gomp_barrier_wait_end (bar, gomp_barrier_wait_start (bar));
+}
+
+/* Like gomp_barrier_wait, except that if the encountering thread
+   is not the last one to hit the barrier, it returns immediately.
+   The intended usage is that a thread which intends to gomp_barrier_destroy
+   this barrier calls gomp_barrier_wait, while all other threads
+   call gomp_barrier_wait_last.  When gomp_barrier_wait returns,
+   the barrier can be safely destroyed.  */
+
+void
+gomp_barrier_wait_last (gomp_barrier_t *bar)
+{
+  gomp_barrier_state_t state = gomp_barrier_wait_start (bar);
+  if (state & BAR_WAS_LAST)
+gomp_barrier_wait_end (bar, state);
+}
+
+void
+gomp_team_barrier_wake (gomp_barrier_t *bar, int count)
+{
+  futex_wake ((int *) >generation, count == 0 ? INT_MAX :

Re: [PATCH] rs6000/test: Fix bswap64-4.c with has_arch_ppc64 [PR106680]

2022-09-01 Thread Segher Boessenkool

On Thu, Sep 01, 2022 at 05:05:44PM +0800, Kewen.Lin wrote:
> > On Wed, Aug 31, 2022 at 05:33:28PM +0800, Kewen.Lin wrote:
> > *Should* -mpowerpc64  be disabled by -m32?  
> 
> I think the reason to disable -mpowerpc64 at -m32 is that we have
> -mpowerpc64 explicitly specified at -m64 (equivalent behavior).

*Im*plicitly.  Explicit means the user has it on the command line.

> In the current implementation, when -m64 is specified, we set the
> bit OPTION_MASK_POWERPC64 in both opts and opts_set.  Since we
> set OPTION_MASK_POWERPC64 in opts_set for -m64, when we find the
> OPTION_MASK_POWERPC64 is ON in opts_set, we don't know if there
> is one actual cmd-line option -mpowerpc64 or just -m64.

Yes.  That is what _explicit is for :-)

> Without any explicit -mpowerpc64 (and -mno-), I think we all agree
> that -m64 should set OPTION_MASK_POWERPC64 in opts, conversely -m32
> should unset OPTION_MASK_POWERPC64 in opts.

The latter only for OSes that do not handle -mpowerpc64 correctly.

> To make -m32/-m64 and -mpowerpc64 orthogonal, IMHO we should not
> set bit OPTION_MASK_POWERPC64 in opts_set for -m64.

No.  Instead, we should not touch it if the user has explicitly set it
or unset it.  Just like with all other flags :-)

> I'm not sure
> if there is some particular reason why we set OPTION_MASK_POWERPC64
> in opts_set, I hope no. :)  One possible reason I can imagine is
> that we want to get the cmd-line options "-mno-powerpc64 -m64" not
> raise error, but I think having it to error makes more senses.

Agreed.

> btw, I guess the option compatibility isn't an blocking issue
> here, right?

We have survived for years with the status quo, nothing changed recently
that makes it more urgent to fix this.


Segher

Re: [PATCH] rs6000/test: Fix bswap64-4.c with has_arch_ppc64 [PR106680]

2022-09-01 Thread Segher Boessenkool

On Thu, Sep 01, 2022 at 04:57:59PM +0800, Kewen.Lin wrote:
> on 2022/8/31 22:13, Peter Bergner wrote:
> > On 8/31/22 4:33 AM, Kewen.Lin wrote:
> >> @@ -1,7 +1,8 @@
> >>  /* { dg-do compile { target { powerpc*-*-* } } } */
> >>  /* { dg-skip-if "" { powerpc*-*-aix* } } */
> >> -/* { dg-options "-O2 -mpowerpc64" } */
> >>  /* { dg-require-effective-target ilp32 } */
> >> +/* { dg-options "-O2 -mpowerpc64" } */
> >> +/* { dg-require-effective-target has_arch_ppc64 } */
> > 
> > With many of our recent patches moving the dg-options before any
> > dg-requires-effectice-target so it affects the results of the
> > dg-requires-effectice-target test, this looks like it's backwards
> > from that process.  I understand why, so I think an explicit comment
> > here in the test case explaining why it's after in this case.
> > Just so in a few years when we come back to this test case, we
> > won't accidentally undo this change.
> 
> Oops, the diff shows it's like "after", but it's actually still "before". :)
> The dg-options is meant to be placed before the succeeding has_arch_ppc64
> effective target which is supposed to use dg-options to compile.  I felt
> good to let ilp32 checking go first then has_arch_ppc64, so moved dg-option
> downward.

These two are independent, but apparently we have a bug here, which will
make what you did malfunction in some cases -- the test will not run for
ilp32 if you have RUNTESTFLAGS {-m32,-m64}.

It should not make a difference, -mpowerpc64 and -m32 should be wholly
independent, and their order should not matter.  So the order of the
  /* { dg-require-effective-target ilp32 } */
  /* { dg-options "-O2 -mpowerpc64" } */
lines should not make a difference either.  But it does :-(


Segher

[committed] libstdc++: Remove FIXME for ICE with remove_cvref_t in requires-expression

2022-09-01 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, pushed to trunk.

-- >8 --

PR c++/99968 is fixed since GCC 12.1 so we can remove the workaround.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_scoped_enum): Remove workaround.
---
 libstdc++-v3/include/std/type_traits | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index 5984442c0aa..5b8314f24fd 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3534,20 +3534,11 @@ template
 
   template
 requires __is_enum(_Tp)
-&& requires(_Tp __t) { __t = __t; } // fails if incomplete
+&& requires(remove_cv_t<_Tp> __t) { __t = __t; } // fails if incomplete
 struct is_scoped_enum<_Tp>
 : bool_constant
 { };
 
-  // FIXME remove this partial specialization and use remove_cv_t<_Tp> above
-  // when PR c++/99968 is fixed.
-  template
-requires __is_enum(_Tp)
-&& requires(_Tp __t) { __t = __t; } // fails if incomplete
-struct is_scoped_enum
-: bool_constant
-{ };
-
   /// @ingroup variable_templates
   /// @since C++23
   template
-- 
2.37.2

Re: [PATCH] rs6000: Don't ICE when we disassemble an MMA variable [PR101322]

2022-09-01 Thread Segher Boessenkool

On Thu, Sep 01, 2022 at 04:28:56PM +0800, Kewen.Lin wrote:
> tree.def has some note about VIEW_CONVERT_EXPR, it quite matches what Segher 
> replied.
> In my experience, VIEW_CONVERT_EXPR are used a lot for vector type conversion.

It is needed whenever vector types are not compatible otherwise.
V4SI <-> V4SF, V4SI <-> V2DI, etc.  In such cases you effectively do a
bit_cast equivalent.

Segher

Re: [PATCH] Add global REAL_VALUE_TYPE infinities to real.*.

2022-09-01 Thread Jeff Law via Gcc-patches





On 9/1/2022 8:13 AM, Aldy Hernandez via Gcc-patches wrote:

We're starting to abuse the infinity endpoints in the frange code and
the associated range operators.  Building infinities are rather cheap,
and we could even inline them, but I think it's best to just not
recalculate them all the time.

I see about 20 uses of real_inf in the source code, not including the
backends.  And I'm about to add more :).

OK pending tests?

gcc/ChangeLog:

* emit-rtl.cc (init_emit_once): Initialize dconstinf and
dconstninf.
* real.h: Add dconstinf and dconstninf.

OK
jeff

Re: [PATCH] rs6000: Don't ICE when we disassemble an MMA variable [PR101322]

2022-09-01 Thread Peter Bergner via Gcc-patches

On 9/1/22 3:29 AM, Kewen.Lin wrote:
>> I have no idea why ptr_vector_*_type would behave differently here than
>> build_pointer_type (vector_*_type_node).  Using the build_pointer_type()
>> fixed it for me, so that's why I went with it. :-)  Maybe this is a bug
>> in lto???
> 
> Thanks for your time to reproduce this!
> 
> The only difference is that ptr_vector_*_type are built from the
> qualified_type based on vector_*_type_node, instead of directly from
> vector_*_type_node.  I'm interested to have a further look at this later.

If you look into this, please let me know.  I'd like to know what you
find out.

Peter

[PATCH] Add global REAL_VALUE_TYPE infinities to real.*.

2022-09-01 Thread Aldy Hernandez via Gcc-patches

We're starting to abuse the infinity endpoints in the frange code and
the associated range operators.  Building infinities are rather cheap,
and we could even inline them, but I think it's best to just not
recalculate them all the time.

I see about 20 uses of real_inf in the source code, not including the
backends.  And I'm about to add more :).

OK pending tests?

gcc/ChangeLog:

* emit-rtl.cc (init_emit_once): Initialize dconstinf and
dconstninf.
* real.h: Add dconstinf and dconstninf.
---
 gcc/emit-rtl.cc | 5 +
 gcc/real.h  | 2 ++
 2 files changed, 7 insertions(+)

diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index 3929ee08986..f25fb70ab97 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -107,6 +107,8 @@ REAL_VALUE_TYPE dconst1;
 REAL_VALUE_TYPE dconst2;
 REAL_VALUE_TYPE dconstm1;
 REAL_VALUE_TYPE dconsthalf;
+REAL_VALUE_TYPE dconstinf;
+REAL_VALUE_TYPE dconstninf;
 
 /* Record fixed-point constant 0 and 1.  */
 FIXED_VALUE_TYPE fconst0[MAX_FCONST0];
@@ -6210,6 +6212,9 @@ init_emit_once (void)
   dconsthalf = dconst1;
   SET_REAL_EXP (, REAL_EXP () - 1);
 
+  real_inf ();
+  real_inf (, true);
+
   for (i = 0; i < 3; i++)
 {
   const REAL_VALUE_TYPE *const r =
diff --git a/gcc/real.h b/gcc/real.h
index ec78e8a8932..2f490ef9b72 100644
--- a/gcc/real.h
+++ b/gcc/real.h
@@ -462,6 +462,8 @@ extern REAL_VALUE_TYPE dconst1;
 extern REAL_VALUE_TYPE dconst2;
 extern REAL_VALUE_TYPE dconstm1;
 extern REAL_VALUE_TYPE dconsthalf;
+extern REAL_VALUE_TYPE dconstinf;
+extern REAL_VALUE_TYPE dconstninf;
 
 #define dconst_e() (*dconst_e_ptr ())
 #define dconst_third() (*dconst_third_ptr ())
-- 
2.37.1

Re: [PATCH][V3] 32-bit PA-RISC with HP-UX: remove deprecated ports

2022-09-01 Thread John David Anglin


On 2022-08-31 3:21 a.m., Martin Liška wrote:

Sending v3 of the patch that includes John's comments.

Ready to be installed?

Okay.

Dave

--
John David Anglin  dave.ang...@bell.net

[PATCH] Remove cycle checking from compute_control_dep_chain

2022-09-01 Thread Richard Biener via Gcc-patches

Now that we have DFS_BACK_EDGE marks we can simply avoid walking
those instead of repeatedly looking for a cycle on the current chain.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

* gimple-predicate-analysis.cc (compute_control_dep_chain):
Remove cycle detection, instead avoid walking backedges.
---
 gcc/gimple-predicate-analysis.cc | 20 +++-
 1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/gcc/gimple-predicate-analysis.cc b/gcc/gimple-predicate-analysis.cc
index 2982268fdfd..a754ff0a029 100644
--- a/gcc/gimple-predicate-analysis.cc
+++ b/gcc/gimple-predicate-analysis.cc
@@ -1035,18 +1035,6 @@ compute_control_dep_chain (basic_block dom_bb, 
const_basic_block dep_bb,
fprintf (dump_file, "chain length exceeds 5: %u\n", cur_chain_len);
 }
 
-  for (unsigned i = 0; i < cur_chain_len; i++)
-{
-  edge e = cur_cd_chain[i];
-  /* Cycle detected.  */
-  if (e->src == dom_bb)
-   {
- if (dump_file)
-   fprintf (dump_file, "cycle detected\n");
- return false;
-   }
-}
-
   if (DEBUG_PREDICATE_ANALYZER && dump_file)
 fprintf (dump_file,
 "%*s%s (dom_bb = %u, dep_bb = %u, cd_chains = { %s }, ...)\n",
@@ -1061,7 +1049,7 @@ compute_control_dep_chain (basic_block dom_bb, 
const_basic_block dep_bb,
   FOR_EACH_EDGE (e, ei, dom_bb->succs)
 {
   int post_dom_check = 0;
-  if (e->flags & (EDGE_FAKE | EDGE_ABNORMAL))
+  if (e->flags & (EDGE_FAKE | EDGE_ABNORMAL | EDGE_DFS_BACK))
continue;
 
   basic_block cd_bb = e->dest;
@@ -1110,6 +1098,12 @@ compute_control_dep_chain (basic_block dom_bb, 
const_basic_block dep_bb,
  break;
}
 
+ /* The post-dominator walk will reach a backedge only
+from a forwarder, otherwise it should choose to exit
+the SCC.  */
+ if (single_succ_p (cd_bb)
+ && single_succ_edge (cd_bb)->flags & EDGE_DFS_BACK)
+   break;
  cd_bb = get_immediate_dominator (CDI_POST_DOMINATORS, cd_bb);
  post_dom_check++;
  if (cd_bb == EXIT_BLOCK_PTR_FOR_FN (cfun)
-- 
2.35.3

[PATCH] Some predicate analysis TLC

2022-09-01 Thread Richard Biener via Gcc-patches

The following hides some internal details of compute_control_dep_chain.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

* gimple-predicate-analysis.cc (compute_control_dep_chain):
New wrapping overload.
(uninit_analysis::init_use_preds): Simplify.
(uninit_analysis::init_from_phi_def): Likewise.
---
 gcc/gimple-predicate-analysis.cc | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/gcc/gimple-predicate-analysis.cc b/gcc/gimple-predicate-analysis.cc
index eb1e11cead8..2982268fdfd 100644
--- a/gcc/gimple-predicate-analysis.cc
+++ b/gcc/gimple-predicate-analysis.cc
@@ -1124,6 +1124,18 @@ compute_control_dep_chain (basic_block dom_bb, 
const_basic_block dep_bb,
   return found_cd_chain;
 }
 
+static bool
+compute_control_dep_chain (basic_block dom_bb, const_basic_block dep_bb,
+  vec cd_chains[], unsigned *num_chains,
+  unsigned in_region = 0)
+{
+  auto_vec cur_cd_chain;
+  unsigned num_calls = 0;
+  unsigned depth = 0;
+  return compute_control_dep_chain (dom_bb, dep_bb, cd_chains, num_chains,
+   cur_cd_chain, _calls, in_region, depth);
+}
+
 /* Implemented simplifications:
 
1) ((x IOR y) != 0) AND (x != 0) is equivalent to (x != 0);
@@ -1919,13 +1931,10 @@ uninit_analysis::init_use_preds (predicate _preds, 
basic_block def_bb,
  Each DEP_CHAINS element is a series of edges whose conditions
  are logical conjunctions.  Together, the DEP_CHAINS vector is
  used below to initialize an OR expression of the conjunctions.  */
-  unsigned num_calls = 0;
   unsigned num_chains = 0;
   auto_vec dep_chains[MAX_NUM_CHAINS];
-  auto_vec cur_chain;
 
-  if (!compute_control_dep_chain (cd_root, use_bb, dep_chains, _chains,
- cur_chain, _calls))
+  if (!compute_control_dep_chain (cd_root, use_bb, dep_chains, _chains))
 {
   gcc_assert (num_chains == 0);
   simple_control_dep_chain (dep_chains[0], cd_root, use_bb);
@@ -2023,14 +2032,12 @@ uninit_analysis::init_from_phi_def (gphi *phi)
 
   unsigned num_chains = 0;
   auto_vec dep_chains[MAX_NUM_CHAINS];
-  auto_vec cur_chain;
   for (unsigned i = 0; i < nedges; i++)
 {
   edge e = def_edges[i];
-  unsigned num_calls = 0;
   unsigned prev_nc = num_chains;
   compute_control_dep_chain (cd_root, e->src, dep_chains,
-_chains, cur_chain, _calls, in_region);
+_chains, in_region);
 
   /* Update the newly added chains with the phi operand edge.  */
   if (EDGE_COUNT (e->src->succs) > 1)
-- 
2.35.3

[pushed] c++: set TYPE_STRING_FLAG for char8_t

2022-09-01 Thread Jason Merrill via Gcc-patches

While looking at the DWARF handling of char8_t I wondered why we weren't
setting TREE_STRING_FLAG on it.  I hoped that setting that flag would be an
easy fix for PR102958, but it doesn't seem to be sufficicent.  But it still
seems correct.

I also tried setting the flag on char16_t and char32_t, but that broke
because braced_list_to_string assumes char-sized elements.  Since we don't
set the flag on wchar_t, I abandoned that idea.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/c-family/ChangeLog:

* c-common.cc (c_common_nodes_and_builtins): Set TREE_STRING_FLAG on
char8_t.
(braced_list_to_string): Check for char-sized elements.
---
 gcc/c-family/c-common.cc | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 71fe7305369..9b07a1cbae3 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -4550,6 +4550,7 @@ c_common_nodes_and_builtins (void)
   if (c_dialect_cxx ())
 {
   char8_type_node = make_unsigned_type (char8_type_size);
+  TYPE_STRING_FLAG (char8_type_node) = true;
 
   if (flag_char8_t)
 record_builtin_type (RID_CHAR8, "char8_t", char8_type_node);
@@ -9343,12 +9344,15 @@ braced_list_to_string (tree type, tree ctor, bool 
member)
   if (!member && !tree_fits_uhwi_p (typesize))
 return ctor;
 
-  /* If the target char size differes from the host char size, we'd risk
+  /* If the target char size differs from the host char size, we'd risk
  loosing data and getting object sizes wrong by converting to
  host chars.  */
   if (TYPE_PRECISION (char_type_node) != CHAR_BIT)
 return ctor;
 
+  /* STRING_CST doesn't support wide characters.  */
+  gcc_checking_assert (TYPE_PRECISION (TREE_TYPE (type)) == CHAR_BIT);
+
   /* If the array has an explicit bound, use it to constrain the size
  of the string.  If it doesn't, be sure to create a string that's
  as long as implied by the index of the last zero specified via

base-commit: 25dd2768afdb8fad7b11d511eb5f739958f9870d
-- 
2.31.1

Re: [PATCH 2/3] rename DBX_REGISTER_NUMBER to DEBUGGER_REGISTER_NUMBER

2022-09-01 Thread Michael Matz via Gcc-patches

Hello,

okay, I'll bite :)  DBG_REGISTER_NUMBER?  DEBUGGER_REGNO?


Ciao,
Michael.

Re: Modula-2: merge followup (brief update on the progress of the new linking implementation)

2022-09-01 Thread Gaius Mulley via Gcc-patches

Martin Liška  writes:

> So please fix the crash I reported and I can convert GM2 texi manual.

will do,

regards,
Gaius

Re: [PATCH 3/3] pdp11: no debugging info

2022-09-01 Thread Richard Biener via Gcc-patches

On Thu, Sep 1, 2022 at 12:06 PM Martin Liška  wrote:
>

OK.

> gcc/ChangeLog:
>
> * config/pdp11/pdp11.h (PREFERRED_DEBUGGING_TYPE): Disable
> debugging format.
> ---
>  gcc/config/pdp11/pdp11.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/pdp11/pdp11.h b/gcc/config/pdp11/pdp11.h
> index 55e0625e6ea..d783b36b652 100644
> --- a/gcc/config/pdp11/pdp11.h
> +++ b/gcc/config/pdp11/pdp11.h
> @@ -49,8 +49,9 @@ along with GCC; see the file COPYING3.  If not see
>  }  \
>while (0)
>
> +#undef PREFERRED_DEBUGGING_TYPE
> +#define PREFERRED_DEBUGGING_TYPE NO_DEBUG
>
> -/* Generate debugger debugging information.  */
>  #define TARGET_40_PLUS (TARGET_40 || TARGET_45)
>  #define TARGET_10  (! TARGET_40_PLUS)
>
> --
> 2.37.2
>

[COMMITTED] Add signbit property to frange to better model signed zeros.

2022-09-01 Thread Aldy Hernandez via Gcc-patches

As discussed here:

https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600656.html

This adds an frange property to keep track of the sign bit.  We keep
it updated at all times, but we don't use it make any decisions when
!HONOR_SIGNED_ZEROS.

With this property we can now query the range for the appropriate sign
with frange::get_signbit ().  Possible values are yes, no, and unknown.

We had some old notes in PR24021 that indicated what was important in
the FP world: finite, signed zeros, normalized, in the range [-1,1] for
trig functions, etc.  I think frange now has enough to model everything
we care about.

gcc/ChangeLog:

* range-op-float.cc (foperator_equal::op1_range): Do not copy sign
bit.
(foperator_not_equal::op1_range): Same.
* value-query.cc (range_query::get_tree_range): Set sign bit.
* value-range-pretty-print.cc (vrange_printer::visit): Dump sign bit.
* value-range.cc (frange::set_signbit): New.
(frange::set): Adjust for sign bit.
(frange::normalize_kind): Same.
(frange::union_): Remove useless comment.
(frange::intersect): Same.
(frange::contains_p): Adjust for sign bit.
(frange::singleton_p): Same.
(frange::verify_range): Same.
(range_tests_signbit): New tests.
(range_tests_floats): Call range_tests_signbit.
* value-range.h (class frange_props): Add signbit
(class frange): Same.
---
 gcc/range-op-float.cc   |   6 +
 gcc/value-query.cc  |  27 +++--
 gcc/value-range-pretty-print.cc |   1 +
 gcc/value-range.cc  | 200 +++-
 gcc/value-range.h   |   4 +
 5 files changed, 204 insertions(+), 34 deletions(-)

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index c30f2af391c..2f1af4055c3 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -361,6 +361,9 @@ foperator_equal::op1_range (frange , tree type,
 case BRS_TRUE:
   // If it's true, the result is the same as OP2.
   r = op2;
+  // Make sure we don't copy the sign bit if we may have a zero.
+  if (HONOR_SIGNED_ZEROS (type) && r.contains_p (build_zero_cst (type)))
+   r.set_signbit (fp_prop::VARYING);
   // The TRUE side of op1 == op2 implies op1 is !NAN.
   r.set_nan (fp_prop::NO);
   break;
@@ -462,6 +465,9 @@ foperator_not_equal::op1_range (frange , tree type,
 case BRS_FALSE:
   // If it's false, the result is the same as OP2.
   r = op2;
+  // Make sure we don't copy the sign bit if we may have a zero.
+  if (HONOR_SIGNED_ZEROS (type) && r.contains_p (build_zero_cst (type)))
+   r.set_signbit (fp_prop::VARYING);
   // The FALSE side of op1 != op2 implies op1 is !NAN.
   r.set_nan (fp_prop::NO);
   break;
diff --git a/gcc/value-query.cc b/gcc/value-query.cc
index 4637fb409e5..201f679a36e 100644
--- a/gcc/value-query.cc
+++ b/gcc/value-query.cc
@@ -217,14 +217,25 @@ range_query::get_tree_range (vrange , tree expr, gimple 
*stmt)
   return true;
 
 case REAL_CST:
-  if (TREE_OVERFLOW_P (expr))
-   expr = drop_tree_overflow (expr);
-  r.set (expr, expr);
-  if (real_isnan (TREE_REAL_CST_PTR (expr)))
-   as_a  (r).set_nan (fp_prop::YES);
-  else
-   as_a  (r).set_nan (fp_prop::NO);
-  return true;
+  {
+   if (TREE_OVERFLOW_P (expr))
+ expr = drop_tree_overflow (expr);
+
+   frange  = as_a  (r);
+   f.set (expr, expr);
+
+   // Singletons from the tree world have known properties.
+   REAL_VALUE_TYPE *rv = TREE_REAL_CST_PTR (expr);
+   if (real_isnan (rv))
+ f.set_nan (fp_prop::YES);
+   else
+ f.set_nan (fp_prop::NO);
+   if (real_isneg (rv))
+ f.set_signbit (fp_prop::YES);
+   else
+ f.set_signbit (fp_prop::NO);
+   return true;
+  }
 
 case SSA_NAME:
   gimple_range_global (r, expr);
diff --git a/gcc/value-range-pretty-print.cc b/gcc/value-range-pretty-print.cc
index e66d56da29d..93e18d3c1b6 100644
--- a/gcc/value-range-pretty-print.cc
+++ b/gcc/value-range-pretty-print.cc
@@ -146,6 +146,7 @@ vrange_printer::visit (const frange ) const
   pp_string (pp, "] ");
 
   print_frange_prop ("NAN", r.get_nan ());
+  print_frange_prop ("SIGN", r.get_signbit ());
 }
 
 // Print the FP properties in an frange.
diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 3c7d4cb84b9..71581b2c54d 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -288,6 +288,58 @@ frange::set_nan (fp_prop::kind k)
 
   m_props.set_nan (k);
   normalize_kind ();
+  if (flag_checking)
+verify_range ();
+}
+
+// Set the SIGNBIT property.  Adjust the range if appropriate.
+
+void
+frange::set_signbit (fp_prop::kind k)
+{
+  gcc_checking_assert (m_type);
+
+  // No additional adjustments are needed for a NAN.
+  if (get_nan ().yes_p ())
+{
+  m_props.set_signbit (k);
+  return;
+}
+  // Ignore sign changes

[COMMITTED] Implement ranger folder for __builtin_signbit.

2022-09-01 Thread Aldy Hernandez via Gcc-patches

Now that we keep track of the signbit, we can use it to fold __builtin_signbit.

I am assuming I don't have try too hard to get the actual signbit
number and 1 will do.  Especially, since we're inconsistent in trunk whether
we fold the builtin or whether we calculate it at runtime.

abulafia:~$ cat a.c
float nzero = -0.0;

main(){
printf("0x%x\n", __builtin_signbit(-0.0));
printf("0x%x\n", __builtin_signbit(nzero));
}
abulafia:~$ gcc a.c -w && ./a.out
0x1
0x8000

It is amazing that we've been failing to fold something as simple as
this:

if (x > 5.0)
  num = __builtin_signbit (x);

It does the right thing now :-P.

gcc/ChangeLog:

* gimple-range-fold.cc
(fold_using_range::range_of_builtin_int_call): Add case for
CFN_BUILT_IN_SIGNBIT.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/vrp-float-signbit-1.c: New test.
---
 gcc/gimple-range-fold.cc  | 20 +++
 .../gcc.dg/tree-ssa/vrp-float-signbit-1.c | 12 +++
 2 files changed, 32 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-signbit-1.c

diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
index b0b22106320..d8497fc9be7 100644
--- a/gcc/gimple-range-fold.cc
+++ b/gcc/gimple-range-fold.cc
@@ -1023,6 +1023,26 @@ fold_using_range::range_of_builtin_int_call (irange , 
gcall *call,
break;
   }
 
+case CFN_BUILT_IN_SIGNBIT:
+  {
+   arg = gimple_call_arg (call, 0);
+   frange tmp;
+   if (src.get_operand (tmp, arg))
+ {
+   if (tmp.get_signbit ().varying_p ())
+ return false;
+   if (tmp.get_signbit ().yes_p ())
+ {
+   tree one = build_one_cst (type);
+   r.set (one, one);
+ }
+   else
+ r.set_zero (type);
+   return true;
+ }
+   break;
+  }
+
 case CFN_BUILT_IN_TOUPPER:
   {
arg = gimple_call_arg (call, 0);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-signbit-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-signbit-1.c
new file mode 100644
index 000..3fa783ec460
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-signbit-1.c
@@ -0,0 +1,12 @@
+// { dg-do compile }
+// { dg-options "-O2 -fdump-tree-evrp" }
+
+int num;
+
+void func(float x)
+{
+  if (x > 5.0)
+num = __builtin_signbit (x);
+}
+
+// { dg-final { scan-tree-dump-times "num = 0;" 1 "evrp" } }
-- 
2.37.1

Re: [PATCH] c++, v2: Implement C++23 P2071R2 - Named universal character escapes [PR106648]

2022-09-01 Thread Jakub Jelinek via Gcc-patches

On Wed, Aug 31, 2022 at 12:14:22PM -0400, Jason Merrill wrote:
> On 8/31/22 11:07, Jakub Jelinek wrote:
> > On Wed, Aug 31, 2022 at 10:52:49AM -0400, Jason Merrill wrote:
> > > It could be more explicit, but I think we can assume that from the 
> > > existing
> > > wording; it says it designates the named character.  If there is no such
> > > character, that cannot be satisfied, so it must be ill-formed.
> > 
> > Ok.
> > 
> > > > So, we could reject the int h case above and accept silently the others?
> > > 
> > > Why not warn on the others?
> > 
> > We were always silent for the cases like \u123X or \U12345X.
> > Do you think we should emit some warnings (but never pedwarns/errors in that
> > case) that it is universal character name like but not completely?
> 
> I think that would be helpful, at least for \u{ and \N{.

Ok.

> > Given what you said above, I think that is what we want for the last 2
> > for C++23, the question is if it is ok also for C++20/C17 etc. and whether
> > it should depend on -pedantic or -pedantic-errors or GNU vs. ISO mode
> > or not in that case.  We could handle those 2 also differently, just
> > warn instead of error for the \N{ABC} case if not in C++23 mode when
> > identifier_pos.
> 
> That sounds right.
> 
> > Here is an incremental version of the patch which will make valid
> > \u{123} and \N{LATIN SMALL LETTER A WITH ACUTE} an extension in GNU
> > modes before C++23 and split it as separate tokens in ISO modes.
> 
> Looks good.

Here is a patch which implements that.
I just wonder if we shouldn't have some warning option that would cover
these warnings, currently one needs to use -w to disable those warnings.

Apparently clang uses -Wunicode option to cover these, but unfortunately
they don't bother to document it (nor almost any other warning option),
so it is unclear what else exactly it covers.  Plus a question is how
we should document that option for GCC...

2022-09-01  Jakub Jelinek  

* charset.cc (_cpp_valid_ucn): In possible identifier contexts, don't
handle \u{ or \N{ specially in -std=c* modes except -std=c++2{3,b}.
In possible identifier contexts, don't emit an error and punt
if \N isn't followed by {, or if \N{} surrounds some lower case
letters or _.  In possible identifier contexts when not C++23, don't
emit an error but warning about unknown character names and treat as
separate tokens.  When treating as separate tokens \u{ or \N{, emit
warnings.

* c-c++-common/cpp/delimited-escape-seq-4.c: New test.
* c-c++-common/cpp/delimited-escape-seq-5.c: New test.
* c-c++-common/cpp/named-universal-char-escape-5.c: New test.
* c-c++-common/cpp/named-universal-char-escape-6.c: New test.
* g++.dg/cpp23/named-universal-char-escape1.C: New test.
* g++.dg/cpp23/named-universal-char-escape2.C: New test.

--- libcpp/charset.cc.jj2022-09-01 09:47:24.146886929 +0200
+++ libcpp/charset.cc   2022-09-01 12:52:28.424034208 +0200
@@ -1448,7 +1448,11 @@ _cpp_valid_ucn (cpp_reader *pfile, const
   if (str[-1] == 'u')
 {
   length = 4;
-  if (str < limit && *str == '{')
+  if (str < limit
+ && *str == '{'
+ && (!identifier_pos
+ || CPP_OPTION (pfile, delimited_escape_seqs)
+ || !CPP_OPTION (pfile, std)))
{
  str++;
  /* Magic value to indicate no digits seen.  */
@@ -1462,8 +1466,22 @@ _cpp_valid_ucn (cpp_reader *pfile, const
   else if (str[-1] == 'N')
 {
   length = 4;
+  if (identifier_pos
+ && !CPP_OPTION (pfile, delimited_escape_seqs)
+ && CPP_OPTION (pfile, std))
+   {
+ *cp = 0;
+ return false;
+   }
   if (str == limit || *str != '{')
-   cpp_error (pfile, CPP_DL_ERROR, "'\\N' not followed by '{'");
+   {
+ if (identifier_pos)
+   {
+ *cp = 0;
+ return false;
+   }
+ cpp_error (pfile, CPP_DL_ERROR, "'\\N' not followed by '{'");
+   }
   else
{
  str++;
@@ -1489,8 +1507,16 @@ _cpp_valid_ucn (cpp_reader *pfile, const
 
  if (str < limit && *str == '}')
{
- if (name == str && identifier_pos)
+ if (identifier_pos && (name == str || !strict))
{
+ if (name == str)
+   cpp_warning (pfile, CPP_W_NONE,
+"empty named universal character escape "
+"sequence; treating it as separate tokens");
+ else
+   cpp_warning (pfile, CPP_W_NONE,
+"incomplete named universal character escape "
+"sequence; treating it as separate tokens");
  *cp = 0;
  return false;
}
@@ -1515,27 +1541,48 @@ _cpp_valid_ucn (cpp_reader *pfile, const

Re: [PATCH 2/2] libstdc++: Implement ranges::adjacent_transform_view from P2321R2

2022-09-01 Thread Jonathan Wakely via Gcc-patches

On Tue, 30 Aug 2022 at 18:14, Patrick Palka via Libstdc++
 wrote:
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

OK


>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (__detail::__unarize): Define.
> (adjacent_view::_Iterator): Befriend adjacent_transform_view.
> (adjacent_transform_view): Define.
> (adjacent_transform_view::_Iterator): Define.
> (adjacent_transform_view::_Sentinel): Define.
> (views::__detail::__can_adjacent_transform_view): Define.
> (views::_AdjacentTransform): Define.
> (views::adjacent_transform): Define.
> (views::pairwise): Define.
> * testsuite/std/ranges/adaptors/adjacent_transform/1.cc: New test.
> ---
>  libstdc++-v3/include/std/ranges   | 342 ++
>  .../ranges/adaptors/adjacent_transform/1.cc   | 106 ++
>  2 files changed, 448 insertions(+)
>  create mode 100644 
> libstdc++-v3/testsuite/std/ranges/adaptors/adjacent_transform/1.cc
>
> diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
> index 4fb879a088c..3a7f0545030 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -5158,6 +5158,20 @@ namespace views::__adaptor
>  // Yields tuple<_Tp, ..., _Tp> with _Nm elements.
>  template
>using __repeated_tuple = 
> decltype(std::tuple_cat(std::declval>()));
> +
> +// For a functor F that takes N arguments, the expression 
> declval<__unarize>(x)
> +// is equivalent to declval(x, ..., x).
> +template
> +  struct __unarize
> +  {
> +   template
> + static invoke_result_t<_Fp, _Ts...>
> + __tuple_apply(const tuple<_Ts...>&); // not defined
> +
> +   template
> + decltype(__tuple_apply(std::declval<__repeated_tuple<_Tp, _Nm>>()))
> + operator()(_Tp&&); // not defined
> +  };
>}
>
>template
> @@ -5205,6 +5219,13 @@ namespace views::__adaptor
>
>  friend class adjacent_view;
>
> +template
> +  requires view<_Wp> && (_Mm > 0) && is_object_v<_Fp>
> +&& regular_invocable<__detail::__unarize<_Fp&, _Mm>, 
> range_reference_t<_Wp>>
> +&& 
> std::__detail::__can_reference,
> +
> range_reference_t<_Wp>>>
> +  friend class adjacent_transform_view;
> +
>public:
>  using iterator_category = input_iterator_tag;
>  using iterator_concept = decltype(_S_iter_concept());
> @@ -5440,6 +5461,327 @@ namespace views::__adaptor
>
>  inline constexpr auto pairwise = adjacent<2>;
>}
> +
> +  template
> +   requires view<_Vp> && (_Nm > 0) && is_object_v<_Fp>
> + && regular_invocable<__detail::__unarize<_Fp&, _Nm>, 
> range_reference_t<_Vp>>
> + && 
> std::__detail::__can_reference,
> +  
> range_reference_t<_Vp>>>
> +  class adjacent_transform_view : public 
> view_interface>
> +  {
> +[[no_unique_address]] __detail::__box<_Fp> _M_fun;
> +adjacent_view<_Vp, _Nm> _M_inner;
> +
> +using _InnerView = adjacent_view<_Vp, _Nm>;
> +
> +template
> +  using _InnerIter = iterator_t<__detail::__maybe_const_t<_Const, 
> _InnerView>>;
> +
> +template
> +  using _InnerSent = sentinel_t<__detail::__maybe_const_t<_Const, 
> _InnerView>>;
> +
> +template class _Iterator;
> +template class _Sentinel;
> +
> +  public:
> +adjacent_transform_view() = default;
> +
> +constexpr explicit
> +adjacent_transform_view(_Vp __base, _Fp __fun)
> +  : _M_fun(std::move(__fun)), _M_inner(std::move(__base))
> +{ }
> +
> +constexpr auto
> +begin()
> +{ return _Iterator(*this, _M_inner.begin()); }
> +
> +constexpr auto
> +begin() const
> +  requires range
> +   && regular_invocable<__detail::__unarize,
> +range_reference_t>
> +{ return _Iterator(*this, _M_inner.begin()); }
> +
> +constexpr auto
> +end()
> +{
> +  if constexpr (common_range<_InnerView>)
> +return _Iterator(*this, _M_inner.end());
> +  else
> +return _Sentinel(_M_inner.end());
> +}
> +
> +constexpr auto
> +end() const
> +  requires range
> +   && regular_invocable<__detail::__unarize,
> +range_reference_t>
> +{
> +  if constexpr (common_range)
> +return _Iterator(*this, _M_inner.end());
> +  else
> +return _Sentinel(_M_inner.end());
> +}
> +
> +constexpr auto
> +size() requires sized_range<_InnerView>
> +{ return _M_inner.size(); }
> +
> +constexpr auto
> +size() const requires sized_range
> +{ return _M_inner.size(); }
> +  };
> +
> +  template
> +   requires view<_Vp> && (_Nm > 0) && is_object_v<_Fp>
> + && regular_invocable<__detail::__unarize<_Fp&, _Nm>, 
> range_reference_t<_Vp>>
> + && 
> std::__detail::__can_reference,
> +

[wwwdocs] Re: unreachable intro to gcc page linked to on readings page

2022-09-01 Thread Jonathan Wakely via Gcc-patches

On Mon, 29 Aug 2022 at 11:31, Gerald Pfeifer wrote:
>
> On Wed, 24 Aug 2022, Jonathan Wakely wrote:
> >> a broken link points to
> >>
> >>   An introduction to GCC by Brian J. Gough.
> >>   . http://www.network-theory.co.uk/gcc/intro/
> > There are much more recent archived copies like
> > https://web.archive.org/web/20181113021321/http://www.network-theory.co.uk/gcc/intro/
> > I'm not sure it's worth updating the link to an archived copy of that
> > page, because all the links for buying a PDF or printed copy are
> > probably dead now anyway.
> >
> > We could link to https://archive.org/details/B-001-002-835 instead, or
> > to the archived HTML version. The newest capture of the HTML version
> > seems to be this, although I didn't check that all pages are archived:
> > https://web.archive.org/web/20181206025406/http://www.network-theory.co.uk/docs/gccintro/
> > My preference would be to link to that latter. Although it's quite
> > dated, the sections on basic compilation and compiler flags are still
> > relevant for beginners.
> >
> > Gerald?
>
> I searched around a bit myself (since indeed the original and printed
> versions seem to be gone) and landed at
>
>https://archive.org/details/B-001-002-835
>
> as well. I probably would have gone for that, though the
> web.archive.org/web link you found works equally if you want to point
> there instead.

Why not both?  I've pushed the attached patch to [wwwdocs].
commit 0e4c9a39789b6dbcd44b2e0d4a42b5885d3bddb2
Author: Jonathan Wakely 
Date:   Thu Sep 1 10:57:46 2022 +0100

Replace stale link to the "An Introduction to GCC" book

diff --git a/htdocs/readings.html b/htdocs/readings.html
index 4269e9f0..5d1b78e8 100644
--- a/htdocs/readings.html
+++ b/htdocs/readings.html
@@ -22,8 +22,9 @@
 
 
 
-  http://www.network-theory.co.uk/gcc/intro/;>An Introduction
-  to GCC by Brian J. Gough.
+  https://web.archive.org/web/20060106062936/http://www.network-theory.co.uk/gcc/intro/;>An
 Introduction
+  to GCC by Brian J. Gough
+  (https://archive.org/details/B-001-002-835/;>e-book).
 
   https://en.wikibooks.org/wiki/GNU_C_Compiler_Internals;>GNU C 
Compiler Internals (Wikibook), numerous contributors.

[PATCH] LoongArch: add -mdirect-extern-access option

2022-09-01 Thread Xi Ruoyao via Gcc-patches

We'd like to introduce a new codegen option to align with the old
"-Wa,-mla-global-with-pcrel" and avoid a performance & size regression
building the Linux kernel with new-reloc toolchain.  And it should be
also useful for building statically linked executables, firmwares (EDK2
for example), and other OS kernels.

OK for trunk?

-- >8 --

As a new target, LoongArch does not use copy relocation as it's
problematic in some circumstances.  One bad consequence is we are
emitting GOT for all accesses to all extern objects with default
visibility.  The use of GOT is not needed in statically linked
executables, OS kernels etc.  The GOT entry just wastes space, and the
GOT access just slow down the execution in those environments.

Before -mexplicit-relocs, we used "-Wa,-mla-global-with-pcrel" to tell
the assembler not to use GOT for extern access.  But with
-mexplicit-relocs, we have to opt the logic in GCC.

The name "-mdirect-extern-access" is learnt from x86 port.

gcc/ChangeLog:

* config/loongarch/genopts/loongarch.opt.in: Add
-mdirect-extern-access option.
* config/loongarch/loongarch.opt: Regenerate.
* config/loongarch/loongarch.cc (loongarch_classify_symbol):
Don't use SYMBOL_GOT_DISP if TARGET_DIRECT_EXTERN_ACCESS.
(loongarch_option_override_internal): Complain if
-mdirect-extern-access is used with -fPIC or -fpic.
* doc/invoke.texi: Document -mdirect-extern-access for
LoongArch.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/direct-extern-1.c: New test.
* gcc.target/loongarch/direct-extern-2.c: New test.
---
 gcc/config/loongarch/genopts/loongarch.opt.in |  4 
 gcc/config/loongarch/loongarch.cc |  5 -
 gcc/config/loongarch/loongarch.opt|  4 
 gcc/doc/invoke.texi   | 15 +++
 .../gcc.target/loongarch/direct-extern-1.c|  6 ++
 .../gcc.target/loongarch/direct-extern-2.c|  6 ++
 6 files changed, 39 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/direct-extern-1.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/direct-extern-2.c

diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in 
b/gcc/config/loongarch/genopts/loongarch.opt.in
index ebdd9538d48..e10618777b2 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -184,3 +184,7 @@ Enum(cmodel) String(@@STR_CMODEL_EXTREME@@) 
Value(CMODEL_EXTREME)
 mcmodel=
 Target RejectNegative Joined Enum(cmodel) Var(la_opt_cmodel) 
Init(CMODEL_NORMAL)
 Specify the code model.
+
+mdirect-extern-access
+Target Var(TARGET_DIRECT_EXTERN_ACCESS) Init(0)
+Avoid using the GOT to access external symbols.
diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 77e3a105390..2875fa5b0f3 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -1642,7 +1642,7 @@ loongarch_classify_symbol (const_rtx x)
   if (SYMBOL_REF_TLS_MODEL (x))
 return SYMBOL_TLS;
 
-  if (!loongarch_symbol_binds_local_p (x))
+  if (!TARGET_DIRECT_EXTERN_ACCESS && !loongarch_symbol_binds_local_p (x))
 return SYMBOL_GOT_DISP;
 
   tree t = SYMBOL_REF_DECL (x);
@@ -6093,6 +6093,9 @@ loongarch_option_override_internal (struct gcc_options 
*opts)
   if (loongarch_branch_cost == 0)
 loongarch_branch_cost = loongarch_cost->branch_cost;
 
+  if (TARGET_DIRECT_EXTERN_ACCESS && flag_shlib)
+error ("%qs cannot be used for compiling a shared library",
+  "-mdirect-extern-access");
 
   switch (la_target.cmodel)
 {
diff --git a/gcc/config/loongarch/loongarch.opt 
b/gcc/config/loongarch/loongarch.opt
index 6395234218b..96c811c850b 100644
--- a/gcc/config/loongarch/loongarch.opt
+++ b/gcc/config/loongarch/loongarch.opt
@@ -191,3 +191,7 @@ Enum(cmodel) String(extreme) Value(CMODEL_EXTREME)
 mcmodel=
 Target RejectNegative Joined Enum(cmodel) Var(la_opt_cmodel) 
Init(CMODEL_NORMAL)
 Specify the code model.
+
+mdirect-extern-access
+Target Var(TARGET_DIRECT_EXTERN_ACCESS) Init(0)
+Avoid using the GOT to access external symbols.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e5eb525a2c1..d4e86682827 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1016,6 +1016,7 @@ Objective-C and Objective-C++ Dialects}.
 -memcpy  -mno-memcpy -mstrict-align -mno-strict-align @gol
 -mmax-inline-memcpy-size=@var{n} @gol
 -mexplicit-relocs -mno-explicit-relocs @gol
+-mdirect-extern-access -mno-direct-extern-access @gol
 -mcmodel=@var{code-model}}
 
 @emph{M32R/D Options}
@@ -25100,6 +25101,20 @@ GCC build-time by detecting corresponding assembler 
support:
 @code{-mno-explicit-relocs} otherwise.  This option is mostly useful for
 debugging, or interoperation with assemblers different from the build-time
 one.
+
+@item -mdirect-extern-access
+@itemx -mno-direct-extern-access
+@opindex mdirect-extern-access
+Do not use or

Re: [PATCH] vect: Try to remove single-vector permutes from SLP graph

2022-09-01 Thread Richard Biener via Gcc-patches

On Thu, 1 Sep 2022, Richard Sandiford wrote:

> This patch extends the SLP layout optimisation pass so that it
> tries to remove layout changes that are brought about by permutes
> of existing vectors.  This fixes the bb-slp-pr54400.c regression on
> x86_64 and also means that we can remove the permutes in cases like:
> 
> typedef float v4sf __attribute__((vector_size(sizeof(float)*4)));
> 
> float __attribute__((noipa))
> f(v4sf v0, v4sf v1)
> {
>   return v0[0]*v1[0]+v0[1]*v1[1]+v0[2]*v1[2]+v0[3]*v1[3];
> }
> 
> The new test is a simple adaption of bb-slp-pr54400.c, with the
> same style of markup.
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

OK.

Thanks,
Richard.

> 
> Richard
> 
> PS. Sorry for missing the regression during testing.  The initial
> version of the new bb-slp-layout* tests had markup that led
> to a lot of FAILs on x86_64.  I fixed the markup to avoid those,
> but didn't notice this extra (unrelated) failure at the end.
> 
> 
> gcc/
>   * tree-vect-slp.cc (vect_build_slp_tree_2): When building a
>   VEC_PERM_EXPR of an existing vector, set the SLP_TREE_LANES
>   to the number of vector elements, if that's a known constant.
>   (vect_optimize_slp_pass::is_compatible_layout): Remove associated
>   comment about zero SLP_TREE_LANES.
>   (vect_optimize_slp_pass::start_choosing_layouts): Iterate over
>   all partition members when looking for potential layouts.
>   Handle existing permutes of fixed-length vectors.
> 
> gcc/testsuite/
>   * gcc.dg/vect/bb-slp-pr54400.c: Extend to aarch64.
>   * gcc.dg/vect/bb-slp-layout-18.c: New test.
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-layout-18.c | 15 +
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c   |  4 +-
>  gcc/tree-vect-slp.cc | 67 +---
>  3 files changed, 61 insertions(+), 25 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-layout-18.c
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-layout-18.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-layout-18.c
> new file mode 100644
> index 000..ff462722507
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-layout-18.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_float} */
> +/* { dg-additional-options "-w -Wno-psabi -ffast-math" } */
> +
> +typedef float v4sf __attribute__((vector_size(sizeof(float)*4)));
> +
> +float __attribute__((noipa))
> +f(v4sf v0, v4sf v1)
> +{
> +  return v0[0]*v1[0]+v0[1]*v1[1]+v0[2]*v1[2]+v0[3]*v1[3];
> +}
> +
> +/* We are lacking an effective target for .REDUC_PLUS support.  */
> +/* { dg-final { scan-tree-dump-times "basic block part vectorized" 1 "slp2" 
> { target x86_64-*-* aarch64*-*-* } } } */
> +/* { dg-final { scan-tree-dump-not " = VEC_PERM_EXPR" "slp2" { target 
> x86_64-*-* aarch64*-*-* } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c
> index 6b427aac774..8aec2092f4d 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c
> @@ -39,5 +39,5 @@ main ()
>  }
>  
>  /* We are lacking an effective target for .REDUC_PLUS support.  */
> -/* { dg-final { scan-tree-dump-times "basic block part vectorized" 3 "slp2" 
> { target x86_64-*-* } } } */
> -/* { dg-final { scan-tree-dump-not " = VEC_PERM_EXPR" "slp2" { target 
> x86_64-*-* } } } */
> +/* { dg-final { scan-tree-dump-times "basic block part vectorized" 3 "slp2" 
> { target x86_64-*-* aarch64*-*-* } } } */
> +/* { dg-final { scan-tree-dump-not " = VEC_PERM_EXPR" "slp2" { target 
> x86_64-*-* aarch64*-*-* } } } */
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index 226550635cc..cc04ef350a6 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -1840,6 +1840,10 @@ vect_build_slp_tree_2 (vec_info *vinfo, slp_tree node,
>TREE_TYPE (TREE_TYPE (vec;
> SLP_TREE_VECTYPE (vnode) = TREE_TYPE (vec);
>   }
> +  auto nunits = TYPE_VECTOR_SUBPARTS (SLP_TREE_VECTYPE (vnode));
> +  unsigned HOST_WIDE_INT const_nunits;
> +  if (nunits.is_constant (_nunits))
> + SLP_TREE_LANES (vnode) = const_nunits;
>SLP_TREE_VEC_DEFS (vnode).safe_push (vec);
>/* We are always building a permutation node even if it is an identity
>permute to shield the rest of the vectorizer from the odd node
> @@ -4325,8 +4329,6 @@ vect_optimize_slp_pass::is_compatible_layout (slp_tree 
> node,
>if (layout_i == 0)
>  return true;
>  
> -  /* SLP_TREE_LANES is zero for existing vectors, but those only support
> - layout 0 anyway.  */
>if (SLP_TREE_LANES (node) != m_perms[layout_i].length ())
>  return false;
>  
> @@ -4521,38 +4523,57 @@ vect_optimize_slp_pass::start_choosing_layouts ()
>m_perms.safe_push (vNULL);
>  
>/* Create layouts from existing permutations.  */
> -  for (unsigned int node_i : m_leafs)

[PATCH] vect: Try to remove single-vector permutes from SLP graph

2022-09-01 Thread Richard Sandiford via Gcc-patches

This patch extends the SLP layout optimisation pass so that it
tries to remove layout changes that are brought about by permutes
of existing vectors.  This fixes the bb-slp-pr54400.c regression on
x86_64 and also means that we can remove the permutes in cases like:

typedef float v4sf __attribute__((vector_size(sizeof(float)*4)));

float __attribute__((noipa))
f(v4sf v0, v4sf v1)
{
  return v0[0]*v1[0]+v0[1]*v1[1]+v0[2]*v1[2]+v0[3]*v1[3];
}

The new test is a simple adaption of bb-slp-pr54400.c, with the
same style of markup.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard

PS. Sorry for missing the regression during testing.  The initial
version of the new bb-slp-layout* tests had markup that led
to a lot of FAILs on x86_64.  I fixed the markup to avoid those,
but didn't notice this extra (unrelated) failure at the end.


gcc/
* tree-vect-slp.cc (vect_build_slp_tree_2): When building a
VEC_PERM_EXPR of an existing vector, set the SLP_TREE_LANES
to the number of vector elements, if that's a known constant.
(vect_optimize_slp_pass::is_compatible_layout): Remove associated
comment about zero SLP_TREE_LANES.
(vect_optimize_slp_pass::start_choosing_layouts): Iterate over
all partition members when looking for potential layouts.
Handle existing permutes of fixed-length vectors.

gcc/testsuite/
* gcc.dg/vect/bb-slp-pr54400.c: Extend to aarch64.
* gcc.dg/vect/bb-slp-layout-18.c: New test.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-layout-18.c | 15 +
 gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c   |  4 +-
 gcc/tree-vect-slp.cc | 67 +---
 3 files changed, 61 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-layout-18.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-layout-18.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-layout-18.c
new file mode 100644
index 000..ff462722507
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-layout-18.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_float} */
+/* { dg-additional-options "-w -Wno-psabi -ffast-math" } */
+
+typedef float v4sf __attribute__((vector_size(sizeof(float)*4)));
+
+float __attribute__((noipa))
+f(v4sf v0, v4sf v1)
+{
+  return v0[0]*v1[0]+v0[1]*v1[1]+v0[2]*v1[2]+v0[3]*v1[3];
+}
+
+/* We are lacking an effective target for .REDUC_PLUS support.  */
+/* { dg-final { scan-tree-dump-times "basic block part vectorized" 1 "slp2" { 
target x86_64-*-* aarch64*-*-* } } } */
+/* { dg-final { scan-tree-dump-not " = VEC_PERM_EXPR" "slp2" { target 
x86_64-*-* aarch64*-*-* } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c
index 6b427aac774..8aec2092f4d 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr54400.c
@@ -39,5 +39,5 @@ main ()
 }
 
 /* We are lacking an effective target for .REDUC_PLUS support.  */
-/* { dg-final { scan-tree-dump-times "basic block part vectorized" 3 "slp2" { 
target x86_64-*-* } } } */
-/* { dg-final { scan-tree-dump-not " = VEC_PERM_EXPR" "slp2" { target 
x86_64-*-* } } } */
+/* { dg-final { scan-tree-dump-times "basic block part vectorized" 3 "slp2" { 
target x86_64-*-* aarch64*-*-* } } } */
+/* { dg-final { scan-tree-dump-not " = VEC_PERM_EXPR" "slp2" { target 
x86_64-*-* aarch64*-*-* } } } */
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 226550635cc..cc04ef350a6 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -1840,6 +1840,10 @@ vect_build_slp_tree_2 (vec_info *vinfo, slp_tree node,
 TREE_TYPE (TREE_TYPE (vec;
  SLP_TREE_VECTYPE (vnode) = TREE_TYPE (vec);
}
+  auto nunits = TYPE_VECTOR_SUBPARTS (SLP_TREE_VECTYPE (vnode));
+  unsigned HOST_WIDE_INT const_nunits;
+  if (nunits.is_constant (_nunits))
+   SLP_TREE_LANES (vnode) = const_nunits;
   SLP_TREE_VEC_DEFS (vnode).safe_push (vec);
   /* We are always building a permutation node even if it is an identity
 permute to shield the rest of the vectorizer from the odd node
@@ -4325,8 +4329,6 @@ vect_optimize_slp_pass::is_compatible_layout (slp_tree 
node,
   if (layout_i == 0)
 return true;
 
-  /* SLP_TREE_LANES is zero for existing vectors, but those only support
- layout 0 anyway.  */
   if (SLP_TREE_LANES (node) != m_perms[layout_i].length ())
 return false;
 
@@ -4521,38 +4523,57 @@ vect_optimize_slp_pass::start_choosing_layouts ()
   m_perms.safe_push (vNULL);
 
   /* Create layouts from existing permutations.  */
-  for (unsigned int node_i : m_leafs)
+  auto_load_permutation_t tmp_perm;
+  for (unsigned int node_i : m_partitioned_nodes)
 {
-  auto  = m_vertices[node_i];
-  if (vertex.partition < 0)
-   continue;
-
   /* Leafs also double as entries to the reverse graph.  Allow the
 layout of

Re: [committed] Fix more problems with new linker warnings

2022-09-01 Thread Martin Liška

On 8/31/22 17:49, Jeff Law wrote:
> 
> 
> On 8/22/2022 3:39 AM, Martin Liška wrote:
>> On 4/28/22 18:10, Jeff Law via Gcc-patches wrote:
>>> As I mentioned in the original thread, my change to pr94157_0 was an 
>>> attempt to avoid these warnings by passing a magic flag to the linker.  Of 
>>> course we may not be using GNU ld.  Or we may be on a non-elf target where 
>>> the flag I used doesn't exist.  Or we may even be on a ELF target where 
>>> those bits weren't added to the linker (frv).  Furthermore, we need fixes 
>>> to all the nested function tests as well.
>>>
>>> So even though I initially resisted pruning the warning, that seems like 
>>> the best course of action.  So this patch removes my recent change to 
>>> pr94157_0 and instead uses our pruning facilities.
>>>
>>> I'll be pushing this to the trunk and gcc-12 branch.
>>>
>>> Jeff
>> Hello.
>>
>> I noticed this patch during my GCC test-suite run with mold linker. As you 
>> likely now, the linker defaults
>> to non-executable stack and so one sees test-suite crashes (not only 
>> warnings) [1].
>>
>> So the question is if we want to explicitly fix all tests that rely on 
>> exectack? Or is it something
>> we can assume as it's what GNU linkers do?
>>
>> List of affected tests:
>> https://gist.githubusercontent.com/marxin/aadb75408a5a7867bf9fb26e879ce4c4/raw/aff2a0e4559e2dba8ea358520ca836eda6e7dc70/gistfile1.txt
> The problem I ran into was that there wasn't a good way to determine what to 
> do, even if we know the test was going to need execstack. We can't just 
> blindly pass the magic flag to the linker -- at the least that would need to 
> be conditional on the linker being used as well as the target as some of the 
> ELF targets don't have the linker infrastructure.  And given that the linker 
> can vary across gnu-ld, gold, mold, it's a rats nest.

Makes sense. So far the simplest approach seems to me modifying mold and 
allowing execstack. Unfortunately,
the author does not want to introduce a new configure option.

Martin

> 
> jeff
>

[PATCH 2/3] rename DBX_REGISTER_NUMBER to DEBUGGER_REGISTER_NUMBER

2022-09-01 Thread Martin Liška

gcc/ada/ChangeLog:

* sigtramp-vxworks-target.h: Rename DBX_REGISTER_NUMBER to
  DEBUGGER_REGISTER_NUMBER.

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch64_dbx_register_number):
Rename DBX_REGISTER_NUMBER to DEBUGGER_REGISTER_NUMBER.
(aarch64_debugger_register_number): Likewise.
* config/aarch64/aarch64.cc (aarch64_dbx_register_number): Likewise.
(aarch64_debugger_register_number): Likewise.
* config/aarch64/aarch64.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
(DWARF_FRAME_REGNUM): Likewise.
* config/alpha/alpha.h (DWARF_FRAME_REGNUM): Likewise.
* config/arc/arc.cc (arc_init_reg_tables): Likewise.
* config/arc/arc.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/arm/arm-protos.h (arm_dbx_register_number): Likewise.
(arm_debugger_register_number): Likewise.
* config/arm/arm.cc (arm_dbx_register_number): Likewise.
(arm_debugger_register_number): Likewise.
* config/arm/arm.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/bfin/bfin.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/c6x/c6x.cc: Likewise.
* config/c6x/c6x.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/cris/cris.h (enum reg_class): Likewise.
(DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/csky/csky.cc (enum reg_class): Likewise.
* config/csky/csky.h (DWARF_FRAME_REGNUM): Likewise.
(DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/frv/frv.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/gcn/gcn-hsa.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/gcn/gcn.cc (print_operand): Likewise.
* config/i386/bsd.h (ASM_QUAD): Likewise.
* config/i386/cygming.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
(DWARF_FRAME_REGNUM): Likewise.
* config/i386/darwin.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/djgpp.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/dragonfly.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/freebsd.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/gas.h: Likewise.
* config/i386/gnu-user.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/i386.cc (enum reg_class): Likewise.
* config/i386/i386.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/i386elf.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/iamcu.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/lynx.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/netbsd-elf.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/nto.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/openbsdelf.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/sysv4.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/vxworks.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/i386/x86-64.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/ia64/ia64-protos.h (ia64_dbx_register_number): Likewise.
(ia64_debugger_register_number): Likewise.
* config/ia64/ia64.cc (ia64_output_function_prologue): Likewise.
(ia64_dbx_register_number): Likewise.
(ia64_debugger_register_number): Likewise.
(process_cfa_adjust_cfa): Likewise.
(process_cfa_register): Likewise.
(ia64_asm_unwind_emit): Likewise.
* config/ia64/ia64.h: Likewise.
* config/ia64/sysv4.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/lm32/lm32.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/m32c/m32c.cc (m32c_eh_return_stackadj_rtx): Likewise.
* config/m32c/m32c.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
* config/m68k/linux.h (DBX_REGISTER_NUMBER): Likewise.
(DEBUGGER_REGISTER_NUMBER): Likewise.
*

[PATCH 3/3] pdp11: no debugging info

2022-09-01 Thread Martin Liška

gcc/ChangeLog:

* config/pdp11/pdp11.h (PREFERRED_DEBUGGING_TYPE): Disable
debugging format.
---
 gcc/config/pdp11/pdp11.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/pdp11/pdp11.h b/gcc/config/pdp11/pdp11.h
index 55e0625e6ea..d783b36b652 100644
--- a/gcc/config/pdp11/pdp11.h
+++ b/gcc/config/pdp11/pdp11.h
@@ -49,8 +49,9 @@ along with GCC; see the file COPYING3.  If not see
 }  \
   while (0)
 
+#undef PREFERRED_DEBUGGING_TYPE
+#define PREFERRED_DEBUGGING_TYPE NO_DEBUG
 
-/* Generate debugger debugging information.  */
 #define TARGET_40_PLUS (TARGET_40 || TARGET_45)
 #define TARGET_10  (! TARGET_40_PLUS)
 
-- 
2.37.2

Re: -Wformat-overflow handling for %b and %B directives in C2X standard

2022-09-01 Thread Даниил Александрович Фролов via Gcc-patches

From eb9e8241d99145020ec5c050c918c1ad3abc2701 Mon Sep 17 00:00:00 2001
From: Frolov Daniil 
Date: Thu, 1 Sep 2022 10:55:01 +0300
Subject: [PATCH] Support %b, %B for -Wformat-overflow (sprintf, snprintf)

gcc/ChangeLog:

	* gimple-ssa-sprintf.cc (fmtresult::type_max_digits): Handle
	base == 2.
	(tree_digits): Likewise.
	(format_integer): Likewise.
	(parse_directive): Add cases for %b and %B directives.

gcc/testsuite/ChangeLog:

	* gcc.dg/Wformat-overflow1.c: New test.
---
 gcc/gimple-ssa-sprintf.cc| 41 +++-
 gcc/testsuite/gcc.dg/Wformat-overflow1.c | 28 
 2 files changed, 53 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/Wformat-overflow1.c

diff --git a/gcc/gimple-ssa-sprintf.cc b/gcc/gimple-ssa-sprintf.cc
index a888b5ac7d5..1dd9b0dc46b 100644
--- a/gcc/gimple-ssa-sprintf.cc
+++ b/gcc/gimple-ssa-sprintf.cc
@@ -535,6 +535,8 @@ fmtresult::type_max_digits (tree type, int base)
   unsigned prec = TYPE_PRECISION (type);
   switch (base)
 {
+case 2:
+  return prec;
 case 8:
   return (prec + 2) / 3;
 case 10:
@@ -804,9 +806,9 @@ ilog (unsigned HOST_WIDE_INT x, int base)
 /* Return the number of bytes resulting from converting into a string
the INTEGER_CST tree node X in BASE with a minimum of PREC digits.
PLUS indicates whether 1 for a plus sign should be added for positive
-   numbers, and PREFIX whether the length of an octal ('O') or hexadecimal
-   ('0x') prefix should be added for nonzero numbers.  Return -1 if X cannot
-   be represented.  */
+   numbers, and PREFIX whether the length of an octal ('0') or hexadecimal
+   ('0x') or binary ('0b') prefix should be added for nonzero numbers.
+   Return -1 if X cannot be represented.  */
 
 static HOST_WIDE_INT
 tree_digits (tree x, int base, HOST_WIDE_INT prec, bool plus, bool prefix)
@@ -857,11 +859,11 @@ tree_digits (tree x, int base, HOST_WIDE_INT prec, bool plus, bool prefix)
 
   /* Adjust a non-zero value for the base prefix, either hexadecimal,
  or, unless precision has resulted in a leading zero, also octal.  */
-  if (prefix && absval && (base == 16 || prec <= ndigs))
+  if (prefix && absval)
 {
-  if (base == 8)
+  if (base == 8 && prec <= ndigs)
 	res += 1;
-  else if (base == 16)
+  else if (base == 16 || base == 2) /* 0x...(0X...) or 0b...(0B...).  */
 	res += 2;
 }
 
@@ -1209,7 +1211,7 @@ format_integer (const directive , tree arg, pointer_query _qry)
 
   /* True when a conversion is preceded by a prefix indicating the base
  of the argument (octal or hexadecimal).  */
-  bool maybebase = dir.get_flag ('#');
+  const bool maybebase = dir.get_flag ('#');
 
   /* True when a signed conversion is preceded by a sign or space.  */
   bool maybesign = false;
@@ -1229,6 +1231,10 @@ format_integer (const directive , tree arg, pointer_query _qry)
 case 'u':
   base = 10;
   break;
+case 'b':
+case 'B':
+  base = 2;
+  break;
 case 'o':
   base = 8;
   break;
@@ -1240,6 +1246,8 @@ format_integer (const directive , tree arg, pointer_query _qry)
   gcc_unreachable ();
 }
 
+  const unsigned adj = (sign | maybebase) + (base == 2 || base == 16);
+
   /* The type of the "formal" argument expected by the directive.  */
   tree dirtype = NULL_TREE;
 
@@ -1350,11 +1358,9 @@ format_integer (const directive , tree arg, pointer_query _qry)
   res.range.unlikely = res.range.max;
 
   /* Bump up the counters if WIDTH is greater than LEN.  */
-  res.adjust_for_width_or_precision (dir.width, dirtype, base,
-	 (sign | maybebase) + (base == 16));
+  res.adjust_for_width_or_precision (dir.width, dirtype, base, adj);
   /* Bump up the counters again if PRECision is greater still.  */
-  res.adjust_for_width_or_precision (dir.prec, dirtype, base,
-	 (sign | maybebase) + (base == 16));
+  res.adjust_for_width_or_precision (dir.prec, dirtype, base, adj);
 
   return res;
 }
@@ -1503,17 +1509,15 @@ format_integer (const directive , tree arg, pointer_query _qry)
 	  if (res.range.min == 1)
 	res.range.likely += base == 8 ? 1 : 2;
 	  else if (res.range.min == 2
-		   && base == 16
+		   && (base == 16 || base == 2)
 		   && (dir.width[0] == 2 || dir.prec[0] == 2))
 	++res.range.likely;
 	}
 }
 
   res.range.unlikely = res.range.max;
-  res.adjust_for_width_or_precision (dir.width, dirtype, base,
- (sign | maybebase) + (base == 16));
-  res.adjust_for_width_or_precision (dir.prec, dirtype, base,
- (sign | maybebase) + (base == 16));
+  res.adjust_for_width_or_precision (dir.width, dirtype, base, adj);
+  res.adjust_for_width_or_precision (dir.prec, dirtype, base, adj);
 
   return res;
 }
@@ -3725,6 +3729,11 @@ parse_directive (call_info ,
   dir.fmtfunc = format_integer;
   break;
 
+case 'b':
+case 'B':
+  dir.fmtfunc = format_integer;
+  break;
+
 case 'p':
   /* The %p output is

Re: [PATCH v2, rs6000] Put dg-options before effective target checks

2022-09-01 Thread Kewen.Lin via Gcc-patches

Hi Haochen,

on 2022/9/1 13:30, HAO CHEN GUI wrote:
> Hi,
>   This patch changes the sequence of test directives for 3 test cases.
> Originally, these 3 cases got failed or unsupported on some platforms, as
> their effective target checks depend on compiling options.
> 

Thanks for the updated patch!

I just found that it seems all the three test cases suffer the empty
TU error issue from those has_arch* effective target checks?

If yes, it looks we don't need to bother this once patch [1] gets
landed?

Sorry, I didn't notice and ask when reviewing the previous version.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598748.html

BR,
Kewen

>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> Thanks
> Gui Haochen
> 
> ChangeLog
> 2022-08-31  Haochen Gui  
> 
> rs6000: Change the sequence of test directives for some test cases.  Put
> dg-options before effective target checks as those has_arch_* adopt
> current_compiler_flags in their checks and rely on compiling options to get an
> accurate check.  dg-options setting before dg-require-effective-target are
> added into current_compiler_flags, but not added if they're after.  So
> adjusting the location of dg-options makes the check more robust.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/pr92398.p9+.c: Put dg-options before effective
>   target check.  Replace lp64 check with has_arch_ppc64 and int128.
>   * gcc.target/powerpc/pr92398.p9-.c: Likewise.
>   * gcc.target/powerpc/pr93453-1.c: Put dg-options before effective
>   target check.
> 
> 
> patch.diff
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c 
> b/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c
> index 72dd1d9a274..b4f5c7f4b82 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c
> @@ -1,6 +1,10 @@
> -/* { dg-do compile { target { lp64 && has_arch_pwr9 } } } */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power9 -mvsx" } */
> +/* { dg-require-effective-target has_arch_ppc64 } */
> +/* { dg-require-effective-target int128 } */
>  /* { dg-require-effective-target powerpc_vsx_ok } */
> -/* { dg-options "-O2 -mvsx" } */
> +/* The test case can be compiled on all platforms with compiling option
> +   -mdejagnu-cpu=power9.  */
> 
>  /* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */
>  /* { dg-final { scan-assembler-times {\mxxlnor\M} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c 
> b/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c
> index bd7fa98af51..4e6a8c8cb8e 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c
> @@ -1,6 +1,8 @@
> -/* { dg-do compile { target { lp64 && {! has_arch_pwr9} } } } */
> -/* { dg-require-effective-target powerpc_vsx_ok } */
>  /* { dg-options "-O2 -mvsx" } */
> +/* { dg-do compile { target { ! has_arch_pwr9 } } } */
> +/* { dg-require-effective-target int128 } */
> +/* { dg-require-effective-target has_arch_ppc64 } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> 
>  /* { dg-final { scan-assembler-times {\mnot\M} 2 { xfail be } } } */
>  /* { dg-final { scan-assembler-times {\mstd\M} 2 { xfail { { {! 
> has_arch_pwr9} && has_arch_pwr8 } && be } } } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr93453-1.c 
> b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
> index b396458ba12..6f4d899c114 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
> @@ -1,5 +1,6 @@
> -/* { dg-do compile { target has_arch_ppc64 } } */
> +/* { dg-do compile } */
>  /* { dg-options "-mdejagnu-cpu=power6 -O2" } */
> +/* { dg-require-effective-target has_arch_ppc64 } */
> 
>  unsigned long load_byte_reverse (unsigned long *in)
>  {

[committed] libstdc++: Optimize array traits

2022-09-01 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, pushed to trunk.

This is the first in a series of patches to optimize compile time for
the contents of .

-- >8 --

Improve compile times by avoiding unnecessary class template
instantiations.

__is_array_known_bounds and __is_array_unknown_bounds can be defined
without instantiating extent, by providing partial specializations for
the true cases.

std::extent can avoid recursing down through a multidimensional array,
so it stops after providing the result. Previously extent
would instantiate extent and extent as well.

std::is_array_v can use partial specializations to avoid instantiating
std::is_array, and similarly for std::rank_v and std::extent_v.

std::is_bounded_array_v and std::is_unbounded_array_v can also use
partial specializations, and then the class templates can be defined in
terms of the variable templates. This makes sense for these traits,
because they are new in C++20 and so the variable templates are always
available, which isn't true in general for C++11 and C++14 traits.

libstdc++-v3/ChangeLog:

* include/std/type_traits (__is_array_known_bounds): Add partial
specialization instead of using std::extent.
(__is_array_unknown_bounds): Likewise.
(extent): Add partial specializations to stop recursion after
the result is found.
(is_array_v): Add partial specializations instead of
instantiating the class template.
(rank_v, extent_v): Likewise.
(is_bounded_array_v, is_unbounded_array_v): Likewise.
(is_bounded_array, is_unbounded_array): Define in terms of the
variable templates.
---
 libstdc++-v3/include/std/type_traits | 102 ++-
 1 file changed, 69 insertions(+), 33 deletions(-)

diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index c2f5cb9c806..5984442c0aa 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -867,21 +867,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 auto declval() noexcept -> decltype(__declval<_Tp>(0));
 
-  template
-struct extent;
-
   template
 struct remove_all_extents;
 
   /// @cond undocumented
   template
 struct __is_array_known_bounds
-: public integral_constant::value > 0)>
+: public false_type
+{ };
+
+  template
+struct __is_array_known_bounds<_Tp[_Size]>
+: public true_type
 { };
 
   template
 struct __is_array_unknown_bounds
-: public __and_, __not_>>
+: public false_type
+{ };
+
+  template
+struct __is_array_unknown_bounds<_Tp[]>
+: public true_type
 { };
 
   // Destructible and constructible type properties.
@@ -1430,23 +1437,25 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : public integral_constant::value> { };
 
   /// extent
-  template
+  template
 struct extent
-: public integral_constant { };
+: public integral_constant { };
 
-  template
+  template
+struct extent<_Tp[_Size], 0>
+: public integral_constant { };
+
+  template
 struct extent<_Tp[_Size], _Uint>
-: public integral_constant::value>
-{ };
+: public extent<_Tp, _Uint - 1>::type { };
+
+  template
+struct extent<_Tp[], 0>
+: public integral_constant { };
 
   template
 struct extent<_Tp[], _Uint>
-: public integral_constant::value>
-{ };
+: public extent<_Tp, _Uint - 1>::type { };
 
 
   // Type relations.
@@ -3133,8 +3142,14 @@ template 
   inline constexpr bool is_integral_v = is_integral<_Tp>::value;
 template 
   inline constexpr bool is_floating_point_v = is_floating_point<_Tp>::value;
+
 template 
-  inline constexpr bool is_array_v = is_array<_Tp>::value;
+  inline constexpr bool is_array_v = false;
+template 
+  inline constexpr bool is_array_v<_Tp[]> = true;
+template 
+  inline constexpr bool is_array_v<_Tp[_Num]> = true;
+
 template 
   inline constexpr bool is_pointer_v = is_pointer<_Tp>::value;
 template 
@@ -3276,10 +3291,25 @@ template 
 has_virtual_destructor<_Tp>::value;
 template 
   inline constexpr size_t alignment_of_v = alignment_of<_Tp>::value;
+
 template 
-  inline constexpr size_t rank_v = rank<_Tp>::value;
+  inline constexpr size_t rank_v = 0;
+template 
+  inline constexpr size_t rank_v<_Tp[_Size]> = 1 + rank_v<_Tp>;
+template 
+  inline constexpr size_t rank_v<_Tp[]> = 1 + rank_v<_Tp>;
+
 template 
-  inline constexpr size_t extent_v = extent<_Tp, _Idx>::value;
+  inline constexpr size_t extent_v = 0;
+template 
+  inline constexpr size_t extent_v<_Tp[_Size], 0> = _Size;
+template 
+  inline constexpr size_t extent_v<_Tp[_Size], _Idx> = extent_v<_Tp, _Idx - 1>;
+template 
+  inline constexpr size_t extent_v<_Tp[], 0> = 0;
+template 
+  inline constexpr size_t extent_v<_Tp[], _Idx> = extent_v<_Tp, _Idx - 1>;
+
 #ifdef _GLIBCXX_HAVE_BUILTIN_IS_SAME
 template 
   inline constexpr bool is_same_v = __is_same(_Tp, _Up);
@@ -3407,32 +3437,38 @@ template
 
 #define __cpp_lib_bounded_array_traits 201902L
 
+  ///

Re: [PATCH] Fix up dump_printf_loc format attribute and adjust uses [PR106782]

2022-09-01 Thread Jakub Jelinek via Gcc-patches

On Thu, Sep 01, 2022 at 09:05:41AM +, Richard Biener wrote:
> > As discussed on IRC, the r13-2299-g68c61c2daa1f bug only got missed
> > because dump_printf_loc had incorrect format attribute and therefore
> > almost no -Wformat=* checking was performed on it.
> > 3, 0 are suitable for function with (whatever, whatever, const char *, 
> > va_list)
> > arguments, not for (whatever, whatever, const char *, ...), that one should
> > use 3, 4.
> > 
> > The following patch fixes that and adjusts all spots to fix warnings.
> > In many cases it is just through an ugly cast (for %G casts to gimple *
> > from gassign */gphi * and the like and for %p casts to void * from slp_node
> > etc.).
> > There are 3 spots where the mismatch was worse though, two using %u or %d
> > for unsigned HOST_WIDE_INT argument and one %T for enum argument (promoted
> > to int).
> 
> Those 3 spots might be worth backporting?  With -fopt-info-* they might
> run into crashes.

Yes, I'll do it soon.

> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> OK.

Thanks.

Jakub

Re: [PATCH] rs6000/test: Fix bswap64-4.c with has_arch_ppc64 [PR106680]

2022-09-01 Thread Kewen.Lin via Gcc-patches

Hi Segher and Peter,

Thanks a lot for your insightful comments on this.

I just read through all discussions and plan to give a
try as replied below.

on 2022/8/31 23:24, Segher Boessenkool wrote:
> On Wed, Aug 31, 2022 at 05:33:28PM +0800, Kewen.Lin wrote:
>> Test case bswap64-4.c suffers the issue as its comments:
>>
>> /* On some versions of dejagnu this test will fail when
>>biarch testing with RUNTESTFLAGS="--target_board=unix
>>'{-m64,-m32}'" due to -m32 being added on the command
>>line after the dg-options -mpowerpc64.
>>common/config/rs6000/rs6000-common.c:
>>rs6000_handle_option disables -mpowerpc64 for -m32.  */
>>
>> As tested, on test machine with dejaGnu 1.6.2, the compilation
>> option order looks like: -m32 ... -mpowerpc64, option
>> -mpowerpc64 still takes effect;  While on test machine with
>> dejaGnu 1.5.1, the option order looks like: -mpowerpc64 ... -m32,
>> option -mpowerpc64 is disabled by -m32, then the case fails.
> 
> *Should* -mpowerpc64  be disabled by -m32?  

I think the reason to disable -mpowerpc64 at -m32 is that we have
-mpowerpc64 explicitly specified at -m64 (equivalent behavior).

In the current implementation, when -m64 is specified, we set the
bit OPTION_MASK_POWERPC64 in both opts and opts_set.  Since we
set OPTION_MASK_POWERPC64 in opts_set for -m64, when we find the
OPTION_MASK_POWERPC64 is ON in opts_set, we don't know if there
is one actual cmd-line option -mpowerpc64 or just -m64.

Assuming there is -m32 given after -m64 in cmd-line option, it's
also unclear how OPTION_MASK_POWERPC64 in opts_set is set, so
to keep conservative it has to disable -mpowerpc64 to ensure
the options like "-m64 -m32" not to have OPTION_MASK_POWERPC64
ON, just like what we have when just specifying "-m32".

Without any explicit -mpowerpc64 (and -mno-), I think we all agree
that -m64 should set OPTION_MASK_POWERPC64 in opts, conversely -m32
should unset OPTION_MASK_POWERPC64 in opts.

To make -m32/-m64 and -mpowerpc64 orthogonal, IMHO we should not
set bit OPTION_MASK_POWERPC64 in opts_set for -m64.  I'm not sure
if there is some particular reason why we set OPTION_MASK_POWERPC64
in opts_set, I hope no. :)  One possible reason I can imagine is
that we want to get the cmd-line options "-mno-powerpc64 -m64" not
raise error, but I think having it to error makes more senses.

So if no objections I'm going to give it a shot like:

```
Iff -mpowerpc64 (or -mno-powerpc64) is specified, the bit
OPTION_MASK_POWERPC64 in opts_set is set.  Either -m64 and -m32
will leave OPTION_MASK_POWERPC64 in opts alone, it only honors
the specified option, and we will raise error for "-m64" +
"-mno-powerpc64" (either order).

When no explicit -mpowerpc64 (or -mno-powerpc64) is provided,
for -m64, set bit OPTION_MASK_POWERPC64 in opts; while for -m32,
unset bit OPTION_MASK_POWERPC64 in opts.  Both will not touch
OPTION_MASK_POWERPC64 in opts_set.
```

btw, I guess the option compatibility isn't an blocking issue
here, right?

BR,
Kewen

Re: [PATCH] Fix up dump_printf_loc format attribute and adjust uses [PR106782]

2022-09-01 Thread Richard Biener via Gcc-patches

On Thu, 1 Sep 2022, Jakub Jelinek wrote:

> Hi!
> 
> As discussed on IRC, the r13-2299-g68c61c2daa1f bug only got missed
> because dump_printf_loc had incorrect format attribute and therefore
> almost no -Wformat=* checking was performed on it.
> 3, 0 are suitable for function with (whatever, whatever, const char *, 
> va_list)
> arguments, not for (whatever, whatever, const char *, ...), that one should
> use 3, 4.
> 
> The following patch fixes that and adjusts all spots to fix warnings.
> In many cases it is just through an ugly cast (for %G casts to gimple *
> from gassign */gphi * and the like and for %p casts to void * from slp_node
> etc.).
> There are 3 spots where the mismatch was worse though, two using %u or %d
> for unsigned HOST_WIDE_INT argument and one %T for enum argument (promoted
> to int).

Those 3 spots might be worth backporting?  With -fopt-info-* they might
run into crashes.

> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2022-09-01  Jakub Jelinek  
> 
>   PR other/106782
>   * dumpfile.h (dump_printf_loc): Use ATTRIBUTE_GCC_DUMP_PRINTF (3, 4)
>   instead of ATTRIBUTE_GCC_DUMP_PRINTF (3, 0).
>   * tree-parloops.cc (parloops_is_slp_reduction): Cast pointers to
>   derived types of gimple to gimple * to avoid -Wformat warnings.
>   * tree-vect-loop-manip.cc (vect_set_loop_condition,
>   vect_update_ivs_after_vectorizer): Likewise.
>   * tree-vect-stmts.cc (vectorizable_load): Likewise.
>   * tree-vect-patterns.cc (vect_split_statement,
>   vect_recog_mulhs_pattern, vect_recog_average_pattern,
>   vect_determine_precisions_from_range,
>   vect_determine_precisions_from_users): Likewise.
>   * gimple-loop-versioning.cc
>   (loop_versioning::analyze_term_using_scevs): Likewise.
>   * tree-vect-slp.cc (vect_build_slp_tree_1): Likewise.
>   (vect_build_slp_tree): Cast slp_tree to void * to avoid
>   -Wformat warnings.
>   (optimize_load_redistribution_1, vect_match_slp_patterns,
>   vect_build_slp_instance, vect_optimize_slp_pass::materialize,
>   vect_optimize_slp_pass::dump, vect_slp_convert_to_external,
>   vect_slp_analyze_node_operations, vect_bb_partition_graph): Likewise.
>   (vect_print_slp_tree): Likewise.  Also use
>   HOST_WIDE_INT_PRINT_UNSIGNED instead of %u.
>   * tree-vect-loop.cc (vect_determine_vectorization_factor,
>   vect_analyze_scalar_cycles_1, vect_analyze_loop_operations,
>   vectorizable_induction, vect_transform_loop): Cast pointers to derived
>   types of gimple to gimple * to avoid -Wformat warnings.
>   (vect_analyze_loop_2): Cast slp_tree to void * to avoid
>   -Wformat warnings.
>   (vect_estimate_min_profitable_iters): Use HOST_WIDE_INT_PRINT_UNSIGNED
>   instead of %d.
>   * tree-vect-slp-patterns.cc (vect_pattern_validate_optab): Use %G
>   instead of %T and STMT_VINFO_STMT (SLP_TREE_REPRESENTATIVE (node))
>   instead of SLP_TREE_DEF_TYPE (node).
> 
> --- gcc/dumpfile.h.jj 2022-08-31 12:11:48.346349456 +0200
> +++ gcc/dumpfile.h2022-08-31 18:04:51.157309877 +0200
> @@ -574,7 +574,7 @@ extern void dump_printf (const dump_meta
>  
>  extern void dump_printf_loc (const dump_metadata_t &, const 
> dump_user_location_t &,
>const char *, ...)
> -  ATTRIBUTE_GCC_DUMP_PRINTF (3, 0);
> +  ATTRIBUTE_GCC_DUMP_PRINTF (3, 4);
>  extern void dump_function (int phase, tree fn);
>  extern void dump_basic_block (dump_flags_t, basic_block, int);
>  extern void dump_generic_expr_loc (const dump_metadata_t &,
> --- gcc/tree-parloops.cc.jj   2022-08-25 11:54:42.406877138 +0200
> +++ gcc/tree-parloops.cc  2022-08-31 17:43:23.779652735 +0200
> @@ -338,8 +338,8 @@ parloops_is_slp_reduction (loop_vec_info
> && parloops_valid_reduction_input_p (def_stmt_info))
>   {
> if (dump_enabled_p ())
> - dump_printf_loc (MSG_NOTE, vect_location, "swapping oprnds: %G",
> -  next_stmt);
> + dump_printf_loc (MSG_NOTE, vect_location,
> +  "swapping oprnds: %G", (gimple *) next_stmt);
>  
> swap_ssa_operands (next_stmt,
>gimple_assign_rhs1_ptr (next_stmt),
> --- gcc/tree-vect-loop-manip.cc.jj2022-08-31 10:20:20.498973136 +0200
> +++ gcc/tree-vect-loop-manip.cc   2022-08-31 17:51:29.502109340 +0200
> @@ -992,7 +992,7 @@ vect_set_loop_condition (class loop *loo
>  
>if (dump_enabled_p ())
>  dump_printf_loc (MSG_NOTE, vect_location, "New loop exit condition: %G",
> -  cond_stmt);
> +  (gimple *) cond_stmt);
>  }
>  
>  /* Helper routine of slpeel_tree_duplicate_loop_to_edge_cfg.
> @@ -1539,7 +1539,8 @@ vect_update_ivs_after_vectorizer (loop_v
>stmt_vec_info phi_info = loop_vinfo->lookup_stmt (phi);
>if (dump_enabled_p ())
>   dump_printf_loc (MSG_NOTE,

Re: [PATCH] rs6000/test: Fix bswap64-4.c with has_arch_ppc64 [PR106680]

2022-09-01 Thread Kewen.Lin via Gcc-patches

on 2022/8/31 22:13, Peter Bergner wrote:
> On 8/31/22 4:33 AM, Kewen.Lin wrote:
>> @@ -1,7 +1,8 @@
>>  /* { dg-do compile { target { powerpc*-*-* } } } */
>>  /* { dg-skip-if "" { powerpc*-*-aix* } } */
>> -/* { dg-options "-O2 -mpowerpc64" } */
>>  /* { dg-require-effective-target ilp32 } */
>> +/* { dg-options "-O2 -mpowerpc64" } */
>> +/* { dg-require-effective-target has_arch_ppc64 } */
> 
> With many of our recent patches moving the dg-options before any
> dg-requires-effectice-target so it affects the results of the
> dg-requires-effectice-target test, this looks like it's backwards
> from that process.  I understand why, so I think an explicit comment
> here in the test case explaining why it's after in this case.
> Just so in a few years when we come back to this test case, we
> won't accidentally undo this change.

Oops, the diff shows it's like "after", but it's actually still "before". :)
The dg-options is meant to be placed before the succeeding has_arch_ppc64
effective target which is supposed to use dg-options to compile.  I felt
good to let ilp32 checking go first then has_arch_ppc64, so moved dg-option
downward.

Sorry for the confusion, I should have placed the has_arch_ppc64
effective target just after the dg-options.  Anyway, it's a good idea
to add more comments in test case source!  Thanks!

BR,
Kewen

Re: [PATCH] rs6000: Don't ICE when we disassemble an MMA variable [PR101322]

2022-09-01 Thread Kewen.Lin via Gcc-patches

>>> ...and of course, now I can't recreate that issue at all and the
>>> ptr_vector_*_type use work fine now.  Strange! ...so ok, changed.
>>> Maybe the behavior changed since my PR106017 fix went in???
>>
>> That is my best guess as well.  But, how did that help this test?
> 
> It didn't. :-)   During my bootstrap, I hit the gimple verification issue
> I mentioned seeing earlier.  My problem was I thought I hit it with the
> test case, but it was exposed on a different test case in the testsuite.
> Here's what I'm seeing, which only happens when using -O0 -flto:
> 
> rain6p1% gcc -O0 -mcpu=power10 -flto pr102347.c 
> lto1: internal compiler error: in gimple_canonical_types_compatible_p, at 
> tree.cc:13677
> 0x11930a97 gimple_canonical_types_compatible_p(tree_node const*, tree_node 
> const*, bool)
>   /home/bergner/gcc/gcc-fsf-mainline-pr101322/gcc/tree.cc:13677
> 0x1192f1ab verify_type_variant
>   /home/bergner/gcc/gcc-fsf-mainline-pr101322/gcc/tree.cc:13377
> 0x11930beb verify_type(tree_node const*)
>   /home/bergner/gcc/gcc-fsf-mainline-pr101322/gcc/tree.cc:13700
> 0x106bbd37 lto_fixup_state
>   /home/bergner/gcc/gcc-fsf-mainline-pr101322/gcc/lto/lto-common.cc:2629
> 0x106bbff3 lto_fixup_decls
>   /home/bergner/gcc/gcc-fsf-mainline-pr101322/gcc/lto/lto-common.cc:2660
> 0x106bce13 read_cgraph_and_symbols(unsigned int, char const**)
>   /home/bergner/gcc/gcc-fsf-mainline-pr101322/gcc/lto/lto-common.cc:2901
> 0x1067bcbf lto_main()
>   /home/bergner/gcc/gcc-fsf-mainline-pr101322/gcc/lto/lto.cc:656
> Please submit a full bug report, with preprocessed source (by using 
> -freport-bug).
> Please include the complete backtrace with any bug report.
> See  for instructions.
> lto-wrapper: fatal error: 
> /home/bergner/gcc/build/gcc-fsf-mainline-pr101322-debug/gcc/xgcc returned 1 
> exit status
> compilation terminated.
> /home/bergner/binutils/install/binutils-power10/bin/ld: error: lto-wrapper 
> failed
> collect2: error: ld returned 1 exit status
> 
> The problem goes away if I use use -O1 or above, I drop -flto or I use
> the code I originally posted without the ptr_vector_*_type
> 
> The assert in gimple_canonical_types_compatible_p() we're hitting is:
> 13673 default:
> 13674   /* Consider all types with language specific trees in them 
> mutually
> 13675  compatible.  This is executed only from verify_type and false
> 13676  positives can be tolerated.  */
> 13677   gcc_assert (!in_lto_p);
> 13678   return true;
> 
> I have no idea why ptr_vector_*_type would behave differently here than
> build_pointer_type (vector_*_type_node).  Using the build_pointer_type()
> fixed it for me, so that's why I went with it. :-)  Maybe this is a bug
> in lto???

Thanks for your time to reproduce this!

The only difference is that ptr_vector_*_type are built from the
qualified_type based on vector_*_type_node, instead of directly from
vector_*_type_node.  I'm interested to have a further look at this later.

BR,
Kewen

Re: [PATCH] rs6000: Don't ICE when we disassemble an MMA variable [PR101322]

2022-09-01 Thread Kewen.Lin via Gcc-patches

 +  if (TREE_TYPE (TREE_TYPE (src_ptr)) != src_type)
>>>
>>> This line looks unexpected, the former is type char while the latter is 
>>> type __vector_pair *.
>>>
>>> I guess you meant to compare the type of pointer type like: 
>>>
>>>TREE_TYPE (TREE_TYPE (src_ptr)) != TREE_TYPE (src_type)
>>
>> Maybe?  However, if that is the case, how can it be working for me?
>> Let me throw this in the debugger and verify the types and I'll report
>> back with what I find.
> 
> Ok, you are correct.  Thanks for catching that!  I don't think we need
> those matching outer TREE_TYPE() uses.  I think just a simple:
> 
>   if (TREE_TYPE (src_ptr) != src_type)
> 
> ...should suffice.
> 

Yeah, it's enough for the associated test case.  :)

> 
>>> or even with mode like:
>>>
>>>TYPE_MODE (TREE_TYPE (TREE_TYPE (src_ptr))) != TYPE_MODE (TREE_TYPE 
>>> (src_type))
> 
> I'd rather not look at the mode here, since OOmode/XOmode doesn't necessarily
> mean __vector_{pair,quad}, so I'll go with the modified test above.

Good point.  I thought the cv qualifier can affect the type equality check and
assumed for test case like:

void
foo (char *resp, const __vector_pair *vpp)
{
  __builtin_vsx_disassemble_pair (resp, (__vector_pair *) vpp);
}

, we don't want to have the conversion there and the ICE seems related to the
underlying mode, so I thought maybe you wanted to use TYPE_MODE.


 +  src_ptr = build1 (VIEW_CONVERT_EXPR, src_type, src_ptr);
>>>
>>> Nit: NOP_EXPR seems to be better suited here for pointer conversion.
> 
> Ok, this works too, so code changed to use it.  Thanks!
> 
> Question for my own education, when would you use VIEW_CONVERT_EXPR over 
> NOP_EXPR?

tree.def has some note about VIEW_CONVERT_EXPR, it quite matches what Segher 
replied.
In my experience, VIEW_CONVERT_EXPR are used a lot for vector type conversion.

BR,
Kewen

Re: [PATCH] rs6000/test: Fix typo in pr86731-fwrapv-longlong.c [PR106682]

2022-09-01 Thread Kewen.Lin via Gcc-patches

Hi Segher & Peter,

Thanks for your reviews!

on 2022/8/31 23:12, Segher Boessenkool wrote:
> On Wed, Aug 31, 2022 at 05:33:21PM +0800, Kewen.Lin wrote:
>> It's meant to update "lxv" to "p?lxv" and should leave the
>> "lvx" unchanged.  So this is to fix the typo accordingly.
>>
>> I'll push this soon if no objections.
> 
> Please go ahead.  Out of interest, did you see failures from this, was
> it just by visual inspection,  something else?
> 

I did reproduce the failure for this test case on ppc64 P8 machine. :)
For the other test cases updated by commit r12-2266, I did a quick visual
inspection on them instead of actually testing them, there are some other
typos but they have been fixed by r12-2889-g8464894c86b03e.

To avoid some to escape, I just tested the other cases on ppc64 P8 and
ppc64le P9 and P10, no failures were found.

So committed as r13-2332-g023c5b36e47697.  Thanks!

BR,
Kewen

[PATCH] Fix up dump_printf_loc format attribute and adjust uses [PR106782]

2022-09-01 Thread Jakub Jelinek via Gcc-patches

Hi!

As discussed on IRC, the r13-2299-g68c61c2daa1f bug only got missed
because dump_printf_loc had incorrect format attribute and therefore
almost no -Wformat=* checking was performed on it.
3, 0 are suitable for function with (whatever, whatever, const char *, va_list)
arguments, not for (whatever, whatever, const char *, ...), that one should
use 3, 4.

The following patch fixes that and adjusts all spots to fix warnings.
In many cases it is just through an ugly cast (for %G casts to gimple *
from gassign */gphi * and the like and for %p casts to void * from slp_node
etc.).
There are 3 spots where the mismatch was worse though, two using %u or %d
for unsigned HOST_WIDE_INT argument and one %T for enum argument (promoted
to int).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-09-01  Jakub Jelinek  

PR other/106782
* dumpfile.h (dump_printf_loc): Use ATTRIBUTE_GCC_DUMP_PRINTF (3, 4)
instead of ATTRIBUTE_GCC_DUMP_PRINTF (3, 0).
* tree-parloops.cc (parloops_is_slp_reduction): Cast pointers to
derived types of gimple to gimple * to avoid -Wformat warnings.
* tree-vect-loop-manip.cc (vect_set_loop_condition,
vect_update_ivs_after_vectorizer): Likewise.
* tree-vect-stmts.cc (vectorizable_load): Likewise.
* tree-vect-patterns.cc (vect_split_statement,
vect_recog_mulhs_pattern, vect_recog_average_pattern,
vect_determine_precisions_from_range,
vect_determine_precisions_from_users): Likewise.
* gimple-loop-versioning.cc
(loop_versioning::analyze_term_using_scevs): Likewise.
* tree-vect-slp.cc (vect_build_slp_tree_1): Likewise.
(vect_build_slp_tree): Cast slp_tree to void * to avoid
-Wformat warnings.
(optimize_load_redistribution_1, vect_match_slp_patterns,
vect_build_slp_instance, vect_optimize_slp_pass::materialize,
vect_optimize_slp_pass::dump, vect_slp_convert_to_external,
vect_slp_analyze_node_operations, vect_bb_partition_graph): Likewise.
(vect_print_slp_tree): Likewise.  Also use
HOST_WIDE_INT_PRINT_UNSIGNED instead of %u.
* tree-vect-loop.cc (vect_determine_vectorization_factor,
vect_analyze_scalar_cycles_1, vect_analyze_loop_operations,
vectorizable_induction, vect_transform_loop): Cast pointers to derived
types of gimple to gimple * to avoid -Wformat warnings.
(vect_analyze_loop_2): Cast slp_tree to void * to avoid
-Wformat warnings.
(vect_estimate_min_profitable_iters): Use HOST_WIDE_INT_PRINT_UNSIGNED
instead of %d.
* tree-vect-slp-patterns.cc (vect_pattern_validate_optab): Use %G
instead of %T and STMT_VINFO_STMT (SLP_TREE_REPRESENTATIVE (node))
instead of SLP_TREE_DEF_TYPE (node).

--- gcc/dumpfile.h.jj   2022-08-31 12:11:48.346349456 +0200
+++ gcc/dumpfile.h  2022-08-31 18:04:51.157309877 +0200
@@ -574,7 +574,7 @@ extern void dump_printf (const dump_meta
 
 extern void dump_printf_loc (const dump_metadata_t &, const 
dump_user_location_t &,
 const char *, ...)
-  ATTRIBUTE_GCC_DUMP_PRINTF (3, 0);
+  ATTRIBUTE_GCC_DUMP_PRINTF (3, 4);
 extern void dump_function (int phase, tree fn);
 extern void dump_basic_block (dump_flags_t, basic_block, int);
 extern void dump_generic_expr_loc (const dump_metadata_t &,
--- gcc/tree-parloops.cc.jj 2022-08-25 11:54:42.406877138 +0200
+++ gcc/tree-parloops.cc2022-08-31 17:43:23.779652735 +0200
@@ -338,8 +338,8 @@ parloops_is_slp_reduction (loop_vec_info
  && parloops_valid_reduction_input_p (def_stmt_info))
{
  if (dump_enabled_p ())
-   dump_printf_loc (MSG_NOTE, vect_location, "swapping oprnds: %G",
-next_stmt);
+   dump_printf_loc (MSG_NOTE, vect_location,
+"swapping oprnds: %G", (gimple *) next_stmt);
 
  swap_ssa_operands (next_stmt,
 gimple_assign_rhs1_ptr (next_stmt),
--- gcc/tree-vect-loop-manip.cc.jj  2022-08-31 10:20:20.498973136 +0200
+++ gcc/tree-vect-loop-manip.cc 2022-08-31 17:51:29.502109340 +0200
@@ -992,7 +992,7 @@ vect_set_loop_condition (class loop *loo
 
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location, "New loop exit condition: %G",
-cond_stmt);
+(gimple *) cond_stmt);
 }
 
 /* Helper routine of slpeel_tree_duplicate_loop_to_edge_cfg.
@@ -1539,7 +1539,8 @@ vect_update_ivs_after_vectorizer (loop_v
   stmt_vec_info phi_info = loop_vinfo->lookup_stmt (phi);
   if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,
-"vect_update_ivs_after_vectorizer: phi: %G", phi);
+"vect_update_ivs_after_vectorizer: phi: %G",
+(gimple *) phi);
 
   /* Skip reduction and virtual phis.  */

[PATCH/gcc] RTEMS: Add -mvrsave multilibs

2022-09-01 Thread Sebastian Huber

gcc/ChangeLog:

* config/rs6000/rtems.h (CPP_OS_DEFAULT_SPEC): Define __PPC_VRSAVE__ if
-mvrsave is present.
* config/rs6000/t-rtems: Add -mvrsave multilib variants for
-mcpu=e6500.
---
 gcc/config/rs6000/rtems.h | 3 ++-
 gcc/config/rs6000/t-rtems | 5 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rtems.h b/gcc/config/rs6000/rtems.h
index 7ea9ebdb77b..683004eb07c 100644
--- a/gcc/config/rs6000/rtems.h
+++ b/gcc/config/rs6000/rtems.h
@@ -252,7 +252,8 @@
 %{mcpu=821:  %{!Dppc*: %{!Dmpc*: -Dmpc821}  } } \
 %{mcpu=860:  %{!Dppc*: %{!Dmpc*: -Dmpc860}  } } \
 %{mcpu=8540: %{!Dppc*: %{!Dmpc*: -Dppc8540}  } } \
-%{mcpu=e6500: -D__PPC_CPU_E6500__}"
+%{mcpu=e6500: -D__PPC_CPU_E6500__} \
+%{mvrsave: -D__PPC_VRSAVE__}"
 
 #undef ASM_DEFAULT_SPEC
 #defineASM_DEFAULT_SPEC "-mppc%{m64:64}"
diff --git a/gcc/config/rs6000/t-rtems b/gcc/config/rs6000/t-rtems
index 66c20aadea5..278ebb69e60 100644
--- a/gcc/config/rs6000/t-rtems
+++ b/gcc/config/rs6000/t-rtems
@@ -36,6 +36,9 @@ MULTILIB_DIRNAMES += nof gprsdouble
 MULTILIB_OPTIONS += mno-spe/mno-altivec
 MULTILIB_DIRNAMES += nospe noaltivec
 
+MULTILIB_OPTIONS += mvrsave
+MULTILIB_DIRNAMES += vrsave
+
 MULTILIB_MATCHES   += ${MULTILIB_MATCHES_ENDIAN}
 MULTILIB_MATCHES   += ${MULTILIB_MATCHES_SYSV}
 # Map 405 to 403
@@ -76,5 +79,7 @@ MULTILIB_REQUIRED += mcpu=8540/msoft-float/mno-spe
 MULTILIB_REQUIRED += mcpu=8540/mfloat-gprs=double
 MULTILIB_REQUIRED += mcpu=860
 MULTILIB_REQUIRED += mcpu=e6500/m32
+MULTILIB_REQUIRED += mcpu=e6500/m32/mvrsave
 MULTILIB_REQUIRED += mcpu=e6500/m32/msoft-float/mno-altivec
 MULTILIB_REQUIRED += mcpu=e6500/m64
+MULTILIB_REQUIRED += mcpu=e6500/m64/mvrsave
-- 
2.26.2

[COMMITTED] Make frange selftests work on !HONOR_NANS systems.

2022-09-01 Thread Aldy Hernandez via Gcc-patches

I'm just shuffling the FP self tests here, with no change to existing
functionality.

If we agree that explicit NANs in the source code with !HONOR_NANS
should behave any differently, I'm happy to address whatever needs
fixing, but for now I'd like to unblock the !HONOR_NANS build systems.

I have added an adaptation of a test Jakub suggested we handle in the PR:

void funk(int cond)
{
  float x;

  if (cond)
x = __builtin_nan ("");
  else
x = 1.24;

  bar(x);
}

For !HONOR_NANS, the range for the PHI of x_1 is the union of 1.24 and
NAN which is really 1.24 with a maybe NAN.  This reflects the IL-- the
presence of the actual NAN.  However, VRP will propagate this because
it sees the 1.24 and ignores the possibility of a NAN, per
!HONOR_NANS.  IMO, this is correct.  OTOH, for HONOR_NANS the unknown
NAN property keeps us from propagating the value.

Is there a reason we don't warn for calls to __builtin_nan when
!HONOR_NANS?  That makes no sense to me.

PR tree-optimization/106785

gcc/ChangeLog:

* value-range.cc (range_tests_nan): Adjust tests for !HONOR_NANS.
(range_tests_floats): Same.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/vrp-float-nan-1.c: New test.
---
 .../gcc.dg/tree-ssa/vrp-float-nan-1.c | 18 +++
 gcc/value-range.cc| 23 +++
 2 files changed, 32 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/vrp-float-nan-1.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-nan-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-nan-1.c
new file mode 100644
index 000..126949b2b4c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp-float-nan-1.c
@@ -0,0 +1,18 @@
+// { dg-do compile }
+// { dg-options "-O2 -ffinite-math-only -fdump-tree-evrp" }
+
+void bar(float);
+
+void funk(int cond)
+{
+  float x;
+
+  if (cond)
+x = __builtin_nan ("");
+  else
+x = 1.24;
+
+  bar(x);
+}
+
+// { dg-final { scan-tree-dump-times "bar \\(1.24" 1 "evrp" } }
diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 473139c6dbd..3c7d4cb84b9 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -3535,13 +3535,16 @@ range_tests_nan ()
   REAL_VALUE_TYPE q, r;
 
   // Equal ranges but with differing NAN bits are not equal.
-  r1 = frange_float ("10", "12");
-  r0 = r1;
-  ASSERT_EQ (r0, r1);
-  r0.set_nan (fp_prop::NO);
-  ASSERT_NE (r0, r1);
-  r0.set_nan (fp_prop::YES);
-  ASSERT_NE (r0, r1);
+  if (HONOR_NANS (float_type_node))
+{
+  r1 = frange_float ("10", "12");
+  r0 = r1;
+  ASSERT_EQ (r0, r1);
+  r0.set_nan (fp_prop::NO);
+  ASSERT_NE (r0, r1);
+  r0.set_nan (fp_prop::YES);
+  ASSERT_NE (r0, r1);
+}
 
   // NAN ranges are not equal to each other.
   r0 = frange_nan (float_type_node);
@@ -3624,9 +3627,11 @@ range_tests_floats ()
   if (HONOR_SIGNED_ZEROS (float_type_node))
 range_tests_signed_zeros ();
 
-  // A range of [-INF,+INF] is actually VARYING...
+  // A range of [-INF,+INF] is actually VARYING if no other properties
+  // are set.
   r0 = frange_float ("-Inf", "+Inf");
-  ASSERT_TRUE (r0.varying_p ());
+  if (r0.get_nan ().varying_p ())
+ASSERT_TRUE (r0.varying_p ());
   // ...unless it has some special property...
   r0.set_nan (fp_prop::NO);
   ASSERT_FALSE (r0.varying_p ());
-- 
2.37.1

Re: [PATCH] ipa: Fix throw in multi-versioned functions [PR106627]

2022-09-01 Thread Richard Biener via Gcc-patches

On Wed, Aug 31, 2022 at 11:00 PM Simon Rainer  wrote:
>
> Hi,
>
> This patch fixes PR106627. I ran the i386.exp tests on my x86_64-linux-gnu 
> machine with a fully bootstrapped checkout. I also tested manually that no 
> exception handling code is generated if none of the function versions throws 
> an exception.
> I don't have access to a machine to test the change to  rs6000.cc, but the 
> code seems like an exact copy and I don't see a reason why it shouldn't work 
> there the same way.
>
> Regards
> Simon Rainer
>
> From 6fcb1c742fa1d61048f7d63243225a8d1931af4a Mon Sep 17 00:00:00 2001
> From: Simon Rainer 
> Date: Wed, 31 Aug 2022 20:56:04 +0200
> Subject: [PATCH] ipa: Fix throw in multi-versioned functions [PR106627]
>
> Any multi-versioned function was implicitly declared as noexcept, which
> leads to an abort if an exception is thrown inside the function.
> The reason for this is that the function declaration is replaced by a
> newly created dispatcher declaration, which has TREE_NOTHROW always set
> to 1. Instead we need to set TREE_NOTHROW to the value of the original
> declaration.

Looks quite obvious.  The middle-end to target interface is a bit iffy
since we have
to duplicate this everywhere.  There's also other flags like
pure/const and noreturn
that do not impose correctness issues but may cause irritations if the IL gets
a call to the dispatcher not marked noreturn but there's no code following.

That said, the fix looks good to me.

Thanks,
Richard.

> PR ipa/106627
>
> gcc/ChangeLog:
>
> * config/i386/i386-features.cc 
> (ix86_get_function_versions_dispatcher): Set TREE_NOTHROW
> correctly for dispatcher declaration
> * config/rs6000/rs6000.cc (rs6000_get_function_versions_dispatcher): 
> Likewise
>
> gcc/testsuite/ChangeLog:
>
> * g++.target/i386/pr106627.C: New test.
> ---
>  gcc/config/i386/i386-features.cc |  1 +
>  gcc/config/rs6000/rs6000.cc  |  1 +
>  gcc/testsuite/g++.target/i386/pr106627.C | 30 
>  3 files changed, 32 insertions(+)
>  create mode 100644 gcc/testsuite/g++.target/i386/pr106627.C
>
> diff --git a/gcc/config/i386/i386-features.cc 
> b/gcc/config/i386/i386-features.cc
> index d6bb66cbe01..5b3b1aeff28 100644
> --- a/gcc/config/i386/i386-features.cc
> +++ b/gcc/config/i386/i386-features.cc
> @@ -3268,6 +3268,7 @@ ix86_get_function_versions_dispatcher (void *decl)
>
>/* Right now, the dispatching is done via ifunc.  */
>dispatch_decl = make_dispatcher_decl (default_node->decl);
> +  TREE_NOTHROW(dispatch_decl) = TREE_NOTHROW(fn);
>
>dispatcher_node = cgraph_node::get_create (dispatch_decl);
>gcc_assert (dispatcher_node != NULL);
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 2f3146e56f8..9280da8a5c8 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -24861,6 +24861,7 @@ rs6000_get_function_versions_dispatcher (void *decl)
>
>/* Right now, the dispatching is done via ifunc.  */
>dispatch_decl = make_dispatcher_decl (default_node->decl);
> +  TREE_NOTHROW(dispatch_decl) = TREE_NOTHROW(fn);
>
>dispatcher_node = cgraph_node::get_create (dispatch_decl);
>gcc_assert (dispatcher_node != NULL);
> diff --git a/gcc/testsuite/g++.target/i386/pr106627.C 
> b/gcc/testsuite/g++.target/i386/pr106627.C
> new file mode 100644
> index 000..a67f5ae4813
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/i386/pr106627.C
> @@ -0,0 +1,30 @@
> +/* PR c++/103012 Exception handling with multiversioned functions */
> +/* { dg-do run } */
> +/* { dg-require-ifunc "" }  */
> +
> +#include 
> +
> +__attribute__((target("default")))
> +void f() {
> +throw 1;
> +}
> +
> +__attribute__((target("sse4.2,bmi")))
> +void f() {
> +throw 2;
> +}
> +
> +int main()
> +{
> +try {
> +f();
> +}
> +catch(...)
> +{
> +return 0;
> +}
> +
> +assert (false);
> +return 1;
> +}
> +
> --
> 2.34.1
>

Re: [PATCH] d: Fix #error You must define PREFERRED_DEBUGGING_TYPE if DWARF is not supported (PR105659)

2022-09-01 Thread Richard Biener via Gcc-patches

On Wed, Aug 31, 2022 at 9:21 PM Iain Buclaw  wrote:
>
> Excerpts from Joseph Myers's message of August 31, 2022 7:16 pm:
> > On Wed, 31 Aug 2022, Iain Buclaw via Gcc-patches wrote:
> >
> >> Excerpts from Joseph Myers's message of August 30, 2022 11:53 pm:
> >> > On Fri, 26 Aug 2022, Richard Biener via Gcc-patches wrote:
> >> >
> >> >> I was hoping Joseph would chime in here - I recollect debugging this 
> >> >> kind
> >> >> of thing and a thread about this a while back but unfortunately I do not
> >> >> remember the details here (IIRC some things get included where they
> >> >> better should not be).
> >> >
> >> > See .
> >> > Is there some reason it's problematic to avoid having defaults.h or
> >> > ${cpu_type}/${cpu_type}.h included in tm_d.h, and instead have tm_d.h 
> >> > only
> >> > include D-specific headers?
> >> >
> >>
> >> In targets such as arm-elf, we still need to pull in definitions from
> >> ${cpu_type}/${cpu_type}-d.cc into default-d.cc.
> >>
> >> All I can think that might suffice is having D-specific prototype
> >> headers in all targets as ${cpu_type}/${cpu_type}-d.h.
> >
> > As long as those prototypes don't involve any types that depend on an
> > inclusion of tm.h, that should be fine.
> >
>
> Updated patch that does what I described.

Ah yes - I think, even if a bit verbose, this is exactly how it was supposed
to be?

OK from my side.

Thanks,
Richard.

> Bootstrapped on x86_64-linux-gnu and built an aarch64-rtems
> cross-compiler without any errors, will kick off config-list.mk as well for
> sanity checking a big list of targets in a while.
>
> Iain.
> ---
> PR d/105659
>
> gcc/ChangeLog:
>
> * config.gcc: Set tm_d_file to ${cpu_type}/${cpu_type}-d.h.
> * config/aarch64/aarch64-d.cc: Include tm_d.h.
> * config/aarch64/aarch64-protos.h (aarch64_d_target_versions): Move to
> config/aarch64/aarch64-d.h.
> (aarch64_d_register_target_info): Likewise.
> * config/aarch64/aarch64.h (TARGET_D_CPU_VERSIONS): Likewise.
> (TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
> * config/arm/arm-d.cc: Include tm_d.h instead of tm_p.h.
> * config/arm/arm-protos.h (arm_d_target_versions): Move to
> config/arm/arm-d.h.
> (arm_d_register_target_info): Likewise.
> * config/arm/arm.h (TARGET_D_CPU_VERSIONS): Likewise.
> (TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
> * config/default-d.cc: Remove memmodel.h include.
> * config/freebsd-d.cc: Include tm_d.h instead of tm_p.h.
> * config/glibc-d.cc: Likewise.
> * config/i386/i386-d.cc: Include tm_d.h.
> * config/i386/i386-protos.h (ix86_d_target_versions): Move to
> config/i386/i386-d.h.
> (ix86_d_register_target_info): Likewise.
> (ix86_d_has_stdcall_convention): Likewise.
> * config/i386/i386.h (TARGET_D_CPU_VERSIONS): Likewise.
> (TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
> (TARGET_D_HAS_STDCALL_CONVENTION): Likewise.
> * config/i386/winnt-d.cc: Include tm_d.h instead of tm_p.h.
> * config/mips/mips-d.cc: Include tm_d.h.
> * config/mips/mips-protos.h (mips_d_target_versions): Move to
> config/mips/mips-d.h.
> (mips_d_register_target_info): Likewise.
> * config/mips/mips.h (TARGET_D_CPU_VERSIONS): Likewise.
> (TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
> * config/netbsd-d.cc: Include tm_d.h instead of tm.h and memmodel.h.
> * config/openbsd-d.cc: Likewise.
> * config/pa/pa-d.cc: Include tm_d.h.
> * config/pa/pa-protos.h (pa_d_target_versions): Move to
> config/pa/pa-d.h.
> (pa_d_register_target_info): Likewise.
> * config/pa/pa.h (TARGET_D_CPU_VERSIONS): Likewise.
> (TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
> * config/riscv/riscv-d.cc: Include tm_d.h.
> * config/riscv/riscv-protos.h (riscv_d_target_versions): Move to
> config/riscv/riscv-d.h.
> (riscv_d_register_target_info): Likewise.
> * config/riscv/riscv.h (TARGET_D_CPU_VERSIONS): Likewise.
> (TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
> * config/rs6000/rs6000-d.cc: Include tm_d.h.
> * config/rs6000/rs6000-protos.h (rs6000_d_target_versions): Move to
> config/rs6000/rs6000-d.h.
> (rs6000_d_register_target_info): Likewise.
> * config/rs6000/rs6000.h (TARGET_D_CPU_VERSIONS) Likewise.:
> (TARGET_D_REGISTER_CPU_TARGET_INFO) Likewise.:
> * config/s390/s390-d.cc: Include tm_d.h.
> * config/s390/s390-protos.h (s390_d_target_versions): Move to
> config/s390/s390-d.h.
> (s390_d_register_target_info): Likewise.
> * config/s390/s390.h (TARGET_D_CPU_VERSIONS): Likewise.
> (TARGET_D_REGISTER_CPU_TARGET_INFO): Likewise.
> * config/sol2-d.cc: Include tm_d.h instead of tm.h

Re: [[GCC13][Patch][V3] 1/2] Add a new option -fstrict-flex-array[=n] and new attribute strict_flex_array

2022-09-01 Thread Richard Biener via Gcc-patches

On Wed, 31 Aug 2022, Kees Cook wrote:

> On Wed, Aug 31, 2022 at 08:35:12PM +, Qing Zhao wrote:
> > One of the major purposes of the new option -fstrict-flex-array is to 
> > encourage standard conforming programming style. 
> > 
> > So, it might be reasonable to treat -fstrict-flex-array similar as 
> > -pedantic (but only for flexible array members)? 
> > If so, then issuing warnings when the standard doesn?t support is 
> > reasonable and desirable. 
> 
> I guess the point is that "-std=c89 -fstrict-flex-arrays=3" leaves "[]"
> available for use still? I think this doesn't matter. If someone wants
> it to be really strict, they'd just add -Wpedantic.

Yes, I think that makes sense.

Richard.

81 matches

Mail list logo