Re: [PATCH] Fix PR 106560: Another ICE after conflicting types of redeclaration

2022-11-20 Thread Richard Biener via Gcc-patches
On Sun, Nov 20, 2022 at 2:26 AM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> This another one of these ICE after error issues with the
> gimplifier and a fallout from r12-3278-g823685221de986af.
> The problem here is gimplify_modify_expr does not
> check if either from or to was an error operand.
> This adds the check and fixes the ICE.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK.

> gcc/ChangeLog:
>
> * gimplify.cc (gimplify_modify_expr): If
> either *from_p or *to_p were error_operand
> return early.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/redecl-23.c: New test.
> * gcc.dg/redecl-24.c: New test.
> * gcc.dg/redecl-25.c: New test.
> ---
>  gcc/gimplify.cc  | 3 +++
>  gcc/testsuite/gcc.dg/redecl-23.c | 6 ++
>  gcc/testsuite/gcc.dg/redecl-24.c | 6 ++
>  gcc/testsuite/gcc.dg/redecl-25.c | 9 +
>  4 files changed, 24 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/redecl-23.c
>  create mode 100644 gcc/testsuite/gcc.dg/redecl-24.c
>  create mode 100644 gcc/testsuite/gcc.dg/redecl-25.c
>
> diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
> index c62a966e918..02415cb1b5c 100644
> --- a/gcc/gimplify.cc
> +++ b/gcc/gimplify.cc
> @@ -6054,6 +6054,9 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, 
> gimple_seq *post_p,
>location_t loc = EXPR_LOCATION (*expr_p);
>gimple_stmt_iterator gsi;
>
> +  if (error_operand_p (*from_p) || error_operand_p (*to_p))
> +return GS_ERROR;
> +
>gcc_assert (TREE_CODE (*expr_p) == MODIFY_EXPR
>   || TREE_CODE (*expr_p) == INIT_EXPR);
>
> diff --git a/gcc/testsuite/gcc.dg/redecl-23.c 
> b/gcc/testsuite/gcc.dg/redecl-23.c
> new file mode 100644
> index 000..425721df2ff
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/redecl-23.c
> @@ -0,0 +1,6 @@
> +/* We used to ICE in the gimplifier, PR 106560. */
> +/* { dg-do compile } */
> +/* { dg-options "-w" } */
> +void **a; /* { dg-note "" } */
> +void b() { void **c = a; }
> +a; /* { dg-error "" } */
> diff --git a/gcc/testsuite/gcc.dg/redecl-24.c 
> b/gcc/testsuite/gcc.dg/redecl-24.c
> new file mode 100644
> index 000..f0f7a723ab8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/redecl-24.c
> @@ -0,0 +1,6 @@
> +/* We used to ICE in the gimplifier, PR 106560 */
> +/* { dg-do compile } */
> +/* { dg-options "-w" } */
> +void **a, **b; /* { dg-note "" } */
> +c(){b = a;}
> +a = /* { dg-error "" } */
> diff --git a/gcc/testsuite/gcc.dg/redecl-25.c 
> b/gcc/testsuite/gcc.dg/redecl-25.c
> new file mode 100644
> index 000..4232e19d9a7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/redecl-25.c
> @@ -0,0 +1,9 @@
> +/* We used to ICE in the gimplifier, PR 106560 */
> +/* { dg-do compile } */
> +/* { dg-options "-w" } */
> +void **a; /* { dg-note "" } */
> +void b() {
> +  void **c;
> +c = a /* { dg-error "" } */
> +}
> +a; /* { dg-error "" } */
> --
> 2.27.0
>


Re: [PATCH 1/2] Allow subtarget customization of CC1_SPEC

2022-11-20 Thread Sebastian Huber

On 20/11/2022 17:57, Jeff Law wrote:


On 10/26/22 03:34, Sebastian Huber wrote:

On 04/10/2022 11:47, Sebastian Huber wrote:

On 08/09/2022 07:33, Sebastian Huber wrote:

On 04/08/2022 15:02, Sebastian Huber wrote:

On 22/07/2022 15:02, Sebastian Huber wrote:

gcc/ChangeLog:

* gcc.cc (SUBTARGET_CC1_SPEC): Define if not defined.
(CC1_SPEC): Define to SUBTARGET_CC1_SPEC.
* config/arm/arm.h (CC1_SPEC): Remove.
* config/arc/arc.h (CC1_SPEC): Append SUBTARGET_CC1_SPEC.
* config/cris/cris.h (CC1_SPEC): Likewise.
* config/frv/frv.h (CC1_SPEC): Likewise.
* config/i386/i386.h (CC1_SPEC): Likewise.
* config/ia64/ia64.h (CC1_SPEC): Likewise.
* config/lm32/lm32.h (CC1_SPEC): Likewise.
* config/m32r/m32r.h (CC1_SPEC): Likewise.
* config/mcore/mcore.h (CC1_SPEC): Likewise.
* config/microblaze/microblaze.h: Likewise.
* config/nds32/nds32.h (CC1_SPEC): Likewise.
* config/nios2/nios2.h (CC1_SPEC): Likewise.
* config/pa/pa.h (CC1_SPEC): Likewise.
* config/rs6000/sysv4.h (CC1_SPEC): Likewise.
* config/rx/rx.h (CC1_SPEC): Likewise.
* config/sparc/sparc.h (CC1_SPEC): Likewise.


Could someone please have a look at this patch set?


Ping


Would someone mind having a look at this patch set? If there is a 
better approach to customize the default TLS model, then please let 
me know.


It would be nice if someone could review the patch before the Stage 1 
ends at November 13th.


Just a reminder.  The guidelines are a patch needs to be posted before 
the end of stage1 to make the deadline.  Review & integration can happen 
after the deadline.


I realize the idea here is to allow RTEMS to change the default TLS 
model.  But does it also happen to make it possible to solve Keith 
Packard's issues with picolibc?  See the Aug/Sep gcc-patches archives.


It looks sensible.  I assume you did a "find" to identify all the 
CC1_SPECs to change.



OK for the trunk,


Thanks for having a look at the patch. After looking at the patch again, 
I think it can be simplified to:


https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606841.html



Jeff


Anyway, does this also solve some of the issue Keith Packard was try to 
nail down for picolibc?


I had a look at this, however, I think this is a slightly different 
problem since LIB_SPEC needs to be replaced and not extended.


--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


[PATCH v2 2/2] RTEMS: Use local-exec TLS model by default

2022-11-20 Thread Sebastian Huber
gcc/ChangeLog:

* config/rtems.h (SUBTARGET_CC1_SPEC): Undef and define.
---
 gcc/config/rtems.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/config/rtems.h b/gcc/config/rtems.h
index 95bcdc41b2f..4742b1f3722 100644
--- a/gcc/config/rtems.h
+++ b/gcc/config/rtems.h
@@ -56,3 +56,7 @@
 /* Prefer int for int32_t (see stdint-newlib.h).  */
 #undef STDINT_LONG32
 #define STDINT_LONG32 (INT_TYPE_SIZE != 32 && LONG_TYPE_SIZE == 32)
+
+/* Default to local-exec TLS model.  */
+#undef SUBTARGET_CC1_SPEC
+#define SUBTARGET_CC1_SPEC " %{!ftls-model=*:-ftls-model=local-exec}"
-- 
2.35.3



[PATCH v2 1/2] Allow subtarget customization of CC1_SPEC

2022-11-20 Thread Sebastian Huber
gcc/ChangeLog:

* gcc.cc (SUBTARGET_CC1_SPEC): Define if not defined.
(cc1_spec): Append SUBTARGET_CC1_SPEC.
---
v2: Append SUBTARGET_CC1_SPEC directly to cc1_spec and not through CC1_SPEC.
This avoids having to modify all the CC1_SPEC definitions in the targets.

 gcc/gcc.cc | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index 830ab88701f..4e1574a4df1 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -706,6 +706,13 @@ proper position among the other output files.  */
 #define CPP_SPEC ""
 #endif
 
+/* Subtargets can define SUBTARGET_CC1_SPEC to provide extra args to cc1 and
+   cc1plus or extra switch-translations.  The SUBTARGET_CC1_SPEC is appended
+   to CC1_SPEC.  */
+#ifndef SUBTARGET_CC1_SPEC
+#define SUBTARGET_CC1_SPEC ""
+#endif
+
 /* config.h can define CC1_SPEC to provide extra args to cc1 and cc1plus
or extra switch-translations.  */
 #ifndef CC1_SPEC
@@ -1174,7 +1181,7 @@ proper position among the other output files.  */
 static const char *asm_debug = ASM_DEBUG_SPEC;
 static const char *asm_debug_option = ASM_DEBUG_OPTION_SPEC;
 static const char *cpp_spec = CPP_SPEC;
-static const char *cc1_spec = CC1_SPEC;
+static const char *cc1_spec = CC1_SPEC SUBTARGET_CC1_SPEC;
 static const char *cc1plus_spec = CC1PLUS_SPEC;
 static const char *link_gcc_c_sequence_spec = LINK_GCC_C_SEQUENCE_SPEC;
 static const char *link_ssp_spec = LINK_SSP_SPEC;
-- 
2.35.3



Re: [PATCH] [x86] Some tidy up for RA related hooks.

2022-11-20 Thread Uros Bizjak via Gcc-patches
On Mon, Nov 21, 2022 at 6:24 AM Hongtao Liu  wrote:
>
> On Mon, Nov 21, 2022 at 10:13 AM liuhongt  wrote:
> >
> > When i'm working at [1] for ix86_can_change_mode_class,
> > I notice there're some incorrectness/misoptimization in current RA-related 
> > hook.
> > This patch tries to do some fix and tidy up for them:
> >
> > 1. We also need to guard size of TO to be
> > less than TARGET_SSE2 ? 2 : 4 in ix86_can_change_mode_class.
> > 2. Merge VALID_AVX512FP16_SCALAR_MODE plus BFmode
> > into VALID_AVX512F_SCALAR_MODE since we've support 16-bit data move
> > above SSE2, so no need for the condition of AVX512FP16 for those evex
> > sse registers.
> > 3. Allocate DI/HImode to sse register for SSE2 above just like
> > SImode since we've supported 16-bit data move between sse and gpr
> > above SSE2, this will help RA to handle cases like (subreg:HI (reg:V8HI)
> > 0) or else RA will spill it. This enable optimization for
> > pieces-memset-{3,37,39}.c
> > 4. Guard 64/32-bit vector move patterns with ix86_hard_reg_move_ok.
> >
> > [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606373.html
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > * config/i386/i386.cc (ix86_can_change_mode_class): Also guard
> > size of TO.
> > (ix86_hard_regno_mode_ok): Remove VALID_AVX512FP16_SCALAR_MODE
> > * config/i386/i386.h (VALID_AVX512FP16_SCALAR_MODE): Merged to
> > ..
> > (VALID_AVX512F_SCALAR_MODE): .. this, also add HImode.
> > (VALID_SSE_REG_MODE): Add DI/HImode.
> > * config/i386/mmx.md (*mov_internal): Add
> > ix86_hard_reg_move_ok to condition.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/pieces-memset-3.c: Remove xfail.
> > * gcc.target/i386/pieces-memset-37.c: Remove xfail.
> > * gcc.target/i386/pieces-memset-39.c: Remove xfail.

OK.

This is somehow tricky part of the compiler, so it would be nice if
the patch can be split to a couple of patches to ease bisecting if
something goes wrong. OTOH, recently there were a couple of similar
changes in this area, and there were no problems.

Thanks,
Uros.

> > ---
> >  gcc/config/i386/i386.cc  |  9 ++---
> >  gcc/config/i386/i386.h   | 16 
> >  gcc/config/i386/mmx.md   |  6 --
> >  gcc/testsuite/gcc.target/i386/pieces-memset-3.c  |  4 ++--
> >  gcc/testsuite/gcc.target/i386/pieces-memset-37.c |  4 ++--
> >  gcc/testsuite/gcc.target/i386/pieces-memset-39.c |  4 ++--
> >  6 files changed, 20 insertions(+), 23 deletions(-)
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index 292b32c5e99..030c26965ab 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -19725,7 +19725,8 @@ ix86_can_change_mode_class (machine_mode from, 
> > machine_mode to,
> >  the vec_dupv4hi pattern.
> >  NB: SSE2 can load 16bit data to sse register via pinsrw.  */
> >int mov_size = MAYBE_SSE_CLASS_P (regclass) && TARGET_SSE2 ? 2 : 4;
> > -  if (GET_MODE_SIZE (from) < mov_size)
> > +  if (GET_MODE_SIZE (from) < mov_size
> > + || GET_MODE_SIZE (to) < mov_size)
> > return false;
> >  }
> >
> > @@ -20089,12 +20090,6 @@ ix86_hard_regno_mode_ok (unsigned int regno, 
> > machine_mode mode)
> >   || VALID_AVX512F_SCALAR_MODE (mode)))
> > return true;
> >
> > -  /* For AVX512FP16, vmovw supports movement of HImode
> > -and HFmode between GPR and SSE registers.  */
> > -  if (TARGET_AVX512FP16
> > - && VALID_AVX512FP16_SCALAR_MODE (mode))
> > -   return true;
> > -
> >/* For AVX-5124FMAPS or AVX-5124VNNIW
> >  allow V64SF and V64SI modes for special regnos.  */
> >if ((TARGET_AVX5124FMAPS || TARGET_AVX5124VNNIW)
> > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > index 3869db8f2d3..d9a1fb0e420 100644
> > --- a/gcc/config/i386/i386.h
> > +++ b/gcc/config/i386/i386.h
> > @@ -1017,11 +1017,9 @@ extern const char *host_detect_local_cpu (int argc, 
> > const char **argv);
> >(VALID_AVX256_REG_MODE (MODE) || (MODE) == OImode)
> >
> >  #define VALID_AVX512F_SCALAR_MODE(MODE)
> > \
> > -  ((MODE) == DImode || (MODE) == DFmode || (MODE) == SImode\
> > -   || (MODE) == SFmode)
> > -
> > -#define VALID_AVX512FP16_SCALAR_MODE(MODE) \
> > -  ((MODE) == HImode || (MODE) == HFmode)
> > +  ((MODE) == DImode || (MODE) == DFmode
> > \
> > +   || (MODE) == SImode || (MODE) == SFmode \
> > +   || (MODE) == HImode || (MODE) == HFmode || (MODE) == BFmode)
> >
> >  #define VALID_AVX512F_REG_MODE(MODE)   \
> >((MODE) == V8DImode || (MODE) == V8DFmode || (MODE) == V64QImode \
> > @@ -1045,13 +1043,15 @@ 

Re: [PATCH 3/3] Add '--oslib=' option when default C library is picolibc

2022-11-20 Thread Sebastian Huber

On 03/09/2022 08:07, Keith Packard via Gcc-patches wrote:

diff --git a/gcc/config/arm/elf.h b/gcc/config/arm/elf.h
index 3d111433ede..dc5b9374814 100644
--- a/gcc/config/arm/elf.h
+++ b/gcc/config/arm/elf.h
@@ -150,3 +150,8 @@
  #undef L_floatundisf
  #endif
  
+#if DEFAULT_LIBC == LIBC_PICOLIBC

+#undef  LIB_SPEC
+#define LIB_SPEC "--start-group %(libgcc)  -lc %{-oslib=*:-l%*} --end-group"
+#endif
+


Can't you add a gcc/config/picolibc.h (similar to gcc/config/rtems.h) 
which is placed in gcc/config.gcc with something like this:


picolibc)
default_libc=LIBC_PICOLIBC
tm_file="${tm_file} picolibc.h"
;;

In this header place:

#undef  LIB_SPEC
#define LIB_SPEC "--start-group %(libgcc)  -lc %{-oslib=*:-l%*} --end-group"
#endif

This avoids having to modify all the ELF-specific files.

--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


Re: [PATCHv2, rs6000] Enable have_cbranchcc4 on rs6000

2022-11-20 Thread HAO CHEN GUI via Gcc-patches
Hi Segher,

在 2022/11/18 20:18, Segher Boessenkool 写道:
> I don't think we should pretend we have any conditional jumps the
> machine does not actually have, in cbranchcc4.  When would this ever be
> useful?  cror;beq can be quite expensive, compared to the code it would
> replace anyway.
> 
> If something generates those here (which then ICEs later), that is
> wrong, fix *that*?  Is it ifcvt doing it?

"*cbranch_2insn" is a valid insn for rs6000. So it generates such insn
at expand pass. The "prepare_cmp_insn" called by ifcvt just wants to verify
that the following comparison rtx is valid.

(unlt (reg:CCFP 156)
(const_int 0 [0]))

It should be valid as it's extracted from an existing insn. It hits ICE only
when the comparison rtx can't pass the predicate check of "cbranchcc4". So
"cbranchcc4" should include "extra_insn_branch_comparison_operator".

Then, ifcvt tries to call emit_conditional_move_1 to generates a condition
move for FP mode. It definitely fails as there is no conditional move insn for
FP mode in rs6000. The behavior of ifcvt is correct. It tries to do conversion
but fails. It won't hit ICEs after cbranchcc4 is correctly defined.

Actually, "*cbranch_2insn" has the same logical as float "*cbranch" in ifcvt.
Both of them get a final false return from "rs6000_emit_int_cmove" as rs6000
doesn't have conditional move for FP mode.

So I think "cbranchcc4" should include "extra_insn_branch_comparison_operator"
as "*cbranch_2insn" is a valid insn. Just let ifcvt decide a conditional
move is valid or not.

Thanks
Gui Haochen


Re: [PATCH] i386: Only enable small loop unrolling in backend [PR 107602]

2022-11-20 Thread Hongyu Wang via Gcc-patches
> It's not necessarily right. unroll_factor will be set as 1 when
> -fno-unroll-loops, which is exactly -fno-unroll-loops means.

Not that exactly, -fno-unroll-loops previously will prevent the pass
from running, and on the current trunk the pass still runs.
Actually I think the implementation on trunk is a bit tricky since we
use a target hook without return value to execute the pass. Also the
logic !OPTION_SET_P looks quite like a workaround. IMHO this is also
not good for maintenance.

Hongtao Liu via Gcc-patches  于2022年11月21日周一 09:33写道:
>
> On Mon, Nov 21, 2022 at 9:01 AM Liu, Hongtao via Gcc-patches
>  wrote:
> >
> >
> >
> > > -Original Message-
> > > From: Wang, Hongyu 
> > > Sent: Saturday, November 19, 2022 2:26 PM
> > > To: gcc-patches@gcc.gnu.org
> > > Cc: richard.guent...@gmail.com; ubiz...@gmail.com; Liu, Hongtao
> > > 
> > > Subject: [PATCH] i386: Only enable small loop unrolling in backend [PR 
> > > 107602]
> > >
> > > Hi,
> > >
> > > Followed by the discussion in pr107602, -munroll-only-small-loops Does not
> > PR107692?
> > > turns on/off -funroll-loops, and current check in 
> > > pass_rtl_unroll_loops::gate
> > > would cause -funroll-loops do not take effect. Revert the change about
> It's not necessarily right. unroll_factor will be set as 1 when
> -fno-unroll-loops, which is exactly -fno-unroll-loops means.
> > > targetm.loop_unroll_adjust and apply the backend option change to strictly
> > > follow the rule that -funroll-loops takes full control of loop unrolling, 
> > > and
> > > munroll-only-small-loops just change its behavior to unroll small size 
> > > loops.
> > >
> > > Bootstrapped and regtested on x86-64-pc-linux-gnu.
> > >
> > > Ok for trunk?
> > >
> > > gcc/ChangeLog:
> > >
> > >   PR target/107602
> > >   * common/config/i386/i386-common.cc (ix86_optimization_table):
> > >   Enable loop unroll O2, disable -fweb and -frename-registers
> > >   by default.
> > >   * config/i386/i386-options.cc
> > >   (ix86_override_options_after_change):
> > >   Disable small loop unroll when funroll-loops enabled, reset
> > >   cunroll_grow_size when it is not explicitly enabled.
> > >   (ix86_option_override_internal): Call
> > >   ix86_override_options_after_change instead of calling
> > >   ix86_recompute_optlev_based_flags and ix86_default_align
> > >   separately.
> > >   * config/i386/i386.cc (ix86_loop_unroll_adjust): Adjust unroll
> > >   factor if -munroll-only-small-loops enabled.
> > >   * loop-init.cc (pass_rtl_unroll_loops::gate): Do not enable
> > >   loop unrolling for -O2-speed.
> > >   (pass_rtl_unroll_loops::execute): Rmove
> > >   targetm.loop_unroll_adjust check.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   PR target/107602
> > >   * gcc.target/i386/pr86270.c: Add -fno-unroll-loops.
> > >   * gcc.target/i386/pr93002.c: Likewise.
> > > ---
> > >  gcc/common/config/i386/i386-common.cc   |  8 ++
> > >  gcc/config/i386/i386-options.cc | 34 ++---
> > >  gcc/config/i386/i386.cc | 18 -
> > >  gcc/loop-init.cc| 11 +++-
> > >  gcc/testsuite/gcc.target/i386/pr86270.c |  2 +-
> > > gcc/testsuite/gcc.target/i386/pr93002.c |  2 +-
> > >  6 files changed, 49 insertions(+), 26 deletions(-)
> > >
> > > diff --git a/gcc/common/config/i386/i386-common.cc
> > > b/gcc/common/config/i386/i386-common.cc
> > > index 6ce2a588adc..660a977b68b 100644
> > > --- a/gcc/common/config/i386/i386-common.cc
> > > +++ b/gcc/common/config/i386/i386-common.cc
> > > @@ -1808,7 +1808,15 @@ static const struct default_options
> > > ix86_option_optimization_table[] =
> > >  /* The STC algorithm produces the smallest code at -Os, for x86.  */
> > >  { OPT_LEVELS_2_PLUS, OPT_freorder_blocks_algorithm_, NULL,
> > >REORDER_BLOCKS_ALGORITHM_STC },
> > > +
> > > +/* Turn on -funroll-loops with -munroll-only-small-loops to enable 
> > > small
> > > +   loop unrolling at -O2.  */
> > > +{ OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_funroll_loops, NULL, 1 },
> > >  { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_munroll_only_small_loops, NULL,
> > > 1 },
> > > +/* Turns off -frename-registers and -fweb which are enabled by
> > > +   funroll-loops.  */
> > > +{ OPT_LEVELS_ALL, OPT_frename_registers, NULL, 0 },
> > > +{ OPT_LEVELS_ALL, OPT_fweb, NULL, 0 },
> > >  /* Turn off -fschedule-insns by default.  It tends to make the
> > > problem with not enough registers even worse.  */
> > >  { OPT_LEVELS_ALL, OPT_fschedule_insns, NULL, 0 }, diff --git
> > > a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index
> > > e5c77f3a84d..bc1d36e36a8 100644
> > > --- a/gcc/config/i386/i386-options.cc
> > > +++ b/gcc/config/i386/i386-options.cc
> > > @@ -1838,8 +1838,37 @@ ix86_recompute_optlev_based_flags (struct
> > > gcc_options *opts,  void  ix86_override_options_after_change (void)  {
> > 

Re: [PATCH] [x86] Some tidy up for RA related hooks.

2022-11-20 Thread Hongtao Liu via Gcc-patches
On Mon, Nov 21, 2022 at 10:13 AM liuhongt  wrote:
>
> When i'm working at [1] for ix86_can_change_mode_class,
> I notice there're some incorrectness/misoptimization in current RA-related 
> hook.
> This patch tries to do some fix and tidy up for them:
>
> 1. We also need to guard size of TO to be
> less than TARGET_SSE2 ? 2 : 4 in ix86_can_change_mode_class.
> 2. Merge VALID_AVX512FP16_SCALAR_MODE plus BFmode
> into VALID_AVX512F_SCALAR_MODE since we've support 16-bit data move
> above SSE2, so no need for the condition of AVX512FP16 for those evex
> sse registers.
> 3. Allocate DI/HImode to sse register for SSE2 above just like
> SImode since we've supported 16-bit data move between sse and gpr
> above SSE2, this will help RA to handle cases like (subreg:HI (reg:V8HI)
> 0) or else RA will spill it. This enable optimization for
> pieces-memset-{3,37,39}.c
> 4. Guard 64/32-bit vector move patterns with ix86_hard_reg_move_ok.
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606373.html
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> * config/i386/i386.cc (ix86_can_change_mode_class): Also guard
> size of TO.
> (ix86_hard_regno_mode_ok): Remove VALID_AVX512FP16_SCALAR_MODE
> * config/i386/i386.h (VALID_AVX512FP16_SCALAR_MODE): Merged to
> ..
> (VALID_AVX512F_SCALAR_MODE): .. this, also add HImode.
> (VALID_SSE_REG_MODE): Add DI/HImode.
> * config/i386/mmx.md (*mov_internal): Add
> ix86_hard_reg_move_ok to condition.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pieces-memset-3.c: Remove xfail.
> * gcc.target/i386/pieces-memset-37.c: Remove xfail.
> * gcc.target/i386/pieces-memset-39.c: Remove xfail.
> ---
>  gcc/config/i386/i386.cc  |  9 ++---
>  gcc/config/i386/i386.h   | 16 
>  gcc/config/i386/mmx.md   |  6 --
>  gcc/testsuite/gcc.target/i386/pieces-memset-3.c  |  4 ++--
>  gcc/testsuite/gcc.target/i386/pieces-memset-37.c |  4 ++--
>  gcc/testsuite/gcc.target/i386/pieces-memset-39.c |  4 ++--
>  6 files changed, 20 insertions(+), 23 deletions(-)
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 292b32c5e99..030c26965ab 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -19725,7 +19725,8 @@ ix86_can_change_mode_class (machine_mode from, 
> machine_mode to,
>  the vec_dupv4hi pattern.
>  NB: SSE2 can load 16bit data to sse register via pinsrw.  */
>int mov_size = MAYBE_SSE_CLASS_P (regclass) && TARGET_SSE2 ? 2 : 4;
> -  if (GET_MODE_SIZE (from) < mov_size)
> +  if (GET_MODE_SIZE (from) < mov_size
> + || GET_MODE_SIZE (to) < mov_size)
> return false;
>  }
>
> @@ -20089,12 +20090,6 @@ ix86_hard_regno_mode_ok (unsigned int regno, 
> machine_mode mode)
>   || VALID_AVX512F_SCALAR_MODE (mode)))
> return true;
>
> -  /* For AVX512FP16, vmovw supports movement of HImode
> -and HFmode between GPR and SSE registers.  */
> -  if (TARGET_AVX512FP16
> - && VALID_AVX512FP16_SCALAR_MODE (mode))
> -   return true;
> -
>/* For AVX-5124FMAPS or AVX-5124VNNIW
>  allow V64SF and V64SI modes for special regnos.  */
>if ((TARGET_AVX5124FMAPS || TARGET_AVX5124VNNIW)
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index 3869db8f2d3..d9a1fb0e420 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -1017,11 +1017,9 @@ extern const char *host_detect_local_cpu (int argc, 
> const char **argv);
>(VALID_AVX256_REG_MODE (MODE) || (MODE) == OImode)
>
>  #define VALID_AVX512F_SCALAR_MODE(MODE)  
>   \
> -  ((MODE) == DImode || (MODE) == DFmode || (MODE) == SImode\
> -   || (MODE) == SFmode)
> -
> -#define VALID_AVX512FP16_SCALAR_MODE(MODE) \
> -  ((MODE) == HImode || (MODE) == HFmode)
> +  ((MODE) == DImode || (MODE) == DFmode  
>   \
> +   || (MODE) == SImode || (MODE) == SFmode \
> +   || (MODE) == HImode || (MODE) == HFmode || (MODE) == BFmode)
>
>  #define VALID_AVX512F_REG_MODE(MODE)   \
>((MODE) == V8DImode || (MODE) == V8DFmode || (MODE) == V64QImode \
> @@ -1045,13 +1043,15 @@ extern const char *host_detect_local_cpu (int argc, 
> const char **argv);
> || (MODE) == V8HFmode || (MODE) == V4HFmode || (MODE) == V2HFmode   \
> || (MODE) == V8BFmode || (MODE) == V4BFmode || (MODE) == V2BFmode   \
> || (MODE) == V4QImode || (MODE) == V2HImode || (MODE) == V1SImode   \
> -   || (MODE) == V2DImode || (MODE) == V2QImode || (MODE) == DFmode \
> -   || (MODE) == HFmode || (MODE) == BFmode)
> +   || (MODE) == V2DImode || (MODE) == V2QImode \
> +   || (MODE) == DFmode || (MODE) == 

Re: [PATCH 7/7] riscv: Add support for str(n)cmp inline expansion

2022-11-20 Thread Kito Cheng via Gcc-patches
> > I would like to have a unified option interface,
> > maybe -m[no-]inline-str[n]cmp and -minline-str[n]cmp-limit.
>
> For the basic option (-m[no-]inline-str[n]cmp), I would punt to
> -fno-builtin-str[n]cmp.

-fno-bulitin-* will also suppress middle-end optimization for those builtins.

see:
https://github.com/gcc-mirror/gcc/blob/master/gcc/tree-ssa-strlen.cc#L5372

and
https://github.com/gcc-mirror/gcc/blob/master/gcc/tree-loop-distribution.cc
for -fno-bulitin-memcpy/memset/memmove

> The limit-one sounds more like a --param?

Use -param=inline-*-limit= sound good idea, aarch64 and x86 have few
options like that.

>
> > And add some option like this:
> > -minline-str[n]cmp=[bitmanip|vector|auto] in future,
>
> If we want to follow the lead of others, then x86 has a 
> -mstringop-strategy=alg

Using same option name as x86 is SGTM.

> > since I assume we'll have different versions of those things.
> >
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/riscv/riscv-protos.h (riscv_expand_strn_compare): New
> > >   prototype.
> > > * config/riscv/riscv-string.cc (GEN_EMIT_HELPER3): New helper
> > >   macros.
> > > (GEN_EMIT_HELPER2): New helper macros.
> > > (expand_strncmp_zbb_sequence): New function.
> > > (riscv_emit_str_compare_zbb): New function.
> > > (riscv_expand_strn_compare): New function.
> > > * config/riscv/riscv.md (cmpstrnsi): Invoke expansion functions
> > >   for strn_compare.
> > > (cmpstrsi): Invoke expansion functions for strn_compare.
> > > * config/riscv/riscv.opt: Add new parameter
> > >   '-mstring-compare-inline-limit'.
> >
> > We need to document this option.


Re: [PATCH] RISC-V: Optimise adding a (larger than simm12) constant

2022-11-20 Thread Kito Cheng via Gcc-patches
> @@ -464,6 +464,60 @@
>[(set_attr "type" "arith")
> (set_attr "mode" "DI")])
>
> +(define_expand "add3"
> +  [(set (match_operand:GPR   0 "register_operand"  "=r,r")
> +   (plus:GPR (match_operand:GPR 1 "register_operand"  " r,r")
> + (match_operand:GPR 2 "addi_operand"  " r,I")))]

Is it possible to just define a predicate that accepts
register_operand and CONST_INT_P,
and then handle all cases in add3 pattern?

My point is put all check in one place:

e.g.
check TARGET_ZBA && const_arith_shifted123_operand (operands[2],
mode) in add3
rather than check TARGET_ZBA in addi_operand and use sh[123]add in
add3 without check.

and that also means we need to sync addi_opearnad and add3
once we have extension XX could improve addi codegen.


> +  ""
> +{
> +  if (arith_operand (operands[2], mode))
> +emit_insn (gen_riscv_add3 (operands[0], operands[1], operands[2]));
> +  else if (const_arith_2simm12_operand (operands[2], mode))

const_arith_2simm12_operand only used once, could you inline the condition here?

> +{
> +  /* Split into two immediates that add up to the desired value:
> +   * e.g., break up "a + 2445" into:
> +   * addi  a0,a0,2047
> +   *addi   a0,a0,398
> +   */
> +
> +  HOST_WIDE_INT val = INTVAL (operands[2]);
> +  HOST_WIDE_INT saturated = HOST_WIDE_INT_M1U << (IMM_BITS - 1);
> +
> +  if (val >= 0)
> +saturated = ~saturated;
> +
> +  val -= saturated;
> +
> +  rtx tmp = gen_reg_rtx (mode);
> +  emit_insn (gen_riscv_add3 (tmp, operands[1], GEN_INT 
> (saturated)));
> +  emit_insn (gen_riscv_add3 (operands[0], tmp, GEN_INT (val)));
> +}
> +  else if (mode == word_mode
> +  && const_arith_shifted123_operand (operands[2], mode))

Same for const_arith_shifted123_operand.

> +{
> +  /* Use a sh[123]add and an immediate shifted down by 1, 2, or 3. */
> +
> +  HOST_WIDE_INT val = INTVAL (operands[2]);
> +  int shamt = ctz_hwi (val);
> +
> +  if (shamt > 3)
> +   shamt = 3;
> +
> +  rtx tmp = gen_reg_rtx (mode);
> +  emit_insn (gen_rtx_SET (tmp, GEN_INT (val >> shamt)));
> +
> +  /* We don't use gen_riscv_shNadd here, as it will only exist for
> +.  Instead we build up its canonical form directly.  */
> +  rtx shifted_imm = gen_rtx_ASHIFT (mode, tmp, GEN_INT (shamt));
> +  rtx shNadd = gen_rtx_PLUS (mode, shifted_imm, operands[1]);
> +  emit_insn (gen_rtx_SET (operands[0], shNadd));
> +}
> +  else
> +FAIL;

Seems add3 FAIL will cause problems, we need either add something like:

  operands[2] = force_reg (mode, operands[2]);
  emit_insn (gen_rtx_SET (operands[0],
 gen_rtx_PLUS (mode,
   operands[1], operands[2])));

Or just gcc_unreachable () if we keep using addi_operand to guard this pattern.


Re: [PATCH V2] Use subscalar mode to move struct block for parameter

2022-11-20 Thread Jiufu Guo via Gcc-patches
Jiufu Guo  writes:

> Hi,
>
> As mentioned in the previous version patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604646.html
> The suboptimal code is generated for "assigning from parameter" or
> "assigning to return value".
> This patch enhances the assignment from parameters like the below
> cases:
> /case1.c
> typedef struct SA {double a[3];long l; } A;
> A ret_arg (A a) {return a;}
> void st_arg (A a, A *p) {*p = a;}
>
> case2.c
> typedef struct SA {double a[3];} A;
> A ret_arg (A a) {return a;}
> void st_arg (A a, A *p) {*p = a;}
>
> For this patch, bootstrap and regtest pass on ppc64{,le}
> and x86_64.
> * Besides asking for help reviewing this patch, I would like to
> consult comments about enhancing for "assigning to returns".

I updated the patch to fix the issue for returns.  This patch
adds a flag DECL_USEDBY_RETURN_P to indicate if a var is used
by a return stmt.  This patch fix the issue in expand pass only,
so, we would try to update the patch to avoid this flag.

diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc
index dd29c03..09b8ec64cea 100644
--- a/gcc/cfgexpand.cc
+++ b/gcc/cfgexpand.cc
@@ -2158,6 +2158,20 @@ expand_used_vars (bitmap forced_stack_vars)
 frame_phase = off ? align - off : 0;
   }
 
+  /* Collect VARs on returns.  */
+  if (DECL_RESULT (current_function_decl))
+{
+  edge_iterator ei;
+  edge e;
+  FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
+   if (greturn *ret = safe_dyn_cast (last_stmt (e->src)))
+ {
+   tree val = gimple_return_retval (ret);
+   if (val && VAR_P (val))
+ DECL_USEDBY_RETURN_P (val) = 1;
+ }
+}
+
   /* Set TREE_USED on all variables in the local_decls.  */
   FOR_EACH_LOCAL_DECL (cfun, i, var)
 TREE_USED (var) = 1;
diff --git a/gcc/expr.cc b/gcc/expr.cc
index d9407432ea5..20973649963 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -6045,6 +6045,52 @@ expand_assignment (tree to, tree from, bool nontemporal)
   return;
 }
 
+  if ((TREE_CODE (from) == PARM_DECL && DECL_INCOMING_RTL (from)
+   && TYPE_MODE (TREE_TYPE (from)) == BLKmode
+   && (GET_CODE (DECL_INCOMING_RTL (from)) == PARALLEL
+  || REG_P (DECL_INCOMING_RTL (from
+  || (VAR_P (to) && DECL_USEDBY_RETURN_P (to)
+ && TYPE_MODE (TREE_TYPE (to)) == BLKmode
+ && GET_CODE (DECL_RTL (DECL_RESULT (current_function_decl)))
+  == PARALLEL))
+{
+  push_temp_slots ();
+  rtx par_ret;
+  machine_mode mode;
+  par_ret = TREE_CODE (from) == PARM_DECL
+ ? DECL_INCOMING_RTL (from)
+ : DECL_RTL (DECL_RESULT (current_function_decl));
+  mode = GET_CODE (par_ret) == PARALLEL
+  ? GET_MODE (XEXP (XVECEXP (par_ret, 0, 0), 0))
+  : word_mode;
+  int mode_size = GET_MODE_SIZE (mode).to_constant ();
+  int size = INTVAL (expr_size (from));
+
+  /* If/How the parameter using submode, it dependes on the size and
+position of the parameter.  Here using heurisitic number.  */
+  int hurstc_num = 8;
+  if (size < mode_size || (size % mode_size) != 0
+ || size > (mode_size * hurstc_num))
+   result = store_expr (from, to_rtx, 0, nontemporal, false);
+  else
+   {
+ rtx from_rtx
+   = expand_expr (from, NULL_RTX, GET_MODE (to_rtx), EXPAND_NORMAL);
+ for (int i = 0; i < size / mode_size; i++)
+   {
+ rtx temp = gen_reg_rtx (mode);
+ rtx src = adjust_address (from_rtx, mode, mode_size * i);
+ rtx dest = adjust_address (to_rtx, mode, mode_size * i);
+ emit_move_insn (temp, src);
+ emit_move_insn (dest, temp);
+   }
+ result = to_rtx;
+   }
+  preserve_temp_slots (result);
+  pop_temp_slots ();
+  return;
+}
+
   /* Compute FROM and store the value in the rtx we got.  */
 
   push_temp_slots ();
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index af75522504f..be42e1464de 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1808,7 +1808,8 @@ struct GTY(()) tree_decl_common {
  In VAR_DECL, PARM_DECL and RESULT_DECL, this is
  DECL_HAS_VALUE_EXPR_P.  */
   unsigned decl_flag_2 : 1;
-  /* In FIELD_DECL, this is DECL_PADDING_P.  */
+  /* In FIELD_DECL, this is DECL_PADDING_P
+ In VAR_DECL, this is DECL_USEDBY_RETURN_P.  */
   unsigned decl_flag_3 : 1;
   /* Logically, these two would go in a theoretical base shared by var and
  parm decl. */
diff --git a/gcc/tree.h b/gcc/tree.h
index a863d2e50e5..73c0314dac1 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -3011,6 +3011,10 @@ extern void decl_value_expr_insert (tree, tree);
 #define DECL_PADDING_P(NODE) \
   (FIELD_DECL_CHECK (NODE)->decl_common.decl_flag_3)
 
+/* Used in a VAR_DECL to indicate that it is used by a return stmt.  */
+#define DECL_USEDBY_RETURN_P(NODE) \
+  (VAR_DECL_CHECK (NODE)->decl_common.decl_flag_3)
+
 /* Used in a 

[PATCH] [x86] Some tidy up for RA related hooks.

2022-11-20 Thread liuhongt via Gcc-patches
When i'm working at [1] for ix86_can_change_mode_class,
I notice there're some incorrectness/misoptimization in current RA-related hook.
This patch tries to do some fix and tidy up for them:

1. We also need to guard size of TO to be
less than TARGET_SSE2 ? 2 : 4 in ix86_can_change_mode_class.
2. Merge VALID_AVX512FP16_SCALAR_MODE plus BFmode
into VALID_AVX512F_SCALAR_MODE since we've support 16-bit data move
above SSE2, so no need for the condition of AVX512FP16 for those evex
sse registers.
3. Allocate DI/HImode to sse register for SSE2 above just like
SImode since we've supported 16-bit data move between sse and gpr
above SSE2, this will help RA to handle cases like (subreg:HI (reg:V8HI)
0) or else RA will spill it. This enable optimization for
pieces-memset-{3,37,39}.c
4. Guard 64/32-bit vector move patterns with ix86_hard_reg_move_ok.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606373.html

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

* config/i386/i386.cc (ix86_can_change_mode_class): Also guard
size of TO.
(ix86_hard_regno_mode_ok): Remove VALID_AVX512FP16_SCALAR_MODE
* config/i386/i386.h (VALID_AVX512FP16_SCALAR_MODE): Merged to
..
(VALID_AVX512F_SCALAR_MODE): .. this, also add HImode.
(VALID_SSE_REG_MODE): Add DI/HImode.
* config/i386/mmx.md (*mov_internal): Add
ix86_hard_reg_move_ok to condition.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pieces-memset-3.c: Remove xfail.
* gcc.target/i386/pieces-memset-37.c: Remove xfail.
* gcc.target/i386/pieces-memset-39.c: Remove xfail.
---
 gcc/config/i386/i386.cc  |  9 ++---
 gcc/config/i386/i386.h   | 16 
 gcc/config/i386/mmx.md   |  6 --
 gcc/testsuite/gcc.target/i386/pieces-memset-3.c  |  4 ++--
 gcc/testsuite/gcc.target/i386/pieces-memset-37.c |  4 ++--
 gcc/testsuite/gcc.target/i386/pieces-memset-39.c |  4 ++--
 6 files changed, 20 insertions(+), 23 deletions(-)

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 292b32c5e99..030c26965ab 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -19725,7 +19725,8 @@ ix86_can_change_mode_class (machine_mode from, 
machine_mode to,
 the vec_dupv4hi pattern.
 NB: SSE2 can load 16bit data to sse register via pinsrw.  */
   int mov_size = MAYBE_SSE_CLASS_P (regclass) && TARGET_SSE2 ? 2 : 4;
-  if (GET_MODE_SIZE (from) < mov_size)
+  if (GET_MODE_SIZE (from) < mov_size
+ || GET_MODE_SIZE (to) < mov_size)
return false;
 }
 
@@ -20089,12 +20090,6 @@ ix86_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
  || VALID_AVX512F_SCALAR_MODE (mode)))
return true;
 
-  /* For AVX512FP16, vmovw supports movement of HImode
-and HFmode between GPR and SSE registers.  */
-  if (TARGET_AVX512FP16
- && VALID_AVX512FP16_SCALAR_MODE (mode))
-   return true;
-
   /* For AVX-5124FMAPS or AVX-5124VNNIW
 allow V64SF and V64SI modes for special regnos.  */
   if ((TARGET_AVX5124FMAPS || TARGET_AVX5124VNNIW)
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 3869db8f2d3..d9a1fb0e420 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -1017,11 +1017,9 @@ extern const char *host_detect_local_cpu (int argc, 
const char **argv);
   (VALID_AVX256_REG_MODE (MODE) || (MODE) == OImode)
 
 #define VALID_AVX512F_SCALAR_MODE(MODE)
\
-  ((MODE) == DImode || (MODE) == DFmode || (MODE) == SImode\
-   || (MODE) == SFmode)
-
-#define VALID_AVX512FP16_SCALAR_MODE(MODE) \
-  ((MODE) == HImode || (MODE) == HFmode)
+  ((MODE) == DImode || (MODE) == DFmode
\
+   || (MODE) == SImode || (MODE) == SFmode \
+   || (MODE) == HImode || (MODE) == HFmode || (MODE) == BFmode)
 
 #define VALID_AVX512F_REG_MODE(MODE)   \
   ((MODE) == V8DImode || (MODE) == V8DFmode || (MODE) == V64QImode \
@@ -1045,13 +1043,15 @@ extern const char *host_detect_local_cpu (int argc, 
const char **argv);
|| (MODE) == V8HFmode || (MODE) == V4HFmode || (MODE) == V2HFmode   \
|| (MODE) == V8BFmode || (MODE) == V4BFmode || (MODE) == V2BFmode   \
|| (MODE) == V4QImode || (MODE) == V2HImode || (MODE) == V1SImode   \
-   || (MODE) == V2DImode || (MODE) == V2QImode || (MODE) == DFmode \
-   || (MODE) == HFmode || (MODE) == BFmode)
+   || (MODE) == V2DImode || (MODE) == V2QImode \
+   || (MODE) == DFmode || (MODE) == DImode \
+   || (MODE) == HFmode || (MODE) == BFmode || (MODE) == HImode)
 
 #define VALID_SSE_REG_MODE(MODE)   \
   ((MODE) == V1TImode || (MODE) == TImode  \
   

Re: [PATCH 1/2] rs6000: Emit vector fp comparison directly in rs6000_emit_vector_compare

2022-11-20 Thread Kewen.Lin via Gcc-patches
Hi Segher,

on 2022/11/18 23:10, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Nov 17, 2022 at 02:59:00PM +0800, Kewen.Lin wrote:
>> on 2022/11/17 02:44, Segher Boessenkool wrote:
>>> On Wed, Nov 16, 2022 at 02:48:25PM +0800, Kewen.Lin wrote:
* config/rs6000/rs6000.cc (rs6000_emit_vector_compare_inner): Remove
float only comparison operators.
>>>
>>> Why?  Is that correct?  Your mail says nothing about this :-(
>>>
>>> Is there any testcase that covers this, and that shows things still
>>> generate the same code?
>>>
>>
>> Sorry for the unclear description, I thought mistakenly that it's
>> probably straightforward.
>>
>> With the change in this patch, all 14 vector float comparison operators
>> (unordered/ordered/eq/ne/gt/lt/ge/le/ungt/unge/unlt/unle/uneq/ltgt)
>> would be handled early in rs6000_emit_vector_compare.
>>
>> For unordered/ordered/ltgt/uneq, the new way is exactly the same
>> as what we do in rs6000_emit_vector_compare_inner, it means there is
>> no chance to get into rs6000_emit_vector_compare_inner with any of them.
> 
> Ah!  In that case, please add an assert there.  It helps catch problems,
> but much more importantly even, if helps the reader understand what is
> going on :-)

Good idea, will do.

> 
>> For eq/ge/gt, it's the same too, but they are shared with vector integer
>> comparison, I just left them alone here.  Just noticed we can remove ge
>> safely too as it's guarded with !MODE_VECTOR_INT.
> 
> ge is nasty for float, it means something different with and without
> -ffast-math (with fast-math ge means not lt, le means not gt; both can
> be done with a simple single condition, no cror needed.  (Compare to ne
> which is the same with and without -ffast-math, that is because it has a
> "not" in its definition!)
> 

It's true for scalar float comparison, but the context here is for vector
comparison, the result of comparison is still vector (of boolean), and we
have the corresponding vector comparison instruction for ge, so I think it
should be fine here.

>> For ne/ungt/unlt/unge/unle, rs6000_emit_vector_compare changes the code
>> with reverse_condition_maybe_unordered and invert the result, it's the
>> same as what we have in vector.md.
>>
>> ; unge(a,b) = ~lt(a,b)
>> ; unle(a,b) = ~gt(a,b)
>> ; ne(a,b)   = ~eq(a,b)
>> ; ungt(a,b) = ~le(a,b)
>> ; unlt(a,b) = ~ge(a,b)
> 
> But for these last two do we generate identical code still?  Since
> forever we have only use cror here (with CCEQ), not crnor etc. (and will
> CCEQ still do the correct thing always then?)

For ge (~ge), yes; while for le (~le), it's not, as explained previously below.

> 
>> Then eq/ge/gt on the right side would match the cases that were mentioned
>> above.  So we just need to focus on lt and le then.
>>
>> For lt, rs6000_emit_vector_compare swaps operands and the operator to gt,
>> it's the same as what we have in vector.md:
>>
>> ; lt(a,b)   = gt(b,a)
>>
>> , and further matches the case mentioned above.
>>
>> As to le, rs6000_emit_vector_compare tries to split it into lt IOR eq,
>> and further handle lt recursively, that is:
>>le = lt(a,b) || eq(a,b)
>>   = gt(b,a) || eq(a,b)
>>
>> actually this is worse than what vector.md supports:
>>
>> ; le(a,b)   = ge(b,a)
>>
>> In short, the function rs6000_emit_vector_compare_inner is only called by
>> twice in rs6000_emit_vector_compare, there is no chance to enter
>> rs6000_emit_vector_compare_inner with codes unordered/ordered/ltgt/uneq
>> any more, I think it's safe to make the change in function
>> rs6000_emit_vector_compare_inner.  Besides, the proposed way to handle
>> vector float comparison can improve slightly for UNGT and LE handlings.
> 
> Thanks for the explanation!
> 
> Can you do this in multiple steps, which will make it much easier to
> review, and to spot the problem if some unexpected problem shows up?

Sure, I'll try my best to separate it into some steps and show how it
evolves gradually.

> 
>> I constructed a test case, compiled with option -O2 -ftree-vectorize
>> -fno-vect-cost-model on ppc64le, which goes into this function
>> rs6000_emit_vector_compare with all 14 vector float comparison codes,
>> the assembly of most functions doesn't change after this patch,
>> excepting for test_UNGT_{float,double} and test_LE_{float,double}.
> 
> For, this is a separate change, a separate and the other patches will
> show no changes in generated code at all.

Good point, will separate it.

> 
>> Maybe it's good to add one test case with function 
>> test_{UNGT,LE}_{float,double}
>> and scan not xvcmp{gt,eq}[sd]p.
> 
> In the patch that changes code gen for those, sure :-)
> 

Thanks for all the comments again.

BR,
Kewen


Re: [PATCH] RISC-V: Add the Zihpm and Zicntr extensions

2022-11-20 Thread Kito Cheng via Gcc-patches
> So the idea here is just to define the extension so that it gets defined
> in the ISA strings and passed through to the assembler, right?

That will also define arch test marco:

https://github.com/riscv-non-isa/riscv-c-api-doc/blob/master/riscv-c-api.md#architecture-extension-test-macro

On Mon, Nov 21, 2022 at 12:20 AM Jeff Law via Gcc-patches
 wrote:
>
>
> On 11/8/22 20:00, Palmer Dabbelt wrote:
> > These extensions were recently frozen [1].  As per Andrew's post [2]
> > we're meant to ignore these in software, this just adds them to the list
> > of allowed extensions and otherwise ignores them.  I added these under
> > SPEC_CLASS_NONE even though the PDF lists them as 20190614 because it
> > seems pointless to add another spec class just to accept two extensions
> > we then ignore.
> >
> > 1: 
> > https://groups.google.com/a/groups.riscv.org/g/isa-dev/c/HZGoqP1eyps/m/GTNKRLJoAQAJ
> > 2: 
> > https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/QKjQhChrq9Q/m/7gqdkctgAgAJ
> >
> > gcc/ChangeLog
> >
> >   * common/config/riscv/riscv-common.cc: Add Zihpm and Zicnttr
> >   extensions.
>
> So the idea here is just to define the extension so that it gets defined
> in the ISA strings and passed through to the assembler, right?
>
> Jeff


Re: [PATCH] i386: Only enable small loop unrolling in backend [PR 107602]

2022-11-20 Thread Hongtao Liu via Gcc-patches
On Mon, Nov 21, 2022 at 9:01 AM Liu, Hongtao via Gcc-patches
 wrote:
>
>
>
> > -Original Message-
> > From: Wang, Hongyu 
> > Sent: Saturday, November 19, 2022 2:26 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: richard.guent...@gmail.com; ubiz...@gmail.com; Liu, Hongtao
> > 
> > Subject: [PATCH] i386: Only enable small loop unrolling in backend [PR 
> > 107602]
> >
> > Hi,
> >
> > Followed by the discussion in pr107602, -munroll-only-small-loops Does not
> PR107692?
> > turns on/off -funroll-loops, and current check in 
> > pass_rtl_unroll_loops::gate
> > would cause -funroll-loops do not take effect. Revert the change about
It's not necessarily right. unroll_factor will be set as 1 when
-fno-unroll-loops, which is exactly -fno-unroll-loops means.
> > targetm.loop_unroll_adjust and apply the backend option change to strictly
> > follow the rule that -funroll-loops takes full control of loop unrolling, 
> > and
> > munroll-only-small-loops just change its behavior to unroll small size 
> > loops.
> >
> > Bootstrapped and regtested on x86-64-pc-linux-gnu.
> >
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> >   PR target/107602
> >   * common/config/i386/i386-common.cc (ix86_optimization_table):
> >   Enable loop unroll O2, disable -fweb and -frename-registers
> >   by default.
> >   * config/i386/i386-options.cc
> >   (ix86_override_options_after_change):
> >   Disable small loop unroll when funroll-loops enabled, reset
> >   cunroll_grow_size when it is not explicitly enabled.
> >   (ix86_option_override_internal): Call
> >   ix86_override_options_after_change instead of calling
> >   ix86_recompute_optlev_based_flags and ix86_default_align
> >   separately.
> >   * config/i386/i386.cc (ix86_loop_unroll_adjust): Adjust unroll
> >   factor if -munroll-only-small-loops enabled.
> >   * loop-init.cc (pass_rtl_unroll_loops::gate): Do not enable
> >   loop unrolling for -O2-speed.
> >   (pass_rtl_unroll_loops::execute): Rmove
> >   targetm.loop_unroll_adjust check.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   PR target/107602
> >   * gcc.target/i386/pr86270.c: Add -fno-unroll-loops.
> >   * gcc.target/i386/pr93002.c: Likewise.
> > ---
> >  gcc/common/config/i386/i386-common.cc   |  8 ++
> >  gcc/config/i386/i386-options.cc | 34 ++---
> >  gcc/config/i386/i386.cc | 18 -
> >  gcc/loop-init.cc| 11 +++-
> >  gcc/testsuite/gcc.target/i386/pr86270.c |  2 +-
> > gcc/testsuite/gcc.target/i386/pr93002.c |  2 +-
> >  6 files changed, 49 insertions(+), 26 deletions(-)
> >
> > diff --git a/gcc/common/config/i386/i386-common.cc
> > b/gcc/common/config/i386/i386-common.cc
> > index 6ce2a588adc..660a977b68b 100644
> > --- a/gcc/common/config/i386/i386-common.cc
> > +++ b/gcc/common/config/i386/i386-common.cc
> > @@ -1808,7 +1808,15 @@ static const struct default_options
> > ix86_option_optimization_table[] =
> >  /* The STC algorithm produces the smallest code at -Os, for x86.  */
> >  { OPT_LEVELS_2_PLUS, OPT_freorder_blocks_algorithm_, NULL,
> >REORDER_BLOCKS_ALGORITHM_STC },
> > +
> > +/* Turn on -funroll-loops with -munroll-only-small-loops to enable 
> > small
> > +   loop unrolling at -O2.  */
> > +{ OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_funroll_loops, NULL, 1 },
> >  { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_munroll_only_small_loops, NULL,
> > 1 },
> > +/* Turns off -frename-registers and -fweb which are enabled by
> > +   funroll-loops.  */
> > +{ OPT_LEVELS_ALL, OPT_frename_registers, NULL, 0 },
> > +{ OPT_LEVELS_ALL, OPT_fweb, NULL, 0 },
> >  /* Turn off -fschedule-insns by default.  It tends to make the
> > problem with not enough registers even worse.  */
> >  { OPT_LEVELS_ALL, OPT_fschedule_insns, NULL, 0 }, diff --git
> > a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index
> > e5c77f3a84d..bc1d36e36a8 100644
> > --- a/gcc/config/i386/i386-options.cc
> > +++ b/gcc/config/i386/i386-options.cc
> > @@ -1838,8 +1838,37 @@ ix86_recompute_optlev_based_flags (struct
> > gcc_options *opts,  void  ix86_override_options_after_change (void)  {
> > +  /* Default align_* from the processor table.  */
> >ix86_default_align (_options);
> > +
> >ix86_recompute_optlev_based_flags (_options, _options_set);
> > +
> > +  /* Disable unrolling small loops when there's explicit
> > + -f{,no}unroll-loop.  */
> > +  if ((OPTION_SET_P (flag_unroll_loops))
> > + || (OPTION_SET_P (flag_unroll_all_loops)
> > +  && flag_unroll_all_loops))
> > +{
> > +  if (!OPTION_SET_P (ix86_unroll_only_small_loops))
> > + ix86_unroll_only_small_loops = 0;
> > +  /* Re-enable -frename-registers and -fweb if funroll-loops
> > +  enabled.  */
> > +  if (!OPTION_SET_P (flag_web))
> > + flag_web = flag_unroll_loops;
> > +  if (!OPTION_SET_P (flag_rename_registers))

Re: [PATCH] i386: Uglify some local identifiers in *intrin.h [PR107748]

2022-11-20 Thread Hongtao Liu via Gcc-patches
On Sat, Nov 19, 2022 at 4:38 PM Jakub Jelinek  wrote:
>
> Hi!
>
> While reporting PR107748 (where is a problem with non-uglified names,
> but I've left it out because it needs fixing anyway), I've noticed
> various spots where identifiers in *intrin.h headers weren't uglified.
> The following patch fixed those that are related to unions (I've grepped
> for [a-zA-Z]\.[a-zA-Z] spots).
> The reason we need those to be uglified is the same as why the arguments
> of the inlines are __ prefixed and most of automatic vars in the inlines
> - say a, v or u aren't part of implementation namespace and so users could
> #define u whatever->something
> #include 
> and it should still work, as long as u is not e.g. one of the names
> of the functions/macros the header provides (_mm* etc.).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
Ok, thanks.
>
> 2022-11-19  Jakub Jelinek  
>
> PR target/107748
> * config/i386/avx512fp16intrin.h (_mm512_castph512_ph128,
> _mm512_castph512_ph256, _mm512_castph128_ph512,
> _mm512_castph256_ph512, _mm512_set1_pch): Uglify names of local
> variables and union members.
> * config/i386/avx512fp16vlintrin.h (_mm256_castph256_ph128,
> _mm256_castph128_ph256, _mm256_set1_pch, _mm_set1_pch): Likewise.
> * config/i386/smmintrin.h (_mm_extract_ps): Likewise.
>
> --- gcc/config/i386/avx512fp16intrin.h.jj   2022-09-27 08:03:26.974984702 
> +0200
> +++ gcc/config/i386/avx512fp16intrin.h  2022-11-18 12:51:10.668957336 +0100
> @@ -272,10 +272,10 @@ _mm512_castph512_ph128 (__m512h __A)
>  {
>union
>{
> -__m128h a[4];
> -__m512h v;
> -  } u = { .v = __A };
> -  return u.a[0];
> +__m128h __a[4];
> +__m512h __v;
> +  } __u = { .__v = __A };
> +  return __u.__a[0];
>  }
>
>  extern __inline __m256h
> @@ -284,10 +284,10 @@ _mm512_castph512_ph256 (__m512h __A)
>  {
>union
>{
> -__m256h a[2];
> -__m512h v;
> -  } u = { .v = __A };
> -  return u.a[0];
> +__m256h __a[2];
> +__m512h __v;
> +  } __u = { .__v = __A };
> +  return __u.__a[0];
>  }
>
>  extern __inline __m512h
> @@ -296,11 +296,11 @@ _mm512_castph128_ph512 (__m128h __A)
>  {
>union
>{
> -__m128h a[4];
> -__m512h v;
> -  } u;
> -  u.a[0] = __A;
> -  return u.v;
> +__m128h __a[4];
> +__m512h __v;
> +  } __u;
> +  __u.__a[0] = __A;
> +  return __u.__v;
>  }
>
>  extern __inline __m512h
> @@ -309,11 +309,11 @@ _mm512_castph256_ph512 (__m256h __A)
>  {
>union
>{
> -__m256h a[2];
> -__m512h v;
> -  } u;
> -  u.a[0] = __A;
> -  return u.v;
> +__m256h __a[2];
> +__m512h __v;
> +  } __u;
> +  __u.__a[0] = __A;
> +  return __u.__v;
>  }
>
>  extern __inline __m512h
> @@ -7156,11 +7156,11 @@ _mm512_set1_pch (_Float16 _Complex __A)
>  {
>union
>{
> -_Float16 _Complex a;
> -float b;
> -  } u = { .a = __A};
> +_Float16 _Complex __a;
> +float __b;
> +  } __u = { .__a = __A};
>
> -  return (__m512h) _mm512_set1_ps (u.b);
> +  return (__m512h) _mm512_set1_ps (__u.__b);
>  }
>
>  // intrinsics below are alias for f*mul_*ch
> --- gcc/config/i386/avx512fp16vlintrin.h.jj 2022-01-11 23:11:21.760299007 
> +0100
> +++ gcc/config/i386/avx512fp16vlintrin.h2022-11-18 12:52:23.242951737 
> +0100
> @@ -124,10 +124,10 @@ _mm256_castph256_ph128 (__m256h __A)
>  {
>union
>{
> -__m128h a[2];
> -__m256h v;
> -  } u = { .v = __A };
> -  return u.a[0];
> +__m128h __a[2];
> +__m256h __v;
> +  } __u = { .__v = __A };
> +  return __u.__a[0];
>  }
>
>  extern __inline __m256h
> @@ -136,11 +136,11 @@ _mm256_castph128_ph256 (__m128h __A)
>  {
>union
>{
> -__m128h a[2];
> -__m256h v;
> -  } u;
> -  u.a[0] = __A;
> -  return u.v;
> +__m128h __a[2];
> +__m256h __v;
> +  } __u;
> +  __u.__a[0] = __A;
> +  return __u.__v;
>  }
>
>  extern __inline __m256h
> @@ -3317,11 +3317,11 @@ _mm256_set1_pch (_Float16 _Complex __A)
>  {
>union
>{
> -_Float16 _Complex a;
> -float b;
> -  } u = { .a = __A };
> +_Float16 _Complex __a;
> +float __b;
> +  } __u = { .__a = __A };
>
> -  return (__m256h) _mm256_set1_ps (u.b);
> +  return (__m256h) _mm256_set1_ps (__u.__b);
>  }
>
>  extern __inline __m128h
> @@ -3330,11 +3330,11 @@ _mm_set1_pch (_Float16 _Complex __A)
>  {
>union
>{
> -_Float16 _Complex a;
> -float b;
> -  } u = { .a = __A };
> +_Float16 _Complex __a;
> +float __b;
> +  } __u = { .__a = __A };
>
> -  return (__m128h) _mm_set1_ps (u.b);
> +  return (__m128h) _mm_set1_ps (__u.__b);
>  }
>
>  // intrinsics below are alias for f*mul_*ch
> --- gcc/config/i386/smmintrin.h.jj  2022-04-19 07:20:56.429171229 +0200
> +++ gcc/config/i386/smmintrin.h 2022-11-18 12:53:26.226079037 +0100
> @@ -365,17 +365,18 @@ _mm_insert_ps (__m128 __D, __m128 __S, c
>  extern __inline int __attribute__((__gnu_inline__, __always_inline__, 
> __artificial__))
>  _mm_extract_ps (__m128 __X, const int __N)

RE: [PATCH] i386: Only enable small loop unrolling in backend [PR 107602]

2022-11-20 Thread Liu, Hongtao via Gcc-patches



> -Original Message-
> From: Wang, Hongyu 
> Sent: Saturday, November 19, 2022 2:26 PM
> To: gcc-patches@gcc.gnu.org
> Cc: richard.guent...@gmail.com; ubiz...@gmail.com; Liu, Hongtao
> 
> Subject: [PATCH] i386: Only enable small loop unrolling in backend [PR 107602]
> 
> Hi,
> 
> Followed by the discussion in pr107602, -munroll-only-small-loops Does not
PR107692?
> turns on/off -funroll-loops, and current check in pass_rtl_unroll_loops::gate
> would cause -funroll-loops do not take effect. Revert the change about
> targetm.loop_unroll_adjust and apply the backend option change to strictly
> follow the rule that -funroll-loops takes full control of loop unrolling, and
> munroll-only-small-loops just change its behavior to unroll small size loops.
> 
> Bootstrapped and regtested on x86-64-pc-linux-gnu.
> 
> Ok for trunk?
> 
> gcc/ChangeLog:
> 
>   PR target/107602
>   * common/config/i386/i386-common.cc (ix86_optimization_table):
>   Enable loop unroll O2, disable -fweb and -frename-registers
>   by default.
>   * config/i386/i386-options.cc
>   (ix86_override_options_after_change):
>   Disable small loop unroll when funroll-loops enabled, reset
>   cunroll_grow_size when it is not explicitly enabled.
>   (ix86_option_override_internal): Call
>   ix86_override_options_after_change instead of calling
>   ix86_recompute_optlev_based_flags and ix86_default_align
>   separately.
>   * config/i386/i386.cc (ix86_loop_unroll_adjust): Adjust unroll
>   factor if -munroll-only-small-loops enabled.
>   * loop-init.cc (pass_rtl_unroll_loops::gate): Do not enable
>   loop unrolling for -O2-speed.
>   (pass_rtl_unroll_loops::execute): Rmove
>   targetm.loop_unroll_adjust check.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/107602
>   * gcc.target/i386/pr86270.c: Add -fno-unroll-loops.
>   * gcc.target/i386/pr93002.c: Likewise.
> ---
>  gcc/common/config/i386/i386-common.cc   |  8 ++
>  gcc/config/i386/i386-options.cc | 34 ++---
>  gcc/config/i386/i386.cc | 18 -
>  gcc/loop-init.cc| 11 +++-
>  gcc/testsuite/gcc.target/i386/pr86270.c |  2 +-
> gcc/testsuite/gcc.target/i386/pr93002.c |  2 +-
>  6 files changed, 49 insertions(+), 26 deletions(-)
> 
> diff --git a/gcc/common/config/i386/i386-common.cc
> b/gcc/common/config/i386/i386-common.cc
> index 6ce2a588adc..660a977b68b 100644
> --- a/gcc/common/config/i386/i386-common.cc
> +++ b/gcc/common/config/i386/i386-common.cc
> @@ -1808,7 +1808,15 @@ static const struct default_options
> ix86_option_optimization_table[] =
>  /* The STC algorithm produces the smallest code at -Os, for x86.  */
>  { OPT_LEVELS_2_PLUS, OPT_freorder_blocks_algorithm_, NULL,
>REORDER_BLOCKS_ALGORITHM_STC },
> +
> +/* Turn on -funroll-loops with -munroll-only-small-loops to enable small
> +   loop unrolling at -O2.  */
> +{ OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_funroll_loops, NULL, 1 },
>  { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_munroll_only_small_loops, NULL,
> 1 },
> +/* Turns off -frename-registers and -fweb which are enabled by
> +   funroll-loops.  */
> +{ OPT_LEVELS_ALL, OPT_frename_registers, NULL, 0 },
> +{ OPT_LEVELS_ALL, OPT_fweb, NULL, 0 },
>  /* Turn off -fschedule-insns by default.  It tends to make the
> problem with not enough registers even worse.  */
>  { OPT_LEVELS_ALL, OPT_fschedule_insns, NULL, 0 }, diff --git
> a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index
> e5c77f3a84d..bc1d36e36a8 100644
> --- a/gcc/config/i386/i386-options.cc
> +++ b/gcc/config/i386/i386-options.cc
> @@ -1838,8 +1838,37 @@ ix86_recompute_optlev_based_flags (struct
> gcc_options *opts,  void  ix86_override_options_after_change (void)  {
> +  /* Default align_* from the processor table.  */
>ix86_default_align (_options);
> +
>ix86_recompute_optlev_based_flags (_options, _options_set);
> +
> +  /* Disable unrolling small loops when there's explicit
> + -f{,no}unroll-loop.  */
> +  if ((OPTION_SET_P (flag_unroll_loops))
> + || (OPTION_SET_P (flag_unroll_all_loops)
> +  && flag_unroll_all_loops))
> +{
> +  if (!OPTION_SET_P (ix86_unroll_only_small_loops))
> + ix86_unroll_only_small_loops = 0;
> +  /* Re-enable -frename-registers and -fweb if funroll-loops
> +  enabled.  */
> +  if (!OPTION_SET_P (flag_web))
> + flag_web = flag_unroll_loops;
> +  if (!OPTION_SET_P (flag_rename_registers))
> + flag_rename_registers = flag_unroll_loops;
> +  /* -fcunroll-grow-size default follws -f[no]-unroll-loops.  */
> +  if (!OPTION_SET_P (flag_cunroll_grow_size))
> + flag_cunroll_grow_size = flag_unroll_loops
> +  || flag_peel_loops
> +  || optimize >= 3;
> +}
> +  else
> +{
> +  if (!OPTION_SET_P (flag_cunroll_grow_size))
> + 

Re: [PATCH] Fix in _GLIBCXX_INLINE_VERSION mode

2022-11-20 Thread Jonathan Wakely via Gcc-patches
On Sun, 20 Nov 2022, 20:45 François Dumont,  wrote:

> On 19/11/22 14:11, Jonathan Wakely wrote:
> > On Sat, 19 Nov 2022 at 13:03, François Dumont via Libstdc++
> >  wrote:
> >> Without this qualification I have this in _GLIBCXX_INLINE_VERSION mode:
> >>
> >>
> /home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/locale_facets.h:2649:
> >> note: candidate: 'template bool std::__9::isxdigit(_CharT,
> >> const locale&)'
> >>
> /home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/locale_facets.h:2649:
> >> note:   template argument deduction/substitution failed:
> >>
> /home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/format:1540:
> >> note:   candidate expects 2 arguments, 1 provided
> >>
> /home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/format:1630:
> >> error: no matching function for call to 'isxdigit(const
> >> std::__9::basic_string_view::value_type&)'
> >>
> /home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/locale_facets.h:2649:
> >> note: candidate: 'template bool std::__9::isxdigit(_CharT,
> >> const locale&)'
> >>
> /home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/locale_facets.h:2649:
> >> note:   template argument deduction/substitution failed:
> >>
> /home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/format:1630:
> >> note:   candidate expects 2 arguments, 1 provided
> >> compiler exited with status 1
> >> FAIL: 17_intro/headers/c++2020/all_attributes.cc (test for excess
> errors)
> >> Excess errors:
> >>
> /home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/format:1540:
> >> error: no matching function for call to 'isxdigit(const
> >> std::__9::basic_string_view::value_type&)'
> >>
> /home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/format:1630:
> >> error: no matching function for call to 'isxdigit(const
> >> std::__9::basic_string_view::value_type&)'
> >>
> >> It sounds like the most reasonable fix as this is how toupper is being
> >> called.
> > I think the real problem is that include/c_global/cctype is missing
> > the NAMESPACE_VERSION macros.
> >
> > All declarations of std::isxdigit etc should be in the same namespace,
> > precisely so we don't need to do this.
>
> Didn't you want to fix it this way then ?
>
> To be honest I was a little bit lost by this code:
>
> #if !__has_builtin(__builtin_toupper)
> # include 
> #endif
>
> Looks like cctype is included only for toupper, why not for isxdigit ?
>


The idea was to only include it when needed for clang. But of course it's
already included by  and so that check is pointless.

I think we might want to use the built-in for isxdigit as well, because the
built-in ignores the locale which is what we want here. Or we should just
replace toupper and isxdigit with locale-independent equivalents.



> >
> >>   libstdc++: Add missing std qualification on isxdigit calls
> >>
> >>   libstdc++-v3/ChangeLog
> >>
> >>   * include/std/format: Add std qualification on isxdigit
> calls.
> >>
> >> Ok to commit ?
> > Yes, OK.
>
> Committed.
>

Thanks.



>


[Patch Arm] Add neon_fcmla and neon_fcadd as neon_type instructions.

2022-11-20 Thread Ramana Radhakrishnan via Gcc-patches
[AArch64 folks CC'd fyi as this is common between both backends.]

Hi,

The design in the backend used to be that advanced simd types are
generally added to is_neon_type in the backend. It appears that
neon_fcmla and neon_fcadd aren't added in as  neon_type instructions.

Applying this to the tree later this week after having built armhf and
a bootstrap and test run on aarch64-linux-gnu.

Thanks,
Ramana
commit 7dd15fae0ac1455f5818a1fc0078e35d85e1e250
Author: Ramana Radhakrishnan 
Date:   Wed Nov 16 10:32:04 2022 +

[Patch Arm] Add neon_fcadd and neon_fcmla to is_neon_type.

Appears to have been an oversight.

gcc/
* config/arm/types.md: Update comment.
(is_neon_type): Add neon_fcmla, neon_fcadd.

Signed-off-by: Ramana Radhakrishnan 

diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md
index 7d0504bdd94..d0d9997efd2 100644
--- a/gcc/config/arm/types.md
+++ b/gcc/config/arm/types.md
@@ -248,7 +248,8 @@ (define_attr "autodetect_type"
 ; wmmx_wunpckil
 ; wmmx_wxor
 ;
-; The classification below is for NEON instructions.
+; The classification below is for NEON instructions. If a new neon type is
+; added, please ensure this is added to the is_neon_type attribute below too.
 ;
 ; neon_add
 ; neon_add_q
@@ -1281,6 +1282,7 @@ (define_attr "is_neon_type" "yes,no"
   neon_fp_mla_d_q, neon_fp_mla_d_scalar_q, neon_fp_sqrt_s,\
   neon_fp_sqrt_s_q, neon_fp_sqrt_d, neon_fp_sqrt_d_q,\
   neon_fp_div_s, neon_fp_div_s_q, neon_fp_div_d, neon_fp_div_d_q, 
crypto_aese,\
+  neon_fcadd, neon_fcmla, \
   crypto_aesmc, crypto_sha1_xor, crypto_sha1_fast, crypto_sha1_slow,\
   crypto_sha256_fast, crypto_sha256_slow")
 (const_string "yes")


Re: [PATCH 15/35] arm: Explicitly specify other float types for _Generic overloading [PR107515]

2022-11-20 Thread Ramana Radhakrishnan via Gcc-patches
On Fri, Nov 18, 2022 at 4:59 PM Kyrylo Tkachov via Gcc-patches
 wrote:
>
>
>
> > -Original Message-
> > From: Andrea Corallo 
> > Sent: Thursday, November 17, 2022 4:38 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Kyrylo Tkachov ; Richard Earnshaw
> > ; Stam Markianos-Wright  > wri...@arm.com>
> > Subject: [PATCH 15/35] arm: Explicitly specify other float types for 
> > _Generic
> > overloading [PR107515]
> >
> > From: Stam Markianos-Wright 
> >
> > This patch adds explicit references to other float types
> > to __ARM_mve_typeid in arm_mve.h.  Resolves PR 107515:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515
> >
> > gcc/ChangeLog:
> > PR 107515
> > * config/arm/arm_mve.h (__ARM_mve_typeid): Add float types.
>
> Argh, I'm looking forward to when we move away from this _Generic business, 
> but for now ok.
> The ChangeLog should say "PR target/107515" for the git hook to recognize it 
> IIRC.

and the PR is against 11.x - is there a plan to back port this and
dependent patches to relevant branches ?

Ramana

> Thanks,
> Kyrill
>
> > ---
> >  gcc/config/arm/arm_mve.h | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
> > index fd1876b57a0..f6b42dc3fab 100644
> > --- a/gcc/config/arm/arm_mve.h
> > +++ b/gcc/config/arm/arm_mve.h
> > @@ -35582,6 +35582,9 @@ enum {
> >   short: __ARM_mve_type_int_n, \
> >   int: __ARM_mve_type_int_n, \
> >   long: __ARM_mve_type_int_n, \
> > + _Float16: __ARM_mve_type_fp_n, \
> > + __fp16: __ARM_mve_type_fp_n, \
> > + float: __ARM_mve_type_fp_n, \
> >   double: __ARM_mve_type_fp_n, \
> >   long long: __ARM_mve_type_int_n, \
> >   unsigned char: __ARM_mve_type_int_n, \
> > --
> > 2.25.1
>


Re: [PATCH][GCC] arm: Add support for new frame unwinding instruction "0xb5".

2022-11-20 Thread Ramana Radhakrishnan via Gcc-patches
On Fri, Nov 18, 2022 at 9:33 AM Srinath Parvathaneni
 wrote:
>
> Hi,
>
> > -Original Message-
> > From: Ramana Radhakrishnan 
> > Sent: Thursday, November 17, 2022 8:27 PM
> > To: Srinath Parvathaneni 
> > Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw
> > ; Kyrylo Tkachov 
> > Subject: Re: [PATCH][GCC] arm: Add support for new frame unwinding
> > instruction "0xb5".
> >
> > On Thu, Nov 10, 2022 at 10:38 AM Srinath Parvathaneni via Gcc-patches  > patc...@gcc.gnu.org> wrote:
> > >
> > > Hi,
> > >
> > > This patch adds support for Arm frame unwinding instruction "0xb5"
> > > [1]. When an exception is taken and "0xb5" instruction is encounter
> > > during runtime stack-unwinding, we use effective vsp as modifier in 
> > > pointer
> > authentication.
> > > On completion of stack unwinding if "0xb5" instruction is not
> > > encountered then CFA will be used as modifier in pointer authentication.
> > >
> > > [1]
> > > https://github.com/ARM-software/abi-
> > aa/releases/download/2022Q3/ehabi3
> > > 2.pdf
> > >
> > > Regression tested on arm-none-eabi target and found no regressions.
> > >
> > > Ok for master?
> > >
> >
> > No, not yet.
> >
> > Presumably the logic to produce 0xb5 is in the source base and this was
> > tested with suitable options that produce said opcode ? I see no logic in 
> > place
> > to produce the said opcode in the backend in a quick read as the pacbti
> > patches still seem to be in review. ?
> >
> > So what was the test suite run actually testing ?
>
> Sorry for the late response, the patch supporting the said opcode (directive 
> ".pacspval)" is here:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605524.html (still 
> under upstream review)
>
> and the patch to encode ".pacspval" with the mentioned opcode "0xb5" in 
> binutils is here:
> https://sourceware.org/pipermail/binutils/2022-November/124328.html (approved 
> and committed to binutils).

Thanks for the answer but perhaps I should make my question more
explicit - are you saying that this patch was tested in combination
with those and other dependent patches on a suitable simulator with
suitable multilibs and C++ to test for this presumably for frame
unwinding ?

For the future , it would certainly be worth being explicit about this
in your patch submission :)

regards
Ramana

>
> Regards,
> Srinath.
>
> > regards
> > Ramana
> >
> >
> > > Regards,
> > > Srinath.
> > >
> > > gcc/ChangeLog:
> > >
> > > 2022-11-09  Srinath Parvathaneni  
> > >
> > > * libgcc/config/arm/pr-support.c (__gnu_unwind_execute): Decode
> > opcode
> > > "0xb5".
> > >
> > >
> > > ### Attachment also inlined for ease of reply
> > ###
> > >
> > >
> > > diff --git a/libgcc/config/arm/pr-support.c
> > > b/libgcc/config/arm/pr-support.c index
> > >
> > e48854587c667a959aa66ccc4982231f6ecc..73e4942a39b34a83c2da85de
> > f6b1
> > > 3e82ec501552 100644
> > > --- a/libgcc/config/arm/pr-support.c
> > > +++ b/libgcc/config/arm/pr-support.c
> > > @@ -107,7 +107,9 @@ __gnu_unwind_execute (_Unwind_Context *
> > context, __gnu_unwind_state * uws)
> > >_uw op;
> > >int set_pc;
> > >int set_pac = 0;
> > > +  int set_pac_sp = 0;
> > >_uw reg;
> > > +  _uw sp;
> > >
> > >set_pc = 0;
> > >for (;;)
> > > @@ -124,10 +126,11 @@ __gnu_unwind_execute (_Unwind_Context *
> > context,
> > > __gnu_unwind_state * uws)  #if defined(TARGET_HAVE_PACBTI)
> > >   if (set_pac)
> > > {
> > > - _uw sp;
> > >   _uw lr;
> > >   _uw pac;
> > > - _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP,
> > _UVRSD_UINT32, );
> > > + if (!set_pac_sp)
> > > +   _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP,
> > _UVRSD_UINT32,
> > > +);
> > >   _Unwind_VRS_Get (context, _UVRSC_CORE, R_LR, _UVRSD_UINT32,
> > );
> > >   _Unwind_VRS_Get (context, _UVRSC_PAC, R_IP,
> > >_UVRSD_UINT32, ); @@ -259,7 +262,19
> > > @@ __gnu_unwind_execute (_Unwind_Context * context,
> > __gnu_unwind_state * uws)
> > >   continue;
> > > }
> > >
> > > - if ((op & 0xfc) == 0xb4)  /* Obsolete FPA.  */
> > > + /* Use current VSP as modifier in PAC validation.  */
> > > + if (op == 0xb5)
> > > +   {
> > > + if (set_pac)
> > > +   _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP,
> > _UVRSD_UINT32,
> > > +);
> > > + else
> > > +   return _URC_FAILURE;
> > > + set_pac_sp = 1;
> > > + continue;
> > > +   }
> > > +
> > > + if ((op & 0xfd) == 0xb6)  /* Obsolete FPA.  */
> > > return _URC_FAILURE;
> > >
> > >   /* op & 0xf8 == 0xb8.  */
> > >
> > >
> > >


Re: [PATCH] Fix in _GLIBCXX_INLINE_VERSION mode

2022-11-20 Thread François Dumont via Gcc-patches

On 19/11/22 14:11, Jonathan Wakely wrote:

On Sat, 19 Nov 2022 at 13:03, François Dumont via Libstdc++
 wrote:

Without this qualification I have this in _GLIBCXX_INLINE_VERSION mode:

/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/locale_facets.h:2649:
note: candidate: 'template bool std::__9::isxdigit(_CharT,
const locale&)'
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/locale_facets.h:2649:
note:   template argument deduction/substitution failed:
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/format:1540:
note:   candidate expects 2 arguments, 1 provided
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/format:1630:
error: no matching function for call to 'isxdigit(const
std::__9::basic_string_view::value_type&)'
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/locale_facets.h:2649:
note: candidate: 'template bool std::__9::isxdigit(_CharT,
const locale&)'
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/bits/locale_facets.h:2649:
note:   template argument deduction/substitution failed:
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/format:1630:
note:   candidate expects 2 arguments, 1 provided
compiler exited with status 1
FAIL: 17_intro/headers/c++2020/all_attributes.cc (test for excess errors)
Excess errors:
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/format:1540:
error: no matching function for call to 'isxdigit(const
std::__9::basic_string_view::value_type&)'
/home/fdt/dev/gcc/build_versioned_ns/x86_64-pc-linux-gnu/libstdc++-v3/include/format:1630:
error: no matching function for call to 'isxdigit(const
std::__9::basic_string_view::value_type&)'

It sounds like the most reasonable fix as this is how toupper is being
called.

I think the real problem is that include/c_global/cctype is missing
the NAMESPACE_VERSION macros.

All declarations of std::isxdigit etc should be in the same namespace,
precisely so we don't need to do this.


Didn't you want to fix it this way then ?

To be honest I was a little bit lost by this code:

#if !__has_builtin(__builtin_toupper)
# include 
#endif

Looks like cctype is included only for toupper, why not for isxdigit ?




  libstdc++: Add missing std qualification on isxdigit calls

  libstdc++-v3/ChangeLog

  * include/std/format: Add std qualification on isxdigit calls.

Ok to commit ?

Yes, OK.


Committed.



Re: PING^1 [PATCH] cpp/remap: Only override if string matched

2022-11-20 Thread Jeff Law via Gcc-patches



On 11/2/22 12:21, Torbjorn SVENSSON via Gcc-patches wrote:

Hi,

Ping, https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604062.html

Ok for trunk?


OK.  Sorry for the delay.

jeff




[PATCH] libgccjit: Fix float vector comparison

2022-11-20 Thread Antoni Boucher via Gcc-patches
Hi.
This fixes bug 107770.
Thanks for the review.
From 1112e92624d41ec96c366fdb60101e1040462522 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Sun, 20 Nov 2022 10:22:53 -0500
Subject: [PATCH] libgccjit: Fix float vector comparison

Fix float vector comparison and add comparison tests didn't include float and
vectors.

gcc/testsuite:
	PR jit/107770
	* jit.dg/harness.h: Add new macro to to perform vector
	comparisons
	* jit.dg/test-expressions.c: Extend comparison tests to add float
	types and vectors

gcc/jit:
	PR jit/107770
	* jit-playback.cc: Fix vector float comparison
	* jit-playback.h: Update comparison function signature
	* jit-recording.cc: Update call for "new_comparison" function
	* jit-recording.h: Fix vector float comparison

Co-authored-by: Guillaume Gomez 
---
 gcc/jit/jit-playback.cc |  27 ++-
 gcc/jit/jit-playback.h  |   2 +-
 gcc/jit/jit-recording.cc|   3 +-
 gcc/jit/jit-recording.h |  18 +-
 gcc/testsuite/jit.dg/harness.h  |  15 ++
 gcc/testsuite/jit.dg/test-expressions.c | 234 +++-
 6 files changed, 246 insertions(+), 53 deletions(-)

diff --git a/gcc/jit/jit-playback.cc b/gcc/jit/jit-playback.cc
index d227d36283a..2888da16ebf 100644
--- a/gcc/jit/jit-playback.cc
+++ b/gcc/jit/jit-playback.cc
@@ -1213,7 +1213,7 @@ playback::rvalue *
 playback::context::
 new_comparison (location *loc,
 		enum gcc_jit_comparison op,
-		rvalue *a, rvalue *b)
+		rvalue *a, rvalue *b, type *vec_result_type)
 {
   // FIXME: type-checking, or coercion?
   enum tree_code inner_op;
@@ -1252,10 +1252,27 @@ new_comparison (location *loc,
   tree node_b = b->as_tree ();
   node_b = fold_const_var (node_b);
 
-  tree inner_expr = build2 (inner_op,
-			boolean_type_node,
-			node_a,
-			node_b);
+  tree inner_expr;
+  tree a_type = TREE_TYPE (node_a);
+  if (VECTOR_TYPE_P (a_type))
+  {
+/* Build a vector comparison.  See build_vec_cmp in c-typeck.cc for
+   reference.  */
+tree t_vec_result_type = vec_result_type->as_tree ();
+tree zero_vec = build_zero_cst (t_vec_result_type);
+tree minus_one_vec = build_minus_one_cst (t_vec_result_type);
+tree cmp_type = truth_type_for (a_type);
+tree cmp = build2 (inner_op, cmp_type, node_a, node_b);
+inner_expr = build3 (VEC_COND_EXPR, t_vec_result_type, cmp, minus_one_vec,
+			 zero_vec);
+  }
+  else
+  {
+inner_expr = build2 (inner_op,
+			 boolean_type_node,
+			 node_a,
+			 node_b);
+  }
 
   /* Try to fold.  */
   inner_expr = fold (inner_expr);
diff --git a/gcc/jit/jit-playback.h b/gcc/jit/jit-playback.h
index 3ba02a0451a..056e5231514 100644
--- a/gcc/jit/jit-playback.h
+++ b/gcc/jit/jit-playback.h
@@ -162,7 +162,7 @@ public:
   rvalue *
   new_comparison (location *loc,
 		  enum gcc_jit_comparison op,
-		  rvalue *a, rvalue *b);
+		  rvalue *a, rvalue *b, type *vec_result_type);
 
   rvalue *
   new_call (location *loc,
diff --git a/gcc/jit/jit-recording.cc b/gcc/jit/jit-recording.cc
index f78daed2d71..b5eb648ad24 100644
--- a/gcc/jit/jit-recording.cc
+++ b/gcc/jit/jit-recording.cc
@@ -5837,7 +5837,8 @@ recording::comparison::replay_into (replayer *r)
   set_playback_obj (r->new_comparison (playback_location (r, m_loc),
    m_op,
    m_a->playback_rvalue (),
-   m_b->playback_rvalue ()));
+   m_b->playback_rvalue (),
+   m_type->playback_type ()));
 }
 
 /* Implementation of pure virtual hook recording::rvalue::visit_children
diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h
index 8610ea988bd..5d7c7177cc3 100644
--- a/gcc/jit/jit-recording.h
+++ b/gcc/jit/jit-recording.h
@@ -1683,7 +1683,23 @@ public:
 m_op (op),
 m_a (a),
 m_b (b)
-  {}
+  {
+type *a_type = a->get_type ();
+vector_type *vec_type = a_type->dyn_cast_vector_type ();
+if (vec_type != NULL)
+{
+  type *element_type = vec_type->get_element_type ();
+  type *inner_type;
+  /* Vectors of floating-point values return a vector of integers of the
+ same size.  */
+  if (element_type->is_float ())
+	inner_type = ctxt->get_int_type (element_type->get_size (), false);
+  else
+	inner_type = element_type;
+  m_type = new vector_type (inner_type, vec_type->get_num_units ());
+  ctxt->record (m_type);
+}
+  }
 
   void replay_into (replayer *r) final override;
 
diff --git a/gcc/testsuite/jit.dg/harness.h b/gcc/testsuite/jit.dg/harness.h
index 7b70ce73dd5..e423abe9ee1 100644
--- a/gcc/testsuite/jit.dg/harness.h
+++ b/gcc/testsuite/jit.dg/harness.h
@@ -68,6 +68,21 @@ static char test[1024];
 }\
   } while (0)
 
+#define CHECK_VECTOR_VALUE(LEN, ACTUAL, EXPECTED) \
+  do {   \
+for (int __check_vector_it = 0; __check_vector_it < LEN; ++__check_vector_it) { \
+  if ((ACTUAL)[__check_vector_it] != (EXPECTED)[__check_vector_it]) { \
+  fail ("%s: %s: actual: %s != 

Re: Making gcc toolchain installs relocatable

2022-11-20 Thread Jeff Law via Gcc-patches



On 9/23/22 12:40, Keith Packard via Gcc-patches wrote:

I submitted the referenced patch to extend the 'getenv' .specs function
back in August and didn't see any response, so I wanted to provide a bit
more context to see if that would help people understand why I wrote
this.


I think most folks generally loathe getting into specs and such. So as 
one of the reviewers of last resort, I'll see what I can do here.



So we already have a goodly amount of infrastructure for relocateable 
toolchains and relocateable sysroots.  So the natural question that 
arises is what is it about your environment that is different and 
prevents those existing mechanisms from working.






Here's a link to that message:

 https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600452.html

I'm working with embedded toolchains where I want to distribute binary
versions of binutils, gcc and a suite of libraries in a tar file which
the user can unpack anywhere on their system. To make this work, I need
to create .spec file fragments that can locate the correct libraries
relative to the location where the toolchain was unpacked.


So the first half of that paragraph describes what I do all the time.  
I've got cross toolchain (gcc, binutils, various libraries & headers).  
Those all go into a sysroot and I can relocate the toolchain to anywhere 
in the filesystem and it "just works". I do this for both bare metal 
tooclhains using newlib as well as linux-gnu toolchains using glibc.



Now you mention you need to create .spec file fragments to locate the 
correct libraries.  But if the libraries are in the sysroot, then you 
shouldn't really need to do anything special.  Drop them into the right 
place and they should relocate just like glibc, newlib, etc.



So maybe the problem is you're not using sysroots?






An easy way to do this, which doesn't depend on a default sysroot value,
is to use the GCC_EXEC_PREFIX environment variable in the .specs


Are you not using sysroots at all?  If so, why not?




file. Gcc sets that whenever it discovers that it hasn't been run from
the defined installation path. However, if the user does end up
installing gcc in the defined installation path, then that variable
isn't set at all. If a .specs file attempts to reference the variable,
gcc will emit a fatal error and exit.


This is a good hint.  Have you considered building with a sysroot like 
/wontexist, installing everything into there, then moving the whole 
thing to a more sensible directory before tarring it up?



What's useful about that is you always have a sysroot defined, but it 
generally won't exist at runtime.  so GCC_EXEC_PREFIX should always be 
set and everything should "just work" from that point onward.



Jeff





Re: [PATCH] Add __builtin_iseqsig()

2022-11-20 Thread FX via Gcc-patches
Hi,

> Joseph's reply earlier in this thread has indicated a desire to verify that 
> verifies FE_INVALID is raised when appropriate and not raised when 
> inappropriate when the arguments come from volatile variables rather than 
> directly from constants.
> 
> The patch itself looks pretty reasonable.  So let's get the testing coverage 
> Joseph wanted so we can move forward.

Sadly, while I had some time to deal with it when the patch was originally 
submitted, once the review came back my hands were full and right now I cannot 
find time. I hope someone can pick it up and finish, otherwise it will have to 
wait for the next version.

FX



Re: [PATCH] Add __builtin_iseqsig()

2022-11-20 Thread Jeff Law via Gcc-patches



On 10/31/22 13:15, FX wrote:

Hi,

Just adding, from the Fortran 2018 perspective, things we will need to 
implement for which I think support from the middle-end might be necessary:

- rounded conversions: converting, from an integer or floating point type, into 
another floating point type, with specific rounding mode passed as argument
- conversion to integer: converting, from a floating point type, into an 
integer type, with specific rounding mode passed as argument
- IEEE operations corresponding to nextDown and nextUp (or are those already 
available? I have not checked the fine print)

I would like to add them all for GCC 13.


If you want them all for GCC 13, then you're going to need to make a 
case for a policy exception to add them after stage1 has closed.



Joseph's reply earlier in this thread has indicated a desire to verify 
that verifies FE_INVALID is raised when appropriate and not raised when 
inappropriate when the arguments come from volatile variables rather 
than directly from constants.


The patch itself looks pretty reasonable.  So let's get the testing 
coverage Joseph wanted so we can move forward.



Jeff


Re: [PATCH 2/5] c++: Set the locus of the function result decl

2022-11-20 Thread Bernhard Reutner-Fischer via Gcc-patches
Hi Jason!

The "meh" of result-decl-plugin-test-2.C should likely be omitted,
grokdeclarator would need some changes to add richloc hints and we would not
be able to make a reliable guess what to remove precisely.
C.f. /* Check all other uses of type modifiers.  */
Furthermore it is unrelated to DECL_RESULT so not of direct interest
here. The other tests in test-2.C, f() and huh() should work though.

I don't know if it's acceptable to change ipa-pure-const to make the
missing noreturn warning more precise and emit a fixit-hint. At least it
would be a real test for the DECL_RESULT and would spare us the plugin.

HTH,

gcc/cp/ChangeLog:

* decl.cc (start_preparsed_function): Set the result decl source
location to the location of the typespec.
(start_function): Likewise.

gcc/ChangeLog:

* ipa-pure-const.cc (suggest_attribute): Add fixit-hint for the
noreturn attribute.

gcc/testsuite/ChangeLog:

* c-c++-common/pr68833-1.c: Adjust noreturn warning line number.
* gcc.dg/noreturn-1.c: Likewise.
* g++.dg/plugin/plugin.exp: Add new plugin test.
* g++.dg/other/resultdecl-1.C: New test.
* g++.dg/plugin/result-decl-plugin-test-1.C: New test.
* g++.dg/plugin/result-decl-plugin-test-2.C: New test.
* g++.dg/plugin/result_decl_plugin.C: New test.
---
 gcc/cp/decl.cc| 26 +++-
 gcc/ipa-pure-const.cc | 14 -
 gcc/testsuite/c-c++-common/pr68833-1.c|  2 +-
 gcc/testsuite/g++.dg/other/resultdecl-1.C | 32 ++
 gcc/testsuite/g++.dg/plugin/plugin.exp|  3 +
 .../g++.dg/plugin/result-decl-plugin-test-1.C | 31 ++
 .../g++.dg/plugin/result-decl-plugin-test-2.C | 59 +++
 .../g++.dg/plugin/result_decl_plugin.C| 53 +
 gcc/testsuite/gcc.dg/noreturn-1.c |  2 +-
 9 files changed, 218 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/other/resultdecl-1.C
 create mode 100644 gcc/testsuite/g++.dg/plugin/result-decl-plugin-test-1.C
 create mode 100644 gcc/testsuite/g++.dg/plugin/result-decl-plugin-test-2.C
 create mode 100644 gcc/testsuite/g++.dg/plugin/result_decl_plugin.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index d28889ed865..0c053b6d19f 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -17235,6 +17235,17 @@ start_preparsed_function (tree decl1, tree attrs, int 
flags)
   cp_apply_type_quals_to_decl (cp_type_quals (restype), resdecl);
 }
 
+  /* Set the result decl source location to the location of the typespec.  */
+  if (DECL_RESULT (decl1)
+  && DECL_TEMPLATE_INSTANTIATION (decl1)
+  && DECL_TEMPLATE_INFO (decl1)
+  && DECL_TI_TEMPLATE (decl1)
+  && DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (decl1))
+  && DECL_RESULT (DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (decl1
+  DECL_SOURCE_LOCATION (DECL_RESULT (decl1))
+   = DECL_SOURCE_LOCATION (
+   DECL_RESULT (DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (decl1;
+
   /* Record the decl so that the function name is defined.
  If we already have a decl for this name, and it is a FUNCTION_DECL,
  use the old decl.  */
@@ -17532,7 +17543,20 @@ start_function (cp_decl_specifier_seq *declspecs,
 gcc_assert (same_type_p (TREE_TYPE (TREE_TYPE (decl1)),
 integer_type_node));
 
-  return start_preparsed_function (decl1, attrs, /*flags=*/SF_DEFAULT);
+  bool ret = start_preparsed_function (decl1, attrs, /*flags=*/SF_DEFAULT);
+
+  /* decl1 might be ggc_freed here.  */
+  decl1 = current_function_decl;
+
+  tree result;
+  /* Set the result decl source location to the location of the typespec.  */
+  if (ret
+  && TREE_CODE (decl1) == FUNCTION_DECL
+  && declspecs->locations[ds_type_spec] != UNKNOWN_LOCATION
+  && (result = DECL_RESULT (decl1)) != NULL_TREE
+  && DECL_SOURCE_LOCATION (result) == input_location)
+DECL_SOURCE_LOCATION (result) = declspecs->locations[ds_type_spec];
+  return ret;
 }
 
 /* Returns true iff an EH_SPEC_BLOCK should be created in the body of
diff --git a/gcc/ipa-pure-const.cc b/gcc/ipa-pure-const.cc
index 572a6da274f..1c80034f38d 100644
--- a/gcc/ipa-pure-const.cc
+++ b/gcc/ipa-pure-const.cc
@@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-fnsummary.h"
 #include "symtab-thunks.h"
 #include "dbgcnt.h"
+#include "gcc-rich-location.h"
 
 /* Lattice values for const and pure functions.  Everything starts out
being const, then may drop to pure and then neither depending on
@@ -212,7 +213,18 @@ suggest_attribute (int option, tree decl, bool 
known_finite,
   if (warned_about->contains (decl))
 return warned_about;
   warned_about->add (decl);
-  warning_at (DECL_SOURCE_LOCATION (decl),
+
+  gcc_rich_location richloc (option == OPT_Wsuggest_attribute_noreturn
+? DECL_SOURCE_LOCATION (DECL_RESULT (decl))
+: 

Re: [PATCH 1/2] Allow subtarget customization of CC1_SPEC

2022-11-20 Thread Jeff Law via Gcc-patches



On 10/26/22 03:34, Sebastian Huber wrote:

On 04/10/2022 11:47, Sebastian Huber wrote:

On 08/09/2022 07:33, Sebastian Huber wrote:

On 04/08/2022 15:02, Sebastian Huber wrote:

On 22/07/2022 15:02, Sebastian Huber wrote:

gcc/ChangeLog:

* gcc.cc (SUBTARGET_CC1_SPEC): Define if not defined.
(CC1_SPEC): Define to SUBTARGET_CC1_SPEC.
* config/arm/arm.h (CC1_SPEC): Remove.
* config/arc/arc.h (CC1_SPEC): Append SUBTARGET_CC1_SPEC.
* config/cris/cris.h (CC1_SPEC): Likewise.
* config/frv/frv.h (CC1_SPEC): Likewise.
* config/i386/i386.h (CC1_SPEC): Likewise.
* config/ia64/ia64.h (CC1_SPEC): Likewise.
* config/lm32/lm32.h (CC1_SPEC): Likewise.
* config/m32r/m32r.h (CC1_SPEC): Likewise.
* config/mcore/mcore.h (CC1_SPEC): Likewise.
* config/microblaze/microblaze.h: Likewise.
* config/nds32/nds32.h (CC1_SPEC): Likewise.
* config/nios2/nios2.h (CC1_SPEC): Likewise.
* config/pa/pa.h (CC1_SPEC): Likewise.
* config/rs6000/sysv4.h (CC1_SPEC): Likewise.
* config/rx/rx.h (CC1_SPEC): Likewise.
* config/sparc/sparc.h (CC1_SPEC): Likewise.


Could someone please have a look at this patch set?


Ping


Would someone mind having a look at this patch set? If there is a 
better approach to customize the default TLS model, then please let 
me know.


It would be nice if someone could review the patch before the Stage 1 
ends at November 13th.


Just a reminder.  The guidelines are a patch needs to be posted before 
the end of stage1 to make the deadline.  Review & integration can happen 
after the deadline.


I realize the idea here is to allow RTEMS to change the default TLS 
model.  But does it also happen to make it possible to solve Keith 
Packard's issues with picolibc?  See the Aug/Sep gcc-patches archives.


It looks sensible.  I assume you did a "find" to identify all the 
CC1_SPECs to change.



OK for the trunk,

Jeff


Anyway, does this also solve some of the issue Keith Packard was try to 
nail down for picolibc?




Re: [PATCH v2] Add -gcodeview option

2022-11-20 Thread Mark Harmstone

On 20/11/22 16:43, Jeff Law wrote:


On 10/26/22 21:38, Mark Harmstone wrote:

Changed to double dashes as per
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604287.html.


What value is there in providing this option now?  IIUC we don't have any of 
the bits yet to actually produce PDB records.   It seems to me like this ought 
to be patch 1/n of a patch to produce PDB debug symbols.


This isn't useless, as ld will create symbols for the mangled names even 
without the .debug$S and .debug$T sections being present.



Re: [PATCH v2] Add -gcodeview option

2022-11-20 Thread Jeff Law via Gcc-patches



On 10/26/22 21:38, Mark Harmstone wrote:

Changed to double dashes as per
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604287.html.


What value is there in providing this option now?  IIUC we don't have 
any of the bits yet to actually produce PDB records.   It seems to me 
like this ought to be patch 1/n of a patch to produce PDB debug symbols.



Or am I missing something?


jeff




Re: [RFC] postreload cse'ing vector constants

2022-11-20 Thread Jeff Law via Gcc-patches



On 11/3/22 06:38, Robin Dapp wrote:

Should we go ahead with this, i.e. push the change and wait for fallout?
  I guess we're still early enough in the cycle for that.  There are no
regressions anymore on s390, Power9, x86 and aarch64 (at least on the
farm machines I checked).


That would be my recommendation (go forward asap so that there's more 
time to find any fallout).



jeff



Re: [PATCH v3] RISC-V: Replace zero_extendsidi2_shifted with generalized split

2022-11-20 Thread Jeff Law via Gcc-patches



On 11/9/22 16:10, Philipp Tomsich wrote:

The current method of treating shifts of extended values on RISC-V
frequently causes sequences of 3 shifts, despite the presence of the
'zero_extendsidi2_shifted' pattern.

Consider:
 unsigned long f(unsigned int a, unsigned long b)
 {
 a = a << 1;
 unsigned long c = (unsigned long) a;
 c = b + (c<<4);
 return c;
 }
which will present at combine-time as:
 Trying 7, 8 -> 9:
 7: r78:SI=r81:DI#0<<0x1
   REG_DEAD r81:DI
 8: r79:DI=zero_extend(r78:SI)
   REG_DEAD r78:SI
 9: r72:DI=r79:DI<<0x4
   REG_DEAD r79:DI
 Failed to match this instruction:
 (set (reg:DI 72 [ _1 ])
 (and:DI (ashift:DI (reg:DI 81)
 (const_int 5 [0x5]))
(const_int 68719476704 [0xfffe0])))
and produce the following (optimized) assembly:
 f:
slliw   a5,a0,1
sllia5,a5,32
srlia5,a5,28
add a0,a5,a1
ret

The current way of handling this (in 'zero_extendsidi2_shifted')
doesn't apply for two reasons:
- this is seen before reload, and
- (more importantly) the constant mask is not 0xul.

To address this, we introduce a generalized version of shifting
zero-extended values that supports any mask of consecutive ones as
long as the number of training zeros is the inner shift-amount.

With this new split, we generate the following assembly for the
aforementioned function:
 f:
sllia0,a0,33
srlia0,a0,28
add a0,a0,a1
ret

Unfortunately, all of this causes some fallout (especially in how it
interacts with Zb* extensions and zero_extract expressions formed
during combine): this is addressed through additional instruction
splitting and handling of zero_extract.

gcc/ChangeLog:

* config/riscv/bitmanip.md (*zext.w): Match a zext.w expressed
 as an and:DI.
(*andi_add.uw): New pattern.
(*slli_slli_uw): New pattern.
(*shift_then_shNadd.uw): New pattern.
(*slliuw): Rename to riscv_slli_uw.
(riscv_slli_uw): Renamed from *slliuw.
(*zeroextract2_highbits): New pattern.
(*zero_extract): New pattern, which will be split to
shift-left + shift-right.
* config/riscv/predicates.md (dimode_shift_operand):
* config/riscv/riscv.md (*zero_extract_lowbits):
(zero_extendsidi2_shifted): Rename.
(*zero_extendsidi2_shifted): Generalize.
(*shift_truthvalue): New pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/shift-shift-6.c: New test.
* gcc.target/riscv/shift-shift-7.c: New test.
* gcc.target/riscv/shift-shift-8.c: New test.
* gcc.target/riscv/shift-shift-9.c: New test.
* gcc.target/riscv/snez.c: New test.

Commit notes:
- Depends on a predicate posted in "RISC-V: Optimize branches testing
   a bit-range or a shifted immediate".  Depending on the order of
   applying these, I'll take care to pull that part out of the other
   patch if needed.

Version-changes: 2
- refactor
- optimise for additional corner cases and deal with fallout

Version-changes: 3
- removed the [WIP] from the commit message (no other changes)

Signed-off-by: Philipp Tomsich 
---

(no changes since v1)

  gcc/config/riscv/bitmanip.md  | 142 ++
  gcc/config/riscv/predicates.md|   5 +
  gcc/config/riscv/riscv.md |  75 +++--
  .../gcc.target/riscv/shift-shift-6.c  |  14 ++
  .../gcc.target/riscv/shift-shift-7.c  |  16 ++
  .../gcc.target/riscv/shift-shift-8.c  |  20 +++
  .../gcc.target/riscv/shift-shift-9.c  |  15 ++
  gcc/testsuite/gcc.target/riscv/snez.c |  14 ++
  8 files changed, 261 insertions(+), 40 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/riscv/shift-shift-6.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/shift-shift-7.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/shift-shift-8.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/shift-shift-9.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/snez.c

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 78fdf02c2ec..06126ac4819 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -29,7 +29,20 @@
[(set_attr "type" "bitmanip,load")
 (set_attr "mode" "DI")])
  
-(define_insn "riscv_shNadd"

+;; We may end up forming a slli.uw with an immediate of 0 (while
+;; splitting through "*slli_slli_uw", below).
+;; Match this back to a zext.w
+(define_insn "*zext.w"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
+  (const_int 0))
+   (const_int 4294967295)))]
+  "TARGET_64BIT && TARGET_ZBA"
+  "zext.w\t%0,%1"
+  [(set_attr "type" "bitmanip")
+   (set_attr "mode" "DI")])


Would it be 

Re: [PATCH] testsuite: Add filter for target socket support

2022-11-20 Thread Jeff Law via Gcc-patches



On 11/20/22 03:02, Dimitar Dimitrov wrote:

The new analyzer tests for sockets are failing on embedded targets.
The newlib and avr-libc C libraries do not support sockets.

At first I considered a coarse filtering on the existing
effective_target_freestanding check.  But seeing how lib/target-supports.exp
is slowly turning into a copy of autotools, I kept the tradition and added
a new fine grained "socket" filter.

I also considered adding effective_target_posix, but could not
figure out a reliable C code to perform the check.

Testing done:
   - No changes in gcc.sum for x86_64-pc-linux-gnu, with or without this
 patch.
   - Filtered cases are now UNSUPPORTED instead of failing on AVR and PRU
 backends.

Ok for trunk?

gcc/ChangeLog:

* doc/sourcebuild.texi (sockets): Document new check.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/fd-accept.c: Require sockets.
* gcc.dg/analyzer/fd-bind.c: Ditto.
* gcc.dg/analyzer/fd-connect.c: Ditto.
* gcc.dg/analyzer/fd-datagram-socket.c: Ditto.
* gcc.dg/analyzer/fd-glibc-byte-stream-connection-server.c:
Ditto.
* gcc.dg/analyzer/fd-glibc-byte-stream-socket.c: Ditto.
* gcc.dg/analyzer/fd-glibc-datagram-client.c: Ditto.
* gcc.dg/analyzer/fd-glibc-datagram-socket.c: Ditto.
* gcc.dg/analyzer/fd-listen.c: Ditto.
* gcc.dg/analyzer/fd-manpage-getaddrinfo-client.c: Ditto.
* gcc.dg/analyzer/fd-mappage-getaddrinfo-server.c: Ditto.
* gcc.dg/analyzer/fd-meaning.c: Ditto.
* gcc.dg/analyzer/fd-socket-meaning.c: Ditto.
* gcc.dg/analyzer/fd-socket-misuse.c: Ditto.
* gcc.dg/analyzer/fd-stream-socket-active-open.c: Ditto.
* gcc.dg/analyzer/fd-stream-socket-passive-open.c: Ditto.
* gcc.dg/analyzer/fd-stream-socket.c: Ditto.
* gcc.dg/analyzer/fd-symbolic-socket.c: Ditto.
* lib/target-supports.exp (check_effective_target_sockets): New
check.


OK

jeff




Re: [PATCH] RISC-V: Add the Zihpm and Zicntr extensions

2022-11-20 Thread Jeff Law via Gcc-patches



On 11/8/22 20:00, Palmer Dabbelt wrote:

These extensions were recently frozen [1].  As per Andrew's post [2]
we're meant to ignore these in software, this just adds them to the list
of allowed extensions and otherwise ignores them.  I added these under
SPEC_CLASS_NONE even though the PDF lists them as 20190614 because it
seems pointless to add another spec class just to accept two extensions
we then ignore.

1: 
https://groups.google.com/a/groups.riscv.org/g/isa-dev/c/HZGoqP1eyps/m/GTNKRLJoAQAJ
2: 
https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/QKjQhChrq9Q/m/7gqdkctgAgAJ

gcc/ChangeLog

* common/config/riscv/riscv-common.cc: Add Zihpm and Zicnttr
extensions.


So the idea here is just to define the extension so that it gets defined 
in the ISA strings and passed through to the assembler, right?


Jeff


Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED

2022-11-20 Thread Jeff Law via Gcc-patches



On 11/4/22 17:00, Philipp Tomsich wrote:

Alexander,

I had missed your comment until now.

On Tue, 6 Sept 2022 at 13:39, Alexander Monakov  wrote:

On Mon, 5 Sep 2022, Philipp Tomsich wrote:


+riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode

mode_rep)

+{
+  /* On 64-bit targets, SImode register values are sign-extended to

DImode.  */

+  if (TARGET_64BIT && mode == SImode && mode_rep == DImode)
+return SIGN_EXTEND;

I think this leads to a counter-intuitive requirement that a hand-written
inline asm must sign-extend its output operands that are bound to either
signed or unsigned 32-bit lvalues. Will compiler users be aware of that?

I am not sure if I fully understand your concern, as the mode of the
asm-output will be derived from the variable type.
So "asm (... : "=r" (a))" will take DI/SI/HI/QImode depending on the type
of a.


Correct.




The concern, as far as I understand would be the case where the
assembly-sequence leaves an incompatible extension in the register.


Right.  The question in my mind is whether or not the responsibility 
should be on the compiler or on the developer to ensure the ASM output 
is properly extended.  If someone's writing ASMs, then to a large 
degree, I consider it their responsibility to make sure things are 
properly extended -- even more so if having the compiler do it results 
in slower code independent of ASMs.



Jeff



Re: [PATCH v2] tree-object-size: Support strndup and strdup

2022-11-20 Thread Jeff Law via Gcc-patches



On 11/4/22 06:48, Siddhesh Poyarekar wrote:

Use string length of input to strdup to determine the usable size of the
resulting object.  Avoid doing the same for strndup since there's a
chance that the input may be too large, resulting in an unnecessary
overhead or worse, the input may not be NULL terminated, resulting in a
crash where there would otherwise have been none.

gcc/ChangeLog:

* tree-object-size.cc (todo): New variable.
(object_sizes_execute): Use it.
(strdup_object_size): New function.
(call_object_size): Use it.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-dynamic-object-size-0.c (test_strdup,
test_strndup, test_strdup_min, test_strndup_min): New tests.
(main): Call them.
* gcc.dg/builtin-dynamic-object-size-1.c: Silence overread
warnings.
* gcc.dg/builtin-dynamic-object-size-2.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-3.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-4.c: Likewise.
* gcc.dg/builtin-object-size-1.c: Silence overread warnings.
Declare free, strdup and strndup.
(test11): New test.
(main): Call it.
* gcc.dg/builtin-object-size-2.c: Silence overread warnings.
Declare free, strdup and strndup.
(test9): New test.
(main): Call it.
* gcc.dg/builtin-object-size-3.c: Silence overread warnings.
Declare free, strdup and strndup.
(test11): New test.
(main): Call it.
* gcc.dg/builtin-object-size-4.c: Silence overread warnings.
Declare free, strdup and strndup.
(test9): New test.
(main): Call it.


I'm struggling to see how the SSA updating is correct.  Yes we need to 
update the virtuals due to the introduction of the call to strlen, 
particularly when SRC is not a string constant.  But do we need to do more?


Don't we end up gimplifying the 1 + strlenfn (src) expression? Can that 
possibly create new SSA_NAMEs?  Do those need to be put into SSA form?  
I feel like I'm missing something here...



jeff



Re: [PATCH 1/3] Compute a table of DWARF register sizes at compile

2022-11-20 Thread Jeff Law via Gcc-patches



On 11/8/22 11:05, Florian Weimer via Gcc-patches wrote:

The sizes are compile-time constants.  Create a vector with them,
so that they can be inspected at compile time.

* gcc/dwarf2cfi.cc (init_return_column_size): Remove.
(init_one_dwarf_reg_size): Adjust.
(generate_dwarf_reg_sizes): New function.  Extracted
from expand_builtin_init_dwarf_reg_sizes.
(expand_builtin_init_dwarf_reg_sizes): Call
generate_dwarf_reg_sizes.
* gcc/target.def (init_dwarf_reg_sizes_extra): Adjust
hook signature.
* gcc/config/msp430/msp430.cc
(msp430_init_dwarf_reg_sizes_extra): Adjust.
* gcc/config/rs6000.cc (rs6000_init_dwarf_reg_sizes_extra):
Likewise.
* gcc/doc/tm.texi: Update.


This series of 3 patches is fine.

Jeff




Re: [PATCH] configure: Implement --enable-host-bind-now

2022-11-20 Thread Jeff Law via Gcc-patches



On 11/10/22 19:53, Marek Polacek via Gcc-patches wrote:

This is a rebased version of the patch I posted in February:
.

Fortunately it is much simpler than the patch implementing --enable-host-pie.
I've converted the install.texi part into configuration.rst, otherwise
there are no changes to the original version.

With --enable-host-bind-now --enable-host-pie:
$ readelf -Wd ./gcc/cc1 ./gcc/cc1plus | grep FLAGS
  0x001e (FLAGS)  BIND_NOW
  0x6ffb (FLAGS_1)Flags: NOW PIE
  0x001e (FLAGS)  BIND_NOW
  0x6ffb (FLAGS_1)Flags: NOW PIE

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --

As promised in the --enable-host-pie patch, this patch adds another
configure option, --enable-host-bind-now, which adds -z now when linking
the compiler executables in order to extend hardening.  BIND_NOW with RELRO
allows the GOT to be marked RO; this prevents GOT modification attacks.

This option does not affect linking of target libraries; you can use
LDFLAGS_FOR_TARGET=-Wl,-z,relro,-z,now to enable RELRO/BIND_NOW.

c++tools/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.
* configure: Regenerate.

gcc/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.  Add
-Wl,-z,now to LD_PICFLAG if --enable-host-bind-now.
* configure: Regenerate.
* doc/install/configuration.rst: Document --enable-host-bind-now.

lto-plugin/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.  Link with
-z,now.
* configure: Regenerate.
---


OK.  Glad to see this finally get to resolution.  While I'm largely in 
agreement with Jakub that PIE doesn't provide a major security benefit 
for the compiler, it seems better to not have the compiler be special 
WRT security options.



Jeff



Re: [PATCH] configure: Implement --enable-host-pie

2022-11-20 Thread Jeff Law via Gcc-patches



On 11/10/22 19:52, Marek Polacek via Gcc-patches wrote:

This is a rebased version of the patch I posted in March:

which Alex sort of approved here:

but it was too late to commit the patch in GCC 12.

There are no changes except that I've converted the documentation
part into the ReST format, and of course regenerated configure.

With --enable-host-pie enabled:
$ file ./gcc/cc1 ./gcc/cc1plus
./gcc/cc1: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
3.2.0, with debug_info, not stripped
./gcc/cc1plus: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
3.2.0, with debug_info, not stripped

Bootstrapped/regtested on x86_64-pc-linux-gnu w/ and w/o --enable-host-pie,
ok for trunk?

-- >8 --

This patch implements the --enable-host-pie configure option which
makes the compiler executables PIE.  This can be used to enhance
protection against ROP attacks, and can be viewed as part of a wider
trend to harden binaries.

It is similar to the option --enable-host-shared, except that --e-h-s
won't add -shared to the linker flags whereas --e-h-p will add -pie.
It is different from --enable-default-pie because that option just
adds an implicit -fPIE/-pie when the compiler is invoked, but the
compiler itself isn't PIE.

Since r12-5768-gfe7c3ecf, PCH works well with PIE, so there are no PCH
regressions.

When building the compiler, the build process may use various in-tree
libraries; these need to be built with -fPIE so that it's possible to
use them when building a PIE.  For instance, when --with-included-gettext
is in effect, intl object files must be compiled with -fPIE.  Similarly,
when building in-tree gmp, isl, mpfr and mpc, they must be compiled with
-fPIE.

I plan to add an option to link with -Wl,-z,now.

ChangeLog:

* Makefile.def: Pass $(PICFLAG) to AM_CFLAGS for gmp, mpfr, mpc, and
isl.
* Makefile.in: Regenerate.
* Makefile.tpl: Set PICFLAG.
* configure.ac (--enable-host-pie): New check.  Set PICFLAG after this
check.
* configure: Regenerate.

c++tools/ChangeLog:

* Makefile.in: Rename PIEFLAG to PICFLAG.  Set LD_PICFLAG.  Use it.
Use pic/libiberty.a if PICFLAG is set.
* configure.ac (--enable-default-pie): Set PICFLAG instead of PIEFLAG.
(--enable-host-pie): New check.
* configure: Regenerate.

fixincludes/ChangeLog:

* Makefile.in: Set and use PICFLAG and LD_PICFLAG.  Use the "pic"
build of libiberty if PICFLAG is set.
* configure.ac:
* configure: Regenerate.

gcc/ChangeLog:

* Makefile.in: Set LD_PICFLAG.  Use it.  Set enable_host_pie.
Remove NO_PIE_CFLAGS and NO_PIE_FLAG.  Pass LD_PICFLAG to
ALL_LINKERFLAGS.  Use the "pic" build of libiberty if --enable-host-pie.
* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
check.
* configure: Regenerate.
* doc/install/configuration.rst: Document --enable-host-pie.

gcc/d/ChangeLog:

* Make-lang.in: Remove NO_PIE_CFLAGS.

intl/ChangeLog:

* Makefile.in: Use @PICFLAG@ in COMPILE as well.
* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.

libcody/ChangeLog:

* Makefile.in: Pass LD_PICFLAG to LDFLAGS.
* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
check.
* configure: Regenerate.

libcpp/ChangeLog:

* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.

libdecnumber/ChangeLog:

* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.

libiberty/ChangeLog:

* configure.ac: Also set shared when enable_host_pie.
* configure: Regenerate.

zlib/ChangeLog:

* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.


OK.


Jeff




Re: [PATCH] reg-stack: Fix a -fcompare-debug bug in reg-stack [PR107183]

2022-11-20 Thread Jeff Law via Gcc-patches



On 11/19/22 02:15, Jakub Jelinek wrote:

Hi!

As the following testcase shows, the swap_rtx_condition function
in reg-stack can result in different code generation between -g and -g0.
The function is doing the changes as it goes, so does analysis and
changes together, which makes it harder to deal with DEBUG_INSNs,
where normally analysis phase ignores them and the later phase
doesn't.
swap_rtx_condition walks instructions two different ways, one is
using next_flags_user function which stops on non-call instructions
that mention the flags register, and the other is a loop on fnstsw
where it stops on instructions mentioning it and tries to find
sahf instruction that uses it (in both cases calls stop it and so
does end of basic block).
Now both of these currently stop on DEBUG_INSNs that mention
the flags register resp. the fnstsw result register.
On success the function recurses on next flags user instruction
if still live and if the recursion failed, reverts the changes
it did too and fails.
If it were just for the next_flags_user case, the fix could be
just not doing
   INSN_CODE (insn) = -1;
   if (recog_memoized (insn) == -1)
fail = 1;
on DEBUG_INSNs (assuming all changes to those are fine),
swap_rtx_condition_1 just changes one comparison to a different
one.  But due to the possibility of fnstsw result being used
in theory before sahf in some DEBUG_INSNs, this patch takes
a different approach.  swap_rtx_condition has now a new argument
and two modes.  The first mode is when debug_seen is >= 0, in this
case both next_flags_user and the loop for fnstsw -> sahf will
ignore but note DEBUG_INSNs (that mention flags register or fnstsw
result).  If no such DEBUG_INSN is found during the whole call
including recursive invocations (so e.g. for -g0 but probably most
often for -g as well), it behaves as before, if it returns true
all the changes are done and nothing further needs to be done later.
If any DEBUG_INSNs are seen along the way, even when returning success
all the changes are reverted, so it just reports that the function
would be successful if DEBUG_INSNs were ignored.
In this case, compare_for_stack_reg needs to call it again in
debug_seen = -1 mode, which tells the function to update everything
including DEBUG_INSNs.  For the fnstsw -> sahf case which I hope
will be very rare I just reset the DEBUG_INSNs, I don't really
know how to express it easily otherwise.  For the rest
swap_rtx_condition_1 is done even on the DEBUG_INSNs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
And after some time for release branches too?

2022-11-19  Jakub Jelinek  

PR target/107183
* reg-stack.cc (next_flags_user): Add DEBUG_SEEN argument.
If >= 0 and a DEBUG_INSN would be otherwise returned, set
DEBUG_SEEN to 1 and ignore it.
(swap_rtx_condition): Add DEBUG_SEEN argument.  In >= 0
mode only set DEBUG_SEEN to 1 if problematic DEBUG_ISNSs
were seen and revert all changes on success in that case.
Don't try to recog_memoized DEBUG_INSNs.
(compare_for_stack_reg): Adjust swap_rtx_condition caller.
If it returns true and debug_seen is 1, call swap_rtx_condition
again with debug_seen -1.

* gcc.dg/ubsan/pr107183.c: New test.


OK

jeff




[PATCH] testsuite: Add filter for target socket support

2022-11-20 Thread Dimitar Dimitrov
The new analyzer tests for sockets are failing on embedded targets.
The newlib and avr-libc C libraries do not support sockets.

At first I considered a coarse filtering on the existing
effective_target_freestanding check.  But seeing how lib/target-supports.exp
is slowly turning into a copy of autotools, I kept the tradition and added
a new fine grained "socket" filter.

I also considered adding effective_target_posix, but could not
figure out a reliable C code to perform the check.

Testing done:
  - No changes in gcc.sum for x86_64-pc-linux-gnu, with or without this
patch.
  - Filtered cases are now UNSUPPORTED instead of failing on AVR and PRU
backends.

Ok for trunk?

gcc/ChangeLog:

* doc/sourcebuild.texi (sockets): Document new check.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/fd-accept.c: Require sockets.
* gcc.dg/analyzer/fd-bind.c: Ditto.
* gcc.dg/analyzer/fd-connect.c: Ditto.
* gcc.dg/analyzer/fd-datagram-socket.c: Ditto.
* gcc.dg/analyzer/fd-glibc-byte-stream-connection-server.c:
Ditto.
* gcc.dg/analyzer/fd-glibc-byte-stream-socket.c: Ditto.
* gcc.dg/analyzer/fd-glibc-datagram-client.c: Ditto.
* gcc.dg/analyzer/fd-glibc-datagram-socket.c: Ditto.
* gcc.dg/analyzer/fd-listen.c: Ditto.
* gcc.dg/analyzer/fd-manpage-getaddrinfo-client.c: Ditto.
* gcc.dg/analyzer/fd-mappage-getaddrinfo-server.c: Ditto.
* gcc.dg/analyzer/fd-meaning.c: Ditto.
* gcc.dg/analyzer/fd-socket-meaning.c: Ditto.
* gcc.dg/analyzer/fd-socket-misuse.c: Ditto.
* gcc.dg/analyzer/fd-stream-socket-active-open.c: Ditto.
* gcc.dg/analyzer/fd-stream-socket-passive-open.c: Ditto.
* gcc.dg/analyzer/fd-stream-socket.c: Ditto.
* gcc.dg/analyzer/fd-symbolic-socket.c: Ditto.
* lib/target-supports.exp (check_effective_target_sockets): New
check.

Signed-off-by: Dimitar Dimitrov 
---
 gcc/doc/sourcebuild.texi   |  3 +++
 gcc/testsuite/gcc.dg/analyzer/fd-accept.c  |  2 ++
 gcc/testsuite/gcc.dg/analyzer/fd-bind.c|  2 ++
 gcc/testsuite/gcc.dg/analyzer/fd-connect.c |  2 ++
 gcc/testsuite/gcc.dg/analyzer/fd-datagram-socket.c |  2 ++
 .../fd-glibc-byte-stream-connection-server.c   |  1 +
 .../gcc.dg/analyzer/fd-glibc-byte-stream-socket.c  |  1 +
 .../gcc.dg/analyzer/fd-glibc-datagram-client.c |  1 +
 .../gcc.dg/analyzer/fd-glibc-datagram-socket.c |  1 +
 gcc/testsuite/gcc.dg/analyzer/fd-listen.c  |  2 ++
 .../analyzer/fd-manpage-getaddrinfo-client.c   |  1 +
 .../analyzer/fd-mappage-getaddrinfo-server.c   |  2 ++
 gcc/testsuite/gcc.dg/analyzer/fd-meaning.c |  2 +-
 gcc/testsuite/gcc.dg/analyzer/fd-socket-meaning.c  |  1 +
 gcc/testsuite/gcc.dg/analyzer/fd-socket-misuse.c   |  2 ++
 .../gcc.dg/analyzer/fd-stream-socket-active-open.c |  2 ++
 .../analyzer/fd-stream-socket-passive-open.c   |  2 ++
 gcc/testsuite/gcc.dg/analyzer/fd-stream-socket.c   |  2 ++
 gcc/testsuite/gcc.dg/analyzer/fd-symbolic-socket.c |  2 ++
 gcc/testsuite/lib/target-supports.exp  | 14 ++
 20 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 766266942f9..ffe69d6fcb9 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2666,6 +2666,9 @@ Target can compile using @code{pthread.h} with no errors 
or warnings.
 @item pthread_h
 Target has @code{pthread.h}.
 
+@item sockets
+Target can compile using @code{sys/socket.h} with no errors or warnings.
+
 @item run_expensive_tests
 Expensive testcases (usually those that consume excessive amounts of CPU
 time) should be run on this target.  This can be enabled by setting the
diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-accept.c 
b/gcc/testsuite/gcc.dg/analyzer/fd-accept.c
index 36cc7af7184..5426063f31d 100644
--- a/gcc/testsuite/gcc.dg/analyzer/fd-accept.c
+++ b/gcc/testsuite/gcc.dg/analyzer/fd-accept.c
@@ -1,3 +1,5 @@
+/* { dg-require-effective-target sockets } */
+
 #include 
 #include 
 #include 
diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-bind.c 
b/gcc/testsuite/gcc.dg/analyzer/fd-bind.c
index 6f91bc4b794..c34803f1380 100644
--- a/gcc/testsuite/gcc.dg/analyzer/fd-bind.c
+++ b/gcc/testsuite/gcc.dg/analyzer/fd-bind.c
@@ -1,3 +1,5 @@
+/* { dg-require-effective-target sockets } */
+
 #include 
 #include 
 #include 
diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-connect.c 
b/gcc/testsuite/gcc.dg/analyzer/fd-connect.c
index 1ab54d01f36..7bf687e2570 100644
--- a/gcc/testsuite/gcc.dg/analyzer/fd-connect.c
+++ b/gcc/testsuite/gcc.dg/analyzer/fd-connect.c
@@ -1,3 +1,5 @@
+/* { dg-require-effective-target sockets } */
+
 #include 
 #include 
 #include 
diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-datagram-socket.c 
b/gcc/testsuite/gcc.dg/analyzer/fd-datagram-socket.c
index 045bdfa32d3..58508570a25 100644
--- a/gcc/testsuite/gcc.dg/analyzer/fd-datagram-socket.c

Re: [PATCH] RISC-V: Add RVV registers register spilling

2022-11-20 Thread Andreas Schwab
FAIL: gcc.target/riscv/rvv/base/spill-1.c (internal compiler error: in 
to_constant, at poly-int.h:504)
FAIL: gcc.target/riscv/rvv/base/spill-1.c (test for excess errors)
FAIL: gcc.target/riscv/rvv/base/spill-2.c (internal compiler error: in 
to_constant, at poly-int.h:504)
FAIL: gcc.target/riscv/rvv/base/spill-2.c (test for excess errors)
FAIL: gcc.target/riscv/rvv/base/spill-3.c (internal compiler error: in 
to_constant, at poly-int.h:504)
FAIL: gcc.target/riscv/rvv/base/spill-3.c (test for excess errors)
FAIL: gcc.target/riscv/rvv/base/spill-4.c (internal compiler error: in 
to_constant, at poly-int.h:504)
FAIL: gcc.target/riscv/rvv/base/spill-4.c (test for excess errors)
FAIL: gcc.target/riscv/rvv/base/spill-5.c (internal compiler error: in 
to_constant, at poly-int.h:504)
FAIL: gcc.target/riscv/rvv/base/spill-5.c (test for excess errors)
FAIL: gcc.target/riscv/rvv/base/spill-6.c (internal compiler error: in 
to_constant, at poly-int.h:504)
FAIL: gcc.target/riscv/rvv/base/spill-6.c (test for excess errors)
FAIL: gcc.target/riscv/rvv/base/spill-7.c (internal compiler error: in 
to_constant, at poly-int.h:504)
FAIL: gcc.target/riscv/rvv/base/spill-7.c (test for excess errors)

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


[r13-3931 Regression] FAIL: libgomp.oacc-fortran/optional-data-copyin-by-value.f90 -DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable -Os (test for excess errors) on Linux/x86_64

2022-11-20 Thread haochen.jiang via Gcc-patches
On Linux/x86_64,

59a63247992eb13153b82c4902aadf111460eac2 is the first bad commit
commit 59a63247992eb13153b82c4902aadf111460eac2
Author: Harald Anlauf 
Date:   Thu Nov 10 22:30:27 2022 +0100

Fortran: fix treatment of character, value, optional dummy arguments 
[PR107444]

caused

FAIL: libgomp.fortran/use_device_ptr-optional-2.f90   -O0  (internal compiler 
error: in gfc_omp_check_optional_argument, at fortran/trans-openmp.cc:137)
FAIL: libgomp.fortran/use_device_ptr-optional-2.f90   -O0  (test for excess 
errors)
FAIL: libgomp.fortran/use_device_ptr-optional-2.f90   -O1  (internal compiler 
error: in gfc_omp_check_optional_argument, at fortran/trans-openmp.cc:137)
FAIL: libgomp.fortran/use_device_ptr-optional-2.f90   -O1  (test for excess 
errors)
FAIL: libgomp.fortran/use_device_ptr-optional-2.f90   -O2  (internal compiler 
error: in gfc_omp_check_optional_argument, at fortran/trans-openmp.cc:137)
FAIL: libgomp.fortran/use_device_ptr-optional-2.f90   -O2  (test for excess 
errors)
FAIL: libgomp.fortran/use_device_ptr-optional-2.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (internal compiler 
error: in gfc_omp_check_optional_argument, at fortran/trans-openmp.cc:137)
FAIL: libgomp.fortran/use_device_ptr-optional-2.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess 
errors)
FAIL: libgomp.fortran/use_device_ptr-optional-2.f90   -O3 -g  (internal 
compiler error: in gfc_omp_check_optional_argument, at 
fortran/trans-openmp.cc:137)
FAIL: libgomp.fortran/use_device_ptr-optional-2.f90   -O3 -g  (test for excess 
errors)
FAIL: libgomp.fortran/use_device_ptr-optional-2.f90   -Os  (internal compiler 
error: in gfc_omp_check_optional_argument, at fortran/trans-openmp.cc:137)
FAIL: libgomp.fortran/use_device_ptr-optional-2.f90   -Os  (test for excess 
errors)
FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (internal compiler 
error: in gfc_omp_check_optional_argument, at fortran/trans-openmp.cc:137)
FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (test for excess 
errors)
FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (internal compiler 
error: in gfc_omp_check_optional_argument, at fortran/trans-openmp.cc:137)
FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (test for excess 
errors)
FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (internal compiler 
error: in gfc_omp_check_optional_argument, at fortran/trans-openmp.cc:137)
FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for excess 
errors)
FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (internal compiler 
error: in gfc_omp_check_optional_argument, at fortran/trans-openmp.cc:137)
FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess 
errors)
FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (internal 
compiler error: in gfc_omp_check_optional_argument, at 
fortran/trans-openmp.cc:137)
FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (test for excess 
errors)
FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (internal compiler 
error: in gfc_omp_check_optional_argument, at fortran/trans-openmp.cc:137)
FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (test for excess 
errors)
FAIL: libgomp.oacc-fortran/optional-data-copyin-by-value.f90 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O0  (internal 
compiler error: in gfc_omp_check_optional_argument, at 
fortran/trans-openmp.cc:137)
FAIL: libgomp.oacc-fortran/optional-data-copyin-by-value.f90 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O0  (test for 
excess errors)
FAIL: libgomp.oacc-fortran/optional-data-copyin-by-value.f90 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O1  (internal 
compiler error: in gfc_omp_check_optional_argument, at 
fortran/trans-openmp.cc:137)
FAIL: libgomp.oacc-fortran/optional-data-copyin-by-value.f90 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O1  (test for 
excess errors)
FAIL: libgomp.oacc-fortran/optional-data-copyin-by-value.f90 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2  (internal 
compiler error: in gfc_omp_check_optional_argument, at 
fortran/trans-openmp.cc:137)
FAIL: libgomp.oacc-fortran/optional-data-copyin-by-value.f90 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O2  (test for 
excess errors)
FAIL: libgomp.oacc-fortran/optional-data-copyin-by-value.f90 
-DACC_DEVICE_TYPE_host=1 -DACC_MEM_SHARED=1 -foffload=disable  -O3 
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  
(internal compiler error: in gfc_omp_check_optional_argument, at 
fortran/trans-openmp.cc:137)
FAIL: libgomp.oacc-fortran/optional-data-copyin-by-value.f90