Re: [PATCH] Enable GCC support for AMX

2020-07-23 Thread Hongyu Wang via Gcc-patches
PING^2

Hongyu Wang  于2020年7月17日周五 下午1:40写道:
>
> Update for SAPPHIRERAPIDS and PING
>
> Hongyu Wang  于2020年7月7日周二 上午11:24写道:
>
> >
> > Hi Kirill, could you help review this patch?
> >
> > Hongyu Wang  于2020年7月6日周一 上午9:58写道:
> > >
> > > Hi:
> > >
> > > This patch is about to support Intel Advanced Matrix Extensions (AMX)
> > > which will be enabled in GLC.
> > >
> > > AMX is a new 64-bit programming paradigm consisting of two
> > > compo nents: a set of 2-dimensional registers (tiles) representing
> > > sub-arrays from a larger 2-dimensional memory image,
> > > and an accelerator able to operate on tiles
> > >
> > > Supported instructions are
> > >
> > > AMX-TILE:ldtilecfg/sttilecfg/tileloadd/tileloaddt1/tilezero/tilerelease
> > > AMX-INT8:tdpbssd/tdpbsud/tdpbusd/tdpbuud
> > > AMX-BF16:tdpbf16ps
> > >
> > > The intrinsics adopts constant tile register number as its input 
> > > parameters.
> > >
> > > For detailed information, please refer to
> > > https://software.intel.com/content/dam/develop/public/us/en/documents/architecture-instruction-set-extensions-programming-reference.pdf
> > >
> > > Bootstrap ok, regression test on i386/x86 backend is ok.
> > >
> > > OK for master?
> > >
> > > gcc/ChangeLog
> > >
> > > * common/config/i386/i386-common.c (OPTION_MASK_ISA2_AMX_TILE_SET,
> > > OPTION_MASK_ISA2_AMX_INT8_SET, OPTION_MASK_ISA2_AMX_BF16_SET,
> > > OPTION_MASK_ISA2_AMX_TILE_UNSET,
> > > OPTION_MASK_ISA2_AMX_INT8_UNSET, OPTION_MASK_ISA2_AMX_BF16_UNSET):
> > > New marcos.
> > > (ix86_handle_option): Hanlde -mamx-tile, -mamx-int8, -mamx-bf16.
> > > * common/config/i386/i386-cpuinfo.h (processor_types): Add
> > > FEATURE_AMX_TILE, FEATURE_AMX_INT8, FEATURE_AMX_BF16.
> > > * common/config/i386/cpuinfo.h (XSTATE_TILECFG,
> > > XSTATE_TILEDATA, XCR_AMX_ENABLED_MASK): New macro.
> > > (get_available_features): Enable AMX features only if
> > > their states are suoorited by OSXSAVE.
> > > * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY
> > > for amx-tile, amx-int8, amx-bf16.
> > > * config.gcc: Add amxtileintrin.h, amxint8intrin.h,
> > > amxbf16intrin.h to extra headers.
> > > * config/i386/amxbf16intrin.h: New file.
> > > * config/i386/amxint8intrin.h: Ditto.
> > > * config/i386/amxtileintrin.h: Ditto.
> > > * config/i386/cpuid.h (bit_AMX_BF16, bit_AMX_TILE, bit_AMX_INT8):
> > > New macro.
> > > * config/i386/i386-c.c (ix86_target_macros_internal): Define
> > > __AMX_TILE__, __AMX_INT8__, AMX_BF16__.
> > > * config/i386/i386-options.c (ix86_target_string): Add
> > > -mamx-tile, -mamx-int8, -mamx-bf16.
> > > (ix86_option_override_internal): Handle AMX-TILE,
> > > AMX-INT8, AMX-BF16.
> > > * config/i386/i386.h (TARGET_AMX_TILE, TARGET_AMX_TILE_P,
> > > TARGET_AMX_INT8, TARGET_AMX_INT8_P, TARGET_AMX_BF16_P,
> > > PTA_AMX_TILE, PTA_AMX_INT8, PTA_AMX_BF16): New macros.
> > > * config/i386/i386.opt: Add -mamx-tile, -mamx-int8, -mamx-bf16.
> > > * config/i386/immintrin.h: Include amxtileintrin.h,
> > > amxint8intrin.h, amxbf16intrin.h.
> > > * doc/invoke.texi: Document -mamx-tile, -mamx-int8, -mamx-bf16.
> > > * doc/extend.texi: Document amx-tile, amx-int8, amx-bf16.
> > > * doc/sourcebuild.texi ((Effective-Target Keywords, Other
> > > hardware attributes): Document amx_int8, amx_tile, amx_bf16.
> > >
> > > gcc/testsuite/ChangeLog
> > >
> > > * lib/target-supports.exp (check_effective_target_amx_tile,
> > > check_effective_target_amx_int8,
> > > check_effective_target_amx_bf16): New proc.
> > > * g++.dg/other/i386-2.C: Add -mamx-tile, -mamx-int8, -mamx-bf16.
> > > * g++.dg/other/i386-3.C: Ditto.
> > > * gcc.target/i386/sse-12.c: Ditto.
> > > * gcc.target/i386/sse-13.c: Ditto.
> > > * gcc.target/i386/sse-14.c: Ditto.
> > > * gcc.target/i386/sse-22.c: Ditto.
> > > * gcc.target/i386/sse-23.c: Ditto.
> > > * gcc.target/i386/funcspec-56.inc: Add new target attribute.
> > > * gcc.target/i386/amxbf16-asmatt-1.c: New test.
> > > * gcc.target/i386/amxint8-asmatt-1.c: Ditto.
> > > * gcc.target/i386/amxtile-asmatt-1.c: Ditto.
> > > * gcc.target/i386/amxbf16-asmintel-1.c: Ditto.
> > > * gcc.target/i386/amxint8-asmintel-1.c: Ditto.
> > > * gcc.target/i386/amxtile-asmintel-1.c: Ditto.
> > > * gcc.target/i386/amxbf16-asmatt-2.c: Ditto.
> > > * gcc.target/i386/amxint8-asmatt-2.c: Ditto.
> > > * gcc.target/i386/amxtile-asmatt-2.c: Ditto.
> > > * gcc.target/i386/amxbf16-asmintel-2.c: Ditto.
> > > * gcc.target/i386/amxint8-asmintel-2.c: Ditto.
> > > * gcc.target/i386/amxtile-asmintel-2.c: Ditto.


[PATCH v3] genemit.c (main): split insn-emit.c for compiling parallelly

2020-07-23 Thread Jojo R
gcc/ChangeLog:

* genemit.c (main): Print 'split line'.
* Makefile.in (insn-emit.c): Define split count and file

---
 gcc/Makefile.in | 10 ++
 gcc/genemit.c   | 87 -
 2 files changed, 59 insertions(+), 38 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 2ba76656dbf..75841e49127 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1253,6 +1253,13 @@ ANALYZER_OBJS = \
 # We put the *-match.o and insn-*.o files first so that a parallel make
 # will build them sooner, because they are large and otherwise tend to be
 # the last objects to finish building.
+
+insn-generated-split-num = $(shell grep -c ^processor /proc/cpuinfo)
+
+insn-emit-split-c = $(foreach o, $(shell for i in 
{1..$(insn-generated-split-num)}; do echo $$i; done), insn-emit$(o).c)
+insn-emit-split-obj = $(patsubst %.c,%.o, $(insn-emit-split-c))
+$(insn-emit-split-c): insn-emit.c
+
 OBJS = \
gimple-match.o \
generic-match.o \
@@ -1260,6 +1267,7 @@ OBJS = \
insn-automata.o \
insn-dfatab.o \
insn-emit.o \
+   $(insn-emit-split-obj) \
insn-extract.o \
insn-latencytab.o \
insn-modes.o \
@@ -2367,6 +2375,8 @@ $(simple_generated_c:insn-%.c=s-%): s-%: 
build/gen%$(build_exeext)
$(RUN_GEN) build/gen$*$(build_exeext) $(md_file) \
  $(filter insn-conditions.md,$^) > tmp-$*.c
$(SHELL) $(srcdir)/../move-if-change tmp-$*.c insn-$*.c
+   -csplit insn-$*.c /parallel\ compilation/ -k -s 
{$(insn-generated-split-num)} -f insn-$* -b "%d.c"
+   -( [ ! -s insn-$*0.c ] && for i in {1..$(insn-generated-split-num)}; do 
touch insn-$*$$i.c; done && echo "" > insn-$*.c)
$(STAMP) s-$*
 
 # gencheck doesn't read the machine description, and the file produced
diff --git a/gcc/genemit.c b/gcc/genemit.c
index 84d07d388ee..3aaaeb62b0a 100644
--- a/gcc/genemit.c
+++ b/gcc/genemit.c
@@ -847,6 +847,46 @@ handle_overloaded_gen (overloaded_name *oname)
 }
 }
 
+#define printf_include() do { \
+  printf ("/* Generated automatically by the program `genemit'\n\
+from the machine description file `md'.  */\n\n");  \
+  printf ("#define IN_TARGET_CODE 1\n");\
+  printf ("#include \"config.h\"\n");   \
+  printf ("#include \"system.h\"\n");   \
+  printf ("#include \"coretypes.h\"\n");\
+  printf ("#include \"backend.h\"\n");  \
+  printf ("#include \"predict.h\"\n");  \
+  printf ("#include \"tree.h\"\n"); \
+  printf ("#include \"rtl.h\"\n");  \
+  printf ("#include \"alias.h\"\n");\
+  printf ("#include \"varasm.h\"\n");   \
+  printf ("#include \"stor-layout.h\"\n");  \
+  printf ("#include \"calls.h\"\n");\
+  printf ("#include \"memmodel.h\"\n"); \
+  printf ("#include \"tm_p.h\"\n"); \
+  printf ("#include \"flags.h\"\n");\
+  printf ("#include \"insn-config.h\"\n");  \
+  printf ("#include \"expmed.h\"\n");   \
+  printf ("#include \"dojump.h\"\n");   \
+  printf ("#include \"explow.h\"\n");   \
+  printf ("#include \"emit-rtl.h\"\n"); \
+  printf ("#include \"stmt.h\"\n"); \
+  printf ("#include \"expr.h\"\n"); \
+  printf ("#include \"insn-codes.h\"\n");   \
+  printf ("#include \"optabs.h\"\n");   \
+  printf ("#include \"dfp.h\"\n");  \
+  printf ("#include \"output.h\"\n");   \
+  printf ("#include \"recog.h\"\n");\
+  printf ("#include \"df.h\"\n");   \
+  printf ("#include \"resource.h\"\n"); \
+  printf ("#include \"reload.h\"\n");   \
+  printf ("#include \"diagnostic-core.h\"\n");  \
+  printf ("#include \"regs.h\"\n"); \
+  printf ("#include \"tm-constrs.h\"\n");   \
+  printf ("#include \"ggc.h\"\n");  \
+  printf ("#include \"target.h\"\n\n"); \
+} while (0)
+
 int
 main (int argc, const char **argv)
 {
@@ -862,49 +902,19 @@ main (int argc, const char **argv)
   /* Assign sequential codes to all entries in the machine description
  in parallel with the tables in insn-output.c.  */
 
-  printf ("/* Generated automatically by the program `genemit'\n\
-from the machine description file `md'.  */\n\n");
-
-  printf ("#define IN_TARGET_CODE 1\n");
-  printf ("#include \"config.h\"\n");
-  printf ("#include \"system.h\"\n");
-  printf ("#include \"coretypes.h\"\n");
-  printf ("#include \"backend.h\"\n");
-  printf ("#include \"predict.h\"\n");
-  printf ("#include \"tree.h\"\n");
-  printf ("#include \"rtl.h\"\n");
-  printf ("#include \"alias.h\"\n");
-  printf ("#include 

Re: [PATCH v2] genemit.c (main): split insn-emit.c for compiling parallelly

2020-07-23 Thread Jojo R
在 2020年7月21日 +0800 PM2:55,Bin.Cheng ,写道:
> On Tue, Jul 21, 2020 at 11:14 AM Jojo  wrote:
> >
> > gcc/ChangeLog:
> >
> > * genemit.c (main): Print 'split line'.
> > * Makefile.in (insn-emit.c): Define split count and file
> >
>
> Thanks for working one this, following comments are based on the
> assumption that the approach is feasible after your investigation.
>
> It's great to accelerate compilation time, do you have any number
> showing how much this can achieve, on a typical machine with
> reasonable parallelism?
>

Here are time collections from machine:

20 Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz,

Size of insn-emit.c (du -sh insn-emit.c) : 121M

Not split (same like j1) :  195 second

Split 20, -j1 : 213 second

Split 20, -j4: 76 second

Split 20, -j8: 44 second

Split 20, -j12: 44 second

Split 20, -j16: 44 second

Split 20, -j20: 44 second

Also, litter RAM will be consumed by litter code :)
> > ---
> > gcc/Makefile.in | 10 ++
> > gcc/genemit.c | 86 +++--
> > 2 files changed, 58 insertions(+), 38 deletions(-)
> >
> > diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> > index 2ba76656dbf..f805050a119 100644
> > --- a/gcc/Makefile.in
> > +++ b/gcc/Makefile.in
> > @@ -1253,6 +1253,13 @@ ANALYZER_OBJS = \
> > # We put the *-match.o and insn-*.o files first so that a parallel make
> > # will build them sooner, because they are large and otherwise tend to be
> > # the last objects to finish building.
> > +
> > +insn-generated-split-num = 15
> Hardcode number 15 looks strange here, how shall we know it's enough?
> Or one step further, can we use some kind of general match "*" for
> writing make rules here?

I have researched the detail of Makefile rules, As far as I know, ‘Make’ 
analyze the dependent objects before
some object running it’s shell command, so we need to static define the 
‘insn-emit.c’ counts for creating ‘insn-emit.o’.

We can define split counts from processor numbers from machine, it should be 
nice ?
It’s fixed in v3 patch.
> > +
> > +insn-emit-split-c = $(foreach o, $(shell for i in 
> > {1..$(insn-generated-split-num)}; do echo $$i; done), insn-emit$(o).c)
> > +insn-emit-split-obj = $(patsubst %.c,%.o, $(insn-emit-split-c))
> > +$(insn-emit-split-c): insn-emit.c
> > +
> > OBJS = \
> > gimple-match.o \
> > generic-match.o \
> > @@ -1260,6 +1267,7 @@ OBJS = \
> > insn-automata.o \
> > insn-dfatab.o \
> > insn-emit.o \
> > + $(insn-emit-split-obj) \
> > insn-extract.o \
> > insn-latencytab.o \
> > insn-modes.o \
> > @@ -2367,6 +2375,8 @@ $(simple_generated_c:insn-%.c=s-%): s-%: 
> > build/gen%$(build_exeext)
> > $(RUN_GEN) build/gen$*$(build_exeext) $(md_file) \
> > $(filter insn-conditions.md,$^) > tmp-$*.c
> > $(SHELL) $(srcdir)/../move-if-change tmp-$*.c insn-$*.c
> > + -csplit insn-$*.c /i\ am\ split\ line/ -k -s 
> > {$(insn-generated-split-num)} -f insn-$* -b "%d.c"
> > + -( [ ! -s insn-$*0.c ] && for i in {1..$(insn-generated-split-num)}; do 
> > touch insn-$*$$i.c; done && echo "" > insn-$*.c)
> Not sure if this is the first time that csplit/coreutils is used,
> shall we mention it here
> https://gcc.gnu.org/install/prerequisites.html, even check it in
> configure?

I grep coreutils from gcc source code and much more modules have checked the 
coreutils
like libcc1, libssp and so on.
> > $(STAMP) s-$*
> >
> > # gencheck doesn't read the machine description, and the file produced
> > diff --git a/gcc/genemit.c b/gcc/genemit.c
> > index 84d07d388ee..fd60cdeeb96 100644
> > --- a/gcc/genemit.c
> > +++ b/gcc/genemit.c
> > @@ -847,6 +847,45 @@ handle_overloaded_gen (overloaded_name *oname)
> > }
> > }
> >
> > +#define printf_include() \
> > + printf ("/* Generated automatically by the program `genemit'\n\
> > +from the machine description file `md'. */\n\n"); \
> > + printf ("#define IN_TARGET_CODE 1\n"); \
> > + printf ("#include \"config.h\"\n"); \
> > + printf ("#include \"system.h\"\n"); \
> > + printf ("#include \"coretypes.h\"\n"); \
> > + printf ("#include \"backend.h\"\n"); \
> > + printf ("#include \"predict.h\"\n"); \
> > + printf ("#include \"tree.h\"\n"); \
> > + printf ("#include \"rtl.h\"\n"); \
> > + printf ("#include \"alias.h\"\n"); \
> > + printf ("#include \"varasm.h\"\n"); \
> > + printf ("#include \"stor-layout.h\"\n"); \
> > + printf ("#include \"calls.h\"\n"); \
> > + printf ("#include \"memmodel.h\"\n"); \
> > + printf ("#include \"tm_p.h\"\n"); \
> > + printf ("#include \"flags.h\"\n"); \
> > + printf ("#include \"insn-config.h\"\n"); \
> > + printf ("#include \"expmed.h\"\n"); \
> > + printf ("#include \"dojump.h\"\n"); \
> > + printf ("#include \"explow.h\"\n"); \
> > + printf ("#include \"emit-rtl.h\"\n"); \
> > + printf ("#include \"stmt.h\"\n"); \
> > + printf ("#include \"expr.h\"\n"); \
> > + printf ("#include \"insn-codes.h\"\n"); \
> > + printf ("#include \"optabs.h\"\n"); \
> > + printf ("#include \"dfp.h\"\n"); \
> > + printf ("#include \"output.h\"\n"); \
> > + printf ("#include \"recog.h\"\n"); \
> > 

Re: libgo patch committed: Update to Go 1.14.6 release

2020-07-23 Thread Ian Lance Taylor via Gcc-patches
On Fri, Jul 17, 2020 at 2:28 PM Ian Lance Taylor  wrote:
>
> This patch to libgo updates it to the Go 1.14.6 release.  Bootstrapped
> and ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

I have now committed this to the GCC 10 branch as well.

Ian


Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-07-23 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 23, 2020 at 9:53 PM Hongtao Liu  wrote:
>
> On Thu, Jul 23, 2020 at 4:39 PM Jan Hubicka  wrote:
> >
> > Hello,
> > sorry for taking so long to get to this.
> > > diff --git a/gcc/config/i386/i386-features.c 
> > > b/gcc/config/i386/i386-features.c
> > > index 535fc7e981d..8f81d101382 100644
> > > --- a/gcc/config/i386/i386-features.c
> > > +++ b/gcc/config/i386/i386-features.c
> > > @@ -2379,6 +2379,152 @@ make_pass_remove_partial_avx_dependency 
> > > (gcc::context *ctxt)
> > >return new pass_remove_partial_avx_dependency (ctxt);
> > >  }
> > >
> > > +/* Replace all one-value const vector that are referenced by SYMBOL_REFs 
> > > in x
> > > +   with embedded broadcast. i.e.transform
> > > +
> > > + vpaddq .LC0(%rip), %zmm0, %zmm0
> > > + ret
> > > +  .LC0:
> > > +.quad 3
> > > +.quad 3
> > > +.quad 3
> > > +.quad 3
> > > +.quad 3
> > > +.quad 3
> > > +.quad 3
> > > +.quad 3
> > > +
> > > +to
> > > +
> > > + vpaddq .LC0(%rip){1to8}, %zmm0, %zmm0
> >
> > It seems to me that having a special purpose pass for this is bit
> > overzelaous.  It seems to me that you can do same pattern matching via
> > splitter and fit it into the usual insn splitting pass?
> >
>
> From an implementation perspective, there could be lots of work, since
> memory embedding broadcast is available for nearly every instruction
> in AVX512. And for new added AVX512 instructions, we also need to add
> a define_split for them.
>

I'll add more tests to show my point.

> > Honza
> > > + ret
> > > +  .LC0:
> > > +.quad 3  */
> > > +static void
> > > +replace_constant_pool_with_broadcast (rtx_insn* insn)
> > > +{
> > > +  subrtx_ptr_iterator::array_type array;
> > > +  FOR_EACH_SUBRTX_PTR (iter, array,  (insn), ALL)
> > > +{
> > > +  rtx *loc = *iter;
> > > +  rtx x = *loc;
> > > +  rtx broadcast_mem, vec_dup, constant, first;
> > > +  machine_mode mode;
> > > +  if (GET_CODE (x) != MEM
> > > +   || GET_CODE (XEXP (x, 0)) != SYMBOL_REF
> > > +   || !CONSTANT_POOL_ADDRESS_P (XEXP (x, 0)))
> > > + continue;
> > > +
> > > +  mode = GET_MODE (x);
> > > +  if (!VECTOR_MODE_P (mode))
> > > + return;
> > > +
> > > +  constant = get_pool_constant (XEXP (x, 0));
> > > +  first = XVECEXP (constant, 0, 0);
> > > +  /* There could be some rtx like
> > > +  (mem/u/c:V16QI (symbol_ref/u:DI ("*.LC1")))
> > > +  but with "*.LC1" refer to V2DI constant vector.  */
> > > +  if (GET_MODE (constant) != mode)
> > > + return;
> > > +
> > > +  for (int i = 1; i < GET_MODE_NUNITS (mode); ++i)
> > > + {
> > > +   rtx tmp = XVECEXP (constant, 0, i);
> > > +   /* Only handle one-value const vector.  */
> > > +   if (!rtx_equal_p (tmp, first))
> > > + return;
> > > + }
> > > +
> > > +  broadcast_mem = force_const_mem (GET_MODE_INNER (mode), first);
> > > +  vec_dup = gen_rtx_VEC_DUPLICATE (mode, broadcast_mem);
> > > +  *loc = vec_dup;
> > > +  INSN_CODE (insn) = -1;
> > > +  /* Revert change if there's no corresponding pattern.  */
> > > +  if (recog_memoized (insn) < 0)
> > > + {
> > > +   *loc = x;
> > > +   recog_memoized (insn);
> > > + }
> > > +  /* At most 1 memory_operand in an insn.  */
> > > +  return;
> > > +}
> > > +}
> > > +
> > > +/* For const vector having one duplicated value, there's no need to put
> > > +   whole vector in the constant pool when target supports embedded 
> > > broadcast. */
> > > +static unsigned int
> > > +constant_pool_broadcast (void)
> > > +{
> > > +  timevar_push (TV_MACH_DEP);
> > > +  rtx_insn *insn;
> > > +
> > > +  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
> > > +{
> > > +  if (!INSN_P (insn))
> > > + continue;
> > > +
> > > +  /* Insns may appear inside a SEQUENCE.  Only check the patterns of
> > > +  insns, not any notes that may be attached.  We don't want to mark
> > > +  a constant just because it happens to appear in a REG_EQUIV note.  
> > > */
> > > +  if (rtx_sequence *seq = dyn_cast  (PATTERN (insn)))
> > > + {
> > > +   int i, n = seq->len ();
> > > +   for (i = 0; i < n; ++i)
> > > + {
> > > +   rtx subinsn = seq->element (i);
> > > +   if (INSN_P (subinsn))
> > > + replace_constant_pool_with_broadcast (dyn_cast  
> > > (subinsn));
> > > + }
> > > + }
> > > +  else
> > > + replace_constant_pool_with_broadcast (insn);
> > > +}
> > > +  timevar_pop (TV_MACH_DEP);
> > > +  return 0;
> > > +}
> > > +
> > > +namespace {
> > > +
> > > +const pass_data pass_data_constant_pool_broadcast =
> > > +{
> > > +  RTL_PASS, /* type */
> > > +  "cpb", /* name */
> > > +  OPTGROUP_NONE, /* optinfo_flags */
> > > +  TV_MACH_DEP, /* tv_id */
> > > +  0, /* properties_required */
> > > +  0, /* properties_provided */
> > > +  0, /* properties_destroyed */
> > > 

Re: committed] correct memcmp expansion of constant representations containing embedded nuls (PR 95189)

2020-07-23 Thread H.J. Lu via Gcc-patches
On Thu, Jul 23, 2020 at 4:50 PM Martin Sebor  wrote:
>
> On 7/23/20 2:18 PM, H.J. Lu wrote:
> > On Thu, Jul 23, 2020 at 1:14 PM Martin Sebor via Gcc-patches
> >  wrote:
> >>
> >> On 7/22/20 2:23 AM, Rainer Orth wrote:
> >>> Hi Martin,
> >>>
>  I have committed this change in r11-2231 after Jeff approved it
>  off list last Thursday.
> >>>
> >>> the new gcc.target/i386/memcpy-pr95886.c test FAILs on 32-bit x86
> >>> (i386-pc-solaris2.11):
> >>>
> >>> +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> >>> "const_int 1976943448883713" 1
> >>> +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> >>> "const_int 576467370915332609" 1
> >>> +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> >>> "const_int 578431098682540545" 1
> >>> +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> >>> "const_int 578437695685198337" 1
> >>> +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> >>> "const_int 578437695685198337" 1
> >>> +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> >>> "const_int 578437695752110593" 1
> >>> +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> >>> "const_int 578437695752306689" 1
> >>> +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> >>> "const_int 578437695752307200" 1
> >>> +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> >>> "const_int 578437695752307201" 2
> >>
> >> Thanks for letting me know.  The test looks for patterns that are
> >> apparently LP64-specific so I restricted it to just that data model.
> >>
> >
> > Shouldn't it also work for x32?
>
> I would expect the optimization to work on any target that does
> the piecemeal copy, including x32.  But you probably meant if
> the test should pass as is on x32.  Possibly yes, but I didn't
> check.
>
> A better test would exercise the solution at least on all three
> i386 targets.  It could probably be done fairly simply by reducing
> the sizes of the arrays, or by using the large arrays only on LP64
> or whatever makes the difference and figuring out the right magic
> target selector to use in the dg- directives.

Here is the patch I am checking in.

-- 
H.J.
From 23daf7386b07c78b79223a00a85c8b002a5a67e3 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Thu, 23 Jul 2020 19:14:06 -0700
Subject: [PATCH] Restrict PR middle-end/95886 x86 test to !ia32

Since gcc.target/i386/memcpy-pr95886.c requires 64-bit register, restrict
it to !ia32.

	PR middle-end/95886
	* gcc.target/i386/memcpy-pr95886.c: Restrict test to !ia32.
---
 gcc/testsuite/gcc.target/i386/memcpy-pr95886.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/memcpy-pr95886.c b/gcc/testsuite/gcc.target/i386/memcpy-pr95886.c
index ddfdfd2bc78..0699146296a 100644
--- a/gcc/testsuite/gcc.target/i386/memcpy-pr95886.c
+++ b/gcc/testsuite/gcc.target/i386/memcpy-pr95886.c
@@ -1,7 +1,6 @@
 /* PR middle-end/95886 - suboptimal memcpy with embedded zero bytes
-   { dg-do compile }
-   { dg-options "-O2 -Wall -fdump-rtl-expand" }
-   { dg-require-effective-target lp64 } */
+   { dg-do compile { target { ! ia32 } } }
+   { dg-options "-O2 -Wall -fdump-rtl-expand" } */
 
 const char a1234567890[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
 
-- 
2.26.2



[PATCH] rs6000: ICE in unrecognizable insn when using -mpower10

2020-07-23 Thread Peter Bergner via Gcc-patches
We get an ICE when using -mpower10 and a -mcpu= value that is older
than power10.  The -mpower10 option requires -mcpu=power10 or later.
The following patch enforces that.

This passed bootstrap and regtesting with no errors.  Ok for trunk?

GCC 10 does not ICE on this test case, so I am not asking for a backport.

Peter


gcc/
PR target/95907
* config/rs6000/rs6000.c (rs6000_option_override_internal): Add check
to require -mcpu=power10 if using -mpower10.

gcc/testsuite/
PR target/95907
* g++.target/powerpc/pr95907.C: New test.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6bea544d26a..96241a9d0a3 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4094,6 +4094,14 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW;
 }
 
+  /* If the user explicitly uses -mpower10, ensure our ISA flags are
+ compatible with it.  */
+  if (TARGET_POWER10
+  && (rs6000_isa_flags_explicit & OPTION_MASK_POWER10) != 0
+  && (processor_target_table[cpu_index].target_enable
+ & OPTION_MASK_POWER10) == 0)
+error ("%qs requires %qs", "-mpower10", "-mcpu=power10");
+
   /* Enable -mprefixed by default on power10 systems.  */
   if (TARGET_POWER10 && (rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED) == 
0)
 rs6000_isa_flags |= OPTION_MASK_PREFIXED;
diff --git a/gcc/testsuite/g++.target/powerpc/pr95907.C 
b/gcc/testsuite/g++.target/powerpc/pr95907.C
new file mode 100644
index 000..45d276f25a7
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/pr95907.C
@@ -0,0 +1,7 @@
+// PR target/95907
+// { dg-do compile }
+// { dg-require-effective-target power10_ok }
+// { dg-options "-mpower10" }
+
+int foobool_argc;
+bool foobool() { return foobool_argc; }


Re: committed] correct memcmp expansion of constant representations containing embedded nuls (PR 95189)

2020-07-23 Thread Martin Sebor via Gcc-patches

On 7/23/20 2:18 PM, H.J. Lu wrote:

On Thu, Jul 23, 2020 at 1:14 PM Martin Sebor via Gcc-patches
 wrote:


On 7/22/20 2:23 AM, Rainer Orth wrote:

Hi Martin,


I have committed this change in r11-2231 after Jeff approved it
off list last Thursday.


the new gcc.target/i386/memcpy-pr95886.c test FAILs on 32-bit x86
(i386-pc-solaris2.11):

+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
1976943448883713" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
576467370915332609" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578431098682540545" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578437695685198337" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578437695685198337" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578437695752110593" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578437695752306689" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578437695752307200" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578437695752307201" 2


Thanks for letting me know.  The test looks for patterns that are
apparently LP64-specific so I restricted it to just that data model.



Shouldn't it also work for x32?


I would expect the optimization to work on any target that does
the piecemeal copy, including x32.  But you probably meant if
the test should pass as is on x32.  Possibly yes, but I didn't
check.

A better test would exercise the solution at least on all three
i386 targets.  It could probably be done fairly simply by reducing
the sizes of the arrays, or by using the large arrays only on LP64
or whatever makes the difference and figuring out the right magic
target selector to use in the dg- directives.

Martin


Re: [PATCH][RFC] diagnostics: Add support for Unicode drawing characters

2020-07-23 Thread Lewis Hyatt via Gcc-patches
On Thu, Jul 23, 2020 at 05:47:28PM -0400, David Malcolm wrote:
> On Thu, 2020-07-23 at 12:28 -0400, Lewis Hyatt via Gcc-patches wrote:
> > Hello-
> > 
> > The attached patch is complete including docs, but I tagged as RFC
> > because I am not sure if anyone will like it, or if the general
> > reaction may
> > be closer to recoiling in horror :). Would appreciate your thoughts,
> > please...
> 
> Thanks for working on this.  I'm interested in other people's thoughts
> on this.  Various comments inline throughout below.
>

Thanks for the feedback! I made a few replies below as well.

> Currently, if a UTF-8 locale is detected, GCC changes the quote
> > characters
> > it outputs in diagnostics to Unicode directional quotes. I feel like
> > this is
> > a nice touch, so I was wondering whether GCC shouldn't do more along
> > these
> > lines. This patch adds support for using Unicode line drawing
> > characters and
> > similar things when outputting diagnostics. There is a new option
> > -fdiagnostics-unicode-drawing=[auto|never|always] to control it,
> > which
> > defaults to auto. "auto" will enable the feature under the same
> > circumstances that Unicode quotes get output, namely when the locale
> > is
> > determined by gcc_init_libintl() to support UTF-8. (The new option
> > does not
> > affect Unicode quote characters, which currently are not configurable
> > and
> > are determined solely by the locale.)
> 
> FWIW when I first started experimenting with location ranges back in
> 2015 my first patches had box-drawing characters for underlines; you
> can see this in some of the early examples here (and similar URLs from
> around then):
>
> https://dmalcolm.fedorapeople.org/gcc/2015-08-18/plugin.html
>   (this also has a different approach for labeling ranges, which I
> called "captions", putting them in a right margin)
> 
> https://dmalcolm.fedorapeople.org/gcc/2015-08-19/diagnostic-test-string-literals-1.html
> 
> https://dmalcolm.fedorapeople.org/gcc/2015-08-26/tree-expression-ranges.html
> 
> etc; the patch kits were:
> 
> https://gcc.gnu.org/legacy-ml/gcc-patches/2015-03/msg00837.html
> https://gcc.gnu.org/pipermail/gcc-patches/2015-September/428036.html
> https://gcc.gnu.org/legacy-ml/gcc-patches/2015-09/msg01696.html
> 
> In:
>   https://gcc.gnu.org/legacy-ml/gcc-patches/2015-09/msg01700.html
> I wrote:
> > * Eliminated UTF-8/box-drawing and captions.  Captions were cute but
> >   weren't "fully baked".  Without them, box-drawing isn't really
> >   needed, and I think I prefer the ASCII look, with the actual
> >   "caret" character, and '~' makes it easier to count characters
> >   compared to a box-drawing line, in my terminal's font, at least.
> >   Doing so greatly simplifies the new locus-printing code.
> 
> So I dropped the UTF-8 box drawing from that original kit for:
> (a) simplicity (the original patch kit was huge in scope, covering a
> bunch of ideas for diagnostics - ranges, labeling, fix-it hints,
> spelling suggestions, so I wanted to reduce the scope to something
> manageable)
> (b) I found it easier to count characters with "~"
>

Oh interesting, sorry I didn't realize this had already been considered. Well
perhaps it's a good time to revisit it anyway, if people find it appealing.

> 
> The thing I'm most nervous about with this patch is the potential for
> introducing mojibake when people copy and paste GCC output.
> 
> For example, looking at:
> https://gcc.gnu.org/legacy-ml/gcc-patches/2015-03/msg00837.html
> I see mojibake where the unicode line-drawing characters in my email
> are being displayed in the HTML mailing list archive via "" -
> something has gone wrong with encoding somewhere between the copy
> from my terminal, the email, and the list archive.
> 
> That said, looking at your email in the archive here:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550551.html
> I don't see any mojibake.
> 
> What happens if GCC's stderr is piped into "less"?
> What happens if GCC's stderr is saved in a build.log file, uploaded
> somewhere, and then viewed?
> etc.
>

pipe to less should presumably be no problem, unless user goes out of their
way to use different locale settings there. Transporting the data is certainly
a potential area of difficulty, as is also interaction with IDEs and other
tools that want to parse the data, but such tools are at least already
handling the UTF-8 quote characters OK.

By the way, I was wondering separately what you think about adding an option
like -fplain-diagnostics or something, which would achieve basically the same
thing you get in the testsuite right now (-fno-diagnostics-show-caret
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never
-fdiagnostics-urls=never) but would change as necessary whenever diagnostics
evolve. It seems rather involved currently to add a new option like
-fdiagnostics-unicode-drawing but keep the testsuite working, in addition to
adding to prune.exp and to the libstdc++.exp, you also need to update the

Re: [PATCH 2/2] Aarch64: Add branch diluter pass

2020-07-23 Thread Segher Boessenkool
On Wed, Jul 22, 2020 at 09:45:08PM +0200, Andrea Corallo wrote:
> > Should that actually be a sliding window, or should there actually just
> > not be more than N branches per aligned block of machine code?  Like,
> > per fetch group.
> >
> > Can you not use ASM_OUTPUT_ALIGN_WITH_NOP (or ASM_OUTPUT_MAX_SKIP_ALIGN
> > even) then?  GCC has infrastructure for that, already.
> 
> Correct, it's a sliding window only because the real load address is not
> known to the compiler and the algorithm is conservative.  I believe we
> could use ASM_OUTPUT_ALIGN_WITH_NOP if we align each function to (al
> least) the granule size, then we should be able to insert 'nop aligned
> labels' precisely.

Yeah, we have similar issues on Power...  Our "granule" (fetch group
size, in our terminology) is 32 typically, but we align functions to
just 16.  This is causing some problems, but aligning to bigger
boundaries isn't a very happy alternative either.  WIP...

(We don't have this exact same problem, because our non-ancient cores
can just predict *all* branches in the same cycle).

> My main fear is that given new cores tend to have big granules code size
> would blow.  One advantage of the implemented algorithm is that even if
> slightly conservative it's impacting code size only where an high branch
> density shows up.

What is "big granules" for you?


Segher


Re: [PATCH][RFC] diagnostics: Add support for Unicode drawing characters

2020-07-23 Thread David Malcolm via Gcc-patches
On Thu, 2020-07-23 at 12:28 -0400, Lewis Hyatt via Gcc-patches wrote:
> Hello-
> 
> The attached patch is complete including docs, but I tagged as RFC
> because I am not sure if anyone will like it, or if the general
> reaction may
> be closer to recoiling in horror :). Would appreciate your thoughts,
> please...

Thanks for working on this.  I'm interested in other people's thoughts
on this.  Various comments inline throughout below.

Currently, if a UTF-8 locale is detected, GCC changes the quote
> characters
> it outputs in diagnostics to Unicode directional quotes. I feel like
> this is
> a nice touch, so I was wondering whether GCC shouldn't do more along
> these
> lines. This patch adds support for using Unicode line drawing
> characters and
> similar things when outputting diagnostics. There is a new option
> -fdiagnostics-unicode-drawing=[auto|never|always] to control it,
> which
> defaults to auto. "auto" will enable the feature under the same
> circumstances that Unicode quotes get output, namely when the locale
> is
> determined by gcc_init_libintl() to support UTF-8. (The new option
> does not
> affect Unicode quote characters, which currently are not configurable
> and
> are determined solely by the locale.)

FWIW when I first started experimenting with location ranges back in
2015 my first patches had box-drawing characters for underlines; you
can see this in some of the early examples here (and similar URLs from
around then):

https://dmalcolm.fedorapeople.org/gcc/2015-08-18/plugin.html
  (this also has a different approach for labeling ranges, which I
called "captions", putting them in a right margin)

https://dmalcolm.fedorapeople.org/gcc/2015-08-19/diagnostic-test-string-literals-1.html

https://dmalcolm.fedorapeople.org/gcc/2015-08-26/tree-expression-ranges.html

etc; the patch kits were:

https://gcc.gnu.org/legacy-ml/gcc-patches/2015-03/msg00837.html
https://gcc.gnu.org/pipermail/gcc-patches/2015-September/428036.html
https://gcc.gnu.org/legacy-ml/gcc-patches/2015-09/msg01696.html

In:
  https://gcc.gnu.org/legacy-ml/gcc-patches/2015-09/msg01700.html
I wrote:
> * Eliminated UTF-8/box-drawing and captions.  Captions were cute but
>   weren't "fully baked".  Without them, box-drawing isn't really
>   needed, and I think I prefer the ASCII look, with the actual
>   "caret" character, and '~' makes it easier to count characters
>   compared to a box-drawing line, in my terminal's font, at least.
>   Doing so greatly simplifies the new locus-printing code.

So I dropped the UTF-8 box drawing from that original kit for:
(a) simplicity (the original patch kit was huge in scope, covering a
bunch of ideas for diagnostics - ranges, labeling, fix-it hints,
spelling suggestions, so I wanted to reduce the scope to something
manageable)
(b) I found it easier to count characters with "~"


The thing I'm most nervous about with this patch is the potential for
introducing mojibake when people copy and paste GCC output.

For example, looking at:
https://gcc.gnu.org/legacy-ml/gcc-patches/2015-03/msg00837.html
I see mojibake where the unicode line-drawing characters in my email
are being displayed in the HTML mailing list archive via "" -
something has gone wrong with encoding somewhere between the copy
from my terminal, the email, and the list archive.

That said, looking at your email in the archive here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550551.html
I don't see any mojibake.

What happens if GCC's stderr is piped into "less"?
What happens if GCC's stderr is saved in a build.log file, uploaded
somewhere, and then viewed?
etc.


> The elements implemented are:
> 
> * Vertical lines, e.g. those indicating labels and those
> separating the
>   source lines from the line numbers, are changed to line drawing
>   characters.
> 
> * The diagnostic paths output by the static analyzer make use of
> line
>   drawing characters to output smooth corners etc.
> 
> * The squiggly underline ~ used to highlight source locations
> is
>   changed to a double underline ═. The main reason for this
> is that
>   it enables a seamless "tee" character to connect the underline
> to a
>   label line if one exists.
> 
> * Carets (^) are changed to a slightly different character (∧). I
> think
>   the new one is a little nicer looking, although probably not
> worth the
>   trouble on its own. I wanted to implement the support in this
> patch
>   beause carets are harder to change than the rest of the
> elements
>   (front ends have an interface to override them, which currently
>   Fortran makes use of), so I thought it worthwhile to get this
> logic in
>   place, so that it can easily be changed to a more superior
> character
>   in the future if one comes up. It would also be easy enough to
> leave
>   the Unicode support in place for carets, but keep the default
> set to
>   the plain one for now.

Some other ideas:

* fix-it 

libgo patch committed: Add AIX FAT libraries support

2020-07-23 Thread Ian Lance Taylor via Gcc-patches
This patch by Clément Chigot adds AIX FAT library support to libgo.
This follows the same general idea as support in other GCC libraries
like libgcc and libsdc++.  Committed to mainline.

Ian
83cc5e2b2f887d4bb2305658da382a65fdcaab29
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 878df0d233a..d23e7377306 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-2d105e65cca6b536320284273353b7c640b12c5f
+587d4595e446c597efe97ccdc81b2f05cbc04a21
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/Makefile.am b/libgo/Makefile.am
index 5b1807228ad..52a8330ed2a 100644
--- a/libgo/Makefile.am
+++ b/libgo/Makefile.am
@@ -1228,3 +1228,17 @@ distclean-local:
find . -name '*.lo.dep' -print | xargs rm -f
 
 include $(top_srcdir)/../multilib.am
+
+if LIBGO_IS_AIX
+ALL_LOCAL_DEPS = add-aix-fat-library
+else
+ALL_LOCAL_DEPS =
+endif
+all-local: $(ALL_LOCAL_DEPS)
+
+MAJOR=$(firstword $(subst :, ,$(libtool_VERSION)))
+add-aix-fat-library: all-multi
+@if test "$(MULTIBUILDTOP)" = ""; then \
+${AR} -X$(AIX_DEFAULT_ARCH) rc .libs/$(PACKAGE).a 
../ppc$(AIX_DEFAULT_ARCH)/$(PACKAGE)/.libs/$(PACKAGE).so.$(MAJOR); \
+${AR} -X$(AIX_DEFAULT_ARCH) rc 
../pthread/$(PACKAGE)/.libs/$(PACKAGE).a 
../pthread/ppc$(AIX_DEFAULT_ARCH)/$(PACKAGE)/.libs/$(PACKAGE).so.$(MAJOR); \
+fi
diff --git a/libgo/configure.ac b/libgo/configure.ac
index 235d867ccda..841cba1768e 100644
--- a/libgo/configure.ac
+++ b/libgo/configure.ac
@@ -36,6 +36,14 @@ case ${host} in
 # static hash tables crashes on AIX when libgo is built with O2
 CFLAGS="$CFLAGS -fno-section-anchors"
 GOCFLAGS="$GOCFLAGS -fno-section-anchors"
+
+# Check default architecture for FAT library creation
+if test -z "`$(CC) -x c -E /dev/null -g3 -o - | grep 64BIT`" ; then
+AIX_DEFAULT_ARCH='64'
+else
+AIX_DEFAULT_ARCH='32'
+fi
+AC_SUBST(AIX_DEFAULT_ARCH)
 ;;
 esac
 


[PR95720] protect gluefile and wrap_flags with -Wl too (was: Re: drop -aux{dir,base}, revamp -dump{dir,base})

2020-07-23 Thread Alexandre Oliva
The testglue object file gets interpreted as another input file,
changing the dump and aux output names in GCC unless it is protected
by -Wl, like board file-named extra inputs.

Refactor the code that modifies the board settings so that it can be
used to modify regular variables as well, and do so.

Regstrapped on x86_64-linux-gnu, fix confirmed on an affected platform
by the bug reporter.  Ok to install?


for  gcc/testsuite/ChangeLog

PR testsuite/95720
* lib/gcc-defs.exp (gcc_adjust_linker_flags_list): Split out of...
(gcc_adjust_linker_flags): ... this.  Protect gluefile and
wrap_flags.
* gcc.misc-tests/outputs.exp: Use gcc_adjust_linker_flags_list.
---
 gcc/testsuite/gcc.misc-tests/outputs.exp |   14 ++--
 gcc/testsuite/lib/gcc-defs.exp   |   55 --
 2 files changed, 40 insertions(+), 29 deletions(-)

diff --git a/gcc/testsuite/gcc.misc-tests/outputs.exp 
b/gcc/testsuite/gcc.misc-tests/outputs.exp
index 469d94c..0784a8e 100644
--- a/gcc/testsuite/gcc.misc-tests/outputs.exp
+++ b/gcc/testsuite/gcc.misc-tests/outputs.exp
@@ -56,17 +56,9 @@ set link_options ""
 set dest [target_info name]
 foreach i { ldflags libs ldscript } {
 if {[board_info $dest exists $i]} {
-   set skip ""
-   foreach opt [split [board_info $dest $i]] {
-   if { $opt == "" } then {
-   continue
-   } elseif { $skip != "" } then {
-   set skip ""
-   } elseif { $opt == "-Xlinker" } then {
-   set skip $opt
-   } elseif { ![string match "-*" $opt] && [file isfile $opt] } {
-   set opt "-Wl,$opt"
-   }
+   set opts [board_info $dest $i]
+   set nopts [gcc_adjust_linker_flags_list $opts]
+   foreach opt $nopts {
append link_options " additional_flags=$opt"
}
 }
diff --git a/gcc/testsuite/lib/gcc-defs.exp b/gcc/testsuite/lib/gcc-defs.exp
index 87eeb7d..380a18b 100644
--- a/gcc/testsuite/lib/gcc-defs.exp
+++ b/gcc/testsuite/lib/gcc-defs.exp
@@ -287,9 +287,32 @@ proc dg-additional-files { args } {
 
 set gcc_adjusted_linker_flags 0
 
-# Add -Wl, before any file names in ldflags, libs, and ldscript, so
-# that default object files or libraries do not change the names of
-# gcc auxiliary outputs.
+# Add -Wl, before any file names in $opts.  Return the modified list.
+
+proc gcc_adjust_linker_flags_list { args } {
+set opts [lindex $args 0]
+set nopts {}
+set skip ""
+foreach opt [split $opts " "] {
+   if { $opt == "" } then {
+   continue
+   } elseif { $skip != "" } then {
+   set skip ""
+   } elseif { $opt == "-Xlinker" } then {
+   set skip $opt
+   } elseif { ![string match "-*" $opt] \
+  && [file isfile $opt] } {
+   set opt "-Wl,$opt"
+   }
+   lappend nopts $opt
+}
+return $nopts
+}
+
+# Add -Wl, before any file names in the target board's ldflags, libs,
+# and ldscript, as well as in global testglue and wrap_flags, so that
+# default object files or libraries do not change the names of gcc
+# auxiliary outputs.
 
 proc gcc_adjust_linker_flags {} {
 global gcc_adjusted_linker_flags
@@ -303,27 +326,23 @@ proc gcc_adjust_linker_flags {} {
foreach i { ldflags libs ldscript } {
if {[board_info $dest exists $i]} {
set opts [board_info $dest $i]
-   set nopts {}
-   set skip ""
-   foreach opt [split $opts] {
-   if { $opt == "" } then {
-   continue
-   } elseif { $skip != "" } then {
-   set skip ""
-   } elseif { $opt == "-Xlinker" } then {
-   set skip $opt
-   } elseif { ![string match "-*" $opt] \
-   && [file isfile $opt] } {
-   set opt "-Wl,$opt"
-   }
-   lappend nopts $opt
-   }
+   set nopts [gcc_adjust_linker_flags_list $opts]
if { $nopts != $opts } {
unset_currtarget_info $i
set_currtarget_info $i "$nopts"
}
}
}
+   foreach i { gluefile wrap_flags } {
+   global $i
+   if {[info exists $i]} {
+   set opts [set $i]
+   set nopts [gcc_adjust_linker_flags_list $opts]
+   if { $nopts != $opts } {
+   set $i $nopts
+   }
+   }
+   }
 }
 }
 


-- 
Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/
Free Software Evangelist  Stallman was right, but he's left :(
GNU Toolchain Engineer   Live long and free, and prosper ethically


Re: [PATCH][RFC] diagnostics: Add support for Unicode drawing characters

2020-07-23 Thread Martin Sebor via Gcc-patches

On 7/23/20 10:28 AM, Lewis Hyatt via Gcc-patches wrote:

Hello-

The attached patch is complete including docs, but I tagged as RFC
because I am not sure if anyone will like it, or if the general reaction may
be closer to recoiling in horror :). Would appreciate your thoughts,
please...


I don't have much of an opinion on the proposed changes but they
remind me of an enhancement I have been thinking about for a while.
I think it would be a nice touch to more finely differentiate parts
of the message text from the rest than just by highlighting it in
bold.  Specifically, I'm thinking of terms of a language grammar,
but a similar approach could be used for other elements as well.

For example, a number of diagnostic messages refer to the term
constant-expression.  A common convention used by language
standards is to render these terms in italics.  Doing the same
in GCC output would make it clear when it refers to the term of
the grammar (especially in non-hyphenated terms).  Since some
terminals support italics even without UTF-8, this enhancement
could be made independently.

Other font characteristics could be used to differentiate other
"elements" referenced in the messages, such as numerical constants
from ordinary numbers, highlight especially relevant parts of quoted
text like option arguments (for instance, the 9 in "alignment of
%qD will increase in %<-fabi-version=9%>") or attribute arguments
that cannot be underscored, or even be used in hints (e.g.,
strikethrough to denote deletion and underline for insertion).

Martin



Currently, if a UTF-8 locale is detected, GCC changes the quote characters
it outputs in diagnostics to Unicode directional quotes. I feel like this is
a nice touch, so I was wondering whether GCC shouldn't do more along these
lines. This patch adds support for using Unicode line drawing characters and
similar things when outputting diagnostics. There is a new option
-fdiagnostics-unicode-drawing=[auto|never|always] to control it, which
defaults to auto. "auto" will enable the feature under the same
circumstances that Unicode quotes get output, namely when the locale is
determined by gcc_init_libintl() to support UTF-8. (The new option does not
affect Unicode quote characters, which currently are not configurable and
are determined solely by the locale.)

The elements implemented are:

 * Vertical lines, e.g. those indicating labels and those separating the
   source lines from the line numbers, are changed to line drawing
   characters.

 * The diagnostic paths output by the static analyzer make use of line
   drawing characters to output smooth corners etc.

 * The squiggly underline ~ used to highlight source locations is
   changed to a double underline ═. The main reason for this is that
   it enables a seamless "tee" character to connect the underline to a
   label line if one exists.

 * Carets (^) are changed to a slightly different character (∧). I think
   the new one is a little nicer looking, although probably not worth the
   trouble on its own. I wanted to implement the support in this patch
   beause carets are harder to change than the rest of the elements
   (front ends have an interface to override them, which currently
   Fortran makes use of), so I thought it worthwhile to get this logic in
   place, so that it can easily be changed to a more superior character
   in the future if one comes up. It would also be easy enough to leave
   the Unicode support in place for carets, but keep the default set to
   the plain one for now.

As an example, this diagnostic from gcc.dg/format/diagnostic-ranges.c:

diagnostic-ranges.c:196:28: warning: field width specifier ‘*’ expects argument 
of type ‘int’, but argument 3 has type ‘long int’ [-Wformat=]
   196 |   __builtin_sprintf (d, " %*ld ", foo + bar, foo);
   |   ~^~~~
   ||  |
   |intlong int

would become instead:

diagnostic-ranges.c:196:28: warning: field width specifier ‘*’ expects argument 
of type ‘int’, but argument 3 has type ‘long int’ [-Wformat=]
   196 │   __builtin_sprintf (d, " %*ld ", foo + bar, foo);
   │   ═∧══╤
   ││  │
   │intlong int

Hopefully you are viewing this in a terminal that displays it properly :), in
which case, hopefully you may find it to be an improvement?

Here is a more involved example from the analyzer:

setjmp-5.c: In function ‘outer’:
setjmp-5.c:21:3: warning: ‘longjmp’ called after enclosing function of ‘setjmp’ 
has returned [-Wanalyzer-stale-setjmp-buffer]
21 |   longjmp (env, 42); /* { dg-warning "'longjmp' called after enclosing 
function of 'setjmp' has returned" } */
   |   ^
   ‘outer’: events 1-2
 |
 |   15 | void 

Re: [PATCH PR96230] driver: ICE in process_command, at gcc.c:5095

2020-07-23 Thread Alexandre Oliva
On Jul 23, 2020, Richard Biener  wrote:

> On Thu, 23 Jul 2020, Zhanghaijian (A) wrote:
>> # This option restores naming of aux and dump output files
>> # after input files when multiple input files are named,
>> # instead of getting them combined with the output name.

> Does it actually do what the above comment suggests?

It should.  When there are multiple inputs, it clearly does.  I guess
it's just as clear I haven't tried it with a single input.

> I'd have naiively assumed that for gcc -S t.c -fdump-tree-optimized
> -dumpbase "" I get a ".238t.optimized" file (thus empty dumpbase).

Uhh, I guess that would be a reasonable expectation, more so if combined
with -dumpdir, but the documentation suggests -dumpbase "" is to restore
pre-revamp behavior.  Should we change that?


Here's a patch that adds some more tests, fixes the misbehavior, and
adds comments to conditions that led me down investigations that wasted
a lot of time while looking into this problem and trying to make for
more widespread changes.

Tested on x86_64-linux-gnu.  Ok to install?


[PR96230] some -dumpbase-ext fixes

From: Alexandre Oliva 

The initial bug report was that compiling (-c) with -dumpbase ""
-dumpbase-ext . crashes the driver.

The verification of -dumpbase-ext against -dumpbase doesn't cover the
case in which -dumpbase activates backward-compatibility mode.

I added a test for that, and for -dumpbase-ext without -dumpbase,
trying to make it work in a sensible way, as if applied to the default
-dumpbase for each file.  It turned out that this made for too much
complexity in dealing with suffixes derived from input filenames, so I
gave that up and returned to discarding -dumpbase-ext as documented,
ending up with a change identical to that in the original bug report.

I also thought I caught an off-by-one error in the initial
verification, that caused dumpbase_ext to be discarded if it was
identical to the specified dumpbase, but that turned out to be
intentional as well, so I put in comments and a test to reflect it.

Finally, an earlier version of the newly-added tests used "$var.ext"
in an expected output list, which showed me the handling of string
expansion was incorrect.  Reworked the expr into an eval to make that
work, and, absent any reliance on post-eval adjustments to so-expanded
output names, I arranged for the adjustments to be skipped after eval.


Co-Authored-By: "Zhanghaijian (A)" 
for  gcc/ChangeLog

PR driver/96230
* gcc.c (process_command): Adjust and document conditions to
reset dumpbase_ext.

for  gcc/testsuite/ChangeLog

PR driver/96230
* gcc.misc-tests/outputs.exp: Add tests with -dumpbase-ext,
with identical -dumpbase, with -dumpbase "", and without any
-dumpbase.
(outest): Fix "" expansion in expected outputs, skip
adjustments.
---
 gcc/gcc.c|   15 +--
 gcc/testsuite/gcc.misc-tests/outputs.exp |   23 ++-
 2 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/gcc/gcc.c b/gcc/gcc.c
index c0eb3c1..10bc988 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -4907,6 +4907,9 @@ process_command (unsigned int decoded_options_count,
   int lendb = strlen (dumpbase);
   int lendbx = strlen (dumpbase_ext);
 
+  /* -dumpbase-ext must be a suffix proper; discard it if it
+ matches all of -dumpbase, as that would make for an empty
+ basename.  */
   if (lendbx >= lendb
  || strcmp (dumpbase + lendb - lendbx, dumpbase_ext) != 0)
{
@@ -5083,10 +5086,18 @@ process_command (unsigned int decoded_options_count,
   /* Check that dumpbase_ext, if still present, still matches the end
  of dumpbase, if present, and drop it otherwise.  We only retained
  it above when dumpbase was absent to maybe use it to drop the
- extension from output_name before combining it with dumpdir.  */
+ extension from output_name before combining it with dumpdir.  We
+ won't deal with -dumpbase-ext when -dumpbase is not explicitly
+ given, even if just to activate backward-compatible dumpbase:
+ dropping it on the floor is correct, expected and documented
+ behavior.  Attempting to deal with a -dumpbase-ext that might
+ match the end of some input filename, or of the combination of
+ the output basename with the suffix of the input filename,
+ possible with an intermediate .gk extension for -fcompare-debug,
+ is just calling for trouble.  */
   if (dumpbase_ext)
 {
-  if (!dumpbase)
+  if (!dumpbase || !*dumpbase)
{
  free (dumpbase_ext);
  dumpbase_ext = NULL;
diff --git a/gcc/testsuite/gcc.misc-tests/outputs.exp 
b/gcc/testsuite/gcc.misc-tests/outputs.exp
index 0784a8e..1e3cd41 100644
--- a/gcc/testsuite/gcc.misc-tests/outputs.exp
+++ b/gcc/testsuite/gcc.misc-tests/outputs.exp
@@ -137,11 +137,9 @@ proc outest { test sources opts dirs outputs } {
 
if { [string 

Re: [PATCH] rs6000: Disable -fcaller-saves by default

2020-07-23 Thread Segher Boessenkool
On Thu, Jul 23, 2020 at 02:29:00PM -0600, Jeff Law wrote:
> On Thu, 2020-07-23 at 15:19 -0500, Segher Boessenkool wrote:
> > On Thu, Jul 23, 2020 at 01:42:59PM -0600, Jeff Law wrote:
> > > On Thu, 2020-07-23 at 14:25 -0500, Pat Haugen via Gcc-patches wrote:
> > > > Disable -fcaller-saves by default.
> > > > 
> > > > This patch turns off -fcaller-saves by default so that IRA doesn't try
> > > > to use volatile regs for pseudos that are live across a call, which
> > > > would then require LRA to save/restore the reg around the call.
> > > > 2020-07-23  Pat Haugen  
> > > > 
> > > > gcc/
> > > > * common/config/rs6000/rs6000-common.c
> > > > (rs6000_option_optimization_table): Turn off -fcaller-saves.
> > > > 
> > > > gcc/testsuite/
> > > > * gcc.target/powerpc/caller-saves.c: New.
> > > What's driving this change?
> > 
> > Peter noticed IRA allocates stuff to volatile registers while it is life
> > through a call, and then LRA has to correct that, not optimal.
> I can't recall if IRA or LRA handles the need to save them to the stack, but 
> the
> whole point is you can do much better sometimes by saving into the stack with 
> the
> caller-saves algorithm vs just giving up and spilling.

But the only thing this flag influences is what registers IRA allocates,
nothing else, or what am I missing?

> > > IRA will do a cost/benefit analysis to see using call clobbered registers 
> > > like
> > > this is profitable or not.  You're just turning the whole thing off.
> > 
> > '-fcaller-saves'
> >  Enable allocation of values to registers that are clobbered by
> >  function calls, by emitting extra instructions to save and restore
> >  the registers around such calls.  Such allocation is done only when
> >  it seems to result in better code.
> > 
> >  This option is always enabled by default on certain machines,
> >  usually those which have no call-preserved registers to use
> >  instead.

The documentation does not match reality btw, and even contradicts it
for many targets.

> > So, given what Peter saw, the analysis when to do or not do this isn't
> > as good as could be hoped for.
> Sure.  That obviously happens.  When it does the usual response is to dig 
> deeper
> into why, not turn the entire option off.

That is after digging deeper!  Maybe not deep enough, hrm.  Peter?


Segher


Re: [PATCH] rs6000: Disable -fcaller-saves by default

2020-07-23 Thread Jeff Law via Gcc-patches
On Thu, 2020-07-23 at 15:19 -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jul 23, 2020 at 01:42:59PM -0600, Jeff Law wrote:
> > On Thu, 2020-07-23 at 14:25 -0500, Pat Haugen via Gcc-patches wrote:
> > > Disable -fcaller-saves by default.
> > > 
> > > This patch turns off -fcaller-saves by default so that IRA doesn't try
> > > to use volatile regs for pseudos that are live across a call, which
> > > would then require LRA to save/restore the reg around the call.
> > > 2020-07-23  Pat Haugen  
> > > 
> > > gcc/
> > >   * common/config/rs6000/rs6000-common.c
> > >   (rs6000_option_optimization_table): Turn off -fcaller-saves.
> > > 
> > > gcc/testsuite/
> > >   * gcc.target/powerpc/caller-saves.c: New.
> > What's driving this change?
> 
> Peter noticed IRA allocates stuff to volatile registers while it is life
> through a call, and then LRA has to correct that, not optimal.
I can't recall if IRA or LRA handles the need to save them to the stack, but the
whole point is you can do much better sometimes by saving into the stack with 
the
caller-saves algorithm vs just giving up and spilling.

> 
> > IRA will do a cost/benefit analysis to see using call clobbered registers 
> > like
> > this is profitable or not.  You're just turning the whole thing off.
> 
> '-fcaller-saves'
>  Enable allocation of values to registers that are clobbered by
>  function calls, by emitting extra instructions to save and restore
>  the registers around such calls.  Such allocation is done only when
>  it seems to result in better code.
> 
>  This option is always enabled by default on certain machines,
>  usually those which have no call-preserved registers to use
>  instead.
> 
> So, given what Peter saw, the analysis when to do or not do this isn't
> as good as could be hoped for.
Sure.  That obviously happens.  When it does the usual response is to dig deeper
into why, not turn the entire option off.

Jeff



Re: committed] correct memcmp expansion of constant representations containing embedded nuls (PR 95189)

2020-07-23 Thread H.J. Lu via Gcc-patches
On Thu, Jul 23, 2020 at 1:14 PM Martin Sebor via Gcc-patches
 wrote:
>
> On 7/22/20 2:23 AM, Rainer Orth wrote:
> > Hi Martin,
> >
> >> I have committed this change in r11-2231 after Jeff approved it
> >> off list last Thursday.
> >
> > the new gcc.target/i386/memcpy-pr95886.c test FAILs on 32-bit x86
> > (i386-pc-solaris2.11):
> >
> > +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> > "const_int 1976943448883713" 1
> > +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> > "const_int 576467370915332609" 1
> > +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> > "const_int 578431098682540545" 1
> > +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> > "const_int 578437695685198337" 1
> > +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> > "const_int 578437695685198337" 1
> > +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> > "const_int 578437695752110593" 1
> > +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> > "const_int 578437695752306689" 1
> > +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> > "const_int 578437695752307200" 1
> > +FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand 
> > "const_int 578437695752307201" 2
>
> Thanks for letting me know.  The test looks for patterns that are
> apparently LP64-specific so I restricted it to just that data model.
>

Shouldn't it also work for x32?


-- 
H.J.


Re: [PATCH] rs6000: Disable -fcaller-saves by default

2020-07-23 Thread Segher Boessenkool
Hi!

On Thu, Jul 23, 2020 at 01:42:59PM -0600, Jeff Law wrote:
> On Thu, 2020-07-23 at 14:25 -0500, Pat Haugen via Gcc-patches wrote:
> > Disable -fcaller-saves by default.
> > 
> > This patch turns off -fcaller-saves by default so that IRA doesn't try
> > to use volatile regs for pseudos that are live across a call, which
> > would then require LRA to save/restore the reg around the call.

> > 2020-07-23  Pat Haugen  
> > 
> > gcc/
> > * common/config/rs6000/rs6000-common.c
> > (rs6000_option_optimization_table): Turn off -fcaller-saves.
> > 
> > gcc/testsuite/
> > * gcc.target/powerpc/caller-saves.c: New.

> What's driving this change?

Peter noticed IRA allocates stuff to volatile registers while it is life
through a call, and then LRA has to correct that, not optimal.

> IRA will do a cost/benefit analysis to see using call clobbered registers like
> this is profitable or not.  You're just turning the whole thing off.

'-fcaller-saves'
 Enable allocation of values to registers that are clobbered by
 function calls, by emitting extra instructions to save and restore
 the registers around such calls.  Such allocation is done only when
 it seems to result in better code.

 This option is always enabled by default on certain machines,
 usually those which have no call-preserved registers to use
 instead.

So, given what Peter saw, the analysis when to do or not do this isn't
as good as could be hoped for.

> This can be particularly bad for performance if you have classes with no call
> saved registers.

Do we though?  Well, "d" for vectors, but there shouldn't be insns that
require that?


Segher


Re: committed] correct memcmp expansion of constant representations containing embedded nuls (PR 95189)

2020-07-23 Thread Martin Sebor via Gcc-patches

On 7/22/20 2:23 AM, Rainer Orth wrote:

Hi Martin,


I have committed this change in r11-2231 after Jeff approved it
off list last Thursday.


the new gcc.target/i386/memcpy-pr95886.c test FAILs on 32-bit x86
(i386-pc-solaris2.11):

+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
1976943448883713" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
576467370915332609" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578431098682540545" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578437695685198337" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578437695685198337" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578437695752110593" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578437695752306689" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578437695752307200" 1
+FAIL: gcc.target/i386/memcpy-pr95886.c scan-rtl-dump-times expand "const_int 
578437695752307201" 2


Thanks for letting me know.  The test looks for patterns that are
apparently LP64-specific so I restricted it to just that data model.

Martin


Re: [PATCH] rs6000: Disable -fcaller-saves by default

2020-07-23 Thread Jeff Law via Gcc-patches
On Thu, 2020-07-23 at 14:25 -0500, Pat Haugen via Gcc-patches wrote:
> Disable -fcaller-saves by default.
> 
> This patch turns off -fcaller-saves by default so that IRA doesn't try
> to use volatile regs for pseudos that are live across a call, which
> would then require LRA to save/restore the reg around the call.
> 
> Bootstrap/regtest on powerpc64le with no new regressions. Also ran a
> CPU2017 benchmark comparison with no major differences (a few minor
> improvements and one minor degradation). Ok for trunk?
> 
> -Pat
> 
> 
> 2020-07-23  Pat Haugen  
> 
> gcc/
>   * common/config/rs6000/rs6000-common.c
>   (rs6000_option_optimization_table): Turn off -fcaller-saves.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/caller-saves.c: New.
What's driving this change?

IRA will do a cost/benefit analysis to see using call clobbered registers like
this is profitable or not.  You're just turning the whole thing off.

This can be particularly bad for performance if you have classes with no call
saved registers.

Jeff



[PATCH] rs6000: Disable -fcaller-saves by default

2020-07-23 Thread Pat Haugen via Gcc-patches
Disable -fcaller-saves by default.

This patch turns off -fcaller-saves by default so that IRA doesn't try
to use volatile regs for pseudos that are live across a call, which
would then require LRA to save/restore the reg around the call.

Bootstrap/regtest on powerpc64le with no new regressions. Also ran a
CPU2017 benchmark comparison with no major differences (a few minor
improvements and one minor degradation). Ok for trunk?

-Pat


2020-07-23  Pat Haugen  

gcc/
* common/config/rs6000/rs6000-common.c
(rs6000_option_optimization_table): Turn off -fcaller-saves.

gcc/testsuite/
* gcc.target/powerpc/caller-saves.c: New.



diff --git a/gcc/common/config/rs6000/rs6000-common.c
b/gcc/common/config/rs6000/rs6000-common.c
index ee37b9dc90b..6d68834aed2 100644
--- a/gcc/common/config/rs6000/rs6000-common.c
+++ b/gcc/common/config/rs6000/rs6000-common.c
@@ -43,8 +43,13 @@ static const struct default_options
rs6000_option_optimization_table[] =
on rs6000, turn it off by default.  */
 { OPT_LEVELS_ALL, OPT_frename_registers, NULL, 0 },

+/* Don't allow IRA to use volatile regs across function calls.  */
+{ OPT_LEVELS_ALL, OPT_fcaller_saves, NULL, 0 },
+
 /* Double growth factor to counter reduced min jump length.  */
 { OPT_LEVELS_ALL, OPT__param_max_grow_copy_bb_insns_, NULL, 16 },
+
+/* Following marks the end of the list and needs to remain last.  */
 { OPT_LEVELS_NONE, 0, NULL, 0 }
   };

diff --git a/gcc/testsuite/gcc.target/powerpc/caller-saves.c
b/gcc/testsuite/gcc.target/powerpc/caller-saves.c
new file mode 100644
index 000..846356f16f7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/caller-saves.c
@@ -0,0 +1,9 @@
+/* { dg-do compile */
+/* { dg-options "-O2 -fverbose-asm" } */
+
+void
+foo ()
+{
+}
+
+/* { dg-final { scan-assembler-not "fcaller-saves" } } */


Re: [PATCH] avoid -Wnonnull on synthesized condition in static_cast (PR 96003)

2020-07-23 Thread Martin Sebor via Gcc-patches

Another test case added to the bug since I posted the patch shows
that the no-warning bit set by the C++ front end is easily lost
during early folding, triggering the -Wnonull again despite
the suppression.  The attached revision tweaks the folder in just
this one case to let the suppression take effect in this situation.

In light of how pervasive this potential for stripping looks (none
of the other folders preserves the bit) the tweak feels like a band-
aid rather than a general solution.  But implementing something
better (and mainly developing tests to verify that it works in
general) would probably be quite the project.  So I leave it for
some other time.

Martin


On 7/17/20 1:00 PM, Martin Sebor wrote:

The recent enhancement to treat the implicit this pointer argument
as nonnull in member functions triggers spurious -Wnonnull for
the synthesized conditional expression the C++ front end replaces
the pointer with in some static_cast expressions.  The front end
already sets the no-warning bit for the test but not for the whole
conditional expression, so the attached fix extends the same solution
to it.

The consequence of this fix is that user-written code like this:

   static_cast(p ? p : 0)->f ();
or
   static_cast(p ? p : nullptr)->f ();

don't trigger the warning because they are both transformed into
the same expression as:

   static_cast(p)->f ();

What still does trigger it is this:

   static_cast(p ? p : (T*)0)->f ();

because here it's the inner COND_EXPR's no-warning bit that's set
(the outer one is clear), whereas in the former expressions it's
the other way around.  It would be nice if this worked consistently
but I didn't see an easy way to do that and more than a quick fix
seems outside the scope for this bug.

Another case reported by someone else in the same bug involves
a dynamic_cast.  A simplified test case goes something like this:

   if (dynamic_cast(p))
     dynamic_cast(p)->f ();

The root cause is the same: the front end emitting the COND_EXPR

   ((p != 0) ? ((T*)__dynamic_cast(p, (& _ZTI1B), (& _ZTI1C), 0)) : 0)

I decided not to suppress the warning in this case because doing
so would also suppress it in unconditional calls with the result
of the cast:

   dynamic_cast(p)->f ();

and that doesn't seem helpful.  Instead, I'd suggest to make
the second cast in the if statement to reference to T&:

   if (dynamic_cast(p))
     dynamic_cast(*p).f ();

Martin


PR c++/96003 spurious -Wnonnull calling a member on the result of static_cast

gcc/c-family/ChangeLog:

	PR c++/96003
	* c-common.c (check_function_arguments_recurse): Return early when
	no-warning bit is set.

gcc/cp/ChangeLog:

	PR c++/96003
	* class.c (build_base_path): Set no-warning bit on the synthesized
	conditional expression in static_cast.

gcc/ChangeLog:
	fold-const.c (fold_unary_loc): Set no-warning on a rebuilt COND_EXPR
	if the original had it set.

gcc/testsuite/ChangeLog:

	PR c++/96003
	* g++.dg/warn/Wnonnull7.C: New test.

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 51ecde69f2d..8a2d641f17c 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5821,6 +5821,9 @@ check_function_arguments_recurse (void (*callback)
   void *ctx, tree param,
   unsigned HOST_WIDE_INT param_num)
 {
+  if (TREE_NO_WARNING (param))
+return;
+
   if (CONVERT_EXPR_P (param)
   && (TYPE_PRECISION (TREE_TYPE (param))
 	  == TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (param, 0)
diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index a3913f4ce0b..e024e8a4a74 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -516,8 +516,14 @@ build_base_path (enum tree_code code,
 
  out:
   if (null_test)
-expr = fold_build3_loc (input_location, COND_EXPR, target_type, null_test, expr,
-			build_zero_cst (target_type));
+{
+  expr = fold_build3_loc (input_location, COND_EXPR, target_type, null_test,
+			  expr, build_zero_cst (target_type));
+  /* Avoid warning for the whole conditional expression (in addition
+	 to NULL_TEST itself -- see above) in case the result is used in
+	 a nonnull context that the front end -Wnonnull checks.  */
+  TREE_NO_WARNING (expr) = 1;
+}
 
   return expr;
 }
diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index 300d959278b..67153e3f484 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -8754,6 +8754,7 @@ fold_unary_loc (location_t loc, enum tree_code code, tree type, tree op0)
 	arg02 = fold_build1_loc (loc, code, type,
  fold_convert_loc (loc,
 		   TREE_TYPE (op0), arg02));
+	  bool nowarn = TREE_NO_WARNING (op0);
 	  tem = fold_build3_loc (loc, COND_EXPR, type, TREE_OPERAND (arg0, 0),
 			 arg01, arg02);
 
@@ -8788,6 +8789,8 @@ fold_unary_loc (location_t loc, enum tree_code code, tree type, tree op0)
   TREE_OPERAND (TREE_OPERAND (tem, 1), 0),
   TREE_OPERAND (TREE_OPERAND (tem, 2),
 		0)));
+	  if (nowarn)
+	TREE_NO_WARNING (tem) = true;
 	  return tem;
 	}
}
diff --git 

[committed] Resolve PR rtl-optimization/96298: XOR doesn't distribute over XOR.

2020-07-23 Thread Roger Sayle

Many thanks to Segher Boessenkool for the speedy review.  The following
patch
has been committed after testing on x86_64-pc-linux-gnu.  In the recent
patch
to simplify-rtx.c  I had incorrectly extrapolated from AND distributing over
AND,
and IOR distributing over IOR, that XOR distributes over XOR.  Possibly a
typo.
The following "obvious" one line change fixes the breakage.  My apologies.

2020-07-23  Roger Sayle  

PR rtl-optimization/96298
* simplify-rtx.c (simplify_binary_operation_1) [XOR]: Xor doesn't
distribute over xor, so (a^b)^(c^b) is not the same as (a^c)^b.

Sorry again,
Roger
--
Roger Sayle
NextMove Software
Cambridge, UK



pr96298.patch
Description: Binary data


Re: [PATCH 4/5] MSP430: Implement TARGET_INSN_COST

2020-07-23 Thread Segher Boessenkool
Hi!

On Thu, Jul 23, 2020 at 04:56:14PM +0100, Jozef Lawrynowicz wrote:
> +static int
> +msp430_insn_cost (rtx_insn *insn, bool speed ATTRIBUTE_UNUSED)
> +{
> +  int cost;
> +
> +  if (recog_memoized (insn) < 0)
> +return 0;
> +
> +  cost = get_attr_length (insn);
> +  if (TARGET_DEBUG_INSN_COSTS)
> +{
> +  fprintf (stderr, "cost %d for insn:\n", cost);
> +  debug_rtx (insn);
> +}
> +
> +  /* The returned cost must be relative to COSTS_N_INSNS (1). An insn with a
> + length of 2 bytes is the smallest possible size and so must be 
> equivalent
> + to COSTS_N_INSNS (1).  */
> +  return COSTS_N_INSNS (cost) / (2 * COSTS_N_INSNS (1));

This is the same as "cost / 2", so "length / 2" here, which doesn't look
right.  The returned value should have the same "unit" as COSTS_N_INSNS
does, so maybe you want  COSTS_N_INSNS (length / 2)  ?

> +  /* FIXME Add more detailed costs when optimizing for speed.
> + For now the length of the instruction is a good approximiation and 
> roughly
> + correlates with cycle cost.  *

COSTS_N_INSNS (1) is 4, so that you can make things cost 5, 6, 7 to be a
cost intermediate to COSTS_N_INSNS (1) and COSTS_N_INSNS (2).  This is
very useful, scaling down the costs destroys that.

> +mdebug-insn-costs
> +Target Report Mask(DEBUG_INSN_COSTS)
> +Print insns and their costs as calculated by TARGET_INSN_COSTS.

It is already printed in the generated asm with -dp?  Not sure if you
want more detail than that.

 '-dp'
  Annotate the assembler output with a comment indicating which
  pattern and alternative is used.  The length and cost of each
  instruction are also printed.


Segher


New German PO file for 'gcc' (version 10.2.0)

2020-07-23 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the German team of translators.  The file is available at:

https://translationproject.org/latest/gcc/de.po

(This file, 'gcc-10.2.0.de.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH] contrib/vimrc: detect more C-like files

2020-07-23 Thread Patrick Palka via Gcc-patches
On Thu, 23 Jul 2020, Martin Liška wrote:

> On 7/23/20 4:44 PM, Patrick Palka via Gcc-patches wrote:
> > Here's a rebased patch.  Can another Vim user verify that this does the
> > right thing?
> 
> Yes, it works correctly for me! I tested that before and after your change
> for ./libstdc++-v3/include/std/iomanip.
> 
> Please install the patch.
> Martin

Thanks Martin!  I pushed the patch just now.


New template for 'gcc' made available

2020-07-23 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.  (If you have
any questions, send them to .)

A new POT file for textual domain 'gcc' has been made available
to the language teams for translation.  It is archived as:

https://translationproject.org/POT-files/gcc-10.2.0.pot

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

Below is the URL which has been provided to the translators of your
package.  Please inform the translation coordinator, at the address
at the bottom, if this information is not current:

https://ftp.gnu.org/gnu/gcc/gcc-10.2.0/gcc-10.2.0.tar.xz

Translated PO files will later be automatically e-mailed to you.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




[PATCH][RFC] diagnostics: Add support for Unicode drawing characters

2020-07-23 Thread Lewis Hyatt via Gcc-patches
Hello-

The attached patch is complete including docs, but I tagged as RFC
because I am not sure if anyone will like it, or if the general reaction may
be closer to recoiling in horror :). Would appreciate your thoughts,
please...

Currently, if a UTF-8 locale is detected, GCC changes the quote characters
it outputs in diagnostics to Unicode directional quotes. I feel like this is
a nice touch, so I was wondering whether GCC shouldn't do more along these
lines. This patch adds support for using Unicode line drawing characters and
similar things when outputting diagnostics. There is a new option
-fdiagnostics-unicode-drawing=[auto|never|always] to control it, which
defaults to auto. "auto" will enable the feature under the same
circumstances that Unicode quotes get output, namely when the locale is
determined by gcc_init_libintl() to support UTF-8. (The new option does not
affect Unicode quote characters, which currently are not configurable and
are determined solely by the locale.)

The elements implemented are:

* Vertical lines, e.g. those indicating labels and those separating the
  source lines from the line numbers, are changed to line drawing
  characters.

* The diagnostic paths output by the static analyzer make use of line
  drawing characters to output smooth corners etc.

* The squiggly underline ~ used to highlight source locations is
  changed to a double underline ═. The main reason for this is that
  it enables a seamless "tee" character to connect the underline to a
  label line if one exists.

* Carets (^) are changed to a slightly different character (∧). I think
  the new one is a little nicer looking, although probably not worth the
  trouble on its own. I wanted to implement the support in this patch
  beause carets are harder to change than the rest of the elements
  (front ends have an interface to override them, which currently
  Fortran makes use of), so I thought it worthwhile to get this logic in
  place, so that it can easily be changed to a more superior character
  in the future if one comes up. It would also be easy enough to leave
  the Unicode support in place for carets, but keep the default set to
  the plain one for now.

As an example, this diagnostic from gcc.dg/format/diagnostic-ranges.c:

diagnostic-ranges.c:196:28: warning: field width specifier ‘*’ expects argument 
of type ‘int’, but argument 3 has type ‘long int’ [-Wformat=]
  196 |   __builtin_sprintf (d, " %*ld ", foo + bar, foo);
  |   ~^~~~
  ||  |
  |intlong int

would become instead:

diagnostic-ranges.c:196:28: warning: field width specifier ‘*’ expects argument 
of type ‘int’, but argument 3 has type ‘long int’ [-Wformat=]
  196 │   __builtin_sprintf (d, " %*ld ", foo + bar, foo);
  │   ═∧══╤
  ││  │
  │intlong int

Hopefully you are viewing this in a terminal that displays it properly :), in
which case, hopefully you may find it to be an improvement?

Here is a more involved example from the analyzer:

setjmp-5.c: In function ‘outer’:
setjmp-5.c:21:3: warning: ‘longjmp’ called after enclosing function of ‘setjmp’ 
has returned [-Wanalyzer-stale-setjmp-buffer]
   21 |   longjmp (env, 42); /* { dg-warning "'longjmp' called after enclosing 
function of 'setjmp' has returned" } */
  |   ^
  ‘outer’: events 1-2
|
|   15 | void outer (void)
|  |  ^
|  |  |
|  |  (1) entry to ‘outer’
|..
|   19 |   inner ();
|  |   
|  |   |
|  |   (2) calling ‘inner’ from ‘outer’
|
+--> ‘inner’: event 3
   |
   |   10 | static void inner (void)
   |  | ^
   |  | |
   |  | (3) entry to ‘inner’
   |
 ‘inner’: event 4
   |
   |   12 |   SETJMP (env);
   |  |   ^~
   |  |   |
   |  |   (4) ‘setjmp’ called here
   |
<--+
|
  ‘outer’: events 5-6
|
|   19 |   inner ();
|  |   ^~~~
|  |   |
|  |   (5) returning to ‘outer’ from ‘inner’
|   20 |
|   21 |   longjmp (env, 42); /* { dg-warning "'longjmp' called after 
enclosing function of 'setjmp' has returned" } */
|  |   ~
|  |   |
|  |   (6) here
|

would become instead:

setjmp-5.c: In function ‘outer’:
setjmp-5.c:21:3: warning: ‘longjmp’ called after enclosing function of ‘setjmp’ 
has returned [-Wanalyzer-stale-setjmp-buffer]
   21 │   longjmp (env, 42); /* { dg-warning "'longjmp' called after enclosing 
function of 'setjmp' has returned" } */
  │   ∧
  

[PATCH 2/2] aarch64: add PAC-RET protection to libitm sjlj.S

2020-07-23 Thread Szabolcs Nagy
_ITM_beginTransaction is a 'returns_twice' function that saves x30
on the stack as part of gtm_jmpbuf (that is passed down to
GTM_begin_transaction), but the saved x30 is also used for return.

The return path should be protected so we don't leave an
  ldp x29, x30, [sp]
  ret
gadget in the code, so x30 is signed on function entry. This
exposes the signed address in the gtm_jmpbuf too. The jmpbuf does
not need a signed address since GTM_longjmp uses
  ldp x29, x30, [x1]
  br x30
and with BTI there is a BTI j at the _ITM_beginTransaction call site
where this jump returns. Using PAC does not hurt: the gtm_jmpbuf is
internal to libitm and its layout is only used by sjlj.S so the
signed address does not escape. Saving signed x30 into gtm_jmpbuf
provides a bit of extra protection, but more importantly it allows
adding the PAC-RET support without changing the existing code much.

In theory bti and pac-ret protection can be added unconditionally
since the instructions are in the nop space, in practice they
can cause trouble if some tooling does not understand the gnu
property note (e.g. old binutils) or some unwinder or debugger
does not understand the new dwarf op code used for pac-ret (e.g
old gdb). So the code is written to only support branch-protection
according to the code generation options.

libitm/ChangeLog:

* config/aarch64/sjlj.S: Add conditional pac-ret protection.
---
 libitm/config/aarch64/sjlj.S | 56 ++--
 1 file changed, 53 insertions(+), 3 deletions(-)

diff --git a/libitm/config/aarch64/sjlj.S b/libitm/config/aarch64/sjlj.S
index e2093ca1a97..c84e98aecad 100644
--- a/libitm/config/aarch64/sjlj.S
+++ b/libitm/config/aarch64/sjlj.S
@@ -25,6 +25,35 @@
 #include "asmcfi.h"
 
 #define BTI_C  hint34
+#define PACIASPhint25
+#define AUTIASPhint29
+#define PACIBSPhint27
+#define AUTIBSPhint31
+
+#if defined(HAVE_AS_CFI_PSEUDO_OP) && defined(__GCC_HAVE_DWARF2_CFI_ASM)
+# define cfi_window_save .cfi_window_save
+# define cfi_b_key_frame .cfi_b_key_frame
+#else
+# define cfi_window_save
+# define cfi_b_key_frame
+#endif
+
+#if __ARM_FEATURE_PAC_DEFAULT & 1
+# define CFI_PAC_TOGGLEcfi_window_save
+# define CFI_PAC_KEY
+# define PAC_AND_BTI   PACIASP
+# define AUT   AUTIASP
+#elif __ARM_FEATURE_PAC_DEFAULT & 2
+# define CFI_PAC_TOGGLEcfi_window_save
+# define CFI_PAC_KEY   cfi_b_key_frame
+# define PAC_AND_BTI   PACIBSP
+# define AUT   AUTIBSP
+#else
+# define CFI_PAC_TOGGLE
+# define CFI_PAC_KEY
+# define PAC_AND_BTI   BTI_C
+# define AUT
+#endif
 
.text
.align  2
@@ -33,7 +62,9 @@
 
 _ITM_beginTransaction:
cfi_startproc
-   BTI_C
+   CFI_PAC_KEY
+   PAC_AND_BTI
+   CFI_PAC_TOGGLE
mov x1, sp
stp x29, x30, [sp, -11*16]!
cfi_adjust_cfa_offset(11*16)
@@ -60,6 +91,8 @@ _ITM_beginTransaction:
cfi_adjust_cfa_offset(-11*16)
cfi_restore(x29)
cfi_restore(x30)
+   AUT
+   CFI_PAC_TOGGLE
ret
cfi_endproc
.size   _ITM_beginTransaction, . - _ITM_beginTransaction
@@ -73,6 +106,7 @@ GTM_longjmp:
/* The first parameter becomes the return value (x0).
   The third parameter is ignored for now.  */
cfi_startproc
+   CFI_PAC_KEY
BTI_C
ldp x19, x20, [x1, 1*16]
ldp x21, x22, [x1, 2*16]
@@ -86,7 +120,10 @@ GTM_longjmp:
ldr x3, [x1, 10*16]
ldp x29, x30, [x1]
cfi_def_cfa(x1, 0)
+   CFI_PAC_TOGGLE
mov sp, x3
+   AUT
+   CFI_PAC_TOGGLE
br  x30
cfi_endproc
.size   GTM_longjmp, . - GTM_longjmp
@@ -96,6 +133,19 @@ GTM_longjmp:
 #define FEATURE_1_BTI 1
 #define FEATURE_1_PAC 2
 
+/* Supported features based on the code generation options.  */
+#if defined(__ARM_FEATURE_BTI_DEFAULT)
+# define BTI_FLAG FEATURE_1_BTI
+#else
+# define BTI_FLAG 0
+#endif
+
+#if __ARM_FEATURE_PAC_DEFAULT & 3
+# define PAC_FLAG FEATURE_1_PAC
+#else
+# define PAC_FLAG 0
+#endif
+
 /* Add a NT_GNU_PROPERTY_TYPE_0 note.  */
 #define GNU_PROPERTY(type, value)  \
   .section .note.gnu.property, "a";\
@@ -113,7 +163,7 @@ GTM_longjmp:
 .section .note.GNU-stack, "", %progbits
 
 /* Add GNU property note if built with branch protection.  */
-# ifdef __ARM_FEATURE_BTI_DEFAULT
-GNU_PROPERTY (FEATURE_1_AND, FEATURE_1_BTI)
+# if (BTI_FLAG|PAC_FLAG) != 0
+GNU_PROPERTY (FEATURE_1_AND, BTI_FLAG|PAC_FLAG)
 # endif
 #endif
-- 
2.17.1



[PATCH 1/2] aarch64: add PAC GNU property note to libgcc lse.S

2020-07-23 Thread Szabolcs Nagy
This note is not used anywhere currently but it is supposed to mark
objects if the return address is protected with PAC on the stack.
Since lse.S only has leaf functions the return address is never
saved on the stack so we can add the note.

The note is only added if pac-ret is enabled because it can cause
problems with old linkers and we don't have checks for that. This
can be changed later to be unconditional, for now it is consistent
with how gcc generates the notes.

libgcc/ChangeLog:

* config/aarch64/lse.S: Add PAC property note.
---
 libgcc/config/aarch64/lse.S | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
index 64691c601c1..aa3e3dc4f2d 100644
--- a/libgcc/config/aarch64/lse.S
+++ b/libgcc/config/aarch64/lse.S
@@ -283,6 +283,19 @@ ENDFN  NAME(LDNM)
 #define FEATURE_1_BTI 1
 #define FEATURE_1_PAC 2
 
+/* Supported features based on the code generation options.  */
+#if defined(__ARM_FEATURE_BTI_DEFAULT)
+# define BTI_FLAG FEATURE_1_BTI
+#else
+# define BTI_FLAG 0
+#endif
+
+#if __ARM_FEATURE_PAC_DEFAULT & 3
+# define PAC_FLAG FEATURE_1_PAC
+#else
+# define PAC_FLAG 0
+#endif
+
 /* Add a NT_GNU_PROPERTY_TYPE_0 note.  */
 #define GNU_PROPERTY(type, value)  \
   .section .note.gnu.property, "a";\
@@ -300,7 +313,7 @@ ENDFN   NAME(LDNM)
 .section .note.GNU-stack, "", %progbits
 
 /* Add GNU property note if built with branch protection.  */
-# ifdef __ARM_FEATURE_BTI_DEFAULT
-GNU_PROPERTY (FEATURE_1_AND, FEATURE_1_BTI)
+# if (BTI_FLAG|PAC_FLAG) != 0
+GNU_PROPERTY (FEATURE_1_AND, BTI_FLAG|PAC_FLAG)
 # endif
 #endif
-- 
2.17.1



Re: [Patch][OG10] omp-low.c: Avoid offload-target lto1 is-missing error by not-privatizing TREE_READONLY vars

2020-07-23 Thread Julian Brown
On Thu, 16 Jul 2020 15:53:54 +0200
Tobias Burnus  wrote:

> [This is an OpenACC issue but I would not completely surprised if
> something similar could occur for OpenMP offloading as well.
> However, the patch is for an OpenACC-specific function.]
> 
> This issue occurs for libgomp.oacc-fortran/privatized-ref-2.f90, for
> which on the device lto1 complains:
> lto1: fatal error: /tmp/ccEGJTZN.o: section A.13.1.21 is missing
> Here, "A.13" is a TREE_STATIC, TREE_READONLY array generated by the
> Fortran front-end and containing the array-constructor values, i.e.
> RHS of: array = [(-2*i, i = 1, size(array))] That testcase works on
> the trunk or on the OG10 (= devel/omp/gcc-10) branch if one reverts
> the patch "Re-do OpenACC private variable resolution" 
> https://gcc.gnu.org/g:2f4b477223fddb84f66e494eb88d1defbd5e04a2 which
> is scheduled but not yet submitted for mainline inclusion. The
> offloading variable table contains the variable as "A.13.10" (which
> works fine) and the problem-causing patch causes that the code
> .UNIQUE (OACC_PRIVATE, 0, 0, , ); gets inserted (via the
> then-added make_oacc_private_marker in omp-low.c). Here, the decl for
> 'A.13' does not have a varpool_node entry – and it is not streamed
> out as separate entity. (This IFN_ is then later processed by the
> target lto1 via omp-offload.c's execute_oacc_device_lower – where the
> asm_name "A.13.1" appears.)
> 
> [While I do not completely understand why the target LTO does
> not contain the symbol, I think the following still makes sense.
> (I do understand why the offload var table does not contain it.)]
> 
> 
> If the variable is TREE_READONLY, there is no need to pass it
> through the variable-privatization bits.
> 
> The current check is for VAR_P and TREE_ADDRESSABLE. For the fix,
> one could use:
>!TREE_READONLY
> or
>!(TREE_READONLY && TREE_STATIC)
> or
>!(TREE_READONLY && (TREE_STATIC || DECL_EXTERNAL)
> I am not sure what makes more sense. I initially used the first
> version and then moved to the last. Thoughts?

I don't know when we'd have TREE_READONLY and DECL_EXTERNAL in a
situation where it made sense to mark the decl private using this
mechanism -- but I'll defer to your judgement!

> Additional comments?
> Does it look OK for OG10?

Looks OK to me, FWIW.

Thanks,

Julian


[PATCH 5/5] MSP430: Skip index-1.c test

2020-07-23 Thread Jozef Lawrynowicz
To access the "n - 10"th element of "a" in this test, GCC will
generate the following code for msp430-elf with -mcpu=msp430x:

  RLAM.W  #1, R12
  MOV.W a-3392(R12), R12

Since there aren't actually 100,000 elements in a, this means that
"a-3392" offset calculated by the linker can overflow, as the address of
"a" can validly be less than 3392.

The relocations used for -mcpu=msp430 and -mlarge are not as strict and
the calculated value is allowed to wrap around the address space,
avoiding relocation overflows.

>From bfafc633013952c1a5cac2dbb10b774f66be920e Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 16 Jul 2020 11:35:25 +0100
Subject: [PATCH 5/5] MSP430: Skip index-1.c test

To access the "n - 10"th element of "a" in this test, GCC will
generate the following code for msp430-elf with -mcpu=msp430x:

  RLAM.W  #1, R12
  MOV.W a-3392(R12), R12

Since there aren't actually 100,000 elements in a, this means that
"a-3392" offset calculated by the linker can overflow, as the address of
"a" can validly be less than 3392.

The relocations used for -mcpu=msp430 and -mlarge are not as strict and
the calculated value is allowed to wrap around the address space,
avoiding relocation overflows.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/index-1.c: Skip for the default MSP430 430X ISA.
---
 gcc/testsuite/gcc.c-torture/execute/index-1.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/testsuite/gcc.c-torture/execute/index-1.c 
b/gcc/testsuite/gcc.c-torture/execute/index-1.c
index b00090d834a..d96be4c77b8 100644
--- a/gcc/testsuite/gcc.c-torture/execute/index-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/index-1.c
@@ -1,3 +1,5 @@
+/* { dg-skip-if "strict reloc overflow checking" { msp430-*-* } { "*" } { 
"-mcpu=msp430" "-mlarge"} } */
+
 int a[] =
 {
   0,  1,  2,  3,  4,  5,  6,  7,  8,  9,
-- 
2.27.0



[PATCH 4/5] MSP430: Implement TARGET_INSN_COST

2020-07-23 Thread Jozef Lawrynowicz
The length of an insn can be used to calculate its cost, when optimizing for
size. When optimizing for speed, this is a good estimate, since the cycle cost
of an MSP430 instruction increases with its length.

>From e4c5f9c3f567489f89b41a0d96e321acb5d18152 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 16 Jul 2020 11:34:50 +0100
Subject: [PATCH 4/5] MSP430: Implement TARGET_INSN_COST

The length of an insn can be used to calculate its cost, when optimizing for
size. When optimizing for speed, this is a good estimate, since the cycle cost
of an MSP430 instruction increases with its length.

gcc/ChangeLog:

* config/msp430/msp430.c (TARGET_INSN_COST): Define.
(msp430_insn_cost): New function.
* config/msp430/msp430.opt: Add -mdebug-insn-costs option.

gcc/testsuite/ChangeLog:

* gcc.target/msp430/rtx-cost-O3-default.c: New test.
* gcc.target/msp430/rtx-cost-O3-f5series.c: New test.
* gcc.target/msp430/rtx-cost-Os-default.c: New test.
* gcc.target/msp430/rtx-cost-Os-f5series.c: New test.
---
 gcc/config/msp430/msp430.c| 29 +
 gcc/config/msp430/msp430.opt  |  4 ++
 .../gcc.target/msp430/rtx-cost-O3-default.c   | 42 ++
 .../gcc.target/msp430/rtx-cost-O3-f5series.c  | 38 
 .../gcc.target/msp430/rtx-cost-Os-default.c   | 43 +++
 .../gcc.target/msp430/rtx-cost-Os-f5series.c  | 38 
 6 files changed, 194 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/msp430/rtx-cost-O3-default.c
 create mode 100644 gcc/testsuite/gcc.target/msp430/rtx-cost-O3-f5series.c
 create mode 100644 gcc/testsuite/gcc.target/msp430/rtx-cost-Os-default.c
 create mode 100644 gcc/testsuite/gcc.target/msp430/rtx-cost-Os-f5series.c

diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index b7daafcc11a..7ef651fb324 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -1711,6 +1711,35 @@ msp430_rtx_costs (rtx x,
   return false;
 }
 }
+
+#undef TARGET_INSN_COST
+#define TARGET_INSN_COST msp430_insn_cost
+
+static int
+msp430_insn_cost (rtx_insn *insn, bool speed ATTRIBUTE_UNUSED)
+{
+  int cost;
+
+  if (recog_memoized (insn) < 0)
+return 0;
+
+  cost = get_attr_length (insn);
+  if (TARGET_DEBUG_INSN_COSTS)
+{
+  fprintf (stderr, "cost %d for insn:\n", cost);
+  debug_rtx (insn);
+}
+
+  /* The returned cost must be relative to COSTS_N_INSNS (1). An insn with a
+ length of 2 bytes is the smallest possible size and so must be equivalent
+ to COSTS_N_INSNS (1).  */
+  return COSTS_N_INSNS (cost) / (2 * COSTS_N_INSNS (1));
+
+  /* FIXME Add more detailed costs when optimizing for speed.
+ For now the length of the instruction is a good approximiation and roughly
+ correlates with cycle cost.  */
+}
+
 
 /* Function Entry and Exit */
 
diff --git a/gcc/config/msp430/msp430.opt b/gcc/config/msp430/msp430.opt
index 8134ca7ac95..448c41aa81c 100644
--- a/gcc/config/msp430/msp430.opt
+++ b/gcc/config/msp430/msp430.opt
@@ -115,3 +115,7 @@ Target RejectNegative Joined UInteger IntegerRange(0,65) 
Var(msp430_max_inline_s
 For shift operations by a constant amount, which require an individual 
instruction to shift by one
 position, set the maximum number of inline shift instructions (maximum value 
64) to emit instead of using the corresponding __mspabi helper function.
 The default value is 4.
+
+mdebug-insn-costs
+Target Report Mask(DEBUG_INSN_COSTS)
+Print insns and their costs as calculated by TARGET_INSN_COSTS.
diff --git a/gcc/testsuite/gcc.target/msp430/rtx-cost-O3-default.c 
b/gcc/testsuite/gcc.target/msp430/rtx-cost-O3-default.c
new file mode 100644
index 000..1bd6a142002
--- /dev/null
+++ b/gcc/testsuite/gcc.target/msp430/rtx-cost-O3-default.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/* Verify the MSP430 cost model is working as expected for the default ISA
+   (msp430x) and hwmult (none), when compiling at -O3.  */
+
+char arr[2];
+char a;
+char *ptr;
+
+/*
+** foo:
+** ...
+** MOV.B   \, \\+1
+** MOV.*   #arr\+2, \
+** ...
+*/
+
+void
+foo (void)
+{
+  arr[1] = a;
+  ptr = arr + 2;
+}
+
+extern void ext (void);
+
+/*
+** bar:
+** ...
+** CALL.*  #ext
+** CALL.*  #ext
+** ...
+*/
+
+void
+bar (void)
+{
+  ext ();
+  ext ();
+}
diff --git a/gcc/testsuite/gcc.target/msp430/rtx-cost-O3-f5series.c 
b/gcc/testsuite/gcc.target/msp430/rtx-cost-O3-f5series.c
new file mode 100644
index 000..1e48625f2e5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/msp430/rtx-cost-O3-f5series.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -mhwmult=f5series" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/* Verify the MSP430 cost model is working as expected for the default ISA
+   (msp430x) and f5series hwmult, when compiling at -O3.  */
+
+volatile 

[PATCH 3/5] MSP430: Add defaulting to the insn length attribute

2020-07-23 Thread Jozef Lawrynowicz
The length of MSP430 instructions is mostly just a function of the type and
number of operands. Setting the "type" attribute on all insns describes
the number of operands, and the position of the source and destination
operands.

In most cases, defaulting in the "length" and "extension" attribute definitions
can then be used to calculate the total length of the instruction by using
the value of the "type" attribute to examine the operands.

>From 0e39cc3f13c604df1225d3c1eef6b05e629c184b Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 16 Jul 2020 11:32:01 +0100
Subject: [PATCH 3/5] MSP430: Add defaulting to the insn length attribute

The length of MSP430 instructions is mostly just a function of the type and
number of operands. Setting the "type" attribute on all insns describes
the number of operands, and the position of the source and destination
operands.

In most cases, defaulting in the "length" and "extension" attribute definitions
can then be used to calculate the total length of the instruction by using
the value of the "type" attribute to examine the operands.

gcc/ChangeLog:

* config/msp430/msp430-protos.h (msp430x_extendhisi): Return int
instead of char *.
(msp430_output_asm_shift_insns): Likewise.
Add new return_length argument.
(msp430x_insn_required): Add prototype.
* config/msp430/msp430.c (msp430_output_asm_shift_insns): Return the
total length, in bytes, of the emitted instructions.
(msp430x_insn_required): New function.
(msp430x_extendhisi): Return the total length, in bytes, of the
emitted instructions.
* config/msp430/msp430.h (ADJUST_INSN_LENGTH): Define.
* config/msp430/msp430.md: New define_attr "type".
New define_attr "extension".
New define_attr "length_multiplier".
New define_attr "extra_length".
Rewrite define_attr "length".
Set type, extension, length, length_multiplier or extra_length insn
attributes on all insns, as appropriate.
(andneghi3): Rewrite using constraints instead of C code to decide
output insns.
* config/msp430/predicates.md (msp430_cheap_operand): New predicate.
(msp430_high_memory_operand): New predicate.
---
 gcc/config/msp430/msp430-protos.h |   5 +-
 gcc/config/msp430/msp430.c| 162 ---
 gcc/config/msp430/msp430.h|  10 +
 gcc/config/msp430/msp430.md   | 439 --
 gcc/config/msp430/predicates.md   |  13 +
 5 files changed, 507 insertions(+), 122 deletions(-)

diff --git a/gcc/config/msp430/msp430-protos.h 
b/gcc/config/msp430/msp430-protos.h
index 0b4d9a42b41..33ad1adc61f 100644
--- a/gcc/config/msp430/msp430-protos.h
+++ b/gcc/config/msp430/msp430-protos.h
@@ -26,7 +26,7 @@ void  msp430_expand_eh_return (rtx);
 void   msp430_expand_epilogue (int);
 void   msp430_expand_helper (rtx *operands, const char *, bool);
 void   msp430_expand_prologue (void);
-const char * msp430x_extendhisi (rtx *);
+int msp430x_extendhisi (rtx *, bool);
 void   msp430_fixup_compare_operands (machine_mode, rtx *);
 intmsp430_hard_regno_nregs_has_padding (int, machine_mode);
 intmsp430_hard_regno_nregs_with_padding (int, machine_mode);
@@ -49,10 +49,11 @@ rtx msp430_subreg (machine_mode, rtx, machine_mode, int);
 boolmsp430_use_f5_series_hwmult (void);
 bool   msp430_has_hwmult (void);
 bool msp430_op_not_in_high_mem (rtx op);
+bool msp430x_insn_required (rtx op);
 
 #ifdef RTX_CODE
 int msp430_expand_shift (enum rtx_code code, machine_mode mode, rtx *operands);
-const char * msp430_output_asm_shift_insns (enum rtx_code code, machine_mode 
mode, rtx *operands);
+int msp430_output_asm_shift_insns (enum rtx_code code, machine_mode mode, rtx 
*operands, bool);
 #endif
 
 #endif /* GCC_MSP430_PROTOS_H */
diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index 81ee5075a57..b7daafcc11a 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -3535,18 +3535,22 @@ msp430_expand_shift (enum rtx_code code, machine_mode 
mode, rtx *operands)
For 430X it is inneficient to do so for any modes except SI and DI, since we
can make use of R*M insns or RPT with 430X insns, so this function is only
used for SImode in that case.  */
-const char *
+int
 msp430_output_asm_shift_insns (enum rtx_code code, machine_mode mode,
-  rtx *operands)
+  rtx *operands, bool return_length)
 {
   int i;
   int amt;
   int max_shift = GET_MODE_BITSIZE (mode) - 1;
+  int length = 0;
+
   gcc_assert (CONST_INT_P (operands[2]));
   amt = INTVAL (operands[2]);
 
   if (amt == 0 || amt > max_shift)
 {
+  if (return_length)
+   return 0;
   switch (code)
{
case ASHIFT:
@@ -3564,17 +3568,28 @@ msp430_output_asm_shift_insns (enum rtx_code code, 
machine_mode mode,
default:
  gcc_unreachable ();
}
-  return "";
+

[PATCH 2/5] MSP430: Implement TARGET_RTX_COSTS

2020-07-23 Thread Jozef Lawrynowicz
Costs of MSP430 instructions are mostly just a function of the type and number
of operands. In these cases, TARGET_RTX_COSTS just needs to examine the
operands to calculate the cost of the expression.

For more complicated operations where library helper functions are required,
if the cost cannot be accurately calculated, it is estimated and
disparaged relative to the cost of a single instruction.

>From f5cbd8967d9c64a4ea6eb9fb8846b4361e16e396 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 16 Jul 2020 11:28:59 +0100
Subject: [PATCH 2/5] MSP430: Implement TARGET_RTX_COSTS

Costs of MSP430 instructions are mostly just a function of the type and number
of operands. In these cases, TARGET_RTX_COSTS just needs to examine the
operands to calculate the cost of the expression.

For more complicated operations where library helper functions are required,
if the cost cannot be accurately calculated, it is estimated and
disparaged relative to the cost of a single instruction.

gcc/ChangeLog:

* config/msp430/msp430.c (use_helper_for_const_shift): Add forward
declaration.
Remove unused argument.
(struct msp430_multlib_costs): New struct.
(msp430_is_mem_indirect): New function.
(msp430_costs): Likewise.
(msp430_shift_costs): Likewise.
(msp430_muldiv_costs): Likewise.
(msp430_get_inner_dest_code): Likewise.
(msp430_single_op_cost): Likewise.
(msp430_rtx_costs): Rewrite from scratch.
(msp430_expand_shift): Adjust use_helper_for_const_shift call.
---
 gcc/config/msp430/msp430.c | 545 -
 1 file changed, 530 insertions(+), 15 deletions(-)

diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index 9e739233fa0..81ee5075a57 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -49,6 +49,9 @@
 #include "msp430-devices.h"
 #include "incpath.h"
 #include "prefix.h"
+#include "insn-config.h"
+#include "insn-attr.h"
+#include "recog.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -56,6 +59,7 @@
 
 static void msp430_compute_frame_info (void);
 static bool use_32bit_hwmult (void);
+static bool use_helper_for_const_shift (machine_mode mode, HOST_WIDE_INT amt);
 
 
 
@@ -1132,6 +1136,28 @@ static const struct double_op_cost size_cost_double_op =
   2, 2, 3
 };
 
+struct msp430_multlib_costs
+{
+  const int mulhi;
+  const int mulsi;
+  const int muldi;
+};
+
+/* There is no precise size cost when using libcalls, instead it is disparaged
+   relative to other instructions.
+   The cycle costs are from the CALL to the RET, inclusive.
+   FIXME muldi cost is not accurate.  */
+static const struct msp430_multlib_costs cycle_cost_multlib_32bit =
+{
+  27, 33, 66
+};
+
+/* 32bit multiply takes a few more instructions on 16bit hwmult.  */
+static const struct msp430_multlib_costs cycle_cost_multlib_16bit =
+{
+  27, 42, 66
+};
+
 /* TARGET_REGISTER_MOVE_COST
There is only one class of general-purpose, non-fixed registers, and the
relative cost of moving data between them is always the same.
@@ -1174,29 +1200,516 @@ msp430_memory_move_cost (machine_mode mode 
ATTRIBUTE_UNUSED,
because there are no conditional move insns - when a condition is involved,
the only option is to use a cbranch.  */
 
+/* For X, which must be a MEM RTX, return TRUE if it is an indirect memory
+   reference, @Rn or @Rn+.  */
+static bool
+msp430_is_mem_indirect (rtx x)
+{
+  gcc_assert (GET_CODE (x) == MEM);
+  rtx op0 = XEXP (x, 0);
+  return (GET_CODE (op0) == REG || GET_CODE (op0) == POST_INC);
+}
+
+/* Costs of MSP430 instructions are generally based on the addressing mode
+   combination of the source and destination operands.
+   Given source operand SRC (which may be NULL to indicate a single-operand
+   instruction) and destination operand DST return the cost of this
+   expression.  */
+static int
+msp430_costs (rtx src, rtx dst, bool speed, rtx outer_rtx)
+{
+  enum rtx_code src_code = GET_CODE (src);
+  enum rtx_code dst_code = GET_CODE (dst);
+  enum rtx_code outer_code = GET_CODE (outer_rtx);
+  machine_mode outer_mode = GET_MODE (outer_rtx);
+  const struct double_op_cost *cost_p;
+  cost_p = (speed ? _cost_double_op : _cost_double_op);
+
+  if (outer_code == TRUNCATE
+  && (outer_mode == QImode
+ || outer_mode == HImode
+ || outer_mode == PSImode))
+/* Truncation to these modes is normally free as a side effect of the
+   instructions themselves.  */
+return 0;
+
+  if (dst_code == SYMBOL_REF
+  || dst_code == LABEL_REF
+  || dst_code == CONST_INT)
+/* Catch RTX like (minus (const_int 0) (reg)) but don't add any cost.  */
+return 0;
+
+  if (debug_rtx_costs && dst_code != REG && dst_code != MEM && dst_code != PC)
+{
+  fprintf (stderr, "msp430_costs unhandled dst operand:\n");
+  debug_rtx (dst);
+  fprintf (stderr, "from:\n");
+  debug_rtx (outer_rtx);
+

[PATCH 1/5] MSP430: Implement TARGET_MEMORY_MOVE_COST

2020-07-23 Thread Jozef Lawrynowicz
The cycle and size cost of a MOV instruction in different addressing
modes can be used to calculate the TARGET_MEMORY_MOVE_COST relative to
TARGET_REGISTER_MOVE_COST.
>From c801a2851d47601218578c411854de9540486335 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 16 Jul 2020 11:28:11 +0100
Subject: [PATCH 1/5] MSP430: Implement TARGET_MEMORY_MOVE_COST

The cycle and size cost of a MOV instruction in different addressing
modes can be used to calculate the TARGET_MEMORY_MOVE_COST relative to
TARGET_REGISTER_MOVE_COST.

gcc/ChangeLog:

* config/msp430/msp430.c (struct single_op_cost): New struct.
(struct double_op_cost): Likewise.
(TARGET_REGISTER_MOVE_COST): Don't define but add comment.
(TARGET_MEMORY_MOVE_COST): Define to...
(msp430_memory_move_cost): New function.
(BRANCH_COST): Don't define but add comment.
---
 gcc/config/msp430/msp430.c | 131 +
 1 file changed, 131 insertions(+)

diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index c2b24974364..9e739233fa0 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -1043,6 +1043,137 @@ msp430_legitimate_constant (machine_mode mode, rtx x)
 }
 
 
+/* Describing Relative Costs of Operations
+   To model the cost of an instruction, use the number of cycles when
+   optimizing for speed, and the number of words when optimizing for size.
+   The cheapest instruction will execute in one cycle and cost one word.
+   The cycle and size costs correspond to 430 ISA instructions, not 430X
+   instructions or 430X "address" instructions.  The relative costs of 430X
+   instructions is accurately modeled with the 430 costs.  The relative costs
+   of some "address" instructions can differ, but these are not yet handled.
+   Adding support for this could improve performance/code size.  */
+
+const int debug_rtx_costs = 0;
+
+struct single_op_cost
+{
+  const int reg;
+  /* Indirect register (@Rn) or indirect autoincrement (@Rn+).  */
+  const int ind;
+  const int mem;
+};
+
+static const struct single_op_cost cycle_cost_single_op =
+{
+  1, 3, 4
+};
+
+static const struct single_op_cost size_cost_single_op =
+{
+  1, 1, 2
+};
+
+/* When the destination of an insn is memory, the cost is always the same
+   regardless of whether that memory is accessed using indirect register,
+   indexed or absolute addressing.
+   When the source operand is memory, indirect register and post-increment have
+   the same cost, which is lower than indexed and absolute, which also have
+   the same cost.  */
+struct double_op_cost
+{
+  /* Source operand is a register.  */
+  const int r2r;
+  const int r2pc;
+  const int r2m;
+
+  /* Source operand is memory, using indirect register (@Rn) or indirect
+ autoincrement (@Rn+) addressing modes.  */
+  const int ind2r;
+  const int ind2pc;
+  const int ind2m;
+
+  /* Source operand is an immediate.  */
+  const int imm2r;
+  const int imm2pc;
+  const int imm2m;
+
+  /* Source operand is memory, using indexed (x(Rn)) or absolute ()
+ addressing modes.  */
+  const int mem2r;
+  const int mem2pc;
+  const int mem2m;
+};
+
+/* These structures describe the cost of MOV, BIT and CMP instructions, in 
terms
+   of clock cycles or words.  */
+static const struct double_op_cost cycle_cost_double_op_mov =
+{
+  1, 3, 3,
+  2, 4, 4,
+  2, 3, 4,
+  3, 5, 5
+};
+
+/* Cycle count when memory is the destination operand is one larger than above
+   for instructions that aren't MOV, BIT or CMP.  */
+static const struct double_op_cost cycle_cost_double_op =
+{
+  1, 3, 4,
+  2, 4, 5,
+  2, 3, 5,
+  3, 5, 6
+};
+
+static const struct double_op_cost size_cost_double_op =
+{
+  1, 1, 2,
+  1, 1, 2,
+  2, 2, 3,
+  2, 2, 3
+};
+
+/* TARGET_REGISTER_MOVE_COST
+   There is only one class of general-purpose, non-fixed registers, and the
+   relative cost of moving data between them is always the same.
+   Therefore, the default of 2 is optimal.  */
+
+#undef TARGET_MEMORY_MOVE_COST
+#define TARGET_MEMORY_MOVE_COST msp430_memory_move_cost
+
+/* Return the cost of moving data between registers and memory.
+   The returned cost must be relative to the default TARGET_REGISTER_MOVE_COST
+   of 2.
+   IN is false if the value is to be written to memory.  */
+static int
+msp430_memory_move_cost (machine_mode mode ATTRIBUTE_UNUSED,
+reg_class_t rclass ATTRIBUTE_UNUSED,
+bool in)
+{
+  int cost;
+  const struct double_op_cost *cost_p;
+  /* Optimize with a code size focus by default, unless -O2 or above is
+ specified.  */
+  bool speed = (!optimize_size && optimize >= 2);
+
+  cost_p = (speed ? _cost_double_op_mov : _cost_double_op);
+
+  if (in)
+/* Reading from memory using indirect addressing is assumed to be the more
+   common case.  */
+cost = cost_p->ind2r;
+  else
+cost = cost_p->r2m;
+
+  /* All register to register moves cost 1 cycle or 1 word, so 

[PATCH 0/5] MSP430: Implement macros to describe relative costs of operations

2020-07-23 Thread Jozef Lawrynowicz
The following series of patches for MSP430 implement some of the target
macros used to determine the relative costs of operations.

To give an indication of the overall effect of these changes on
codesize, below are some size statistics collected from all the
executable files from execute.exp that are built at -Os.
There are around 1470 such tests (depending on the configuration).

The percentage change (((new - old)/old) * 100) in text size is calculated
for each test and the given metric is applied to that overall set of data.

Configuration | Mean (%) | Median (%) | Delta < 0 (count) | Delta > 0 (count)
-
-mcpu=msp430  |  -2.4|   -2.7 |  1454 |  17
-mcpu=msp430x |  -2.3|   -2.4 |  1460 |  10
-mlarge   |  -1.7|   -1.9 |  1412 |  37

Successfully regtested on trunk for msp430-elf, ok to apply?

Jozef Lawrynowicz (5):
  MSP430: Implement TARGET_MEMORY_MOVE_COST
  MSP430: Implement TARGET_RTX_COSTS
  MSP430: Add defaulting to the insn length attribute
  MSP430: Implement TARGET_INSN_COST
  MSP430: Skip index-1.c test

 gcc/config/msp430/msp430-protos.h |   5 +-
 gcc/config/msp430/msp430.c| 867 --
 gcc/config/msp430/msp430.h|  13 +
 gcc/config/msp430/msp430.md   | 439 +++--
 gcc/config/msp430/msp430.opt  |   4 +
 gcc/config/msp430/predicates.md   |  13 +
 gcc/testsuite/gcc.c-torture/execute/index-1.c |   2 +
 7 files changed, 1206 insertions(+), 137 deletions(-)

-- 
2.27.0



Re: [nvptx, aarch64] Define TARGET_OFFLOAD_OPTIONS for AArch64

2020-07-23 Thread Thomas Schwinge
Hi Matthias!

On 2020-07-23T15:54:55+0200, Matthias Klose  wrote:
> Trying to build a nvptx offload compiler on aarch64-linux-gnu

I have not yet heard any reports of ARM-host GCC builds with offloading
enabled: 32-bit ARM, or 64-bit aarch64.

In general, we've designed/implemented code offloading support in GCC so
that it's completely independent from (unaware of) the specific host CPU
architecture.  Just indeed, for obvious reasons, ABI compatibility across
the offloading compliation boundary is the one (only) place where there
may be some fine tuning required, in particular if the CPU-side ABI is
mandating (..., and the respective GCC back end thus implementing) any
"unusual" things.  (But I don't know about ARM/aarch64 myself.)

> the libgomp tests
> error out with
>
> unrecognizable argument of option -foffload-abi
>
> Passing that option goes a step further

Thanks for looking into this.

> hitting PR target/96265.

I don't have any input on the 'lto1' errors you're mentioning there --
these seem "strange".  Are we sure it's not a host vs. offloading
compiler (LTO) configuration problem?

Not sure if these are relevant, but just for completeness: I do remember
the years-old unresolved  "OpenACC
compilation on Tegra K1 (ARM)" problem report, and
 "nvptx offloading: hard-coded
64-bit assumptions", which mentions a few minor technical items that need
to be resolved for enabling 32-bit nvptx offloading builds -- which isn't
your problem for aarch64.

I don't have suitable aarch64 hardware/builds available myself.

> Define that
> hook, as it was done for rs6000 in 2015.  Ok for the trunk?

Looks good to me, in principle.

Reviewed-by: Thomas Schwinge 

> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -22739,6 +22739,16 @@ aarch64_stack_protect_guard (void)
>return NULL_TREE;
>  }
>
> +/* Implement the TARGET_OFFLOAD_OPTIONS hook.  */
> +static char *
> +aarch64_offload_options (void)
> +{
> +  if (TARGET_ILP32)
> +return xstrdup ("-foffload-abi=ilp32");
> +  else
> +return xstrdup ("-foffload-abi=lp64");
> +}

I've verified that 'TARGET_ILP32' is the correct conditional to use for
'gcc/config/aarch64/'.  I did wonder if we have to keep the current
("unsupported") behavior for big-endian ('BYTES_BIG_ENDIAN'?), but as
'gcc/config/rs6000/rs6000.c:rs6000_offload_options' isn't doing that, we
also don't here have to consider that case specially either.

> +
>  /* Return the diagnostic message string if conversion from FROMTYPE to
> TOTYPE is not allowed, NULL otherwise.  */
>
> @@ -23079,6 +23089,9 @@ aarch64_libgcc_floating_mode_supported_p
>  #undef TARGET_NARROW_VOLATILE_BITFIELD
>  #define TARGET_NARROW_VOLATILE_BITFIELD hook_bool_void_false
>
> +#undef TARGET_OFFLOAD_OPTIONS
> +#define TARGET_OFFLOAD_OPTIONS aarch64_offload_options
> +
>  #undef  TARGET_OPTION_OVERRIDE
>  #define TARGET_OPTION_OVERRIDE aarch64_override_options

According to 'gcc/target.def', 'offload_options' appears after
'override_options_after_change', so maybe both 'aarch64_offload_options'
and '#define TARGET_OFFLOAD_OPTIONS' should appear after
'aarch64_override_options_after_change' and '#define
TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE' -- but it's not that all the other
ones would be sorted properly...  ;-)


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: [Patch] OpenMP: Support 'lastprivate (conditional:' in Fortran

2020-07-23 Thread Jakub Jelinek via Gcc-patches
On Thu, Jul 23, 2020 at 05:06:31PM +0200, Tobias Burnus wrote:
> Another simple-to-add feature as the main work was
> done in the middle-end for C/C++.
> 
> (I do note that 'master taskloop [simd]' is not yet
> supported in gfortran.)
> 
> OK?

LGTM, thanks.

Jakub



Re: [PATCH] contrib/vimrc: detect more C-like files

2020-07-23 Thread Martin Liška

On 7/23/20 4:44 PM, Patrick Palka via Gcc-patches wrote:

Here's a rebased patch.  Can another Vim user verify that this does the
right thing?


Yes, it works correctly for me! I tested that before and after your change
for ./libstdc++-v3/include/std/iomanip.

Please install the patch.
Martin


[Patch] OpenMP: Support 'lastprivate (conditional:' in Fortran

2020-07-23 Thread Tobias Burnus

Another simple-to-add feature as the main work was
done in the middle-end for C/C++.

(I do note that 'master taskloop [simd]' is not yet
supported in gfortran.)

OK?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
OpenMP: Support 'lastprivate (conditional:' in Fortran

gcc/fortran/ChangeLog:

	* gfortran.h (gfc_omp_namelist): Add lastprivate_conditional.
	* openmp.c (gfc_match_omp_clauses): Handle 'conditional:'
	modifier of 'lastprivate'.
	* trans-openmp.c (gfc_omp_clause_default_ctor): Don't assert
	on OMP_CLAUSE__CONDTEMP_ and other OMP_*TEMP_.
	(gfc_trans_omp_variable_list): Handle lastprivate_conditional.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/lastprivate-conditional-1.f90: New test.
	* gfortran.dg/gomp/lastprivate-conditional-2.f90: New test.
	* gfortran.dg/gomp/lastprivate-conditional-3.f90: New test.
	* gfortran.dg/gomp/lastprivate-conditional-4.f90: New test.
	* gfortran.dg/gomp/lastprivate-conditional-5.f90: New test.

 gcc/fortran/gfortran.h |  1 +
 gcc/fortran/openmp.c   | 20 --
 gcc/fortran/trans-openmp.c | 23 --
 .../gfortran.dg/gomp/lastprivate-conditional-1.f90 | 82 ++
 .../gfortran.dg/gomp/lastprivate-conditional-2.f90 | 46 
 .../gfortran.dg/gomp/lastprivate-conditional-3.f90 | 65 +
 .../gfortran.dg/gomp/lastprivate-conditional-4.f90 | 28 
 .../gfortran.dg/gomp/lastprivate-conditional-5.f90 | 47 +
 8 files changed, 304 insertions(+), 8 deletions(-)

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 1648831736c..5fa86aa4e30 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1242,6 +1242,7 @@ typedef struct gfc_omp_namelist
   gfc_omp_map_op map_op;
   gfc_omp_linear_op linear_op;
   struct gfc_common_head *common;
+  bool lastprivate_conditional;
 } u;
   struct gfc_omp_namelist_udr *udr;
   struct gfc_omp_namelist *next;
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index e89ae295a31..f8f2439b6e4 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1355,10 +1355,22 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  break;
 	case 'l':
 	  if ((mask & OMP_CLAUSE_LASTPRIVATE)
-	  && gfc_match_omp_variable_list ("lastprivate (",
-	  >lists[OMP_LIST_LASTPRIVATE],
-	  true) == MATCH_YES)
-	continue;
+	  && gfc_match ("lastprivate ( ") == MATCH_YES)
+	{
+	  bool conditional = gfc_match ("conditional : ") == MATCH_YES;
+	  head = NULL;
+	  if (gfc_match_omp_variable_list ("",
+	   >lists[OMP_LIST_LASTPRIVATE],
+	   false, NULL, ) == MATCH_YES)
+		{
+		  gfc_omp_namelist *n;
+		  for (n = *head; n; n = n->next)
+		n->u.lastprivate_conditional = conditional;
+		  continue;
+		}
+	  gfc_current_locus = old_loc;
+	  break;
+	}
 	  end_colon = false;
 	  head = NULL;
 	  if ((mask & OMP_CLAUSE_LINEAR)
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 56bc7cd10cc..d12d7fbddac 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -613,10 +613,21 @@ gfc_omp_clause_default_ctor (tree clause, tree decl, tree outer)
   tree type = TREE_TYPE (decl), size, ptr, cond, then_b, else_b;
   stmtblock_t block, cond_block;
 
-  gcc_assert (OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_PRIVATE
-	  || OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_LASTPRIVATE
-	  || OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_LINEAR
-	  || OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_REDUCTION);
+  switch (OMP_CLAUSE_CODE (clause))
+{
+case OMP_CLAUSE__LOOPTEMP_:
+case OMP_CLAUSE__REDUCTEMP_:
+case OMP_CLAUSE__CONDTEMP_:
+case OMP_CLAUSE__SCANTEMP_:
+  return NULL;
+case OMP_CLAUSE_PRIVATE:
+case OMP_CLAUSE_LASTPRIVATE:
+case OMP_CLAUSE_LINEAR:
+case OMP_CLAUSE_REDUCTION:
+  break;
+default:
+  gcc_unreachable ();
+}
 
   if ((! GFC_DESCRIPTOR_TYPE_P (type)
|| GFC_TYPE_ARRAY_AKIND (type) != GFC_ARRAY_ALLOCATABLE)
@@ -1678,6 +1689,10 @@ gfc_trans_omp_variable_list (enum omp_clause_code code,
 	tree node = build_omp_clause (input_location, code);
 	OMP_CLAUSE_DECL (node) = t;
 	list = gfc_trans_add_clause (node, list);
+
+	if (code == OMP_CLAUSE_LASTPRIVATE
+		&& namelist->u.lastprivate_conditional)
+	  OMP_CLAUSE_LASTPRIVATE_CONDITIONAL (node) = 1;
 	  }
   }
   return list;
diff --git a/gcc/testsuite/gfortran.dg/gomp/lastprivate-conditional-1.f90 b/gcc/testsuite/gfortran.dg/gomp/lastprivate-conditional-1.f90
new file mode 100644
index 000..7a024061ec6
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/lastprivate-conditional-1.f90
@@ -0,0 +1,82 @@
+subroutine foo (p)
+  implicit none
+  logical :: p(:)
+  integer a, b, c, d, e, f, 

Re: [stage1][PATCH] Change semantics of -frecord-gcc-switches and add -frecord-gcc-switches-format.

2020-07-23 Thread Qing Zhao via Gcc-patches
Hi,

Thanks a lot for the info.

> On Jul 23, 2020, at 5:07 AM, Martin Liška  wrote:
> 
> On 7/21/20 6:24 PM, Qing Zhao wrote:
>> 4.
>> Our company is waiting for this patch to be committed to upstream.
> 
> Hello.
> 
> Please note that patch review can sometimes take some time.

Yes, I understand. 
However, this patch has been submitted for review since March this year, 4 
months passed.
Just want to make sure that this patch is not ignored and someone will take a 
look at it.

> You can still install the patch to your local branch and note that
> the next major release will be released in about 9 months.

Our production build of GCC needs this patch for our important application. 
So, if this patch will be changed due to some new comments from the reviewers, 
the back porting will
Need to be redone later, it’s really not convenient for us. 

I am hoping the review process of this patch can be done soon, it has been 
waiting for so long time already.

Thanks.

Qing

> 
> Martin



Re: [PATCH] contrib/vimrc: detect more C-like files

2020-07-23 Thread Patrick Palka via Gcc-patches
On Mon, 3 Feb 2020, Patrick Palka wrote:

> Currently this script doesn't set the indentation style for the standard 
> library
> headers under libstdc++/ because they lack a file extension.  But they do
> have a modeline, so the file type is still set appropriately by Vim.  So by
> inspecting , we can also detect these standard library headers as
> C-like files.
> 
> contrib/ChangeLog:
> 
>   * vimrc (SetStyle): Also look at  to determine whether to
>   a file is C-like.
> ---
>  contrib/vimrc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/contrib/vimrc b/contrib/vimrc
> index bbbe1dd449b..e2ccbe0fd10 100644
> --- a/contrib/vimrc
> +++ b/contrib/vimrc
> @@ -39,7 +39,7 @@ function! SetStyle()
>setlocal formatoptions-=ro formatoptions+=cqlt
>let l:ext = fnamemodify(l:fname, ":e")
>let l:c_exts = ['c', 'h', 'cpp', 'cc', 'C', 'H', 'def', 'java']
> -  if index(l:c_exts, l:ext) != -1
> +  if index(l:c_exts, l:ext) != -1 ||  == "cpp" ||  == "c"
>  setlocal cindent
>  setlocal cinoptions=>4,n-2,{2,^-2,:2,=2,g0,f0,h2,p4,t0,+2,(0,u0,w1,m0
>endif
> -- 
> 2.25.0.114.g5b0ca878e0

Here's a rebased patch.  Can another Vim user verify that this does the
right thing?

-- >8 --

Subject: [PATCH] contrib/vimrc: detect more C-like files

Currently this script doesn't set the indentation style for the standard
library headers under libstdc++/ because they lack a file extension.
But they do have a modeline, so the file type is still set appropriately
by Vim.  So by inspecting , we can also detect these standard
library headers as C-like files.

contrib/ChangeLog:

* vimrc (SetStyle): Also inspect  to determine whether
to a file is C-like.
---
---
 contrib/vimrc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/vimrc b/contrib/vimrc
index c207eead2e4..356d455b059 100644
--- a/contrib/vimrc
+++ b/contrib/vimrc
@@ -45,7 +45,7 @@ function! SetStyle()
 setlocal textwidth=80
   endif
   setlocal formatoptions-=ro formatoptions+=cqlt
-  if index(l:c_exts, l:ext) != -1
+  if index(l:c_exts, l:ext) != -1 ||  == "c" ||  == "cpp"
 setlocal cindent
 setlocal cinoptions=>4,n-2,{2,^-2,:2,=2,g0,f0,h2,p4,t0,+2,(0,u0,w1,m0
   endif
-- 
2.28.0.rc1



[Ada] Stub CUDA_Execute and CUDA_Global pragmas

2020-07-23 Thread Arnaud Charlet
This commit adds CUDA_Execute and CUDA_Global to the list of allowed
pragmas. It also implements basic validation of said pragmas.

gcc/ada/

* aspects.ads: Declare CUDA_Global as aspect.
* einfo.ads: Use Flag118 for the Is_CUDA_Kernel flag.
(Set_Is_CUDA_Kernel): New function.
(Is_CUDA_Kernel): New function.
* einfo.adb (Set_Is_CUDA_Kernel): New function.
(Is_CUDA_Kernel): New function.
* par-prag.adb (Prag): Ignore Pragma_CUDA_Execute and
Pragma_CUDA_global.
* rtsfind.ads: Define CUDA.Driver_Types.Stream_T and
CUDA.Vector_Types.Dim3 entities
* rtsfind.adb: Define CUDA_Descendant subtype.
(Get_Unit_Name): Handle CUDA_Descendant packages.
* sem_prag.ads: Mark CUDA_Global as aspect-specifying pragma.
* sem_prag.adb (Analyze_Pragma): Validate Pragma_CUDA_Execute and
Pragma_CUDA_Global.
* snames.ads-tmpl: Define Name_CUDA_Execute and Name_CUDA_Global.

Tested on x86_64-pc-linux-gnu, committed on master

diff --git gcc/ada/aspects.ads gcc/ada/aspects.ads
index 4e517d1..0394106 100644
--- gcc/ada/aspects.ads
+++ gcc/ada/aspects.ads
@@ -189,6 +189,7 @@ package Aspects is
   Aspect_Atomic_Components,
   Aspect_Disable_Controlled,-- GNAT
   Aspect_Discard_Names,
+  Aspect_CUDA_Global,   -- GNAT
   Aspect_Export,
   Aspect_Favor_Top_Level,   -- GNAT
   Aspect_Independent,
@@ -458,6 +459,7 @@ package Aspects is
   Aspect_Contract_Cases   => False,
   Aspect_Convention   => True,
   Aspect_CPU  => False,
+  Aspect_CUDA_Global  => False,
   Aspect_Default_Component_Value  => True,
   Aspect_Default_Initial_Condition=> False,
   Aspect_Default_Iterator => False,
@@ -601,6 +603,7 @@ package Aspects is
   Aspect_Contract_Cases   => Name_Contract_Cases,
   Aspect_Convention   => Name_Convention,
   Aspect_CPU  => Name_CPU,
+  Aspect_CUDA_Global  => Name_CUDA_Global,
   Aspect_Default_Component_Value  => Name_Default_Component_Value,
   Aspect_Default_Initial_Condition=> Name_Default_Initial_Condition,
   Aspect_Default_Iterator => Name_Default_Iterator,
@@ -839,6 +842,7 @@ package Aspects is
   Aspect_Attach_Handler   => Always_Delay,
   Aspect_Constant_Indexing=> Always_Delay,
   Aspect_CPU  => Always_Delay,
+  Aspect_CUDA_Global  => Always_Delay,
   Aspect_Default_Iterator => Always_Delay,
   Aspect_Default_Storage_Pool => Always_Delay,
   Aspect_Default_Value=> Always_Delay,
diff --git gcc/ada/einfo.adb gcc/ada/einfo.adb
index eab06ee..6cdea48 100644
--- gcc/ada/einfo.adb
+++ gcc/ada/einfo.adb
@@ -423,6 +423,7 @@ package body Einfo is
--Never_Set_In_Source Flag115
--Is_Visible_Lib_Unit Flag116
--Is_Unchecked_Union  Flag117
+   --Is_CUDA_Kernel  Flag118
--Has_Convention_Pragma   Flag119
--Has_Primitive_OperationsFlag120
 
@@ -2235,6 +2236,12 @@ package body Einfo is
   return Flag74 (Id);
end Is_CPP_Class;
 
+   function Is_CUDA_Kernel (Id : E) return B is
+   begin
+  pragma Assert (Ekind (Id) in E_Function | E_Procedure);
+  return Flag118 (Id);
+   end Is_CUDA_Kernel;
+
function Is_DIC_Procedure (Id : E) return B is
begin
   pragma Assert (Ekind_In (Id, E_Function, E_Procedure));
@@ -5477,6 +5484,12 @@ package body Einfo is
   Set_Flag74 (Id, V);
end Set_Is_CPP_Class;
 
+   procedure Set_Is_CUDA_Kernel (Id : E; V : B := True) is
+   begin
+  pragma Assert (Ekind (Id) in E_Function | E_Procedure);
+  Set_Flag118 (Id, V);
+   end Set_Is_CUDA_Kernel;
+
procedure Set_Is_DIC_Procedure (Id : E; V : B := True) is
begin
   pragma Assert (Ekind (Id) = E_Procedure);
@@ -9848,6 +9861,7 @@ package body Einfo is
   W ("Is_Atomic",   Flag85  (Id));
   W ("Is_Bit_Packed_Array", Flag122 (Id));
   W ("Is_CPP_Class",Flag74  (Id));
+  W ("Is_CUDA_Kernel",  Flag118  (Id));
   W ("Is_Called",   Flag102 (Id));
   W ("Is_Character_Type",   Flag63  (Id));
   W ("Is_Checked_Ghost_Entity", Flag277 (Id));
diff --git gcc/ada/einfo.ads gcc/ada/einfo.ads
index 758aef5..7932c92 100644
--- gcc/ada/einfo.ads
+++ gcc/ada/einfo.ads
@@ -2508,6 +2508,10 @@ package Einfo is
 --   Defined in all type entities, set only for tagged types to which a
 --   valid pragma Import (CPP, ...) or pragma CPP_Class has been applied.
 
+--Is_CUDA_Kernel (Flag118)
+--   Defined in function and procedure 

[nvptx, aarch64] Define TARGET_OFFLOAD_OPTIONS for AArch64

2020-07-23 Thread Matthias Klose
Trying to build a nvptx offload compiler on aarch64-linux-gnu, the libgomp tests
error out with

unrecognizable argument of option -foffload-abi

Passing that option goes a step further, hitting PR target/96265.  Define that
hook, as it was done for rs6000 in 2015.  Ok for the trunk?

Matthias

	* config/aarch64/aarch64.c (+aarch64_offload_options,
	TARGET_OFFLOAD_OPTIONS): New.

--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -22739,6 +22739,16 @@ aarch64_stack_protect_guard (void)
   return NULL_TREE;
 }
 
+/* Implement the TARGET_OFFLOAD_OPTIONS hook.  */
+static char *
+aarch64_offload_options (void)
+{
+  if (TARGET_ILP32)
+return xstrdup ("-foffload-abi=ilp32");
+  else
+return xstrdup ("-foffload-abi=lp64");
+}
+
 /* Return the diagnostic message string if conversion from FROMTYPE to
TOTYPE is not allowed, NULL otherwise.  */
 
@@ -23079,6 +23089,9 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_NARROW_VOLATILE_BITFIELD
 #define TARGET_NARROW_VOLATILE_BITFIELD hook_bool_void_false
 
+#undef TARGET_OFFLOAD_OPTIONS
+#define TARGET_OFFLOAD_OPTIONS aarch64_offload_options
+
 #undef  TARGET_OPTION_OVERRIDE
 #define TARGET_OPTION_OVERRIDE aarch64_override_options
 


Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-07-23 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 23, 2020 at 4:39 PM Jan Hubicka  wrote:
>
> Hello,
> sorry for taking so long to get to this.
> > diff --git a/gcc/config/i386/i386-features.c 
> > b/gcc/config/i386/i386-features.c
> > index 535fc7e981d..8f81d101382 100644
> > --- a/gcc/config/i386/i386-features.c
> > +++ b/gcc/config/i386/i386-features.c
> > @@ -2379,6 +2379,152 @@ make_pass_remove_partial_avx_dependency 
> > (gcc::context *ctxt)
> >return new pass_remove_partial_avx_dependency (ctxt);
> >  }
> >
> > +/* Replace all one-value const vector that are referenced by SYMBOL_REFs 
> > in x
> > +   with embedded broadcast. i.e.transform
> > +
> > + vpaddq .LC0(%rip), %zmm0, %zmm0
> > + ret
> > +  .LC0:
> > +.quad 3
> > +.quad 3
> > +.quad 3
> > +.quad 3
> > +.quad 3
> > +.quad 3
> > +.quad 3
> > +.quad 3
> > +
> > +to
> > +
> > + vpaddq .LC0(%rip){1to8}, %zmm0, %zmm0
>
> It seems to me that having a special purpose pass for this is bit
> overzelaous.  It seems to me that you can do same pattern matching via
> splitter and fit it into the usual insn splitting pass?
>

>From an implementation perspective, there could be lots of work, since
memory embedding broadcast is available for nearly every instruction
in AVX512. And for new added AVX512 instructions, we also need to add
a define_split for them.

> Honza
> > + ret
> > +  .LC0:
> > +.quad 3  */
> > +static void
> > +replace_constant_pool_with_broadcast (rtx_insn* insn)
> > +{
> > +  subrtx_ptr_iterator::array_type array;
> > +  FOR_EACH_SUBRTX_PTR (iter, array,  (insn), ALL)
> > +{
> > +  rtx *loc = *iter;
> > +  rtx x = *loc;
> > +  rtx broadcast_mem, vec_dup, constant, first;
> > +  machine_mode mode;
> > +  if (GET_CODE (x) != MEM
> > +   || GET_CODE (XEXP (x, 0)) != SYMBOL_REF
> > +   || !CONSTANT_POOL_ADDRESS_P (XEXP (x, 0)))
> > + continue;
> > +
> > +  mode = GET_MODE (x);
> > +  if (!VECTOR_MODE_P (mode))
> > + return;
> > +
> > +  constant = get_pool_constant (XEXP (x, 0));
> > +  first = XVECEXP (constant, 0, 0);
> > +  /* There could be some rtx like
> > +  (mem/u/c:V16QI (symbol_ref/u:DI ("*.LC1")))
> > +  but with "*.LC1" refer to V2DI constant vector.  */
> > +  if (GET_MODE (constant) != mode)
> > + return;
> > +
> > +  for (int i = 1; i < GET_MODE_NUNITS (mode); ++i)
> > + {
> > +   rtx tmp = XVECEXP (constant, 0, i);
> > +   /* Only handle one-value const vector.  */
> > +   if (!rtx_equal_p (tmp, first))
> > + return;
> > + }
> > +
> > +  broadcast_mem = force_const_mem (GET_MODE_INNER (mode), first);
> > +  vec_dup = gen_rtx_VEC_DUPLICATE (mode, broadcast_mem);
> > +  *loc = vec_dup;
> > +  INSN_CODE (insn) = -1;
> > +  /* Revert change if there's no corresponding pattern.  */
> > +  if (recog_memoized (insn) < 0)
> > + {
> > +   *loc = x;
> > +   recog_memoized (insn);
> > + }
> > +  /* At most 1 memory_operand in an insn.  */
> > +  return;
> > +}
> > +}
> > +
> > +/* For const vector having one duplicated value, there's no need to put
> > +   whole vector in the constant pool when target supports embedded 
> > broadcast. */
> > +static unsigned int
> > +constant_pool_broadcast (void)
> > +{
> > +  timevar_push (TV_MACH_DEP);
> > +  rtx_insn *insn;
> > +
> > +  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
> > +{
> > +  if (!INSN_P (insn))
> > + continue;
> > +
> > +  /* Insns may appear inside a SEQUENCE.  Only check the patterns of
> > +  insns, not any notes that may be attached.  We don't want to mark
> > +  a constant just because it happens to appear in a REG_EQUIV note.  */
> > +  if (rtx_sequence *seq = dyn_cast  (PATTERN (insn)))
> > + {
> > +   int i, n = seq->len ();
> > +   for (i = 0; i < n; ++i)
> > + {
> > +   rtx subinsn = seq->element (i);
> > +   if (INSN_P (subinsn))
> > + replace_constant_pool_with_broadcast (dyn_cast  
> > (subinsn));
> > + }
> > + }
> > +  else
> > + replace_constant_pool_with_broadcast (insn);
> > +}
> > +  timevar_pop (TV_MACH_DEP);
> > +  return 0;
> > +}
> > +
> > +namespace {
> > +
> > +const pass_data pass_data_constant_pool_broadcast =
> > +{
> > +  RTL_PASS, /* type */
> > +  "cpb", /* name */
> > +  OPTGROUP_NONE, /* optinfo_flags */
> > +  TV_MACH_DEP, /* tv_id */
> > +  0, /* properties_required */
> > +  0, /* properties_provided */
> > +  0, /* properties_destroyed */
> > +  0, /* todo_flags_start */
> > +  TODO_df_finish, /* todo_flags_finish */
> > +};
> > +
> > +class pass_constant_pool_broadcast : public rtl_opt_pass
> > +{
> > +public:
> > +  pass_constant_pool_broadcast (gcc::context *ctxt)
> > +: rtl_opt_pass (pass_data_constant_pool_broadcast, ctxt)
> > +  {}
> > +
> > +  /* opt_pass methods: */
> > +  virtual bool gate (function *)
> > +

[Ada] Ada2020: AI12-0027 Access values and unaliased component

2020-07-23 Thread Arnaud Charlet
Access values should never designate unaliased components.
This new feature is documented in AI12-0027-1.

gcc/ada/

* sem_ch13.ads (Same_Representation): Renamed as
Has_Compatible_Representation because now the order of the arguments
are taken into account; its formals are also renamed as Target_Type
and Operand_Type.
* sem_ch13.adb (Same_Representation): Renamed and moved to place the
routine in alphabetic order.
* sem_attr.adb (Prefix_With_Safe_Accessibility_Level): New subprogram.
(Resolve_Attribute): Check that the prefix of attribute Access
does not have a value conversion of an array type.
* sem_res.adb (Resolve_Actuals): Remove restrictive check on view
conversions which required matching value of Has_Aliased_Components of
formals and actuals.
* exp_ch4.adb (Handle_Changed_Representation): Update call to
Same_Representation.
(Expand_N_Type_Conversion): Update call to Same_Representation.
* exp_ch5.adb (Change_Of_Representation): Update call to
Same_Representation.
* exp_ch6.adb (Add_Call_By_Copy_Code): Update call to
Same_Representation.
(Expand_Actuals): Update call to Same_Representation.
(Expand_Call_Helper): Update call to Same_Representation.

Tested on x86_64-pc-linux-gnu, committed on master

diff --git gcc/ada/exp_ch4.adb gcc/ada/exp_ch4.adb
index c35fea3..2f6dc3a 100644
--- gcc/ada/exp_ch4.adb
+++ gcc/ada/exp_ch4.adb
@@ -11436,7 +11436,7 @@ package body Exp_Ch4 is
   begin
  --  Nothing else to do if no change of representation
 
- if Same_Representation (Operand_Type, Target_Type) then
+ if Has_Compatible_Representation (Target_Type, Operand_Type) then
 return;
 
  --  The real change of representation work is done by the assignment
@@ -12454,7 +12454,7 @@ package body Exp_Ch4 is
  --  Special processing is required if there is a change of
  --  representation (from enumeration representation clauses).
 
- if not Same_Representation (Target_Type, Operand_Type)
+ if not Has_Compatible_Representation (Target_Type, Operand_Type)
and then not Conversion_OK (N)
  then
 
diff --git gcc/ada/exp_ch5.adb gcc/ada/exp_ch5.adb
index afd3bca..8482f30 100644
--- gcc/ada/exp_ch5.adb
+++ gcc/ada/exp_ch5.adb
@@ -278,8 +278,9 @@ package body Exp_Ch5 is
begin
   return
 Nkind (Rhs) = N_Type_Conversion
-  and then
-not Same_Representation (Etype (Rhs), Etype (Expression (Rhs)));
+  and then not Has_Compatible_Representation
+ (Target_Type  => Etype (Rhs),
+  Operand_Type => Etype (Expression (Rhs)));
end Change_Of_Representation;
 
--
diff --git gcc/ada/exp_ch6.adb gcc/ada/exp_ch6.adb
index 41ed764..3886493 100644
--- gcc/ada/exp_ch6.adb
+++ gcc/ada/exp_ch6.adb
@@ -1571,8 +1571,9 @@ package body Exp_Ch6 is
 
 Var := Make_Var (Expression (Actual));
 
-Crep := not Same_Representation
-  (F_Typ, Etype (Expression (Actual)));
+Crep := not Has_Compatible_Representation
+  (Target_Type  => F_Typ,
+   Operand_Type => Etype (Expression (Actual)));
 
  else
 V_Typ := Etype (Actual);
@@ -2373,9 +2374,9 @@ package body Exp_Ch6 is
 
   --  Also pass by copy if change of representation
 
-  or else not Same_Representation
-(Etype (Formal),
- Etype (Expression (Actual
+  or else not Has_Compatible_Representation
+(Target_Type  => Etype (Formal),
+ Operand_Type => Etype (Expression (Actual
 then
Add_Call_By_Copy_Code;
 
@@ -4801,7 +4802,10 @@ package body Exp_Ch6 is
   --  If there is a change of representation, then generate a
   --  warning, and do the change of representation.
 
-  elsif not Same_Representation (Formal_Typ, Parent_Typ) then
+  elsif not Has_Compatible_Representation
+  (Target_Type  => Formal_Typ,
+   Operand_Type => Parent_Typ)
+  then
  Error_Msg_N
("??change of representation required", Actual);
  Convert (Actual, Parent_Typ);
diff --git gcc/ada/sem_attr.adb gcc/ada/sem_attr.adb
index 78da069..2439169 100644
--- gcc/ada/sem_attr.adb
+++ gcc/ada/sem_attr.adb
@@ -10556,6 +10556,13 @@ package body Sem_Attr is
   --  Returns True if Declared_Entity is declared within the declarative
   --  region of Generic_Unit; otherwise returns False.
 
+  function 

[PATCH] gcov-tool: Fix merging of different endian coverage data [PR96267]

2020-07-23 Thread Martin Liška

Hi.

There's one obvious patch that I'm going to commmit on behalf of Dong.

Martin

2020-07-21  Dong JianQiang  

gcc/ChangeLog:

PR gcov-profile/96267
* gcov-io.c (gcov_open): enable if IN_GCOV_TOOL.
---
 gcc/gcov-io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/gcov-io.c b/gcc/gcov-io.c
index ac60f9baf47..4db56f8aacf 100644
--- a/gcc/gcov-io.c
+++ b/gcc/gcov-io.c
@@ -144,7 +144,7 @@ gcov_open (const char *name, int mode)
   gcov_var.offset = gcov_var.length = 0;
   gcov_var.overread = -1u;
   gcov_var.error = 0;
-#if !IN_LIBGCOV
+#if !IN_LIBGCOV || defined (IN_GCOV_TOOL)
   gcov_var.endian = 0;
 #endif
 #if GCOV_LOCKED
--
2.27.0



[Ada] Add push/pop capability in Output

2020-07-23 Thread Arnaud Charlet
Add the capability to use the Write_* procedures in an environment where
you want to write debugging info but still use them to write to other
files (such a C source files).

[changelog]

* output.ads (Push_Output, Pop_Output): New procedures.
* output.adb (FD_Array, FD_Stack, FD_Stack_Idx): New type and vars.
(Push_Output, Pop_Output): New procedures.

Tested on x86_64-pc-linux-gnu, commited on trunk

---
 output.adb | 29 +
 output.ads |  9 +
 2 files changed, 38 insertions(+)

diff --git output.adb output.adb
index d36aac8..971819b 100644
--- output.adb
+++ output.adb
@@ -45,6 +45,13 @@ package body Output is
Current_FD : File_Descriptor := Standout;
--  File descriptor for current output
 
+   type FD_Array is array (Nat range 1 .. 3) of File_Descriptor;
+   FD_Stack : FD_Array;
+   FD_Stack_Idx : Nat := FD_Array'First - 1;
+   --  Maintain a small stack for Push_Output and Pop_Output. We'd normally
+   --  use Table for this and allow an unlimited depth, but we're the target
+   --  of a pragma Elaborate_All in Table, so we can't use it here.
+
Special_Output_Proc : Output_Proc := null;
--  Record argument to last call to Set_Special_Output. If this is
--  non-null, then we are in special output mode.
@@ -228,6 +235,28 @@ procedure Outdent is
 (Cur_Indentation - Indentation_Amount) mod Indentation_Limit;
end Outdent;
 
+   
+   -- Pop_Output --
+   
+
+   procedure Pop_Output is
+   begin
+  pragma Assert (FD_Stack_Idx >= FD_Array'First);
+  Current_FD := FD_Stack (FD_Stack_Idx);
+  FD_Stack_Idx := FD_Stack_Idx - 1;
+   end Pop_Output;
+
+   -
+   -- Push_Output --
+   -
+
+   procedure Push_Output is
+   begin
+  pragma Assert (FD_Stack_Idx < FD_Array'Last);
+  FD_Stack_Idx := FD_Stack_Idx + 1;
+  FD_Stack (FD_Stack_Idx) := Current_FD;
+   end Push_Output;
+
---
-- Restore_Output_Buffer --
---
diff --git output.ads output.ads
index 4574cce..55d308a 100644
--- output.ads
+++ output.ads
@@ -95,6 +95,15 @@ procedure Set_Output (FD : File_Descriptor);
--  output will appear on the given file descriptor only after special
--  output has been cancelled.
 
+   procedure Push_Output;
+   --  Saves the current output destination on a stack, but leaves it
+   --  unchanged. This subprogram only supports a small stack and is normally
+   --  used with a depth of one.
+
+   procedure Pop_Output;
+   --  Changes the current output destination to be the last output destination
+   --  popped using Push_Output.
+
procedure Indent;
--  Increases the current indentation level. Whenever a line is written
--  (triggered by Eol), an appropriate amount of whitespace is added to the
-- 
2.1.4



Re: [Patch] libomp: Add omp_depend_kind to omp_lib.{f90,h}

2020-07-23 Thread Jakub Jelinek via Gcc-patches
On Thu, Jul 23, 2020 at 02:42:25PM +0200, Tobias Burnus wrote:
> libomp: Add omp_depend_kind to omp_lib.{f90,h}
> 
> gcc/fortran/ChangeLog:
> 
>   * intrinsic.texi (OMP_LIB_KINDS): Add omp_depend_kind.
> 
> libgomp/ChangeLog:
> 
>   * configure.ac: Add OMP_DEPEND_KIND and OMP_INT128_SIZE.

Do you need
> +AC_SUBST(OMP_INT128_SIZE)
this for anything?  I don't see @OMP_INT128_SIZE@ ever used anywhere.

Otherwise LGTM.

Jakub



Re: [Patch] OpenMP: Update gcc/fortran/*.texi

2020-07-23 Thread Tobias Burnus

Hi Thomas,

On 7/23/20 12:09 PM, Thomas Koenig wrote:

can you also update https://gcc.gnu.org/gcc-11/changes.html ?

do you mind if I put it on the backburner? I think there will be several
more OpenMP changes during GCC 11 and I hope that one can then write a
final status which will be: officially support OpenMP 4.5 support (now
also) for Fortran plus a large chunk of 5.0/5.1 for C/C++/Fortran. –
That seems to be simpler than keeping updating the changes.html.

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


[Patch] libomp: Add omp_depend_kind to omp_lib.{f90,h}

2020-07-23 Thread Tobias Burnus

As just discussed,* this patch adds omp_depend_kind
to Fortran's omp_lib.{h,f90}.
I fixed a problem in omp_lib.h (which as already fixed
in the module version) and I added a sizeof check,
which I missed when adding the other items previously.

OK?

Tobias
* https://gcc.gnu.org/pipermail/gcc-patches/2020-July/550517.html

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
libomp: Add omp_depend_kind to omp_lib.{f90,h}

gcc/fortran/ChangeLog:

	* intrinsic.texi (OMP_LIB_KINDS): Add omp_depend_kind.

libgomp/ChangeLog:

	* configure.ac: Add OMP_DEPEND_KIND and OMP_INT128_SIZE.
	* libgomp_f.h.in (omp_check_defines): Check whether
	sizeof of determined Fortran kind and C typedef match.
	* omp_lib.f90.in: Add omp_depened_kind.
	* omp_lib.h.in: Likewise; fix omp_alloctrait_key_kind.
	* configure: Regenerate.
	* Makefile.in: Regenerate.
	* testsuite/Makefile.in: Regenerate.

 gcc/fortran/intrinsic.texi|  1 +
 libgomp/Makefile.in   |  2 ++
 libgomp/configure | 31 +--
 libgomp/configure.ac  | 14 ++
 libgomp/libgomp_f.h.in|  7 ++-
 libgomp/omp_lib.f90.in|  1 +
 libgomp/omp_lib.h.in  |  4 +++-
 libgomp/testsuite/Makefile.in |  2 ++
 8 files changed, 58 insertions(+), 4 deletions(-)

diff --git a/gcc/fortran/intrinsic.texi b/gcc/fortran/intrinsic.texi
index 1b93d251509..6268feca70c 100644
--- a/gcc/fortran/intrinsic.texi
+++ b/gcc/fortran/intrinsic.texi
@@ -15317,6 +15317,7 @@ named constants:
 @item @code{omp_allocator_handle_kind}
 @item @code{omp_alloctrait_key_kind}
 @item @code{omp_alloctrait_val_kind}
+@item @code{omp_depend_kind}
 @item @code{omp_lock_kind}
 @item @code{omp_lock_hint_kind}
 @item @code{omp_nest_lock_kind}
diff --git a/libgomp/configure.ac b/libgomp/configure.ac
index d1034dab7f8..dc840754ca1 100644
--- a/libgomp/configure.ac
+++ b/libgomp/configure.ac
@@ -396,6 +396,9 @@ for i in $config_path; do
 done
 
 _AC_COMPUTE_INT([sizeof (__INTPTR_TYPE__)], [INTPTR_T_KIND])
+_AC_COMPUTE_INT([sizeof (__int128)], [OMP_INT128_SIZE],,[OMP_INT128_SIZE=0])
+_AC_COMPUTE_INT([2*sizeof (__INTPTR_TYPE__)], [OMP_DEPEND_KIND],,
+		[OMP_DEPEND_KIND=0])
 _AC_COMPUTE_INT([sizeof (omp_lock_t)], [OMP_LOCK_SIZE],,
   [AC_MSG_ERROR([unsupported system, cannot find sizeof (omp_lock_t)])])
 _AC_COMPUTE_INT([__alignof (omp_lock_t)], [OMP_LOCK_ALIGN])
@@ -428,6 +431,15 @@ fi
 if test $OMP_NEST_LOCK_25_SIZE -gt 8 || test $OMP_NEST_LOCK_25_ALIGN -gt $OMP_NEST_LOCK_25_SIZE; then
   OMP_NEST_LOCK_25_KIND=8
 fi
+if test $OMP_DEPEND_KIND -eq 16; then
+  if $OMP_INT128_SIZE -ne 16; then
+AC_MSG_ERROR([unsupported system, cannot find Fortran int kind=16, needed for omp_depend_kind])
+  fi
+else
+  if test $OMP_DEPEND_KIND -ne 8; then
+AC_MSG_ERROR([unsupported system, cannot find Fortran integer kind for omp_depend_kind])
+  fi
+fi
 
 AC_SUBST(INTPTR_T_KIND)
 AC_SUBST(OMP_LOCK_SIZE)
@@ -442,6 +454,8 @@ AC_SUBST(OMP_NEST_LOCK_25_SIZE)
 AC_SUBST(OMP_NEST_LOCK_25_ALIGN)
 AC_SUBST(OMP_LOCK_25_KIND)
 AC_SUBST(OMP_NEST_LOCK_25_KIND)
+AC_SUBST(OMP_INT128_SIZE)
+AC_SUBST(OMP_DEPEND_KIND)
 CFLAGS="$save_CFLAGS"
 
 # Determine what GCC version number to use in filesystem paths.
diff --git a/libgomp/libgomp_f.h.in b/libgomp/libgomp_f.h.in
index 3359352a0b4..d629503b8da 100644
--- a/libgomp/libgomp_f.h.in
+++ b/libgomp/libgomp_f.h.in
@@ -79,7 +79,12 @@ omp_check_defines (void)
 	 || @OMP_NEST_LOCK_SIZE@ != sizeof (omp_nest_lock_t)
 	 || @OMP_NEST_LOCK_ALIGN@ != __alignof (omp_nest_lock_t)
 	 || @OMP_LOCK_KIND@ != sizeof (*(omp_lock_arg_t) 0)
-	 || @OMP_NEST_LOCK_KIND@ != sizeof (*(omp_nest_lock_arg_t) 0))
+	 || @OMP_NEST_LOCK_KIND@ != sizeof (*(omp_nest_lock_arg_t) 0)
+	 || @INTPTR_T_KIND@ != sizeof (omp_allocator_handle_t)
+	 || 4 != sizeof (omp_alloctrait_key_t)
+	 || @INTPTR_T_KIND@ != sizeof (omp_alloctrait_value_t)
+	 || @INTPTR_T_KIND@ != sizeof (omp_memspace_handle_t)
+	 || @OMP_DEPEND_KIND@ != sizeof (omp_depend_t))
 	? -1 : 1] __attribute__ ((__unused__));
   char test2[(@OMP_LOCK_25_SIZE@ != sizeof (omp_lock_25_t)
 	 || @OMP_LOCK_25_ALIGN@ != __alignof (omp_lock_25_t)
diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in
index b22bcbaf770..3ec31ac1f1f 100644
--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -38,6 +38,7 @@
 integer, parameter :: omp_alloctrait_key_kind = c_int
 integer, parameter :: omp_alloctrait_val_kind = c_intptr_t
 integer, parameter :: omp_memspace_handle_kind = c_intptr_t
+integer, parameter :: omp_depend_kind = @OMP_DEPEND_KIND@
 integer (omp_sched_kind), parameter :: omp_sched_static = 1
 integer (omp_sched_kind), parameter :: omp_sched_dynamic = 2
 integer (omp_sched_kind), parameter :: omp_sched_guided = 3
diff --git a/libgomp/omp_lib.h.in 

Re: [PATCH v3] dse: Remove partial load after full store for high part access[PR71309]

2020-07-23 Thread luoxhu via Gcc-patches



On 2020/7/23 04:30, Richard Sandiford wrote:
> 
> I now realise the reason is that the starting mode is too wide.
> I think we should fix that by doing:
> 
>FOR_EACH_MODE_IN_CLASS (new_mode_iter, MODE_INT)
>  {
>…
> 
> and then add:
> 
>if (maybe_lt (GET_MODE_SIZE (new_mode), access_size))
>  continue;
> 
> after your optimisation, so that the shift code proper still only
> considers modes that are wide enough to hold the unshifted value.
> 
> I don't think there are any efficiency concerns with that, since
> smallest_int_mode_for_size does its own similar iteration internally.
> 
> Sorry for not picking up on that first time.
> 

Thanks:), I didn't make it clear that it starts from TImode first...

The updated patch use "FOR_EACH_MODE_IN_CLASS (new_mode_iter, MODE_INT)"
to iterate from QImode now, but still need size "new_mode <= store_mode"
to get legal subreg, otherwise "gfortran.dg/shift-kind.f90" will fail.
Even so, there are still below execute fail when running regression tests:

gfortran.fortran-torture/execute/equiv_2.f90 execution,  -O2
gfortran.fortran-torture/execute/equiv_2.f90 execution,  -O2 -fbounds-check
gfortran.fortran-torture/execute/equiv_2.f90 execution,  -O2 
-fomit-frame-pointer -finline-functions
gfortran.fortran-torture/execute/equiv_2.f90 execution,  -O2 
-fomit-frame-pointer -finline-functions -funroll-loops
gfortran.fortran-torture/execute/equiv_2.f90 execution,  -O3 -g
gfortran.fortran-torture/execute/equiv_2.f90 execution, -O2 -ftree-vectorize 
-maltivec 

Any clue about the execution outputs "STOP 2", please? Still investigating.

PS: if switch the sequence of GET_MODE_BITSIZE (new_mode) and shift in 
multiple_p,
it will generates incorrect ASM for PR71309.


This patch could optimize (works for char/short/int/void*):

6: r119:TI=[r118:DI+0x10]
7: [r118:DI]=r119:TI
8: r121:DI=[r118:DI+0x8]

=>

6: r119:TI=[r118:DI+0x10]
16: r122:DI=r119:TI#8

Final ASM will be as below without partial load after full store(stxv+ld):
  ld 10,16(3)
  mr 9,3
  ld 3,24(3)
  std 10,0(9)
  std 3,8(9)
  blr

It could achieve ~25% performance improvement for typical cases on
Power9.  Bootstrap and regression tested on Power9-LE.

For AArch64, one ldr is replaced by mov with this patch:

ldp x2, x3, [x0, 16]
stp x2, x3, [x0]
ldr x0, [x0, 8]

=>

mov x1, x0
ldp x2, x0, [x0, 16]
stp x2, x0, [x1]

gcc/ChangeLog:

2020-07-23  Xionghu Luo  

PR rtl-optimization/71309
* dse.c (find_shift_sequence): Use subreg of shifted from high part
register to avoid loading from address.

gcc/testsuite/ChangeLog:

2020-07-23  Xionghu Luo  

PR rtl-optimization/71309
* gcc.target/powerpc/pr71309.c: New test.
---
 gcc/dse.c  | 22 +--
 gcc/testsuite/gcc.target/powerpc/pr71309.c | 33 ++
 2 files changed, 53 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr71309.c

diff --git a/gcc/dse.c b/gcc/dse.c
index bbe792e48e8..afc6ad30623 100644
--- a/gcc/dse.c
+++ b/gcc/dse.c
@@ -1728,8 +1728,7 @@ find_shift_sequence (poly_int64 access_size,
  the machine.  */
 
   opt_scalar_int_mode new_mode_iter;
-  FOR_EACH_MODE_FROM (new_mode_iter,
- smallest_int_mode_for_size (access_size * BITS_PER_UNIT))
+  FOR_EACH_MODE_IN_CLASS (new_mode_iter, MODE_INT)
 {
   rtx target, new_reg, new_lhs;
   rtx_insn *shift_seq, *insn;
@@ -1779,6 +1778,25 @@ find_shift_sequence (poly_int64 access_size,
  && !targetm.modes_tieable_p (new_mode, store_mode))
continue;
 
+  if (multiple_p (GET_MODE_BITSIZE (new_mode), shift)
+ && known_le (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (store_mode)))
+   {
+ /* Try to implement the shift using a subreg.  */
+ poly_int64 offset
+   = subreg_offset_from_lsb (new_mode, store_mode, shift);
+ rtx rhs_subreg
+   = simplify_gen_subreg (new_mode, store_info->rhs, store_mode, 
offset);
+ if (rhs_subreg)
+   {
+ read_reg
+   = extract_low_bits (read_mode, new_mode, copy_rtx (rhs_subreg));
+ break;
+   }
+   }
+
+  if (maybe_lt (GET_MODE_SIZE (new_mode), access_size))
+   continue;
+
   new_reg = gen_reg_rtx (new_mode);
 
   start_sequence ();
diff --git a/gcc/testsuite/gcc.target/powerpc/pr71309.c 
b/gcc/testsuite/gcc.target/powerpc/pr71309.c
new file mode 100644
index 000..94d727a8ed9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr71309.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
+
+#define TYPE void*
+#define TYPE2 void*
+
+struct path {
+TYPE2 mnt;
+TYPE dentry;
+};
+
+struct nameidata {
+struct path path;
+struct path root;
+};
+
+__attribute__ ((noinline))
+TYPE foo(struct nameidata *nd)
+{
+  TYPE d;
+  

Re: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Richard Biener via Gcc-patches
On Thu, Jul 23, 2020 at 2:01 PM Roger Sayle  wrote:
>
>
> On Thu, Jul 23, 2020 at 10:02 Richard Biener wrote:
> > Likewise for the existing
> >
> >+/* popcount(X) == 0 is X == 0, and related (in)equalities.  */ (for
> >+(for popcount (POPCOUNT)
> >   (for cmp (le eq ne gt)
> >   rep (eq eq ne ne)
> >   (simplify
> >   (cmp (popcount @0) integer_zerop)
> >   (rep @0 { build_zero_cst (TREE_TYPE (@0)); }
> >
> > you can elide the (for ...)
>
> Unfortunately, I tried that myself when preparing this patch:
>
> Attempt #1:
>
>(for cmp (le eq ne gt)
> rep (eq eq ne ne)
>  (simplify
>(cmp (POPCOUNT @0) integer_zerop)
>(rep @0 { build_zero_cst (TREE_TYPE (@0)); }
>
> results in:
> build/genmatch --gimple ../../patcha3/gcc/match.pd \
> > tmp-gimple-match.c
> ../../patcha3/gcc/match.pd:5977:11 error: operator-list POPCOUNT cannot be 
> expanded inside 'for'
> (cmp (POPCOUNT @0) integer_zerop)
>   ^
> Attempt #2:
>
> (for popcount (POPCOUNT)
>  cmp (le eq ne gt)
>  rep (eq eq ne ne)
>   (simplify
> (cmp (popcount @0) integer_zerop)
> (rep @0 { build_zero_cst (TREE_TYPE (@0)); })))
>
> results in:
> > tmp-gimple-match.c
> ../../patcha3/gcc/match.pd:5975:22 error: All user-defined identifiers must 
> have a multiple number of operator substitutions of the smallest number of 
> substitutions
>  cmp (le eq ne gt)
>  ^

Oh yeah - I failed to spot the outer (for ..), indeed it doesn't work
then.  We're not magically
creating a nest from the first vairant which we could make work - I
erred on the side of
"too much implicitness" caution here.

> I'll leave it the way it is with the nested FORs, and investigate your other 
> suggestions.

Yeah, fine.

> > OK with those changes.
> >
> > Richard.
>
>
> Roger
> --
>
>


RE: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Roger Sayle


On Thu, Jul 23, 2020 at 10:02 Richard Biener wrote:
> Likewise for the existing
>
>+/* popcount(X) == 0 is X == 0, and related (in)equalities.  */ (for 
>+(for popcount (POPCOUNT)
>   (for cmp (le eq ne gt)
>   rep (eq eq ne ne)
>   (simplify
>   (cmp (popcount @0) integer_zerop)
>   (rep @0 { build_zero_cst (TREE_TYPE (@0)); }
>
> you can elide the (for ...)

Unfortunately, I tried that myself when preparing this patch:

Attempt #1:

   (for cmp (le eq ne gt)
rep (eq eq ne ne)
 (simplify
   (cmp (POPCOUNT @0) integer_zerop)
   (rep @0 { build_zero_cst (TREE_TYPE (@0)); }

results in:
build/genmatch --gimple ../../patcha3/gcc/match.pd \
> tmp-gimple-match.c
../../patcha3/gcc/match.pd:5977:11 error: operator-list POPCOUNT cannot be 
expanded inside 'for'
(cmp (POPCOUNT @0) integer_zerop)
  ^
Attempt #2:

(for popcount (POPCOUNT)
 cmp (le eq ne gt)
 rep (eq eq ne ne)
  (simplify
(cmp (popcount @0) integer_zerop)
(rep @0 { build_zero_cst (TREE_TYPE (@0)); })))

results in:
> tmp-gimple-match.c
../../patcha3/gcc/match.pd:5975:22 error: All user-defined identifiers must 
have a multiple number of operator substitutions of the smallest number of 
substitutions
 cmp (le eq ne gt)
 ^

I'll leave it the way it is with the nested FORs, and investigate your other 
suggestions.

> OK with those changes.
>
> Richard.


Roger
--




Re: [PATCH] Implement no_stack_protect attribute.

2020-07-23 Thread Martin Liška

PING^3

On 6/24/20 11:09 AM, Martin Liška wrote:

PING^2

On 6/10/20 10:12 AM, Martin Liška wrote:

PING^1

On 5/25/20 3:10 PM, Martin Liška wrote:

On 5/21/20 4:53 PM, Martin Sebor wrote:

On 5/21/20 5:28 AM, Martin Liška wrote:

On 5/18/20 10:37 PM, Martin Sebor wrote:

I know there are some somewhat complex cases the attribute exclusion
mechanism isn't general enough to handle but this seems simple enough
that it should work.  Unless I'm missing something that makes it not
feasible I would suggest to use it.


Hi Martin.

Do we have a better place where we check for attribute collision?


If by collision you mean the same thing as the mutual exclusion I was
talking about then that's done by creating an attribute_spec::exclusions
array like for instance attr_cold_hot_exclusions in c-attribs.c and
pointing to it from the attribute_spec entries for each of
the mutually exclusive attributes in the attribute table.  Everything
else is handled automatically by decl_attributes.

Martin


Thanks, I'm sending updated version of the patch that utilizes the conflict
detection.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin








Re: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Richard Biener via Gcc-patches
On Thu, Jul 23, 2020 at 12:58 PM Richard Sandiford
 wrote:
>
> Richard Biener via Gcc-patches  writes:
> > On Thu, Jul 23, 2020 at 11:11 AM Marc Glisse  wrote:
> >>
> >> On Thu, 23 Jul 2020, Marc Glisse wrote:
> >>
> >> > On Wed, 22 Jul 2020, Roger Sayle wrote:
> >> >
> >> >> Many thanks for the peer review and feedback.  I completely agree that
> >> >> POPCOUNT
> >> >> and PARITY iterators simplifies things and handle the IFN_ variants.
> >> >
> >> > Is there a reason why the iterators cannot be used for this one?
> >> >
> >> > +(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
> >> > +BUILT_IN_POPCOUNTIMAX)
> >> > + parity (BUILT_IN_PARITY BUILT_IN_PARITYL BUILT_IN_PARITYLL
> >> > +  BUILT_IN_PARITYIMAX)
> >> > +  (simplify
> >> > +(bit_and (popcount @0) integer_onep)
> >> > +(parity @0)))
> >>
> >> Ah, maybe because we may have platforms/modes where the optab for popcount
> >> is supported but not the one for parity, and we are not allowed to create
> >> internal calls if the optab is not supported? I think expand_parity tries
> >> to expand parity as popcount&1, so it should be fine, but I didn't
> >> actually try it...
> >
> > Hmm, that might indeed be a problem.  An explicit
> > direct_internal_fn_supported_p guard would help here, like
> > maybe
> >
> >   (if (PARITY != IFN_PARITY
> >|| direct_internal_fn_supported_p (IFN_PARITY, type, 
> > OPTIMIZE_FOR_BOTH))
> >   (PARITY @0)))
> >
> > magic that eventually could be auto-generated by genmatch ... (but only if
> > iterating, so it would be a bit iffy)
>
> The matching code should already reject folds to IFN_PARITY calls that
> aren't supported by the target (gimple-match-head.c:build_call_internal).

Ah, indeed, so it should work unchanged even.

Richard.

> Thanks,
> Richard


Re: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Richard Sandiford
Richard Biener via Gcc-patches  writes:
> On Thu, Jul 23, 2020 at 11:11 AM Marc Glisse  wrote:
>>
>> On Thu, 23 Jul 2020, Marc Glisse wrote:
>>
>> > On Wed, 22 Jul 2020, Roger Sayle wrote:
>> >
>> >> Many thanks for the peer review and feedback.  I completely agree that
>> >> POPCOUNT
>> >> and PARITY iterators simplifies things and handle the IFN_ variants.
>> >
>> > Is there a reason why the iterators cannot be used for this one?
>> >
>> > +(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
>> > +BUILT_IN_POPCOUNTIMAX)
>> > + parity (BUILT_IN_PARITY BUILT_IN_PARITYL BUILT_IN_PARITYLL
>> > +  BUILT_IN_PARITYIMAX)
>> > +  (simplify
>> > +(bit_and (popcount @0) integer_onep)
>> > +(parity @0)))
>>
>> Ah, maybe because we may have platforms/modes where the optab for popcount
>> is supported but not the one for parity, and we are not allowed to create
>> internal calls if the optab is not supported? I think expand_parity tries
>> to expand parity as popcount&1, so it should be fine, but I didn't
>> actually try it...
>
> Hmm, that might indeed be a problem.  An explicit
> direct_internal_fn_supported_p guard would help here, like
> maybe
>
>   (if (PARITY != IFN_PARITY
>|| direct_internal_fn_supported_p (IFN_PARITY, type, 
> OPTIMIZE_FOR_BOTH))
>   (PARITY @0)))
>
> magic that eventually could be auto-generated by genmatch ... (but only if
> iterating, so it would be a bit iffy)

The matching code should already reject folds to IFN_PARITY calls that
aren't supported by the target (gimple-match-head.c:build_call_internal).

Thanks,
Richard


Re: [PATCH v2] Add --ld-path= to specify an arbitrary executable as the linker

2020-07-23 Thread Martin Liška

On 7/21/20 6:07 AM, Fangrui Song wrote:

If the value does not contain any path component separator (e.g. a
slash), the linker will be searched for using COMPILER_PATH followed by
PATH. Otherwise, it is either an absolute path or a path relative to the
current working directory.

--ld-path= complements and overrides -fuse-ld={bfd,gold,lld}. If in the
future, we want to make dfferent linker option decisions we can let
-fuse-ld= represent the linker flavor and --ld-path= the linker path.


Hello.

I have just few nits:

=== ERROR type #3: trailing operator (1 error(s)) ===
gcc/collect2.c:1155:14: ld_file_name =



PR driver/93645
* common.opt (--ld-path=): Add --ld-path=
* opts.c (common_handle_option): Handle OPT__ld_path_.
* gcc.c (driver_handle_option): Likewise.
* collect2.c (main): Likewise.
* doc/invoke.texi: Document --ld-path=.

---
Changes in v2:
* Renamed -fld-path= to --ld-path= (clang 12.0.0 new option).
   The option does not affect code generation and is not a language feature,
   -f* is not suitable. Additionally, clang has other similar --*-path=
   options, e.g. --cuda-path=.
---
  gcc/collect2.c  | 63 +++--
  gcc/common.opt  |  4 +++
  gcc/doc/invoke.texi |  9 +++
  gcc/gcc.c   |  2 +-
  gcc/opts.c  |  1 +
  5 files changed, 64 insertions(+), 15 deletions(-)

diff --git a/gcc/collect2.c b/gcc/collect2.c
index f8a5ce45994..caa1b96ab52 100644
--- a/gcc/collect2.c
+++ b/gcc/collect2.c
@@ -844,6 +844,7 @@ main (int argc, char **argv)
const char **ld1;
bool use_plugin = false;
bool use_collect_ld = false;
+  const char *ld_path = NULL;
  
/* The kinds of symbols we will have to consider when scanning the

   outcome of a first pass link.  This is ALL to start with, then might
@@ -961,12 +962,21 @@ main (int argc, char **argv)
if (selected_linker == USE_DEFAULT_LD)
  selected_linker = USE_PLUGIN_LD;
  }
-   else if (strcmp (argv[i], "-fuse-ld=bfd") == 0)
- selected_linker = USE_BFD_LD;
-   else if (strcmp (argv[i], "-fuse-ld=gold") == 0)
- selected_linker = USE_GOLD_LD;
-   else if (strcmp (argv[i], "-fuse-ld=lld") == 0)
- selected_linker = USE_LLD_LD;
+   else if (strncmp (argv[i], "-fuse-ld=", 9) == 0
+&& selected_linker != USE_LD_MAX)
+ {
+   if (strcmp (argv[i] + 9, "bfd") == 0)
+ selected_linker = USE_BFD_LD;
+   else if (strcmp (argv[i] + 9, "gold") == 0)
+ selected_linker = USE_GOLD_LD;
+   else if (strcmp (argv[i] + 9, "lld") == 0)
+ selected_linker = USE_LLD_LD;
+ }
+   else if (strncmp (argv[i], "--ld-path=", 10) == 0)
+ {
+   ld_path = argv[i] + 10;
+   selected_linker = USE_LD_MAX;
+ }
else if (strncmp (argv[i], "-o", 2) == 0)
  {
/* Parse the output filename if it's given so that we can make
@@ -1117,14 +1127,34 @@ main (int argc, char **argv)
ld_file_name = find_a_file (, collect_ld_suffix, X_OK);
use_collect_ld = ld_file_name != 0;
  }
-  /* Search the compiler directories for `ld'.  We have protection against
- recursive calls in find_a_file.  */
-  if (ld_file_name == 0)
-ld_file_name = find_a_file (, ld_suffixes[selected_linker], X_OK);
-  /* Search the ordinary system bin directories
- for `ld' (if native linking) or `TARGET-ld' (if cross).  */
-  if (ld_file_name == 0)
-ld_file_name = find_a_file (, full_ld_suffixes[selected_linker], 
X_OK);
+  if (selected_linker == USE_LD_MAX)
+{
+  /* If --ld-path= does not contain a path component separator, search for
+the command using cpath, then using path.  Otherwise find the linker
+relative to the current working directory.  */
+  if (lbasename (ld_path) == ld_path)
+   {
+ ld_file_name = find_a_file (, ld_path, X_OK);
+ if (ld_file_name == 0)
+   ld_file_name = find_a_file (, ld_path, X_OK);
+   }
+  else if (file_exists (ld_path))
+   {


^^^ these braces are not needed.


+ ld_file_name = ld_path;
+   }
+}
+  else
+{
+  /* Search the compiler directories for `ld'.  We have protection against
+recursive calls in find_a_file.  */
+  if (ld_file_name == 0)


I would prefer '== NULL'.


+   ld_file_name = find_a_file (, ld_suffixes[selected_linker], X_OK);
+  /* Search the ordinary system bin directories
+for `ld' (if native linking) or `TARGET-ld' (if cross).  */
+  if (ld_file_name == 0)
+   ld_file_name =
+ find_a_file (, full_ld_suffixes[selected_linker], X_OK);
+}
  
  #ifdef REAL_NM_FILE_NAME

nm_file_name = find_a_file (, REAL_NM_FILE_NAME, X_OK);
@@ -1461,6 +1491,11 @@ main (int argc, char **argv)
  ld2--;
  #endif
}
+ else if (strncmp (arg, 

Re: [Patch] OpenMP: Update gcc/fortran/*.texi

2020-07-23 Thread Thomas Koenig via Gcc-patches

Hi Tobias,

can you also update https://gcc.gnu.org/gcc-11/changes.html ?

Best regards

Thomas


Re: [stage1][PATCH] Change semantics of -frecord-gcc-switches and add -frecord-gcc-switches-format.

2020-07-23 Thread Martin Liška

On 7/21/20 6:24 PM, Qing Zhao wrote:

4.

Our company is waiting for this patch to be committed to upstream.


Hello.

Please note that patch review can sometimes take some time.
You can still install the patch to your local branch and note that
the next major release will be released in about 9 months.

Martin


Re: [Patch] OpenMP: Update gcc/fortran/*.texi

2020-07-23 Thread Jakub Jelinek via Gcc-patches
On Thu, Jul 23, 2020 at 11:51:38AM +0200, Tobias Burnus wrote:
> Except for a few known bugs (and a bunch of unknown ones),
> I am not aware of any outstanding issues with OpenMP 4.5.
> Additionally, the openmp_version claims 201511 (= 4.5).
> Hence, state that 4.5 is supported – and OpenMP 5.0 is
> partially.
> 
> I did update the .texi file to match the current content
> of OMP_LIB.
> 
> However, compared to C/C++ I do note that the 'omp_depend_kind'
> is missing, which in C/C++ omp.h is a struct of
> size "char[2*sizeof(*void)]".
> I fear that using 'integer(kind=2*c_intptr_t)' will not
> work everywhere – hence, I did not add it with this patch.
> (→ back to the drawingboard?)

First of all, libgomp isn't supported on all targets, the weirdest ones
hopefully don't have libgomp enabled.
And we can do what we do e.g. for omp_nest_lock_kind, sometimes use
indirection, but try to use large kind if possible.  So unlike
omp_nest_lock_kind, I'd prefer to use kind=16 if supported.
And, unlike omp_nest_lock_kind, it is unfortunately not just the matter of
the library which can have something like OMP_NEST_LOCK_DIRECT, but the
compiler needs to agree on that.

So for now I think I'd suggest a configure test and fail libgomp build if
2*c_intptr_t kind doesn't work, and let's do something only if we discover a
target which has libgomp enabled and doesn't support those kinds.

> Otherwise, C/C++'s omp.h only has some memory allocation
> functions which currently (in OpenMP 5.0.0) do not have
> a Fortran equivalent.
> 
> OK for the trunk?

Ok, thanks.

Jakub



Re: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Richard Biener via Gcc-patches
On Thu, Jul 23, 2020 at 11:11 AM Marc Glisse  wrote:
>
> On Thu, 23 Jul 2020, Marc Glisse wrote:
>
> > On Wed, 22 Jul 2020, Roger Sayle wrote:
> >
> >> Many thanks for the peer review and feedback.  I completely agree that
> >> POPCOUNT
> >> and PARITY iterators simplifies things and handle the IFN_ variants.
> >
> > Is there a reason why the iterators cannot be used for this one?
> >
> > +(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
> > +BUILT_IN_POPCOUNTIMAX)
> > + parity (BUILT_IN_PARITY BUILT_IN_PARITYL BUILT_IN_PARITYLL
> > +  BUILT_IN_PARITYIMAX)
> > +  (simplify
> > +(bit_and (popcount @0) integer_onep)
> > +(parity @0)))
>
> Ah, maybe because we may have platforms/modes where the optab for popcount
> is supported but not the one for parity, and we are not allowed to create
> internal calls if the optab is not supported? I think expand_parity tries
> to expand parity as popcount&1, so it should be fine, but I didn't
> actually try it...

Hmm, that might indeed be a problem.  An explicit
direct_internal_fn_supported_p guard would help here, like
maybe

  (if (PARITY != IFN_PARITY
   || direct_internal_fn_supported_p (IFN_PARITY, type, OPTIMIZE_FOR_BOTH))
  (PARITY @0)))

magic that eventually could be auto-generated by genmatch ... (but only if
iterating, so it would be a bit iffy)

Richard.

> --
> Marc Glisse


[Patch] OpenMP: Update gcc/fortran/*.texi

2020-07-23 Thread Tobias Burnus

Except for a few known bugs (and a bunch of unknown ones),
I am not aware of any outstanding issues with OpenMP 4.5.
Additionally, the openmp_version claims 201511 (= 4.5).
Hence, state that 4.5 is supported – and OpenMP 5.0 is
partially.

I did update the .texi file to match the current content
of OMP_LIB.

However, compared to C/C++ I do note that the 'omp_depend_kind'
is missing, which in C/C++ omp.h is a struct of
size "char[2*sizeof(*void)]".
I fear that using 'integer(kind=2*c_intptr_t)' will not
work everywhere – hence, I did not add it with this patch.
(→ back to the drawingboard?)

Otherwise, C/C++'s omp.h only has some memory allocation
functions which currently (in OpenMP 5.0.0) do not have
a Fortran equivalent.

OK for the trunk?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
OpenMP: Update gcc/fortran/*.texi

gcc/fortran/ChangeLog:

	* gfortran.texi (Standards): Update URL; state that OpenMP 4.5
	is supported and 5.0 is partially.
	* intrinsic.texi (OpenMP Modules): Refer also to OpenMP 5.0;
	(OMP_LIB): Add missing derived type and new named constants.

 gcc/fortran/gfortran.texi  |  4 +-
 gcc/fortran/intrinsic.texi | 96 +++---
 2 files changed, 92 insertions(+), 8 deletions(-)

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 20fe38534d5..d927ebc4abc 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -541,8 +541,8 @@ can be found in the @ref{Fortran 2003 status}, @ref{Fortran 2008
 status} and @ref{Fortran 2018 status} sections of the documentation.
 
 Additionally, the GNU Fortran compilers supports the OpenMP specification
-(version 4.0 and most of the features of the 4.5 version,
-@url{http://openmp.org/@/wp/@/openmp-specifications/}).
+(version 4.5 and partial support of the features of the 5.0 version,
+@url{http://openmp.org/@/openmp-specifications/}).
 There also is support for the OpenACC specification (targeting
 version 2.6, @uref{http://www.openacc.org/}).  See
 @uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
diff --git a/gcc/fortran/intrinsic.texi b/gcc/fortran/intrinsic.texi
index a1ecf5933ba..1b93d251509 100644
--- a/gcc/fortran/intrinsic.texi
+++ b/gcc/fortran/intrinsic.texi
@@ -15291,12 +15291,12 @@ with the following options: @code{-fno-unsafe-math-optimizations
 @section OpenMP Modules @code{OMP_LIB} and @code{OMP_LIB_KINDS}
 @table @asis
 @item @emph{Standard}:
-OpenMP Application Program Interface v4.5
+OpenMP Application Program Interface v4.5 and
+OpenMP Application Program Interface v5.0 (partially supported).
 @end table
 
-
 The OpenMP Fortran runtime library routines are provided both in
-a form of two Fortran 90 modules, named @code{OMP_LIB} and 
+a form of two Fortran modules, named @code{OMP_LIB} and
 @code{OMP_LIB_KINDS}, and in a form of a Fortran @code{include} file named
 @file{omp_lib.h}. The procedures provided by @code{OMP_LIB} can be found
 in the @ref{Top,,Introduction,libgomp,GNU Offloading and Multi
@@ -15306,19 +15306,25 @@ below.
 
 For details refer to the actual
 @uref{http://www.openmp.org/wp-content/uploads/openmp-4.5.pdf,
-OpenMP Application Program Interface v4.5}.
-And for the @code{pause}-related constants to the OpenMP 5.0 specification.
+OpenMP Application Program Interface v4.5} and
+@uref{https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5.0.pdf,
+OpenMP Application Program Interface v5.0}.
 
 @code{OMP_LIB_KINDS} provides the following scalar default-integer
 named constants:
 
 @table @asis
+@item @code{omp_allocator_handle_kind}
+@item @code{omp_alloctrait_key_kind}
+@item @code{omp_alloctrait_val_kind}
 @item @code{omp_lock_kind}
 @item @code{omp_lock_hint_kind}
 @item @code{omp_nest_lock_kind}
 @item @code{omp_pause_resource_kind}
+@item @code{omp_memspace_handle_kind}
 @item @code{omp_proc_bind_kind}
 @item @code{omp_sched_kind}
+@item @code{omp_sync_hint_kind}
 @end table
 
 @code{OMP_LIB} provides the scalar default-integer
@@ -15326,6 +15332,12 @@ named constant @code{openmp_version} with a value of the form
 @var{mm}, where @code{} is the year and @var{mm} the month
 of the OpenMP version; for OpenMP v4.5 the value is @code{201511}.
 
+The following derived type:
+
+@table @asis
+@item @code{omp_alloctrait}
+@end table
+
 The following scalar integer named constants of the
 kind @code{omp_sched_kind}:
 
@@ -15336,7 +15348,7 @@ kind @code{omp_sched_kind}:
 @item @code{omp_sched_auto}
 @end table
 
-And the following scalar integer named constants of the 
+And the following scalar integer named constants of the
 kind @code{omp_proc_bind_kind}:
 
 @table @asis
@@ -15356,6 +15368,11 @@ kind @code{omp_lock_hint_kind}:
 @item @code{omp_lock_hint_contended}
 @item @code{omp_lock_hint_nonspeculative}
 @item @code{omp_lock_hint_speculative}
+@item 

RE: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Marc Glisse

On Thu, 23 Jul 2020, Marc Glisse wrote:


On Wed, 22 Jul 2020, Roger Sayle wrote:

Many thanks for the peer review and feedback.  I completely agree that 
POPCOUNT

and PARITY iterators simplifies things and handle the IFN_ variants.


Is there a reason why the iterators cannot be used for this one?

+(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
+  BUILT_IN_POPCOUNTIMAX)
+ parity (BUILT_IN_PARITY BUILT_IN_PARITYL BUILT_IN_PARITYLL
+BUILT_IN_PARITYIMAX)
+  (simplify
+(bit_and (popcount @0) integer_onep)
+(parity @0)))


Ah, maybe because we may have platforms/modes where the optab for popcount 
is supported but not the one for parity, and we are not allowed to create 
internal calls if the optab is not supported? I think expand_parity tries 
to expand parity as popcount&1, so it should be fine, but I didn't 
actually try it...


--
Marc Glisse


Re: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Richard Biener via Gcc-patches
On Thu, Jul 23, 2020 at 10:15 AM Marc Glisse  wrote:
>
> On Wed, 22 Jul 2020, Roger Sayle wrote:
>
> > Many thanks for the peer review and feedback.  I completely agree that 
> > POPCOUNT
> > and PARITY iterators simplifies things and handle the IFN_ variants.
>
> Is there a reason why the iterators cannot be used for this one?
>
> +(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
> +  BUILT_IN_POPCOUNTIMAX)
> + parity (BUILT_IN_PARITY BUILT_IN_PARITYL BUILT_IN_PARITYLL
> +BUILT_IN_PARITYIMAX)
> +  (simplify
> +(bit_and (popcount @0) integer_onep)
> +(parity @0)))

I should indeed work fine to even do

 (simplify
   (bit_and (POPCOUNT @0) integer_onep)
   (PARITY @0))

Likewise for the existing

+/* popcount(X) == 0 is X == 0, and related (in)equalities.  */
+(for popcount (POPCOUNT)
   (for cmp (le eq ne gt)
rep (eq eq ne ne)
 (simplify
   (cmp (popcount @0) integer_zerop)
   (rep @0 { build_zero_cst (TREE_TYPE (@0)); }

you can elide the (for ...)

OK with those changes.

Richard.

>
> --
> Marc Glisse


Re: [PATCH PR96230] driver: ICE in process_command, at gcc.c:5095

2020-07-23 Thread Richard Biener
On Thu, 23 Jul 2020, Zhanghaijian (A) wrote:

> Hi,
> 
> 
> 
> This is a simple fix for pr96230.
> 
> When the dumpbase is an empty string, we should add check for !*dumpbase in 
> process_command.
> 
> 
> 
> The option -dumpbase "" is used gcc-defs.exp:
> 
> # This option restores naming of aux and dump output files
> 
> # after input files when multiple input files are named,
> 
> # instead of getting them combined with the output name.
> 
> lappend options "additional_flags=-dumpbase \"\""
> 
> 
> 
> Bootstrap and tested on aarch64 and x86 Linux platform. No new regression 
> witnessed.
> 
> Any suggestion?

Does it actually do what the above comment suggests?  I'd have
naiively assumed that for gcc -S t.c -fdump-tree-optimized -dumpbase ""
I get a ".238t.optimized" file (thus empty dumpbase).

Our docs state

An empty string specified as @var{dumpbase} avoids the influence of the
output basename in the naming of auxiliary and dump outputs during
compilation, computing default values :

@smallexample
gcc -c foo.c -o dir/foobar.o -dumpbase '' ...
@end smallexample

will name aux outputs @file{dir/foo.*} and dump outputs
@file{dir/foo.c.*}.  Note how their basenames are taken from the input
name, but the directory still defaults to that of the output.

Alex?

Thanks,
Richard.

> 
> 
> Thanks,
> 
> Haijian Zhang
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


[PATCH v2] driver: fix a problem with implementation of -falign-foo=0 [PR96247]

2020-07-23 Thread Hu Jiangping
Thanks, Richard!

I think your suggestion is very good, so I made a new patch.

v2: at a high level handles -falign-foo=0 like -falign-foo
v1: at the target level overides the -falign-foo=0 option values

Obviously, v2 is better than v1. In addition, anthor option
to reject 0 that discussed in the email and PR96247
is not as good as the current patch either, I think.

I tested this patch on x86_64, it works well. OK for trunk?

Regards!
Hujp

---
 gcc/opts.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/gcc/opts.c b/gcc/opts.c
index 499eb900643..ed6102cd606 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -2786,18 +2786,38 @@ common_handle_option (struct gcc_options *opts,
 
 case OPT_falign_loops_:
   check_alignment_argument (loc, arg, "loops");
+  // fix PR96247
+  if (0 == atoi(arg)) {
+opts->x_flag_align_loops = true;
+opts->x_str_align_loops = NULL;
+  }
   break;
 
 case OPT_falign_jumps_:
   check_alignment_argument (loc, arg, "jumps");
+  // fix PR96247
+  if (0 == atoi(arg)) {
+opts->x_flag_align_jumps = true;
+opts->x_str_align_jumps = NULL;
+  }
   break;
 
 case OPT_falign_labels_:
   check_alignment_argument (loc, arg, "labels");
+  // fix PR96247
+  if (0 == atoi(arg)) {
+opts->x_flag_align_labels = true;
+opts->x_str_align_labels = NULL;
+  }
   break;
 
 case OPT_falign_functions_:
   check_alignment_argument (loc, arg, "functions");
+  // fix PR96247
+  if (0 == atoi(arg)) {
+opts->x_flag_align_functions = true;
+opts->x_str_align_functions = NULL;
+  }
   break;
 
 case OPT_ftabstop_:
-- 
2.17.1





[PATCH] gcc-changelog: fix when somebody reverts a backport

2020-07-23 Thread Martin Liška

It's going to fix git commit 76641cd8b53128e1a962f1313ba75acf0855fd00,
where we not correctly print:

git gcc-verify 76641cd8b53128e1a962f1313ba75acf0855fd00 -p
Checking 76641cd8b53128e1a962f1313ba75acf0855fd00: OK
-- gcc/ChangeLog --
2020-07-15  Richard Biener  

Revert:
2020-07-14  Matthias Klose  

PR lto/95604
* lto-wrapper.c (merge_and_complain): Add decoded options as parameter,
error on different values for -fcf-protection.
(append_compiler_options): Pass -fcf-protection option.
(find_and_merge_options): Add decoded options as parameter,
pass decoded_options to merge_and_complain.
(run_gcc): Pass decoded options to find_and_merge_options.
* lto-opts.c (lto_write_options): Pass -fcf-protection option.

instead of:

git gcc-verify 76641cd8b53128e1a962f1313ba75acf0855fd00 -p
Checking 76641cd8b53128e1a962f1313ba75acf0855fd00: OK
-- gcc/ChangeLog --
2020-07-14  Matthias Klose  

Backported from master:
2020-07-14  Matthias Klose  

PR lto/95604
* lto-wrapper.c (merge_and_complain): Add decoded options as parameter,
error on different values for -fcf-protection.
(append_compiler_options): Pass -fcf-protection option.
(find_and_merge_options): Add decoded options as parameter,
pass decoded_options to merge_and_complain.
(run_gcc): Pass decoded options to find_and_merge_options.
* lto-opts.c (lto_write_options): Pass -fcf-protection option.

I compared 500 revision on master and gcc-10 branch that's the only difference.
I'm going to install it.

Martin

contrib/ChangeLog:

* gcc-changelog/git_commit.py: When reverting a backport,
we should print only Revert header.
---
 contrib/gcc-changelog/git_commit.py | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 827976c2f84..5a9cc4c7563 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -628,7 +628,11 @@ class GitCommit:
 for entry in self.changelog_entries:
 output = ''
 timestamp = entry.datetime
-if self.cherry_pick_commit:
+if self.revert_commit:
+timestamp = current_timestamp
+orig_date = self.original_info.date
+current_timestamp = orig_date.strftime(DATE_FORMAT)
+elif self.cherry_pick_commit:
 info = self.commit_to_info_hook(self.cherry_pick_commit)
 # it can happen that it is a cherry-pick for a different
 # repository
@@ -636,10 +640,6 @@ class GitCommit:
 timestamp = info.date.strftime(DATE_FORMAT)
 else:
 timestamp = current_timestamp
-elif self.revert_commit:
-timestamp = current_timestamp
-orig_date = self.original_info.date
-current_timestamp = orig_date.strftime(DATE_FORMAT)
 elif not timestamp or use_commit_ts:
 timestamp = current_timestamp
 authors = entry.authors if entry.authors else [self.info.author]
@@ -649,12 +649,13 @@ class GitCommit:
 authors.append(author)
 
 if self.cherry_pick_commit or self.revert_commit:

-output += self.format_authors_in_changelog([self.info.author],
+original_author = self.original_info.author
+output += self.format_authors_in_changelog([original_author],
current_timestamp)
-if self.cherry_pick_commit:
-output += '\tBackported from master:\n'
-else:
+if self.revert_commit:
 output += '\tRevert:\n'
+else:
+output += '\tBackported from master:\n'
 output += self.format_authors_in_changelog(authors,
timestamp, '\t')
 else:
--
2.27.0



Re: [PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

2020-07-23 Thread Jan Hubicka
Hello,
sorry for taking so long to get to this.
> diff --git a/gcc/config/i386/i386-features.c b/gcc/config/i386/i386-features.c
> index 535fc7e981d..8f81d101382 100644
> --- a/gcc/config/i386/i386-features.c
> +++ b/gcc/config/i386/i386-features.c
> @@ -2379,6 +2379,152 @@ make_pass_remove_partial_avx_dependency (gcc::context 
> *ctxt)
>return new pass_remove_partial_avx_dependency (ctxt);
>  }
>  
> +/* Replace all one-value const vector that are referenced by SYMBOL_REFs in x
> +   with embedded broadcast. i.e.transform
> +
> + vpaddq .LC0(%rip), %zmm0, %zmm0
> + ret
> +  .LC0:
> +.quad 3
> +.quad 3
> +.quad 3
> +.quad 3
> +.quad 3
> +.quad 3
> +.quad 3
> +.quad 3
> +
> +to
> +
> + vpaddq .LC0(%rip){1to8}, %zmm0, %zmm0

It seems to me that having a special purpose pass for this is bit 
overzelaous.  It seems to me that you can do same pattern matching via 
splitter and fit it into the usual insn splitting pass?

Honza
> + ret
> +  .LC0:
> +.quad 3  */
> +static void
> +replace_constant_pool_with_broadcast (rtx_insn* insn)
> +{
> +  subrtx_ptr_iterator::array_type array;
> +  FOR_EACH_SUBRTX_PTR (iter, array,  (insn), ALL)
> +{
> +  rtx *loc = *iter;
> +  rtx x = *loc;
> +  rtx broadcast_mem, vec_dup, constant, first;
> +  machine_mode mode;
> +  if (GET_CODE (x) != MEM
> +   || GET_CODE (XEXP (x, 0)) != SYMBOL_REF
> +   || !CONSTANT_POOL_ADDRESS_P (XEXP (x, 0)))
> + continue;
> +
> +  mode = GET_MODE (x);
> +  if (!VECTOR_MODE_P (mode))
> + return;
> +
> +  constant = get_pool_constant (XEXP (x, 0));
> +  first = XVECEXP (constant, 0, 0);
> +  /* There could be some rtx like
> +  (mem/u/c:V16QI (symbol_ref/u:DI ("*.LC1")))
> +  but with "*.LC1" refer to V2DI constant vector.  */
> +  if (GET_MODE (constant) != mode)
> + return;
> +
> +  for (int i = 1; i < GET_MODE_NUNITS (mode); ++i)
> + {
> +   rtx tmp = XVECEXP (constant, 0, i);
> +   /* Only handle one-value const vector.  */
> +   if (!rtx_equal_p (tmp, first))
> + return;
> + }
> +
> +  broadcast_mem = force_const_mem (GET_MODE_INNER (mode), first);
> +  vec_dup = gen_rtx_VEC_DUPLICATE (mode, broadcast_mem);
> +  *loc = vec_dup;
> +  INSN_CODE (insn) = -1;
> +  /* Revert change if there's no corresponding pattern.  */
> +  if (recog_memoized (insn) < 0)
> + {
> +   *loc = x;
> +   recog_memoized (insn);
> + }
> +  /* At most 1 memory_operand in an insn.  */
> +  return;
> +}
> +}
> +
> +/* For const vector having one duplicated value, there's no need to put
> +   whole vector in the constant pool when target supports embedded 
> broadcast. */
> +static unsigned int
> +constant_pool_broadcast (void)
> +{
> +  timevar_push (TV_MACH_DEP);
> +  rtx_insn *insn;
> +
> +  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
> +{
> +  if (!INSN_P (insn))
> + continue;
> +
> +  /* Insns may appear inside a SEQUENCE.  Only check the patterns of
> +  insns, not any notes that may be attached.  We don't want to mark
> +  a constant just because it happens to appear in a REG_EQUIV note.  */
> +  if (rtx_sequence *seq = dyn_cast  (PATTERN (insn)))
> + {
> +   int i, n = seq->len ();
> +   for (i = 0; i < n; ++i)
> + {
> +   rtx subinsn = seq->element (i);
> +   if (INSN_P (subinsn))
> + replace_constant_pool_with_broadcast (dyn_cast  
> (subinsn));
> + }
> + }
> +  else
> + replace_constant_pool_with_broadcast (insn);
> +}
> +  timevar_pop (TV_MACH_DEP);
> +  return 0;
> +}
> +
> +namespace {
> +
> +const pass_data pass_data_constant_pool_broadcast =
> +{
> +  RTL_PASS, /* type */
> +  "cpb", /* name */
> +  OPTGROUP_NONE, /* optinfo_flags */
> +  TV_MACH_DEP, /* tv_id */
> +  0, /* properties_required */
> +  0, /* properties_provided */
> +  0, /* properties_destroyed */
> +  0, /* todo_flags_start */
> +  TODO_df_finish, /* todo_flags_finish */
> +};
> +
> +class pass_constant_pool_broadcast : public rtl_opt_pass
> +{
> +public:
> +  pass_constant_pool_broadcast (gcc::context *ctxt)
> +: rtl_opt_pass (pass_data_constant_pool_broadcast, ctxt)
> +  {}
> +
> +  /* opt_pass methods: */
> +  virtual bool gate (function *)
> +{
> +  return TARGET_AVX512F;
> +}
> +
> +  virtual unsigned int execute (function *)
> +{
> +  return constant_pool_broadcast ();
> +}
> +}; // class pass_cpb
> +
> +} // anon namespace
> +
> +rtl_opt_pass *
> +make_pass_constant_pool_broadcast (gcc::context *ctxt)
> +{
> +  return new pass_constant_pool_broadcast (ctxt);
> +}
> +
>  /* This compares the priority of target features in function DECL1
> and DECL2.  It returns positive value if DECL1 is higher priority,
> negative value if DECL2 is higher priority and 0 if they are the
> diff --git 

RE: [PATCH] middle-end: Fold popcount(x&4) to (x>>2)&1 and friends.

2020-07-23 Thread Marc Glisse

On Wed, 22 Jul 2020, Roger Sayle wrote:


Many thanks for the peer review and feedback.  I completely agree that POPCOUNT
and PARITY iterators simplifies things and handle the IFN_ variants.


Is there a reason why the iterators cannot be used for this one?

+(for popcount (BUILT_IN_POPCOUNT BUILT_IN_POPCOUNTL BUILT_IN_POPCOUNTLL
+  BUILT_IN_POPCOUNTIMAX)
+ parity (BUILT_IN_PARITY BUILT_IN_PARITYL BUILT_IN_PARITYLL
+BUILT_IN_PARITYIMAX)
+  (simplify
+(bit_and (popcount @0) integer_onep)
+(parity @0)))


--
Marc Glisse


[PATCH PR96230] driver: ICE in process_command, at gcc.c:5095

2020-07-23 Thread Zhanghaijian (A)
Hi,



This is a simple fix for pr96230.

When the dumpbase is an empty string, we should add check for !*dumpbase in 
process_command.



The option -dumpbase "" is used gcc-defs.exp:

# This option restores naming of aux and dump output files

# after input files when multiple input files are named,

# instead of getting them combined with the output name.

lappend options "additional_flags=-dumpbase \"\""



Bootstrap and tested on aarch64 and x86 Linux platform. No new regression 
witnessed.

Any suggestion?



Thanks,

Haijian Zhang



pr96230-v1.diff
Description: pr96230-v1.diff


Re: [PATCH] PR target/96260 - KASAN should work even back-end not porting anything.

2020-07-23 Thread Kito Cheng via Gcc-patches
GCC 10.2 released, committed to GCC 10 branch :)

On Thu, Jul 23, 2020 at 3:21 PM Kito Cheng  wrote:
>
> Hi Jakub:
>
> Thanks for your review, committed with formatting fixes.
>
> Hi Richard:
>
> Thanks, I'll commit that after gcc 10.2 release :)
>
> On Wed, Jul 22, 2020 at 6:14 PM Jakub Jelinek via Gcc-patches
>  wrote:
> >
> > On Wed, Jul 22, 2020 at 04:53:00PM +0800, Kito Cheng wrote:
> > > --- a/gcc/asan.c
> > > +++ b/gcc/asan.c
> > > @@ -344,6 +344,12 @@ asan_shadow_offset ()
> > >return asan_shadow_offset_value;
> > >  }
> > >
> > > +/* Returns Asan shadow offset has been set.  */
> > > +bool asan_shadow_offset_set_p ()
> >
> > Formatting.  Should be
> > bool
> > asan_shadow_offset_set_p ()
> >
> > > +{
> > > +  return asan_shadow_offset_computed;
> > > +}
> > > +
> > >  alias_set_type asan_shadow_set = -1;
> > >
> > >  /* Pointer types to 1, 2 or 4 byte integers in shadow memory.  A separate
> >
> > > -/* { dg-warning ".'-fsanitize=address' and '-fsanitize=kernel-address' 
> > > are not supported for this target" "" { target *-*-* } 0 } */
> > > +/* { dg-warning ".'-fsanitize=kernel-address' with stack protection is 
> > > not supported without '-fasan-shadow-offset=' for this target." "" { 
> > > target *-*-* } 0 } */
> >
> > Please adjust, see below.
> > > index 95eea63380f6..48f13d282c52 100644
> > > --- a/gcc/toplev.c
> > > +++ b/gcc/toplev.c
> > > @@ -1835,7 +1835,7 @@ process_options (void)
> > >/* Address Sanitizer needs porting to each target architecture.  */
> > >
> > >if ((flag_sanitize & SANITIZE_ADDRESS)
> > > -  && (!FRAME_GROWS_DOWNWARD || targetm.asan_shadow_offset == NULL))
> > > +  && !FRAME_GROWS_DOWNWARD)
> > >  {
> > >warning_at (UNKNOWN_LOCATION, 0,
> > > "%<-fsanitize=address%> and %<-fsanitize=kernel-address%> 
> > > "
> > > @@ -1843,6 +1843,25 @@ process_options (void)
> > >flag_sanitize &= ~SANITIZE_ADDRESS;
> > >  }
> > >
> > > +  if ((flag_sanitize & SANITIZE_USER_ADDRESS)
> > > +  && targetm.asan_shadow_offset == NULL)
> > > +{
> > > +  warning_at (UNKNOWN_LOCATION, 0,
> > > +   "%<-fsanitize=address%> not supported for this target");
> > > +  flag_sanitize &= ~SANITIZE_ADDRESS;
> > > +}
> > > +
> > > +  if ((flag_sanitize & SANITIZE_KERNEL_ADDRESS)
> > > +  && (targetm.asan_shadow_offset == NULL && param_asan_stack
> > > +   && !asan_shadow_offset_set_p ()))
> >
> > Formatting.  If there are several & (or ||s) and it doesn't fit on one
> > line, each of them should be on a separate line, so there should be a
> > newline and indentation instead of space before "&& param_asan_stack".
> > > +{
> > > +  warning_at (UNKNOWN_LOCATION, 0,
> > > +   "%<-fsanitize=kernel-address%> with stack protection "
> > > +   "is not supported without %<-fasan-shadow-offset=%> "
> > > +   "for this target.");
> >
> > No full stop at the end of diagnostics (plus adjust testcase for it).
> >
> > Otherwise LGTM for trunk and 10.3 (see Richi's mail).
> >
> > Jakub
> >


Re: [PATCH] PR target/96260 - KASAN should work even back-end not porting anything.

2020-07-23 Thread Kito Cheng via Gcc-patches
Hi Jakub:

Thanks for your review, committed with formatting fixes.

Hi Richard:

Thanks, I'll commit that after gcc 10.2 release :)

On Wed, Jul 22, 2020 at 6:14 PM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Wed, Jul 22, 2020 at 04:53:00PM +0800, Kito Cheng wrote:
> > --- a/gcc/asan.c
> > +++ b/gcc/asan.c
> > @@ -344,6 +344,12 @@ asan_shadow_offset ()
> >return asan_shadow_offset_value;
> >  }
> >
> > +/* Returns Asan shadow offset has been set.  */
> > +bool asan_shadow_offset_set_p ()
>
> Formatting.  Should be
> bool
> asan_shadow_offset_set_p ()
>
> > +{
> > +  return asan_shadow_offset_computed;
> > +}
> > +
> >  alias_set_type asan_shadow_set = -1;
> >
> >  /* Pointer types to 1, 2 or 4 byte integers in shadow memory.  A separate
>
> > -/* { dg-warning ".'-fsanitize=address' and '-fsanitize=kernel-address' are 
> > not supported for this target" "" { target *-*-* } 0 } */
> > +/* { dg-warning ".'-fsanitize=kernel-address' with stack protection is not 
> > supported without '-fasan-shadow-offset=' for this target." "" { target 
> > *-*-* } 0 } */
>
> Please adjust, see below.
> > index 95eea63380f6..48f13d282c52 100644
> > --- a/gcc/toplev.c
> > +++ b/gcc/toplev.c
> > @@ -1835,7 +1835,7 @@ process_options (void)
> >/* Address Sanitizer needs porting to each target architecture.  */
> >
> >if ((flag_sanitize & SANITIZE_ADDRESS)
> > -  && (!FRAME_GROWS_DOWNWARD || targetm.asan_shadow_offset == NULL))
> > +  && !FRAME_GROWS_DOWNWARD)
> >  {
> >warning_at (UNKNOWN_LOCATION, 0,
> > "%<-fsanitize=address%> and %<-fsanitize=kernel-address%> "
> > @@ -1843,6 +1843,25 @@ process_options (void)
> >flag_sanitize &= ~SANITIZE_ADDRESS;
> >  }
> >
> > +  if ((flag_sanitize & SANITIZE_USER_ADDRESS)
> > +  && targetm.asan_shadow_offset == NULL)
> > +{
> > +  warning_at (UNKNOWN_LOCATION, 0,
> > +   "%<-fsanitize=address%> not supported for this target");
> > +  flag_sanitize &= ~SANITIZE_ADDRESS;
> > +}
> > +
> > +  if ((flag_sanitize & SANITIZE_KERNEL_ADDRESS)
> > +  && (targetm.asan_shadow_offset == NULL && param_asan_stack
> > +   && !asan_shadow_offset_set_p ()))
>
> Formatting.  If there are several & (or ||s) and it doesn't fit on one
> line, each of them should be on a separate line, so there should be a
> newline and indentation instead of space before "&& param_asan_stack".
> > +{
> > +  warning_at (UNKNOWN_LOCATION, 0,
> > +   "%<-fsanitize=kernel-address%> with stack protection "
> > +   "is not supported without %<-fasan-shadow-offset=%> "
> > +   "for this target.");
>
> No full stop at the end of diagnostics (plus adjust testcase for it).
>
> Otherwise LGTM for trunk and 10.3 (see Richi's mail).
>
> Jakub
>


GCC 10.2.1 Status Report (2020-07-23)

2020-07-23 Thread Richard Biener


Status
==

The GCC 10.2 release process has been completed and the GCC 10 branch
is now again open for regression and documentation fixes.


Quality Data


Priority  #   Change from last report
---   ---
P1 
P2  219   +   1
P3   57   +   4
P4  176
P5   22
---   ---
Total P1-P3 276   +   5
Total   474   +   5


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2020-July/233135.html