date:20201209

Re: [PATCH] X86: Fix feature check for HRESET/AVX_VNNI/UINTR

2020-12-09 Thread Uros Bizjak via Gcc-patches

On Thu, Dec 10, 2020 at 4:42 AM Hongyu Wang  wrote:
>
> Hi,
>
> This patch is a simple fix for HRESET/AVX_VNNI/UINTR feature detect is
> put wrongly under avx_512usable.
>
> Bootstrap and tested on x86-64-linux, OK for trunk?
>
> gcc/ChangeLog:
> * common/config/i386/cpuinfo.h (get_available_features):
> Move check for HRESET/AVX_VNNI/UINTR out of avx512_usable.
> ---
>  gcc/common/config/i386/cpuinfo.h | 14 +-
>  1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/common/config/i386/cpuinfo.h 
> b/gcc/common/config/i386/cpuinfo.h
> index 4f1ab636807..19de63fe7ac 100644
> --- a/gcc/common/config/i386/cpuinfo.h
> +++ b/gcc/common/config/i386/cpuinfo.h
> @@ -686,6 +686,8 @@ get_available_features (struct __processor_model 
> *cpu_model,
>if (edx & bit_AMX_BF16)
>  set_feature (FEATURE_AMX_BF16);
>  }
> +  if (edx & bit_UINTR)
> +set_feature (FEATURE_UINTR);
>if (ecx & bit_KL)
>  has_kl = 1;
>if (avx512_usable)
> @@ -722,17 +724,19 @@ get_available_features (struct __processor_model
> *cpu_model,
>  set_feature (FEATURE_AVX5124FMAPS);
>if (edx & bit_AVX512VP2INTERSECT)
>  set_feature (FEATURE_AVX512VP2INTERSECT);
> -  if (edx & bit_UINTR)
> -set_feature (FEATURE_UINTR);
>
>__cpuid_count (7, 1, eax, ebx, ecx, edx);
>if (eax & bit_AVX512BF16)
>  set_feature (FEATURE_AVX512BF16);
> -  if (eax & bit_HRESET)
> -set_feature (FEATURE_HRESET);
> +}
> +
> +  __cpuid_count (7, 1, eax, ebx, ecx, edx);

Please better move __cpuid_count (7, 1, ...) in its own place, as is
done in the attached patch (that also groups a couple of features
together).

OK with the above change.

Thanks,
Uros.

> +  if (eax & bit_HRESET)
> +set_feature (FEATURE_HRESET);
> +  if (avx_usable)
> +{
>if (eax & bit_AVXVNNI)
>  set_feature (FEATURE_AVXVNNI);
> -
>  }
>  }
>
> --
>
> --
> Regards,
>
> Hongyu, Wang
diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 4f1ab636807..a3372fc4ecf 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -669,6 +669,8 @@ get_available_features (struct __processor_model *cpu_model,
set_feature (FEATURE_WAITPKG);
   if (ecx & bit_SHSTK)
set_feature (FEATURE_SHSTK);
+  if (ecx & bit_KL)
+   has_kl = 1;
   if (edx & bit_SERIALIZE)
set_feature (FEATURE_SERIALIZE);
   if (edx & bit_TSXLDTRK)
@@ -677,6 +679,8 @@ get_available_features (struct __processor_model *cpu_model,
set_feature (FEATURE_PCONFIG);
   if (edx & bit_IBT)
set_feature (FEATURE_IBT);
+  if (edx & bit_UINTR)
+   set_feature (FEATURE_UINTR);
   if (amx_usable)
{
  if (edx & bit_AMX_TILE)
@@ -686,8 +690,6 @@ get_available_features (struct __processor_model *cpu_model,
  if (edx & bit_AMX_BF16)
set_feature (FEATURE_AMX_BF16);
}
-  if (ecx & bit_KL)
-   has_kl = 1;
   if (avx512_usable)
{
  if (ebx & bit_AVX512F)
@@ -722,17 +724,20 @@ get_available_features (struct __processor_model 
*cpu_model,
set_feature (FEATURE_AVX5124FMAPS);
  if (edx & bit_AVX512VP2INTERSECT)
set_feature (FEATURE_AVX512VP2INTERSECT);
- if (edx & bit_UINTR)
-   set_feature (FEATURE_UINTR);
+   }
 
- __cpuid_count (7, 1, eax, ebx, ecx, edx);
- if (eax & bit_AVX512BF16)
-   set_feature (FEATURE_AVX512BF16);
- if (eax & bit_HRESET)
-   set_feature (FEATURE_HRESET);
+  __cpuid_count (7, 1, eax, ebx, ecx, edx);
+  if (eax & bit_HRESET)
+   set_feature (FEATURE_HRESET);
+  if (avx_usable)
+   {
  if (eax & bit_AVXVNNI)
set_feature (FEATURE_AVXVNNI);
-
+   }
+  if (avx512_usable)
+   {
+ if (eax & bit_AVX512BF16)
+   set_feature (FEATURE_AVX512BF16);
}
 }

Re: [PATCH] Add missing varasm DECL_P check.

2020-12-09 Thread Jim Wilson

On Wed, Dec 9, 2020 at 7:14 PM H.J. Lu  wrote:

>  A testcase?
>

A testcase requires the RISC-V select_section target hook, so it isn't
going to be very useful.  I don't see any other linux targets that have
this hook defined.  Just a few embedded targets.  The testcase
is libgfortran/generated/product_c4.c.  I haven't tried to reduce it.  It
fails both for a native build and a cross build.

Jim

Re: [RFC] [avr] Toolchain Integration for Testsuite Execution (avr cc0 to mode_cc0 conversion)

2020-12-09 Thread Dimitar Dimitrov

On сряда, 9 декември 2020 г. 15:12:49 EET abebeos via Gcc-patches wrote:
> Essence:
> 
> I need a confirmation that the testsuite setup as presented in:
> 
> https://github.com/abebeos/avr-gnu
> 
> works fine.
> 
> The problem with the avr target is that the testsuite cannot be run easily,
> mainly because of the need for a special simulated-target setup, which does
> not work for avr as documented. This led developers to a dead-end with
> their non-cc0-avr-backends (the non-cc0 backend is needed thus avr is not
> dropped from gcc11).
> 
> I integrated a toolchain/testsetup to be able to run the gcc testsuite
> against a simulated avr target.
> 
> I then used this toolchain to test 2 different existent
> non-cc0-avr-backends (from pipcet and saaadhu, both github).
> 
> The result is that saaadhu's backend seems to be working 100%. It has
> identical testsuite results with the existing (but deprecated) cc0-backend,
> which means that it can be used "as-is" for inclusion in gcc11.
> 
> Please note that I did this work in context of a bounty @ bountysouce, more
> information within the issue:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92729#c35
Hi,

I tested the trees you have given with my own AVR test setup [1]. I confirm 
your results:
  - saaadhu's tree does not introduce any regressions.
  - pipcet's tree has 142 gcc and 299 g++ regressions (although many of them
are duplicates, e.g. same test case with different optimization levels).

It's a bit awkward to copy gcc/config/avr into a mainline tree. Looking at 
their github history, both authors made some small changes in other areas. I 
would have prefered to cherry-pick or apply patches.

=
baseline beb9afcaf1466996a301c778596c5df209e7913c

=== gcc Summary ===

# of expected passes87504
# of unexpected failures1105
# of unexpected successes   15
# of expected failures  581
# of unresolved testcases   16786
# of unsupported tests  5370

=== g++ Summary ===

# of expected passes140663
# of unexpected failures7932
# of unexpected successes   21
# of expected failures  620
# of unresolved testcases   8603
# of unsupported tests  11305

=
pipcet/avr-ccmode

=== gcc Summary ===

# of expected passes87463
# of unexpected failures1221
# of unexpected successes   15
# of expected failures  581
# of unresolved testcases   16799
# of unsupported tests  5359

=== g++ Summary ===

# of expected passes140529
# of unexpected failures8205
# of unexpected successes   21
# of expected failures  620
# of unresolved testcases   8607
# of unsupported tests  11301

=
saadhu/avr-cc0
=== gcc Summary ===

# of expected passes87504
# of unexpected failures1105
# of unexpected successes   15
# of expected failures  581
# of unresolved testcases   16786
# of unsupported tests  5370

=== g++ Summary ===

# of expected passes140663
# of unexpected failures7932
# of unexpected successes   21
# of expected failures  620
# of unresolved testcases   8603
# of unsupported tests  11305

On a side note, I build and test AVR backend in mainline everyday. If there is 
interest from AVR maintainers I can post daily results to gcc-testresults@ 
mailing list.

Regards,
Dimitar

[1] https://github.com/dinuxbg/gnupru/blob/master/testing/buildbot-avr.sh

[PATCH] X86: Fix feature check for HRESET/AVX_VNNI/UINTR

2020-12-09 Thread Hongyu Wang via Gcc-patches

Hi,

This patch is a simple fix for HRESET/AVX_VNNI/UINTR feature detect is
put wrongly under avx_512usable.

Bootstrap and tested on x86-64-linux, OK for trunk?

gcc/ChangeLog:
* common/config/i386/cpuinfo.h (get_available_features):
Move check for HRESET/AVX_VNNI/UINTR out of avx512_usable.
---
 gcc/common/config/i386/cpuinfo.h | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h
index 4f1ab636807..19de63fe7ac 100644
--- a/gcc/common/config/i386/cpuinfo.h
+++ b/gcc/common/config/i386/cpuinfo.h
@@ -686,6 +686,8 @@ get_available_features (struct __processor_model *cpu_model,
   if (edx & bit_AMX_BF16)
 set_feature (FEATURE_AMX_BF16);
 }
+  if (edx & bit_UINTR)
+set_feature (FEATURE_UINTR);
   if (ecx & bit_KL)
 has_kl = 1;
   if (avx512_usable)
@@ -722,17 +724,19 @@ get_available_features (struct __processor_model
*cpu_model,
 set_feature (FEATURE_AVX5124FMAPS);
   if (edx & bit_AVX512VP2INTERSECT)
 set_feature (FEATURE_AVX512VP2INTERSECT);
-  if (edx & bit_UINTR)
-set_feature (FEATURE_UINTR);

   __cpuid_count (7, 1, eax, ebx, ecx, edx);
   if (eax & bit_AVX512BF16)
 set_feature (FEATURE_AVX512BF16);
-  if (eax & bit_HRESET)
-set_feature (FEATURE_HRESET);
+}
+
+  __cpuid_count (7, 1, eax, ebx, ecx, edx);
+  if (eax & bit_HRESET)
+set_feature (FEATURE_HRESET);
+  if (avx_usable)
+{
   if (eax & bit_AVXVNNI)
 set_feature (FEATURE_AVXVNNI);
-
 }
 }

-- 

-- 
Regards,

Hongyu, Wang

Re: [PATCH 3/4] rs6000: Enable vec_insert for P8 with rs6000_expand_vector_set_var_p8

2020-12-09 Thread Xionghu Luo via Gcc-patches


Ping^2. Thanks.

On 2020/12/3 22:16, Xionghu Luo via Gcc-patches wrote:

Ping. Thanks.


On 2020/11/27 09:04, Xionghu Luo via Gcc-patches wrote:

Hi Segher,
Thanks for the approval of [PATCH 1/4] and [PATCH 2/4], what's your
opinion of this [PATCH 3/4] for P8, please?  xxinsertw only exists since
v3.0, so we had to implement by another way.


Xionghu


On 2020/10/10 16:08, Xionghu Luo wrote:

gcc/ChangeLog:

2020-10-10  Xionghu Luo  

* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Generate ARRAY_REF(VIEW_CONVERT_EXPR) for P8 and later
platforms.
* config/rs6000/rs6000.c (rs6000_expand_vector_set_var): Update
to call different path for P8 and P9.
(rs6000_expand_vector_set_var_p9): New function.
(rs6000_expand_vector_set_var_p8): New function.

gcc/testsuite/ChangeLog:

2020-10-10  Xionghu Luo  

* gcc.target/powerpc/pr79251.p8.c: New test.
---
   gcc/config/rs6000/rs6000-c.c  |  27 +++-
   gcc/config/rs6000/rs6000.c    | 117 
+-

   gcc/testsuite/gcc.target/powerpc/pr79251.p8.c |  17 +++
   3 files changed, 155 insertions(+), 6 deletions(-)
   create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251.p8.c

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 5551a21d738..4bea8001ec6 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -1599,10 +1599,29 @@ altivec_resolve_overloaded_builtin 
(location_t loc, tree fndecl,

 SET_EXPR_LOCATION (stmt, loc);
 stmt = build1 (COMPOUND_LITERAL_EXPR, arg1_type, stmt);
   }
-  stmt = build_array_ref (loc, stmt, arg2);
-  stmt = fold_build2 (MODIFY_EXPR, TREE_TYPE (arg0), stmt,
-  convert (TREE_TYPE (stmt), arg0));
-  stmt = build2 (COMPOUND_EXPR, arg1_type, stmt, decl);
+
+  if (TARGET_P8_VECTOR)
+    {
+  stmt = build_array_ref (loc, stmt, arg2);
+  stmt = fold_build2 (MODIFY_EXPR, TREE_TYPE (arg0), stmt,
+  convert (TREE_TYPE (stmt), arg0));
+  stmt = build2 (COMPOUND_EXPR, arg1_type, stmt, decl);
+    }
+  else
+    {
+  tree arg1_inner_type;
+  tree innerptrtype;
+  arg1_inner_type = TREE_TYPE (arg1_type);
+  innerptrtype = build_pointer_type (arg1_inner_type);
+
+  stmt = build_unary_op (loc, ADDR_EXPR, stmt, 0);
+  stmt = convert (innerptrtype, stmt);
+  stmt = build_binary_op (loc, PLUS_EXPR, stmt, arg2, 1);
+  stmt = build_indirect_ref (loc, stmt, RO_NULL);
+  stmt = build2 (MODIFY_EXPR, TREE_TYPE (stmt), stmt,
+ convert (TREE_TYPE (stmt), arg0));
+  stmt = build2 (COMPOUND_EXPR, arg1_type, stmt, decl);
+    }
 return stmt;
   }
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 96f76c7a74c..33ca839cb28 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -6806,10 +6806,10 @@ rs6000_expand_vector_set (rtx target, rtx 
val, rtx elt_rtx)

   }
   /* Insert VAL into IDX of TARGET, VAL size is same of the vector 
element, IDX

-   is variable and also counts by vector element size.  */
+   is variable and also counts by vector element size for p9 and 
above.  */

   void
-rs6000_expand_vector_set_var (rtx target, rtx val, rtx idx)
+rs6000_expand_vector_set_var_p9 (rtx target, rtx val, rtx idx)
   {
 machine_mode mode = GET_MODE (target);
@@ -6852,6 +6852,119 @@ rs6000_expand_vector_set_var (rtx target, rtx 
val, rtx idx)

 emit_insn (perml);
   }
+/* Insert VAL into IDX of TARGET, VAL size is same of the vector 
element, IDX

+   is variable and also counts by vector element size for p8.  */
+
+void
+rs6000_expand_vector_set_var_p8 (rtx target, rtx val, rtx idx)
+{
+  machine_mode mode = GET_MODE (target);
+
+  gcc_assert (VECTOR_MEM_VSX_P (mode) && !CONST_INT_P (idx));
+
+  gcc_assert (GET_MODE (idx) == E_SImode);
+
+  machine_mode inner_mode = GET_MODE (val);
+  HOST_WIDE_INT mode_mask = GET_MODE_MASK (inner_mode);
+
+  rtx tmp = gen_reg_rtx (GET_MODE (idx));
+  int width = GET_MODE_SIZE (inner_mode);
+
+  gcc_assert (width >= 1 && width <= 4);
+
+  if (!BYTES_BIG_ENDIAN)
+    {
+  /*  idx = idx * width.  */
+  emit_insn (gen_mulsi3 (tmp, idx, GEN_INT (width)));
+  /*  idx = idx + 8.  */
+  emit_insn (gen_addsi3 (tmp, tmp, GEN_INT (8)));
+    }
+  else
+    {
+  emit_insn (gen_mulsi3 (tmp, idx, GEN_INT (width)));
+  emit_insn (gen_subsi3 (tmp, GEN_INT (24 - width), tmp));
+    }
+
+  /*  lxv vs33, mask.
+  DImode: 0x
+  SImode: 0x
+  HImode: 0x.
+  QImode: 0x00ff.  */
+  rtx mask = gen_reg_rtx (V16QImode);
+  rtx mask_v2di = gen_reg_rtx (V2DImode);
+  rtvec v = rtvec_alloc (2);
+  if (!BYTES_BIG_ENDIAN)
+    {
+  RTVEC_ELT (v, 0) = gen_rtx_CONST_INT (DImode, 0);
+  RTVEC_ELT (v, 1) = gen_rtx_CONST_INT (DImode, mode_mask);
+    }
+  else
+    {
+

Re: [PATCH] Add missing varasm DECL_P check.

2020-12-09 Thread H.J. Lu via Gcc-patches

On Wed, Dec 9, 2020 at 7:10 PM Jim Wilson  wrote:
>
> This fixes a riscv64-linux bootstrap failure.
>
> get_constant_section calls the select_section target hook, and select_section
> calls get_named_section which calls get_section.  So it is possible to have
> a constant not a decl in both of these functions.  They already call DECL_P
> checks everywhere except for the new code HJ recently added.  This adds the
> missing DECL_P check.
>
> Verified with a riscv64-linux bootstrap.
>
> OK?
>
> Jim
> ---
>  gcc/varasm.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/gcc/varasm.c b/gcc/varasm.c
> index 0fac3688828..5b2e123b0da 100644
> --- a/gcc/varasm.c
> +++ b/gcc/varasm.c
> @@ -294,6 +294,7 @@ get_section (const char *name, unsigned int flags, tree 
> decl,
>flags |= SECTION_NAMED;
>if (HAVE_GAS_SHF_GNU_RETAIN
>&& decl != nullptr
> +  && DECL_P (decl)
>&& DECL_PRESERVE_P (decl))
>  flags |= SECTION_RETAIN;
>if (*slot == NULL)
> --
> 2.17.1
>

 A testcase?

-- 
H.J.

Re: V3 [PATCH 0/2] Switch to a new section if the SECTION_RETAIN bit doesn't match

2020-12-09 Thread Jim Wilson

On Wed, Dec 9, 2020 at 6:14 PM H.J. Lu  wrote:

> I tested it with glibc build.  Glibc build issue is the reason I
> didn't combine 2 patches into one.
> If GCC does issue a warning, which it should, we will change glibc.
>

OK.  Thanks.  Then I won't worry about this glibc for now.

Jim

[PATCH] Add missing varasm DECL_P check.

2020-12-09 Thread Jim Wilson

This fixes a riscv64-linux bootstrap failure.

get_constant_section calls the select_section target hook, and select_section
calls get_named_section which calls get_section.  So it is possible to have
a constant not a decl in both of these functions.  They already call DECL_P
checks everywhere except for the new code HJ recently added.  This adds the
missing DECL_P check.

Verified with a riscv64-linux bootstrap.

OK?

Jim
---
 gcc/varasm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/varasm.c b/gcc/varasm.c
index 0fac3688828..5b2e123b0da 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -294,6 +294,7 @@ get_section (const char *name, unsigned int flags, tree 
decl,
   flags |= SECTION_NAMED;
   if (HAVE_GAS_SHF_GNU_RETAIN
   && decl != nullptr
+  && DECL_P (decl)
   && DECL_PRESERVE_P (decl))
 flags |= SECTION_RETAIN;
   if (*slot == NULL)
-- 
2.17.1

Re: [PATCH] Correct -fdump-go-spec's handling of incomplete types

2020-12-09 Thread Ian Lance Taylor via Gcc-patches

On Tue, Dec 8, 2020 at 2:57 PM Nikhil Benesch  wrote:
>
> This patch corrects -fdump-go-spec's handling of incomplete types.
> To my knowledge the issue fixed here has not been previously
> reported. It was exposed by an in-progress port of gccgo to FreeBSD.
>
> Given the following C code
>
> struct s_fwd v_fwd;
> struct s_fwd { };
>
> -fdump-go-spec currently produces the following Go code
>
> var v_fwd struct {};
> type s_fwd s_fwd;
>
> whereas the correct Go code is:
>
> var v_fwd s_fwd;
> type s_fwd struct {};
>
> (Go is considerably more permissive than C with out-of-order
> declarations, so anywhere an out-of-order declaration is valid in
> C it is valid in Go.)
>
> gcc/:
> * godump.c (go_format_type): Don't consider whether a type has
> been seen when determining whether to output a type by name.
> Consider only the use_type_name parameter.
> (go_output_typedef): When outputting a typedef, format the
> declaration's original type, which contains the name of the
> underlying type rather than the name of the typedef.
> gcc/testsuite:
> * gcc.misc-tests/godump-1.c: Add test case.

Thanks.  I changed function types to use type names, and committed like so.

Ian
73cf5da233b4cd0f140dd997270e88de63e27db7
diff --git a/gcc/godump.c b/gcc/godump.c
index 29a45ce8979..033b2c59f3c 100644
--- a/gcc/godump.c
+++ b/gcc/godump.c
@@ -697,9 +697,8 @@ go_format_type (class godump_container *container, tree 
type,
   ret = true;
   ob = &container->type_obstack;
 
-  if (TYPE_NAME (type) != NULL_TREE
-  && (container->decls_seen.contains (type)
- || container->decls_seen.contains (TYPE_NAME (type)))
+  if (use_type_name
+  && TYPE_NAME (type) != NULL_TREE
   && (AGGREGATE_TYPE_P (type)
  || POINTER_TYPE_P (type)
  || TREE_CODE (type) == FUNCTION_TYPE))
@@ -707,6 +706,12 @@ go_format_type (class godump_container *container, tree 
type,
   tree name;
   void **slot;
 
+  /* References to complex builtin types cannot be translated to
+   Go.  */
+  if (DECL_P (TYPE_NAME (type))
+ && DECL_IS_UNDECLARED_BUILTIN (TYPE_NAME (type)))
+   ret = false;
+
   name = TYPE_IDENTIFIER (type);
 
   slot = htab_find_slot (container->invalid_hash, IDENTIFIER_POINTER 
(name),
@@ -714,13 +719,17 @@ go_format_type (class godump_container *container, tree 
type,
   if (slot != NULL)
ret = false;
 
+  /* References to incomplete structs are permitted in many
+contexts, like behind a pointer or inside of a typedef. So
+consider any referenced struct a potential dummy type.  */
+  if (RECORD_OR_UNION_TYPE_P (type))
+   container->pot_dummy_types.add (IDENTIFIER_POINTER (name));
+
   obstack_1grow (ob, '_');
   go_append_string (ob, name);
   return ret;
 }
 
-  container->decls_seen.add (type);
-
   switch (TREE_CODE (type))
 {
 case TYPE_DECL:
@@ -821,34 +830,6 @@ go_format_type (class godump_container *container, tree 
type,
   break;
 
 case POINTER_TYPE:
-  if (use_type_name
-  && TYPE_NAME (TREE_TYPE (type)) != NULL_TREE
-  && (RECORD_OR_UNION_TYPE_P (TREE_TYPE (type))
- || (POINTER_TYPE_P (TREE_TYPE (type))
-  && (TREE_CODE (TREE_TYPE (TREE_TYPE (type)))
- == FUNCTION_TYPE
-{
- tree name;
- void **slot;
-
- name = TYPE_IDENTIFIER (TREE_TYPE (type));
-
- slot = htab_find_slot (container->invalid_hash,
-IDENTIFIER_POINTER (name), NO_INSERT);
- if (slot != NULL)
-   ret = false;
-
- obstack_grow (ob, "*_", 2);
- go_append_string (ob, name);
-
- /* The pointer here can be used without the struct or union
-definition.  So this struct or union is a potential dummy
-type.  */
- if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (type)))
-   container->pot_dummy_types.add (IDENTIFIER_POINTER (name));
-
- return ret;
-}
   if (TREE_CODE (TREE_TYPE (type)) == FUNCTION_TYPE)
obstack_grow (ob, "func", 4);
   else
@@ -1107,7 +1088,7 @@ go_output_type (class godump_container *container)
 static void
 go_output_fndecl (class godump_container *container, tree decl)
 {
-  if (!go_format_type (container, TREE_TYPE (decl), false, true, NULL, false))
+  if (!go_format_type (container, TREE_TYPE (decl), true, true, NULL, false))
 fprintf (go_dump_file, "// ");
   fprintf (go_dump_file, "func _%s ",
   IDENTIFIER_POINTER (DECL_NAME (decl)));
@@ -1182,8 +1163,8 @@ go_output_typedef (class godump_container *container, 
tree decl)
return;
   *slot = CONST_CAST (void *, (const void *) type);
 
-  if (!go_format_type (container, TREE_TYPE (decl), true, false, NULL,
-  false))
+  if (!go_format_type (container, DECL_ORIGINAL_

[PATCH 1/2] libstdc++: Add --enable-stdio=stdio_pure option [v2]

2020-12-09 Thread Keith Packard via Gcc-patches

This option directs the library to only use simple stdio APIs instead
of using fileno to get the file descriptor for use with POSIX APIs.

Aided-by: Jonathan Wakely 
Signed-off-by: Keith Packard 

-

v2:
Switch from --enable-libstdcxx-pure-stdio to
--enable-stdio=stdio_pure based on a patch from Jonathan
Wakely .
---
 libstdc++-v3/acinclude.m4  | 20 ++
 libstdc++-v3/config/io/basic_file_stdio.cc | 46 +++---
 2 files changed, 54 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index fcd9ea3d23a..703962ce2d7 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -2862,24 +2862,30 @@ AC_DEFUN([GLIBCXX_ENABLE_PARALLEL], [
 
 
 dnl
-dnl Check for which I/O library to use:  stdio, or something specific.
+dnl Check for which I/O library to use:  stdio and POSIX, or pure stdio.
 dnl
-dnl Default is stdio.
+dnl Default is stdio_posix.
 dnl
 AC_DEFUN([GLIBCXX_ENABLE_CSTDIO], [
   AC_MSG_CHECKING([for underlying I/O to use])
   GLIBCXX_ENABLE(cstdio,stdio,[[[=PACKAGE]]],
-[use target-specific I/O package], [permit stdio])
+[use target-specific I/O package], [permit stdio|stdio_posix|stdio_pure])
 
-  # Now that libio has been removed, you can have any color you want as long
-  # as it's black.  This is one big no-op until other packages are added, but
-  # showing the framework never hurts.
+  # The only available I/O model is based on stdio, via basic_file_stdio.
+  # The default "stdio" is actually "stdio + POSIX" because it uses fdopen(3)
+  # to get a file descriptor and then uses read(3) and write(3) with it.
+  # The "stdio_pure" model doesn't use fdopen and only uses FILE* for I/O.
   case ${enable_cstdio} in
-stdio)
+stdio*)
   CSTDIO_H=config/io/c_io_stdio.h
   BASIC_FILE_H=config/io/basic_file_stdio.h
   BASIC_FILE_CC=config/io/basic_file_stdio.cc
   AC_MSG_RESULT(stdio)
+
+  if test "x$enable_cstdio" = "xstdio_pure" ; then
+   AC_DEFINE(_GLIBCXX_USE_STDIO_PURE, 1,
+ [Define to restrict std::__basic_file<> to stdio APIs.])
+  fi
   ;;
   esac
 
diff --git a/libstdc++-v3/config/io/basic_file_stdio.cc 
b/libstdc++-v3/config/io/basic_file_stdio.cc
index ba830fb9e97..eedffb017b6 100644
--- a/libstdc++-v3/config/io/basic_file_stdio.cc
+++ b/libstdc++-v3/config/io/basic_file_stdio.cc
@@ -111,13 +111,21 @@ namespace
 
   // Wrapper handling partial write.
   static std::streamsize
+#ifdef _GLIBCXX_USE_STDIO_PURE
+  xwrite(FILE *__file, const char* __s, std::streamsize __n)
+#else
   xwrite(int __fd, const char* __s, std::streamsize __n)
+#endif
   {
 std::streamsize __nleft = __n;
 
 for (;;)
   {
+#ifdef _GLIBCXX_USE_STDIO_PURE
+   const std::streamsize __ret = fwrite(__file, 1, __nleft, __file);
+#else
const std::streamsize __ret = write(__fd, __s, __nleft);
+#endif
if (__ret == -1L && errno == EINTR)
  continue;
if (__ret == -1L)
@@ -133,7 +141,7 @@ namespace
 return __n - __nleft;
   }
 
-#ifdef _GLIBCXX_HAVE_WRITEV
+#if defined(_GLIBCXX_HAVE_WRITEV) && !defined(_GLIBCXX_USE_STDIO_PURE)
   // Wrapper handling partial writev.
   static std::streamsize
   xwritev(int __fd, const char* __s1, std::streamsize __n1,
@@ -286,9 +294,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __basic_file::is_open() const throw ()
   { return _M_cfile != 0; }
 
+#ifndef _GLIBCCXX_USE_STDIO_PURE
   int
   __basic_file::fd() throw ()
   { return fileno(_M_cfile); }
+#endif
 
   __c_file*
   __basic_file::file() throw ()
@@ -315,28 +325,46 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
 streamsize __ret;
 do
+#ifdef _GLIBCXX_USE_STDIO_PURE
+  __ret = fread(__s, 1, __n, this->file());
+#else
   __ret = read(this->fd(), __s, __n);
+#endif
 while (__ret == -1L && errno == EINTR);
 return __ret;
   }
 
   streamsize
   __basic_file::xsputn(const char* __s, streamsize __n)
-  { return xwrite(this->fd(), __s, __n); }
+  {
+#ifdef _GLIBCXX_USE_STDIO_PURE
+return xwrite(this->file(), __s, __n);
+#else
+return xwrite(this->fd(), __s, __n);
+#endif
+  }
 
   streamsize
   __basic_file::xsputn_2(const char* __s1, streamsize __n1,
   const char* __s2, streamsize __n2)
   {
 streamsize __ret = 0;
-#ifdef _GLIBCXX_HAVE_WRITEV
+#if defined(_GLIBCXX_HAVE_WRITEV) && !defined(_GLIBCXX_USE_STDIO_PURE)
 __ret = xwritev(this->fd(), __s1, __n1, __s2, __n2);
 #else
 if (__n1)
+#ifdef _GLIBCXX_USE_STDIO_PURE
+  __ret = xwrite(this->file(), __s1, __n1);
+#else
   __ret = xwrite(this->fd(), __s1, __n1);
+#endif
 
 if (__ret == __n1)
+#ifdef _GLIBCXX_USE_STDIO_PURE
+  __ret += xwrite(this->file(), __s2, __n2);
+#else
   __ret += xwrite(this->fd(), __s2, __n2);
+#endif
 #endif
 return __ret;
   }
@@ -350,7 +378,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 if (__off > numeric_limits::max()
|| __off < numeric_limits::min())

[PATCH 2/2] Regenerate libstdc++-v3 autoconf files

2020-12-09 Thread Keith Packard via Gcc-patches

These are the changes to autoconf files for the stdio_pure patch

Signed-off-by: Keith Packard 
---
 libstdc++-v3/config.h.in |  3 +++
 libstdc++-v3/configure   | 17 -
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/libstdc++-v3/config.h.in b/libstdc++-v3/config.h.in
index 72faabfb2c1..c0c166715cb 100644
--- a/libstdc++-v3/config.h.in
+++ b/libstdc++-v3/config.h.in
@@ -1022,6 +1022,9 @@
 /* Define if POSIX read/write locks are available in . */
 #undef _GLIBCXX_USE_PTHREAD_RWLOCK_T
 
+/* Define to restrict std::__basic_file<> to stdio APIs. */
+#undef _GLIBCXX_USE_STDIO_PURE
+
 /* Define if /dev/random and /dev/urandom are available for the random_device
of TR1 (Chapter 5.1). */
 #undef _GLIBCXX_USE_RANDOM_TR1
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index d128de2f186..36c61a77d07 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -16376,7 +16376,7 @@ $as_echo_n "checking for underlying I/O to use... " 
>&6; }
 if test "${enable_cstdio+set}" = set; then :
   enableval=$enable_cstdio;
   case "$enableval" in
-   stdio) ;;
+   stdio|stdio_posix|stdio_pure) ;;
*) as_fn_error $? "Unknown argument to enable/disable cstdio" "$LINENO" 
5 ;;
esac
 
@@ -16386,16 +16386,23 @@ fi
 
 
 
-  # Now that libio has been removed, you can have any color you want as long
-  # as it's black.  This is one big no-op until other packages are added, but
-  # showing the framework never hurts.
+  # The only available I/O model is based on stdio, via basic_file_stdio.
+  # The default "stdio" is actually "stdio + POSIX" because it uses fdopen(3)
+  # to get a file descriptor and then uses read(3) and write(3) with it.
+  # The "stdio_pure" model doesn't use fdopen and only uses FILE* for I/O.
   case ${enable_cstdio} in
-stdio)
+stdio*)
   CSTDIO_H=config/io/c_io_stdio.h
   BASIC_FILE_H=config/io/basic_file_stdio.h
   BASIC_FILE_CC=config/io/basic_file_stdio.cc
   { $as_echo "$as_me:${as_lineno-$LINENO}: result: stdio" >&5
 $as_echo "stdio" >&6; }
+
+  if test "x$enable_cstdio" = "xstdio_pure" ; then
+
+$as_echo "#define _GLIBCXX_USE_STDIO_PURE 1" >>confdefs.h
+
+  fi
   ;;
   esac
 
-- 
2.29.2

[PATCH 0/2] Support libc with stdio-only I/O in libstdc++

2020-12-09 Thread Keith Packard via Gcc-patches

The current libstdc++ basic_file_stdio.cc code assumes a POSIX API
underneath the stdio implementation provided by the host libc. This
means that the host must provide a fairly broad POSIX file API,
including read, write, open, close, lseek and ioctl.

This patch changes basic_file_stdio.cc to only use basic ANSI-C stdio
functions, allowing it to be used with libc implementations like
picolibc which may not have a POSIX operating system underneath.

This is version 2 of the patch. This version uses the existing
--enable-stdio option, extending that to add 'stdio_pure' for this
mode using a patch created by Jonathan Wakely

Re: [PATCH V2] RISC-V: Explicitly call python when using multilib generator

2020-12-09 Thread Kito Cheng via Gcc-patches

Hi Simon:

V2 version is LGTM, thanks!


On Thu, Dec 10, 2020 at 4:31 AM Simon Cook  wrote:
>
>
> On 09/12/2020 14:57, Matthias Klose wrote:
> >
> > that's again hard-coding 'python'.
> >
>
> I believe this way of invoking python should be better than just
> hardcoding python, instead using the interpreter that was called for the
> first script.
>
> --
> From 304afba63fb851fae461fcd89a7ecdba3e96c313 Mon Sep 17 00:00:00 2001
> From: Simon Cook 
> Date: Wed, 9 Dec 2020 10:39:28 +
> Subject: [PATCH] RISC-V: Explicitly call python when using multilib
> generator
>
> When building GCC for RISC-V with the --with-multilib-generator option,
> it may not be possible to call arch-canonicalize as an executable when
> building on Windows. Instead directly invoke the expected python
> interpreter for this step.
>
> gcc/ChangeLog:
>
> * config/riscv/multilib-generator (arch_canonicalize): Invoke
> python interpreter when calling arch-canonicalize script.
> ---
>  gcc/config/riscv/multilib-generator | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/riscv/multilib-generator
> b/gcc/config/riscv/multilib-generator
> index 53c51dfa53f..ccfd9ea18ea 100755
> --- a/gcc/config/riscv/multilib-generator
> +++ b/gcc/config/riscv/multilib-generator
> @@ -54,7 +54,8 @@ def arch_canonicalize(arch):
>this_file = os.path.abspath(os.path.join( __file__))
>arch_can_script = \
>  os.path.join(os.path.dirname(this_file), "arch-canonicalize")
> -  proc = subprocess.Popen([arch_can_script, arch], stdout=subprocess.PIPE)
> +  proc = subprocess.Popen([sys.executable, arch_can_script, arch],
> +  stdout=subprocess.PIPE)
>out, err = proc.communicate()
>return out.strip()
>
> --
> 2.24.3
>

Re: V3 [PATCH 0/2] Switch to a new section if the SECTION_RETAIN bit doesn't match

2020-12-09 Thread H.J. Lu via Gcc-patches

On Wed, Dec 9, 2020 at 6:08 PM Jim Wilson  wrote:
>
> On Tue, Dec 8, 2020 at 4:51 AM H.J. Lu via Gcc-patches 
>  wrote:
>>
>> When SECTION_RETAIN is used, definitions marked with used attribute and
>> unmarked definitions are placed in a section with the same name.  Instead
>> of issue an error:
>
>
> Have you tested glibc builds with this patch?  I noticed yesterday that your 
> earlier patch had broken glibc builds, and was about to raise a bug report 
> for that.  I'm seeing
>
> In file included from :
> gconv_db.c: In function 'free_mem':
> gconv_db.c:831:18: error: 'free_mem' causes a section type conflict with 
> 'free_derivation'
>   831 | libc_freeres_fn (free_mem)
>   |  ^~~~
> ./../include/libc-symbols.h:316:15: note: in definition of macro 
> 'libc_freeres_fn'
>   316 |   static void name (void)
>   |   ^~~~
> gconv_db.c:174:1: note: 'free_derivation' was declared here
>   174 | free_derivation (void *p)
>   | ^~~
>
> This is because free_derivation and free_mem are in the same section, but 
> free_mem has attribute used and free_derivation does not.
>
> This patch is changing the error to a warning, which I think solves the 
> problem unless --enable-werror is used, which is probably good enough.  We 
> could maybe decide that what glibc is doing is wrong and fix glibc to use 
> different section names or mark both functions as attribute used.
>

I tested it with glibc build.  Glibc build issue is the reason I
didn't combine 2 patches into one.
If GCC does issue a warning, which it should, we will change glibc.
If we decide not to issue
a warning, there is no need to change glibc.

BTW, I believe glibc should be changed.

-- 
H.J.

Re: V3 [PATCH 0/2] Switch to a new section if the SECTION_RETAIN bit doesn't match

2020-12-09 Thread Jim Wilson

On Tue, Dec 8, 2020 at 4:51 AM H.J. Lu via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> When SECTION_RETAIN is used, definitions marked with used attribute and
> unmarked definitions are placed in a section with the same name.  Instead
> of issue an error:
>

Have you tested glibc builds with this patch?  I noticed yesterday that
your earlier patch had broken glibc builds, and was about to raise a bug
report for that.  I'm seeing

In file included from :
gconv_db.c: In function 'free_mem':
gconv_db.c:831:18: error: 'free_mem' causes a section type conflict with
'free_derivation'
  831 | libc_freeres_fn (free_mem)
  |  ^~~~
./../include/libc-symbols.h:316:15: note: in definition of macro
'libc_freeres_fn'
  316 |   static void name (void)
  |   ^~~~
gconv_db.c:174:1: note: 'free_derivation' was declared here
  174 | free_derivation (void *p)
  | ^~~

This is because free_derivation and free_mem are in the same section, but
free_mem has attribute used and free_derivation does not.

This patch is changing the error to a warning, which I think solves the
problem unless --enable-werror is used, which is probably good enough.  We
could maybe decide that what glibc is doing is wrong and fix glibc to use
different section names or mark both functions as attribute used.

Jim

Re: [PATCH 00/31] VAX: Bring the port up to date (yes, MODE_CC conversion is included)

2020-12-09 Thread Paul Koning via Gcc-patches

> On Dec 9, 2020, at 9:06 AM, Maciej W. Rozycki  wrote:
> 
> On Sat, 28 Nov 2020, Paul Koning wrote:
> 
>>> Hmm, I gather those systems are able to run some kind of BSD Unix: don't 
>>> they support the r-commands which would allow you to run DejaGNU testing 
>>> with a realistic environment PDP-11 hardware would be usually used with, 
>>> possibly on actual hardware even?  I always feel a bit uneasy about the 
>>> accuracy of any simulation (having suffered from bugs in QEMU causing 
>>> false negatives in software verification).
>> 
>> Fair enough.  But SIMH is a full system emulator with a very large 
>> amount of history and expertise involved in its creation.  It's also 
>> known to run every PDP-11 OS and most diagnostics.  Yes, it certainly 
>> runs BSD 2.x; the reason I didn't use that approach is that I don't know 
>> it well.
> 
> This all sounds great.  Do you happen to know if it is cycle-accurate 
> with respect to individual hardware microarchitectures simulated?  That 
> would be required for performance evaluation of compiler-generated code.

No, it isn't.  I believe it just charges one time unit per instruction, with 
the possible exception of CIS instructions. 

I don't know of any cycle accurate PDP-11 emulators.  It's not even clear if it 
is possible to build one, given the asynchronous operation of the UNIBUS.  It 
certainly would be extremely difficult since even the documented timing is 
amazingly complex, never mind the possibility that the reality is different 
from what is documented.

The pdp11 back end uses a very rough approximation of the documented 11/70 
timing, but GCC doesn't make it easy (or maybe not even possible) to use the 
full timing details.  It's not something I'd expect to refine a whole lot 
further.

More interesting would be to tweak the optimizing machinery to improve parts 
that either have bitrotted or never actually worked. The code generation for 
auto-increment etc. isn't particularly effective and I think that's a known 
limitation.  Ditto indirect addressing, since few other machines have that.  
(VAX does, of course; it might benefit too.)  And with LRA things are more 
limited still, again this seems to be known and is caused by the focus on 
modern machine architectures.

paul

Re: [PATCH V2] RISC-V: Explicitly call python when using multilib generator

2020-12-09 Thread Jim Wilson

On Wed, Dec 9, 2020 at 12:30 PM Simon Cook  wrote:

> I believe this way of invoking python should be better than just
> hardcoding python, instead using the interpreter that was called for the
> first script.
>

I'm not a python expert.  I would suggest asking Kito to review the patch.
Avoiding the explicit python reference is good idea in case we rewrite the
script in another language.

Jim

Re: Help with PR97872

2020-12-09 Thread Hongtao Liu via Gcc-patches

It seems better with your PR97872 fix on i386.

Cat test.c

typedef char v16qi __attribute__ ((vector_size(16)));
v16qi f1(v16qi a, v16qi b) {
return (a & b) != 0;
}

before

f1(char __vector(16), char __vector(16)):
pand %xmm1, %xmm0
pxor %xmm1, %xmm1
pcmpeqb %xmm1, %xmm0
pcmpeqd %xmm1, %xmm1
pandn %xmm1, %xmm0
ret

After the pr97872 fix

f1(char __vector(16), char __vector(16)):
pand xmm0, xmm1
pxor xmm1, xmm1
pcmpeqb xmm0, xmm1
pcmpeqb xmm0, xmm1
ret

On Wed, Dec 9, 2020 at 7:47 PM Prathamesh Kulkarni
 wrote:
>
> On Tue, 8 Dec 2020 at 14:36, Prathamesh Kulkarni
>  wrote:
> >
> > On Mon, 7 Dec 2020 at 17:37, Hongtao Liu  wrote:
> > >
> > > On Mon, Dec 7, 2020 at 7:11 PM Prathamesh Kulkarni
> > >  wrote:
> > > >
> > > > On Mon, 7 Dec 2020 at 16:15, Hongtao Liu  wrote:
> > > > >
> > > > > On Mon, Dec 7, 2020 at 5:47 PM Richard Biener  
> > > > > wrote:
> > > > > >
> > > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > >
> > > > > > > On Mon, 7 Dec 2020 at 13:01, Richard Biener  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > >
> > > > > > > > > On Fri, 4 Dec 2020 at 17:18, Richard Biener 
> > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > >
> > > > > > > > > > > On Thu, 3 Dec 2020 at 16:35, Richard Biener 
> > > > > > > > > > >  wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener 
> > > > > > > > > > > > >  wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > > For the test mentioned in PR, I was trying to see 
> > > > > > > > > > > > > > > if we could do
> > > > > > > > > > > > > > > specialized expansion for vcond in target when 
> > > > > > > > > > > > > > > operands are -1 and 0.
> > > > > > > > > > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > > > > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > > > > > > > > > (reg:V8QI 117)
> > > > > > > > > > > > > > > (reg:V8QI 118)
> > > > > > > > > > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > > > (reg/v:V8QI 116 [ b ]))
> > > > > > > > > > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > where r117 and r118 are set to vector constants 
> > > > > > > > > > > > > > > -1 and 0 respectively.
> > > > > > > > > > > > > > > However, I am not sure if there's a way to check 
> > > > > > > > > > > > > > > if the register is
> > > > > > > > > > > > > > > constant during expansion time (since we don't 
> > > > > > > > > > > > > > > have df analysis yet) ?
> > > > >
> > > > > It seems to me that all you need to do is relax the predicates of op1
> > > > > and op2 in vcondmn to accept const0_rtx and constm1_rtx. I haven't
> > > > > debugged it, but I see that vcondmn in neon.md only accepts
> > > > > s_register_operand.
> > > > >
> > > > > (define_expand "vcond"
> > > > >   [(set (match_operand:VDQW 0 "s_register_operand")
> > > > > (if_then_else:VDQW
> > > > >   (match_operator 3 "comparison_operator"
> > > > > [(match_operand:VDQW 4 "s_register_operand")
> > > > >  (match_operand:VDQW 5 "reg_or_zero_operand")])
> > > > >   (match_operand:VDQW 1 "s_register_operand")
> > > > >   (match_operand:VDQW 2 "s_register_operand")))]
> > > > >   "TARGET_NEON && (! || 
> > > > > flag_unsafe_math_optimizations)"
> > > > > {
> > > > >   arm_expand_vcond (operands, mode);
> > > > >   DONE;
> > > > > })
> > > > >
> > > > > in sse.md it's defined as
> > > > > (define_expand "vcondu"
> > > > >   [(set (match_operand:V_512 0 "register_operand")
> > > > > (if_then_else:V_512
> > > > >   (match_operator 3 ""
> > > > > [(match_operand:VI_AVX512BW 4 "nonimmediate_operand")
> > > > >  (match_operand:VI_AVX512BW 5 "nonimmediate_operand")])
> > > > >   (match_operand:V_512 1 "general_operand")
> > > > >   (match_operand:V_512 2 "general_operand")))]
> > > > >   "TARGET_AVX512F
> > > > >&& (GET_MODE_NUNITS (mode)
> > > > >== GET_MODE_NUNITS (mode))"
> > > > > {
> > > > >   bool ok = ix86_expand_int_vcond (operands);
> > > > >   gcc_assert (ok);
> > > > >   DONE;
> > > > > })
> > > > >
> > > > > then we can get operands[1] and operands[2] as
> > > > >
> > > > > (gdb) p debug_rtx (operands[1])
> > > > >  (const_vector:V16QI [
> > > > > (const_int -1 [0x]) repeated x16
> > > > > ])
> > > > > (gdb) p debug_rtx (operands[2])
> > > > > (reg:V16QI 82 [ _2 ])
> > > > > (const_vector:V16QI [
> > > > > (const_int 0 [0]) repeated x16
> > > > > ])
> > > > Hi Hongtao,

Re: [PATCH] RISC-V: Explicitly call python when using multilib generator

2020-12-09 Thread Jim Wilson

On Wed, Dec 9, 2020 at 7:02 AM Jakub Jelinek via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> On Wed, Dec 09, 2020 at 03:57:51PM +0100, Matthias Klose wrote:
> > On 12/9/20 3:03 PM, Simon Cook wrote:
> > > When building GCC for RISC-V with the --with-multilib-generator option,
> > > it may not be possible to call arch-canonicalize as an executable when
> > > building on Windows. Instead directly invoke the expected python
> > > interpreter for this step.
>
> There is nothing in the two scripts that can't be easily done in awk
> or shell.
> I think it would be best to rewrite those scripts in those languages.
>

multilib-generator is for specifying embedded elf multilibs which get
complicated considering all of the arch extensions and sub-extensions that
we need to support.  This isn't relevant to native compilers, and hence not
a compiler bootstrap issue.

arch-canonicalize is just a very minor optimization to avoid redundant
multilibs, and again, it is really for embedded elf multilibs, and of no
practical use for native compilers, and hence not a bootstrap issue.  The
only problem was that we were calling it always, even if no multilibs and
no python, and Kito fixed that by making it optional, only calling it if
python exists.  We could perhaps further improve this by only calling it
when --enable-multilibs was specified.

Anyways, as things currently stand, the RISC-V port has no python
requirement to build a native compiler (after Kito's last fix).  And only
an optional need for python for embedded elf --enable-multilib build if you
want to use the --enable-multilib-generator configure option, or if you
want to avoid one duplicate multilib that is possible if you accidentally
specify a non-canonical --with-arch target.

So there is no immediate need to rewrite the scripts.

Jim

Optimize combination of comparisons to dec+compare

2020-12-09 Thread Eugene Rozenfeld via Gcc-patches

This patch adds a pattern for optimizing 
x < y || x == XXX_MIN to x <= y-1
if y is an integer with TYPE_OVERFLOW_WRAPS.

This fixes pr96674.

Tested on x86_64-pc-linux-gnu.

For this function

bool f(unsigned a, unsigned b)
{
return (b == 0) | (a < b);
}

the code without the patch is

test   esi,esi
sete   al
cmpesi,edi
seta   dl
or eax,edx
ret

the code with the patch is

subesi,0x1
cmpesi,edi
setae  al
ret

Eugene

gcc/
PR tree-optimization/96674
* match.pd: New pattern x < y || x == XXX_MIN --> x <= y - 1

gcc/testsuite
* gcc.dg/pr96674.c: New test.



0001-Optimize-combination-of-comparisons-to-dec-compare.patch
Description: 0001-Optimize-combination-of-comparisons-to-dec-compare.patch

Go testsuite patch committed: Recognize errorcheckdir -n

2020-12-09 Thread Ian Lance Taylor via Gcc-patches

This patch to go-test.exp recognizes errorcheckdir -n, as used by the
updated bug345 test.  The -n option is meaningful for the gc compiler,
but irrelevant for gccgo.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian

* go.test/go-test.exp (go-gc-tests): Recognize errorcheckdir -n,
for bug345.go.
f3f018dd31b3b4324276021f250db8ec6996ae8e
diff --git a/gcc/testsuite/go.test/go-test.exp 
b/gcc/testsuite/go.test/go-test.exp
index d11a2c2bba4..b03cb16990d 100644
--- a/gcc/testsuite/go.test/go-test.exp
+++ b/gcc/testsuite/go.test/go-test.exp
@@ -618,7 +618,7 @@ proc go-gc-tests { } {
go-execute-xfail $test
} elseif { $test_line == "// errorcheck" } {
errchk $test ""
-   } elseif { $test_line == "// errorcheckdir" } {
+   } elseif { $test_line == "// errorcheckdir" || $test_line == "// 
errorcheckdir -n" } {
set hold_runtests $runtests
set runtests "go-test.exp"
set dir "[file rootname $test].dir"

Re: Go testsuite patch committed: Don't quote quoted parentheses

2020-12-09 Thread Ian Lance Taylor via Gcc-patches

On Wed, Dec 9, 2020 at 2:39 AM Andreas Schwab  wrote:
>
> This breaks make -C gcc check-go RUNTESTFLAGS="go-test.exp=chan.go":
>
> ERROR: tcl error sourcing 
> /opt/gcc/gcc-20201209/gcc/testsuite/go.test/go-test.exp.
> ERROR: couldn't compile regular expression pattern: parentheses () not 
> balanced
> while executing
> "regsub -all "(^|\n)(\[^\n\]+$line\[^\n\]*($pattern)\[^\n\]*\n?)+" 
> $comp_output "\n" comp_output"
> (procedure "saved-dg-test" line 125)
> invoked from within
> "saved-dg-test chan.go {  -O } {-fno-show-column  -pedantic-errors }"
> ("eval" body line 1)
> invoked from within
> "eval saved-dg-test $args "
> (procedure "dg-test" line 1)
> invoked from within
> "dg-test $test "$flags $flags_t" ${default-extra-flags}"
> (procedure "go-dg-runtest" line 24)
> invoked from within
> "go-dg-runtest $filename "" "-fno-show-column $DEFAULT_GOCFLAGS $opts""
> (procedure "errchk" line 83)
> invoked from within
> "errchk $test """
> (procedure "go-gc-tests" line 309)
> invoked from within
> "go-gc-tests"
> (file "/opt/gcc/gcc-20201209/gcc/testsuite/go.test/go-test.exp" line 1217)
> invoked from within
> "source /opt/gcc/gcc-20201209/gcc/testsuite/go.test/go-test.exp"
> ("uplevel" body line 1)
> invoked from within
> "uplevel #0 source /opt/gcc/gcc-20201209/gcc/testsuite/go.test/go-test.exp"
> invoked from within
> "catch "uplevel #0 source $test_file_name""


Thanks.  I managed to miss that.

I read up on TCL syntax, and rewrote the regexp quoting to use curly
braces instead of double quotes, which makes it much simpler.  This
version fixes that problem and should hopefully work going forward.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian

* go.test/go-test.exp (errchk): Rewrite regexp quoting to use
curly braces, making it much simpler.
7c1feeb579c4d0cc3c8e633360eb77754bc70fa7
diff --git a/gcc/testsuite/go.test/go-test.exp 
b/gcc/testsuite/go.test/go-test.exp
index d129e1c65da..d11a2c2bba4 100644
--- a/gcc/testsuite/go.test/go-test.exp
+++ b/gcc/testsuite/go.test/go-test.exp
@@ -101,50 +101,32 @@ proc errchk { test opts } {
set changed ""
while { $changed != $copy_line } {
set changed $copy_line
-   regsub "\(// \[^\"\]*\"\[^\"\]*\)\" \"" $copy_line "\\1|" out_line
+   regsub {(// [^"]*"[^"]*)" "} $copy_line {\1|} out_line
set copy_line $out_line
}
 
-   regsub "// \(GCCGO_\)?ERROR \"\(\[^\"\]*\)\" *\(\\*/\)?$" $copy_line 
"// \{ dg-error \"\\2\" \}\\3" out_line
-   if [string match "*dg-error*\\\[*" $out_line] {
-   set index [string first "dg-error" $out_line]
-   regsub -start $index -all "\\\[" $out_line "\\\[" 
out_line
-   }
-   if [string match "*dg-error*\\\]*" $out_line] {
-   set index [string first "dg-error" $out_line]
-   regsub -start $index -all "\\\]" $out_line "\\\]" 
out_line
-   }
-   if [string match "*dg-error*.\**" $out_line] {
-   # I worked out the right number of backslashes by
-   # experimentation, not analysis.
-   regsub -all "\\.\\*" $out_line "\[ -~\]*" out_line
-   }
-   if [string match "*dg-error*\\\[?\\\]*" $out_line] {
-   set index [string first "dg-error" $out_line]
-   regsub -all "\\\[\(.\)\\\]" $out_line "\[\\1\]" out_line
-   }
-   if [string match "*dg-error*\{*" $out_line] {
-   set index [string first "dg-error" $out_line]
-   regsub -start $index -all "\(\[^]\)\{" $out_line 
"\\1\[\\\{\]" out_line
-   }
-   if [string match "*dg-error*\}*\}" $out_line] {
-   set index [string first "dg-error" $out_line]
-   regsub -start $index -all "\(\[^]\)\}\(.\)" $out_line 
"\\1\[\\\}\]\\2" out_line
-   }
-   if [string match "*dg-error*\\\[^\]\(*" $out_line] {
-   set index [string first "dg-error" $out_line]
-   regsub -start $index -all "\\\(" $out_line "\[\\\(\]" 
out_line
-   }
-   if [string match "*dg-error*\\\[^\]\)*\}" $out_line] {
-   set index [stri

Re: [PATCH 1/8 v4] Dead-field warning in structs at LTO-time

2020-12-09 Thread Eric Gallager via Gcc-patches

On Fri, Dec 4, 2020 at 4:58 AM Erick Ochoa <
erick.oc...@theobroma-systems.com> wrote:

>
> This commit includes the following components:
>
>Type-based escape analysis to determine structs that can be modified at
>link-time.
>Field access analysis to determine which fields are never read.
>
> The type-based escape analysis provides a list of types, that are not
> visible outside of the current linking unit (e.g. parameter types of
> external
> functions).
>
> The field access analyses non-escaping structs for fields that
> are not used in the linking unit and thus can be removed.
>
> 2020-11-04  Erick Ochoa  
>
>  * Makefile.in: Add file to list of new sources.
>  * common.opt: Add new flags.
>  * ipa-type-escape-analysis.c: New file.
> ---
>   gcc/Makefile.in|1 +
>   gcc/common.opt |8 +
>   gcc/ipa-type-escape-analysis.c | 3428 
>   gcc/ipa-type-escape-analysis.h | 1152 +++
>   gcc/passes.def |1 +
>   gcc/timevar.def|1 +
>   gcc/tree-pass.h|2 +
>   7 files changed, 4593 insertions(+)
>   create mode 100644 gcc/ipa-type-escape-analysis.c
>   create mode 100644 gcc/ipa-type-escape-analysis.h
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 978a08f7b04..8b18c9217a2 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1415,6 +1415,7 @@ OBJS = \
> incpath.o \
> init-regs.o \
> internal-fn.o \
> +   ipa-type-escape-analysis.o \
> ipa-cp.o \
> ipa-sra.o \
> ipa-devirt.o \
> diff --git a/gcc/common.opt b/gcc/common.opt
> index d4cbb2f86a5..85351738a29 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3460,4 +3460,12 @@ fipa-ra
>   Common Report Var(flag_ipa_ra) Optimization
>   Use caller save register across calls if possible.
>   +fipa-type-escape-analysis
> +Common Report Var(flag_ipa_type_escape_analysis) Optimization
> +This flag is only used for debugging the type escape analysis
> +
> +Wdfa
> +Common Var(warn_dfa) Init(1) Warning
> +Warn about dead fields at link time.
> +
>

I don't really like the name "-Wdfa" very much; could you maybe come up
with a longer and more descriptive name instead? Say, "-Wunused-field" or
"-Wunused-private-field" depending on the kind of field:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72789
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92801


>   ; This comment is to ensure we retain the blank line above.
> diff --git a/gcc/ipa-type-escape-analysis.c
> b/gcc/ipa-type-escape-analysis.c
> new file mode 100644
> index 000..32c8bf997fb
> --- /dev/null
> +++ b/gcc/ipa-type-escape-analysis.c
> @@ -0,0 +1,3428 @@
> +/* IPA Type Escape Analysis and Dead Field Elimination
> +   Copyright (C) 2019-2020 Free Software Foundation, Inc.
> +
> +  Contributed by Erick Ochoa 
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +.  */
> +
> +/* Interprocedural dead field analysis (IPA-DFA)
> +
> +   The goal of this analysis is to
> +
> +   1) discover RECORD_TYPEs which do not escape the current linking unit.
> +
> +   2) discover fields in RECORD_TYPEs that are never read.
> +
> +   3) merge the results from 1 and 2 to determine which fields are not
> needed.
> +
> +   The algorithm basically consists of the following stages:
> +
> +   1) Partition all TYPE_P trees into two sets: those trees which reach a
> +   tree of RECORD_TYPE.
> +
> +   2.a) Analyze callsites to determine if arguments and return types are
> +   escaping.
> +   2.b) Analyze casts to determine if it would be safe to mark a field
> as dead.
> +   2.c) Analyze for constructors and static initialization and mark this
> as
> +   TYPE_P trees as unable to be modified
> +   2.d) Analyze if FIELD_DECL are accessed via pointer arithmetic and mark
> +   FIELD_DECLs before as unable to be modified.
> +   2.e) Analyze if an address of a FIELD_DECL is taken and mark the whole
> +   RECORD_TYPE as unable to be modified.
> +   2.f) Propagate this information to nested TYPE_P trees.
> +   2.g) Propagate this information across different TYPE_P trees that
> represent
> +   equivalent TYPE_P types.
> +
> +   3.a) Analyze FIELD_DECL to determine whether they are read,
> +   written or neither.
> +   3.b) Unify this information across different RECORD_TYPE trees that
> +   represent equiv

Re: Problem building libstdc++ for the avr target

2020-12-09 Thread Vladimir V via Gcc-patches

Thank you for the quick response.
The patch solves the problem.

ср, 9 дек. 2020 г. в 18:01, Jonathan Wakely :

> On 09/12/20 12:49 +, Jonathan Wakely wrote:
> >On 09/12/20 13:32 +0100, Vladimir V wrote:
> >>Hello.
> >>
> >>While testing with the current upstream I encountered a compilation
> issue.
> >>Although I build with "--disable-threads" flag the following error
> occurs:
> >>
> >>../../../../../libstdc++-v3/src/c++11/thread.cc:39:4: error: #error "No
> >>sleep function known for this target"
> >>
> >>Previously the check was inside the  #ifdef _GLIBCXX_HAS_GTHREADS that
> >>prevented the error from happening (in my case with gcc v10.1),
> >>So I would like to ask if the thread.cc should be involved in the build
> if
> >>the threads support is configured to be disabled?
> >
> >Yes, the file is always built, but which definitions it contains
> >depends on what is configured for the target.
> >
> >The std::this_thread::sleep_for and std::this_thread::sleep_until
> >functions don't actually depend on threads at all. They just sleep.
> >
> >But that still requires target support, just different support from
> >threads.
> >
> >>And if it should, then can the condition be reworked to cover the
> described
> >>case?
> >
> >Yes, I'll do that. Thanks for bringing it to my attention.
> >
> >I assume we can't use avr-libc's delay functions, because they depend
> >on the CPU clock frequency, which isn't known when we compile
> >libstdc++. So I'll just suppress the declarations of those functions
> >and remove the #error.
>
> The attached patch adds a new _GLIBCXX_NO_SLEEP configure macro which
> should get defined for your hosted AVR build. That should mean that
> std::this_thread::sleep_for is not defined, and src/c++11/thread.cc
> will no longer insist on some way to sleep being supported.
>
> I've only tested this on powerpc64le-linux, so please let me know if
> it works for you.
>
> Pushed to master.
>
>
>

RE: [PATCH][GCC] aarch64: Add +pauth to -march

2020-12-09 Thread Przemyslaw Wirkus via Gcc-patches

> > Subject: [PATCH][GCC] aarch64: Add +pauth to -march
> >
> > New +pauth (Pointer Authentication from Armv8.3-A) feature option for
> > -march command line option.
> >
> > Please note that majority of PAUTH instructions are implemented behind
> > HINT instruction. PAUTH stays a Armv8.3-A feature but now can be
> > assigned to other architectures or CPUs.
> >
> > Patch includes:
> > - new +pauth command line option.
> > - docs update to +flagm command line option in docs.
> >
> > Regression tested and no issues.
> >
> > OK for master?
> Ok.
> Thanks,
> Kyrill

commit ef33047a8b93d416f08f3f640dd65f3887fb05c1

> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64-option-extensions.def
> > (AARCH64_OPT_EXTENSION): New +pauth option in -march for AArch64.
> > * config/aarch64/aarch64.h (AARCH64_FL_PAUTH): New pauth extension
> > bitmask.
> > (AARCH64_ISA_PUATH): New ISA bitmask for PAUTH.
> > (AARCH64_FL_FOR_ARCH8_3): Add PAUTH to Armv8.3-A.
> > (TARGET_PAUTH): New target mask to isolate PAUTH instructions.
> > * config/aarch64/aarch64.md (do_return): Condition set to TARGET_PAUTH.
> > * doc/invoke.texi: Update docs (+flagm, +pauth).

Re: c++: Module-specific error and tree dumping

2020-12-09 Thread David Malcolm via Gcc-patches

On Wed, 2020-12-09 at 16:07 -0500, Nathan Sidwell wrote:
> On 12/9/20 3:41 PM, David Malcolm wrote:
> > On Wed, 2020-12-09 at 15:20 -0500, Nathan Sidwell wrote:
> > > With modules, we need the ability to name 'foos' in different
> > > modules.
> > > The idiom for that is a trailing '@modulename' suffix.
> > 
> > Out of curiosity, is this idiom shared with other compilers? (or in
> > the
> > standard)
> 
> The std says nothing.  I discussed it with clang guys, and probably
> with 
> MSVC & EDG guys.  I've been using it consistently in papers to wg21. 
> Can't recall exactly, it was one of the very early things.

Fair enough, thanks.


> > Unless I'm misreading it, the patch you posted deletes about 2000
> > lines
> > from gcc/cp/error.c (lines 2599 onwards), deleting a whole bunch of
> > stuff.
> > 
> > Did you post the wrong patch?
> 
> hm, wut?  Attached is the diff that was committed.

Thanks; that diff looks far more reasonable :)

Dave

>I think I must have 
> generated the diff during a file write (via sshfs).
> 
> I think that's evidence I need to stop for the day :)
> 
> nathan
> 
>

Re: [PATCH, rs6000] Update "size" attribute for Power10

2020-12-09 Thread will schmidt via Gcc-patches

On Tue, 2020-12-08 at 15:46 -0600, Pat Haugen via Gcc-patches wrote:
> Update size attribute for Power10.
> 
> 
> This patch was broken out from my larger patch to update various
> attributes for
> Power10, in order to make the review process hopefully easier. This
> patch only
> updates the size attribute for various new instructions. There were
> no changes
> requested to this portion of the original patch, so nothing is new
> here.
> 
> Bootstrap/regtest on powerpc64le (Power8/Power10) with no new
> regressions. Ok for trunk?
> 
> -Pat
> 
> 
> 2020-11-08  Pat Haugen  
> 
> gcc/


I think you'll need to specify gcc/ChangeLog at commit time.
Beyond that nit, the Changelog content here looks to match the patch
body OK.
lgtm,
thanks
-Will

>   * config/rs6000/dfp.md (extendddtd2, trunctddd2,
> *cmp_internal1,
>   floatditd2, ftrunc2, fixdi2, dfp_ddedpd_,
>   dfp_denbcd_, dfp_dxex_, dfp_diex_,
>   *dfp_sgnfcnc_, dfp_dscli_, dfp_dscri_):
> Update size
>   attribute for Power10.
>   * config/rs6000/mma.md (*movoo): Likewise.
>   * config/rs6000/rs6000.md (define_attr "size"): Add 256.
>   (define_mode_attr bits): Add DD/TD modes.
>   * config/rs6000/sync.md (load_quadpti, store_quadpti,
> load_lockedpti,
>   store_conditionalpti): Update size attribute for Power10.
>

Re: c++: Module-specific error and tree dumping

2020-12-09 Thread Nathan Sidwell


On 12/9/20 3:41 PM, David Malcolm wrote:

On Wed, 2020-12-09 at 15:20 -0500, Nathan Sidwell wrote:

With modules, we need the ability to name 'foos' in different
modules.
The idiom for that is a trailing '@modulename' suffix.


Out of curiosity, is this idiom shared with other compilers? (or in the
standard)


The std says nothing.  I discussed it with clang guys, and probably with 
MSVC & EDG guys.  I've been using it consistently in papers to wg21. 
Can't recall exactly, it was one of the very early things.




Unless I'm misreading it, the patch you posted deletes about 2000 lines
from gcc/cp/error.c (lines 2599 onwards), deleting a whole bunch of
stuff.

Did you post the wrong patch?


hm, wut?  Attached is the diff that was committed.   I think I must have 
generated the diff during a file write (via sshfs).


I think that's evidence I need to stop for the day :)

nathan


--
Nathan Sidwell
diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index d11591d67a0..4572f6e4ae2 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -179,6 +179,38 @@ cxx_initialize_diagnostics (diagnostic_context *context)
   pp->m_format_postprocessor = new cxx_format_postprocessor ();
 }
 
+/* Dump an '@module' name suffix for DECL, if any.  */
+
+static void
+dump_module_suffix (cxx_pretty_printer *pp, tree decl)
+{
+  if (!modules_p ())
+return;
+
+  if (!DECL_CONTEXT (decl))
+return;
+
+  if (TREE_CODE (decl) != CONST_DECL
+  || !UNSCOPED_ENUM_P (DECL_CONTEXT (decl)))
+{
+  if (!DECL_NAMESPACE_SCOPE_P (decl))
+	return;
+
+  if (TREE_CODE (decl) == NAMESPACE_DECL
+	  && !DECL_NAMESPACE_ALIAS (decl)
+	  && (TREE_PUBLIC (decl) || !TREE_PUBLIC (CP_DECL_CONTEXT (decl
+	return;
+}
+
+  if (unsigned m = get_originating_module (decl))
+if (const char *n = module_name (m, false))
+  {
+	pp_character (pp, '@');
+	pp->padding = pp_none;
+	pp_string (pp, n);
+  }
+}
+
 /* Dump a scope, if deemed necessary.  */
 
 static void
@@ -771,6 +803,8 @@ dump_aggr_type (cxx_pretty_printer *pp, tree t, int flags)
   else
 pp_cxx_tree_identifier (pp, DECL_NAME (decl));
 
+  dump_module_suffix (pp, decl);
+
   if (tmplate)
 dump_template_parms (pp, TYPE_TEMPLATE_INFO (t),
 			 !CLASSTYPE_USE_TEMPLATE (t),
@@ -1077,6 +,9 @@ dump_simple_decl (cxx_pretty_printer *pp, tree t, tree type, int flags)
 pp_string (pp, M_(""));
   else
 pp_string (pp, M_(""));
+
+  dump_module_suffix (pp, t);
+
   if (flags & TFF_DECL_SPECIFIERS)
 dump_type_suffix (pp, type, flags);
 }
@@ -1894,6 +1931,8 @@ dump_function_name (cxx_pretty_printer *pp, tree t, int flags)
   else
 dump_decl (pp, name, flags);
 
+  dump_module_suffix (pp, t);
+
   if (DECL_TEMPLATE_INFO (t)
   && !DECL_FRIEND_PSEUDO_TEMPLATE_INSTANTIATION (t)
   && (TREE_CODE (DECL_TI_TEMPLATE (t)) != TEMPLATE_DECL
diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 3587dfcc925..176286cdd91 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -74,6 +74,11 @@ get_module (tree, module_state *, bool)
   return nullptr;
 }
 
+const char *
+module_name (unsigned, bool)
+{
+  return nullptr;
+}
 
 void
 mangle_module (int, bool)
@@ -102,6 +107,12 @@ get_originating_module (tree, bool)
   return 0;
 }
 
+unsigned
+get_importing_module (tree, bool)
+{
+  return 0;
+}
+
 bool
 module_may_redeclare (tree)
 {
diff --git a/gcc/cp/ptree.c b/gcc/cp/ptree.c
index f8d22082ba7..1e9fdf82e86 100644
--- a/gcc/cp/ptree.c
+++ b/gcc/cp/ptree.c
@@ -59,6 +59,42 @@ cxx_print_decl (FILE *file, tree node, int indent)
 
   bool need_indent = true;
 
+  if (TREE_CODE (node) == FUNCTION_DECL
+  || TREE_CODE (node) == VAR_DECL
+  || TREE_CODE (node) == TYPE_DECL
+  || TREE_CODE (node) == TEMPLATE_DECL
+  || TREE_CODE (node) == CONCEPT_DECL
+  || TREE_CODE (node) == NAMESPACE_DECL)
+{
+  unsigned m = 0;
+  if (DECL_LANG_SPECIFIC (node) && DECL_MODULE_IMPORT_P (node))
+	m = get_importing_module (node, true);
+
+  if (const char *name = m == ~0u ? "" : module_name (m, true))
+	{
+	  if (need_indent)
+	indent_to (file, indent + 3);
+	  fprintf (file, " module %d:%s", m, name);
+	  need_indent = false;
+	}
+
+  if (DECL_LANG_SPECIFIC (node) && DECL_MODULE_PURVIEW_P (node))
+	{
+	  if (need_indent)
+	indent_to (file, indent + 3);
+	  fprintf (file, " purview");
+	  need_indent = false;
+	}
+}
+
+  if (DECL_MODULE_EXPORT_P (node))
+{
+  if (need_indent)
+	indent_to (file, indent + 3);
+  fprintf (file, " exported");
+  need_indent = false;
+}
+
   if (DECL_EXTERNAL (node) && DECL_NOT_REALLY_EXTERN (node))
 {
   if (need_indent)

Re: c++: Module-specific error and tree dumping

2020-12-09 Thread David Malcolm via Gcc-patches

On Wed, 2020-12-09 at 15:20 -0500, Nathan Sidwell wrote:
> With modules, we need the ability to name 'foos' in different
> modules.
> The idiom for that is a trailing '@modulename' suffix.

Out of curiosity, is this idiom shared with other compilers? (or in the
standard)

>   This adds that
> to the error printing routines.I also augment the tree
> dumping
> machinery to show module-specific metadata.
> 
>  gcc/cp/
>  * error.c (dump_module_suffix): New.
>  (dump_aggr_type, dump_simple_decl, dump_function_name): Call
> it.
>  * ptree.c (cxx_print_decl): Print module information.
>  * module.cc (module_name, get_importing_module): Stubs.
> 
> pushing to trunk

Unless I'm misreading it, the patch you posted deletes about 2000 lines
from gcc/cp/error.c (lines 2599 onwards), deleting a whole bunch of
stuff.

Did you post the wrong patch?

Dave

[PATCH V2] RISC-V: Explicitly call python when using multilib generator

2020-12-09 Thread Simon Cook



On 09/12/2020 14:57, Matthias Klose wrote:
> 
> that's again hard-coding 'python'.
> 

I believe this way of invoking python should be better than just
hardcoding python, instead using the interpreter that was called for the
first script.

--
>From 304afba63fb851fae461fcd89a7ecdba3e96c313 Mon Sep 17 00:00:00 2001
From: Simon Cook 
Date: Wed, 9 Dec 2020 10:39:28 +
Subject: [PATCH] RISC-V: Explicitly call python when using multilib
generator

When building GCC for RISC-V with the --with-multilib-generator option,
it may not be possible to call arch-canonicalize as an executable when
building on Windows. Instead directly invoke the expected python
interpreter for this step.

gcc/ChangeLog:

* config/riscv/multilib-generator (arch_canonicalize): Invoke
python interpreter when calling arch-canonicalize script.
---
 gcc/config/riscv/multilib-generator | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/multilib-generator
b/gcc/config/riscv/multilib-generator
index 53c51dfa53f..ccfd9ea18ea 100755
--- a/gcc/config/riscv/multilib-generator
+++ b/gcc/config/riscv/multilib-generator
@@ -54,7 +54,8 @@ def arch_canonicalize(arch):
   this_file = os.path.abspath(os.path.join( __file__))
   arch_can_script = \
 os.path.join(os.path.dirname(this_file), "arch-canonicalize")
-  proc = subprocess.Popen([arch_can_script, arch], stdout=subprocess.PIPE)
+  proc = subprocess.Popen([sys.executable, arch_can_script, arch],
+  stdout=subprocess.PIPE)
   out, err = proc.communicate()
   return out.strip()

-- 
2.24.3

c++: Module-specific error and tree dumping

2020-12-09 Thread Nathan Sidwell


With modules, we need the ability to name 'foos' in different modules.
The idiom for that is a trailing '@modulename' suffix.  This adds that
to the error printing routines.  I also augment the tree dumping
machinery to show module-specific metadata.

gcc/cp/
* error.c (dump_module_suffix): New.
(dump_aggr_type, dump_simple_decl, dump_function_name): Call it.
* ptree.c (cxx_print_decl): Print module information.
* module.cc (module_name, get_importing_module): Stubs.

pushing to trunk

--
Nathan Sidwell
diff --git i/gcc/cp/error.c w/gcc/cp/error.c
index d11591d67a0..374d859fe1c 100644
--- i/gcc/cp/error.c
+++ w/gcc/cp/error.c
@@ -179,6 +179,38 @@ cxx_initialize_diagnostics (diagnostic_context *context)
   pp->m_format_postprocessor = new cxx_format_postprocessor ();
 }
 
+/* Dump an '@module' name suffix for DECL, if any.  */
+
+static void
+dump_module_suffix (cxx_pretty_printer *pp, tree decl)
+{
+  if (!modules_p ())
+return;
+
+  if (!DECL_CONTEXT (decl))
+return;
+
+  if (TREE_CODE (decl) != CONST_DECL
+  || !UNSCOPED_ENUM_P (DECL_CONTEXT (decl)))
+{
+  if (!DECL_NAMESPACE_SCOPE_P (decl))
+	return;
+
+  if (TREE_CODE (decl) == NAMESPACE_DECL
+	  && !DECL_NAMESPACE_ALIAS (decl)
+	  && (TREE_PUBLIC (decl) || !TREE_PUBLIC (CP_DECL_CONTEXT (decl
+	return;
+}
+
+  if (unsigned m = get_originating_module (decl))
+if (const char *n = module_name (m, false))
+  {
+	pp_character (pp, '@');
+	pp->padding = pp_none;
+	pp_string (pp, n);
+  }
+}
+
 /* Dump a scope, if deemed necessary.  */
 
 static void
@@ -771,6 +803,8 @@ dump_aggr_type (cxx_pretty_printer *pp, tree t, int flags)
   else
 pp_cxx_tree_identifier (pp, DECL_NAME (decl));
 
+  dump_module_suffix (pp, decl);
+
   if (tmplate)
 dump_template_parms (pp, TYPE_TEMPLATE_INFO (t),
 			 !CLASSTYPE_USE_TEMPLATE (t),
@@ -1077,6 +,9 @@ dump_simple_decl (cxx_pretty_printer *pp, tree t, tree type, int flags)
 pp_string (pp, M_(""));
   else
 pp_string (pp, M_(""));
+
+  dump_module_suffix (pp, t);
+
   if (flags & TFF_DECL_SPECIFIERS)
 dump_type_suffix (pp, type, flags);
 }
@@ -1894,6 +1931,8 @@ dump_function_name (cxx_pretty_printer *pp, tree t, int flags)
   else
 dump_decl (pp, name, flags);
 
+  dump_module_suffix (pp, t);
+
   if (DECL_TEMPLATE_INFO (t)
   && !DECL_FRIEND_PSEUDO_TEMPLATE_INSTANTIATION (t)
   && (TREE_CODE (DECL_TI_TEMPLATE (t)) != TEMPLATE_DECL
@@ -2560,2047 +2599,4 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags)
 	  pp_cxx_left_paren (pp);
 	  pp_cxx_left_paren (pp);
 	  dump_type (pp, TREE_TYPE (t), flags);
-	  pp_cxx_right_paren (pp);
-	  pp_character (pp, '0');
-	  pp_cxx_right_paren (pp);
-	  break;
-	}
-	  else if (tree_fits_shwi_p (idx))
-	{
-	  tree virtuals;
-	  unsigned HOST_WIDE_INT n;
-
-	  t = TREE_TYPE (TYPE_PTRMEMFUNC_FN_TYPE (TREE_TYPE (t)));
-	  t = TYPE_METHOD_BASETYPE (t);
-	  virtuals = BINFO_VIRTUALS (TYPE_BINFO (TYPE_MAIN_VARIANT (t)));
-
-	  n = tree_to_shwi (idx);
-
-	  /* Map vtable index back one, to allow for the null pointer to
-		 member.  */
-	  --n;
-
-	  while (n > 0 && virtuals)
-		{
-		  --n;
-		  virtuals = TREE_CHAIN (virtuals);
-		}
-	  if (virtuals)
-		{
-		  dump_expr (pp, BV_FN (virtuals),
-			 flags | TFF_EXPR_IN_PARENS);
-		  break;
-		}
-	}
-	}
-  if (TREE_TYPE (t) && LAMBDA_TYPE_P (TREE_TYPE (t)))
-	pp_string (pp, "");
-  if (TREE_TYPE (t) && EMPTY_CONSTRUCTOR_P (t))
-	{
-	  dump_type (pp, TREE_TYPE (t), 0);
-	  pp_cxx_left_paren (pp);
-	  pp_cxx_right_paren (pp);
-	}
-  else
-	{
-	  if (!BRACE_ENCLOSED_INITIALIZER_P (t))
-	dump_type (pp, TREE_TYPE (t), 0);
-	  pp_cxx_left_brace (pp);
-	  dump_expr_init_vec (pp, CONSTRUCTOR_ELTS (t), flags);
-	  pp_cxx_right_brace (pp);
-	}
-
-  break;
-
-case OFFSET_REF:
-  {
-	tree ob = TREE_OPERAND (t, 0);
-	if (is_dummy_object (ob))
-	  {
-	t = TREE_OPERAND (t, 1);
-	if (TREE_CODE (t) == FUNCTION_DECL)
-	  /* A::f */
-	  dump_expr (pp, t, flags | TFF_EXPR_IN_PARENS);
-	else if (BASELINK_P (t))
-	  dump_expr (pp, OVL_FIRST (BASELINK_FUNCTIONS (t)),
-			 flags | TFF_EXPR_IN_PARENS);
-	else
-	  dump_decl (pp, t, flags);
-	  }
-	else
-	  {
-	if (INDIRECT_REF_P (ob))
-	  {
-		dump_expr (pp, TREE_OPERAND (ob, 0), flags | TFF_EXPR_IN_PARENS);
-		pp_cxx_arrow (pp);
-		pp_cxx_star (pp);
-	  }
-	else
-	  {
-		dump_expr (pp, ob, flags | TFF_EXPR_IN_PARENS);
-		pp_cxx_dot (pp);
-		pp_cxx_star (pp);
-	  }
-	dump_expr (pp, TREE_OPERAND (t, 1), flags | TFF_EXPR_IN_PARENS);
-	  }
-	break;
-  }
-
-case TEMPLATE_PARM_INDEX:
-  dump_decl (pp, TEMPLATE_PARM_DECL (t), flags & ~TFF_DECL_SPECIFIERS);
-  break;
-
-case CAST_EXPR:
-  if (TREE_OPERAND (t, 0) == NULL_TREE
-	  || TREE_CHAIN (TREE_OPERAND (t, 0)))
-	{
-	  dump_type (pp, TREE_TYPE (t), flags);
-	  pp_c

Re: GCC 10 backports

2020-12-09 Thread Martin Liška


On 10/16/20 10:51 AM, Martin Liška wrote:

On 10/7/20 2:03 PM, Martin Liška wrote:

On 10/1/20 9:18 PM, Martin Liška wrote:

I'm going to install the following 3 tested backports.

Martin


One more patch that I've tested.

Martin


Adding one more.

Martin


Adding one more I've just tested.

Martin
>From 5b51c5135f9c6adce273697a7a892df39d7f4b29 Mon Sep 17 00:00:00 2001
From: Kewen Lin 
Date: Tue, 18 Aug 2020 21:37:39 -0500
Subject: [PATCH] options: Make --help= see overridden values

Options "-Q --help=params" don't show the final values after
target option overriding, instead it emits the default values
in params.opt (without any explicit param settings).

This patch makes it see overridden values.

gcc/ChangeLog:

	* opts-global.c (decode_options): Call target_option_override_hook
	before it prints for --help=*.

(cherry picked from commit a7bbb5b1b1eb09db8175130474e8da952f30404b)
---
 gcc/opts-global.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/gcc/opts-global.c b/gcc/opts-global.c
index c658805470e..5e5c3d41dd9 100644
--- a/gcc/opts-global.c
+++ b/gcc/opts-global.c
@@ -327,8 +327,14 @@ decode_options (struct gcc_options *opts, struct gcc_options *opts_set,
   unsigned i;
   const char *arg;
 
-  FOR_EACH_VEC_ELT (help_option_arguments, i, arg)
-print_help (opts, lang_mask, arg);
+  if (!help_option_arguments.is_empty ())
+{
+  /* Make sure --help=* sees the overridden values.  */
+  target_option_override_hook ();
+
+  FOR_EACH_VEC_ELT (help_option_arguments, i, arg)
+	print_help (opts, lang_mask, arg);
+}
 }
 
 /* Hold command-line options associated with stack limitation.  */
-- 
2.29.2

c++: name-lookup cleanups

2020-12-09 Thread Nathan Sidwell



Name-lookup is the most changed piece of the front end for modules.
Here are some preparatort cleanups and API extensions.

gcc/cp/
* name-lookup.h (set_class_bindings): Return vector, take signed
'extra' parm.
* name-lookup.c (maybe_lazily_declare): Break out ...
(get_class_binding): .. of here, call it.
(find_member_slot): Adjust get_class_bindings call.
(set_class_bindings): Allow -ve extra.  Return the vector.
(set_identifier_type_value_with_scope): Remove checking assert.
(lookup_using_decl): Set decl's context.
(do_pushtag): Adjust set_identifier_type_value_with_scope handling.


--
Nathan Sidwell
diff --git c/gcc/cp/name-lookup.c w/gcc/cp/name-lookup.c
index c87d151b441..fa372810349 100644
--- c/gcc/cp/name-lookup.c
+++ w/gcc/cp/name-lookup.c
@@ -1338,6 +1338,36 @@ get_class_binding_direct (tree klass, tree name, bool want_type)
   return val;
 }
 
+/* We're about to lookup NAME in KLASS.  Make sure any lazily declared
+   members are now declared.  */
+
+static void
+maybe_lazily_declare (tree klass, tree name)
+{
+  /* Lazily declare functions, if we're going to search these.  */
+  if (IDENTIFIER_CTOR_P (name))
+{
+  if (CLASSTYPE_LAZY_DEFAULT_CTOR (klass))
+	lazily_declare_fn (sfk_constructor, klass);
+  if (CLASSTYPE_LAZY_COPY_CTOR (klass))
+	lazily_declare_fn (sfk_copy_constructor, klass);
+  if (CLASSTYPE_LAZY_MOVE_CTOR (klass))
+	lazily_declare_fn (sfk_move_constructor, klass);
+}
+  else if (IDENTIFIER_DTOR_P (name))
+{
+  if (CLASSTYPE_LAZY_DESTRUCTOR (klass))
+	lazily_declare_fn (sfk_destructor, klass);
+}
+  else if (name == assign_op_identifier)
+{
+  if (CLASSTYPE_LAZY_COPY_ASSIGN (klass))
+	lazily_declare_fn (sfk_copy_assignment, klass);
+  if (CLASSTYPE_LAZY_MOVE_ASSIGN (klass))
+	lazily_declare_fn (sfk_move_assignment, klass);
+}
+}
+
 /* Look for NAME's binding in exactly KLASS.  See
get_class_binding_direct for argument description.  Does lazy
special function creation as necessary.  */
@@ -1348,30 +1378,7 @@ get_class_binding (tree klass, tree name, bool want_type /*=false*/)
   klass = complete_type (klass);
 
   if (COMPLETE_TYPE_P (klass))
-{
-  /* Lazily declare functions, if we're going to search these.  */
-  if (IDENTIFIER_CTOR_P (name))
-	{
-	  if (CLASSTYPE_LAZY_DEFAULT_CTOR (klass))
-	lazily_declare_fn (sfk_constructor, klass);
-	  if (CLASSTYPE_LAZY_COPY_CTOR (klass))
-	lazily_declare_fn (sfk_copy_constructor, klass);
-	  if (CLASSTYPE_LAZY_MOVE_CTOR (klass))
-	lazily_declare_fn (sfk_move_constructor, klass);
-	}
-  else if (IDENTIFIER_DTOR_P (name))
-	{
-	  if (CLASSTYPE_LAZY_DESTRUCTOR (klass))
-	lazily_declare_fn (sfk_destructor, klass);
-	}
-  else if (name == assign_op_identifier)
-	{
-	  if (CLASSTYPE_LAZY_COPY_ASSIGN (klass))
-	lazily_declare_fn (sfk_copy_assignment, klass);
-	  if (CLASSTYPE_LAZY_MOVE_ASSIGN (klass))
-	lazily_declare_fn (sfk_move_assignment, klass);
-	}
-}
+maybe_lazily_declare (klass, name);
 
   return get_class_binding_direct (klass, name, want_type);
 }
@@ -1392,14 +1399,11 @@ find_member_slot (tree klass, tree name)
   vec_alloc (member_vec, 8);
   CLASSTYPE_MEMBER_VEC (klass) = member_vec;
   if (complete_p)
-	{
-	  /* If the class is complete but had no member_vec, we need
-	 to add the TYPE_FIELDS into it.  We're also most likely
-	 to be adding ctors & dtors, so ask for 6 spare slots (the
-	 abstract cdtors and their clones).  */
-	  set_class_bindings (klass, 6);
-	  member_vec = CLASSTYPE_MEMBER_VEC (klass);
-	}
+	/* If the class is complete but had no member_vec, we need to
+	   add the TYPE_FIELDS into it.  We're also most likely to be
+	   adding ctors & dtors, so ask for 6 spare slots (the
+	   abstract cdtors and their clones).  */
+	member_vec = set_class_bindings (klass, 6);
 }
 
   if (IDENTIFIER_CONV_OP_P (name))
@@ -1741,18 +1745,18 @@ member_vec_dedup (vec *member_vec)
no existing MEMBER_VEC and fewer than 8 fields, do nothing.  We
know there must be at least 1 field -- the self-reference
TYPE_DECL, except for anon aggregates, which will have at least
-   one field anyway.  */
+   one field anyway.  If EXTRA < 0, always create the vector.  */
 
-void 
-set_class_bindings (tree klass, unsigned extra)
+vec *
+set_class_bindings (tree klass, int extra)
 {
   unsigned n_fields = count_class_fields (klass);
   vec *member_vec = CLASSTYPE_MEMBER_VEC (klass);
 
-  if (member_vec || n_fields >= 8)
+  if (member_vec || n_fields >= 8 || extra < 0)
 {
   /* Append the new fields.  */
-  vec_safe_reserve_exact (member_vec, extra + n_fields);
+  vec_safe_reserve_exact (member_vec, n_fields + (extra >= 0 ? extra : 0));
   member_vec_append_class_fields (member_vec, klass);
 }
 
@@ -1762,6 +1766,8 @@ set_class_bindings (tree klass, unsigned extra)
   member_vec->qsort (member_name_cmp

i386: Remove REG_ALLOC_ORDER definition

2020-12-09 Thread Uros Bizjak via Gcc-patches

REG_ALLOC_ORDER just defines what the default is set to.

2020-12-09  Uroš Bizjak  

gcc/
* config/i386/i386.h (REG_ALLOC_ORDER): Remove.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index d157d30ec17..e88738ca873 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -1163,22 +1163,6 @@ extern const char *host_detect_local_cpu (int argc, 
const char **argv);
  /* k0,  k1,  k2,  k3,  k4,  k5,  k6,  k7*/\
  1,   1,   1,   1,   1,   1,   1,   1 }
 
-/* Order in which to allocate registers.  Each register must be
-   listed once, even those in FIXED_REGISTERS.  List frame pointer
-   late and fixed registers last.  Note that, in general, we prefer
-   registers listed in CALL_USED_REGISTERS, keeping the others
-   available for storage of persistent values.
-
-   The ADJUST_REG_ALLOC_ORDER actually overwrite the order,
-   so this is just empty initializer for array.  */
-
-#define REG_ALLOC_ORDER
\
-{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
\
-  16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,  \
-  32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,  \
-  48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,  \
-  64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75 }
-
 /* ADJUST_REG_ALLOC_ORDER is a macro which permits reg_alloc_order
to be rearranged based on a particular function.  When using sse math,
we want to allocate SSE before x87 registers and vice versa.  */

Re: [[PATCH] 2/3] aarch64: Add C-function invocation for indirect branch pattern.

2020-12-09 Thread Philipp Tomsich

Richard,

Could you review this series and let us know if this is acceptable for Phase 3?
This is a security-relevant (a Spectre variant 2 mitigation) for the Ampere 
eMAG…

Thanks,
Philipp.

> On 09.12.2020, at 18:21, Christoph Müllner 
>  wrote:
> 
> aarch64 already uses a C-function for indirect calls
> (aarch64_indirect_call_asm()). So let's add the same
> abstraction for indirect branches.
> 
> This patch has no functional consequence.
> 
> gcc/
>* config/aarch64/aarch64.c (aarch64_indirect_branch_asm): Add
>   function to output indirect branch instructions.
>* config/aarch64/aarch64.md (indirect_jump): Invoke
>aarch64_indirect_branch_asm() instead of outputting instructions
>direclty.
>* config/aarch64/aarch64.md (sibcall_insn): Likewise.
>* config/aarch64/aarch64.md (sibcall_value_insn): Likewise.
> ---
> gcc/config/aarch64/aarch64-protos.h | 1 +
> gcc/config/aarch64/aarch64.c| 7 +++
> gcc/config/aarch64/aarch64.md   | 6 +++---
> 3 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 2aa3f1fddaa..91ae8b7a0f9 100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -802,6 +802,7 @@ extern const atomic_ool_names aarch64_ool_ldeor_names;
> tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
> 
> const char *aarch64_sls_barrier (int);
> +const char *aarch64_indirect_branch_asm (rtx);
> const char *aarch64_indirect_call_asm (rtx);
> extern bool aarch64_harden_sls_retbr_p (void);
> extern bool aarch64_harden_sls_blr_p (void);
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 264ccb8beb2..4799679f9e5 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -23659,6 +23659,13 @@ aarch64_asm_file_end ()
> #endif
> }
> 
> +const char *
> +aarch64_indirect_branch_asm (rtx addr)
> +{
> +  output_asm_insn ("br\t%0", &addr);
> +  return "";
> +}
> +
> const char *
> aarch64_indirect_call_asm (rtx addr)
> {
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index eed06de3240..5cf660cc19f 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -471,7 +471,7 @@
>   [(set (pc) (match_operand:DI 0 "register_operand" "r"))]
>   ""
>   {
> -output_asm_insn ("br\\t%0", operands);
> +aarch64_indirect_branch_asm (operands[0]);
> return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
>   }
>   [(set_attr "type" "branch")
> @@ -1104,7 +1104,7 @@
>   {
> if (which_alternative == 0)
>   {
> - output_asm_insn ("br\\t%0", operands);
> + aarch64_indirect_branch_asm (operands[0]);
>   return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
>   }
> return "b\\t%c0";
> @@ -1124,7 +1124,7 @@
>   {
> if (which_alternative == 0)
>   {
> - output_asm_insn ("br\\t%1", operands);
> + aarch64_indirect_branch_asm (operands[1]);
>   return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
>   }
> return "b\\t%c1";
> -- 
> 2.29.2
>

Re: [PATCH] C++ : Add the -stdlib= option.

2020-12-09 Thread Jason Merrill via Gcc-patches


On 11/11/20 3:58 PM, Iain Sandoe wrote:
resending - the first & second attempt didn’t seem to make it to 
gcc-patches.


Hi

This option allows the user to specify alternate C++ runtime libraries,
for example when a platform uses libc++ as the installed C++ runtime.

It is the same spelling as a clang option that allows that to use 
libstdc++.


I have had this patch for some time now (more than a year) on Darwin
branches.

For Darwin [>=11] (and I expect modern FreeBSD) the fact that the installed
C++ runtime is libc++ means conflicts can and do occur when using G++.

I expect that the facility will also be useful for folks who regularly 
try to
ensure that GCC and clang stay compatible, it is a credit to that effort 
that

the replacement is pretty much “drop in”.

Testing:

The patch applies without regression on *darwin* and x86_64-linux-gnu.

That doesn’t say much about whether it does what’s intended, of course,
and testing in-tree is not a viable option (it would need a lot of work, 
not

to mention the fact that it depends on an external source base).  So I’ve
tested this quite extensively on x86 Darwin and Linux.

It’s a lot easier to use an LLVM branch >= 9 for this since there is a
missing __cxa symbol before that (I originally used LLVM-7 for ‘reasons’).
Since coroutines was committed to GCC we have a  header
where the libc++ implementation is still using the 
version, so that one needs to  account for this.

Here’s an LLVM-9 tree with an added  header (as an example)
https://github.com/iains/llvm-project/tree/9.0.1-gcc-stdlib
(in case someone wants to try this out in the near future; I don’t think 
that

LLVM-10 will be much different, at least the coroutine header is unchanged
there)

I’ve used this ‘in anger’ on Darwin to build a toolset which includes a 
number

of C++ heavy applications (e.g. LLVM, cmake, etc) and it allowed some of
these to work effectively where it had not been possible before.

One can also do an “installed test” of g++
for that there are (a relatively modest number of) test fails.
AFAICT, there is nothing significant there - some tests fail because the 
output
isn’t expecting to see libc++ __1 inline namespace, some fail because 
libc++

(as per current branches) doesn’t allow use with GCC + std=c++98, some
are warning diagnostics etc.

[how compatible libc++ is, is somewhat independent of the patch itself; but
it seems “very compatible” is a starting assessment].

phew… description longer than patch, it seems.

OK for master?
thanks
Iain

—— commit message

This option allows the user to specify alternate C++ runtime libraries,
for example when a platform uses libc++ as the installed C++ runtime.

We introduce the command line option: -stdlib= which is the user-facing
mechanism to select the C++ runtime to be used when compiling and linking
code.  This is the same option spelling as that used by clang to allow the
use of libstdc++.

The availability (and thus function) of the option are a configure-time
choice using the configuration control:
--with-gxx-libcxx-include-dir=

Specification of the path for the libc++ headers, enables the -stdlib=
option (using the path as given), default values are set when the path
is unconfigured.

If --with-gxx-libcxx-include-dir is given together with --with-sysroot=,
then we test to see if the include path starts with the sysroot and, if so,
record the sysroot-relative component as the local path.  At runtime, we
prepend the sysroot that is actually active.

At link time, we use the C++ runtime in force and (if that is libc++) also
append the libc++abi ABI library. As for other cases, if a target sets the
name pointer for the ABI library to NULL the G++ driver will omit it from
the link line.

gcc/ChangeLog:

 * configure.ac: Add gxx-libcxx-include-dir handled
 in the same way as the regular cxx header directory.
 * Makefile.in: Regenerated.
 * config.in: Likewise.
 * configure: Likewise.
 * cppdefault.c: Pick up libc++ headers if the option
 is enabled.
 * incpath.c (add_standard_paths): Allow for multiple
 c++ header include path variants.
 * doc/invoke.texi: Document the -stdlib= option.

gcc/c-family/ChangeLog:

 * c.opt: Add -stdlib= option and enumerations for
 libstdc++ and libc++.

gcc/cp/ChangeLog:

 * g++spec.c (LIBCXX, LIBCXX_PROFILE, LIBCXX_STATIC): New.
 (LIBCXXABI, LIBCXXABI_PROFILE, LIBCXXABI_STATIC): New.
 (lang_specific_driver): Allow selection amongst multiple
 c++ libraries to be added to the link command.
---
gcc/Makefile.in |  6 +
gcc/c-family/c.opt  | 14 +++
gcc/config.in   |  6 +
gcc/configure   | 57 +++--
gcc/configure.ac    | 44 ++
gcc/cp/g++spec.c    | 53 ++---
gcc/cppdefault.c    |  5 
gcc/doc/invoke.texi | 11 +
gcc/incpath.c   |  6 +++--
9 files changed, 195 insertions(+), 7 deletions(-)

dif

Re: [PATCH] Remove misleading debug line entries

2020-12-09 Thread Bernd Edlinger

On 12/8/20 7:57 PM, Bernd Edlinger wrote:
> On 12/8/20 11:35 AM, Richard Biener wrote:
>>
>> + {
>> +   /* Remove a nonbind marker when the outer scope of the
>> +  inline function is completely removed.  */
>> +   if (gimple_debug_nonbind_marker_p (stmt)
>> +   && BLOCK_ABSTRACT_ORIGIN (b))
>> + {
>> +   while (TREE_CODE (b) == BLOCK
>> +  && !inlined_function_outer_scope_p (b))
>> + b = BLOCK_SUPERCONTEXT (b);
>>
>> So given we never remove a inlined_function_outer_scope_p BLOCK from
>> the block tree can we assert that we find such a BLOCK?  If we never
>> elide those BLOCKs how can it happen that we elide it in the end?

We can remove inlined function outer scope when they have no subblocks
any more, or only unused subblocks, and there is an exception from the
rule when no debug info is generated, that is due to this:

>else if (!flag_auto_profile && debug_info_level == DINFO_LEVEL_NONE
> && !optinfo_wants_inlining_info_p ())
>  {
>/* Even for -g0 don't prune outer scopes from artificial
>   functions, otherwise diagnostics using tree_nonartificial_location
>   will not be emitted properly.  */
>if (inlined_function_outer_scope_p (scope))
>  {
>tree ao = BLOCK_ORIGIN (scope);
>if (ao
>&& TREE_CODE (ao) == FUNCTION_DECL
>&& DECL_DECLARED_INLINE_P (ao)
>&& lookup_attribute ("artificial", DECL_ATTRIBUTES (ao)))
>  unused = false;
>  }
>  }
> 

I instrumented the remove_unused_scope_block_p now as follows,
to better understand what happens here:

diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index 9ea24a1..3dd859c 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -525,9 +525,15 @@ remove_unused_scope_block_p (tree scope, bool 
in_ctor_dtor_block)
*t = BLOCK_SUBBLOCKS (*t);
while (BLOCK_CHAIN (*t))
  {
+   gcc_assert (TREE_USED (*t));
+   if (debug_info_level != DINFO_LEVEL_NONE)
+ gcc_assert (!inlined_function_outer_scope_p 
(BLOCK_SUPERCONTEXT (*t)));
BLOCK_SUPERCONTEXT (*t) = supercontext;
t = &BLOCK_CHAIN (*t);
  }
+   gcc_assert (TREE_USED (*t));
+   if (debug_info_level != DINFO_LEVEL_NONE)
+ gcc_assert (!inlined_function_outer_scope_p (BLOCK_SUPERCONTEXT 
(*t)));
BLOCK_CHAIN (*t) = next;
BLOCK_SUPERCONTEXT (*t) = supercontext;
t = &BLOCK_CHAIN (*t);

This survives a bootstrap, but I consider that just as an experiment...

This means that the BLOCK_SUPERCONTEXT pointers never skip
an inlined_function_outer_scope_p, *except* when no debug info is
generated, but then it is fine, as there are either no debug_nonbind_marker_p,
or it would not matter, if an outer scope is missed.

After the above loop runs, the BLOCK_SUBBLOCKS->BLOCK_CHAIN have only
Blocks with TREE_USED
Blocks with !TREE_USED are removed from the SUBBLOCKS->CHAIN list, but
have still a valid BLOCK_SUPERCONTEXT. However BLOCK_CHAIN and BLOCK_SUBBLOCKS
are not used any more, and could theoretically misused for something, but
fortunately that is not necessary.

I think that result suggests that the proposed patch does the right thing,
already as-is.


Do you agree?


Thanks
Bernd.

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2020-12-09 Thread Kwok Cheung Yeung


On 09/12/2020 5:53 pm, Jakub Jelinek wrote:

On Wed, Dec 09, 2020 at 05:37:24PM +, Kwok Cheung Yeung wrote:

I believe this patch is largely complete now. I have done a bootstrap on
x86_64 and run the testsuites with no regressions. I have also run the
libgomp testsuite with offloading to Nvidia and AMD GCN devices, also with
no regressions. Is this patch okay for trunk (or would it be more
appropriate to wait until GCC 11 is branched off)?


I think it is desirable for GCC 11, doesn't need to be deferred, and sorry
it is taking me so long.  I've paged in the standard wording related to this
yesterday and hoped I'd look at this, but didn't manage, will try to do that
tomorrow or worst case on Friday.


No problem :-), and thanks for looking at the patch.

Kwok

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2020-12-09 Thread Jakub Jelinek via Gcc-patches

On Wed, Dec 09, 2020 at 05:37:24PM +, Kwok Cheung Yeung wrote:
> I believe this patch is largely complete now. I have done a bootstrap on
> x86_64 and run the testsuites with no regressions. I have also run the
> libgomp testsuite with offloading to Nvidia and AMD GCN devices, also with
> no regressions. Is this patch okay for trunk (or would it be more
> appropriate to wait until GCC 11 is branched off)?

I think it is desirable for GCC 11, doesn't need to be deferred, and sorry
it is taking me so long.  I've paged in the standard wording related to this
yesterday and hoped I'd look at this, but didn't manage, will try to do that
tomorrow or worst case on Friday.

Jakub

Re: [PATCH] [WIP] openmp: Add OpenMP 5.0 task detach clause support

2020-12-09 Thread Kwok Cheung Yeung


Hello

This is a further update of the patch for task detach support.

- The memory for the event is not mapped on the target. This means that if 
omp_fulfill_event is called from an 'omp target' section with a target that 
does not share memory with the host, the event will not be fulfilled (and a 
segfault will probably occur).


I was thinking of something along the lines of:

#pragma omp task detach (event)
{
}

#pragma omp target
{
   omp_fulfill_event (event);
}

Would something like this be expected to work? I cannot find many examples of 
the detach clause online, and none of them use any offloading constructs.


I have asked on the omp-lang mailing list - this is not expected to work.

- The tasks awaiting event fulfillment currently wait until there are no other 
runnable tasks left. A better approach would be to poll (without blocking) the 
waiting tasks whenever any task completes, immediately removing any 
now-complete tasks and requeuing any dependent tasks.


This has now been implemented. On every iteration of the main loop in 
gomp_barrier_handle_tasks, it first checks to see if any tasks in the detach 
queue have a fulfilled completion event, and if so it will remove the task and 
requeue any dependent tasks.




I have found another problem with the original blocking approach when the tasks 
are on offload devices. On Nvidia and GCN, a bar.sync/s_barrier instruction is 
issued when gomp_team_barrier_wake is called to synchronise the threads. 
However, if some of the barrier threads are stuck waiting for semaphores 
associated with completion events, and the fulfillment of those events are in 
other tasks waiting to run, then the result is a deadlock as the threads cannot 
synchronise without all the semaphores being released.


I have removed the blocking path on gomp_barrier_handle_tasks altogether, and 
omp_fulfill_event now directly wakes the barrier threads to process any tasks 
that are now complete.


I have also ensured that the event handle specified on the detach clause is 
firstprivate by default on enclosing scopes.


I believe this patch is largely complete now. I have done a bootstrap on x86_64 
and run the testsuites with no regressions. I have also run the libgomp 
testsuite with offloading to Nvidia and AMD GCN devices, also with no 
regressions. Is this patch okay for trunk (or would it be more appropriate to 
wait until GCC 11 is branched off)?


Thanks

Kwok
commit 3d82db0fc3623e9dc241bed4c4cfd266574d45e7
Author: Kwok Cheung Yeung 
Date:   Wed Dec 9 09:33:46 2020 -0800

openmp: Add support for the OpenMP 5.0 task detach clause

2020-12-09  Kwok Cheung Yeung  

gcc/
* builtin-types.def (BT_PTR_SIZED_INT): New primitive type.
(BT_FN_PSINT_VOID): New function type.
(BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT): Rename
to...
(BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PSINT):
...this.  Add extra argument.
* gimplify.c (gimplify_scan_omp_clauses): Handle OMP_CLAUSE_DETACH.
(gimplify_adjust_omp_clauses): Likewise.
* omp-builtins.def (BUILT_IN_GOMP_TASK): Change function type to
BT_FN_VOID_OMPFN_PTR_OMPCPYFN_LONG_LONG_BOOL_UINT_PTR_INT_PSINT.
(BUILT_IN_GOMP_NEW_EVENT): New.
* omp-expand.c (expand_task_call): Add detach argument when generating
call to GOMP_task.
* omp-low.c (scan_sharing_clauses): Setup data environment for detach
clause.
(lower_detach_clause): New.
(lower_omp_taskreg): Call lower_detach_clause for detach clause.  Add
Gimple statements generated for detach clause.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_DETACH.
* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE_DETACH.
* tree.c (omp_clause_num_ops): Add entry for OMP_CLAUSE_DETACH.
(omp_clause_code_name): Add entry for OMP_CLAUSE_DETACH.
(walk_tree_1): Handle OMP_CLAUSE_DETACH.
* tree.h (OMP_CLAUSE_DETACH_EXPR): New.

gcc/c-family/
* c-pragma.h (pragma_omp_clause): Add PRAGMA_OMP_CLAUSE_DETACH.
Redefine PRAGMA_OACC_CLAUSE_DETACH.

gcc/c/
* c-parser.c (c_parser_omp_clause_detach): New.
(c_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DETACH clause.
(OMP_TASK_CLAUSE_MASK): Add mask for PRAGMA_OMP_CLAUSE_DETACH.
* c-typeck.c (c_finish_omp_clauses): Handle PRAGMA_OMP_CLAUSE_DETACH
clause.

gcc/cp/
* parser.c (cp_parser_omp_clause_detach): New.
(cp_parser_omp_all_clauses): Handle PRAGMA_OMP_CLAUSE_DETACH.
(OMP_TASK_CLAUSE_MASK): Add mask for PRAGMA_OMP_CLAUSE_DETACH.
* semantics.c (finish_omp_clauses): Handle OMP_CLAUSE_DETACH clause.

gcc/fortran/
* dump-parse-tree.c (show_omp_clauses): Handle detach clause.
* frontend-passes.c (gfc_code_walker): Walk detach expression.
* gfortran.h (struct

[[PATCH] 3/3] aarch64: Retpoline (Spectre-V2 mitigation) for aarch64.

2020-12-09 Thread Christoph Müllner

The compiler option -mindirect-branch= converts indirect
branch-and-link-register and branch-register instructions according to .

The default is ``keep``, which keeps indirect branch-and-link-register and
branch-register instructions unmodified.

``thunk`` converts indirect branch-and-link-register/branch-register
instructions to a branch-and-link/branch to a function containing a retpoline
(to stop speculative execution) followed by a branch-register to the target.

``thunk-inline`` is similar to ``thunk``, but inlines the retpoline
before the branch-and-link-register/branch-register instruction.

``thunk-extern`` is also similar to ``thunk``, but does not insert the
functions containing the retpoline. When using this option, these functions
need to be provided in a separate object file. The retpoline functions exist
for each register and are named ``__aarch64_indirect_thunk_xN`` (N being the
register number).

It is also possible to override the indirect-branch setting for
individual fuctions using the function attribute ``indirect_branch``.

The actual retpoline instruction sequence, which prevents speculative
indirect branches looks like this::

str x30, [sp, #-16]!
bl  101f
  100: //speculation trap
wfe
b   100b
  101: //do ROP
adr x30, 102f
ret
  102: //non-spec code
ldr x30, [sp], #16

This patch has been tested with the included testcases and various other
source bases (benchmarks, retpoline-patched arm64 kernel, etc.).
---
 gcc/config/aarch64/aarch64-opts.h |   9 +
 gcc/config/aarch64/aarch64.c  | 298 +-
 gcc/config/aarch64/aarch64.h  |   2 +
 gcc/config/aarch64/aarch64.opt|  20 ++
 gcc/doc/invoke.texi   |  20 +-
 .../gcc.target/aarch64/indirect-thunk-1.c |  25 ++
 .../gcc.target/aarch64/indirect-thunk-2.c |  26 ++
 .../gcc.target/aarch64/indirect-thunk-3.c |  26 ++
 .../gcc.target/aarch64/indirect-thunk-4.c |  27 ++
 .../gcc.target/aarch64/indirect-thunk-5.c |  25 ++
 .../gcc.target/aarch64/indirect-thunk-6.c |  26 ++
 .../aarch64/indirect-thunk-attr-1.c   |  28 ++
 .../aarch64/indirect-thunk-attr-2.c   |  27 ++
 .../aarch64/indirect-thunk-attr-3.c   |  29 ++
 .../aarch64/indirect-thunk-attr-4.c   |  28 ++
 .../aarch64/indirect-thunk-attr-5.c   |  24 ++
 .../aarch64/indirect-thunk-attr-6.c   |  24 ++
 .../aarch64/indirect-thunk-extern-1.c |  22 ++
 .../aarch64/indirect-thunk-extern-2.c |  23 ++
 .../aarch64/indirect-thunk-extern-3.c |  23 ++
 .../aarch64/indirect-thunk-extern-4.c |  24 ++
 .../aarch64/indirect-thunk-extern-5.c |  20 ++
 .../aarch64/indirect-thunk-extern-6.c |  21 ++
 .../aarch64/indirect-thunk-inline-1.c |  26 ++
 .../aarch64/indirect-thunk-inline-2.c |  26 ++
 .../aarch64/indirect-thunk-inline-3.c |  26 ++
 .../aarch64/indirect-thunk-inline-4.c |  27 ++
 .../aarch64/indirect-thunk-inline-5.c |  23 ++
 .../aarch64/indirect-thunk-inline-6.c |  24 ++
 29 files changed, 941 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-4.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-5.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-6.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-attr-1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-attr-2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-attr-3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-attr-4.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-attr-5.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-attr-6.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-extern-1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-extern-2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-extern-3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-extern-4.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-extern-5.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-extern-6.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-inline-1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-inline-2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-inline-3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-inline-4.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/indirect-thunk-inline-5.c
 create mode

[[PATCH] 2/3] aarch64: Add C-function invocation for indirect branch pattern.

2020-12-09 Thread Christoph Müllner

aarch64 already uses a C-function for indirect calls
(aarch64_indirect_call_asm()). So let's add the same
abstraction for indirect branches.

This patch has no functional consequence.

gcc/
* config/aarch64/aarch64.c (aarch64_indirect_branch_asm): Add
function to output indirect branch instructions.
* config/aarch64/aarch64.md (indirect_jump): Invoke
aarch64_indirect_branch_asm() instead of outputting instructions
direclty.
* config/aarch64/aarch64.md (sibcall_insn): Likewise.
* config/aarch64/aarch64.md (sibcall_value_insn): Likewise.
---
 gcc/config/aarch64/aarch64-protos.h | 1 +
 gcc/config/aarch64/aarch64.c| 7 +++
 gcc/config/aarch64/aarch64.md   | 6 +++---
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 2aa3f1fddaa..91ae8b7a0f9 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -802,6 +802,7 @@ extern const atomic_ool_names aarch64_ool_ldeor_names;
 tree aarch64_resolve_overloaded_builtin_general (location_t, tree, void *);
 
 const char *aarch64_sls_barrier (int);
+const char *aarch64_indirect_branch_asm (rtx);
 const char *aarch64_indirect_call_asm (rtx);
 extern bool aarch64_harden_sls_retbr_p (void);
 extern bool aarch64_harden_sls_blr_p (void);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 264ccb8beb2..4799679f9e5 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -23659,6 +23659,13 @@ aarch64_asm_file_end ()
 #endif
 }
 
+const char *
+aarch64_indirect_branch_asm (rtx addr)
+{
+  output_asm_insn ("br\t%0", &addr);
+  return "";
+}
+
 const char *
 aarch64_indirect_call_asm (rtx addr)
 {
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index eed06de3240..5cf660cc19f 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -471,7 +471,7 @@
   [(set (pc) (match_operand:DI 0 "register_operand" "r"))]
   ""
   {
-output_asm_insn ("br\\t%0", operands);
+aarch64_indirect_branch_asm (operands[0]);
 return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
   }
   [(set_attr "type" "branch")
@@ -1104,7 +1104,7 @@
   {
 if (which_alternative == 0)
   {
-   output_asm_insn ("br\\t%0", operands);
+   aarch64_indirect_branch_asm (operands[0]);
return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
   }
 return "b\\t%c0";
@@ -1124,7 +1124,7 @@
   {
 if (which_alternative == 0)
   {
-   output_asm_insn ("br\\t%1", operands);
+   aarch64_indirect_branch_asm (operands[1]);
return aarch64_sls_barrier (aarch64_harden_sls_retbr_p ());
   }
 return "b\\t%c1";
-- 
2.29.2

[[PATCH] 1/3] aarch64: Sanitize access to cfun in aarch64_declare_function_name()

2020-12-09 Thread Christoph Müllner

From: Christoph Muellner 

It is possible to call aarch64_declare_function_name() and
have cfun not set. Let's sanitize the access to this variable.

gcc/

* config/aarch64/aarch64.c (aarch64_declare_function_name):
Santize access to cfun.
---
 gcc/config/aarch64/aarch64.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 67ffba02d3e..264ccb8beb2 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -19317,7 +19317,8 @@ aarch64_declare_function_name (FILE *stream, const 
char* name,
   ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function");
   ASM_OUTPUT_LABEL (stream, name);
 
-  cfun->machine->label_is_assembled = true;
+  if (cfun)
+cfun->machine->label_is_assembled = true;
 }
 
 /* Implement PRINT_PATCHABLE_FUNCTION_ENTRY.  Check if the patch area is after
-- 
2.29.2

[PATCH,rs6000] Optimize pcrel access of globals [ping]

2020-12-09 Thread acsawdey--- via Gcc-patches

From: Aaron Sawdey 

Ping. I've folded in the changes to comments suggested by Will Schmidt.

This patch implements a RTL pass that looks for pc-relative loads of the
address of an external variable using the PCREL_GOT relocation and a
single load or store that uses that external address.

Produced by a cast of thousands:
 * Michael Meissner
 * Peter Bergner
 * Bill Schmidt
 * Alan Modra
 * Segher Boessenkool
 * Aaron Sawdey

Passes bootstrap/regtest on ppc64le power10. Should have no effect on
other processors. OK for trunk?

Thanks!
   Aaron

gcc/ChangeLog:

* config.gcc: Add pcrel-opt.c and pcrel-opt.o.
* config/rs6000/pcrel-opt.c: New file.
* config/rs6000/pcrel-opt.md: New file.
* config/rs6000/predicates.md: Add d_form_memory predicate.
* config/rs6000/rs6000-cpus.def: Add OPTION_MASK_PCREL_OPT.
* config/rs6000/rs6000-passes.def: Add pass_pcrel_opt.
* config/rs6000/rs6000-protos.h: Add reg_to_non_prefixed(),
offsettable_non_prefixed_memory(), output_pcrel_opt_reloc(),
and make_pass_pcrel_opt().
* config/rs6000/rs6000.c (reg_to_non_prefixed): Make global.
(rs6000_option_override_internal): Add pcrel-opt.
(rs6000_delegitimize_address): Support pcrel-opt.
(rs6000_opt_masks): Add pcrel-opt.
(offsettable_non_prefixed_memory): New function.
(reg_to_non_prefixed): Make global.
(rs6000_asm_output_opcode): Reset next_insn_prefixed_p.
(output_pcrel_opt_reloc): New function.
* config/rs6000/rs6000.md (loads_extern_addr): New attr.
(pcrel_extern_addr): Set loads_extern_addr.
Add include for pcrel-opt.md.
* config/rs6000/rs6000.opt: Add -mpcrel-opt.
* config/rs6000/t-rs6000: Add rules for pcrel-opt.c and
pcrel-opt.md.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pcrel-opt-inc-di.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-df.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-di.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-hi.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-qi.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-sf.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-si.c: New test.
* gcc.target/powerpc/pcrel-opt-ld-vector.c: New test.
* gcc.target/powerpc/pcrel-opt-st-df.c: New test.
* gcc.target/powerpc/pcrel-opt-st-di.c: New test.
* gcc.target/powerpc/pcrel-opt-st-hi.c: New test.
* gcc.target/powerpc/pcrel-opt-st-qi.c: New test.
* gcc.target/powerpc/pcrel-opt-st-sf.c: New test.
* gcc.target/powerpc/pcrel-opt-st-si.c: New test.
* gcc.target/powerpc/pcrel-opt-st-vector.c: New test.
---
 gcc/config.gcc|   6 +-
 gcc/config/rs6000/pcrel-opt.c | 888 ++
 gcc/config/rs6000/pcrel-opt.md| 386 
 gcc/config/rs6000/predicates.md   |  23 +
 gcc/config/rs6000/rs6000-cpus.def |   2 +
 gcc/config/rs6000/rs6000-passes.def   |   8 +
 gcc/config/rs6000/rs6000-protos.h |   4 +
 gcc/config/rs6000/rs6000.c| 116 ++-
 gcc/config/rs6000/rs6000.md   |   8 +-
 gcc/config/rs6000/rs6000.opt  |   4 +
 gcc/config/rs6000/t-rs6000|   7 +-
 .../gcc.target/powerpc/pcrel-opt-inc-di.c |  18 +
 .../gcc.target/powerpc/pcrel-opt-ld-df.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-ld-di.c  |  43 +
 .../gcc.target/powerpc/pcrel-opt-ld-hi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-qi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-sf.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-ld-si.c  |  41 +
 .../gcc.target/powerpc/pcrel-opt-ld-vector.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-df.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-di.c  |  37 +
 .../gcc.target/powerpc/pcrel-opt-st-hi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-st-qi.c  |  42 +
 .../gcc.target/powerpc/pcrel-opt-st-sf.c  |  36 +
 .../gcc.target/powerpc/pcrel-opt-st-si.c  |  41 +
 .../gcc.target/powerpc/pcrel-opt-st-vector.c  |  36 +
 26 files changed, 2013 insertions(+), 9 deletions(-)
 create mode 100644 gcc/config/rs6000/pcrel-opt.c
 create mode 100644 gcc/config/rs6000/pcrel-opt.md
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-inc-di.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-df.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-di.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-hi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-qi.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-sf.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-si.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-vector.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-st-df.c
 create

Re: Problem building libstdc++ for the avr target

2020-12-09 Thread Jonathan Wakely via Gcc-patches


On 09/12/20 12:49 +, Jonathan Wakely wrote:

On 09/12/20 13:32 +0100, Vladimir V wrote:

Hello.

While testing with the current upstream I encountered a compilation issue.
Although I build with "--disable-threads" flag the following error occurs:

../../../../../libstdc++-v3/src/c++11/thread.cc:39:4: error: #error "No
sleep function known for this target"

Previously the check was inside the  #ifdef _GLIBCXX_HAS_GTHREADS that
prevented the error from happening (in my case with gcc v10.1),
So I would like to ask if the thread.cc should be involved in the build if
the threads support is configured to be disabled?


Yes, the file is always built, but which definitions it contains
depends on what is configured for the target.

The std::this_thread::sleep_for and std::this_thread::sleep_until
functions don't actually depend on threads at all. They just sleep.

But that still requires target support, just different support from
threads.


And if it should, then can the condition be reworked to cover the described
case?


Yes, I'll do that. Thanks for bringing it to my attention.

I assume we can't use avr-libc's delay functions, because they depend
on the CPU clock frequency, which isn't known when we compile
libstdc++. So I'll just suppress the declarations of those functions
and remove the #error.


The attached patch adds a new _GLIBCXX_NO_SLEEP configure macro which
should get defined for your hosted AVR build. That should mean that
std::this_thread::sleep_for is not defined, and src/c++11/thread.cc
will no longer insist on some way to sleep being supported.

I've only tested this on powerpc64le-linux, so please let me know if
it works for you.

Pushed to master.


commit 0aa1786d34b891c8e1e219fb11255af5358013c4
Author: Jonathan Wakely 
Date:   Wed Dec 9 16:53:18 2020

libstdc++: Fix build failure for target with no way to sleep

In previous releases the std::this_thread::sleep_for function was only
declared if the target supports multiple threads. I changed that
recently in r11-2649-g5bbb1f3000c57fd4d95969b30fa0e35be6d54ffb so that
sleep_for could be used single-threaded. But that means that targets
using --disable-threads are now required to provide some way to sleep.
This breaks the build for (at least) AVR when trying to build a hosted
library.

This patch adds a new autoconf macro that is defined when no way to
sleep is available, and uses that to suppress the sleeping functions in
std::this_thread.

The #error in src/c++11/thread.cc is retained for the case where there
is no sleep function available but multiple threads are supported. This
is consistent with previous releases, but that #error could probably be
removed without any consequences.

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Define NO_SLEEP
if none of nanosleep, sleep and Sleep is available.
* config.h.in: Regenerate.
* configure: Regenerate.
* include/std/thread [_GLIBCXX_NO_SLEEP] (__sleep_for): Do
not declare.
[_GLIBCXX_NO_SLEEP] (sleep_for, sleep_until): Do not
define.
* src/c++11/thread.cc [_GLIBCXX_NO_SLEEP] (__sleep_for): Do
not define.

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index fcd9ea3d23a..61191812c92 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -1626,16 +1626,22 @@ AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_TIME], [
   fi
 
   if test x"$ac_has_nanosleep$ac_has_sleep" = x"nono"; then
+  ac_no_sleep=yes
   AC_MSG_CHECKING([for Sleep])
   AC_TRY_COMPILE([#include ],
  [Sleep(1)],
  [ac_has_win32_sleep=yes],[ac_has_win32_sleep=no])
   if test x"$ac_has_win32_sleep" = x"yes"; then
 AC_DEFINE(HAVE_WIN32_SLEEP,1, [Defined if Sleep exists.])
+	ac_no_sleep=no
   fi
   AC_MSG_RESULT($ac_has_win32_sleep)
   fi
 
+  if test x"$ac_no_sleep" = x"yes"; then
+AC_DEFINE(NO_SLEEP,1, [Defined if no way to sleep is available.])
+  fi
+
   AC_SUBST(GLIBCXX_LIBS)
 
   CXXFLAGS="$ac_save_CXXFLAGS"
diff --git a/libstdc++-v3/include/std/thread b/libstdc++-v3/include/std/thread
index 6ea8a51c0cf..8d0ede2b6c2 100644
--- a/libstdc++-v3/include/std/thread
+++ b/libstdc++-v3/include/std/thread
@@ -122,8 +122,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   namespace this_thread
   {
+#ifndef _GLIBCXX_NO_SLEEP
+
+#ifndef _GLIBCXX_USE_NANOSLEEP
 void
 __sleep_for(chrono::seconds, chrono::nanoseconds);
+#endif
 
 /// this_thread::sleep_for
 template
@@ -168,7 +172,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	__now = _Clock::now();
 	  }
   }
-  }
+  } // namespace this_thread
+#endif // ! NO_SLEEP
 
 #ifdef __cpp_lib_jthread
 
diff --git a/libstdc++-v3/src/c++11/thread.cc b/libstdc++-v3/src/c++11/thread.cc
index a9c92804959..62f6ddcd802 100644
--- a/libstdc++-v3/src/c++11/thread.cc
+

Re: [PATCH 1/2] libstdc++: Add --enable-pure-stdio-libstdcxx option

2020-12-09 Thread Jonathan Wakely via Gcc-patches


On 09/12/20 08:32 -0800, Keith Packard via Libstdc++ wrote:

Jonathan Wakely  writes:


OK. In principle, changes to avoid using the POSIX APIs are definitely
fine. I would like to combine your new configure switch with the
existing --enable-cstdio one though.

How about the attached change for acinclude.m4 which would allow you
to do --enable-cstdio=stdio_pure? (It also adds "stdio_posix" as a
more accurate alternative spelling of the current "stdio" option.)


Oh, that's very nice. I'll integrate that into my patch and send it
along.


Great. We already have a *lot* of configure options, so piggybacking
onto an existing one that's related seems preferable to adding Yet
Another of them.

Re: RFC: ARM MVE and Neon auto-vectorization

2020-12-09 Thread Richard Sandiford via Gcc-patches

Christophe Lyon via Gcc-patches  writes:
> Hi,
>
> I've been working for a while on enabling auto-vectorization for ARM
> MVE, and I find it a bit awkward to keep things common with Neon as
> much as possible.
>
> I've just sent a few patches for logical operators
> (vand/vorr/veor/vbic), and I have a few more WIP patches where I
> struggle to avoid duplication.
>
> For example, vneg is supported in different modes by MVE and Neon:
> * Neon: VDQ and VH iterators: V8QI V16QI V4HI V8HI V2SI V4SI V4HF V8HF
> V2SF V4SF V2DI  and V8HF V4HF
> * MVE: MVE_2 and MVE_0 iterators: V16QI V8HI V4SI and V8HF V4SF

My hope behind the ARM_HAVE__ macros was that the common
(optab) define_expand could use those, with the most permissive iterator
necessary.  We could stick on a "&& !TARGET_IWMMXT" for things that
aren't implemented for iwMMXt.

The above combination seems like a natural fit for unmodified
VDQ with ARM_HAVE__ARITH.  This would be similar to the
existing add3 pattern.

> My 'vand' patch changes the definition of VDQ so that the relevant
> modes are enabled only when !TARGET_HAVE_MVE (V8QI, ...), and this
> helps writing a simpler expander.
>
> However, vneg is used by vshr (right-shifts by register are
> implemented as left-shift by negation of that register), so the
> expander uses something like:
>
>   emit_insn (gen_neg2 (neg, operands[2]));
>   if (TARGET_NEON)
>   emit_insn (gen_ashl3_signed (operands[0], operands[1], neg));
>   else
>   emit_insn (gen_mve_vshlq_s (operands[0], operands[1], neg));
>
> which does not work if the iterator has conditional members: the
> 'else' part is still generated for  unsupported by MVE.

FWIW, I agree with Andre that it would be good to remove unnecessary
NEON/MVE differences like this.

Another technique that can be used where necessary is to convert:

  gen_foo (args)

to:

  gen_foo (mode, args)

and add a @ to the start of the definition of pattern "foo".

Thanks,
Richard

Re: [PATCH 1/2] libstdc++: Add --enable-pure-stdio-libstdcxx option

2020-12-09 Thread Keith Packard via Gcc-patches

Jonathan Wakely  writes:

> OK. In principle, changes to avoid using the POSIX APIs are definitely
> fine. I would like to combine your new configure switch with the
> existing --enable-cstdio one though.
>
> How about the attached change for acinclude.m4 which would allow you
> to do --enable-cstdio=stdio_pure? (It also adds "stdio_posix" as a
> more accurate alternative spelling of the current "stdio" option.)

Oh, that's very nice. I'll integrate that into my patch and send it
along.

-- 
-keith


signature.asc
Description: PGP signature

[PATCH] tree-optimization/98213 - cache PHI walking result in SM

2020-12-09 Thread Richard Biener

This avoids exponential work when walking PHIs in loop store motion.
Fails are quickly propagated and thus need no caching.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-12-09  Richard Biener  

PR tree-optimization/98213
* tree-ssa-loop-im.c (sm_seq_valid_bb): Cache successfully
processed PHIs.
(hoist_memory_references): Adjust.

* g++.dg/pr98213.C: New testcase.
---
 gcc/testsuite/g++.dg/pr98213.C | 24 
 gcc/tree-ssa-loop-im.c | 20 ++--
 2 files changed, 38 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr98213.C

diff --git a/gcc/testsuite/g++.dg/pr98213.C b/gcc/testsuite/g++.dg/pr98213.C
new file mode 100644
index 000..1a744eb2a3e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr98213.C
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+#include 
+
+long var_23;
+int var_24, test_var_8;
+extern bool arr_20[][13];
+char arr_21_0_0_0_0_0;
+int *test_arr_0;
+void test(unsigned long long var_1)
+{
+  int arr_16;
+  for (int i_0 = 0;;)
+for (int i_5; i_5;) {
+  for (int i_6 = 0; i_6 < 19; i_6 += 4)
+for (long i_7(test_var_8); i_7; i_7 += 2) {
+  arr_20[0][i_7] = arr_21_0_0_0_0_0 = 0;
+  var_23 = test_arr_0[0];
+}
+  var_24 = std::max((unsigned long long)arr_16,
+std::min((unsigned long long)5, var_1));
+}
+}
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 92e5a8dd774..fe48d02242d 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -2254,7 +2254,8 @@ sm_seq_push_down (vec &seq, unsigned ptr, 
unsigned *at)
 static int
 sm_seq_valid_bb (class loop *loop, basic_block bb, tree vdef,
 vec &seq, bitmap refs_not_in_seq,
-bitmap refs_not_supported, bool forked)
+bitmap refs_not_supported, bool forked,
+bitmap fully_visited)
 {
   if (!vdef)
 for (gimple_stmt_iterator gsi = gsi_last_bb (bb); !gsi_end_p (gsi);
@@ -2276,7 +2277,7 @@ sm_seq_valid_bb (class loop *loop, basic_block bb, tree 
vdef,
/* This handles the perfect nest case.  */
return sm_seq_valid_bb (loop, single_pred (bb), vdef,
seq, refs_not_in_seq, refs_not_supported,
-   forked);
+   forked, fully_visited);
   return 0;
 }
   do
@@ -2314,7 +2315,10 @@ sm_seq_valid_bb (class loop *loop, basic_block bb, tree 
vdef,
return sm_seq_valid_bb (loop, gimple_phi_arg_edge (phi, 0)->src,
gimple_phi_arg_def (phi, 0), seq,
refs_not_in_seq, refs_not_supported,
-   false);
+   false, fully_visited);
+ if (bitmap_bit_p (fully_visited,
+   SSA_NAME_VERSION (gimple_phi_result (phi
+   return 1;
  auto_vec first_edge_seq;
  auto_bitmap tem_refs_not_in_seq (&lim_bitmap_obstack);
  int eret;
@@ -2323,7 +2327,7 @@ sm_seq_valid_bb (class loop *loop, basic_block bb, tree 
vdef,
  gimple_phi_arg_def (phi, 0),
  first_edge_seq,
  tem_refs_not_in_seq, refs_not_supported,
- true);
+ true, fully_visited);
  if (eret != 1)
return -1;
  /* Simplify our lives by pruning the sequence of !sm_ord.  */
@@ -2338,7 +2342,7 @@ sm_seq_valid_bb (class loop *loop, basic_block bb, tree 
vdef,
  bitmap_copy (tem_refs_not_in_seq, refs_not_in_seq);
  eret = sm_seq_valid_bb (loop, e->src, vuse, edge_seq,
  tem_refs_not_in_seq, refs_not_supported,
- true);
+ true, fully_visited);
  if (eret != 1)
return -1;
  /* Simplify our lives by pruning the sequence of !sm_ord.  */
@@ -2419,6 +2423,8 @@ sm_seq_valid_bb (class loop *loop, basic_block bb, tree 
vdef,
  seq[new_idx].from = NULL_TREE;
}
}
+ bitmap_set_bit (fully_visited,
+ SSA_NAME_VERSION (gimple_phi_result (phi)));
  return 1;
}
   lim_aux_data *data = get_lim_data (def);
@@ -2494,9 +2500,11 @@ hoist_memory_references (class loop *loop, bitmap 
mem_refs,
   seq.create (4);
   auto_bitmap refs_not_in_seq (&lim_bitmap_obstack);
   bitmap_copy (refs_not_in_seq, mem_refs);
+  auto_bitmap fully_visited;
   int res = sm_seq_valid_bb (loop, e->src, NULL_TREE,
 seq, refs_not_in_seq,
-refs_not_supported, false);
+refs_not_supported, false,
+

Re: How to traverse all the local variables that declared in the current routine?

2020-12-09 Thread Qing Zhao via Gcc-patches




> On Dec 9, 2020, at 9:12 AM, Richard Biener  wrote:
> 
> On Wed, Dec 9, 2020 at 4:04 PM Qing Zhao  > wrote:
>> 
>> 
>> 
>> On Dec 9, 2020, at 2:23 AM, Richard Biener  
>> wrote:
>> 
>> On Tue, Dec 8, 2020 at 8:54 PM Qing Zhao  wrote:
>> 
>> 
>> 
>> 
>> On Dec 8, 2020, at 1:40 AM, Richard Biener  
>> wrote:
>> 
>> On Mon, Dec 7, 2020 at 5:20 PM Qing Zhao  wrote:
>> 
>> 
>> 
>> 
>> On Dec 7, 2020, at 1:12 AM, Richard Biener  
>> wrote:
>> 
>> On Fri, Dec 4, 2020 at 5:19 PM Qing Zhao  wrote:
>> 
>> 
>> 
>> 
>> On Dec 4, 2020, at 2:50 AM, Richard Biener  
>> wrote:
>> 
>> On Thu, Dec 3, 2020 at 6:33 PM Richard Sandiford
>>  wrote:
>> 
>> 
>> Richard Biener via Gcc-patches  writes:
>> 
>> On Tue, Nov 24, 2020 at 4:47 PM Qing Zhao  wrote:
>> 
>> Another issue is, in order to check whether an auto-variable has 
>> initializer, I plan to add a new bit in “decl_common” as:
>> /* In a VAR_DECL, this is DECL_IS_INITIALIZED.  */
>> unsigned decl_is_initialized :1;
>> 
>> /* IN VAR_DECL, set when the decl is initialized at the declaration.  */
>> #define DECL_IS_INITIALIZED(NODE) \
>> (DECL_COMMON_CHECK (NODE)->decl_common.decl_is_initialized)
>> 
>> set this bit when setting DECL_INITIAL for the variables in FE. then keep it
>> even though DECL_INITIAL might be NULLed.
>> 
>> 
>> For locals it would be more reliable to set this flag during gimplification.
>> 
>> Do you have any comment and suggestions?
>> 
>> 
>> As said above - do you want to cover registers as well as locals?  I'd do
>> the actual zeroing during RTL expansion instead since otherwise you
>> have to figure youself whether a local is actually used (see 
>> expand_stack_vars)
>> 
>> Note that optimization will already made have use of "uninitialized" state
>> of locals so depending on what the actual goal is here "late" may be too 
>> late.
>> 
>> 
>> Haven't thought about this much, so it might be a daft idea, but would a
>> compromise be to use a const internal function:
>> 
>> X1 = .DEFERRED_INIT (X0, INIT)
>> 
>> where the X0 argument is an uninitialised value and the INIT argument
>> describes the initialisation pattern?  So for a decl we'd have:
>> 
>> X = .DEFERRED_INIT (X, INIT)
>> 
>> and for an SSA name we'd have:
>> 
>> X_2 = .DEFERRED_INIT (X_1(D), INIT)
>> 
>> with all other uses of X_1(D) being replaced by X_2.  The idea is that:
>> 
>> * Having the X0 argument would keep the uninitialised use of the
>> variable around for the later warning passes.
>> 
>> * Using a const function should still allow the UB to be deleted as dead
>> if X1 isn't needed.
>> 
>> * Having a function in the way should stop passes from taking advantage
>> of direct uninitialised uses for optimisation.
>> 
>> This means we won't be able to optimise based on the actual init
>> value at the gimple level, but that seems like a fair trade-off.
>> AIUI this is really a security feature or anti-UB hardening feature
>> (in the sense that users are more likely to see predictable behaviour
>> “in the field” even if the program has UB).
>> 
>> 
>> The question is whether it's in line of peoples expectation that
>> explicitely zero-initialized code behaves differently from
>> implicitely zero-initialized code with respect to optimization
>> and secondary side-effects (late diagnostics, latent bugs, etc.).
>> 
>> Introducing a new concept like .DEFERRED_INIT is much more
>> heavy-weight than an explicit zero initializer.
>> 
>> 
>> What exactly you mean by “heavy-weight”? More difficult to implement or much 
>> more run-time overhead or both? Or something else?
>> 
>> The major benefit of the approach of “.DEFERRED_INIT”  is to enable us keep 
>> the current -Wuninitialized analysis untouched and also pass
>> the “uninitialized” info from source code level to “pass_expand”.
>> 
>> 
>> Well, "untouched" is a bit oversimplified.  You do need to handle
>> .DEFERRED_INIT as not
>> being an initialization which will definitely get interesting.
>> 
>> 
>> Yes, during uninitialized variable analysis pass, we should specially handle 
>> the defs with “.DEFERRED_INIT”, to treat them as uninitializations.
>> 
>> If we want to keep the current -Wuninitialized analysis untouched, this is a 
>> quite reasonable approach.
>> 
>> However, if it’s not required to keep the current -Wuninitialized analysis 
>> untouched, adding zero-initializer directly during gimplification should
>> be much easier and simpler, and also smaller run-time overhead.
>> 
>> 
>> As for optimization I fear you'll get a load of redundant zero-init
>> actually emitted if you can just rely on RTL DSE/DCE to remove it.
>> 
>> 
>> Runtime overhead for -fauto-init=zero is one important consideration for the 
>> whole feature, we should minimize the runtime overhead for zero
>> Initialization since it will be used in production build.
>> We can do some run-time performance evaluation when we have an 
>> implementation ready.
>> 
>> 
>> Note there will be other passes "confused" by .DEFERRED_

Re: [PATCH 30/31] PR target/95294: VAX: Convert backend to MODE_CC representation

2020-12-09 Thread Maciej W. Rozycki

On Fri, 20 Nov 2020, Maciej W. Rozycki wrote:

> Outliers:
> 
> old new change  %change filename
> 
> 24062950+544+22.610 20111208-1.exe
> 43145329+1015   +23.528 pr39417.exe
> 22353055+820+36.689 990404-1.exe
> 26314213+1582   +60.129 pr57521.exe
> 30635579+2516   +82.142 2422-1.exe

 So as a matter of interest I have actually looked into the worst offender 
shown above.  As I have just learnt by chance, GNU `size' in its default 
BSD mode reports code combined with rodata as text, so I have rerun it in 
the recently introduced GNU mode with `2422-1.exe' and the results are 
worse yet:

   text   databss  total filename
-  1985   1466 68   3519 ./2422-1.exe
+  4501   1466 68   6035 ./2422-1.exe

However upon actual inspection code produced looks sound and it's just 
that the loops the test case has get unrolled further.  With speed 
optimisation this is not necessarily bad.  Mind that the options used for 
this particular compilation are `-O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions' meaning that, well, ahem, we do 
want loops to get unrolled and we want speed rather than size.  So all is 
well.

 I have tried to run some benchmarking with this test case, by putting all 
the files involved in tmpfs (i.e. RAM) so as to limit any I/O influence 
and looping the executable 1000 times, which yielded elapsed times of 
around 340s, i.e. with a good resolution, but results are inconclusive and 
the execution time oscillates on individual runs around the value shown, 
regardless of whether this change has been applied or not.

 So I have to conclude that either both variants of code are virtually 
equivalent in terms of performance or that the test environment keeps 
execution I/O-bound despite the measures I have taken.  Sadly the VAX 
architecture does not provide a user-accessible cycle count (I would know 
of) that could be used for more accurate measurements, and I do not feel 
right now like fiddling with the code of the test case any further so as 
to make it more suited for performance evaluation.

 I have to note however on this occasion that this part of the change:

-(define_insn "*cmp"
-  [(set (cc0)
-   (compare (match_operand:VAXint 0 "nonimmediate_operand" "nrmT,nrmT")
-(match_operand:VAXint 1 "general_operand" "I,nrmT")))]
+(define_insn "*cmp_"
+  [(set (reg:VAXcc VAX_PSL_REGNUM)
+   (compare:VAXcc (match_operand:VAXint 0 "general_operand" "nrmT,nrmT")
+  (match_operand:VAXint 1 "general_operand" "I,nrmT")))]

which allowed an immediate with operand #0 has improved code generation a 
little bit with this test case as well, because rather than this:

clrl %r0
[...]
cmpl %r0,%r1
[...]
cmpl %r0,%r3
[...]

this:

cmpl $0,%r1
[...]
cmpl $0,%r3
[...]

is produced, which does not waste a register to hold the value of 0 which 
can be supplied in the literal addressing mode, i.e. with the operand 
specifier byte itself just like with register operands, and therefore does 
not require extra space or execution time.

 I don't know however why the middle end insists on supplying constant 0 
as operand #0 to the comparison operation (or the `cbranch4' insn it has 
originated from).  While we have machine support for such a comparison, 
having constant 0 supplied as operand #1 would permit the use of the TST 
instruction, one byte shorter.  Of course that would require reversing the 
condition of any branches using the output of the comparison, but unlike 
typical RISC ISAs the VAX ISA supports all the conditions as does our MD.

 Oddly enough making constant 0 more expensive in operand #0 than in 
operand #1 for comparison operations or COMPARE does not persuade the 
middle end to try and swap the operands, and making the `cbranch4' insns 
reject an immediate in operand #0 only makes reload put it back in a 
register.  All this despite COMPARE documentation saying:

"If one of the operands is a constant, it should be placed in the
 second operand and the comparison code adjusted as appropriate."

So this looks like a missed optimisation and something to investigate at 
one point.

 Also, interestingly, we have this comment in our MD:

;; The VAX move instructions have space-time tradeoffs.  On a MicroVAX
;; register-register mov instructions take 3 bytes and 2 CPU cycles.  clrl
;; takes 2 bytes and 3 cycles.  mov from constant to register takes 2 cycles
;; if the constant is smaller than 4 bytes, 3 cycles for a longword
;; constant.  movz, mneg, and mcom are as fast as mov, so movzwl is faster
;; than movl for positive constants that fit in 16 bits but not 6 bits.  cvt
;; instructions take 4 cycles.  inc takes 3 cycles.  The machine description
;; is willing to trade 1 byte for 1 cycle (clrl instead of m

[PATCH] data-ref: Rework integer handling in split_constant_offset [PR98069]

2020-12-09 Thread Richard Sandiford via Gcc-patches

PR98069 is about a case in which split_constant_offset miscategorises
an expression of the form:

  int foo;
  …
  POINTER_PLUS_EXPR

as:

  base: base
  offset: (sizetype) (-foo) * size
  init: INT_MIN * size

“-foo” overflows when “foo” is INT_MIN, whereas the original expression
didn't overflow in that case.

As discussed in the PR trail, we could simply ignore the fact that
int overflow is undefined and treat it as a wrapping type, but that
is likely to pessimise quite a few cases.

This patch instead reworks split_constant_offset so that:

- it treats integer operations as having an implicit cast to sizetype
- for integer operations, the returned VAR has type sizetype

In other words, the problem becomes to express:

  (sizetype) (OP0 CODE OP1)

as:

  VAR:sizetype + (sizetype) OFF:ssizetype

The top-level integer split_constant_offset will (usually) be a sizetype
POINTER_PLUS operand, so the extra cast to sizetype disappears.  But adding
the cast allows the conversion handling to defer a lot of the difficult
cases to the recursive split_constant_offset call, which can detect
overflow on individual operations.

The net effect is to analyse the access above as:

  base: base
  offset: -(sizetype) foo * size
  init: INT_MIN * size

See the comments in the patch for more details.

Tested on aarch64-linux-gnu so far (with and without SVE), but will
test more widely overnight.

Thanks,
Richard


gcc/
PR tree-optimization/98069
* tree-data-ref.c (compute_distributive_range): New function.
(nop_conversion_for_offset_p): Likewise.
(split_constant_offset): In the internal overload, treat integer
expressions as having an implicit cast to sizetype and express
them accordingly.  Pass back the range of the original (uncast)
expression in a new range parameter.
(split_constant_offset_1): Likewise.  Rework the handling of
conversions to account for the implicit sizetype casts.
---
 gcc/testsuite/gcc.dg/vect/pr98069.c |  22 ++
 gcc/tree-data-ref.c | 427 +---
 2 files changed, 352 insertions(+), 97 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr98069.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr98069.c 
b/gcc/testsuite/gcc.dg/vect/pr98069.c
new file mode 100644
index 000..e60549fb30a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr98069.c
@@ -0,0 +1,22 @@
+long long int var_3 = -166416893043554447LL;
+short var_8 = (short)27092;
+unsigned int var_17 = 75036300U;
+short arr_165[23];
+
+static long c(long e, long f) { return f ? e : f; }
+void __attribute((noipa)) test()
+{
+  for (int b = 0; b < 19; b = var_17)
+for (int d = (int)(~c(-2147483647 - 1, var_3)) - 2147483647; d < 22; d++)
+  arr_165[d] = var_8;
+}
+
+int main()
+{
+  for (unsigned i_3 = 0; i_3 < 23; ++i_3)
+arr_165[i_3] = (short)-8885;
+  test();
+  if (arr_165[0] != 27092)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
index e8308ce8250..926553b5cac 100644
--- a/gcc/tree-data-ref.c
+++ b/gcc/tree-data-ref.c
@@ -97,6 +97,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-eh.h"
 #include "ssa.h"
 #include "internal-fn.h"
+#include "range-op.h"
+#include "vr-values.h"
 
 static struct datadep_stats
 {
@@ -581,26 +583,196 @@ debug_ddrs (vec ddrs)
   dump_ddrs (stderr, ddrs);
 }
 
+/* If RESULT_RANGE is nonnull, set *RESULT_RANGE to the range of
+   OP0 CODE OP1, where:
+
+   - OP0 CODE OP1 has integral type TYPE
+   - the range of OP0 is given by OP0_RANGE and
+   - the range of OP1 is given by OP1_RANGE.
+
+   Independently of RESULT_RANGE, try to compute:
+
+ DELTA = ((sizetype) OP0 CODE (sizetype) OP1)
+- (sizetype) (OP0 CODE OP1)
+
+   as a constant and subtract DELTA from the ssizetype constant in *OFF.
+   Return true on success, or false if DELTA is not known at compile time.
+
+   Truncation and sign changes are known to distribute over CODE, i.e.
+
+ (itype) (A CODE B) == (itype) A CODE (itype) B
+
+   for any integral type ITYPE whose precision is no greater than the
+   precision of A and B.  */
+
+static bool
+compute_distributive_range (tree type, value_range &op0_range,
+   tree_code code, value_range &op1_range,
+   tree *off, value_range *result_range)
+{
+  gcc_assert (INTEGRAL_TYPE_P (type) && !TYPE_OVERFLOW_TRAPS (type));
+  if (result_range)
+{
+  range_operator *op = range_op_handler (code, type);
+  op->fold_range (*result_range, type, op0_range, op1_range);
+}
+
+  /* The distributive property guarantees that if TYPE is no narrower
+ than SIZETYPE,
+
+   (sizetype) (OP0 CODE OP1) == (sizetype) OP0 CODE (sizetype) OP1
+
+ and so we can treat DELTA as zero.  */
+  if (TYPE_PRECISION (type) >= TYPE_PRECISION (sizetype))
+return true;
+
+  /* If overflow is undefined, we can assume that:
+
+   X == (ssizetype) OP0 CODE (ssizety

c++: Module parsing

2020-12-09 Thread Nathan Sidwell



This adds the module-declaration parsing and other logic.  We have two
new kinds of declaration -- module and import.  Plus the ability to
export other declarations.  The module processing can also divide the
TU into several portions -- GMF, Purview and PMF.

There are restrictions that some declarations must or mustnot appear
in a #include, so I needed to add a bit to indicate whether a token
came from the main source or not.  This seemed the least unpleasant
way of implementing such a check.

gcc/cp/
* parser.h (struct cp_token): Add main_source_p field.
* parser.c (cp_lexer_new_main): Pass thought module token filter.
Check macros.
(cp_lexer_get_preprocessor_token): Set main_source_p.
(enum module_parse): New.
(cp_parser_diagnose_invalid_type_name): Deal with unrecognized
module-directives.
(cp_parser_skip_to_closing_parenthesize_1): Skip 
module-directivres.

(cp_parser_skip_to_end_of_statement): Likewise.
(cp_parser_skiup_to_end_of_block_or_statement): Likewise.
(cp_parser_translation_unit): Add module parsing calls.
(cp_parser_module_name, cp_parser_module_declaration): New.
(cp_parser_import_declaration, cp_parser_module_export): New.
(cp_parser_declaration): Add module export detection.
(cp_parser_template_declaration): Adjust 'export' error message.
(cp_parser_function_definition_after_declarator): Add
module-specific logic.
* module.cc (import_module, declare_module)
(maybe_check_all_macros): Stubs.

pushing to trunk
--
Nathan Sidwell
diff --git i/gcc/cp/module.cc w/gcc/cp/module.cc
index a961e3bcc92..3587dfcc925 100644
--- i/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -123,6 +123,16 @@ set_originating_module (tree, bool)
 {
 }
 
+void
+import_module (module_state *, location_t, bool, tree, cpp_reader *)
+{
+}
+
+void
+declare_module (module_state *, location_t, bool, tree, cpp_reader *)
+{
+}
+
 module_state *
 preprocess_module (module_state *, unsigned, bool, bool, bool, cpp_reader *)
 {
@@ -143,6 +153,11 @@ init_modules (cpp_reader *)
 		 "Shtopp! What are you doing? This is not ready yet.");
 }
 
+void
+maybe_check_all_macros (cpp_reader *)
+{
+}
+
 void
 fini_modules ()
 {
diff --git i/gcc/cp/parser.c w/gcc/cp/parser.c
index 0ff000cd053..39957d4b6a9 100644
--- i/gcc/cp/parser.c
+++ w/gcc/cp/parser.c
@@ -646,9 +646,17 @@ cp_lexer_new_main (void)
   /* Put the first token in the buffer.  */
   cp_token *tok = lexer->buffer->quick_push (token);
 
+  uintptr_t filter = 0;
+  if (modules_p ())
+filter = module_token_cdtor (parse_in, filter);
+
   /* Get the remaining tokens from the preprocessor.  */
   while (tok->type != CPP_EOF)
 {
+  if (filter)
+	/* Process the previous token.  */
+	module_token_lang (tok->type, tok->keyword, tok->u.value,
+			   tok->location, filter);
   tok = vec_safe_push (lexer->buffer, cp_token ());
   cp_lexer_get_preprocessor_token (C_LEX_STRING_NO_JOIN, tok);
 }
@@ -658,10 +666,15 @@ cp_lexer_new_main (void)
   + lexer->buffer->length ()
 		  - 1;
 
+  if (filter)
+module_token_cdtor (parse_in, filter);
+
   /* Subsequent preprocessor diagnostics should use compiler
  diagnostic functions to get the compiler source location.  */
   done_lexing = true;
 
+  maybe_check_all_macros (parse_in);
+
   gcc_assert (!lexer->next_token->purged_p);
   return lexer;
 }
@@ -842,6 +855,8 @@ cp_lexer_get_preprocessor_token (unsigned flags, cp_token *token)
   token->purged_p = false;
   token->error_reported = false;
   token->tree_check_p = false;
+  /* Usually never see a zero, but just in case ... */
+  token->main_source_p = line_table->depth <= 1;
 
   /* On some systems, some header files are surrounded by an
  implicit extern "C" block.  Set a flag in the token if it
@@ -2190,6 +2205,28 @@ static tree cp_parser_implicitly_scoped_statement
 static void cp_parser_already_scoped_statement
   (cp_parser *, bool *, const token_indent_info &);
 
+/* State of module-declaration parsing.  */
+enum module_parse
+{
+  MP_NOT_MODULE,	/* Not a module.  */
+
+  _MP_UNUSED,
+
+  MP_FIRST,	/* First declaration of TU.  */
+  MP_GLOBAL,	/* Global Module Fragment.  */
+
+  MP_PURVIEW_IMPORTS,   /* Imports of a module.  */
+  MP_PURVIEW,	/* Purview of a named module.  */
+
+  MP_PRIVATE_IMPORTS, /* Imports of a Private Module Fragment.  */
+  MP_PRIVATE,   /* Private Module Fragment.  */
+};
+
+static module_parse cp_parser_module_declaration
+  (cp_parser *parser, module_parse, bool exporting);
+static void cp_parser_import_declaration
+  (cp_parser *parser, module_parse, bool exporting);
+
 /* Declarations [gram.dcl.dcl] */
 
 static void cp_parser_declaration_seq_opt
@@ -3419,6 +3456,15 @@ cp_parser_diagnose_invalid_type_name (cp_parser *parser, tree id,
   else if (cxx_dialect < cxx11 && id == ridpointers[(int)RID_NOEXCEPT])
 	inform (location, "C++11 % only availabl

Re: How to traverse all the local variables that declared in the current routine?

2020-12-09 Thread Richard Biener via Gcc-patches

On Wed, Dec 9, 2020 at 4:04 PM Qing Zhao  wrote:
>
>
>
> On Dec 9, 2020, at 2:23 AM, Richard Biener  wrote:
>
> On Tue, Dec 8, 2020 at 8:54 PM Qing Zhao  wrote:
>
>
>
>
> On Dec 8, 2020, at 1:40 AM, Richard Biener  wrote:
>
> On Mon, Dec 7, 2020 at 5:20 PM Qing Zhao  wrote:
>
>
>
>
> On Dec 7, 2020, at 1:12 AM, Richard Biener  wrote:
>
> On Fri, Dec 4, 2020 at 5:19 PM Qing Zhao  wrote:
>
>
>
>
> On Dec 4, 2020, at 2:50 AM, Richard Biener  wrote:
>
> On Thu, Dec 3, 2020 at 6:33 PM Richard Sandiford
>  wrote:
>
>
> Richard Biener via Gcc-patches  writes:
>
> On Tue, Nov 24, 2020 at 4:47 PM Qing Zhao  wrote:
>
> Another issue is, in order to check whether an auto-variable has initializer, 
> I plan to add a new bit in “decl_common” as:
> /* In a VAR_DECL, this is DECL_IS_INITIALIZED.  */
> unsigned decl_is_initialized :1;
>
> /* IN VAR_DECL, set when the decl is initialized at the declaration.  */
> #define DECL_IS_INITIALIZED(NODE) \
> (DECL_COMMON_CHECK (NODE)->decl_common.decl_is_initialized)
>
> set this bit when setting DECL_INITIAL for the variables in FE. then keep it
> even though DECL_INITIAL might be NULLed.
>
>
> For locals it would be more reliable to set this flag during gimplification.
>
> Do you have any comment and suggestions?
>
>
> As said above - do you want to cover registers as well as locals?  I'd do
> the actual zeroing during RTL expansion instead since otherwise you
> have to figure youself whether a local is actually used (see 
> expand_stack_vars)
>
> Note that optimization will already made have use of "uninitialized" state
> of locals so depending on what the actual goal is here "late" may be too late.
>
>
> Haven't thought about this much, so it might be a daft idea, but would a
> compromise be to use a const internal function:
>
> X1 = .DEFERRED_INIT (X0, INIT)
>
> where the X0 argument is an uninitialised value and the INIT argument
> describes the initialisation pattern?  So for a decl we'd have:
>
> X = .DEFERRED_INIT (X, INIT)
>
> and for an SSA name we'd have:
>
> X_2 = .DEFERRED_INIT (X_1(D), INIT)
>
> with all other uses of X_1(D) being replaced by X_2.  The idea is that:
>
> * Having the X0 argument would keep the uninitialised use of the
> variable around for the later warning passes.
>
> * Using a const function should still allow the UB to be deleted as dead
> if X1 isn't needed.
>
> * Having a function in the way should stop passes from taking advantage
> of direct uninitialised uses for optimisation.
>
> This means we won't be able to optimise based on the actual init
> value at the gimple level, but that seems like a fair trade-off.
> AIUI this is really a security feature or anti-UB hardening feature
> (in the sense that users are more likely to see predictable behaviour
> “in the field” even if the program has UB).
>
>
> The question is whether it's in line of peoples expectation that
> explicitely zero-initialized code behaves differently from
> implicitely zero-initialized code with respect to optimization
> and secondary side-effects (late diagnostics, latent bugs, etc.).
>
> Introducing a new concept like .DEFERRED_INIT is much more
> heavy-weight than an explicit zero initializer.
>
>
> What exactly you mean by “heavy-weight”? More difficult to implement or much 
> more run-time overhead or both? Or something else?
>
> The major benefit of the approach of “.DEFERRED_INIT”  is to enable us keep 
> the current -Wuninitialized analysis untouched and also pass
> the “uninitialized” info from source code level to “pass_expand”.
>
>
> Well, "untouched" is a bit oversimplified.  You do need to handle
> .DEFERRED_INIT as not
> being an initialization which will definitely get interesting.
>
>
> Yes, during uninitialized variable analysis pass, we should specially handle 
> the defs with “.DEFERRED_INIT”, to treat them as uninitializations.
>
> If we want to keep the current -Wuninitialized analysis untouched, this is a 
> quite reasonable approach.
>
> However, if it’s not required to keep the current -Wuninitialized analysis 
> untouched, adding zero-initializer directly during gimplification should
> be much easier and simpler, and also smaller run-time overhead.
>
>
> As for optimization I fear you'll get a load of redundant zero-init
> actually emitted if you can just rely on RTL DSE/DCE to remove it.
>
>
> Runtime overhead for -fauto-init=zero is one important consideration for the 
> whole feature, we should minimize the runtime overhead for zero
> Initialization since it will be used in production build.
> We can do some run-time performance evaluation when we have an implementation 
> ready.
>
>
> Note there will be other passes "confused" by .DEFERRED_INIT.  Note
> that there's going to be other
> considerations - namely where to emit the .DEFERRED_INIT - when
> emitting it during gimplification
> you can emit it at the start of the block of block-scope variables.
> When emitting after gimplification
> you have to emit at function start which will probably

Re: How to traverse all the local variables that declared in the current routine?

2020-12-09 Thread Qing Zhao via Gcc-patches




> On Dec 9, 2020, at 2:23 AM, Richard Biener  wrote:
> 
> On Tue, Dec 8, 2020 at 8:54 PM Qing Zhao  > wrote:
>> 
>> 
>> 
>> On Dec 8, 2020, at 1:40 AM, Richard Biener > > wrote:
>> 
>> On Mon, Dec 7, 2020 at 5:20 PM Qing Zhao > > wrote:
>> 
>> 
>> 
>> 
>> On Dec 7, 2020, at 1:12 AM, Richard Biener  
>> wrote:
>> 
>> On Fri, Dec 4, 2020 at 5:19 PM Qing Zhao  wrote:
>> 
>> 
>> 
>> 
>> On Dec 4, 2020, at 2:50 AM, Richard Biener  
>> wrote:
>> 
>> On Thu, Dec 3, 2020 at 6:33 PM Richard Sandiford
>>  wrote:
>> 
>> 
>> Richard Biener via Gcc-patches  writes:
>> 
>> On Tue, Nov 24, 2020 at 4:47 PM Qing Zhao  wrote:
>> 
>> Another issue is, in order to check whether an auto-variable has 
>> initializer, I plan to add a new bit in “decl_common” as:
>> /* In a VAR_DECL, this is DECL_IS_INITIALIZED.  */
>> unsigned decl_is_initialized :1;
>> 
>> /* IN VAR_DECL, set when the decl is initialized at the declaration.  */
>> #define DECL_IS_INITIALIZED(NODE) \
>> (DECL_COMMON_CHECK (NODE)->decl_common.decl_is_initialized)
>> 
>> set this bit when setting DECL_INITIAL for the variables in FE. then keep it
>> even though DECL_INITIAL might be NULLed.
>> 
>> 
>> For locals it would be more reliable to set this flag during gimplification.
>> 
>> Do you have any comment and suggestions?
>> 
>> 
>> As said above - do you want to cover registers as well as locals?  I'd do
>> the actual zeroing during RTL expansion instead since otherwise you
>> have to figure youself whether a local is actually used (see 
>> expand_stack_vars)
>> 
>> Note that optimization will already made have use of "uninitialized" state
>> of locals so depending on what the actual goal is here "late" may be too 
>> late.
>> 
>> 
>> Haven't thought about this much, so it might be a daft idea, but would a
>> compromise be to use a const internal function:
>> 
>> X1 = .DEFERRED_INIT (X0, INIT)
>> 
>> where the X0 argument is an uninitialised value and the INIT argument
>> describes the initialisation pattern?  So for a decl we'd have:
>> 
>> X = .DEFERRED_INIT (X, INIT)
>> 
>> and for an SSA name we'd have:
>> 
>> X_2 = .DEFERRED_INIT (X_1(D), INIT)
>> 
>> with all other uses of X_1(D) being replaced by X_2.  The idea is that:
>> 
>> * Having the X0 argument would keep the uninitialised use of the
>> variable around for the later warning passes.
>> 
>> * Using a const function should still allow the UB to be deleted as dead
>> if X1 isn't needed.
>> 
>> * Having a function in the way should stop passes from taking advantage
>> of direct uninitialised uses for optimisation.
>> 
>> This means we won't be able to optimise based on the actual init
>> value at the gimple level, but that seems like a fair trade-off.
>> AIUI this is really a security feature or anti-UB hardening feature
>> (in the sense that users are more likely to see predictable behaviour
>> “in the field” even if the program has UB).
>> 
>> 
>> The question is whether it's in line of peoples expectation that
>> explicitely zero-initialized code behaves differently from
>> implicitely zero-initialized code with respect to optimization
>> and secondary side-effects (late diagnostics, latent bugs, etc.).
>> 
>> Introducing a new concept like .DEFERRED_INIT is much more
>> heavy-weight than an explicit zero initializer.
>> 
>> 
>> What exactly you mean by “heavy-weight”? More difficult to implement or much 
>> more run-time overhead or both? Or something else?
>> 
>> The major benefit of the approach of “.DEFERRED_INIT”  is to enable us keep 
>> the current -Wuninitialized analysis untouched and also pass
>> the “uninitialized” info from source code level to “pass_expand”.
>> 
>> 
>> Well, "untouched" is a bit oversimplified.  You do need to handle
>> .DEFERRED_INIT as not
>> being an initialization which will definitely get interesting.
>> 
>> 
>> Yes, during uninitialized variable analysis pass, we should specially handle 
>> the defs with “.DEFERRED_INIT”, to treat them as uninitializations.
>> 
>> If we want to keep the current -Wuninitialized analysis untouched, this is a 
>> quite reasonable approach.
>> 
>> However, if it’s not required to keep the current -Wuninitialized analysis 
>> untouched, adding zero-initializer directly during gimplification should
>> be much easier and simpler, and also smaller run-time overhead.
>> 
>> 
>> As for optimization I fear you'll get a load of redundant zero-init
>> actually emitted if you can just rely on RTL DSE/DCE to remove it.
>> 
>> 
>> Runtime overhead for -fauto-init=zero is one important consideration for the 
>> whole feature, we should minimize the runtime overhead for zero
>> Initialization since it will be used in production build.
>> We can do some run-time performance evaluation when we have an 
>> implementation ready.
>> 
>> 
>> Note there will be other passes "confused" by .DEFERRED_INIT.  Note
>> that there's going to be other
>> considerations

Re: [PATCH] RISC-V: Explicitly call python when using multilib generator

2020-12-09 Thread Jakub Jelinek via Gcc-patches

On Wed, Dec 09, 2020 at 03:57:51PM +0100, Matthias Klose wrote:
> On 12/9/20 3:03 PM, Simon Cook wrote:
> > When building GCC for RISC-V with the --with-multilib-generator option,
> > it may not be possible to call arch-canonicalize as an executable when
> > building on Windows. Instead directly invoke the expected python
> > interpreter for this step.

There is nothing in the two scripts that can't be easily done in awk
or shell.
I think it would be best to rewrite those scripts in those languages.

Jakub

Re: [PATCH] RISC-V: Explicitly call python when using multilib generator

2020-12-09 Thread Matthias Klose

On 12/9/20 3:03 PM, Simon Cook wrote:
> When building GCC for RISC-V with the --with-multilib-generator option,
> it may not be possible to call arch-canonicalize as an executable when
> building on Windows. Instead directly invoke the expected python
> interpreter for this step.
> 
> gcc/ChangeLog:
> 
>   * config/riscv/multilib-generator (arch_canonicalize): Invoke
>   python interpreter when calling arch-canonicalize script.
> ---
>  gcc/config/riscv/multilib-generator | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/riscv/multilib-generator
> b/gcc/config/riscv/multilib-generator
> index 53c51dfa53f..79948518118 100755
> --- a/gcc/config/riscv/multilib-generator
> +++ b/gcc/config/riscv/multilib-generator
> @@ -54,7 +54,8 @@ def arch_canonicalize(arch):
>this_file = os.path.abspath(os.path.join( __file__))
>arch_can_script = \
>  os.path.join(os.path.dirname(this_file), "arch-canonicalize")
> -  proc = subprocess.Popen([arch_can_script, arch], stdout=subprocess.PIPE)
> +  proc = subprocess.Popen(['python', arch_can_script, arch],
> +  stdout=subprocess.PIPE)
>out, err = proc.communicate()
>return out.strip()
> 

that's again hard-coding 'python'.

Re: RFC: ARM MVE and Neon auto-vectorization

2020-12-09 Thread Andre Vieira (lists) via Gcc-patches




On 08/12/2020 13:50, Christophe Lyon via Gcc-patches wrote:

Hi,


My 'vand' patch changes the definition of VDQ so that the relevant
modes are enabled only when !TARGET_HAVE_MVE (V8QI, ...), and this
helps writing a simpler expander.

However, vneg is used by vshr (right-shifts by register are
implemented as left-shift by negation of that register), so the
expander uses something like:

   emit_insn (gen_neg2 (neg, operands[2]));
   if (TARGET_NEON)
   emit_insn (gen_ashl3_signed (operands[0], operands[1], neg));
   else
   emit_insn (gen_mve_vshlq_s (operands[0], operands[1], neg));

which does not work if the iterator has conditional members: the
'else' part is still generated for  unsupported by MVE.

So I guess my question is:  do we want to enforce implementation
of Neon / MVE common parts? There are already lots of partly
overlapping/duplicate iterators. I have tried to split iterators into
eg VDQ_COMMON_TO_NEON_AND_MVE and VDQ_NEON_ONLY but this means we have
to basically duplicate the expanders which defeats the point...
Ideally I think we'd want a minimal number iterators and defines, which 
was the idea behind the conditional iterators disabling 64-bit modes for 
MVE.


Obviously that then breaks the code above. For this specific case I 
would suggest unifying define_insns ashl3_{signed,unsigned} and 
mve_vshlq_, they are very much the same patterns, I also 
don't understand why ahsl's signed and unsigned are separate. For 
instance create a 'ashl3__' or something like that, and make 
sure the calls to gen_ashl33_{unsigned,signed} now call to 
gen_ashl3__ and that arm_mve_builtins.def use 
ashl3__ instead of this,  needs to be at the end of 
the name for the builtin construct. Whether this 'form' would work 
everywhere, I don't know. And I suspect you might find more issues like 
this. If there are more than you are willing to change right now then 
maybe the easier step forward is to try to tackle them one at a time, 
and use a new conditional iterator where you've been able to merge NEON 
and MVE patterns.


As a general strategy I think we should try to clean the mess up, but I 
don't think we should try to clean it all up in one go as that will 
probably lead to it not getting done at all. I'm not the maintainer, so 
I'd be curious to see how Kyrill feels about this, but in my opinion we 
should take patches that don't make it less maintainable, so if you can 
clean it up as much as possible, great! Otherwise if its not making the 
mess bigger and its enabling auto-vec then I personally don't see why it 
shouldn't be accepted.

Or we can keep different expanders for Neon and MVE? But we have
already quite a few in vec-common.md.
We can't keep different expanders if they expand the same optab with the 
same modes in the same backend. So we will always have to make NEON and 
MVE work together.

[PATCH][pushed] testsuite: fix 2 tests on aarch64

2020-12-09 Thread Martin Liška


gcc/testsuite/ChangeLog:

PR tree-optimization/98182
* gcc.dg/tree-ssa/if-to-switch-1.c: Add case-values-threshold in
order to fix them for aarch64.
* gcc.dg/tree-ssa/if-to-switch-10.c: Likewise.
---
 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-1.c  | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-10.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-1.c
index e66fa736e10..e5da00b62d6 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-iftoswitch-optimized" } */
+/* { dg-options "-O2 -fdump-tree-iftoswitch-optimized --param 
case-values-threshold=5" } */
 
 int global;

 int foo ();
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-10.c 
b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-10.c
index 7b8da1c9f3c..7b21c99313c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-10.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/if-to-switch-10.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-iftoswitch-optimized" } */
+/* { dg-options "-O2 -fdump-tree-iftoswitch-optimized --param 
case-values-threshold=5" } */
 
 int global;

 int foo ();
--
2.29.2

[PATCH] aarch64: Add CPU-specific SVE vector costs struct

2020-12-09 Thread Kyrylo Tkachov via Gcc-patches

Hi all,

This patch extends the backend vector costs structures to allow for separate 
Advanced SIMD and SVE
costs. The fields in the current cpu_vector_costs that would vary between the 
ISAs are moved into
a simd_vec_cost struct and we have two typedefs of it: advsimd_vec_cost and 
sve_vec_costs.
If, in the future, SVE needs some extra fields it could inherit from 
simd_vec_cost.
The CPU vector cost tables in aarch64.c are updated for the struct changes.
aarch64_builtin_vectorization_cost is updated to select either the Advanced 
SIMD or SVE costs field
depending on the mode and field availability.
No change in codegen is intended with this patch.

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to the mainline.
Thanks,
Kyrill

gcc/
* config/aarch64/aarch64-protos.h (cpu_vector_cost): Move simd fields 
to...
(simd_vec_cost): ... Here.  Define.
(advsimd_vec_cost): Define.
(sve_vec_cost): Define.
* config/aarch64/aarch64.c (generic_advsimd_vector_cost): Define.
(generic_sve_vector_cost): Likewise.
(generic_vector_cost): Update.
(qdf24xx_advsimd_vector_cost): Define.
(qdf24xx_vector_cost): Update.
(thunderx_advsimd_vector_cost): Define.
(thunderx_vector_cost): Update.
(tsv110_advsimd_vector_cost): Define.
(tsv110_vector_cost): Likewise.
(cortexa57_advsimd_vector_cost): Define.
(cortexa57_vector_cost): Update.
(exynosm1_advsimd_vector_cost): Define.
(exynosm1_vector_cost): Update.
(xgene1_advsimd_vector_cost): Define.
(xgene1_vector_cost): Update.
(thunderx2t99_advsimd_vector_cost): Define.
(thunderx2t99_vector_cost): Update.
(thunderx3t110_advsimd_vector_cost): Define.
(thunderx3t110_vector_cost): Update.
(aarch64_builtin_vectorization_cost): Handle sve and advsimd vector cost
fields.


sve-cost.patch
Description: sve-cost.patch

Re: [PATCH 00/31] VAX: Bring the port up to date (yes, MODE_CC conversion is included)

2020-12-09 Thread Maciej W. Rozycki

On Sat, 28 Nov 2020, Paul Koning wrote:

> > Hmm, I gather those systems are able to run some kind of BSD Unix: don't 
> > they support the r-commands which would allow you to run DejaGNU testing 
> > with a realistic environment PDP-11 hardware would be usually used with, 
> > possibly on actual hardware even?  I always feel a bit uneasy about the 
> > accuracy of any simulation (having suffered from bugs in QEMU causing 
> > false negatives in software verification).
> 
> Fair enough.  But SIMH is a full system emulator with a very large 
> amount of history and expertise involved in its creation.  It's also 
> known to run every PDP-11 OS and most diagnostics.  Yes, it certainly 
> runs BSD 2.x; the reason I didn't use that approach is that I don't know 
> it well.

 This all sounds great.  Do you happen to know if it is cycle-accurate 
with respect to individual hardware microarchitectures simulated?  That 
would be required for performance evaluation of compiler-generated code.

  Maciej

[PATCH] RISC-V: Explicitly call python when using multilib generator

2020-12-09 Thread Simon Cook

When building GCC for RISC-V with the --with-multilib-generator option,
it may not be possible to call arch-canonicalize as an executable when
building on Windows. Instead directly invoke the expected python
interpreter for this step.

gcc/ChangeLog:

* config/riscv/multilib-generator (arch_canonicalize): Invoke
python interpreter when calling arch-canonicalize script.
---
 gcc/config/riscv/multilib-generator | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/multilib-generator
b/gcc/config/riscv/multilib-generator
index 53c51dfa53f..79948518118 100755
--- a/gcc/config/riscv/multilib-generator
+++ b/gcc/config/riscv/multilib-generator
@@ -54,7 +54,8 @@ def arch_canonicalize(arch):
   this_file = os.path.abspath(os.path.join( __file__))
   arch_can_script = \
 os.path.join(os.path.dirname(this_file), "arch-canonicalize")
-  proc = subprocess.Popen([arch_can_script, arch], stdout=subprocess.PIPE)
+  proc = subprocess.Popen(['python', arch_can_script, arch],
+  stdout=subprocess.PIPE)
   out, err = proc.communicate()
   return out.strip()

-- 
2.24.3

c++: Decl module-specific semantic processing

2020-12-09 Thread Nathan Sidwell



This adds the module-specific logic to the various declaration
processing routines in decl.c and semantic.c.  I also adjust the rtti
type creation, as those are all in the global module, so we need to
temporarily clear the module_kind, when they are being created.
Finally, I added init and fini module processing with the initialier
giving a fatal error if you try and turn it on (so don't do that yet).

gcc/cp/
* decl.c (duplicate_decls): Add module-specific redeclaration
logic.
(cxx_init_decl_processing): Export the global namespace, maybe
initialize modules.
(start_decl): Reject local-extern in a module, adjust linkage of
template var.
(xref_tag_1): Add module-specific redeclaration logic.
(start_enum): Likewise.
(finish_enum_value_list): Export unscoped members of an exported
enum.
(grokmethod): Implement p1779 linkage of in-class defined
functions.
* decl2.c (no_linkage_error): Imports are ok.
(c_parse_final_cleanups): Call fini_modules.
* lex.c (cxx_dup_lang_specific): Clear some module flags in the
copy.
* module.cc (module_kind): Define.
(module_may_redeclare, set_defining_module): Stubs.
(init_modules): Error on modules.
(fini_modules): Stub.
* rtti.c (push_abi_namespace): Save and reset module_kind.
(pop_abi_namespace): Restore module kind.
(build_dynamic_cast_1, tinfo_base_init): Adjust.
* semantics.c (begin_class_definition): Add module-specific logic.
(expand_or_defer_fn_1): Keep bodies of more fns when modules_p.


pushing to trunk
--
Nathan Sidwell
diff --git i/gcc/cp/decl.c w/gcc/cp/decl.c
index bb5bb2f1a18..ae93fe1d7f0 100644
--- i/gcc/cp/decl.c
+++ w/gcc/cp/decl.c
@@ -2008,6 +2008,39 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden)
   if (!validate_constexpr_redeclaration (olddecl, newdecl))
 return error_mark_node;
 
+  if (modules_p ()
+  && TREE_CODE (CP_DECL_CONTEXT (olddecl)) == NAMESPACE_DECL
+  && TREE_CODE (olddecl) != NAMESPACE_DECL
+  && !hiding)
+{
+  if (DECL_ARTIFICIAL (olddecl))
+	{
+	  gcc_checking_assert (!(DECL_LANG_SPECIFIC (olddecl)
+ && DECL_MODULE_IMPORT_P (olddecl)));
+	  if (!(global_purview_p () || not_module_p ()))
+	error ("declaration %qD conflicts with builtin", newdecl);
+	  else
+	DECL_MODULE_EXPORT_P (olddecl) = DECL_MODULE_EXPORT_P (newdecl);
+	}
+  else
+	{
+	  if (!module_may_redeclare (olddecl))
+	{
+	  error ("declaration %qD conflicts with import", newdecl);
+	  inform (olddecl_loc, "import declared %q#D here", olddecl);
+
+	  return error_mark_node;
+	}
+
+	  if (DECL_MODULE_EXPORT_P (newdecl)
+	  && !DECL_MODULE_EXPORT_P (olddecl))
+	{
+	  error ("conflicting exporting declaration %qD", newdecl);
+	  inform (olddecl_loc, "previous declaration %q#D here", olddecl);
+	}
+	}
+}
+
   /* We have committed to returning OLDDECL at this point.  */
 
   /* If new decl is `static' and an `extern' was seen previously,
@@ -2218,6 +2251,10 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden)
 	}
 	}
 
+  DECL_MODULE_IMPORT_P (olddecl)
+	= DECL_MODULE_IMPORT_P (old_result)
+	= DECL_MODULE_IMPORT_P (newdecl);
+
   return olddecl;
 }
 
@@ -2836,6 +2873,20 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden)
   memcpy ((char *) olddecl + sizeof (struct tree_common),
 	  (char *) newdecl + sizeof (struct tree_common),
 	  sizeof (struct tree_decl_common) - sizeof (struct tree_common));
+
+  if (DECL_LANG_SPECIFIC (olddecl) && DECL_TEMPLATE_INFO (olddecl))
+	{
+	  /* Repropagate the module information to the template.  */
+	  tree tmpl = DECL_TI_TEMPLATE (olddecl);
+
+	  if (DECL_TEMPLATE_RESULT (tmpl) == olddecl)
+	{
+	  DECL_MODULE_PURVIEW_P (tmpl) = DECL_MODULE_PURVIEW_P (olddecl);
+	  gcc_checking_assert (!DECL_MODULE_IMPORT_P (olddecl));
+	  DECL_MODULE_IMPORT_P (tmpl) = false;
+	}
+	}
+
   switch (TREE_CODE (newdecl))
 	{
 	case LABEL_DECL:
@@ -4330,7 +4381,8 @@ cxx_init_decl_processing (void)
   gcc_assert (global_namespace == NULL_TREE);
   global_namespace = build_lang_decl (NAMESPACE_DECL, global_identifier,
   void_type_node);
-  TREE_PUBLIC (global_namespace) = 1;
+  TREE_PUBLIC (global_namespace) = true;
+  DECL_MODULE_EXPORT_P (global_namespace) = true;
   DECL_CONTEXT (global_namespace)
 = build_translation_unit_decl (get_identifier (main_input_filename));
   /* Remember whether we want the empty class passing ABI change warning
@@ -4629,6 +4681,9 @@ cxx_init_decl_processing (void)
   if (! supports_one_only ())
 flag_weak = 0;
 
+  if (modules_p ())
+init_modules (parse_in);
+
   make_fname_decl = cp_make_fname_decl;
   start_fname_decls ();
 
@@ -5453,8 +5508,14 @@ start_decl (const cp_declarator *declarator,

RE: [PATCH][GCC] aarch64: Add +pauth to -march

2020-12-09 Thread Kyrylo Tkachov via Gcc-patches




> -Original Message-
> From: Przemyslaw Wirkus 
> Sent: 07 December 2020 21:20
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; Richard Sandiford
> ; Kyrylo Tkachov
> ; Marcus Shawcroft
> 
> Subject: [PATCH][GCC] aarch64: Add +pauth to -march
> 
> New +pauth (Pointer Authentication from Armv8.3-A) feature option for
> -march command line option.
> 
> Please note that majority of PAUTH instructions are implemented behind
> HINT
> instruction. PAUTH stays a Armv8.3-A feature but now can be assigned to
> other
> architectures or CPUs.
> 
> Patch includes:
> - new +pauth command line option.
> - docs update to +flagm command line option in docs.
> 
> Regression tested and no issues.
> 
> OK for master?

Ok.
Thanks,
Kyrill

> 
> gcc/ChangeLog:
> 
> * config/aarch64/aarch64-option-extensions.def
> (AARCH64_OPT_EXTENSION): New +pauth option in -march for AArch64.
> * config/aarch64/aarch64.h (AARCH64_FL_PAUTH): New pauth extension
> bitmask.
> (AARCH64_ISA_PUATH): New ISA bitmask for PAUTH.
> (AARCH64_FL_FOR_ARCH8_3): Add PAUTH to Armv8.3-A.
> (TARGET_PAUTH): New target mask to isolate PAUTH instructions.
> * config/aarch64/aarch64.md (do_return): Condition set to TARGET_PAUTH.
> * doc/invoke.texi: Update docs (+flagm, +pauth).

[RFC] [avr] Toolchain Integration for Testsuite Execution (avr cc0 to mode_cc0 conversion)

2020-12-09 Thread abebeos via Gcc-patches

Essence:

I need a confirmation that the testsuite setup as presented in:

https://github.com/abebeos/avr-gnu

works fine.

The problem with the avr target is that the testsuite cannot be run easily,
mainly because of the need for a special simulated-target setup, which does
not work for avr as documented. This led developers to a dead-end with
their non-cc0-avr-backends (the non-cc0 backend is needed thus avr is not
dropped from gcc11).

I integrated a toolchain/testsetup to be able to run the gcc testsuite
against a simulated avr target.

I then used this toolchain to test 2 different existent
non-cc0-avr-backends (from pipcet and saaadhu, both github).

The result is that saaadhu's backend seems to be working 100%. It has
identical testsuite results with the existing (but deprecated) cc0-backend,
which means that it can be used "as-is" for inclusion in gcc11.

Please note that I did this work in context of a bounty @ bountysouce, more
information within the issue:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92729#c35

[PATCH] Limit perf data buffer during feature checking

2020-12-09 Thread Ilya Leoshkevich via Gcc-patches

Bootstrapped and regtested on x86_64-redhat-linux.  Ok for master?

Commit 2ead1ab91123 ("Limit perf data buffer during profiling") added
-m8 to perf invocations during running tests, but the same problem
exists for checking whether perf is working in the first place.

gcc/testsuite/ChangeLog:

2020-12-08  Ilya Leoshkevich  

* lib/target-supports.exp(check_profiling_available): Limit
perf data buffer.
---
 gcc/testsuite/lib/target-supports.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 89c4f67554f..75b4f5d0e85 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -654,7 +654,7 @@ proc check_profiling_available { test_what } {
return 0
}
 global srcdir
-   set status [remote_exec host "$srcdir/../config/i386/gcc-auto-profile" 
"true -v >/dev/null"]
+   set status [remote_exec host "$srcdir/../config/i386/gcc-auto-profile" 
"-m8 true -v >/dev/null"]
if { [lindex $status 0] != 0 } {
verbose "autofdo not supported because perf does not work"
return 0
-- 
2.25.4

Re: [Patch] Fortran: Add 'omp scan' support of OpenMP 5.0

2020-12-09 Thread Tobias Burnus


On 09.12.20 12:36, Thomas Schwinge wrote:

I'm confirming that it seems to work (that is, doesn't seem to cause any
obvious interference); OK to verify/document that as in the attached
"Add 'gfortran.dg/goacc-gomp/omp-scan-1-if_present.f90'"?


I don't think the testcase is useful, but I wouldn't veto it.

However, I think the comment change is completely misleading.
And additionally the testcase misses the point in terms what
happens internally.

Namely:
  !$acc update ... if_present
gets translated into an
  gfc_code->op == EXEC_OACC_UPDATE
  gfc_code->code->ext.omp_clauses->if_present = true

While
  !$omp scan ...
gets translated into
  gfc_code->op == EXEC_OMP_SCAN
which for a short time sets:
  gfc_code->code->ext.omp_clauses->if_present = true

(And those are different gfc_code variables.)

If we worry about this, we should also add a testcase that for
 !$acc update host(a)
 !$acc update self(b) if_present
checking that the first 'acc' does not have if_present set.


That it is also almost impossible to generate a compilable
testcase – due to the restrictions of 'omp scan' but also
because '!$acc update' cannot appear in an 'omp do' loop
— comes on top of this but focusing on the testcase is
really a red herring.

Tobias

PS: Unsetting 'if_present' for OMP_SCAN makes sense as otherwise
trans-openmp.c creates an OMP_CLAUSE_IF_PRESENT clause,
which may cause problems in the middle end. But that's
completely independent of -fopenacc and !$acc.

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter

Re: [Patch] Fortran: Add 'omp scan' support of OpenMP 5.0

2020-12-09 Thread Thomas Schwinge

Hi!

On 2020-12-09T12:51:57+0100, Jakub Jelinek  wrote:
> On Wed, Dec 09, 2020 at 12:36:26PM +0100, Thomas Schwinge wrote:
>> Yeah, that re-purposing of 'if_present' made me raise an eyebrow, too.
>
> I've missed yesterday that the if_present is on the EXEC_OMP_SCAN, not on
> some outer EXEC that could be arbitrary

Indeed that's not obvious when seeing the first occurrence:
'c->block->next->next->ext.omp_clauses->if_present = true'.  From the
second occurrence:

 case EXEC_OMP_SCAN:
   /* Flag is only used to checking, hence, it is unset afterwards.  */
   if (!code->ext.omp_clauses->if_present)
gfc_error ("Unexpected !$OMP SCAN at %L outside loop construct with 
"
   "% REDUCTION clause", &code->loc);

... it should've been clear -- but at that point I already had raised an
eyebrow.  ;-)

> and as !$omp scan can have only
> exclusive and inclusive clauses and nothing else, we can use pretty much
> all bool or unsigned :1 flags for that purpose as long as we document it,
> and the testcase with if_present on some other construct probably doesn't
> buy us much.

ACK.

Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter

Re: [Patch] Fortran: Add 'omp scan' support of OpenMP 5.0

2020-12-09 Thread Jakub Jelinek via Gcc-patches

On Wed, Dec 09, 2020 at 12:36:26PM +0100, Thomas Schwinge wrote:
> Yeah, that re-purposing of 'if_present' made me raise an eyebrow, too.

I've missed yesterday that the if_present is on the EXEC_OMP_SCAN, not on
some outer EXEC that could be arbitrary and as !$omp scan can have only
exclusive and inclusive clauses and nothing else, we can use pretty much
all bool or unsigned :1 flags for that purpose as long as we document it,
and the testcase with if_present on some other construct probably doesn't
buy us much.

OT, perhaps we should turn those:
  bool nowait, ordered, untied, mergeable;
  bool inbranch, notinbranch, defaultmap, nogroup;
  bool sched_simd, sched_monotonic, sched_nonmonotonic;
  bool simd, threads, depend_source, order_concurrent, capture;
into unsigned nowait : 1, ordered : 1, untied : 1, mergeable : 1;
etc. to save some memory, single bit bitfields should be pretty fast.

Jakub

Re: Help with PR97872

2020-12-09 Thread Prathamesh Kulkarni via Gcc-patches

On Tue, 8 Dec 2020 at 14:36, Prathamesh Kulkarni
 wrote:
>
> On Mon, 7 Dec 2020 at 17:37, Hongtao Liu  wrote:
> >
> > On Mon, Dec 7, 2020 at 7:11 PM Prathamesh Kulkarni
> >  wrote:
> > >
> > > On Mon, 7 Dec 2020 at 16:15, Hongtao Liu  wrote:
> > > >
> > > > On Mon, Dec 7, 2020 at 5:47 PM Richard Biener  wrote:
> > > > >
> > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > >
> > > > > > On Mon, 7 Dec 2020 at 13:01, Richard Biener  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Mon, 7 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > >
> > > > > > > > On Fri, 4 Dec 2020 at 17:18, Richard Biener  
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > On Fri, 4 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > >
> > > > > > > > > > On Thu, 3 Dec 2020 at 16:35, Richard Biener 
> > > > > > > > > >  wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Thu, 3 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > >
> > > > > > > > > > > > On Tue, 1 Dec 2020 at 16:39, Richard Biener 
> > > > > > > > > > > >  wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, 1 Dec 2020, Prathamesh Kulkarni wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi,
> > > > > > > > > > > > > > For the test mentioned in PR, I was trying to see 
> > > > > > > > > > > > > > if we could do
> > > > > > > > > > > > > > specialized expansion for vcond in target when 
> > > > > > > > > > > > > > operands are -1 and 0.
> > > > > > > > > > > > > > arm_expand_vcond gets the following operands:
> > > > > > > > > > > > > > (reg:V8QI 113 [ _2 ])
> > > > > > > > > > > > > > (reg:V8QI 117)
> > > > > > > > > > > > > > (reg:V8QI 118)
> > > > > > > > > > > > > > (lt (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > > (reg/v:V8QI 116 [ b ]))
> > > > > > > > > > > > > > (reg/v:V8QI 115 [ a ])
> > > > > > > > > > > > > > (reg/v:V8QI 116 [ b ])
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > where r117 and r118 are set to vector constants -1 
> > > > > > > > > > > > > > and 0 respectively.
> > > > > > > > > > > > > > However, I am not sure if there's a way to check if 
> > > > > > > > > > > > > > the register is
> > > > > > > > > > > > > > constant during expansion time (since we don't have 
> > > > > > > > > > > > > > df analysis yet) ?
> > > >
> > > > It seems to me that all you need to do is relax the predicates of op1
> > > > and op2 in vcondmn to accept const0_rtx and constm1_rtx. I haven't
> > > > debugged it, but I see that vcondmn in neon.md only accepts
> > > > s_register_operand.
> > > >
> > > > (define_expand "vcond"
> > > >   [(set (match_operand:VDQW 0 "s_register_operand")
> > > > (if_then_else:VDQW
> > > >   (match_operator 3 "comparison_operator"
> > > > [(match_operand:VDQW 4 "s_register_operand")
> > > >  (match_operand:VDQW 5 "reg_or_zero_operand")])
> > > >   (match_operand:VDQW 1 "s_register_operand")
> > > >   (match_operand:VDQW 2 "s_register_operand")))]
> > > >   "TARGET_NEON && (! || flag_unsafe_math_optimizations)"
> > > > {
> > > >   arm_expand_vcond (operands, mode);
> > > >   DONE;
> > > > })
> > > >
> > > > in sse.md it's defined as
> > > > (define_expand "vcondu"
> > > >   [(set (match_operand:V_512 0 "register_operand")
> > > > (if_then_else:V_512
> > > >   (match_operator 3 ""
> > > > [(match_operand:VI_AVX512BW 4 "nonimmediate_operand")
> > > >  (match_operand:VI_AVX512BW 5 "nonimmediate_operand")])
> > > >   (match_operand:V_512 1 "general_operand")
> > > >   (match_operand:V_512 2 "general_operand")))]
> > > >   "TARGET_AVX512F
> > > >&& (GET_MODE_NUNITS (mode)
> > > >== GET_MODE_NUNITS (mode))"
> > > > {
> > > >   bool ok = ix86_expand_int_vcond (operands);
> > > >   gcc_assert (ok);
> > > >   DONE;
> > > > })
> > > >
> > > > then we can get operands[1] and operands[2] as
> > > >
> > > > (gdb) p debug_rtx (operands[1])
> > > >  (const_vector:V16QI [
> > > > (const_int -1 [0x]) repeated x16
> > > > ])
> > > > (gdb) p debug_rtx (operands[2])
> > > > (reg:V16QI 82 [ _2 ])
> > > > (const_vector:V16QI [
> > > > (const_int 0 [0]) repeated x16
> > > > ])
> > > Hi Hongtao,
> > > Thanks for the suggestions!
> > > However IIUC from vector extensions doc page, the result of vector
> > > comparison is defined to be 0
> > > or -1, so would it be better to canonicalize
> > > x cmp y ? -1 : 0 to x cmp y, on GIMPLE itself during gimple-isel and
> > > adjust targets if required ?
> >
> > Yes, it would be more straightforward to handle it in gimple isel, I
> > would adjust the backend and testcase after you check in the patch.
> Thanks! I have committed the attached patch in
> 3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1.
Hi,
I was looking at similar issue in PR97903 and wondering what will be the
right approach to lower (a & b) != 0 to vtst ?
For test-case:
#include 

int8x8_t f1(int8x8_t a, int8x8_t b) {
  return (a & b

Re: [Patch] Fortran: Add 'omp scan' support of OpenMP 5.0

2020-12-09 Thread Thomas Schwinge

Hi!

On 2020-12-09T12:06:21+0100, Tobias Burnus  wrote:
> On 08.12.20 13:30, Jakub Jelinek wrote:
>> On Tue, Dec 08, 2020 at 01:13:07PM +0100, Tobias Burnus wrote:
>>> +case EXEC_OMP_SCAN:
>>> +  /* Flag is only used to checking, hence, it is unset afterwards.  */
>>> +  if (!code->ext.omp_clauses->if_present)
>> Isn't if_present used also for OpenACC?  Then can't it with -fopenmp
>> -fopenacc allow
>> !$acc ... if_present...
>> !$omp scan inclusive(...)
>> !$add end ...
>> ?

Yeah, that re-purposing of 'if_present' made me raise an eyebrow, too.

> !$acc ends up in a different ST_OMP_/EXEC_OMP_; additionally, due to the
> tight restrictions imposed by 'inscan'/'omp scan' adding something
> inbetween is difficult. (It can be added in 'block ... end block' but it
> still does not make much sense for 'omp scan' and it still ends up in a
> different statement.)

I'm confirming that it seems to work (that is, doesn't seem to cause any
obvious interference); OK to verify/document that as in the attached
"Add 'gfortran.dg/goacc-gomp/omp-scan-1-if_present.f90'"?

Regarding my comment "As long as '!$omp scan' inside '!$acc host_data'
(generally, all constructs using 'if_present') reliably results in a
compile-time error, [...]": do we need a more elaborate testcase for
that, like not directly nesting '!$omp scan' inside '!$acc host_data'?

.., or, well..., just implement this '!$omp scan' checking differently?
;-)


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From ceff1cdbd0fae3884efe9558704644add4ac9257 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 9 Dec 2020 12:33:12 +0100
Subject: [PATCH] Add 'gfortran.dg/goacc-gomp/omp-scan-1-if_present.f90'

... as a "Variant of '../gomp/scan-1.f90', checking
'code->ext.omp_clauses->if_present' implementation detail".

	gcc/testsuite/
	* gfortran.dg/goacc-gomp/omp-scan-1-if_present.f90: New.
	* gfortran.dg/gomp/scan-1.f90: Refer.
	gcc/fortran/
	* openmp.c (gfc_resolve_omp_do_blocks, gfc_resolve_omp_directive):
	Refer 'gfortran.dg/goacc-gomp/omp-scan-1-if_present.f90'.
---
 gcc/fortran/openmp.c  |  6 ++-
 .../goacc-gomp/omp-scan-1-if_present.f90  | 38 +++
 gcc/testsuite/gfortran.dg/gomp/scan-1.f90 |  2 +
 3 files changed, 44 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/goacc-gomp/omp-scan-1-if_present.f90

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index b1f009785e3..8010ce91478 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -6222,7 +6222,8 @@ gfc_resolve_omp_do_blocks (gfc_code *code, gfc_namespace *ns)
 	gfc_error ("With INSCAN at %L, expected loop body with !$OMP SCAN "
 		   "between two structured-block-sequences", loc);
 	  else
-	/* Mark as checked; flag will be unset later.  */
+	/* Mark as checked; flag will be unset later.
+	   (See 'gfortran.dg/goacc-gomp/omp-scan-1-if_present.f90'.)  */
 	c->block->next->next->ext.omp_clauses->if_present = true;
 	}
 }
@@ -7123,7 +7124,8 @@ gfc_resolve_omp_directive (gfc_code *code, gfc_namespace *ns)
 		   "except when omp_sync_hint_none is used", &code->loc);
   break;
 case EXEC_OMP_SCAN:
-  /* Flag is only used to checking, hence, it is unset afterwards.  */
+  /* Flag is only used to checking, hence, it is unset afterwards.
+	 See 'gfortran.dg/goacc-gomp/omp-scan-1-if_present.f90'.  */
   if (!code->ext.omp_clauses->if_present)
 	gfc_error ("Unexpected !$OMP SCAN at %L outside loop construct with "
 		   "% REDUCTION clause", &code->loc);
diff --git a/gcc/testsuite/gfortran.dg/goacc-gomp/omp-scan-1-if_present.f90 b/gcc/testsuite/gfortran.dg/goacc-gomp/omp-scan-1-if_present.f90
new file mode 100644
index 000..08b21e0cbf5
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc-gomp/omp-scan-1-if_present.f90
@@ -0,0 +1,38 @@
+! Variant of '../gomp/scan-1.f90', checking 'code->ext.omp_clauses->if_present' implementation detail.
+
+! As long as '!$omp scan' inside '!$acc host_data' (generally, all constructs
+! using 'if_present') reliably results in a compile-time error, we don't have
+! to worry about the case where the checking done for an '!$omp scan' might
+! inadvertently alter 'if_present' for an outer '!$acc host_data'.
+
+module m
+  integer a, b
+end module m
+
+subroutine f1
+  use m
+  !$acc host_data if_present
+  !$omp scan inclusive (a)  ! { dg-error "Unexpected ..OMP SCAN at .1. outside loop construct with 'inscan' REDUCTION clause" }
+  ! { dg-error "The ..OMP SCAN directive cannot be specified within a ..ACC HOST_DATA region" "" { target *-*-* } .-1 }
+  !$omp scan exclusive (b)  ! { dg-error "Unexpected ..OMP SCAN at .1. outside loop construct with 'inscan' REDUCTION clause" }
+  ! { dg-error "The ..OMP SCAN directive cannot be specified within a ..ACC HOST_DATA region" "" { t

Re: [Patch] OpenMP: C/C++ parse 'omp allocate'

2020-12-09 Thread Tobias Burnus


On 08.12.20 18:56, Jakub Jelinek wrote:

On Mon, Nov 23, 2020 at 03:50:33PM +0100, Tobias Burnus wrote:

Given that (at least for C/C++) there is some initial support for
OpenMP 5.0's allocators, it is likely that users will try it.

Sadly at least the current implementation doesn't offer much benefits;
I meant to add e.g. HBM support through dlopening of the memkind library,
but I haven't found a box with hw I could test it on.


I am not sure that there is a big benefit for most of the allocation
options – well, maybe in theory or for cluster OpenMP solutions with
slow interconnect or maybe for those SGI Altrix systems with lots of
threads.

The largest effect should be there for offloading and unified-shared
memory systems, where pinning seems to be essential for good performance.


I guess your patch is ok, but I should fine time to implement at least
the rest of the restrictions; in particular e.g.:
...
While the patch tests for C that the allocator has the right type, for C++
(for obvious reasons) it isn't checked, so we need the checking there later
from the attributes or so, at least if it is dependent.


For 'allocate' attribute, it is checked in cp/semantics.c's
finish_omp_clauses; I think it could be done there as well. – Once, the
'sorry' is either removed or moved to cp/semantics.c.


For automatic variables, we likely need to handle it during gimplification,

That's outside of this patch, but I fear what we will need to do for
Fortran with allocatable components.

Ok, thanks


Thanks for the review. Now committed as
r11-5879-gaa0432005f36f6ac51dc9dcecb717fe739d39b88.

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter

Re: [Patch] Fortran: Add 'omp scan' support of OpenMP 5.0

2020-12-09 Thread Tobias Burnus


On 08.12.20 13:30, Jakub Jelinek wrote:

On Tue, Dec 08, 2020 at 01:13:07PM +0100, Tobias Burnus wrote:

+if (list == OMP_LIST_REDUCTION)
+  has_inscan = true;

This looks weird, I would have expected
if (list == OMP_LIST_REDUCTION_INSCAN)


That's not only weird, that was plainly wrong. Now fixed and committed
as r11-5856-g005cff4e2ecbd5c4e2ef978fe4842fa3c8c79f47; follow-up fix for
reduction4.f90 committed as
r11-5876-g1cb2d1d5ce178cb68f0bd475299d2e0b25a4a756 loc);


you initially accept !$omp scan everywhere and only later complain if it
is misplaced?  I think e.g. for !$omp section I used to hardcode it in
parse_omp_structured_block - allow it only there and nowhere else:


Hmm, also a good method; I am not sure which one is better – hence, I
did not rewrite this patch. But good to know for the future.


+case EXEC_OMP_SCAN:
+  /* Flag is only used to checking, hence, it is unset afterwards.  */
+  if (!code->ext.omp_clauses->if_present)

Isn't if_present used also for OpenACC?  Then can't it with -fopenmp
-fopenacc allow
!$acc ... if_present...
!$omp scan inclusive(...)
!$add end ...
?


!$acc ends up in a different ST_OMP_/EXEC_OMP_; additionally, due to the
tight restrictions imposed by 'inscan'/'omp scan' adding something
inbetween is difficult. (It can be added in 'block ... end block' but it
still does not make much sense for 'omp scan' and it still ends up in a
different statement.)


Otherwise LGTM.


Thanks for the review.

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
commit 1cb2d1d5ce178cb68f0bd475299d2e0b25a4a756
Author: Tobias Burnus 
Date:   Wed Dec 9 10:42:49 2020 +0100

gfortran.dg/gomp/reduction4.f90: Fix testcase

Fix to 'omp scan' commit 005cff4e2ecbd5c4e2ef978fe4842fa3c8c79f47

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/reduction4.f90: Update scan-trees, add
lost testcase; move test with FE error to ...
* gfortran.dg/gomp/reduction5.f90: ... here.
---
 gcc/testsuite/gfortran.dg/gomp/reduction4.f90 | 23 +++
 gcc/testsuite/gfortran.dg/gomp/reduction5.f90 | 14 ++
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/gomp/reduction4.f90 b/gcc/testsuite/gfortran.dg/gomp/reduction4.f90
index 812be323b2e..2e8aaa2d54c 100644
--- a/gcc/testsuite/gfortran.dg/gomp/reduction4.f90
+++ b/gcc/testsuite/gfortran.dg/gomp/reduction4.f90
@@ -28,11 +28,6 @@ do i=1,10
 end do
 !$omp end parallel
 
-!$omp parallel reduction(inscan,+:a)  ! { dg-error "'inscan' REDUCTION clause on construct other than DO, SIMD, DO SIMD, PARALLEL DO, PARALLEL DO SIMD" }
-do i=1,10
-  a = a + 1
-end do
-!$omp end parallel
 
 !  simd 
 !$omp simd reduction(+:a)
@@ -45,6 +40,11 @@ do i=1,10
   a = a + 1
 end do
 
+!$omp simd reduction(task,+:a)  ! { dg-error "invalid 'task' reduction modifier on construct other than 'parallel', 'do' or 'sections'" }
+do i=1,10
+  a = a + 1
+end do
+
 !  do 
 !$omp parallel
 !$omp do reduction(+:a)
@@ -89,13 +89,6 @@ end do
 !$omp end sections
 !$omp end parallel
 
-!$omp parallel
-!$omp sections reduction(inscan,+:a)   ! { dg-error "'inscan' REDUCTION clause on construct other than DO, SIMD, DO SIMD, PARALLEL DO, PARALLEL DO SIMD" }
-  !$omp section
-  a = a + 1
-!$omp end sections
-!$omp end parallel
-
 !  task 
 !$omp task in_reduction(+:a)
   a = a + 1
@@ -136,13 +129,11 @@ end
 
 ! { dg-final { scan-tree-dump-times "#pragma omp for reduction\\(\\\+:a\\)" 2 "original" } }
 ! { dg-final { scan-tree-dump-times "#pragma omp for reduction\\(task,\\\+:a\\)" 1 "original" } }
-! { dg-final { scan-tree-dump-times "#pragma omp parallel\[\n\r\]" 7 "original" } }
+! { dg-final { scan-tree-dump-times "#pragma omp parallel\[\n\r\]" 6 "original" } }
 ! { dg-final { scan-tree-dump-times "#pragma omp parallel private\\(i\\) reduction\\(\\\+:a\\)" 2 "original" } }
-! { dg-final { scan-tree-dump-times "#pragma omp parallel private\\(i\\) reduction\\(inscan,\\\+:a\\)" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "#pragma omp parallel private\\(i\\) reduction\\(task,\\\+:a\\)" 1 "original" } }
-! { dg-final { scan-tree-dump-times "#pragma omp section\[\n\r\]" 4 "original" } }
+! { dg-final { scan-tree-dump-times "#pragma omp section\[\n\r\]" 3 "original" } }
 ! { dg-final { scan-tree-dump-times "#pragma omp sections reduction\\(\\\+:a\\)" 2 "original" } }
-! { dg-final { scan-tree-dump-times "#pragma omp sections reduction\\(inscan,\\\+:a\\)" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "#pragma omp sections reduction\\(task,\\\+:a\\)" 1 "original" } }
 ! { dg-final { scan-tree-dump-times "#pragma omp simd linear\\(i:1\\) reduction\\(\\\+:a\\)" 2 "original" } }
 ! { dg-final { scan-tree-dump-times "#pragma omp simd

[PATCH] c/98200 - improve error recovery for GIMPLE FE

2020-12-09 Thread Richard Biener

This avoids ICEing by making sure to propagate error early.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2020-12-09  Richard Biener  

PR c/98200
gcc/c/
* gimple-parser.c (c_parser_gimple_postfix_expression): Return
early on error.

* gcc.dg/gimplefe-error-8.c: New testcase.
---
 gcc/c/gimple-parser.c   | 2 ++
 gcc/testsuite/gcc.dg/gimplefe-error-8.c | 9 +
 2 files changed, 11 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/gimplefe-error-8.c

diff --git a/gcc/c/gimple-parser.c b/gcc/c/gimple-parser.c
index 5c0ed826119..473cb900481 100644
--- a/gcc/c/gimple-parser.c
+++ b/gcc/c/gimple-parser.c
@@ -1700,6 +1700,8 @@ c_parser_gimple_postfix_expression (gimple_parser &parser)
   expr.set_error ();
   break;
 }
+  if (expr.value == error_mark_node)
+return expr;
   return c_parser_gimple_postfix_expression_after_primary
 (parser, EXPR_LOC_OR_LOC (expr.value, loc), expr);
 }
diff --git a/gcc/testsuite/gcc.dg/gimplefe-error-8.c 
b/gcc/testsuite/gcc.dg/gimplefe-error-8.c
new file mode 100644
index 000..59e81eb4b32
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gimplefe-error-8.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-fgimple" } */
+
+int __GIMPLE() f(int x, int y)
+{
+  int a;
+  a = (x < y) ? 1 : 2; /* { dg-error "expected" } */
+  return a;
+}
-- 
2.26.2

Re: Go testsuite patch committed: Don't quote quoted parentheses

2020-12-09 Thread Andreas Schwab

This breaks make -C gcc check-go RUNTESTFLAGS="go-test.exp=chan.go":

ERROR: tcl error sourcing 
/opt/gcc/gcc-20201209/gcc/testsuite/go.test/go-test.exp.
ERROR: couldn't compile regular expression pattern: parentheses () not balanced
while executing
"regsub -all "(^|\n)(\[^\n\]+$line\[^\n\]*($pattern)\[^\n\]*\n?)+" $comp_output 
"\n" comp_output"
(procedure "saved-dg-test" line 125)
invoked from within
"saved-dg-test chan.go {  -O } {-fno-show-column  -pedantic-errors }"
("eval" body line 1)
invoked from within
"eval saved-dg-test $args "
(procedure "dg-test" line 1)
invoked from within
"dg-test $test "$flags $flags_t" ${default-extra-flags}"
(procedure "go-dg-runtest" line 24)
invoked from within
"go-dg-runtest $filename "" "-fno-show-column $DEFAULT_GOCFLAGS $opts""
(procedure "errchk" line 83)
invoked from within
"errchk $test """
(procedure "go-gc-tests" line 309)
invoked from within
"go-gc-tests"
(file "/opt/gcc/gcc-20201209/gcc/testsuite/go.test/go-test.exp" line 1217)
invoked from within
"source /opt/gcc/gcc-20201209/gcc/testsuite/go.test/go-test.exp"
("uplevel" body line 1)
invoked from within
"uplevel #0 source /opt/gcc/gcc-20201209/gcc/testsuite/go.test/go-test.exp"
invoked from within
"catch "uplevel #0 source $test_file_name""

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

RE: [PATCH v2 13/16]Arm: Add support for auto-vectorization using HF mode.

2020-12-09 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Tamar Christina 
> Sent: 25 September 2020 15:31
> To: gcc-patches@gcc.gnu.org
> Cc: nd ; Ramana Radhakrishnan
> ; Richard Earnshaw
> ; ni...@redhat.com; Kyrylo Tkachov
> 
> Subject: [PATCH v2 13/16]Arm: Add support for auto-vectorization using HF
> mode.
> 
> Hi All,
> 
> This adds support to the auto-vectorizer to support HFmode vectorization for
> AArch32.  This is supported when +fp16 is used.  I wonder if I should disable
> the returning of the type if the option isn't enabled.
> 
> At the moment it will be returned but the vectorizer will try and fail to use
> it.  It wastes a few compile cycles but doesn't result in bad code.
> 
> Bootstrapped Regtested on arm-none-linux-gnueabihf and no issues.
> 
> Ok for master?

Ok.
Thanks,
Kyrill

> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.c (arm_preferred_simd_mode): Add E_HFmode.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/vect-half-floats.c: New test.
> 
> --

[PATCH] Fix up testcase.

2020-12-09 Thread Hongtao Liu via Gcc-patches

On Wed, Dec 9, 2020 at 5:22 PM Prathamesh Kulkarni via Gcc-patches
 wrote:
>
> On Wed, 9 Dec 2020 at 00:29, sunil.k.pandey  wrote:
> >
> > On Linux/x86_64,
> >
> > 3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1 is the first bad commit
> > commit 3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1
> > Author: Prathamesh Kulkarni 
> > Date:   Tue Dec 8 14:30:04 2020 +0530
> >
> > gimple-isel: Fold x CMP y ? -1 : 0 to x CMP y [PR97872]
> >
> > caused
> >
> > FAIL: gcc.target/i386/pr78102.c scan-assembler-times pcmpeqq 3
> Hi,
> This is a known issue with the patch, and discussed here:
> https://gcc.gnu.org/pipermail/gcc/2020-December/234438.html
> I guess Hongtao will check in a fix for that soon.
>

According to https://uops.info/table.html,
both pcmpeqq and pcmpeqd use only port 1, so i think there's no
performance difference between

vpcmpeqq %xmm1, %xmm0, %xmm0
vpxor %xmm1, %xmm1, %xmm1
vpcmpeqq %xmm1, %xmm0, %xmm0

and

vpcmpeqq %xmm1, %xmm0, %xmm0
vpcmpeqd %xmm1, %xmm1, %xmm1
vpandn %xmm1, %xmm0, %xmm0

So fix up testcase as below.

gcc/testsuite

* gcc.target/i386/i386/pr78102.c: Adjust testcase.

1 file changed, 1 insertion(+), 1 deletion(-)
gcc/testsuite/gcc.target/i386/pr78102.c | 2 +-

modified   gcc/testsuite/gcc.target/i386/pr78102.c
@@ -1,7 +1,7 @@
 /* PR target/78102 */
 /* { dg-do compile } */
 /* { dg-options "-O2 -mno-sse4.2 -msse4.1" } */
-/* { dg-final { scan-assembler-times "pcmpeqq" 3 } } */
+/* { dg-final { scan-assembler-times "pcmpeq" 4 } } */

Ok for trunk?



-- 
BR,
Hongtao

Re: [PATCH 1/2] libstdc++: Add --enable-pure-stdio-libstdcxx option

2020-12-09 Thread Jonathan Wakely via Gcc-patches


On 07/12/20 12:36 -0800, Keith Packard wrote:

Jonathan Wakely  writes:


GCC changelog files are autogenerated now, so patches should not touch
them. Just include the ChangeLog entry in the Git commit log (which
will usually end up being quoted in the patch and/or the email body of
the mail to gcc-patches).


Awesome.


I think the right way to do this (or at least, the way that was
intended when basic_file_stdio.cc was added) is to provide a new file
and change GLIBCXX_ENABLE_CSTDIO in acinclude.m4 to use that new file.

The two biggest downsides of that are that it duplicates a lot of the
file (because the diffs for your changes are small) and that the
correct name for your new file is already taken!


I can definitely see a reason to use a separate file when implementing
the basic_file interface on top of something other than stdio, but
this patch doesn't do that -- it only changes the interaction between
basic_file and stdio in a few places.

I think it makes the best long-term sense to leave everything in
basic_file_stdio.cc and avoid having the two implementations diverge in
the future.


However, it's rather late in the GCC 11 process to make a change like
that (even though it's really just renaming some files). Would you be
OK waiting until after GCC 11 is released (in 4-5 months) to do it
"properly"? Is this blocking something that would require doing it
sooner?


This patch enables the use of C++ with picolibc, a libc designed for 32-
and 64- bit embedded systems.

Right now, I'm working on getting picolibc support integrated into
Zephyr, which uses toolchains build by crosstool-ng. I've gotten
picolibc support merged to crosstool-ng, but the Zephyr developers are
interested in having a single toolchain support three different libc
implementations (newlib, newlib-nano and picolibc), but that's blocked
on having C++ support available in all three libraries.

So, you're at the bottom of my current dependency graph :-)

I don't particularly need this released in gcc, but I would like to
get patches reviewed and the general approach agreed on so that I can
feel more confident in preparing patches to be applied to gcc in
crosstool-ng itself.

Once that's done, I'll also be able to release new Debian packages of
GCC for embedded ARM and RISC-V and have those include suitable patches
so that we can support embedded C++ development there too.


OK. In principle, changes to avoid using the POSIX APIs are definitely
fine. I would like to combine your new configure switch with the
existing --enable-cstdio one though.

How about the attached change for acinclude.m4 which would allow you
to do --enable-cstdio=stdio_pure? (It also adds "stdio_posix" as a
more accurate alternative spelling of the current "stdio" option.)



diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index fcd9ea3d23a..535ffd8682f 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -2862,24 +2862,30 @@ AC_DEFUN([GLIBCXX_ENABLE_PARALLEL], [
 
 
 dnl
-dnl Check for which I/O library to use:  stdio, or something specific.
+dnl Check for which I/O library to use:  stdio and POSIX, or pure stdio.
 dnl
-dnl Default is stdio.
+dnl Default is stdio_posix.
 dnl
 AC_DEFUN([GLIBCXX_ENABLE_CSTDIO], [
   AC_MSG_CHECKING([for underlying I/O to use])
   GLIBCXX_ENABLE(cstdio,stdio,[[[=PACKAGE]]],
-[use target-specific I/O package], [permit stdio])
+[use target-specific I/O package], [permit stdio|stdio_posix|stdio_pure])
 
-  # Now that libio has been removed, you can have any color you want as long
-  # as it's black.  This is one big no-op until other packages are added, but
-  # showing the framework never hurts.
+  # The only available I/O model is based on stdio, via basic_file_stdio.
+  # The default "stdio" is actually "stdio + POSIX" because it uses fdopen(3)
+  # to get a file descriptor and then uses read(3) and write(3) with it.
+  # The "stdio_pure" model doesn't use fdopen and only uses FILE* for I/O.
   case ${enable_cstdio} in
-stdio)
+stdio*)
   CSTDIO_H=config/io/c_io_stdio.h
   BASIC_FILE_H=config/io/basic_file_stdio.h
   BASIC_FILE_CC=config/io/basic_file_stdio.cc
   AC_MSG_RESULT(stdio)
+
+  if test "x$enable_cstdio" = "xstdio_pure" ; then
+	AC_DEFINE(_GLIBCXX_USE_PURE_STDIO, 1,
+		  [Define to restrict std::__basic_file<> to stdio APIs.])
+  fi
   ;;
   esac

[PATCH] combine: zeroing cost for new copies

2020-12-09 Thread Kewen.Lin via Gcc-patches

Hi,

This patch is to treat those new pseudo-to-pseudo copies
after hard-reg-to-pseudo-copy as zero costs.  The
justification is that these new copies are closely after
the corresponding hard-reg-to-pseudo-copy insns, register
allocation should be able to coalesce them and get them
eliminated.

Now these copies follow the normal costing scheme, the
below case dump shows the unexpected combination:

``` dump

Trying 3, 2 -> 13:
3: r119:DI=r132:DI
  REG_DEAD r132:DI
2: r118:DI=r131:DI
  REG_DEAD r131:DI
   13: r128:DI=r118:DI&0x|r119:DI<<0x20
  REG_DEAD r119:DI
  REG_DEAD r118:DI

Failed to match this instruction:
(set (reg:DI 128)
(ior:DI (ashift:DI (reg:DI 132)
(const_int 32 [0x20]))
(reg:DI 131)))
Successfully matched this instruction:
(set (reg/v:DI 119 [ f2 ])
(ashift:DI (reg:DI 132)
(const_int 32 [0x20])))
Successfully matched this instruction:
(set (reg:DI 128)
(ior:DI (reg/v:DI 119 [ f2 ])
(reg:DI 131)))
allowing combination of insns 2, 3 and 13
original costs 4 + 4 + 4 = 12
replacement costs 4 + 4 = 8
deferring deletion of insn with uid = 2.
modifying insn i2 3: r119:DI=r132:DI<<0x20
  REG_DEAD r132:DI
deferring rescan insn with uid = 3.
modifying insn i313: r128:DI=r119:DI|r131:DI
  REG_DEAD r131:DI
  REG_DEAD r119:DI
deferring rescan insn with uid = 13.

``` end dump

The original insn 13 can work well as rotldi3_insert_3,
so the combination with shift/or isn't better, but the
costing doesn't matches.

With this patch, we get below instead:

rejecting combination of insns 2, 3 and 13
original costs 0 + 0 + 4 = 4
replacement costs 4 + 4 = 8


Bootstrapped/regtested on powerpc64le-linux-gnu P9.

Is it reasonable?  Any comments are highly appreciated!

BR,
Kewen
--
gcc/ChangeLog:

* combine.c (new_copies): New static global variable declare/init.
(combine_validate_cost): Consider zero costs from new_copies.
(combine_instructions): Set zero cost for insns in new_copies.
(make_more_copies): Record new pseudo-to-pseudo copies to new_copies.
(rest_of_handle_combine): Call bitmap alloc/free for new_copies.
diff --git a/gcc/combine.c b/gcc/combine.c
index ed1ad45de83..6fb2fa82c3f 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -419,6 +419,10 @@ static struct undobuf undobuf;
 
 static int n_occurrences;
 
+/* Record the newly introduced pseudo-to-pseudo copies in function
+   make_more_copies.  */
+static bitmap new_copies = NULL;
+
 static rtx reg_nonzero_bits_for_combine (const_rtx, scalar_int_mode,
 scalar_int_mode,
 unsigned HOST_WIDE_INT *);
@@ -856,30 +860,38 @@ combine_validate_cost (rtx_insn *i0, rtx_insn *i1, 
rtx_insn *i2, rtx_insn *i3,
   int i0_cost, i1_cost, i2_cost, i3_cost;
   int new_i2_cost, new_i3_cost;
   int old_cost, new_cost;
+  bool i0_cost_ok, i1_cost_ok, i2_cost_ok, i3_cost_ok;
 
   /* Lookup the original insn_costs.  */
   i2_cost = INSN_COST (i2);
   i3_cost = INSN_COST (i3);
+  i2_cost_ok = (i2_cost > 0) || bitmap_bit_p (new_copies, INSN_UID (i2));
+  i3_cost_ok = (i3_cost > 0) || bitmap_bit_p (new_copies, INSN_UID (i3));
 
   if (i1)
 {
   i1_cost = INSN_COST (i1);
+  i1_cost_ok = (i1_cost > 0) || bitmap_bit_p (new_copies, INSN_UID (i1));
   if (i0)
{
  i0_cost = INSN_COST (i0);
- old_cost = (i0_cost > 0 && i1_cost > 0 && i2_cost > 0 && i3_cost > 0
- ? i0_cost + i1_cost + i2_cost + i3_cost : 0);
+ i0_cost_ok = (i0_cost > 0)
+  || bitmap_bit_p (new_copies, INSN_UID (i0));
+ old_cost = (i0_cost_ok && i1_cost_ok && i2_cost_ok && i3_cost_ok
+   ? i0_cost + i1_cost + i2_cost + i3_cost
+   : 0);
}
   else
{
- old_cost = (i1_cost > 0 && i2_cost > 0 && i3_cost > 0
- ? i1_cost + i2_cost + i3_cost : 0);
+ old_cost = (i1_cost_ok && i2_cost_ok && i3_cost_ok
+   ? i1_cost + i2_cost + i3_cost
+   : 0);
  i0_cost = 0;
}
 }
   else
 {
-  old_cost = (i2_cost > 0 && i3_cost > 0) ? i2_cost + i3_cost : 0;
+  old_cost = (i2_cost_ok && i3_cost_ok) ? i2_cost + i3_cost : 0;
   i1_cost = i0_cost = 0;
 }
 
@@ -1233,7 +1245,12 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
insn);
 
/* Record the current insn_cost of this instruction.  */
-   INSN_COST (insn) = insn_cost (insn, optimize_this_for_speed_p);
+   if (bitmap_bit_p (new_copies, INSN_UID (insn)))
+ /* Newly added pseudo-to-pseudo copies should not take any
+costs since they should be able to be coalesced.  */
+ INSN_COST (insn) = 0;
+   else
+ INSN_COST (insn) = insn_cost (insn, optimize_this_for_speed_p);

[PATCH] Add -Wtsan.

2020-12-09 Thread Martin Liška


Hello.

The newly added warning is about warning a user
that std::atomic_thread_fence is not supported by TSAN.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

PR sanitizer/97868
* common.opt: Add new warning -Wtsan.
* doc/invoke.texi: Likewise.
* tsan.c (instrument_builtin_call): Warn users about unsupported
std::atomic_thread_fence.
---
 gcc/common.opt  | 4 
 gcc/doc/invoke.texi | 8 +++-
 gcc/tsan.c  | 6 ++
 3 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 6645539f5e5..6c24c7bbffb 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -842,6 +842,10 @@ Wvector-operation-performance
 Common Var(warn_vector_operation_performance) Warning
 Warn when a vector operation is compiled outside the SIMD.
 
+Wtsan

+Common Var(warn_tsan) Init(1) Warning
+Warn about unsupported features in the ThreadSanitizer.
+
 Xassembler
 Driver Separate
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi

index f7e8c8b29b0..5bd18c78e99 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -377,7 +377,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wswitch  -Wno-switch-bool  -Wswitch-default  -Wswitch-enum @gol
 -Wno-switch-outside-range  -Wno-switch-unreachable  -Wsync-nand @gol
 -Wsystem-headers  -Wtautological-compare  -Wtrampolines  -Wtrigraphs @gol
--Wtype-limits  -Wundef @gol
+-Wtsan -Wtype-limits  -Wundef @gol
 -Wuninitialized  -Wunknown-pragmas @gol
 -Wunsuffixed-float-constants  -Wunused @gol
 -Wunused-but-set-parameter  -Wunused-but-set-variable @gol
@@ -7951,6 +7951,12 @@ Note that the code above is invalid in C++11.
 
 This warning is enabled by default.
 
+@item -Wtsan

+@opindex Wtsan
+@opindex Wno-tsan
+Warn about unsupported features in the ThreadSanitizer.
+This warning is enabled by default.
+
 @item -Wtype-limits
 @opindex Wtype-limits
 @opindex Wno-type-limits
diff --git a/gcc/tsan.c b/gcc/tsan.c
index 4d6223454b5..be9fabea62a 100644
--- a/gcc/tsan.c
+++ b/gcc/tsan.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "asan.h"
 #include "builtins.h"
 #include "target.h"
+#include "diagnostic-core.h"
 
 /* Number of instrumented memory accesses in the current function.  */
 
@@ -500,6 +501,11 @@ instrument_builtin_call (gimple_stmt_iterator *gsi)

   continue;
 else
   {
+   if (fcode == BUILT_IN_ATOMIC_THREAD_FENCE)
+ warning_at (gimple_location (stmt), OPT_Wtsan,
+ "%qs is not supported by ThreadSanitizer and may "
+ "lead to false positives", "atomic_thread_fence");
+
tree decl = builtin_decl_implicit (tsan_atomic_table[i].tsan_fcode);
if (decl == NULL_TREE)
  return;
--
2.29.2

Re: [r11-5839 Regression] FAIL: gcc.target/i386/pr78102.c scan-assembler-times pcmpeqq 3 on Linux/x86_64

2020-12-09 Thread Prathamesh Kulkarni via Gcc-patches

On Wed, 9 Dec 2020 at 00:29, sunil.k.pandey  wrote:
>
> On Linux/x86_64,
>
> 3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1 is the first bad commit
> commit 3a6e3ad38a17a03ee0139b49a0946e7b9ded1eb1
> Author: Prathamesh Kulkarni 
> Date:   Tue Dec 8 14:30:04 2020 +0530
>
> gimple-isel: Fold x CMP y ? -1 : 0 to x CMP y [PR97872]
>
> caused
>
> FAIL: gcc.target/i386/pr78102.c scan-assembler-times pcmpeqq 3
Hi,
This is a known issue with the patch, and discussed here:
https://gcc.gnu.org/pipermail/gcc/2020-December/234438.html
I guess Hongtao will check in a fix for that soon.

Thanks,
Prathamesh
>
> with GCC configured with
>
> ../../gcc/configure 
> --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-5839/usr
>  --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
> --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
> --enable-libmpx x86_64-linux --disable-bootstrap
>
> To reproduce:
>
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="i386.exp=gcc.target/i386/pr78102.c --target_board='unix{-m32}'"
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="i386.exp=gcc.target/i386/pr78102.c --target_board='unix{-m32\ 
> -march=cascadelake}'"
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="i386.exp=gcc.target/i386/pr78102.c --target_board='unix{-m64}'"
> $ cd {build_dir}/gcc && make check 
> RUNTESTFLAGS="i386.exp=gcc.target/i386/pr78102.c --target_board='unix{-m64\ 
> -march=cascadelake}'"
>
> (Please do not reply to this email, for question about this report, contact 
> me at skpgkp2 at gmail dot com)

Re: [PATCH] phiopt: Fix up two_value_replacement BOOLEAN_TYPE handling for Ada [PR98188]

2020-12-09 Thread Richard Biener

On Wed, 9 Dec 2020, Jakub Jelinek wrote:

> On Wed, Dec 09, 2020 at 09:03:36AM +0100, Richard Biener wrote:
> > So maybe do
> > 
> > >if (TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE
> >  && TYPE_PRECISION (TREE_TYPE (lhs)) == 1)
> > 
> > and thus rely on get_range_info for Ada?
> 
> So, like this instead (if it passes bootstrap/regtest)?
> Basically, for non-VR_RANGE just use VARYING min/max manually.
> The min + 1 != max check will then do the rest.

Yep, that looks safest.

OK.

Richard.

> 2020-12-09  Jakub Jelinek  
> 
>   PR bootstrap/98188
>   * tree-ssa-phiopt.c (two_value_replacement): Don't special case
>   BOOLEAN_TYPEs for ranges, instead if get_range_info doesn't return
>   VR_RANGE, set min/max to wi::min/max_value.
> 
> --- gcc/tree-ssa-phiopt.c.jj  2020-12-08 15:43:17.399463613 +0100
> +++ gcc/tree-ssa-phiopt.c 2020-12-09 09:39:18.713046374 +0100
> @@ -658,13 +658,13 @@ two_value_replacement (basic_block cond_
>  return false;
>  
>wide_int min, max;
> -  if (TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE)
> +  if (get_range_info (lhs, &min, &max) != VR_RANGE)
>  {
> -  min = wi::to_wide (boolean_false_node);
> -  max = wi::to_wide (boolean_true_node);
> +  int prec = TYPE_PRECISION (TREE_TYPE (lhs));
> +  signop sgn = TYPE_SIGN (TREE_TYPE (lhs));
> +  min = wi::min_value (prec, sgn);
> +  max = wi::max_value (prec, sgn);
>  }
> -  else if (get_range_info (lhs, &min, &max) != VR_RANGE)
> -return false;
>if (min + 1 != max
>|| (wi::to_wide (rhs) != min
> && wi::to_wide (rhs) != max))
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: [PATCH] phiopt: Fix up two_value_replacement BOOLEAN_TYPE handling for Ada [PR98188]

2020-12-09 Thread Jakub Jelinek via Gcc-patches

On Wed, Dec 09, 2020 at 09:03:36AM +0100, Richard Biener wrote:
> So maybe do
> 
> >if (TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE
>  && TYPE_PRECISION (TREE_TYPE (lhs)) == 1)
> 
> and thus rely on get_range_info for Ada?

So, like this instead (if it passes bootstrap/regtest)?
Basically, for non-VR_RANGE just use VARYING min/max manually.
The min + 1 != max check will then do the rest.

2020-12-09  Jakub Jelinek  

PR bootstrap/98188
* tree-ssa-phiopt.c (two_value_replacement): Don't special case
BOOLEAN_TYPEs for ranges, instead if get_range_info doesn't return
VR_RANGE, set min/max to wi::min/max_value.

--- gcc/tree-ssa-phiopt.c.jj2020-12-08 15:43:17.399463613 +0100
+++ gcc/tree-ssa-phiopt.c   2020-12-09 09:39:18.713046374 +0100
@@ -658,13 +658,13 @@ two_value_replacement (basic_block cond_
 return false;
 
   wide_int min, max;
-  if (TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE)
+  if (get_range_info (lhs, &min, &max) != VR_RANGE)
 {
-  min = wi::to_wide (boolean_false_node);
-  max = wi::to_wide (boolean_true_node);
+  int prec = TYPE_PRECISION (TREE_TYPE (lhs));
+  signop sgn = TYPE_SIGN (TREE_TYPE (lhs));
+  min = wi::min_value (prec, sgn);
+  max = wi::max_value (prec, sgn);
 }
-  else if (get_range_info (lhs, &min, &max) != VR_RANGE)
-return false;
   if (min + 1 != max
   || (wi::to_wide (rhs) != min
  && wi::to_wide (rhs) != max))


Jakub

Re: How to traverse all the local variables that declared in the current routine?

2020-12-09 Thread Richard Biener via Gcc-patches

On Tue, Dec 8, 2020 at 8:54 PM Qing Zhao  wrote:
>
>
>
> On Dec 8, 2020, at 1:40 AM, Richard Biener  wrote:
>
> On Mon, Dec 7, 2020 at 5:20 PM Qing Zhao  wrote:
>
>
>
>
> On Dec 7, 2020, at 1:12 AM, Richard Biener  wrote:
>
> On Fri, Dec 4, 2020 at 5:19 PM Qing Zhao  wrote:
>
>
>
>
> On Dec 4, 2020, at 2:50 AM, Richard Biener  wrote:
>
> On Thu, Dec 3, 2020 at 6:33 PM Richard Sandiford
>  wrote:
>
>
> Richard Biener via Gcc-patches  writes:
>
> On Tue, Nov 24, 2020 at 4:47 PM Qing Zhao  wrote:
>
> Another issue is, in order to check whether an auto-variable has initializer, 
> I plan to add a new bit in “decl_common” as:
> /* In a VAR_DECL, this is DECL_IS_INITIALIZED.  */
> unsigned decl_is_initialized :1;
>
> /* IN VAR_DECL, set when the decl is initialized at the declaration.  */
> #define DECL_IS_INITIALIZED(NODE) \
> (DECL_COMMON_CHECK (NODE)->decl_common.decl_is_initialized)
>
> set this bit when setting DECL_INITIAL for the variables in FE. then keep it
> even though DECL_INITIAL might be NULLed.
>
>
> For locals it would be more reliable to set this flag during gimplification.
>
> Do you have any comment and suggestions?
>
>
> As said above - do you want to cover registers as well as locals?  I'd do
> the actual zeroing during RTL expansion instead since otherwise you
> have to figure youself whether a local is actually used (see 
> expand_stack_vars)
>
> Note that optimization will already made have use of "uninitialized" state
> of locals so depending on what the actual goal is here "late" may be too late.
>
>
> Haven't thought about this much, so it might be a daft idea, but would a
> compromise be to use a const internal function:
>
> X1 = .DEFERRED_INIT (X0, INIT)
>
> where the X0 argument is an uninitialised value and the INIT argument
> describes the initialisation pattern?  So for a decl we'd have:
>
> X = .DEFERRED_INIT (X, INIT)
>
> and for an SSA name we'd have:
>
> X_2 = .DEFERRED_INIT (X_1(D), INIT)
>
> with all other uses of X_1(D) being replaced by X_2.  The idea is that:
>
> * Having the X0 argument would keep the uninitialised use of the
> variable around for the later warning passes.
>
> * Using a const function should still allow the UB to be deleted as dead
> if X1 isn't needed.
>
> * Having a function in the way should stop passes from taking advantage
> of direct uninitialised uses for optimisation.
>
> This means we won't be able to optimise based on the actual init
> value at the gimple level, but that seems like a fair trade-off.
> AIUI this is really a security feature or anti-UB hardening feature
> (in the sense that users are more likely to see predictable behaviour
> “in the field” even if the program has UB).
>
>
> The question is whether it's in line of peoples expectation that
> explicitely zero-initialized code behaves differently from
> implicitely zero-initialized code with respect to optimization
> and secondary side-effects (late diagnostics, latent bugs, etc.).
>
> Introducing a new concept like .DEFERRED_INIT is much more
> heavy-weight than an explicit zero initializer.
>
>
> What exactly you mean by “heavy-weight”? More difficult to implement or much 
> more run-time overhead or both? Or something else?
>
> The major benefit of the approach of “.DEFERRED_INIT”  is to enable us keep 
> the current -Wuninitialized analysis untouched and also pass
> the “uninitialized” info from source code level to “pass_expand”.
>
>
> Well, "untouched" is a bit oversimplified.  You do need to handle
> .DEFERRED_INIT as not
> being an initialization which will definitely get interesting.
>
>
> Yes, during uninitialized variable analysis pass, we should specially handle 
> the defs with “.DEFERRED_INIT”, to treat them as uninitializations.
>
> If we want to keep the current -Wuninitialized analysis untouched, this is a 
> quite reasonable approach.
>
> However, if it’s not required to keep the current -Wuninitialized analysis 
> untouched, adding zero-initializer directly during gimplification should
> be much easier and simpler, and also smaller run-time overhead.
>
>
> As for optimization I fear you'll get a load of redundant zero-init
> actually emitted if you can just rely on RTL DSE/DCE to remove it.
>
>
> Runtime overhead for -fauto-init=zero is one important consideration for the 
> whole feature, we should minimize the runtime overhead for zero
> Initialization since it will be used in production build.
> We can do some run-time performance evaluation when we have an implementation 
> ready.
>
>
> Note there will be other passes "confused" by .DEFERRED_INIT.  Note
> that there's going to be other
> considerations - namely where to emit the .DEFERRED_INIT - when
> emitting it during gimplification
> you can emit it at the start of the block of block-scope variables.
> When emitting after gimplification
> you have to emit at function start which will probably make stack slot
> sharing inefficient because
> the deferred init will cause overlapping lifetimes.  With emitting

Re: [PATCH] phiopt: Fix up two_value_replacement BOOLEAN_TYPE handling for Ada [PR98188]

2020-12-09 Thread Eric Botcazou

> I'm sure:
>   /* In Ada, we use an unsigned 8-bit type for the default boolean type.  */
> boolean_type_node = make_unsigned_type (8);
>   TREE_SET_CODE (boolean_type_node, BOOLEAN_TYPE);
>   SET_TYPE_RM_MAX_VALUE (boolean_type_node,
>  build_int_cst (boolean_type_node, 1));

Richard is correct, this is the RM maximum value, not the GCC maximum value.
All Ada integral types have maximum GCC bounds for their precision like in C.

-- 
Eric Botcazou

Re: [PATCH] phiopt: Fix up two_value_replacement BOOLEAN_TYPE handling for Ada [PR98188]

2020-12-09 Thread Richard Biener

On Wed, 9 Dec 2020, Jakub Jelinek wrote:

> On Wed, Dec 09, 2020 at 09:03:36AM +0100, Richard Biener wrote:
> > > For Ada with LTO, boolean_{false,true}_node can be 1-bit precision 
> > > boolean,
> > > while TREE_TYPE (lhs) can be 8-bit precision boolean and thus we can end 
> > > up
> > > with wide_int mismatches.
> > > 
> > > The following patch fixes it by using TYPE_{MIN,MAX}_VALUE instead.
> > 
> > Are you sure the Ada boolean types have 1/0 as MIN/MAX value?  ISTR
> > that's not the case as the middle-end has to support out-of-bound
> > values.  Now, this might mean using MIN/MAX value is even required
> > since the transform cannot assume 0/[-]1 here?
> > 
> > So maybe do
> > 
> > >if (TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE
> >  && TYPE_PRECISION (TREE_TYPE (lhs)) == 1)
> > 
> > and thus rely on get_range_info for Ada?
> 
> I'm sure:
>   /* In Ada, we use an unsigned 8-bit type for the default boolean type.  */
>   boolean_type_node = make_unsigned_type (8);
>   TREE_SET_CODE (boolean_type_node, BOOLEAN_TYPE);
>   SET_TYPE_RM_MAX_VALUE (boolean_type_node,
>  build_int_cst (boolean_type_node, 1));
>  ^^^ Here

But

/* For numerical types, this is the RM upper bound of the type.  There is
   again a discrepancy between this upper bound and the GCC upper bound,
   again because of the need to support invalid values.

   These values can be outside the range of values allowed by the RM upper
   bound but they must nevertheless be valid in the GCC type system, 
otherwise
   the optimizer can pretend that they simply don't exist.  Therefore they
   must be within the range of values allowed by the upper bound in the 
GCC
   sense, hence the GCC upper bound be set to that of the base type.

   This upper bound is translated directly without the adjustments that 
may
   be required for type compatibility, so it will generally be necessary 
to
   convert it to the base type of the numerical type before using it.  */
#define TYPE_RM_MAX_VALUE(NODE) TYPE_RM_VALUE ((NODE), 2)
#define SET_TYPE_RM_MAX_VALUE(NODE, X) SET_TYPE_RM_VALUE ((NODE), 2, (X))

and

#define SET_TYPE_RM_VALUE(NODE, N, X)  \
do {   \
  tree tmp = (X);  \
  if (!TYPE_RM_VALUES (NODE))  \
TYPE_RM_VALUES (NODE) = make_tree_vec (3); \
  /* ??? The field is not visited by the generic   \
 code so we need to mark it manually.  */  \
  MARK_VISITED (tmp);  \
  TREE_VEC_ELT (TYPE_RM_VALUES (NODE), (N)) = tmp; \
} while (0)

>   SET_TYPE_RM_SIZE (boolean_type_node, bitsize_int (1));
>   boolean_true_node = TYPE_MAX_VALUE (boolean_type_node);
>   boolean_false_node = TYPE_MIN_VALUE (boolean_type_node);
> 
> The build_nonstandard_boolean_type that is not the case and
> TYPE_MAX_VALUE is -1 rather than 0, but I've checked all uses
> of that function and it always just creates an element type
> for VECTOR_TYPEs, so I think it shouldn't affect code that
> looks at BOOLEAN_TYPE non-VECTOR_TYPEs.
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: [PATCH] fold-const: Fix native_encode_initializer bitfield handling [PR98199]

2020-12-09 Thread Richard Biener

On Wed, 9 Dec 2020, Jakub Jelinek wrote:

> Hi!
> 
> With the bit_cast changes, I have added support for bitfields which don't
> have scalar representatives.  For bit_cast it works fine, as when mask
> is non-NULL, off is asserted to be 0.  But when native_encode_initializer
> is called e.g. from sccvn with off > 0 (i.e. we are interested in encoding
> just a few bytes out of it somewhere from the middle or at the end), the
> following computations are incorrect.
> pos is a byte position from the start of the constructor, repr_size is the
> size in bytes of the bit-field representative and len is the length
> of the buffer.  If the buffer is offsetted by positive off, those numbers
> are uncomparable though, we need to add off to len to make both
> count bytes from the start of the constructor, and o is a utility temporary
> set to off != -1 ? off : 0 (because off -1 also means start at offset 0
> and just force special behavior).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2020-12-09  Jakub Jelinek  
> 
>   PR tree-optimization/98199
>   * fold-const.c (native_encode_initializer): Fix handling bit-fields
>   when off > 0.
> 
>   * gcc.c-torture/compile/pr98199.c: New test.
> 
> --- gcc/fold-const.c.jj   2020-12-08 12:42:53.0 +0100
> +++ gcc/fold-const.c  2020-12-08 15:26:43.055445045 +0100
> @@ -8320,11 +8320,11 @@ native_encode_initializer (tree init, un
>   return 0;
> HOST_WIDE_INT repr_size = int_size_in_bytes (repr_type);
> gcc_assert (repr_size > 0 && repr_size <= len);
> -   if (pos + repr_size <= len)
> +   if (pos + repr_size <= o + len)
>   rpos = pos;
> else
>   {
> -   rpos = len - repr_size;
> +   rpos = o + len - repr_size;
> gcc_assert (rpos <= pos);
>   }
>   }
> --- gcc/testsuite/gcc.c-torture/compile/pr98199.c.jj  2020-12-08 
> 15:37:15.082465022 +0100
> +++ gcc/testsuite/gcc.c-torture/compile/pr98199.c 2020-12-08 
> 15:36:51.032730617 +0100
> @@ -0,0 +1,7 @@
> +/* PR tree-optimization/98199 */
> +
> +struct A { long a; short d; int c, f, e, g; };
> +struct B { int a, i; short j; struct A k; signed : 20; int e, g; } 
> __attribute__((packed));
> +struct C { short a; unsigned i, k; struct B d; const int : 30; signed e : 
> 20; signed : 18; };
> +const struct C l = { 1, 6, 0, {}, 0 };
> +int foo (void) { return l.e || 0; }
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: [PATCH] phiopt: Fix up two_value_replacement BOOLEAN_TYPE handling for Ada [PR98188]

2020-12-09 Thread Jakub Jelinek via Gcc-patches

On Wed, Dec 09, 2020 at 09:03:36AM +0100, Richard Biener wrote:
> > For Ada with LTO, boolean_{false,true}_node can be 1-bit precision boolean,
> > while TREE_TYPE (lhs) can be 8-bit precision boolean and thus we can end up
> > with wide_int mismatches.
> > 
> > The following patch fixes it by using TYPE_{MIN,MAX}_VALUE instead.
> 
> Are you sure the Ada boolean types have 1/0 as MIN/MAX value?  ISTR
> that's not the case as the middle-end has to support out-of-bound
> values.  Now, this might mean using MIN/MAX value is even required
> since the transform cannot assume 0/[-]1 here?
> 
> So maybe do
> 
> >if (TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE
>  && TYPE_PRECISION (TREE_TYPE (lhs)) == 1)
> 
> and thus rely on get_range_info for Ada?

I'm sure:
  /* In Ada, we use an unsigned 8-bit type for the default boolean type.  */
  boolean_type_node = make_unsigned_type (8);
  TREE_SET_CODE (boolean_type_node, BOOLEAN_TYPE);
  SET_TYPE_RM_MAX_VALUE (boolean_type_node,
 build_int_cst (boolean_type_node, 1));
   ^^^ Here
  SET_TYPE_RM_SIZE (boolean_type_node, bitsize_int (1));
  boolean_true_node = TYPE_MAX_VALUE (boolean_type_node);
  boolean_false_node = TYPE_MIN_VALUE (boolean_type_node);

The build_nonstandard_boolean_type that is not the case and
TYPE_MAX_VALUE is -1 rather than 0, but I've checked all uses
of that function and it always just creates an element type
for VECTOR_TYPEs, so I think it shouldn't affect code that
looks at BOOLEAN_TYPE non-VECTOR_TYPEs.

Jakub

Re: [PATCH] phiopt: Fix up two_value_replacement BOOLEAN_TYPE handling for Ada [PR98188]

2020-12-09 Thread Richard Biener

On Wed, 9 Dec 2020, Jakub Jelinek wrote:

> Hi!
> 
> For Ada with LTO, boolean_{false,true}_node can be 1-bit precision boolean,
> while TREE_TYPE (lhs) can be 8-bit precision boolean and thus we can end up
> with wide_int mismatches.
> 
> The following patch fixes it by using TYPE_{MIN,MAX}_VALUE instead.

Are you sure the Ada boolean types have 1/0 as MIN/MAX value?  ISTR
that's not the case as the middle-end has to support out-of-bound
values.  Now, this might mean using MIN/MAX value is even required
since the transform cannot assume 0/[-]1 here?

So maybe do

>if (TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE
 && TYPE_PRECISION (TREE_TYPE (lhs)) == 1)

and thus rely on get_range_info for Ada?

Eric?

Thanks,
Richard.

> Bootstrapped/regtested on x86_64-linux and i686-linux (the former including
> Ada as usually), ok for trunk?
> Or do you prefer your version with wi::zero/wi::one?
> 
> 2020-12-09  Jakub Jelinek  
> 
>   PR bootstrap/98188
>   * tree-ssa-phiopt.c (two_value_replacement): For boolean, set
>   min and max from minimum and maximum of the type.
> 
> --- gcc/tree-ssa-phiopt.c.jj  2020-12-06 10:57:00.142847537 +0100
> +++ gcc/tree-ssa-phiopt.c 2020-12-08 15:00:09.091063392 +0100
> @@ -660,8 +660,8 @@ two_value_replacement (basic_block cond_
>wide_int min, max;
>if (TREE_CODE (TREE_TYPE (lhs)) == BOOLEAN_TYPE)
>  {
> -  min = wi::to_wide (boolean_false_node);
> -  max = wi::to_wide (boolean_true_node);
> +  min = wi::to_wide (TYPE_MIN_VALUE (TREE_TYPE (lhs)));
> +  max = wi::to_wide (TYPE_MAX_VALUE (TREE_TYPE (lhs)));
>  }
>else if (get_range_info (lhs, &min, &max) != VR_RANGE)
>  return false;
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

[PATCH] fold-const: Fix native_encode_initializer bitfield handling [PR98199]

2020-12-09 Thread Jakub Jelinek via Gcc-patches

Hi!

With the bit_cast changes, I have added support for bitfields which don't
have scalar representatives.  For bit_cast it works fine, as when mask
is non-NULL, off is asserted to be 0.  But when native_encode_initializer
is called e.g. from sccvn with off > 0 (i.e. we are interested in encoding
just a few bytes out of it somewhere from the middle or at the end), the
following computations are incorrect.
pos is a byte position from the start of the constructor, repr_size is the
size in bytes of the bit-field representative and len is the length
of the buffer.  If the buffer is offsetted by positive off, those numbers
are uncomparable though, we need to add off to len to make both
count bytes from the start of the constructor, and o is a utility temporary
set to off != -1 ? off : 0 (because off -1 also means start at offset 0
and just force special behavior).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-12-09  Jakub Jelinek  

PR tree-optimization/98199
* fold-const.c (native_encode_initializer): Fix handling bit-fields
when off > 0.

* gcc.c-torture/compile/pr98199.c: New test.

--- gcc/fold-const.c.jj 2020-12-08 12:42:53.0 +0100
+++ gcc/fold-const.c2020-12-08 15:26:43.055445045 +0100
@@ -8320,11 +8320,11 @@ native_encode_initializer (tree init, un
return 0;
  HOST_WIDE_INT repr_size = int_size_in_bytes (repr_type);
  gcc_assert (repr_size > 0 && repr_size <= len);
- if (pos + repr_size <= len)
+ if (pos + repr_size <= o + len)
rpos = pos;
  else
{
- rpos = len - repr_size;
+ rpos = o + len - repr_size;
  gcc_assert (rpos <= pos);
}
}
--- gcc/testsuite/gcc.c-torture/compile/pr98199.c.jj2020-12-08 
15:37:15.082465022 +0100
+++ gcc/testsuite/gcc.c-torture/compile/pr98199.c   2020-12-08 
15:36:51.032730617 +0100
@@ -0,0 +1,7 @@
+/* PR tree-optimization/98199 */
+
+struct A { long a; short d; int c, f, e, g; };
+struct B { int a, i; short j; struct A k; signed : 20; int e, g; } 
__attribute__((packed));
+struct C { short a; unsigned i, k; struct B d; const int : 30; signed e : 20; 
signed : 18; };
+const struct C l = { 1, 6, 0, {}, 0 };
+int foo (void) { return l.e || 0; }

Jakub

90 matches

Mail list logo