[PATCH v3] c++: Fix cp_tree_equal for template value args using dependent sizeof/alignof/noexcept expressions

2021-09-13 Thread Barrett Adair via Gcc-patches
I reworked the fix today based on feedback from Jason and Jakub (thank
you), and the subject line is now outdated. I added another test for a
closely related bug that's also fixed here (dependent-expr11.C -- this one
would even fail without the second declaration). All the new tests in the
patch succeed with the change (only two of them succeed with trunk). On my
box, the bootstrap succeeds, the g++ test suite passes (matching today's
posted results anyway), and the libstdc++ test suite is looking good but is
still running after a long time. I'll leave the full "make check" running
overnight.

Some potentially controversial changes here:

1. Adding new bool member to cp_parser. I'd like to avoid this, any tips?
2. Relaxing an assert in tsubst_copy. This change feels correct to me, but
maybe I'm missing something.
3. Pushing a function scope in PARM_DECL case in tsubst_copy_and_build to
make process_outer_var_ref happy for trailing return types. I don't yet
fully appreciate the consequences of these changes, so this needs some eyes.
4. Traversing each template arg's tree in
any_template_arguments_need_structural_equality_p to identify dependent
expressions in trailing return types. This could probably be done better. I
check current_function_decl here as an optimization (since it's NULL in the
only place that "needs" this), but that seems brittle. Also, the new
find_dependent_parm_decl_r callback implementation may have the unintended
consequence of forcing structural comparison on member function trailing
return types that depend on class template parameters. I think I really
only want to force structural comparison for "arg tree has a dependent parm
decl and we're in a trailing return type" -- is there a better way to do
this?

Also note that I found another related bug which I have not yet solved:

template
struct foo {
  constexpr operator int() { return i; }
};
void bar() {
  [](auto i) -> foo {
return {};
  }(foo<1>{});
}

With the attached patch, failure occurs at invocation, while trunk fails to
parse the return type. This seems like a step in the right direction, but
we should consider whether such an incomplete fix introduces more issues
than it solves (e.g. unfriendlier error messages, or perhaps something more
sinister).

Thanks,
Barrett
From 0470bdc5b2b4ddff2d2ee9db11a8c5895abda50f Mon Sep 17 00:00:00 2001
From: Barrett Adair 
Date: Fri, 20 Aug 2021 15:37:36 -0500
Subject: [PATCH] Fix trailing return type bugs

---
 gcc/cp/cp-tree.h  |  2 +-
 gcc/cp/parser.c   | 13 -
 gcc/cp/parser.h   |  3 +
 gcc/cp/pt.c   | 58 +--
 gcc/cp/semantics.c|  9 ++-
 gcc/testsuite/g++.dg/template/canon-type-15.C |  7 +++
 gcc/testsuite/g++.dg/template/canon-type-16.C |  6 ++
 gcc/testsuite/g++.dg/template/canon-type-17.C |  5 ++
 gcc/testsuite/g++.dg/template/canon-type-18.C |  6 ++
 .../g++.dg/template/dependent-expr11.C|  6 ++
 .../g++.dg/template/dependent-name15.C| 18 ++
 .../g++.dg/template/dependent-name16.C| 14 +
 12 files changed, 136 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/canon-type-15.C
 create mode 100644 gcc/testsuite/g++.dg/template/canon-type-16.C
 create mode 100644 gcc/testsuite/g++.dg/template/canon-type-17.C
 create mode 100644 gcc/testsuite/g++.dg/template/canon-type-18.C
 create mode 100644 gcc/testsuite/g++.dg/template/dependent-expr11.C
 create mode 100644 gcc/testsuite/g++.dg/template/dependent-name15.C
 create mode 100644 gcc/testsuite/g++.dg/template/dependent-name16.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index a82747ca627..b93455aebff 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7537,7 +7537,7 @@ extern tree process_outer_var_ref		(tree, tsubst_flags_t, bool force_use = false
 extern cp_expr finish_id_expression		(tree, tree, tree,
 		 cp_id_kind *,
 		 bool, bool, bool *,
-		 bool, bool, bool, bool,
+		 bool, bool, bool, bool, bool,
 		 const char **,
  location_t);
 extern tree finish_typeof			(tree);
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index e44c5c6b57c..4b95103eb2b 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -6011,7 +6011,8 @@ cp_parser_primary_expression (cp_parser *parser,
 		 parser->integral_constant_expression_p,
 		 parser->allow_non_integral_constant_expression_p,
 		 >non_integral_constant_expression_p,
-		 template_p, done, address_p,
+		 template_p, parser->in_trailing_return_type_p,
+		 done, address_p,
 		 template_arg_p,
 		 _msg,
 		 id_expression.get_location ()));
@@ -11256,6 +11257,7 @@ cp_parser_lambda_introducer (cp_parser* parser, tree lambda_expr)
  /*allow_non_integral_constant_expression_p=*/false,
  /*non_integral_constant_expression_p=*/NULL,
  /*template_p=*/false,

Re: [PATCH 16/62] AVX512FP16: Add vsqrtph/vrsqrtph/vsqrtsh/vrsqrtsh.

2021-09-13 Thread Hongtao Liu via Gcc-patches
i'm going to commit 8 patches:

[PATCH 16/62] AVX512FP16: Add vsqrtph/vrsqrtph/vsqrtsh/vrsqrtsh.
[PATCH 17/62] AVX512FP16: Add testcase for vsqrtph/vsqrtsh/vrsqrtph/vrsqrtsh.
[PATCH 18/62] AVX512FP16: Add vrcpph/vrcpsh/vscalefph/vscalefsh.
[PATCH 19/62] AVX512FP16: Add testcase for vrcpph/vrcpsh/vscalefph/vscalefsh.
[PATCH 20/62] AVX512FP16: Add vreduceph/vreducesh/vrndscaleph/vrndscalesh.
[PATCH 21/62] AVX512FP16: Add testcase for
vreduceph/vreducesh/vrndscaleph/vrndscalesh.
[PATCH 22/62] AVX512FP16: Add fpclass/getexp/getmant instructions.
[PATCH 23/62] AVX512FP16: Add testcase for fpclass/getmant/getexp instructions.

 Bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
 Newly added tests passed on SPR.

On Thu, Jul 1, 2021 at 2:17 PM liuhongt  wrote:
>
> gcc/ChangeLog:
>
> * config/i386/avx512fp16intrin.h: (_mm512_sqrt_ph):
> New intrinsic.
> (_mm512_mask_sqrt_ph): Likewise.
> (_mm512_maskz_sqrt_ph): Likewise.
> (_mm512_sqrt_round_ph): Likewise.
> (_mm512_mask_sqrt_round_ph): Likewise.
> (_mm512_maskz_sqrt_round_ph): Likewise.
> (_mm512_rsqrt_ph): Likewise.
> (_mm512_mask_rsqrt_ph): Likewise.
> (_mm512_maskz_rsqrt_ph): Likewise.
> (_mm_rsqrt_sh): Likewise.
> (_mm_mask_rsqrt_sh): Likewise.
> (_mm_maskz_rsqrt_sh): Likewise.
> (_mm_sqrt_sh): Likewise.
> (_mm_mask_sqrt_sh): Likewise.
> (_mm_maskz_sqrt_sh): Likewise.
> (_mm_sqrt_round_sh): Likewise.
> (_mm_mask_sqrt_round_sh): Likewise.
> (_mm_maskz_sqrt_round_sh): Likewise.
> * config/i386/avx512fp16vlintrin.h (_mm_sqrt_ph): New intrinsic.
> (_mm256_sqrt_ph): Likewise.
> (_mm_mask_sqrt_ph): Likewise.
> (_mm256_mask_sqrt_ph): Likewise.
> (_mm_maskz_sqrt_ph): Likewise.
> (_mm256_maskz_sqrt_ph): Likewise.
> (_mm_rsqrt_ph): Likewise.
> (_mm256_rsqrt_ph): Likewise.
> (_mm_mask_rsqrt_ph): Likewise.
> (_mm256_mask_rsqrt_ph): Likewise.
> (_mm_maskz_rsqrt_ph): Likewise.
> (_mm256_maskz_rsqrt_ph): Likewise.
> * config/i386/i386-builtin-types.def: Add corresponding builtin types.
> * config/i386/i386-builtin.def: Add corresponding new builtins.
> * config/i386/i386-expand.c
> (ix86_expand_args_builtin): Handle new builtins.
> (ix86_expand_round_builtin): Ditto.
> * config/i386/sse.md (VF_AVX512FP16VL): New.
> (sqrt2): Adjust for HF vector modes.
> (_sqrt2): Likewise.
> (_vmsqrt2):
> Likewise.
> (_rsqrt2): New.
> (avx512fp16_vmrsqrtv8hf2): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx-1.c: Add test for new builtins.
> * gcc.target/i386/sse-13.c: Ditto.
> * gcc.target/i386/sse-23.c: Ditto.
> * gcc.target/i386/sse-14.c: Add test for new intrinsics.
> * gcc.target/i386/sse-22.c: Ditto.
> ---
>  gcc/config/i386/avx512fp16intrin.h | 193 +
>  gcc/config/i386/avx512fp16vlintrin.h   |  93 
>  gcc/config/i386/i386-builtin-types.def |   4 +
>  gcc/config/i386/i386-builtin.def   |   8 +
>  gcc/config/i386/i386-expand.c  |   4 +
>  gcc/config/i386/sse.md |  44 --
>  gcc/testsuite/gcc.target/i386/avx-1.c  |   2 +
>  gcc/testsuite/gcc.target/i386/sse-13.c |   2 +
>  gcc/testsuite/gcc.target/i386/sse-14.c |   6 +
>  gcc/testsuite/gcc.target/i386/sse-22.c |   6 +
>  gcc/testsuite/gcc.target/i386/sse-23.c |   2 +
>  11 files changed, 355 insertions(+), 9 deletions(-)
>
> diff --git a/gcc/config/i386/avx512fp16intrin.h 
> b/gcc/config/i386/avx512fp16intrin.h
> index ed8ad84a105..50db5d12140 100644
> --- a/gcc/config/i386/avx512fp16intrin.h
> +++ b/gcc/config/i386/avx512fp16intrin.h
> @@ -1235,6 +1235,199 @@ _mm_comi_round_sh (__m128h __A, __m128h __B, const 
> int __P, const int __R)
>
>  #endif /* __OPTIMIZE__  */
>
> +/* Intrinsics vsqrtph.  */
> +extern __inline __m512h
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_sqrt_ph (__m512h __A)
> +{
> +  return __builtin_ia32_vsqrtph_v32hf_mask_round (__A,
> + _mm512_setzero_ph(),
> + (__mmask32) -1,
> + _MM_FROUND_CUR_DIRECTION);
> +}
> +
> +extern __inline __m512h
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_mask_sqrt_ph (__m512h __A, __mmask32 __B, __m512h __C)
> +{
> +  return __builtin_ia32_vsqrtph_v32hf_mask_round (__C, __A, __B,
> + _MM_FROUND_CUR_DIRECTION);
> +}
> +
> +extern __inline __m512h
> +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
> +_mm512_maskz_sqrt_ph (__mmask32 __A, __m512h __B)
> +{
> +  return __builtin_ia32_vsqrtph_v32hf_mask_round (__B,
> +  

Re: [PATCH, Fortran] Revert to non-multilib-specific ISO_Fortran_binding.h

2021-09-13 Thread Sandra Loosemore

On 9/13/21 11:07 AM, Tobias Burnus wrote:

On 13.09.21 18:59, Sandra Loosemore wrote:

On 9/13/21 10:51 AM, Jakub Jelinek wrote: >>> Wouldn't it be better to use the 
__LDBL_* macros anyway and not rely on

float.h?  The header doesn't want to test what float.h tells about the
long double type, but what the compiler knows about it.
I originally wrote the code to use the internal GCC __LDBL_* macros as 
you suggest, but Tobias complained that then the gfortran-provided .h 
file could not be used to compile the C parts of the program with some 
other C compiler. 
For instance, clang does not seem to provide those - and in some cases, 
it can be useful to mix gfortran code with code complied by other 
compilers (icc, clang, ...).
Maybe it needs to first check the internal macros and then look for 
the float.h versions if it can't find them?


I think that makes sense. (Adding a comment that #include  is 
for non-GCC compilers, only.)


Here's a patch.  Gerald, can you check that this fixes your bootstrap 
problem on i586-unknown-freebsd11?


-Sandra
commit b8b19bca743ed678ef1b59f1a363c7fa7d155c43
Author: Sandra Loosemore 
Date:   Mon Sep 13 19:48:16 2021 -0700

Fortran: Prefer GCC internal macros to float.h in ISO_Fortran_binding.h.

2021-09-13  Sandra Loosemore  

	libgfortran/
	* ISO_Fortran_binding.h: Only include float.h if the C compiler
	doesn't have predefined __LDBL_* and __DBL_* macros.

diff --git a/libgfortran/ISO_Fortran_binding.h b/libgfortran/ISO_Fortran_binding.h
index 9c42464..a3c6f80 100644
--- a/libgfortran/ISO_Fortran_binding.h
+++ b/libgfortran/ISO_Fortran_binding.h
@@ -32,7 +32,6 @@ extern "C" {
 
 #include   /* Standard ptrdiff_t tand size_t. */
 #include   /* Integer types. */
-#include   /* Macros for floating-point type characteristics.  */
 
 /* Constants, defined as macros. */
 #define CFI_VERSION 1
@@ -217,40 +216,82 @@ extern int CFI_setpointer (CFI_cdesc_t *, CFI_cdesc_t *, const CFI_index_t []);
 #endif
 
 /* The situation with long double support is more complicated; we need to
-   examine the type in more detail to figure out its kind.  */
+   examine the type in more detail to figure out its kind.
+   GCC and some other compilers predefine the __LDBL* macros; otherwise
+   get the parameters we need from float.h.  */
+
+#if (defined (__LDBL_MANT_DIG__) \
+ && defined (__LDBL_MIN_EXP__) \
+ && defined (__LDBL_MAX_EXP__) \
+ && defined (__DBL_MANT_DIG__) \
+ && defined (__DBL_MIN_EXP__) \
+ && defined (__DBL_MAX_EXP__))
+#define __CFI_LDBL_MANT_DIG__ __LDBL_MANT_DIG__
+#define __CFI_LDBL_MIN_EXP__ __LDBL_MIN_EXP__
+#define __CFI_LDBL_MAX_EXP__ __LDBL_MAX_EXP__
+#define __CFI_DBL_MANT_DIG__ __DBL_MANT_DIG__
+#define __CFI_DBL_MIN_EXP__ __DBL_MIN_EXP__
+#define __CFI_DBL_MAX_EXP__ __DBL_MAX_EXP__
+
+#else
+#include 
+
+#if (defined (LDBL_MANT_DIG) \
+ && defined (LDBL_MIN_EXP) \
+ && defined (LDBL_MAX_EXP) \
+ && defined (DBL_MANT_DIG) \
+ && defined (DBL_MIN_EXP) \
+ && defined (DBL_MAX_EXP))
+#define __CFI_LDBL_MANT_DIG__ LDBL_MANT_DIG
+#define __CFI_LDBL_MIN_EXP__ LDBL_MIN_EXP
+#define __CFI_LDBL_MAX_EXP__ LDBL_MAX_EXP
+#define __CFI_DBL_MANT_DIG__ DBL_MANT_DIG
+#define __CFI_DBL_MIN_EXP__ DBL_MIN_EXP
+#define __CFI_DBL_MAX_EXP__ DBL_MAX_EXP
+
+#else
+#define CFI_no_long_double 1
+
+#endif  /* Definitions from float.h.  */
+#endif  /* Definitions from compiler builtins.  */
+
+/* Can't determine anything about long double support?  */
+#if (defined (CFI_no_long_double))
+#define CFI_type_long_double -2
+#define CFI_type_long_double_Complex -2
 
 /* Long double is the same kind as double.  */
-#if (LDBL_MANT_DIG == DBL_MANT_DIG \
- && LDBL_MIN_EXP == DBL_MIN_EXP \
- && LDBL_MAX_EXP == DBL_MAX_EXP)
+#elif (__CFI_LDBL_MANT_DIG__ == __CFI_DBL_MANT_DIG__ \
+ && __CFI_LDBL_MIN_EXP__ == __CFI_DBL_MIN_EXP__ \
+ && __CFI_LDBL_MAX_EXP__ == __CFI_DBL_MAX_EXP__)
 #define CFI_type_long_double CFI_type_double
 #define CFI_type_long_double_Complex CFI_type_double_Complex
 
 /* This is the 80-bit encoding on x86; Fortran assigns it kind 10.  */
-#elif (LDBL_MANT_DIG == 64 \
-   && LDBL_MIN_EXP == -16381 \
-   && LDBL_MAX_EXP == 16384)
+#elif (__CFI_LDBL_MANT_DIG__ == 64 \
+   && __CFI_LDBL_MIN_EXP__ == -16381 \
+   && __CFI_LDBL_MAX_EXP__ == 16384)
 #define CFI_type_long_double (CFI_type_Real + (10 << CFI_type_kind_shift))
 #define CFI_type_long_double_Complex (CFI_type_Complex + (10 << CFI_type_kind_shift))
 
 /* This is the 96-bit encoding on m68k; Fortran assigns it kind 10.  */
-#elif (LDBL_MANT_DIG == 64 \
-   && LDBL_MIN_EXP == -16382 \
-   && LDBL_MAX_EXP == 16384)
+#elif (__CFI_LDBL_MANT_DIG__ == 64 \
+   && __CFI_LDBL_MIN_EXP__ == -16382 \
+   && __CFI_LDBL_MAX_EXP__ == 16384)
 #define CFI_type_long_double (CFI_type_Real + (10 << CFI_type_kind_shift))
 #define CFI_type_long_double_Complex (CFI_type_Complex + (10 << CFI_type_kind_shift))
 
 /* This is the IEEE 128-bit 

Re: [PATCH] i386: Fix up @xorsign3_1 [PR102224]

2021-09-13 Thread Hongtao Liu via Gcc-patches
On Tue, Sep 14, 2021 at 10:06 AM Hongtao Liu  wrote:
>
> On Tue, Sep 14, 2021 at 8:58 AM Andrew Pinski  wrote:
> >
> > On Wed, Sep 8, 2021 at 2:55 AM Hongtao Liu via Gcc-patches
> >  wrote:
> > >
> > > On Wed, Sep 8, 2021 at 5:33 PM Jakub Jelinek  wrote:
> > > >
> > > > On Wed, Sep 08, 2021 at 05:23:40PM +0800, Hongtao Liu wrote:
> > > > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > > > > >
> > > > > Patch LGTM.
> > > >
> > > > Thanks, committed.
> > > >
> > > > > PS:
> > > > >   I'm curious why we need the  post_reload splitter @xorsign3_1
> > > > > for scalar mode, can't we just expand them into and/xor operations in
> > > > > the expander, just like vector modes did.
> > > > > Let me do some experiments to see whether it is ok to remove the 
> > > > > splitter.
> > > >
> > > > I bet it is the question how should the code look like before RA.
> > > > stv is somewhat related, but as that replaces whole chains, it can do:
> > > > (insn 14 5 6 2 (set (subreg:V2DI (reg:DI 92) 0)
> > > > (vec_concat:V2DI (mem/c:DI (symbol_ref:SI ("c") [flags 0x2]  
> > > > ) [1 c+0 S8 A64])
> > > > (const_int 0 [0]))) "hohohou.c":6:9 -1
> > > >  (nil))
> > > > on loads of memory.
> > > > But it stv still does use paradoxical subregs:
> > > > (insn 10 16 11 2 (set (subreg:V2DI (reg:DI 91) 0)
> > > > (minus:V2DI (subreg:V2DI (reg:DI 87) 0)
> > > > (subreg:V2DI (reg:DI 94) 0))) "hohohou.c":6:13 5003 
> > > > {*subv2di3}
> > > >  (expr_list:REG_DEAD (reg:DI 87)
> > > > (expr_list:REG_UNUSED (reg:CC 17 flags)
> > > > (nil
> > > > (insn 11 10 0 2 (set (mem/c:DI (symbol_ref:SI ("a") [flags 0x2]  
> > > > ) [1 a+0 S8 A64])
> > > > (reg:DI 91)) "hohohou.c":6:5 76 {*movdi_internal}
> > > >  (expr_list:REG_DEAD (reg:DI 91)
> > > > (nil)))
> > > > so perhaps just using paradoxical subregs everywhere would be ok?
> > > Yes, I think so.
> > > And I find paradoxical subreg like (subreg:V4SF (reg:SF)) are not
> > > allowed by validate_subreg until r11-621.
> > > That's why post_reload splitter is needed here.
> >
> > That is not exactly true.  It has been around since before 2005.  See
> > https://gcc.gnu.org/PR24436 which is referencing the fixme comment in
> > validate_subreg.
> We also have things like (subreg:V4SF(reg:V2SF) 0), the problem of
> defining post_reload splitter with V2SF is movv2sf is only defined
> under TARGET_64BIT if there's no mmx(so should we also enable 64-bit
> vector 32-bit mode?).
> And for xorsign w/o post_reload splitter, the code is cleaner and even
> more optimal.
And if we allow something like subreg:V4SF (reg:TI), it seems we could
have something like
mov reg:SI, subreg:SI (reg:SF)
mov reg:TI, subreg:TI (reg:SI)
mov reg:V4SF, subreg:V4SF (reg:TI)

> >
> > Thanks,
> > Andrew Pinski
> >
> > Thanks,
> > Andrew Pinski
> >
> > > > Jakub
> > > >
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
>
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao


Re: [PATCH] i386: Fix up @xorsign3_1 [PR102224]

2021-09-13 Thread Hongtao Liu via Gcc-patches
On Tue, Sep 14, 2021 at 8:58 AM Andrew Pinski  wrote:
>
> On Wed, Sep 8, 2021 at 2:55 AM Hongtao Liu via Gcc-patches
>  wrote:
> >
> > On Wed, Sep 8, 2021 at 5:33 PM Jakub Jelinek  wrote:
> > >
> > > On Wed, Sep 08, 2021 at 05:23:40PM +0800, Hongtao Liu wrote:
> > > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > > > >
> > > > Patch LGTM.
> > >
> > > Thanks, committed.
> > >
> > > > PS:
> > > >   I'm curious why we need the  post_reload splitter @xorsign3_1
> > > > for scalar mode, can't we just expand them into and/xor operations in
> > > > the expander, just like vector modes did.
> > > > Let me do some experiments to see whether it is ok to remove the 
> > > > splitter.
> > >
> > > I bet it is the question how should the code look like before RA.
> > > stv is somewhat related, but as that replaces whole chains, it can do:
> > > (insn 14 5 6 2 (set (subreg:V2DI (reg:DI 92) 0)
> > > (vec_concat:V2DI (mem/c:DI (symbol_ref:SI ("c") [flags 0x2]  
> > > ) [1 c+0 S8 A64])
> > > (const_int 0 [0]))) "hohohou.c":6:9 -1
> > >  (nil))
> > > on loads of memory.
> > > But it stv still does use paradoxical subregs:
> > > (insn 10 16 11 2 (set (subreg:V2DI (reg:DI 91) 0)
> > > (minus:V2DI (subreg:V2DI (reg:DI 87) 0)
> > > (subreg:V2DI (reg:DI 94) 0))) "hohohou.c":6:13 5003 
> > > {*subv2di3}
> > >  (expr_list:REG_DEAD (reg:DI 87)
> > > (expr_list:REG_UNUSED (reg:CC 17 flags)
> > > (nil
> > > (insn 11 10 0 2 (set (mem/c:DI (symbol_ref:SI ("a") [flags 0x2]  
> > > ) [1 a+0 S8 A64])
> > > (reg:DI 91)) "hohohou.c":6:5 76 {*movdi_internal}
> > >  (expr_list:REG_DEAD (reg:DI 91)
> > > (nil)))
> > > so perhaps just using paradoxical subregs everywhere would be ok?
> > Yes, I think so.
> > And I find paradoxical subreg like (subreg:V4SF (reg:SF)) are not
> > allowed by validate_subreg until r11-621.
> > That's why post_reload splitter is needed here.
>
> That is not exactly true.  It has been around since before 2005.  See
> https://gcc.gnu.org/PR24436 which is referencing the fixme comment in
> validate_subreg.
We also have things like (subreg:V4SF(reg:V2SF) 0), the problem of
defining post_reload splitter with V2SF is movv2sf is only defined
under TARGET_64BIT if there's no mmx(so should we also enable 64-bit
vector 32-bit mode?).
And for xorsign w/o post_reload splitter, the code is cleaner and even
more optimal.
>
> Thanks,
> Andrew Pinski
>
> Thanks,
> Andrew Pinski
>
> > > Jakub
> > >
> >
> >
> >
> > --
> > BR,
> > Hongtao



-- 
BR,
Hongtao


Re: [PATCH] tree-optimization/102155 - fix LIM fill_always_executed_in CFG walk

2021-09-13 Thread Xionghu Luo via Gcc-patches




On 2021/9/13 16:17, Richard Biener wrote:

On Mon, 13 Sep 2021, Xionghu Luo wrote:




On 2021/9/10 21:54, Xionghu Luo via Gcc-patches wrote:



On 2021/9/9 18:55, Richard Biener wrote:

diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 5d6845478e7..4b187c2cdaf 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -3074,15 +3074,13 @@ fill_always_executed_in_1 (class loop *loop,
sbitmap contains_call)
   break;
     if (bb->loop_father->header == bb)
-    {
-  if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb))
-    break;
-
-  /* In a loop that is always entered we may proceed anyway.
- But record that we entered it and stop once we leave it
- since it might not be finite.  */
-  inn_loop = bb->loop_father;
-    }
+    /* Record that we enter into a subloop since it might not
+   be finite.  */
+    /* ???  Entering into a not always executed subloop makes
+   fill_always_executed_in quadratic in loop depth since
+   we walk those loops N times.  This is not a problem
+   in practice though, see PR102253 for a worst-case testcase.  */
+    inn_loop = bb->loop_father;



Yes your two patches extracted the get_loop_body_in_dom_order out and
removed
the inn_loop break logic when it doesn't dominate outer loop.  Confirmed the
replacement
could improve for saving ~10% build time due to not full DOM walker and
marked the previously
ignored ALWAYS_EXECUTED bbs.
But if we don't break for inner loop again, why still keep the *inn_loop*
variable?
It seems unnecessary and confusing, could we just remove it and restore the
original
infinte loop check in bb->succs for better understanding?



What's more, the refine of this fix is incorrect for PR78185.


commit 483e400870601f650c80f867ec781cd5f83507d6
Author: Richard Biener 
Date:   Thu Sep 2 10:47:35 2021 +0200

 Refine fix for PR78185, improve LIM for code after inner loops
 
 This refines the fix for PR78185 after understanding that the code

 regarding to the comment 'In a loop that is always entered we may
 proceed anyway.  But record that we entered it and stop once we leave
 it.' was supposed to protect us from leaving possibly infinite inner
 loops.  The simpler fix of moving the misplaced stopping code
 can then be refined to continue processing when the exited inner
 loop is finite, improving invariant motion for cases like in the
 added testcase.
 
 2021-09-02  Richard Biener  
 
 * tree-ssa-loop-im.c (fill_always_executed_in_1): Refine

 fix for PR78185 and continue processing when leaving
 finite inner loops.
 
 * gcc.dg/tree-ssa/ssa-lim-16.c: New testcase.



3<---
||
6<---|
| \  |   |
|  \ |   |
48   |
|--- |
|  | |
5  7--
|
1

loop 2 is an infinite loop, it is only ALWAYS_EXECUTED for loop 2,
but r12-3313-g483e40087 sets it ALWAYS_EXECUTED for loop 1.
We need to restore it like this:
https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579195.html


I don't understand - BB6 is the header block of loop 2 which is
always entered and thus BB6 is always executed at least once.

The important part is that BB4 which follows the inner loop is
_not_ always executed because we don't know if we will exit the
inner loop.

What am I missing?


Oh, I see.  I only noticed the functionality change of the patch on the case
and no failure check of it, misunderstood it was a regression instead of an
improvement to also hoisting invariants from infinite loop, sorry about that.

Finally, the function fill_always_executed_in_1 could mark all ALWAYS_EXECUTED
bb both including and after all subloops' bb but break after exiting from
infinite subloops with better performance, thanks.  The only thing to be
worried is replacing get_loop_body_in_dom_order makes the code a bit more
complicated for later readers as the loop depth and DOM order is not a problem
here any more? ;)



Richard.



--
Thanks,
Xionghu


Re: [PATCH] i386: Fix up @xorsign3_1 [PR102224]

2021-09-13 Thread Andrew Pinski via Gcc-patches
On Wed, Sep 8, 2021 at 2:55 AM Hongtao Liu via Gcc-patches
 wrote:
>
> On Wed, Sep 8, 2021 at 5:33 PM Jakub Jelinek  wrote:
> >
> > On Wed, Sep 08, 2021 at 05:23:40PM +0800, Hongtao Liu wrote:
> > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > > >
> > > Patch LGTM.
> >
> > Thanks, committed.
> >
> > > PS:
> > >   I'm curious why we need the  post_reload splitter @xorsign3_1
> > > for scalar mode, can't we just expand them into and/xor operations in
> > > the expander, just like vector modes did.
> > > Let me do some experiments to see whether it is ok to remove the splitter.
> >
> > I bet it is the question how should the code look like before RA.
> > stv is somewhat related, but as that replaces whole chains, it can do:
> > (insn 14 5 6 2 (set (subreg:V2DI (reg:DI 92) 0)
> > (vec_concat:V2DI (mem/c:DI (symbol_ref:SI ("c") [flags 0x2]  
> > ) [1 c+0 S8 A64])
> > (const_int 0 [0]))) "hohohou.c":6:9 -1
> >  (nil))
> > on loads of memory.
> > But it stv still does use paradoxical subregs:
> > (insn 10 16 11 2 (set (subreg:V2DI (reg:DI 91) 0)
> > (minus:V2DI (subreg:V2DI (reg:DI 87) 0)
> > (subreg:V2DI (reg:DI 94) 0))) "hohohou.c":6:13 5003 {*subv2di3}
> >  (expr_list:REG_DEAD (reg:DI 87)
> > (expr_list:REG_UNUSED (reg:CC 17 flags)
> > (nil
> > (insn 11 10 0 2 (set (mem/c:DI (symbol_ref:SI ("a") [flags 0x2]   > 0x7f65a131fc60 a>) [1 a+0 S8 A64])
> > (reg:DI 91)) "hohohou.c":6:5 76 {*movdi_internal}
> >  (expr_list:REG_DEAD (reg:DI 91)
> > (nil)))
> > so perhaps just using paradoxical subregs everywhere would be ok?
> Yes, I think so.
> And I find paradoxical subreg like (subreg:V4SF (reg:SF)) are not
> allowed by validate_subreg until r11-621.
> That's why post_reload splitter is needed here.

That is not exactly true.  It has been around since before 2005.  See
https://gcc.gnu.org/PR24436 which is referencing the fixme comment in
validate_subreg.

Thanks,
Andrew Pinski

Thanks,
Andrew Pinski

> > Jakub
> >
>
>
>
> --
> BR,
> Hongtao


Re: [PATCH] rs6000: Disable optimizing multiple xxsetaccz instructions into one xxsetaccz

2021-09-13 Thread Segher Boessenkool
On Mon, Sep 13, 2021 at 05:10:42PM -0500, Peter Bergner wrote:
> >>* config/rs6000/mma.md (unspec): Delete UNSPEC_MMA_XXSETACCZ.
> >>(unspecv): Add UNSPECV_MMA_XXSETACCZ.
> > 
> > Unrelated to this patch, but I have been wondering this for years:
> > should we have an unspecv enum at all?  It causes some churn, and you
> > can name the volatile ones UNSPECV_ in either case.
> 
> I assumed it was needed, but if not, yeah, one enum would seem to be
> better than two.

These enums are not so very old (from 2010 only).  Before that we used
define_constant for all.  Some backends used separate numbering for
unspec and for unspec_volatile, some didn't.  The enum scheme supported
both the existing practices.

It is easier and prettier to have only one namespace for this (when you
want to look something up for example, or just reading stuff even).  I
did however think it quite useful to have the "V" in the name still.
But there is nothing forcing us to have any particular naming scheme for
the enumeration constants, so that is no blocker :-)

Since it is perfectly fine to have multiple define_enum's for the same
enumeration, too, converting to this will be pretty easy :-)

> >>(mma_xxsetaccz): Change to define_insn.  Remove match_operand.
> >>Use UNSPECV_MMA_XXSETACCZ.
> > 
> > It still has the match_operand.
> 
> -(define_insn_and_split "*mma_xxsetaccz"
> +(define_insn "mma_xxsetaccz"
>[(set (match_operand:XO 0 "fpr_reg_operand" "=d")
> -   (unspec:XO [(match_operand 1 "const_0_to_1_operand" "O")]
> -UNSPEC_MMA_XXSETACCZ))]
> +   (unspec_volatile:XO [(const_int 0)]
> +   UNSPECV_MMA_XXSETACCZ))]
> 
> 
> It still has "a" match_operand...for operand 0.  The match_operand
> for operand 1 was what was removed.  Want me to reword that as
> "Remove source match_operand." or "Remove match_operand 1." or ???

Ah I see, I looked with my eyes close apparently.  "Remove operand 1"?

> >>  ;; We can't have integer constants in XOmode so we wrap this in an UNSPEC.
> > 
> > Does the comment need updating?  It may help to point out here that itr
> >  needs to be volatile.
> 
> I think the comment was referring to the unneeded operand which I have
> now removed.  I could either remove the comment altogether or change it
> to:
> 
> ;; We can't have integer constants in XOmode so we wrap this in an
> ;; UNSPEC_VOLATILE.
> 
> ...to refer to the dummy zero for the source.  Let me know what you want.

No strong opinion, the existing comment looked out of place, that's all.

The latter option adds information, so if you think that is useful to
have here, let's go with that?

Cheers,


Segher


[PATCH] Remove unused function make_unique_name.

2021-09-13 Thread Benjamin Peterson
Signed-off-by: Benjamin Peterson 

gcc/
* attribs.c (make_unique_name): Delete.
* attribs.h (make_unique_name): Delete.
---
 gcc/attribs.c | 34 --
 gcc/attribs.h |  1 -
 2 files changed, 35 deletions(-)

diff --git a/gcc/attribs.c b/gcc/attribs.c
index 0d22c20a35e..83fafc98b7d 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -1022,40 +1022,6 @@ common_function_versions (tree fn1, tree fn2)
   return result;
 }
 
-/* Return a new name by appending SUFFIX to the DECL name.  If make_unique
-   is true, append the full path name of the source file.  */
-
-char *
-make_unique_name (tree decl, const char *suffix, bool make_unique)
-{
-  char *global_var_name;
-  int name_len;
-  const char *name;
-  const char *unique_name = NULL;
-
-  name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
-
-  /* Get a unique name that can be used globally without any chances
- of collision at link time.  */
-  if (make_unique)
-unique_name = IDENTIFIER_POINTER (get_file_function_name ("\0"));
-
-  name_len = strlen (name) + strlen (suffix) + 2;
-
-  if (make_unique)
-name_len += strlen (unique_name) + 1;
-  global_var_name = XNEWVEC (char, name_len);
-
-  /* Use '.' to concatenate names as it is demangler friendly.  */
-  if (make_unique)
-snprintf (global_var_name, name_len, "%s.%s.%s", name, unique_name,
- suffix);
-  else
-snprintf (global_var_name, name_len, "%s.%s", name, suffix);
-
-  return global_var_name;
-}
-
 /* Make a dispatcher declaration for the multi-versioned function DECL.
Calls to DECL function will be replaced with calls to the dispatcher
by the front-end.  Return the decl created.  */
diff --git a/gcc/attribs.h b/gcc/attribs.h
index 87231b954c6..138c509bce1 100644
--- a/gcc/attribs.h
+++ b/gcc/attribs.h
@@ -44,7 +44,6 @@ extern struct scoped_attributes* register_scoped_attributes 
(const struct attrib
 
 extern char *sorted_attr_string (tree);
 extern bool common_function_versions (tree, tree);
-extern char *make_unique_name (tree, const char *, bool);
 extern tree make_dispatcher_decl (const tree);
 extern bool is_function_default_version (const tree);
 
-- 
2.30.2



Re: [PATCH 01/18] rs6000: Handle overloads during program parsing

2021-09-13 Thread Segher Boessenkool
On Wed, Sep 01, 2021 at 11:13:37AM -0500, Bill Schmidt wrote:
> Although this patch looks quite large, the changes are fairly minimal.
> Most of it is duplicating the large function that does the overload
> resolution using the automatically generated data structures instead of
> the old hand-generated ones.  This doesn't make the patch terribly easy to
> review, unfortunately.  Just be aware that generally we aren't changing
> the logic and functionality of overload handling.

>   (altivec_build_new_resolved_builtin): New function.
>   (altivec_resolve_new_overloaded_builtin): Likewise.

A new function of 973 lines (plus the function comment).  Please factor
that (can be in a later patch, but please do, you know what it all means
and does currently, now is the time :-) ).

> +static bool
> +rs6000_new_builtin_type_compatible (tree t, tree u)

This needs a function comment.  Are t and u used symmetrically at all?

> +{
> +  if (t == error_mark_node)
> +return false;

(not here)

> +  if (POINTER_TYPE_P (t) && POINTER_TYPE_P (u))
> +{
> +  t = TREE_TYPE (t);
> +  u = TREE_TYPE (u);
> +  if (TYPE_READONLY (u))
> + t = build_qualified_type (t, TYPE_QUAL_CONST);
> +}

Esp. here.  And it still creates junk trees where those are not needed
afaics, and that is not a great idea for functions that are called so
often.

> +static tree
> +altivec_build_new_resolved_builtin (tree *args, int n, tree fntype,
> + tree ret_type,
> + rs6000_gen_builtins bif_id,
> + rs6000_gen_builtins ovld_id)
> +{
> +  tree argtypes = TYPE_ARG_TYPES (fntype);
> +  tree arg_type[MAX_OVLD_ARGS];
> +  tree fndecl = rs6000_builtin_decls_x[bif_id];
> +  tree call;

Don't declare things so far ahead please.  Declare them right before
they are assigned to, ideally.

> +  for (int i = 0; i < n; i++)
> +arg_type[i] = TREE_VALUE (argtypes), argtypes = TREE_CHAIN (argtypes);

Please do not use comma operators where you could use separate
statements.

> +  /* The AltiVec overloading implementation is overall gross, but this

Ooh you spell "AltiVec" correctly here ;-)

You can do
  for (int j = 0; j < n; j++)
args[j] = fully_fold_convert (arg_type[j], args[j]);
here and then the rest becomes simpler.

> +  switch (n)
> +{
> +case 0:
> +  call = build_call_expr (fndecl, 0);
> +  break;
> +case 1:
> +  call = build_call_expr (fndecl, 1,
> +   fully_fold_convert (arg_type[0], args[0]));
> +  break;
> +case 2:
> +  call = build_call_expr (fndecl, 2,
> +   fully_fold_convert (arg_type[0], args[0]),
> +   fully_fold_convert (arg_type[1], args[1]));
> +  break;
> +case 3:
> +  call = build_call_expr (fndecl, 3,
> +   fully_fold_convert (arg_type[0], args[0]),
> +   fully_fold_convert (arg_type[1], args[1]),
> +   fully_fold_convert (arg_type[2], args[2]));
> +  break;
> +case 4:
> +  call = build_call_expr (fndecl, 4,
> +   fully_fold_convert (arg_type[0], args[0]),
> +   fully_fold_convert (arg_type[1], args[1]),
> +   fully_fold_convert (arg_type[2], args[2]),
> +   fully_fold_convert (arg_type[3], args[3]));
> +  break;
> +default:
> +  gcc_unreachable ();
> +}
> +  return fold_convert (ret_type, call);
> +}

> +static tree
> +altivec_resolve_new_overloaded_builtin (location_t loc, tree fndecl,
> + void *passed_arglist)
> +{
> +  vec *arglist = static_cast *> 
> (passed_arglist);
> +  unsigned int nargs = vec_safe_length (arglist);
> +  enum rs6000_gen_builtins fcode
> += (enum rs6000_gen_builtins) DECL_MD_FUNCTION_CODE (fndecl);
> +  tree fnargs = TYPE_ARG_TYPES (TREE_TYPE (fndecl));
> +  tree types[MAX_OVLD_ARGS], args[MAX_OVLD_ARGS];

Two separate lines please, they are very different things, and very
important things, too.

> +  unsigned int n;

You use this var first 792 lines later.  Please don't.

Oh well, this will become much better once this is more properly
factored.  Who knows, some of it may become readable / understandable
even!  :-)

> +  arg = (*arglist)[0];
> +  type = TREE_TYPE (arg);
> +  if (!SCALAR_FLOAT_TYPE_P (type)
> +   && !INTEGRAL_TYPE_P (type))
> + goto bad;

And all gotos still scream "FACTOR ME".

> +   case E_TImode:
> + type = (unsigned_p ? unsigned_V1TI_type_node : V1TI_type_node);
> + size = 1;
> + break;

  type = signed_or_unsigned_type_for (unsigned_p, V1TI_type_node);
etc.

> + arg2 = build_binary_op (loc, BIT_AND_EXPR, arg2,
> + build_int_cst (TREE_TYPE (arg2),
> +TYPE_VECTOR_SUBPARTS (arg1_type)
> + 

Re: [PATCH] c++: fix wrong fixit hints for misspelled typedef [PR77565]

2021-09-13 Thread David Malcolm via Gcc-patches
On Tue, 2021-09-14 at 03:35 +0900, Michel Morin via Gcc-patches wrote:
> Hi,
> 
> PR77565 reports that, with the code `typdef int Int;`, GCC emits
> "did you mean 'typeof'?" instead of "did you mean 'typedef'?".
> 
> This happens because the typo corrector determines that `typeof` is a
> candidate for suggestion (through
> `cp_keyword_starts_decl_specifier_p`),
> but `typedef` is not.
> 
> This patch fixes the issue by adding `typedef` as a candidate. The
> patch
> additionally adds the `inline` specifier and cv-specifiers as a
> candidate.
> Here is a patch (tests `make check-gcc` pass on darwin):

Thanks for this patch (and for reporting the bug in the first place).

I notice that, as well as being used for fix-it hints by
lookup_name_fuzzy (indirectly via suggest_rid_p),
cp_keyword_starts_decl_specifier_p is also used by
cp_lexer_next_token_is_decl_specifier_keyword, which is used by
cp_parser_lambda_declarator_opt and cp_parser_constructor_declarator_p.

So I'm not sure if this fix is exactly correct - hopefully one of the
C++ frontend maintainers can chime in.  If
cp_keyword_starts_decl_specifier_p isn't quite the right place for
this, the fix could probably go in suggest_rid_p instead, which *is*
specific to spelling corrections.

Hope this is constructive; thanks again for the patch
Dave



> 
> 
> c++: add typo corrections for typedef/inline/cv-qual [PR77565]
> 
> PR c++/77565
> 
> gcc/cp/ChangeLog:
> 
> * parser.c (cp_keyword_starts_decl_specifier_p): Handle
> typedef/inline specifiers and cv-qualifiers.
> 
> gcc/testsuite/ChangeLog:
> 
> * g++.dg/spellcheck-typenames.C: Add tests for decl-specs.
> 
> --- a/gcc/cp/parser.c
> +++ b/gcc/cp/parser.c
> @@ -1051,6 +1051,12 @@ cp_keyword_starts_decl_specifier_p (enum rid
> keyword)
>  case RID_FLOAT:
>  case RID_DOUBLE:
>  case RID_VOID:
> +  /* CV qualifiers.  */
> +    case RID_CONST:
> +    case RID_VOLATILE:
> +  /* typedef/inline specifiers.  */
> +    case RID_TYPEDEF:
> +    case RID_INLINE:
>    /* GNU extensions.  */
>  case RID_ATTRIBUTE:
>  case RID_TYPEOF:
> --- a/gcc/testsuite/g++.dg/spellcheck-typenames.C
> +++ b/gcc/testsuite/g++.dg/spellcheck-typenames.C
> @@ -76,3 +76,38 @@ singed char ch; // { dg-error "1: 'singed' does
> not
> name a type; did you mean 's
>   ^~
>   signed
>     { dg-end-multiline-output "" } */
> +
> +typdef int my_int; // { dg-error "1: 'typdef' does not name a type;
> did you mean 'typedef'?" }
> +/* { dg-begin-multiline-output "" }
> + typdef int my_int;
> + ^~
> + typedef
> +   { dg-end-multiline-output "" } */
> +
> +inlien int inline_func(); // { dg-error "1: 'inlien' does not name a
> type; did you mean 'inline'?" }
> +/* { dg-begin-multiline-output "" }
> + inlien int inline_func();
> + ^~
> + inline
> +   { dg-end-multiline-output "" } */
> +
> +coonst int ci = 0; // { dg-error "1: 'coonst' does not name a type;
> did you mean 'const'?" }
> +/* { dg-begin-multiline-output "" }
> + coonst int ci = 0;
> + ^~
> + const
> +   { dg-end-multiline-output "" } */
> +
> +voltil int vi; // { dg-error "1: 'voltil' does not name a type; did
> you mean 'volatile'?" }
> +/* { dg-begin-multiline-output "" }
> + voltil int vi;
> + ^~
> + volatile
> +   { dg-end-multiline-output "" } */
> +
> +statik int si; // { dg-error "1: 'statik' does not name a type; did
> you mean 'static'?" }
> +/* { dg-begin-multiline-output "" }
> + statik int si;
> + ^~
> + static
> +   { dg-end-multiline-output "" } */
> 
> 
> --
> Regards,
> Michel




Re: [PATCH] rs6000: Disable optimizing multiple xxsetaccz instructions into one xxsetaccz

2021-09-13 Thread Peter Bergner via Gcc-patches
On 9/12/21 2:26 PM, Segher Boessenkool wrote:
>> I also removed the mma_xxsetaccz define_expand and
>> define_insn_and_split and replaced it with a simple define_insn.
> 
> In the future pleaase do that in a separate patch.  That makes it *much*
> easier to read and review this.

Will do.



>>  * config/rs6000/mma.md (unspec): Delete UNSPEC_MMA_XXSETACCZ.
>>  (unspecv): Add UNSPECV_MMA_XXSETACCZ.
> 
> Unrelated to this patch, but I have been wondering this for years:
> should we have an unspecv enum at all?  It causes some churn, and you
> can name the volatile ones UNSPECV_ in either case.

I assumed it was needed, but if not, yeah, one enum would seem to be
better than two.




>>  (mma_xxsetaccz): Change to define_insn.  Remove match_operand.
>>  Use UNSPECV_MMA_XXSETACCZ.
> 
> It still has the match_operand.

-(define_insn_and_split "*mma_xxsetaccz"
+(define_insn "mma_xxsetaccz"
   [(set (match_operand:XO 0 "fpr_reg_operand" "=d")
-   (unspec:XO [(match_operand 1 "const_0_to_1_operand" "O")]
-UNSPEC_MMA_XXSETACCZ))]
+   (unspec_volatile:XO [(const_int 0)]
+   UNSPECV_MMA_XXSETACCZ))]


It still has "a" match_operand...for operand 0.  The match_operand
for operand 1 was what was removed.  Want me to reword that as
"Remove source match_operand." or "Remove match_operand 1." or ???



>>  ;; We can't have integer constants in XOmode so we wrap this in an UNSPEC.
> 
> Does the comment need updating?  It may help to point out here that itr
>  needs to be volatile.

I think the comment was referring to the unneeded operand which I have
now removed.  I could either remove the comment altogether or change it
to:

;; We can't have integer constants in XOmode so we wrap this in an
;; UNSPEC_VOLATILE.

...to refer to the dummy zero for the source.  Let me know what you want.



>> (set_attr "length" "4")])
> 
> Not new of course: the default length is 4, most insns have that, it
> helps to be less verbose.

I'll remove that before pushing, thanks!


Peter



[r12-3495 Regression] FAIL: 29_atomics/atomic_flag/test_and_set/explicit-hle.cc (test for excess errors) on Linux/x86_64

2021-09-13 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

76b75018b3d053a890ebe155e47814de14b3c9fb is the first bad commit
commit 76b75018b3d053a890ebe155e47814de14b3c9fb
Author: Jason Merrill 
Date:   Thu Jul 15 15:30:17 2021 -0400

c++: implement C++17 hardware interference size

caused

FAIL: 29_atomics/atomic_flag/test_and_set/explicit-hle.cc (test for excess 
errors)
FAIL: g++.dg/ext/sync-4.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/ext/sync-4.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/ext/sync-4.C  -std=gnu++2a (test for excess errors)
FAIL: g++.dg/ext/sync-4.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/opt/inline9.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/opt/inline9.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/opt/inline9.C  -std=gnu++2a (test for excess errors)
FAIL: g++.dg/opt/inline9.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/opt/longbranch2.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/opt/longbranch2.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/opt/longbranch2.C  -std=gnu++2a (test for excess errors)
FAIL: g++.dg/opt/longbranch2.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/opt/pr52727.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/opt/pr52727.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/opt/pr52727.C  -std=gnu++2a (test for excess errors)
FAIL: g++.dg/opt/pr52727.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/opt/pr58864.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/opt/pr58864.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/opt/pr58864.C  -std=gnu++2a (test for excess errors)
FAIL: g++.dg/opt/pr58864.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/opt/pr69570.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/opt/pr69570.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/opt/pr69570.C  -std=gnu++2a (test for excess errors)
FAIL: g++.dg/opt/pr69570.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/opt/reg-stack4.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/opt/reg-stack4.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/opt/reg-stack4.C  -std=gnu++2a (test for excess errors)
FAIL: g++.dg/opt/reg-stack4.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/other/pr39496.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/other/pr39496.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/other/pr39496.C  -std=gnu++2a (test for excess errors)
FAIL: g++.dg/other/pr39496.C  -std=gnu++98 (test for excess errors)
FAIL: libitm.c++/dropref.C (test for excess errors)
FAIL: libitm.c++/eh-1.C (test for excess errors)
FAIL: libitm.c++/eh-2.C (test for excess errors)
FAIL: libitm.c++/eh-3.C (test for excess errors)
FAIL: libitm.c++/eh-4.C (test for excess errors)
FAIL: libitm.c++/eh-5.C (test for excess errors)
FAIL: libitm.c++/libstdc++-pr91488.C (test for excess errors)
FAIL: libitm.c++/libstdc++-safeexc.C (test for excess errors)
FAIL: libitm.c++/newdelete.C (test for excess errors)
FAIL: libitm.c++/throwdown.C (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-3495/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="conformance.exp=29_atomics/atomic_flag/test_and_set/explicit-hle.cc
 --target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libstdc++-v3/testsuite && make check 
RUNTESTFLAGS="conformance.exp=29_atomics/atomic_flag/test_and_set/explicit-hle.cc
 --target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/ext/sync-4.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/ext/sync-4.C 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/opt/inline9.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/opt/inline9.C 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/opt/longbranch2.C --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/opt/longbranch2.C --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/opt/pr52727.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/opt/pr52727.C 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/opt/pr58864.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/opt/pr58864.C 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/opt/pr69570.C 

[PATCH] c++: empty union member activation during constexpr [PR102163]

2021-09-13 Thread Patrick Palka via Gcc-patches
Here, the union's constructor is defined to activate its empty data
member _M_rest, but during constexpr evaluation of this constructor the
subobject constructor call to O::O(&_M_rest, 42) produces no side
effects that actually activates the member, so the union still appears
uninitialized after the fact.  This patch fixes this by faking up a
dummy MODIFY_EXPR in this situation, whose evaluation ensures the member
gets activated.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk?

PR c++/102163

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_call_expression): After evaluating a
constructor call for an empty union member, produce a side
effect that makes sure the member is activated.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-empty17.C: New test.
---
 gcc/cp/constexpr.c| 34 +++
 .../g++.dg/cpp0x/constexpr-empty17.C  | 21 
 2 files changed, 49 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-empty17.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 7772fe62d95..40b0b80b438 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -2787,12 +2787,34 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, 
tree t,
_target);
 
  if (DECL_CONSTRUCTOR_P (fun))
-   /* This can be null for a subobject constructor call, in
-  which case what we care about is the initialization
-  side-effects rather than the value.  We could get at the
-  value by evaluating *this, but we don't bother; there's
-  no need to put such a call in the hash table.  */
-   result = lval ? ctx->object : ctx->ctor;
+   {
+ /* This can be null for a subobject constructor call, in
+which case what we care about is the initialization
+side-effects rather than the value.  We could get at the
+value by evaluating *this, but we don't bother; there's
+no need to put such a call in the hash table.  */
+ result = lval ? ctx->object : ctx->ctor;
+
+ if (!result && new_obj
+ && TREE_CODE (new_obj) == COMPONENT_REF
+ && TREE_CODE (TREE_TYPE
+   (TREE_OPERAND (new_obj, 0))) == UNION_TYPE
+ && is_really_empty_class (TREE_TYPE (new_obj),
+   /*ignore_vptr*/false))
+   {
+ /* This constructor call for an empty union member might not
+have produced a side effect that actually activated the
+member.  So produce such a side effect now to ensure the
+union appears initialized.  */
+ tree activate = build2 (MODIFY_EXPR, TREE_TYPE (new_obj),
+ new_obj,
+ build_constructor (TREE_TYPE 
(new_obj),
+NULL));
+ cxx_eval_constant_expression (ctx, activate, lval,
+   non_constant_p, overflow_p);
+ ggc_free (activate);
+   }
+   }
  else if (VOID_TYPE_P (TREE_TYPE (res)))
result = void_node;
  else
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-empty17.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-empty17.C
new file mode 100644
index 000..9d753a3bb69
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-empty17.C
@@ -0,0 +1,21 @@
+// PR c++/102163
+// { dg-do compile { target c++11 } }
+
+struct O {
+  constexpr O(int) { }
+};
+
+union _Variadic_union {
+  constexpr _Variadic_union(int __arg) : _M_rest(__arg) { }
+
+  int _M_first;
+  O _M_rest;
+};
+
+
+struct _Variant_storage {
+  constexpr _Variant_storage() : _M_u(42) {}
+  _Variadic_union _M_u;
+};
+
+constexpr _Variant_storage w;
-- 
2.33.0.328.g8b7c11b866



Merge from trunk to gccgo branch

2021-09-13 Thread Ian Lance Taylor via Gcc-patches
I merged trunk revision 104c05c5284b7822d770ee51a7d91946c7e56d50 to
the gccgo branch.

Ian


Re: [PATCH 05/18] rs6000: Support for vectorizing built-in functions

2021-09-13 Thread will schmidt via Gcc-patches
On Wed, 2021-09-01 at 11:13 -0500, Bill Schmidt via Gcc-patches wrote:
> This patch just duplicates a couple of functions and adjusts them to use the
> new builtin names.  There's no logical change otherwise.
> 
> 2021-08-31  Bill Schmidt  
> 
> gcc/
>   * config/rs6000/rs6000.c (rs6000-builtins.h): New include.
>   (rs6000_new_builtin_vectorized_function): New function.
>   (rs6000_new_builtin_md_vectorized_function): Likewise.
>   (rs6000_builtin_vectorized_function): Call
>   rs6000_new_builtin_vectorized_function.
>   (rs6000_builtin_md_vectorized_function): Call
>   rs6000_new_builtin_md_vectorized_function.

ok

> ---
>  gcc/config/rs6000/rs6000.c | 253 +
>  1 file changed, 253 insertions(+)
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index b7ea1483da5..52c78c7500c 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -78,6 +78,7 @@
>  #include "case-cfn-macros.h"
>  #include "ppc-auxv.h"
>  #include "rs6000-internal.h"
> +#include "rs6000-builtins.h"
>  #include "opts.h"
> 
>  /* This file should be included last.  */
> @@ -5501,6 +5502,251 @@ rs6000_loop_unroll_adjust (unsigned nunroll, struct 
> loop *loop)
>return nunroll;
>  }
> 
> +/* Returns a function decl for a vectorized version of the builtin function
> +   with builtin function code FN and the result vector type TYPE, or 
> NULL_TREE
> +   if it is not available.  */
> +
> +static tree
> +rs6000_new_builtin_vectorized_function (unsigned int fn, tree type_out,
> + tree type_in)
> +{
> +  machine_mode in_mode, out_mode;
> +  int in_n, out_n;
> +
> +  if (TARGET_DEBUG_BUILTIN)
> +fprintf (stderr, "rs6000_new_builtin_vectorized_function (%s, %s, %s)\n",
> +  combined_fn_name (combined_fn (fn)),
> +  GET_MODE_NAME (TYPE_MODE (type_out)),
> +  GET_MODE_NAME (TYPE_MODE (type_in)));
> +
> +  if (TREE_CODE (type_out) != VECTOR_TYPE
> +  || TREE_CODE (type_in) != VECTOR_TYPE)
> +return NULL_TREE;
> +
> +  out_mode = TYPE_MODE (TREE_TYPE (type_out));
> +  out_n = TYPE_VECTOR_SUBPARTS (type_out);
> +  in_mode = TYPE_MODE (TREE_TYPE (type_in));
> +  in_n = TYPE_VECTOR_SUBPARTS (type_in);
> +
> +  switch (fn)
> +{
> +CASE_CFN_COPYSIGN:
> +  if (VECTOR_UNIT_VSX_P (V2DFmode)
> +   && out_mode == DFmode && out_n == 2
> +   && in_mode == DFmode && in_n == 2)
> + return rs6000_builtin_decls_x[RS6000_BIF_CPSGNDP];
> +  if (VECTOR_UNIT_VSX_P (V4SFmode)
> +   && out_mode == SFmode && out_n == 4
> +   && in_mode == SFmode && in_n == 4)
> + return rs6000_builtin_decls_x[RS6000_BIF_CPSGNSP];
> +  if (VECTOR_UNIT_ALTIVEC_P (V4SFmode)
> +   && out_mode == SFmode && out_n == 4
> +   && in_mode == SFmode && in_n == 4)
> + return rs6000_builtin_decls_x[RS6000_BIF_COPYSIGN_V4SF];
> +  break;
> +CASE_CFN_CEIL:
> +  if (VECTOR_UNIT_VSX_P (V2DFmode)
> +   && out_mode == DFmode && out_n == 2
> +   && in_mode == DFmode && in_n == 2)
> + return rs6000_builtin_decls_x[RS6000_BIF_XVRDPIP];
> +  if (VECTOR_UNIT_VSX_P (V4SFmode)
> +   && out_mode == SFmode && out_n == 4
> +   && in_mode == SFmode && in_n == 4)
> + return rs6000_builtin_decls_x[RS6000_BIF_XVRSPIP];
> +  if (VECTOR_UNIT_ALTIVEC_P (V4SFmode)
> +   && out_mode == SFmode && out_n == 4
> +   && in_mode == SFmode && in_n == 4)
> + return rs6000_builtin_decls_x[RS6000_BIF_VRFIP];
> +  break;
> +CASE_CFN_FLOOR:
> +  if (VECTOR_UNIT_VSX_P (V2DFmode)
> +   && out_mode == DFmode && out_n == 2
> +   && in_mode == DFmode && in_n == 2)
> + return rs6000_builtin_decls_x[RS6000_BIF_XVRDPIM];
> +  if (VECTOR_UNIT_VSX_P (V4SFmode)
> +   && out_mode == SFmode && out_n == 4
> +   && in_mode == SFmode && in_n == 4)
> + return rs6000_builtin_decls_x[RS6000_BIF_XVRSPIM];
> +  if (VECTOR_UNIT_ALTIVEC_P (V4SFmode)
> +   && out_mode == SFmode && out_n == 4
> +   && in_mode == SFmode && in_n == 4)
> + return rs6000_builtin_decls_x[RS6000_BIF_VRFIM];
> +  break;
> +CASE_CFN_FMA:
> +  if (VECTOR_UNIT_VSX_P (V2DFmode)
> +   && out_mode == DFmode && out_n == 2
> +   && in_mode == DFmode && in_n == 2)
> + return rs6000_builtin_decls_x[RS6000_BIF_XVMADDDP];
> +  if (VECTOR_UNIT_VSX_P (V4SFmode)
> +   && out_mode == SFmode && out_n == 4
> +   && in_mode == SFmode && in_n == 4)
> + return rs6000_builtin_decls_x[RS6000_BIF_XVMADDSP];
> +  if (VECTOR_UNIT_ALTIVEC_P (V4SFmode)
> +   && out_mode == SFmode && out_n == 4
> +   && in_mode == SFmode && in_n == 4)
> + return rs6000_builtin_decls_x[RS6000_BIF_VMADDFP];
> +  break;
> +CASE_CFN_TRUNC:
> +  if (VECTOR_UNIT_VSX_P (V2DFmode)
> +   && out_mode == DFmode && out_n == 2
> +   && in_mode == DFmode && in_n == 2)
> + return 

Re: [PATCH][v2] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/13/2021 10:52 AM, Koning, Paul wrote:



On Sep 13, 2021, at 3:31 AM, Richard Biener  wrote:

This makes defaults.h choose DWARF2_DEBUG if PREFERRED_DEBUGGING_TYPE
is not specified by the target and NO_DEBUG if DWARF is not supported.

It also makes us warn when STABS is enabled and removes the corresponding
diagnostic from the Ada frontend.  The warnings are pruned from the
testsuite output via prune_gcc_output.

This leaves the following targets without debug support:

pdp11-*-*   pdp11 is a.out, dwarf support is difficult

I'll admit that I don't know much about debug formats.  It is definitely the 
case that pdp11 output is a.out (it may be BSD 2.x style a.out -- which I think 
is somewhat different though it's been many years since I looked at that, and 
then only briefly).  I guess that constrains which debug formats can be used, 
but I don't know any details.
My recollection of aout stabs is mostly lost.  IIRC we'd emit .stabs 
directives to the assembler which would turn into symbol table entries.


Embedded stabs used the same underlying stab strings, but instead put 
the information into a special section.  That requires an object format 
that supports named sections, so it's a non-starter for a.out.


dwarf also requires named sections.  While in theory one could probably 
do something hackish like dwarf embedded in .stab directives, that just 
seems just awful.




pdp11-elf was done as an experiment by someone else, in binutils.  I'll ask about the 
status of that.  If it's possible to deliver that, it would presumably enable DWARF 
support.  Is that all common code so it's a matter of enabling it, or would "DWARF 
machine details for pdp11" have to be defined?
That's going to be the best path forward.  Get the pdp11-elf bits 
working and the dwarf2 debugging stuff should come along for free.


Jeff



Re: [PATCH] Remove DARWIN_PREFER_DWARF and dead code

2021-09-13 Thread Iain Sandoe
Hi Folks

> On 10 Sep 2021, at 16:16, Jeff Law  wrote:
> On 9/10/2021 1:19 AM, Richard Biener via Gcc-patches wrote:
>> This removes the always defined DARWIN_PREFER_DWARF and the code
>> guarded by it being not defined, removing the possibility to
>> default some i386 darwin configurations to STABS when it would
>> not be defined.
>> 
>> OK for trunk?
>> 
>> Thanks,
>> Richard.
>> 
>> 2021-09-10  Richard Biener  
>> 
>>  * config/darwin.h (DARWIN_PREFER_DWARF): Do not define.
>>  * config/i386/darwin.h (PREFERRED_DEBUGGING_TYPE): Do not
>>  change based on DARWIN_PREFER_DWARF not being defined.

Sorry, was OOO and without sensible connection.

As you saw I was part way through ripping out stabs for Darwin (I disabled it 
on the last cycle, but with an easy route to re-enabling should there be any 
fallout) - no one has complained - so the patch ought to be fine.

> OK.  I'm not too worried about supporting 32bit darwin 8 and earlier.  That's 
> got to be at least a decade out of service at this point

The Darwin maintainers have had a policy of not breaking older versions on 
purpose (and I have gone along with that for now) - but if you want to build on 
anything older than Darwin8 expect to have to build a whole bunch of supporting 
tools first to support the bootstrap…

For the record, Darwin8 works fine for GCC11 - and I’d not expect a problem 
with GCC12 (so far) - there will come a point that we want to default to using 
embedded rpaths, which it doesn’t support so we shall see what options there 
are then.

Iain



Re: [PATCH 04/18] rs6000: Handle some recent MMA builtin changes

2021-09-13 Thread will schmidt via Gcc-patches
On Wed, 2021-09-01 at 11:13 -0500, Bill Schmidt via Gcc-patches wrote:
> Peter Bergner recently added two new builtins __builtin_vsx_lxvp and
> __builtin_vsx_stxvp.  These happened to break a pattern in MMA builtins that
> I had been using to automate gimple folding of MMA builtins.  Previously,
> every MMA function that could be folded had an associated internal function
> that it was folded into.  The LXVP/STXVP builtins are just folded directly
> into memory operations.
> 
> Instead of relying on this pattern, this patch adds a new attribute to
> builtins called "mmaint," which is set for all MMA builtins that have an
> associated internal builtin.  The naming convention that adds _INTERNAL to
> the builtin index name remains.
> 
> The rest of the patch is just duplicating Peter's patch, using the new
> builtin infrastructure.
> 
> 2021-08-23  Bill Schmidt  
> 
> gcc/
>   * config/rs6000/rs6000-builtin-new.def (ASSEMBLE_ACC): Add mmaint flag.
>   (ASSEMBLE_PAIR): Likewise.
>   (BUILD_ACC): Likewise.
>   (DISASSEMBLE_ACC): Likewise.
>   (DISASSEMBLE_PAIR): Likewise.
>   (PMXVBF16GER2): Likewise.
>   (PMXVBF16GER2NN): Likewise.
>   (PMXVBF16GER2NP): Likewise.
>   (PMXVBF16GER2PN): Likewise.
>   (PMXVBF16GER2PP): Likewise.
>   (PMXVF16GER2): Likewise.
>   (PMXVF16GER2NN): Likewise.
>   (PMXVF16GER2NP): Likewise.
>   (PMXVF16GER2PN): Likewise.
>   (PMXVF16GER2PP): Likewise.
>   (PMXVF32GER): Likewise.
>   (PMXVF32GERNN): Likewise.
>   (PMXVF32GERNP): Likewise.
>   (PMXVF32GERPN): Likewise.
>   (PMXVF32GERPP): Likewise.
>   (PMXVF64GER): Likewise.
>   (PMXVF64GERNN): Likewise.
>   (PMXVF64GERNP): Likewise.
>   (PMXVF64GERPN): Likewise.
>   (PMXVF64GERPP): Likewise.
>   (PMXVI16GER2): Likewise.
>   (PMXVI16GER2PP): Likewise.
>   (PMXVI16GER2S): Likewise.
>   (PMXVI16GER2SPP): Likewise.
>   (PMXVI4GER8): Likewise.
>   (PMXVI4GER8PP): Likewise.
>   (PMXVI8GER4): Likewise.
>   (PMXVI8GER4PP): Likewise.
>   (PMXVI8GER4SPP): Likewise.
>   (XVBF16GER2): Likewise.
>   (XVBF16GER2NN): Likewise.
>   (XVBF16GER2NP): Likewise.
>   (XVBF16GER2PN): Likewise.
>   (XVBF16GER2PP): Likewise.
>   (XVF16GER2): Likewise.
>   (XVF16GER2NN): Likewise.
>   (XVF16GER2NP): Likewise.
>   (XVF16GER2PN): Likewise.
>   (XVF16GER2PP): Likewise.
>   (XVF32GER): Likewise.
>   (XVF32GERNN): Likewise.
>   (XVF32GERNP): Likewise.
>   (XVF32GERPN): Likewise.
>   (XVF32GERPP): Likewise.
>   (XVF64GER): Likewise.
>   (XVF64GERNN): Likewise.
>   (XVF64GERNP): Likewise.
>   (XVF64GERPN): Likewise.
>   (XVF64GERPP): Likewise.
>   (XVI16GER2): Likewise.
>   (XVI16GER2PP): Likewise.
>   (XVI16GER2S): Likewise.
>   (XVI16GER2SPP): Likewise.
>   (XVI4GER8): Likewise.
>   (XVI4GER8PP): Likewise.
>   (XVI8GER4): Likewise.
>   (XVI8GER4PP): Likewise.
>   (XVI8GER4SPP): Likewise.
>   (XXMFACC): Likewise.
>   (XXMTACC): Likewise.
>   (XXSETACCZ): Likewise.
>   (ASSEMBLE_PAIR_V): Likewise.
>   (BUILD_PAIR): Likewise.
>   (DISASSEMBLE_PAIR_V): Likewise.
>   (LXVP): New.
>   (STXVP): New.

ok

>   * config/rs6000/rs6000-call.c
>   (rs6000_gimple_fold_new_mma_builtin): Handle RS6000_BIF_LXVP and
>   RS6000_BIF_STXVP.
>   * config/rs6000/rs6000-gen-builtins.c (attrinfo): Add ismmaint.
>   (parse_bif_attrs): Handle ismmaint.
>   (write_decls): Add bif_mmaint_bit and bif_is_mmaint.
>   (write_bif_static_init): Handle ismmaint.

ok

> ---
>  gcc/config/rs6000/rs6000-builtin-new.def | 145 ---
>  gcc/config/rs6000/rs6000-call.c  |  38 +-
>  gcc/config/rs6000/rs6000-gen-builtins.c  |  38 +++---
>  3 files changed, 135 insertions(+), 86 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
> b/gcc/config/rs6000/rs6000-builtin-new.def
> index a8c6b9e988f..1966516551e 100644
> --- a/gcc/config/rs6000/rs6000-builtin-new.def
> +++ b/gcc/config/rs6000/rs6000-builtin-new.def
> @@ -129,6 +129,7 @@
>  ;   mma  Needs special handling for MMA
>  ;   quad MMA instruction using a register quad as an input operand
>  ;   pair MMA instruction using a register pair as an input operand
> +;   mmaint   MMA instruction expanding to internal call at GIMPLE time
>  ;   no32bit  Not valid for TARGET_32BIT
>  ;   32bitRequires different handling for TARGET_32BIT
>  ;   cpu  This is a "cpu_is" or "cpu_supports" builtin
> @@ -3584,415 +3585,421 @@
> 
>  [mma]
>void __builtin_mma_assemble_acc (v512 *, vuc, vuc, vuc, vuc);
> -ASSEMBLE_ACC nothing {mma}
> +ASSEMBLE_ACC nothing {mma,mmaint}
> 
>v512 __builtin_mma_assemble_acc_internal (vuc, vuc, vuc, vuc);
>  ASSEMBLE_ACC_INTERNAL mma_assemble_acc {mma}
> 
>void __builtin_mma_assemble_pair (v256 *, vuc, vuc);
> -ASSEMBLE_PAIR nothing {mma}
> + 

Re: [PATCH 03/18] rs6000: Handle gimple folding of target built-ins

2021-09-13 Thread will schmidt via Gcc-patches
On Wed, 2021-09-01 at 11:13 -0500, Bill Schmidt via Gcc-patches wrote:
> This is another patch that looks bigger than it really is.  Because we
> have a new namespace for the builtins, allowing us to have both the old
> and new builtin infrastructure supported at once, we need versions of
> these functions that use the new builtin namespace.  Otherwise the code is
> unchanged.
> 
> 2021-08-31  Bill Schmidt  
> 
> gcc/
>   * config/rs6000/rs6000-call.c (rs6000_gimple_fold_new_builtin):
>   New forward decl.
>   (rs6000_gimple_fold_builtin): Call rs6000_gimple_fold_new_builtin.
>   (rs6000_new_builtin_valid_without_lhs): New function.
>   (rs6000_gimple_fold_new_mma_builtin): Likewise.
>   (rs6000_gimple_fold_new_builtin): Likewise.
> ---
>  gcc/config/rs6000/rs6000-call.c | 1165 +++
>  1 file changed, 1165 insertions(+)
> 
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 2c68aa3580c..eae4e15df1e 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -190,6 +190,7 @@ static tree builtin_function_type (machine_mode, 
> machine_mode,
>  static void rs6000_common_init_builtins (void);
>  static void htm_init_builtins (void);
>  static void mma_init_builtins (void);
> +static bool rs6000_gimple_fold_new_builtin (gimple_stmt_iterator *gsi);
> 
> 
>  /* Hash table to keep track of the argument types for builtin functions.  */
> @@ -12024,6 +12025,9 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator 
> *gsi)
>  bool
>  rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>  {
> +  if (new_builtins_are_live)
> +return rs6000_gimple_fold_new_builtin (gsi);
> +
>gimple *stmt = gsi_stmt (*gsi);
>tree fndecl = gimple_call_fndecl (stmt);
>gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == 
> BUILT_IN_MD);

ok

> @@ -12971,6 +12975,35 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
> *gsi)
>return false;
>  }
> 
> +/*  Helper function to sort out which built-ins may be valid without having
> +a LHS.  */
> +static bool
> +rs6000_new_builtin_valid_without_lhs (enum rs6000_gen_builtins fn_code,
> +   tree fndecl)
> +{
> +  if (TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node)
> +return true;

Is that a better or improved version of the code as seen in
rs6000_builtin_valid_without_lhs ? 
That is
>  if (rs6000_builtin_info[fn_code].attr & RS6000_BTC_VOID)
>return true;

ok either way.


> +
> +  switch (fn_code)
> +{
> +case RS6000_BIF_STVX_V16QI:
> +case RS6000_BIF_STVX_V8HI:
> +case RS6000_BIF_STVX_V4SI:
> +case RS6000_BIF_STVX_V4SF:
> +case RS6000_BIF_STVX_V2DI:
> +case RS6000_BIF_STVX_V2DF:
> +case RS6000_BIF_STXVW4X_V16QI:
> +case RS6000_BIF_STXVW4X_V8HI:
> +case RS6000_BIF_STXVW4X_V4SF:
> +case RS6000_BIF_STXVW4X_V4SI:
> +case RS6000_BIF_STXVD2X_V2DF:
> +case RS6000_BIF_STXVD2X_V2DI:
> +  return true;
> +default:
> +  return false;
> +}
> +}
> +
>  /* Check whether a builtin function is supported in this target
> configuration.  */
>  bool
> @@ -13024,6 +13057,1138 @@ rs6000_new_builtin_is_supported (enum 
> rs6000_gen_builtins fncode)
>gcc_unreachable ();
>  }
> 
> +/* Expand the MMA built-ins early, so that we can convert the 
> pass-by-reference
> +   __vector_quad arguments into pass-by-value arguments, leading to more
> +   efficient code generation.  */
> +static bool
> +rs6000_gimple_fold_new_mma_builtin (gimple_stmt_iterator *gsi,
> + rs6000_gen_builtins fn_code)
> +{
> +  gimple *stmt = gsi_stmt (*gsi);
> +  size_t fncode = (size_t) fn_code;
> +
> +  if (!bif_is_mma (rs6000_builtin_info_x[fncode]))
> +return false;
> +
> +  /* Each call that can be gimple-expanded has an associated built-in
> + function that it will expand into.  If this one doesn't, we have
> + already expanded it!  */
> +  if (rs6000_builtin_info_x[fncode].assoc_bif == RS6000_BIF_NONE)
> +return false;
> +
> +  bifdata *bd = _builtin_info_x[fncode];
> +  unsigned nopnds = bd->nargs;
> +  gimple_seq new_seq = NULL;
> +  gimple *new_call;
> +  tree new_decl;
> +
> +  /* Compatibility built-ins; we used to call these
> + __builtin_mma_{dis,}assemble_pair, but now we call them
> + __builtin_vsx_{dis,}assemble_pair.  Handle the old versions.  */
> +  if (fncode == RS6000_BIF_ASSEMBLE_PAIR)
> +fncode = RS6000_BIF_ASSEMBLE_PAIR_V;
> +  else if (fncode == RS6000_BIF_DISASSEMBLE_PAIR)
> +fncode = RS6000_BIF_DISASSEMBLE_PAIR_V;
> +
> +  if (fncode == RS6000_BIF_DISASSEMBLE_ACC
> +  || fncode == RS6000_BIF_DISASSEMBLE_PAIR_V)
> +{
> +  /* This is an MMA disassemble built-in function.  */
> +  push_gimplify_context (true);
> +  unsigned nvec = (fncode == RS6000_BIF_DISASSEMBLE_ACC) ? 4 : 2;
> +  tree dst_ptr = gimple_call_arg (stmt, 0);
> +  tree src_ptr = gimple_call_arg 

[PATCH] c++: fix wrong fixit hints for misspelled typedef [PR77565]

2021-09-13 Thread Michel Morin via Gcc-patches
Hi,

PR77565 reports that, with the code `typdef int Int;`, GCC emits
"did you mean 'typeof'?" instead of "did you mean 'typedef'?".

This happens because the typo corrector determines that `typeof` is a
candidate for suggestion (through `cp_keyword_starts_decl_specifier_p`),
but `typedef` is not.

This patch fixes the issue by adding `typedef` as a candidate. The patch
additionally adds the `inline` specifier and cv-specifiers as a candidate.
Here is a patch (tests `make check-gcc` pass on darwin):


c++: add typo corrections for typedef/inline/cv-qual [PR77565]

PR c++/77565

gcc/cp/ChangeLog:

* parser.c (cp_keyword_starts_decl_specifier_p): Handle
typedef/inline specifiers and cv-qualifiers.

gcc/testsuite/ChangeLog:

* g++.dg/spellcheck-typenames.C: Add tests for decl-specs.

--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -1051,6 +1051,12 @@ cp_keyword_starts_decl_specifier_p (enum rid keyword)
 case RID_FLOAT:
 case RID_DOUBLE:
 case RID_VOID:
+  /* CV qualifiers.  */
+case RID_CONST:
+case RID_VOLATILE:
+  /* typedef/inline specifiers.  */
+case RID_TYPEDEF:
+case RID_INLINE:
   /* GNU extensions.  */
 case RID_ATTRIBUTE:
 case RID_TYPEOF:
--- a/gcc/testsuite/g++.dg/spellcheck-typenames.C
+++ b/gcc/testsuite/g++.dg/spellcheck-typenames.C
@@ -76,3 +76,38 @@ singed char ch; // { dg-error "1: 'singed' does not
name a type; did you mean 's
  ^~
  signed
{ dg-end-multiline-output "" } */
+
+typdef int my_int; // { dg-error "1: 'typdef' does not name a type;
did you mean 'typedef'?" }
+/* { dg-begin-multiline-output "" }
+ typdef int my_int;
+ ^~
+ typedef
+   { dg-end-multiline-output "" } */
+
+inlien int inline_func(); // { dg-error "1: 'inlien' does not name a
type; did you mean 'inline'?" }
+/* { dg-begin-multiline-output "" }
+ inlien int inline_func();
+ ^~
+ inline
+   { dg-end-multiline-output "" } */
+
+coonst int ci = 0; // { dg-error "1: 'coonst' does not name a type;
did you mean 'const'?" }
+/* { dg-begin-multiline-output "" }
+ coonst int ci = 0;
+ ^~
+ const
+   { dg-end-multiline-output "" } */
+
+voltil int vi; // { dg-error "1: 'voltil' does not name a type; did
you mean 'volatile'?" }
+/* { dg-begin-multiline-output "" }
+ voltil int vi;
+ ^~
+ volatile
+   { dg-end-multiline-output "" } */
+
+statik int si; // { dg-error "1: 'statik' does not name a type; did
you mean 'static'?" }
+/* { dg-begin-multiline-output "" }
+ statik int si;
+ ^~
+ static
+   { dg-end-multiline-output "" } */


--
Regards,
Michel
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index f9c2c8ac3a7..5295911eb82 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -1051,6 +1051,12 @@ cp_keyword_starts_decl_specifier_p (enum rid keyword)
 case RID_FLOAT:
 case RID_DOUBLE:
 case RID_VOID:
+  /* CV qualifiers.  */
+case RID_CONST:
+case RID_VOLATILE:
+  /* typedef/inline specifiers.  */
+case RID_TYPEDEF:
+case RID_INLINE:
   /* GNU extensions.  */
 case RID_ATTRIBUTE:
 case RID_TYPEOF:
diff --git a/gcc/testsuite/g++.dg/spellcheck-typenames.C 
b/gcc/testsuite/g++.dg/spellcheck-typenames.C
index ff53ecc6303..75f80480e16 100644
--- a/gcc/testsuite/g++.dg/spellcheck-typenames.C
+++ b/gcc/testsuite/g++.dg/spellcheck-typenames.C
@@ -76,3 +76,38 @@ singed char ch; // { dg-error "1: 'singed' does not name a 
type; did you mean 's
  ^~
  signed
{ dg-end-multiline-output "" } */
+
+typdef int my_int; // { dg-error "1: 'typdef' does not name a type; did you 
mean 'typedef'?" }
+/* { dg-begin-multiline-output "" }
+ typdef int my_int;
+ ^~
+ typedef
+   { dg-end-multiline-output "" } */
+
+inlien int inline_func(); // { dg-error "1: 'inlien' does not name a type; did 
you mean 'inline'?" }
+/* { dg-begin-multiline-output "" }
+ inlien int inline_func();
+ ^~
+ inline
+   { dg-end-multiline-output "" } */
+
+coonst int ci = 0; // { dg-error "1: 'coonst' does not name a type; did you 
mean 'const'?" }
+/* { dg-begin-multiline-output "" }
+ coonst int ci = 0;
+ ^~
+ const
+   { dg-end-multiline-output "" } */
+
+voltil int vi; // { dg-error "1: 'voltil' does not name a type; did you mean 
'volatile'?" }
+/* { dg-begin-multiline-output "" }
+ voltil int vi;
+ ^~
+ volatile
+   { dg-end-multiline-output "" } */
+
+statik int si; // { dg-error "1: 'statik' does not name a type; did you mean 
'static'?" }
+/* { dg-begin-multiline-output "" }
+ statik int si;
+ ^~
+ static
+   { dg-end-multiline-output "" } */


Re: [PATCH, Fortran] Revert to non-multilib-specific ISO_Fortran_binding.h

2021-09-13 Thread Jakub Jelinek via Gcc-patches
On Mon, Sep 13, 2021 at 05:56:53PM +0200, Gerald Pfeifer wrote:
> % egrep -r '#define.*LDBL_(MANT_DIG|MIN_EXP|MAX_EXP)' /usr/include/
> /usr/include/x86/float.h:#define LDBL_MANT_DIG  64
> /usr/include/x86/float.h:#define LDBL_MIN_EXP   (-16381)
> /usr/include/x86/float.h:#define LDBL_MAX_EXP   16384
> 
> This looks like it matches existing Linux case already in place?

Those are indeed the same.  But perhaps the FreeBSD float.h header
guards those defines with some preprocessor condition?

Jakub



Re: [PATCH 02/18] rs6000: Move __builtin_mffsl to the [always] stanza

2021-09-13 Thread will schmidt via Gcc-patches
On Wed, 2021-09-01 at 11:13 -0500, Bill Schmidt via Gcc-patches wrote:
> I over-restricted use of __builtin_mffsl, since I was unaware that it
> automatically uses mffs when mffsl is not available.  Paul Clarke
> pointed
> this out in discussion of his SSE 4.1 compatibility patches.
> 
> 2021-08-31  Bill Schmidt  
> 
> gcc/
>   * config/rs6000/rs6000-call.c (__builtin_mffsl): Move from
> [power9]
>   to [always].
> ---
>  gcc/config/rs6000/rs6000-builtin-new.def | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-builtin-new.def
> b/gcc/config/rs6000/rs6000-builtin-new.def
> index 6a28d5189f8..a8c6b9e988f 100644
> --- a/gcc/config/rs6000/rs6000-builtin-new.def
> +++ b/gcc/config/rs6000/rs6000-builtin-new.def
> @@ -208,6 +208,12 @@
>double __builtin_mffs ();
>  MFFS rs6000_mffs {}
> 
> +; Although the mffsl instruction is only available on POWER9 and
> later
> +; processors, this builtin automatically falls back to mffs on older
> +; platforms.  Thus it appears here in the [always] stanza.
> +  double __builtin_mffsl ();
> +MFFSL rs6000_mffsl {}
> +
>  ; This thing really assumes long double == __ibm128, and I'm told it
> has
>  ; been used as such within libgcc.  Given that __builtin_pack_ibm128
>  ; exists for the same purpose, this should really not be used at
> all.
> @@ -2784,9 +2790,6 @@
>signed long long __builtin_darn_raw ();
>  DARN_RAW darn_raw {}
> 
> -  double __builtin_mffsl ();
> -MFFSL rs6000_mffsl {}
> -
>const signed int __builtin_dtstsfi_eq_dd (const int<6>,
> _Decimal64);
>  TSTSFI_EQ_DD dfptstsfi_eq_dd {}
> 


Looks reasonable,
Thanks
-Will



Re: [PATCH, Fortran] Revert to non-multilib-specific ISO_Fortran_binding.h

2021-09-13 Thread Andreas Schwab
On Sep 13 2021, Gerald Pfeifer wrote:

> % egrep -r '#define.*LDBL_(MANT_DIG|MIN_EXP|MAX_EXP)' /usr/include/
> /usr/include/x86/float.h:#define LDBL_MANT_DIG  64
> /usr/include/x86/float.h:#define LDBL_MIN_EXP   (-16381)
> /usr/include/x86/float.h:#define LDBL_MAX_EXP   16384
>
> This looks like it matches existing Linux case already in place?

gcc has its own , see gcc/include/float.h in the build
directory.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH, Fortran] Revert to non-multilib-specific ISO_Fortran_binding.h

2021-09-13 Thread Jakub Jelinek via Gcc-patches
On Mon, Sep 13, 2021 at 07:07:01PM +0200, Tobias Burnus wrote:
> Regarding FreeBSD: Does this output different values? – If yes, we know
> what to do, otherwise – hmm.
> 
> [...]
> 
> > > Wouldn't it be better to use the __LDBL_* macros anyway and not rely on
> > > float.h?  The header doesn't want to test what float.h tells about the
> > > long double type, but what the compiler knows about it.
> > I originally wrote the code to use the internal GCC __LDBL_* macros as
> > you suggest, but Tobias complained that then the gfortran-provided .h
> > file could not be used to compile the C parts of the program with some
> > other C compiler.
> For instance, clang does not seem to provide those - and in some cases,
> it can be useful to mix gfortran code with code complied by other
> compilers (icc, clang, ...).
> > Maybe it needs to first check the internal macros and then look for
> > the float.h versions if it can't find them?
> 
> I think that makes sense. (Adding a comment that #include  is
> for non-GCC compilers, only.)

At least according to godbolt and my tests, both clang and icc predefine
those macros too.  But there are other C compilers, sure.

So we'd need #if defined (__LDBL_MAX__) && defined (__LDBL_*_) // whatever
we need
#else
#include 
...
#endif
or so.

Jakub



Re: [PATCH 01/18] rs6000: Handle overloads during program parsing

2021-09-13 Thread will schmidt via Gcc-patches
On Wed, 2021-09-01 at 11:13 -0500, Bill Schmidt via Gcc-patches wrote:

Hi, 
  Just a couple cosmetic nits noted below, the majority if which is also in
the original code this is based on.  
THanks
-Will


> Although this patch looks quite large, the changes are fairly minimal.
> Most of it is duplicating the large function that does the overload
> resolution using the automatically generated data structures instead of
> the old hand-generated ones.  This doesn't make the patch terribly easy to
> review, unfortunately.  Just be aware that generally we aren't changing
> the logic and functionality of overload handling.

ok


> 
> 2021-08-31  Bill Schmidt  
> 
> gcc/
>   * config/rs6000/rs6000-c.c (rs6000-builtins.h): New include.
>   (altivec_resolve_new_overloaded_builtin): New forward decl.
>   (rs6000_new_builtin_type_compatible): New function.
>   (altivec_resolve_overloaded_builtin): Call
>   altivec_resolve_new_overloaded_builtin.
>   (altivec_build_new_resolved_builtin): New function.
>   (altivec_resolve_new_overloaded_builtin): Likewise.
>   * config/rs6000/rs6000-call.c (rs6000_new_builtin_is_supported):
>   Likewise.
>   * config/rs6000/rs6000-gen-builtins.c (write_decls): Remove _p from
>   name of rs6000_new_builtin_is_supported.


ok

> ---
>  gcc/config/rs6000/rs6000-c.c| 1088 +++
>  gcc/config/rs6000/rs6000-call.c |   53 ++
>  gcc/config/rs6000/rs6000-gen-builtins.c |2 +-
>  3 files changed, 1142 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
> index afcb5bb6e39..aafb4e6a98f 100644
> --- a/gcc/config/rs6000/rs6000-c.c
> +++ b/gcc/config/rs6000/rs6000-c.c
> @@ -35,6 +35,9 @@
>  #include "langhooks.h"
>  #include "c/c-tree.h"
> 
> +#include "rs6000-builtins.h"
> +
> +static tree altivec_resolve_new_overloaded_builtin (location_t, tree, void 
> *);
> 
> 
>  /* Handle the machine specific pragma longcall.  Its syntax is
> @@ -811,6 +814,30 @@ is_float128_p (tree t)
> && t == long_double_type_node));
>  }
> 
> +static bool
> +rs6000_new_builtin_type_compatible (tree t, tree u)
> +{
> +  if (t == error_mark_node)
> +return false;
> +
> +  if (INTEGRAL_TYPE_P (t) && INTEGRAL_TYPE_P (u))
> +return true;
> +
> +  if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
> +  && is_float128_p (t) && is_float128_p (u))
> +return true;
> +
> +  if (POINTER_TYPE_P (t) && POINTER_TYPE_P (u))
> +{
> +  t = TREE_TYPE (t);
> +  u = TREE_TYPE (u);
> +  if (TYPE_READONLY (u))
> + t = build_qualified_type (t, TYPE_QUAL_CONST);
> +}
> +
> +  return lang_hooks.types_compatible_p (t, u);
> +}
> +

ok

>  static inline bool
>  rs6000_builtin_type_compatible (tree t, int id)
>  {
> @@ -927,6 +954,10 @@ tree
>  altivec_resolve_overloaded_builtin (location_t loc, tree fndecl,
>   void *passed_arglist)
>  {
> +  if (new_builtins_are_live)
> +return altivec_resolve_new_overloaded_builtin (loc, fndecl,
> +passed_arglist);
> +
>vec *arglist = static_cast *> 
> (passed_arglist);
>unsigned int nargs = vec_safe_length (arglist);
>enum rs6000_builtins fcode

ok

> @@ -1930,3 +1961,1060 @@ altivec_resolve_overloaded_builtin (location_t loc, 
> tree fndecl,
>  return error_mark_node;
>}
>  }
> +
> +/* Build a tree for a function call to an Altivec non-overloaded builtin.
> +   The overloaded builtin that matched the types and args is described
> +   by DESC.  The N arguments are given in ARGS, respectively.
> +
> +   Actually the only thing it does is calling fold_convert on ARGS, with
> +   a small exception for vec_{all,any}_{ge,le} predicates. */
> +
> +static tree
> +altivec_build_new_resolved_builtin (tree *args, int n, tree fntype,
> + tree ret_type,
> + rs6000_gen_builtins bif_id,
> + rs6000_gen_builtins ovld_id)
> +{
> +  tree argtypes = TYPE_ARG_TYPES (fntype);
> +  tree arg_type[MAX_OVLD_ARGS];
> +  tree fndecl = rs6000_builtin_decls_x[bif_id];
> +  tree call;
> +
> +  for (int i = 0; i < n; i++)
> +arg_type[i] = TREE_VALUE (argtypes), argtypes = TREE_CHAIN (argtypes);
> +
> +  /* The AltiVec overloading implementation is overall gross, but this
> + is particularly disgusting.  The vec_{all,any}_{ge,le} builtins
> + are completely different for floating-point vs. integer vector
> + types, because the former has vcmpgefp, but the latter should use
> + vcmpgtXX.
> +
> + In practice, the second and third arguments are swapped, and the
> + condition (LT vs. EQ, which is recognizable by bit 1 of the first
> + argument) is reversed.  Patch the arguments here before building
> + the resolved CALL_EXPR.  */
> +  if (n == 3
> +  && ovld_id == RS6000_OVLD_VEC_CMPGE_P
> +  && bif_id != 

Re: [PATCH, Fortran] Revert to non-multilib-specific ISO_Fortran_binding.h

2021-09-13 Thread Tobias Burnus

On 13.09.21 18:59, Sandra Loosemore wrote:

On 9/13/21 10:51 AM, Jakub Jelinek wrote:

On Mon, Sep 13, 2021 at 06:32:56PM +0200, Tobias Burnus wrote:

On 13.09.21 17:56, Gerald Pfeifer wrote:

This broke bootstrap on i586-unknown-freebsd11:

% egrep -r '#define.*LDBL_(MANT_DIG|MIN_EXP|MAX_EXP)' /usr/include/
/usr/include/x86/float.h:#define LDBL_MANT_DIG  64
/usr/include/x86/float.h:#define LDBL_MIN_EXP   (-16381)
/usr/include/x86/float.h:#define LDBL_MAX_EXP   16384

This looks like it matches existing Linux case already in place?


Can you run 'echo | cpp -E -g3|grep DBL' to (or in the build dir:
echo |
./gcc/cc1 -E -g3 -dD|grep DBL) to check what's the output?


Regarding FreeBSD: Does this output different values? – If yes, we know
what to do, otherwise – hmm.

[...]


Wouldn't it be better to use the __LDBL_* macros anyway and not rely on
float.h?  The header doesn't want to test what float.h tells about the
long double type, but what the compiler knows about it.

I originally wrote the code to use the internal GCC __LDBL_* macros as
you suggest, but Tobias complained that then the gfortran-provided .h
file could not be used to compile the C parts of the program with some
other C compiler.

For instance, clang does not seem to provide those - and in some cases,
it can be useful to mix gfortran code with code complied by other
compilers (icc, clang, ...).

Maybe it needs to first check the internal macros and then look for
the float.h versions if it can't find them?


I think that makes sense. (Adding a comment that #include  is
for non-GCC compilers, only.)

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: Regression with recent change

2021-09-13 Thread Aldy Hernandez via Gcc-patches
On 9/13/21 4:18 PM, Michael Matz wrote:
> Hello,
>
> On Mon, 13 Sep 2021, Jeff Law via Gcc-patches wrote:
>
>>> So it looks like there's some undefined behavior going on, even before
>>> my patch.  I'd like to get some feedback, because this is usually the
>>> type of problems I see in the presence of a smarter threader... things
>>> get shuffled around, problematic code gets isolated, and warning
>>> passes have an easier time (or sometimes harder time) diagnosing
>>> things.
>> The original issue was PRE hanging, so I'd lean towards keeping the test
as-is
>> and instead twiddling any warning flags we can to make the diagnostics go
>> away.
>
> Or use this changed test avoiding the issues that I see with -W -Wall on
> this testcase.  I've verified that it still hangs before r194358 and is
> fixed by that revision.
>
> Generally I think, our testsuite, even for ICEs or these kinds of hangs,
> should make an effort to try to write conforming code; if at all possible.
> Here it is possible.
>
> (I don't know if the new threader causes additional warnings, of course,
> but at least the problems with sequence points and uninitialized use of
> 'j' aren't necessary to reproduce the bug)
>
>
> Ciao,
> Michael.
>
> /* { dg-do compile } */
> /* { dg-additional-options "-fno-split-loops" } */
>
> typedef unsigned short uint16_t;
>
> uint16_t a, b;
>
> int *j_global;
> uint16_t f(void)
> {
>int c, **p;
>short d = 2, e = 4;
>
>for (;; b++)
>  {
>int *j = j_global, k = 0;
>
>for (; *j; j++)
>   {
> for(; c; c++)
>   for(; k < 1; k++)
> {
>   short *f = 
>
>   if(b)
> return *f;
> }
>   }
>
>if(!c)
>   d *= e;
>
>a = d;
>if ((a ? b = 0 : (**p ? : 1) != (d != 1 ? 1 : (b = 0))) != ((k ? a
: 0)
> < (a * (c = k
>   **p = 0;
>  }
> }
>

Thanks for getting rid of the noise here.

I've simplified the above to show what's going on in the warning on
nds32-elf:

int george, *global;
int stuff(), readme();

int
f (void)
{
   int store;

   for (;;)
 {
   int k = 0;

   while (global)
{
  for (; store; ++store)
{
  for (; k < 1; k++)
{
  if (readme())
return 0;
}
}
}

   store = k;
   if (george)
stuff();
 }
}

The -Waggressive-loop-optimizations pass is complaining because of an
undefined iteration in the for(;store;++store) loop.  But this looks
like it's getting confused by threader having isolated an undefined path.

At the warning, the IL looks like this on entry:

[local count: 55807730]:
   goto ; [100.00%]

[local count: 57254340]:
   # store_4 = PHI 
   global.0_25 = global;
   if (global.0_25 != 0B)
 goto ; [94.50%]
   else
 goto ; [5.50%]

...
...

   [local count: 54105352]:
   if (store_4 != 0)
 goto ; [99.64%]
   else
 goto ; [0.36%]

If global.0_25 was true on entry, the read from store_4 would be
undefined.  Presumably the warning pass is assuming this path always
gets executed.

This looks like a latent bug.  For that matter, the above snippet warns
with -fdisable-tree-thread2, even on x86-64 (and before my
patch).

Aldy


Re: [PATCH, Fortran] Revert to non-multilib-specific ISO_Fortran_binding.h

2021-09-13 Thread Sandra Loosemore

On 9/13/21 10:51 AM, Jakub Jelinek wrote:

On Mon, Sep 13, 2021 at 06:32:56PM +0200, Tobias Burnus wrote:

On 13.09.21 17:56, Gerald Pfeifer wrote:

This broke bootstrap on i586-unknown-freebsd11:

In file included from 
.../GCC-HEAD/libgfortran/runtime/ISO_Fortran_binding.c:30:
.../GCC-HEAD/libgfortran/ISO_Fortran_binding.h:255:2:
error: #error "Can't determine kind of long double"
255 | #error "Can't determine kind of long double"
|  ^

Does this work on i586-*-linux?


% egrep -r '#define.*LDBL_(MANT_DIG|MIN_EXP|MAX_EXP)' /usr/include/
/usr/include/x86/float.h:#define LDBL_MANT_DIG  64
/usr/include/x86/float.h:#define LDBL_MIN_EXP   (-16381)
/usr/include/x86/float.h:#define LDBL_MAX_EXP   16384

This looks like it matches existing Linux case already in place?


Can you run 'echo | cpp -E -g3|grep DBL' to (or in the build dir: echo |
./gcc/cc1 -E -g3 -dD|grep DBL) to check what's the output?

It might be that /usr/include/x86/float.h is not used; e.g. there is
$(gcc-src)/ginclude/float.h which undef's the LDBL_MAX_EXP to replace it
by a #define using __LDBL_MAX_EXP. Thus, if those are different from the
values under /usr/include, it might be the reason for the fail?

I think it works under Linux, at least the "x86-64 -m32"
libgfortran.{so,a}  build and the -m32 testsuite runs do work.


Wouldn't it be better to use the __LDBL_* macros anyway and not rely on
float.h?  The header doesn't want to test what float.h tells about the
long double type, but what the compiler knows about it.


I originally wrote the code to use the internal GCC __LDBL_* macros as 
you suggest, but Tobias complained that then the gfortran-provided .h 
file could not be used to compile the C parts of the program with some 
other C compiler.  (I guess there are people out in the real world who 
want to mash up clang-compiled C code with gfortran programs).  Maybe it 
needs to first check the internal macros and then look for the float.h 
versions if it can't find them?


-Sandra


Re: [PATCH][v2] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-09-13 Thread Koning, Paul via Gcc-patches



> On Sep 13, 2021, at 3:31 AM, Richard Biener  wrote:
> 
> This makes defaults.h choose DWARF2_DEBUG if PREFERRED_DEBUGGING_TYPE
> is not specified by the target and NO_DEBUG if DWARF is not supported.
> 
> It also makes us warn when STABS is enabled and removes the corresponding
> diagnostic from the Ada frontend.  The warnings are pruned from the
> testsuite output via prune_gcc_output.
> 
> This leaves the following targets without debug support:
> 
> pdp11-*-*   pdp11 is a.out, dwarf support is difficult

I'll admit that I don't know much about debug formats.  It is definitely the 
case that pdp11 output is a.out (it may be BSD 2.x style a.out -- which I think 
is somewhat different though it's been many years since I looked at that, and 
then only briefly).  I guess that constrains which debug formats can be used, 
but I don't know any details.

pdp11-elf was done as an experiment by someone else, in binutils.  I'll ask 
about the status of that.  If it's possible to deliver that, it would 
presumably enable DWARF support.  Is that all common code so it's a matter of 
enabling it, or would "DWARF machine details for pdp11" have to be defined?

paul




Re: [PATCH, Fortran] Revert to non-multilib-specific ISO_Fortran_binding.h

2021-09-13 Thread Jakub Jelinek via Gcc-patches
On Mon, Sep 13, 2021 at 06:32:56PM +0200, Tobias Burnus wrote:
> On 13.09.21 17:56, Gerald Pfeifer wrote:
> > This broke bootstrap on i586-unknown-freebsd11:
> > 
> >In file included from 
> > .../GCC-HEAD/libgfortran/runtime/ISO_Fortran_binding.c:30:
> >.../GCC-HEAD/libgfortran/ISO_Fortran_binding.h:255:2:
> >error: #error "Can't determine kind of long double"
> >255 | #error "Can't determine kind of long double"
> >|  ^
> > 
> > Does this work on i586-*-linux?
> > 
> > 
> > % egrep -r '#define.*LDBL_(MANT_DIG|MIN_EXP|MAX_EXP)' /usr/include/
> > /usr/include/x86/float.h:#define LDBL_MANT_DIG  64
> > /usr/include/x86/float.h:#define LDBL_MIN_EXP   (-16381)
> > /usr/include/x86/float.h:#define LDBL_MAX_EXP   16384
> > 
> > This looks like it matches existing Linux case already in place?
> 
> Can you run 'echo | cpp -E -g3|grep DBL' to (or in the build dir: echo |
> ./gcc/cc1 -E -g3 -dD|grep DBL) to check what's the output?
> 
> It might be that /usr/include/x86/float.h is not used; e.g. there is
> $(gcc-src)/ginclude/float.h which undef's the LDBL_MAX_EXP to replace it
> by a #define using __LDBL_MAX_EXP. Thus, if those are different from the
> values under /usr/include, it might be the reason for the fail?
> 
> I think it works under Linux, at least the "x86-64 -m32"
> libgfortran.{so,a}  build and the -m32 testsuite runs do work.

Wouldn't it be better to use the __LDBL_* macros anyway and not rely on
float.h?  The header doesn't want to test what float.h tells about the
long double type, but what the compiler knows about it.

Jakub



Re: [PATCH, Fortran] Revert to non-multilib-specific ISO_Fortran_binding.h

2021-09-13 Thread Tobias Burnus

Hi Gerald,

On 13.09.21 17:56, Gerald Pfeifer wrote:

This broke bootstrap on i586-unknown-freebsd11:

   In file included from 
.../GCC-HEAD/libgfortran/runtime/ISO_Fortran_binding.c:30:
   .../GCC-HEAD/libgfortran/ISO_Fortran_binding.h:255:2:
   error: #error "Can't determine kind of long double"
   255 | #error "Can't determine kind of long double"
   |  ^

Does this work on i586-*-linux?


% egrep -r '#define.*LDBL_(MANT_DIG|MIN_EXP|MAX_EXP)' /usr/include/
/usr/include/x86/float.h:#define LDBL_MANT_DIG  64
/usr/include/x86/float.h:#define LDBL_MIN_EXP   (-16381)
/usr/include/x86/float.h:#define LDBL_MAX_EXP   16384

This looks like it matches existing Linux case already in place?


Can you run 'echo | cpp -E -g3|grep DBL' to (or in the build dir: echo |
./gcc/cc1 -E -g3 -dD|grep DBL) to check what's the output?

It might be that /usr/include/x86/float.h is not used; e.g. there is
$(gcc-src)/ginclude/float.h which undef's the LDBL_MAX_EXP to replace it
by a #define using __LDBL_MAX_EXP. Thus, if those are different from the
values under /usr/include, it might be the reason for the fail?

I think it works under Linux, at least the "x86-64 -m32"
libgfortran.{so,a}  build and the -m32 testsuite runs do work.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH, Fortran] Revert to non-multilib-specific ISO_Fortran_binding.h

2021-09-13 Thread Gerald Pfeifer
On Wed, 18 Aug 2021, Sandra Loosemore wrote:
> I realized last week that having multilib-specific versions of
> ISO_Fortran_binding.h (generated by running the compiler to ask what kinds it
> supports) was still broken outside of the test support; the directory where
> it's being installed isn't on GCC's normal search path. It seemed to me that
> it was better to try to find some other solution for this problem than to
> venture down what appears to be a rat hole.
> 
> I've come up with this patch to return to a single ISO_Fortran_binding.h file
> that uses preprocessor magic to identify the Fortran kind corresponding to the
> standard C long double type and the GCC extension types __float128 and
> int128_t.
:
>  2021-08-18  Sandra Loosemore  
>
>   libgfortran/
>   * ISO_Fortran_binding-1-tmpl.h: Deleted.
>   * ISO_Fortran_binding-2-tmpl.h: Deleted.
>   * ISO_Fortran_binding-3-tmpl.h: Deleted.
>   * ISO_Fortran_binding.h: New file to replace the above.
>   * Makefile.am (gfor_cdir): Remove MULTISUBDIR.
>   (ISO_Fortran_binding.h): Simplify to just copy the file.
>   * Makefile.in: Regenerated.
>   * mk-kinds-h.sh: Revert pieces no longer needed for
>   ISO_Fortran_binding.h.

This broke bootstrap on i586-unknown-freebsd11:

  In file included from 
.../GCC-HEAD/libgfortran/runtime/ISO_Fortran_binding.c:30:
  .../GCC-HEAD/libgfortran/ISO_Fortran_binding.h:255:2: 
  error: #error "Can't determine kind of long double"
  255 | #error "Can't determine kind of long double"
  |  ^

Does this work on i586-*-linux?


% egrep -r '#define.*LDBL_(MANT_DIG|MIN_EXP|MAX_EXP)' /usr/include/
/usr/include/x86/float.h:#define LDBL_MANT_DIG  64
/usr/include/x86/float.h:#define LDBL_MIN_EXP   (-16381)
/usr/include/x86/float.h:#define LDBL_MAX_EXP   16384

This looks like it matches existing Linux case already in place?


Hmm, I wonder whether this may be related to the bootstrap compiler,
which is clang 10.0.1 on FreeBSD 11 and 12.  Apparently not, since 
even setting CC and CXX to recent GCC builds the same issue occurs.

(Note this happens after stage 3, so in hindsight not too surprising
that it's independent of the bootstrap compiler.)

Gerald


Re: [PATCH][v2] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/13/2021 9:44 AM, John David Anglin wrote:

On 2021-09-13 11:05 a.m., Jeff Law wrote:


On 9/13/2021 8:58 AM, John David Anglin wrote:

On 2021-09-13 9:53 a.m., Jeff Law wrote:

It is in fact also hpux11*, thus all 32bit pa configs that do not support
DWARF (for whatever reasons).

We used embedded stabs for SOM (the native format for 32bit PA). SOM is a 
variant of COFF and could easily support dwarf I would think since
it had support for fairly arbitrary sections.  Hell, it was already supporting 
embedded stabs as well as HP's proprietary debugging format.

But I'd consider 32bit SOM on hpux11 dead too :-)

I don't disagree but 32bit SOM still builds on hpux11:
https://gcc.gnu.org/pipermail/gcc-testresults/2021-August/718130.html

Suspect the change will cause a lot of warnings.

It might, but with stabs going away something needs to be done with these 
legacy systems.  Either they need to move into the modern world,
deal with the diagnostic  or get dropped.

I believe the 32-bit SOM target should be deprecated.  I'm the only one 
maintaining it and I had some health issues earlier this year.
The current versions should suffice for several years.

Seems quite reasonable.



My main interest is the Debian parisc-linux target.  It's fully up to date and 
thousands of packages are available.  Most kernels are 64-bit.
Since there's no 64-bit runtime for Linux, we still need the 64-bit hpux target 
for 64-bit compile testing.
Agreed.  Given that the 32bit linux and 64bit hpux targets both use ELF 
+ dwarf, they're not in danger of significant fallout from the stabs 
removal effort.



DWARF isn't supported because we lack named sections.  That could be worked 
around
but probably the gdb versions that work on 32-bit hpux11 wouldn't support DWARF.

I'd be a bit surprised if that were true.  dwarf support has been around a long 
long time in GDB.  Hell, it was around when I did the original
64bit PA work back in the 90s.

There's a chance it might work with the right section names.  However dwarf 5 
wouldn't be supported.  That's an issue that I noticed recently.
Yea, without a modern gdb, 32bit SOM would be stuck back in the dwarf2 
era.  But even that's better than embedded stabs.


jeff


Re: [PATCH][v2] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-09-13 Thread John David Anglin
On 2021-09-13 11:05 a.m., Jeff Law wrote:
>
>
> On 9/13/2021 8:58 AM, John David Anglin wrote:
>> On 2021-09-13 9:53 a.m., Jeff Law wrote:
 It is in fact also hpux11*, thus all 32bit pa configs that do not support
 DWARF (for whatever reasons).
>>> We used embedded stabs for SOM (the native format for 32bit PA). SOM is a 
>>> variant of COFF and could easily support dwarf I would think since
>>> it had support for fairly arbitrary sections.  Hell, it was already 
>>> supporting embedded stabs as well as HP's proprietary debugging format.
>>>
>>> But I'd consider 32bit SOM on hpux11 dead too :-)
>> I don't disagree but 32bit SOM still builds on hpux11:
>> https://gcc.gnu.org/pipermail/gcc-testresults/2021-August/718130.html
>>
>> Suspect the change will cause a lot of warnings.
> It might, but with stabs going away something needs to be done with these 
> legacy systems.  Either they need to move into the modern world,
> deal with the diagnostic  or get dropped.
I believe the 32-bit SOM target should be deprecated.  I'm the only one 
maintaining it and I had some health issues earlier this year.
The current versions should suffice for several years.

My main interest is the Debian parisc-linux target.  It's fully up to date and 
thousands of packages are available.  Most kernels are 64-bit.
Since there's no 64-bit runtime for Linux, we still need the 64-bit hpux target 
for 64-bit compile testing.
>
>>
>> There is some support for hpux10/11 in qemu but it takes a lot of work to 
>> provide the build infrastructure needed for gcc.
> I would think so.
Recently had to move my build infrastructure to a "new" machine, so I'm fully 
aware that it's not easy.
>
>>
>> DWARF isn't supported because we lack named sections.  That could be worked 
>> around
>> but probably the gdb versions that work on 32-bit hpux11 wouldn't support 
>> DWARF.
> I'd be a bit surprised if that were true.  dwarf support has been around a 
> long long time in GDB.  Hell, it was around when I did the original
> 64bit PA work back in the 90s.
There's a chance it might work with the right section names.  However dwarf 5 
wouldn't be supported.  That's an
issue that I noticed recently.

Dave

-- 
John David Anglin  dave.ang...@bell.net




Re: [PATCH] Relax condition of (vec_concat:M(vec_select op0 idx0)(vec_select op0 idx1)) to allow different modes between op0 and M, but have same inner mode.

2021-09-13 Thread Hongtao Liu via Gcc-patches
On Mon, Sep 13, 2021 at 10:10 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 9/9/2021 10:36 PM, liuhongt via Gcc-patches wrote:
> >Currently for (vec_concat:M (vec_select op0 idx1)(vec_select op0 idx2)),
> > optimizer wouldn't simplify if op0 has different mode with M, but that's too
> > restrict which will prevent below optimization, the condition can be relaxed
> > to op0 must have same inner mode with M.
> >
> > (set (reg:V2DF 87 [ xx ])
> >  (vec_concat:V2DF (vec_select:DF (reg:V4DF 92)
> >  (parallel [
> >  (const_int 2 [0x2])
> >  ]))
> >  (vec_select:DF (reg:V4DF 92)
> >  (parallel [
> >  (const_int 3 [0x3])
> >  ]
> >
> >Bootsrapped and regtested on x86_64-linux-gnu{-m32,}.
> >Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> >   * simplify-rtx.c
> >   (simplify_context::simplify_binary_operation_1): Relax
> >   condition of simplifying (vec_concat:M (vec_select op0
> >   index0)(vec_select op1 index1)) to allow different modes
> >   between op0 and M, but have same inner mode.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/i386/vect-rebuild.c:
> >   * gcc.target/i386/avx512f-vect-rebuild.c: New test.
> Funny, I was looking at something rather similar recently, but never
> pushed on it because we were going to need too many entries in the
> parallel selector.
>
> I'm not convinced that we need the inner mode to match anything.  As
> long as the vec_concat's mode is twice the size of the vec_select modes
> and the vec_select mode is <= the mode of its operands ISTM this is
> fine.   We  might want the modes of the vec_select to match, but I don't
> think that's strictly necessary either, they just need to be the same
> size.  ie, we could have somethig like
If they're different sizes, i.e, something like below should also be legal?
(vec_concat:V8SF (vec_select:V2SF (reg:V16SF)) (vec_select:V6SF (reg:V16SF)))
>
> (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DI (reg:V4DI)))
>
> I'm not sure if that level of generality is useful though.  If we want
> the modes of the vec_selects to match I think we could still support
>
> (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DF (reg:V8DF)))
>
> Thoughts?
>
> jeff
>
> Jeff
>
>
> > ---
> >   gcc/simplify-rtx.c|  3 ++-
> >   .../gcc.target/i386/avx512f-vect-rebuild.c| 21 +++
> >   gcc/testsuite/gcc.target/i386/vect-rebuild.c  |  2 +-
> >   3 files changed, 24 insertions(+), 2 deletions(-)
> >   create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> >
> > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
> > index ebad5cb5a79..16286befd79 100644
> > --- a/gcc/simplify-rtx.c
> > +++ b/gcc/simplify-rtx.c
> > @@ -4587,7 +4587,8 @@ simplify_context::simplify_binary_operation_1 
> > (rtx_code code,
> >   if (GET_CODE (trueop0) == VEC_SELECT
> >   && GET_CODE (trueop1) == VEC_SELECT
> >   && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0))
> > - && GET_MODE (XEXP (trueop0, 0)) == mode)
> > + && GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0)))
> > +== GET_MODE_INNER(mode))
> > {
> >   rtx par0 = XEXP (trueop0, 1);
> >   rtx par1 = XEXP (trueop1, 1);
> > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c 
> > b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > new file mode 100644
> > index 000..aef6855aa46
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > @@ -0,0 +1,21 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -mavx512vl -mavx512dq -fno-tree-forwprop" } */
> > +
> > +typedef double v2df __attribute__ ((__vector_size__ (16)));
> > +typedef double v4df __attribute__ ((__vector_size__ (32)));
> > +
> > +v2df h (v4df x)
> > +{
> > +  v2df xx = { x[2], x[3] };
> > +  return xx;
> > +}
> > +
> > +v4df f2 (v4df x)
> > +{
> > +  v4df xx = { x[0], x[1], x[2], x[3] };
> > +  return xx;
> > +}
> > +
> > +/* { dg-final { scan-assembler-not "unpck" } } */
> > +/* { dg-final { scan-assembler-not "valign" } } */
> > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 1 
> > } } */
> > diff --git a/gcc/testsuite/gcc.target/i386/vect-rebuild.c 
> > b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > index 570967f6b5c..8e85b98bf1d 100644
> > --- a/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > +++ b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > @@ -30,4 +30,4 @@ v2df h (v4df x)
> >
> >   /* { dg-final { scan-assembler-not "unpck" } } */
> >   /* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */
> > -/* { dg-final { scan-assembler-times "\tv?extractf128\[ \t\]" 1 } } */
> > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 1 
> > } } */
>


-- 
BR,
Hongtao


Re: [PATCH] Relax condition of (vec_concat:M(vec_select op0 idx0)(vec_select op0 idx1)) to allow different modes between op0 and M, but have same inner mode.

2021-09-13 Thread Hongtao Liu via Gcc-patches
On Mon, Sep 13, 2021 at 10:10 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 9/9/2021 10:36 PM, liuhongt via Gcc-patches wrote:
> >Currently for (vec_concat:M (vec_select op0 idx1)(vec_select op0 idx2)),
> > optimizer wouldn't simplify if op0 has different mode with M, but that's too
> > restrict which will prevent below optimization, the condition can be relaxed
> > to op0 must have same inner mode with M.
> >
> > (set (reg:V2DF 87 [ xx ])
> >  (vec_concat:V2DF (vec_select:DF (reg:V4DF 92)
> >  (parallel [
> >  (const_int 2 [0x2])
> >  ]))
> >  (vec_select:DF (reg:V4DF 92)
> >  (parallel [
> >  (const_int 3 [0x3])
> >  ]
> >
> >Bootsrapped and regtested on x86_64-linux-gnu{-m32,}.
> >Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> >   * simplify-rtx.c
> >   (simplify_context::simplify_binary_operation_1): Relax
> >   condition of simplifying (vec_concat:M (vec_select op0
> >   index0)(vec_select op1 index1)) to allow different modes
> >   between op0 and M, but have same inner mode.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/i386/vect-rebuild.c:
> >   * gcc.target/i386/avx512f-vect-rebuild.c: New test.
> Funny, I was looking at something rather similar recently, but never
> pushed on it because we were going to need too many entries in the
> parallel selector.
>
> I'm not convinced that we need the inner mode to match anything.  As
> long as the vec_concat's mode is twice the size of the vec_select modes
> and the vec_select mode is <= the mode of its operands ISTM this is
> fine.   We  might want the modes of the vec_select to match, but I don't
> think that's strictly necessary either, they just need to be the same
> size.  ie, we could have somethig like
>
> (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DI (reg:V4DI)))
>
> I'm not sure if that level of generality is useful though.  If we want
> the modes of the vec_selects to match I think we could still support
>
> (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DF (reg:V8DF)))
The first operand of vec_select is required to be the same here. and I
guess (vec_select:DF (subreg:V4DF (reg:V8DF) 0) will be simplified to
(vec_select:DF (reg:V8DF))?
>
> Thoughts?
>
> jeff
>
> Jeff
>
>
> > ---
> >   gcc/simplify-rtx.c|  3 ++-
> >   .../gcc.target/i386/avx512f-vect-rebuild.c| 21 +++
> >   gcc/testsuite/gcc.target/i386/vect-rebuild.c  |  2 +-
> >   3 files changed, 24 insertions(+), 2 deletions(-)
> >   create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> >
> > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
> > index ebad5cb5a79..16286befd79 100644
> > --- a/gcc/simplify-rtx.c
> > +++ b/gcc/simplify-rtx.c
> > @@ -4587,7 +4587,8 @@ simplify_context::simplify_binary_operation_1 
> > (rtx_code code,
> >   if (GET_CODE (trueop0) == VEC_SELECT
> >   && GET_CODE (trueop1) == VEC_SELECT
> >   && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0))
> > - && GET_MODE (XEXP (trueop0, 0)) == mode)
> > + && GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0)))
> > +== GET_MODE_INNER(mode))
> > {
> >   rtx par0 = XEXP (trueop0, 1);
> >   rtx par1 = XEXP (trueop1, 1);
> > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c 
> > b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > new file mode 100644
> > index 000..aef6855aa46
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > @@ -0,0 +1,21 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -mavx512vl -mavx512dq -fno-tree-forwprop" } */
> > +
> > +typedef double v2df __attribute__ ((__vector_size__ (16)));
> > +typedef double v4df __attribute__ ((__vector_size__ (32)));
> > +
> > +v2df h (v4df x)
> > +{
> > +  v2df xx = { x[2], x[3] };
> > +  return xx;
> > +}
> > +
> > +v4df f2 (v4df x)
> > +{
> > +  v4df xx = { x[0], x[1], x[2], x[3] };
> > +  return xx;
> > +}
> > +
> > +/* { dg-final { scan-assembler-not "unpck" } } */
> > +/* { dg-final { scan-assembler-not "valign" } } */
> > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 1 
> > } } */
> > diff --git a/gcc/testsuite/gcc.target/i386/vect-rebuild.c 
> > b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > index 570967f6b5c..8e85b98bf1d 100644
> > --- a/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > +++ b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > @@ -30,4 +30,4 @@ v2df h (v4df x)
> >
> >   /* { dg-final { scan-assembler-not "unpck" } } */
> >   /* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */
> > -/* { dg-final { scan-assembler-times "\tv?extractf128\[ \t\]" 1 } } */
> > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 1 
> > } } */
>


-- 
BR,
Hongtao


Re: [PATCH][v2] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/13/2021 8:58 AM, John David Anglin wrote:

On 2021-09-13 9:53 a.m., Jeff Law wrote:

It is in fact also hpux11*, thus all 32bit pa configs that do not support
DWARF (for whatever reasons).

We used embedded stabs for SOM (the native format for 32bit PA). SOM is a 
variant of COFF and could easily support dwarf I would think since
it had support for fairly arbitrary sections.  Hell, it was already supporting 
embedded stabs as well as HP's proprietary debugging format.

But I'd consider 32bit SOM on hpux11 dead too :-)

I don't disagree but 32bit SOM still builds on hpux11:
https://gcc.gnu.org/pipermail/gcc-testresults/2021-August/718130.html

Suspect the change will cause a lot of warnings.
It might, but with stabs going away something needs to be done with 
these legacy systems.  Either they need to move into the modern world, 
deal with the diagnostic  or get dropped.




There is some support for hpux10/11 in qemu but it takes a lot of work to 
provide the build infrastructure needed for gcc.

I would think so.



DWARF isn't supported because we lack named sections.  That could be worked 
around
but probably the gdb versions that work on 32-bit hpux11 wouldn't support DWARF.
I'd be a bit surprised if that were true.  dwarf support has been around 
a long long time in GDB.  Hell, it was around when I did the original 
64bit PA work back in the 90s.


Jeff


Re: [PATCH][v2] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-09-13 Thread John David Anglin
On 2021-09-13 9:53 a.m., Jeff Law wrote:
>> It is in fact also hpux11*, thus all 32bit pa configs that do not support
>> DWARF (for whatever reasons).
> We used embedded stabs for SOM (the native format for 32bit PA). SOM is a 
> variant of COFF and could easily support dwarf I would think since
> it had support for fairly arbitrary sections.  Hell, it was already 
> supporting embedded stabs as well as HP's proprietary debugging format.
>
> But I'd consider 32bit SOM on hpux11 dead too :-)
I don't disagree but 32bit SOM still builds on hpux11:
https://gcc.gnu.org/pipermail/gcc-testresults/2021-August/718130.html

Suspect the change will cause a lot of warnings.

There is some support for hpux10/11 in qemu but it takes a lot of work to 
provide the build infrastructure
needed for gcc.

DWARF isn't supported because we lack named sections.  That could be worked 
around
but probably the gdb versions that work on 32-bit hpux11 wouldn't support DWARF.

Dave

-- 
John David Anglin  dave.ang...@bell.net




Re: openmp: Implement OpenMP 5.1 atomics, so far for C only

2021-09-13 Thread Christophe Lyon via Gcc-patches
On Mon, Sep 13, 2021 at 4:40 PM Jakub Jelinek  wrote:

> On Mon, Sep 13, 2021 at 01:57:52PM +0200, Christophe Lyon wrote:
> > > --- gcc/testsuite/c-c++-common/gomp/atomic-29.c.jj  2021-09-10
> > > 11:47:17.093164041 +0200
> > > +++ gcc/testsuite/c-c++-common/gomp/atomic-29.c 2021-09-10
> > > 11:52:33.428722747 +0200
> > > @@ -0,0 +1,43 @@
> > > +/* { dg-do compile { target c } } */
> > > +/* { dg-additional-options "-O2 -fdump-tree-ompexp" } */
> > > +/* { dg-final { scan-tree-dump-times "\.ATOMIC_COMPARE_EXCHANGE
> > > \\\(\[^\n\r]*, 8, 5, 5\\\);" 1 "ompexp" { target sync_int_long } } } */
> > > +/* { dg-final { scan-tree-dump-times "\.ATOMIC_COMPARE_EXCHANGE
> > > \\\(\[^\n\r]*, 8, 4, 2\\\);" 1 "ompexp" { target sync_int_long } } } */
> > > +/* { dg-final { scan-tree-dump-times "\.ATOMIC_COMPARE_EXCHANGE
> > > \\\(\[^\n\r]*, 264, 5, 0\\\);" 1 "ompexp" { target sync_int_long } } }
> */
> > > +/* { dg-final { scan-tree-dump-times "\.ATOMIC_COMPARE_EXCHANGE
> > > \\\(\[^\n\r]*, 8, 0, 0\\\);" 1 "ompexp" { target sync_int_long } } } */
> > > +/* { dg-final { scan-tree-dump-not "__atomic_load_8 \\\(" "ompexp" {
> > > target sync_int_long } } } */
> > > +
> > >
> >
> > This test fails on arm*linux when forcing old CPU/ARCH (eg
> -march=armv5t).
> > Not sure how easy it is to fix?
> > sync_int_long returns true for arm*-*-linux-*,
> > but for other arm targets, it depends on the result
> > of check_effective_target_arm_acq_rel.
> >
> > Is it just a matter of removing arm*-*-linux-* from sync_int_long and
> > always rely on arm_acq_rel?
>
> So, atomic-28.c passes and just atomic-29.c fails?
>
Yes.


> atomic-29.c tests for double atomics aka 64-bit, so probably I should use
> sync_long_long effective target instead of sync_int_long.
> But, it is unclear how would that help for arm because sync_long_long is
> enabled for all arm targets...
>
Indeed :-(


> There is also sync_long_long_runtime effective target, but this is a
> compile
> test, so it would be weird to rely on that when it ought to test the
> runtime
> behavior of that.
>
Agreed



>
> Jakub
>
>


Re: [RFC] ldist: Recognize rawmemchr loop patterns

2021-09-13 Thread Stefan Schulze Frielinghaus via Gcc-patches
On Mon, Sep 06, 2021 at 11:56:21AM +0200, Richard Biener wrote:
> On Fri, Sep 3, 2021 at 10:01 AM Stefan Schulze Frielinghaus
>  wrote:
> >
> > On Fri, Aug 20, 2021 at 12:35:58PM +0200, Richard Biener wrote:
> > [...]
> > > > >
> > > > > +  /* Handle strlen like loops.  */
> > > > > +  if (store_dr == NULL
> > > > > +  && integer_zerop (pattern)
> > > > > +  && TREE_CODE (reduction_iv.base) == INTEGER_CST
> > > > > +  && TREE_CODE (reduction_iv.step) == INTEGER_CST
> > > > > +  && integer_onep (reduction_iv.step)
> > > > > +  && (types_compatible_p (TREE_TYPE (reduction_var), 
> > > > > size_type_node)
> > > > > + || TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (reduction_var
> > > > > +{
> > > > >
> > > > > I wonder what goes wrong with a larger or smaller wrapping IV type?
> > > > > The iteration
> > > > > only stops when you load a NUL and the increments just wrap along 
> > > > > (you're
> > > > > using the pointer IVs to compute the strlen result).  Can't you 
> > > > > simply truncate?
> > > >
> > > > I think truncation is enough as long as no overflow occurs in strlen or
> > > > strlen_using_rawmemchr.
> > > >
> > > > > For larger than size_type_node (actually larger than ptr_type_node 
> > > > > would matter
> > > > > I guess), the argument is that since pointer wrapping would be 
> > > > > undefined anyway
> > > > > the IV cannot wrap either.  Now, the correct check here would IMHO be
> > > > >
> > > > >   TYPE_PRECISION (TREE_TYPE (reduction_var)) < TYPE_PRECISION
> > > > > (ptr_type_node)
> > > > >|| TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (pointer-iv-var))
> > > > >
> > > > > ?
> > > >
> > > > Regarding the implementation which makes use of rawmemchr:
> > > >
> > > > We can count at most PTRDIFF_MAX many bytes without an overflow.  Thus,
> > > > the maximal length we can determine of a string where each character has
> > > > size S is PTRDIFF_MAX / S without an overflow.  Since an overflow for
> > > > ptrdiff type is undefined we have to make sure that if an overflow
> > > > occurs, then an overflow occurs for reduction variable, too, and that
> > > > this is undefined, too.  However, I'm not sure anymore whether we want
> > > > to respect overflows in all cases.  If TYPE_PRECISION (ptr_type_node)
> > > > equals TYPE_PRECISION (ptrdiff_type_node) and an overflow occurs, then
> > > > this would mean that a single string consumes more than half of the
> > > > virtual addressable memory.  At least for architectures where
> > > > TYPE_PRECISION (ptrdiff_type_node) == 64 holds, I think it is reasonable
> > > > to neglect the case where computing pointer difference may overflow.
> > > > Otherwise we are talking about strings with lenghts of multiple
> > > > pebibytes.  For other architectures we might have to be more precise
> > > > and make sure that reduction variable overflows first and that this is
> > > > undefined.
> > > >
> > > > Thus a conservative condition would be (I assumed that the size of any
> > > > integral type is a power of two which I'm not sure if this really holds;
> > > > IIRC the C standard requires only that the alignment is a power of two
> > > > but not necessarily the size so I might need to change this):
> > > >
> > > > /* Compute precision (reduction_var) < (precision (ptrdiff_type) - 1 - 
> > > > log2 (sizeof (load_type))
> > > >or in other words return true if reduction variable overflows first
> > > >and false otherwise.  */
> > > >
> > > > static bool
> > > > reduction_var_overflows_first (tree reduction_var, tree load_type)
> > > > {
> > > >   unsigned precision_ptrdiff = TYPE_PRECISION (ptrdiff_type_node);
> > > >   unsigned precision_reduction_var = TYPE_PRECISION (TREE_TYPE 
> > > > (reduction_var));
> > > >   unsigned size_exponent = wi::exact_log2 (wi::to_wide (TYPE_SIZE_UNIT 
> > > > (load_type)));
> > > >   return wi::ltu_p (precision_reduction_var, precision_ptrdiff - 1 - 
> > > > size_exponent);
> > > > }
> > > >
> > > > TYPE_PRECISION (ptrdiff_type_node) == 64
> > > > || (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (reduction_var))
> > > > && reduction_var_overflows_first (reduction_var, load_type)
> > > >
> > > > Regarding the implementation which makes use of strlen:
> > > >
> > > > I'm not sure what it means if strlen is called for a string with a
> > > > length greater than SIZE_MAX.  Therefore, similar to the implementation
> > > > using rawmemchr where we neglect the case of an overflow for 64bit
> > > > architectures, a conservative condition would be:
> > > >
> > > > TYPE_PRECISION (size_type_node) == 64
> > > > || (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (reduction_var))
> > > > && TYPE_PRECISION (reduction_var) <= TYPE_PRECISION 
> > > > (size_type_node))
> > > >
> > > > I still included the overflow undefined check for reduction variable in
> > > > order to rule out situations where the reduction variable is unsigned
> > > > and overflows as many times until strlen(,_using_rawmemchr) overflows,
> > > > too.  

Re: [PATCH] c++: parameter pack inside constexpr if [PR101764]

2021-09-13 Thread Patrick Palka via Gcc-patches
On Sun, Sep 12, 2021 at 10:29 PM Jason Merrill  wrote:
>
> On 9/12/21 7:48 PM, Patrick Palka wrote:
> > On Thu, 2 Sep 2021, Jason Merrill wrote:
> >
> >> On 8/30/21 10:05 PM, Patrick Palka wrote:
> >>> Here when partially substituting into the pack expansion, substitution
> >>> into the constexpr if yields a still-dependent tree, so tsubst_expr
> >>> returns an IF_STMT with an unsubstituted IF_COND and with
> >>> IF_STMT_EXTRA_ARGS added to.  Hence after partial substitution
> >>> the pack expansion pattern still refers to the parameter pack 'ts' of
> >>> level 2 (and it's thus represented in the new
> >>> PACK_EXPANSION_PARAMETER_PACKS)
> >>> even though the partially instantiated generic lambda admits only one
> >>> level of template arguments.
> >>
> >>> This causes us to crash during the
> >>> subsequent instantiation with the lambda's template arguments because of
> >>> the level mismatch.  (Likewise when the constexpr if is replaced by a
> >>> requires-expr, which too uses the extra args mechanism for delaying
> >>> partial instantiation.)
> >>
> >>> So essentially, a pack expansion pattern that contains a parameter pack
> >>> inside an "extra args" tree doesn't play well with partial substitution
> >>> thereof.  This patch fixes this by forcing such pack expansions to use
> >>> the extra args mechanism as well.
> >>
> >> Why is this specific to parameter packs?  Won't non-pack template 
> >> parameters
> >> also suffer from the level mismatch?
> >
> > IIUC it's specific to parameter packs because each parameter pack in the
> > pattern is also recorded in PACK_EXPANSION_PARAMETER_PACKS, which
> > tsubst_pack_expansion looks at to extra all argument packs from 'args'
> > that are relevant to the pattern.
> >
> > I should clarify it's during the loop over PACK_EXPANSION_PARAMETER_PACKS
> > that we crash, because we fail to find an argument pack for 'ts' (which
> > still has the unlowered level 2), and we trip over the assert:
> >
> >   {
> > /* We can't substitute for this parameter pack.  We use a flag as
> >well as the missing_level counter because function parameter
> >packs don't have a level.  */
> > gcc_assert (processing_template_decl || is_auto (parm_pack));
> > unsubstituted_packs = true;
> >   }
> >
> > For non-pack template parameters (even those within extra args trees),
> > ordinary substitution is sufficient and does the right thing.
> >
> >> I'd think it would be simpler to just
> >> note when the pattern contains a constexpr if or requires-expression, for
> >> which we can't substitute into the pattern like a pack expansion, and know 
> >> we
> >> need to use extra args in that case.
> >
> > Sounds good.  We'd force the extra args mechanism more than is strictly
> > necessary, but IIUC that should be harmless.
>
> Agreed.
>
> >>> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> >>> trunk?
> >>>
> >>> PR c++/101764
> >>>
> >>> gcc/cp/ChangeLog:
> >>>
> >>> * cp-tree.h (PACK_EXPANSION_FORCE_EXTRA_ARGS_P): New accessor
> >>> macro.
> >>> * pt.c (uses_extra_args_mechanism_p): New function.
> >>> (find_parameter_pack_data::found_pack_within_extra_args_tree_p):
> >>> New data member.
> >>> (find_parameter_pack_data::inside_extra_args_tree_p): Likewise.
> >>> (find_parameter_packs_r): Detect parameter packs within "extra
> >>> args" trees and set found_pack_within_extra_args_tree_p
> >>> appropriately.
> >>> (make_pack_expansion): Set PACK_EXPANSION_FORCE_EXTRA_ARGS_P if
> >>> found_pack_within_extra_args_tree_p.
> >>> (use_pack_expansion_extra_args_p): Return true if there were
> >>> unsubstituted packs and PACK_EXPANSION_FORCE_EXTRA_ARGS_P.
> >>> (tsubst_pack_expansion): Pass the pack expansion to
> >>> use_pack_expansion_extra_args_p.
> >>>
> >>> gcc/testsuite/ChangeLog:
> >>>
> >>> * g++.dg/cpp1z/constexpr-if35.C: New test.
> >>> ---
> >>>gcc/cp/cp-tree.h|  5 ++
> >>>gcc/cp/pt.c | 69 -
> >>>gcc/testsuite/g++.dg/cpp1z/constexpr-if35.C | 18 ++
> >>>3 files changed, 90 insertions(+), 2 deletions(-)
> >>>create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-if35.C
> >>>
> >>> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> >>> index ce7ca53a113..06dec495428 100644
> >>> --- a/gcc/cp/cp-tree.h
> >>> +++ b/gcc/cp/cp-tree.h
> >>> @@ -493,6 +493,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
> >>>  CONSTRUCTOR_C99_COMPOUND_LITERAL (in CONSTRUCTOR)
> >>>  OVL_NESTED_P (in OVERLOAD)
> >>>  DECL_MODULE_EXPORT_P (in _DECL)
> >>> +  PACK_EXPANSION_FORCE_EXTRA_ARGS_P (in *_PACK_EXPANSION)
> >>>   4: IDENTIFIER_MARKED (IDENTIFIER_NODEs)
> >>>  TREE_HAS_CONSTRUCTOR (in INDIRECT_REF, SAVE_EXPR, CONSTRUCTOR,
> >>>   CALL_EXPR, or FIELD_DECL).
> >>> @@ -3902,6 +3903,10 @@ struct GTY(()) lang_decl {
> >>>  

RE: [PATCH] aarch64: PR target/102252 Invalid addressing mode for SVE load predicate

2021-09-13 Thread Kyrylo Tkachov via Gcc-patches
Hi Richard,

> -Original Message-
> From: Richard Sandiford 
> Sent: 13 September 2021 12:09
> To: Kyrylo Tkachov 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] aarch64: PR target/102252 Invalid addressing mode for
> SVE load predicate
> 
> Kyrylo Tkachov  writes:
> > Hi all,
> >
> > In the testcase we generate invalid assembly for an SVE load predicate
> instruction.
> > The RTL for the insn is:
> > (insn 9 8 10 (set (reg:VNx16BI 68 p0)
> > (mem:VNx16BI (plus:DI (mult:DI (reg:DI 1 x1 [93])
> > (const_int 8 [0x8]))
> > (reg/f:DI 0 x0 [92])) [2 work_3(D)->array[offset_4(D)]+0 S8 
> > A16]))
> >
> > That addressing mode is not valid for the instruction [1] as it only accepts
> the addressing mode:
> > [{, #, MUL VL}]
> >
> > This patch rejects the register index form for SVE predicate modes.
> >
> > Bootstrapped and tested on aarch64-none-linux-gnu.
> >
> > Ok for trunk?
> > Thanks,
> > Kyrill
> >
> > [1] https://developer.arm.com/documentation/ddi0602/2021-06/SVE-
> Instructions/LDR--predicate---Load-predicate-register-
> >
> > gcc/ChangeLog:
> >
> > PR target/102252
> > * config/aarch64/aarch64.c (aarch64_classify_address): Don't allow
> > register index for SVE predicate modes.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/102252
> > * g++.target/aarch64/sve/pr102252.C: New test.
> >
> > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> > index
> e37922db0007e3b4b559cda65f135247f4fb1b9f..e6253edeb55cdcc3dbc7303
> e03bad26dd519c4b1 100644
> > --- a/gcc/config/aarch64/aarch64.c
> > +++ b/gcc/config/aarch64/aarch64.c
> > @@ -9770,7 +9770,7 @@ aarch64_classify_address (struct
> aarch64_address_info *info,
> > || mode == TImode
> > || mode == TFmode
> > || (BYTES_BIG_ENDIAN && advsimd_struct_p));
> > -
> > +  bool sve_pred_p = (vec_flags & VEC_SVE_PRED) != 0;
> >/* If we are dealing with ADDR_QUERY_LDP_STP_N that means the
> incoming mode
> >   corresponds to the actual size of the memory being loaded/stored and
> the
> >   mode of the corresponding addressing mode is half of that.  */
> > @@ -9779,12 +9779,14 @@ aarch64_classify_address (struct
> aarch64_address_info *info,
> >  mode = DFmode;
> >
> >bool allow_reg_index_p = (!load_store_pair_p
> > +   && !sve_pred_p
> > && (known_lt (GET_MODE_SIZE (mode), 16)
> > || vec_flags == VEC_ADVSIMD
> > || vec_flags & VEC_SVE_DATA));
> 
> I think the known_lt (GET_MODE_SIZE (mode), 16) is really there for
> non-vector cases, with the ||s enumerating the valid vector cases.
> So how about:
> 
>   bool allow_reg_index_p = (!load_store_pair_p
>   && ((vec_flags == 0
>&& known_lt (GET_MODE_SIZE (mode), 16))
>   || vec_flags == VEC_ADVSIMD
>   || vec_flags & VEC_SVE_DATA));
> 
> instead?  OK with that change from my POV.

Yeah, that works.
Thanks, here's what I've committed. I'll wait a bit before backporting to the 
branches.

Kyrill

> 
> Thanks,
> Richard
> 
> >
> > -  /* For SVE, only accept [Rn], [Rn, Rm, LSL #shift] and
> > - [Rn, #offset, MUL VL].  */
> > +  /* For SVE, only accept [Rn], [Rn, #offset, MUL VL] and [Rn, Rm, LSL
> #shift].
> > + The latter is not valid for SVE predicates, and that's rejected 
> > through
> > + allow_reg_index_p above.  */
> >if ((vec_flags & (VEC_SVE_DATA | VEC_SVE_PRED)) != 0
> >&& (code != REG && code != PLUS))
> >  return false;
> > diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr102252.C
> b/gcc/testsuite/g++.target/aarch64/sve/pr102252.C
> > new file mode 100644
> > index
> ..f90f1218555f0dfdb0253fe
> 83c656ba03b1aac43
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.target/aarch64/sve/pr102252.C
> > @@ -0,0 +1,37 @@
> > +/* PR target/102252.  */
> > +/* { dg-do assemble { target aarch64_asm_sve_ok } } */
> > +/* { dg-options "-march=armv8.2-a+sve -msve-vector-bits=512" } */
> > +
> > +/* We used to generate invalid assembly for SVE predicate loads.  */
> > +
> > +#include 
> > +
> > +class SimdBool
> > +{
> > +private:
> > +typedef svbool_t simdInternalType_
> __attribute__((arm_sve_vector_bits(512)));
> > +
> > +public:
> > +SimdBool() {}
> > +
> > +simdInternalType_ simdInternal_;
> > +
> > +};
> > +
> > +static svfloat32_t selectByMask(svfloat32_t a, SimdBool m) {
> > +return svsel_f32(m.simdInternal_, a, svdup_f32(0.0));
> > +}
> > +
> > +struct s {
> > +SimdBool array[1];
> > +};
> > +
> > +
> > +
> > +void foo(struct s* const work, int offset)
> > +{
> > +svfloat32_t tz_S0;
> > +
> > +tz_S0 = selectByMask(tz_S0, work->array[offset]);
> > +}
> > +


pred-addr.patch
Description: pred-addr.patch


Re: openmp: Implement OpenMP 5.1 atomics, so far for C only

2021-09-13 Thread Jakub Jelinek via Gcc-patches
On Mon, Sep 13, 2021 at 01:57:52PM +0200, Christophe Lyon wrote:
> > --- gcc/testsuite/c-c++-common/gomp/atomic-29.c.jj  2021-09-10
> > 11:47:17.093164041 +0200
> > +++ gcc/testsuite/c-c++-common/gomp/atomic-29.c 2021-09-10
> > 11:52:33.428722747 +0200
> > @@ -0,0 +1,43 @@
> > +/* { dg-do compile { target c } } */
> > +/* { dg-additional-options "-O2 -fdump-tree-ompexp" } */
> > +/* { dg-final { scan-tree-dump-times "\.ATOMIC_COMPARE_EXCHANGE
> > \\\(\[^\n\r]*, 8, 5, 5\\\);" 1 "ompexp" { target sync_int_long } } } */
> > +/* { dg-final { scan-tree-dump-times "\.ATOMIC_COMPARE_EXCHANGE
> > \\\(\[^\n\r]*, 8, 4, 2\\\);" 1 "ompexp" { target sync_int_long } } } */
> > +/* { dg-final { scan-tree-dump-times "\.ATOMIC_COMPARE_EXCHANGE
> > \\\(\[^\n\r]*, 264, 5, 0\\\);" 1 "ompexp" { target sync_int_long } } } */
> > +/* { dg-final { scan-tree-dump-times "\.ATOMIC_COMPARE_EXCHANGE
> > \\\(\[^\n\r]*, 8, 0, 0\\\);" 1 "ompexp" { target sync_int_long } } } */
> > +/* { dg-final { scan-tree-dump-not "__atomic_load_8 \\\(" "ompexp" {
> > target sync_int_long } } } */
> > +
> >
> 
> This test fails on arm*linux when forcing old CPU/ARCH (eg  -march=armv5t).
> Not sure how easy it is to fix?
> sync_int_long returns true for arm*-*-linux-*,
> but for other arm targets, it depends on the result
> of check_effective_target_arm_acq_rel.
> 
> Is it just a matter of removing arm*-*-linux-* from sync_int_long and
> always rely on arm_acq_rel?

So, atomic-28.c passes and just atomic-29.c fails?
atomic-29.c tests for double atomics aka 64-bit, so probably I should use
sync_long_long effective target instead of sync_int_long.
But, it is unclear how would that help for arm because sync_long_long is
enabled for all arm targets...
There is also sync_long_long_runtime effective target, but this is a compile
test, so it would be weird to rely on that when it ought to test the runtime
behavior of that.

Jakub



Re: [COMMITTED][patch][version 9]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-09-13 Thread Jose E. Marchesi via Gcc-patches


> On Fri, Sep 10, 2021 at 3:47 PM Jose E. Marchesi via Gcc-patches
>  wrote:
>>
>>
>> Hi Richard.
>>
>> > On Thu, 9 Sep 2021, Kees Cook wrote:
>> >
>> >> On Thu, Sep 09, 2021 at 10:49:11PM +, Qing Zhao wrote:
>> >> > Hi, FYI
>> >> >
>> >> > I just committed the following patch to gcc upstream:
>> >> >
>> >> >
>> >> > https://gcc.gnu.org/pipermail/gcc-cvs/2021-September/353195.html
>> >>
>> >> Hurray! Thank you so much for working on this, and thanks also to the
>> >> reviewers and everyone else poking at it.
>> >>
>> >> I will go update my Linux Plumbers slides to say "supported" instead of
>> >> "proposed". :)
>> >
>> > Can you two work on wording to add to gcc-12/changes.html for this
>> > feature?  I think it deserves a release note.  Likewise the CTF/BTF
>> > support btw.
>>
>> What about something like this for the BPF, CTF and BTF changes..
>
> Looks good to me!

Installed.  Thanks!

> Thanks,
> Richard.
>
>> commit 3826495d1a2c265954d5da13ca71925eea390060 (HEAD -> master)
>> Author: Jose E. Marchesi 
>> Date:   Fri Sep 10 15:44:30 2021 +0200
>>
>> gcc-12/changes.html: BPF, CTF and BTF update
>>
>> * htdocs/gcc-12/changes.html (BPF): Item about the CO-RE support.
>> (Debugging formats): New section with items about the support for
>> CTF and BTF.
>>
>> diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
>> index 946faa49..936af979 100644
>> --- a/htdocs/gcc-12/changes.html
>> +++ b/htdocs/gcc-12/changes.html
>> @@ -143,6 +143,15 @@ a work-in-progress.
>>
>>  
>>
>> +BPF
>> +
>> +  Support for CO-RE (compile-once, run-everywhere) has been added
>> +  to the BPF backend.  CO-RE allows to compile portable BPF
>> +  programs that are able to run among different versions of the
>> +  Linux kernel.
>> +  
>> +
>> +
>>  
>>
>>  
>> @@ -210,7 +219,25 @@ a work-in-progress.
>>  
>>
>>  
>> -
>> +Other significant improvements
>> +
>> +Debugging formats
>> +
>> +
>> +  GCC can now generate debugging information
>> +  in https://ctfstd.org;>CTF, a lightweight debugging
>> +  format that provides information about C types and the
>> +  association between functions and data symbols and types.  This
>> +  format is designed to be embedded in ELF files and to be very
>> +  compact and simple.  A new command-line
>> +  option -gctf enables the generation of CTF.
>> +  
>> +  GCC can now generate debugging information in BTF.  This is a
>> +  debugging format mainly used in BPF programs and the Linux
>> +  kernel.  The compiler can generate BTF for any target, when
>> +  enabled with the command-line option -gbtf
>> +  
>> +
>>
>>
>>  


[PATCH] Maintain (mis-)alignment info in the first element of a group

2021-09-13 Thread Richard Biener via Gcc-patches
This changes us to maintain and compute (mis-)alignment info for
the first element of a group only rather than for each DR when
doing interleaving and for the earliest, first, or first in the SLP
node (or any pair or all three of those) when SLP vectorizing.

For this to work out the easiest way I have changed the accessors
DR_MISALIGNMENT and DR_TARGET_ALIGNMENT to do the indirection to
the first element rather than adjusting all callers.
dr_misalignment is moved out-of-line and I'm not too fond of the
poly-int dances there (any hints?), but basically we are now
adjusting the first elements misalignment based on the DR_INIT
difference.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2021-09-13  Richard Biener  

* tree-vectorizer.h (dr_misalignment): Move out of line.
(dr_target_alignment): New.
(DR_TARGET_ALIGNMENT): Wrap dr_target_alignment.
(set_dr_target_alignment): New.
(SET_DR_TARGET_ALIGNMENT): Wrap set_dr_target_alignment.
* tree-vect-data-refs.c (dr_misalignment): Compute and
return the group members misalignment.
(vect_compute_data_ref_alignment): Use SET_DR_TARGET_ALIGNMENT.
(vect_analyze_data_refs_alignment): Compute alignment only
for the first element of a DR group.
(vect_slp_analyze_node_alignment): Likewise.
---
 gcc/tree-vect-data-refs.c | 65 ---
 gcc/tree-vectorizer.h | 24 ++-
 2 files changed, 57 insertions(+), 32 deletions(-)

diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 66e76132d14..b53d6a0b3f1 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -887,6 +887,36 @@ vect_slp_analyze_instance_dependence (vec_info *vinfo, 
slp_instance instance)
   return res;
 }
 
+/* Return the misalignment of DR_INFO.  */
+
+int
+dr_misalignment (dr_vec_info *dr_info)
+{
+  if (STMT_VINFO_GROUPED_ACCESS (dr_info->stmt))
+{
+  dr_vec_info *first_dr
+   = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT (dr_info->stmt));
+  int misalign = first_dr->misalignment;
+  gcc_assert (misalign != DR_MISALIGNMENT_UNINITIALIZED);
+  if (misalign == DR_MISALIGNMENT_UNKNOWN)
+   return misalign;
+  poly_offset_int diff = (wi::to_poly_offset (DR_INIT (dr_info->dr))
+ - wi::to_poly_offset (DR_INIT (first_dr->dr)));
+  poly_int64 mispoly = misalign + diff.to_constant ().to_shwi ();
+  bool res = known_misalignment (mispoly,
+first_dr->target_alignment.to_constant (),
+);
+  gcc_assert (res);
+  return misalign;
+}
+  else
+{
+  int misalign = dr_info->misalignment;
+  gcc_assert (misalign != DR_MISALIGNMENT_UNINITIALIZED);
+  return misalign;
+}
+}
+
 /* Record the base alignment guarantee given by DRB, which occurs
in STMT_INFO.  */
 
@@ -992,7 +1022,7 @@ vect_compute_data_ref_alignment (vec_info *vinfo, 
dr_vec_info *dr_info)
 
   poly_uint64 vector_alignment
 = exact_div (vect_calculate_target_alignment (dr_info), BITS_PER_UNIT);
-  DR_TARGET_ALIGNMENT (dr_info) = vector_alignment;
+  SET_DR_TARGET_ALIGNMENT (dr_info, vector_alignment);
 
   /* If the main loop has peeled for alignment we have no way of knowing
  whether the data accesses in the epilogues are aligned.  We can't at
@@ -2408,7 +2438,12 @@ vect_analyze_data_refs_alignment (loop_vec_info 
loop_vinfo)
 {
   dr_vec_info *dr_info = loop_vinfo->lookup_dr (dr);
   if (STMT_VINFO_VECTORIZABLE (dr_info->stmt))
-   vect_compute_data_ref_alignment (loop_vinfo, dr_info);
+   {
+ if (STMT_VINFO_GROUPED_ACCESS (dr_info->stmt)
+ && DR_GROUP_FIRST_ELEMENT (dr_info->stmt) != dr_info->stmt)
+   continue;
+ vect_compute_data_ref_alignment (loop_vinfo, dr_info);
+   }
 }
 
   return opt_result::success ();
@@ -2420,13 +2455,9 @@ vect_analyze_data_refs_alignment (loop_vec_info 
loop_vinfo)
 static bool
 vect_slp_analyze_node_alignment (vec_info *vinfo, slp_tree node)
 {
-  /* We vectorize from the first scalar stmt in the node unless
- the node is permuted in which case we start from the first
- element in the group.  */
+  /* Alignment is maintained in the first element of the group.  */
   stmt_vec_info first_stmt_info = SLP_TREE_SCALAR_STMTS (node)[0];
-  dr_vec_info *first_dr_info = STMT_VINFO_DR_INFO (first_stmt_info);
-  if (SLP_TREE_LOAD_PERMUTATION (node).exists ())
-first_stmt_info = DR_GROUP_FIRST_ELEMENT (first_stmt_info);
+  first_stmt_info = DR_GROUP_FIRST_ELEMENT (first_stmt_info);
 
   /* We need to commit to a vector type for the group now.  */
   if (is_a  (vinfo)
@@ -2440,22 +2471,8 @@ vect_slp_analyze_node_alignment (vec_info *vinfo, 
slp_tree node)
 }
 
   dr_vec_info *dr_info = STMT_VINFO_DR_INFO (first_stmt_info);
-  vect_compute_data_ref_alignment (vinfo, dr_info);
-  /* In several places 

Re: [PATCH] i386: support micro-levels in target{, _clone} attrs [PR101696]

2021-09-13 Thread Uros Bizjak via Gcc-patches
On Thu, Aug 12, 2021 at 5:32 PM Martin Liška  wrote:
>
> On 8/12/21 5:26 PM, H.J. Lu wrote:
> > Will it hurt if they have proper feature_priorities you added?
>
> No. They are unused, by we should use the proper priorities.

gcc/ChangeLog:

* common/config/i386/cpuinfo.h (cpu_indicator_init): Add support
for x86-64 micro levels for __builtin_cpu_supports.
* common/config/i386/i386-cpuinfo.h (enum feature_priority):
Add priorities for the micro-arch levels.
(enum processor_features): Add new features.
* common/config/i386/i386-isas.h: Add micro-arch features.
* config/i386/i386-builtins.c (get_builtin_code_for_version):
Support the micro-arch levels by callsing
__builtin_cpu_supports.
* doc/extend.texi: Document that the levels are support by
 __builtin_cpu_supports.

gcc/testsuite/ChangeLog:

* g++.target/i386/mv30.C: New test.
* gcc.target/i386/mvc16.c: New test.
* gcc.target/i386/builtin_target.c (CHECK___builtin_cpu_supports):
New.

OK.

Thanks,
Uros.


Re: Regression with recent change

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/13/2021 8:18 AM, Michael Matz wrote:

Hello,

On Mon, 13 Sep 2021, Jeff Law via Gcc-patches wrote:


So it looks like there's some undefined behavior going on, even before
my patch.  I'd like to get some feedback, because this is usually the
type of problems I see in the presence of a smarter threader... things
get shuffled around, problematic code gets isolated, and warning
passes have an easier time (or sometimes harder time) diagnosing
things.

The original issue was PRE hanging, so I'd lean towards keeping the test as-is
and instead twiddling any warning flags we can to make the diagnostics go
away.

Or use this changed test avoiding the issues that I see with -W -Wall on
this testcase.  I've verified that it still hangs before r194358 and is
fixed by that revision.

Generally I think, our testsuite, even for ICEs or these kinds of hangs,
should make an effort to try to write conforming code; if at all possible.
Here it is possible.

(I don't know if the new threader causes additional warnings, of course,
but at least the problems with sequence points and uninitialized use of
'j' aren't necessary to reproduce the bug)
Well, if we can twiddle the test to remove the undefined behavior 
without compromising its original intent, then that's obviously better -)


jeff



Re: [PATCH] Relax condition of (vec_concat:M(vec_select op0 idx0)(vec_select op0 idx1)) to allow different modes between op0 and M, but have same inner mode.

2021-09-13 Thread Richard Biener via Gcc-patches
On Mon, Sep 13, 2021 at 4:10 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 9/9/2021 10:36 PM, liuhongt via Gcc-patches wrote:
> >Currently for (vec_concat:M (vec_select op0 idx1)(vec_select op0 idx2)),
> > optimizer wouldn't simplify if op0 has different mode with M, but that's too
> > restrict which will prevent below optimization, the condition can be relaxed
> > to op0 must have same inner mode with M.
> >
> > (set (reg:V2DF 87 [ xx ])
> >  (vec_concat:V2DF (vec_select:DF (reg:V4DF 92)
> >  (parallel [
> >  (const_int 2 [0x2])
> >  ]))
> >  (vec_select:DF (reg:V4DF 92)
> >  (parallel [
> >  (const_int 3 [0x3])
> >  ]
> >
> >Bootsrapped and regtested on x86_64-linux-gnu{-m32,}.
> >Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> >   * simplify-rtx.c
> >   (simplify_context::simplify_binary_operation_1): Relax
> >   condition of simplifying (vec_concat:M (vec_select op0
> >   index0)(vec_select op1 index1)) to allow different modes
> >   between op0 and M, but have same inner mode.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/i386/vect-rebuild.c:
> >   * gcc.target/i386/avx512f-vect-rebuild.c: New test.
> Funny, I was looking at something rather similar recently, but never
> pushed on it because we were going to need too many entries in the
> parallel selector.
>
> I'm not convinced that we need the inner mode to match anything.  As
> long as the vec_concat's mode is twice the size of the vec_select modes
> and the vec_select mode is <= the mode of its operands ISTM this is
> fine.   We  might want the modes of the vec_select to match, but I don't
> think that's strictly necessary either, they just need to be the same
> size.  ie, we could have somethig like
>
> (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DI (reg:V4DI)))
>
> I'm not sure if that level of generality is useful though.  If we want
> the modes of the vec_selects to match I think we could still support
>
> (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DF (reg:V8DF)))
>
> Thoughts?

I think the component or scalar modes of the elements to concat need to match
the component mode of the result.  I don't think you example involving
a cat of DF and DI is too useful - but you could use a subreg around the DI
value ;)

Richard.

>
> jeff
>
> Jeff
>
>
> > ---
> >   gcc/simplify-rtx.c|  3 ++-
> >   .../gcc.target/i386/avx512f-vect-rebuild.c| 21 +++
> >   gcc/testsuite/gcc.target/i386/vect-rebuild.c  |  2 +-
> >   3 files changed, 24 insertions(+), 2 deletions(-)
> >   create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> >
> > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
> > index ebad5cb5a79..16286befd79 100644
> > --- a/gcc/simplify-rtx.c
> > +++ b/gcc/simplify-rtx.c
> > @@ -4587,7 +4587,8 @@ simplify_context::simplify_binary_operation_1 
> > (rtx_code code,
> >   if (GET_CODE (trueop0) == VEC_SELECT
> >   && GET_CODE (trueop1) == VEC_SELECT
> >   && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0))
> > - && GET_MODE (XEXP (trueop0, 0)) == mode)
> > + && GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0)))
> > +== GET_MODE_INNER(mode))
> > {
> >   rtx par0 = XEXP (trueop0, 1);
> >   rtx par1 = XEXP (trueop1, 1);
> > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c 
> > b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > new file mode 100644
> > index 000..aef6855aa46
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > @@ -0,0 +1,21 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O -mavx512vl -mavx512dq -fno-tree-forwprop" } */
> > +
> > +typedef double v2df __attribute__ ((__vector_size__ (16)));
> > +typedef double v4df __attribute__ ((__vector_size__ (32)));
> > +
> > +v2df h (v4df x)
> > +{
> > +  v2df xx = { x[2], x[3] };
> > +  return xx;
> > +}
> > +
> > +v4df f2 (v4df x)
> > +{
> > +  v4df xx = { x[0], x[1], x[2], x[3] };
> > +  return xx;
> > +}
> > +
> > +/* { dg-final { scan-assembler-not "unpck" } } */
> > +/* { dg-final { scan-assembler-not "valign" } } */
> > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 1 
> > } } */
> > diff --git a/gcc/testsuite/gcc.target/i386/vect-rebuild.c 
> > b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > index 570967f6b5c..8e85b98bf1d 100644
> > --- a/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > +++ b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > @@ -30,4 +30,4 @@ v2df h (v4df x)
> >
> >   /* { dg-final { scan-assembler-not "unpck" } } */
> >   /* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */
> > -/* { dg-final { scan-assembler-times "\tv?extractf128\[ \t\]" 1 } } */
> > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 1 
> > } } */
>


Re: Regression with recent change

2021-09-13 Thread Michael Matz via Gcc-patches
Hello,

On Mon, 13 Sep 2021, Jeff Law via Gcc-patches wrote:

> > So it looks like there's some undefined behavior going on, even before
> > my patch.  I'd like to get some feedback, because this is usually the
> > type of problems I see in the presence of a smarter threader... things
> > get shuffled around, problematic code gets isolated, and warning
> > passes have an easier time (or sometimes harder time) diagnosing
> > things.
> The original issue was PRE hanging, so I'd lean towards keeping the test as-is
> and instead twiddling any warning flags we can to make the diagnostics go
> away.

Or use this changed test avoiding the issues that I see with -W -Wall on 
this testcase.  I've verified that it still hangs before r194358 and is 
fixed by that revision.

Generally I think, our testsuite, even for ICEs or these kinds of hangs, 
should make an effort to try to write conforming code; if at all possible.  
Here it is possible.

(I don't know if the new threader causes additional warnings, of course, 
but at least the problems with sequence points and uninitialized use of 
'j' aren't necessary to reproduce the bug)


Ciao,
Michael.

/* { dg-do compile } */
/* { dg-additional-options "-fno-split-loops" } */

typedef unsigned short uint16_t;

uint16_t a, b;

int *j_global;
uint16_t f(void)
{
  int c, **p;
  short d = 2, e = 4;

  for (;; b++)
{
  int *j = j_global, k = 0;

  for (; *j; j++)
{
  for(; c; c++)
for(; k < 1; k++)
  {
short *f = 

if(b)
  return *f;
  }
}

  if(!c)
d *= e;

  a = d;
  if ((a ? b = 0 : (**p ? : 1) != (d != 1 ? 1 : (b = 0))) != ((k ? a : 0)
  < (a * (c = k
**p = 0;
}
}


Re: [PATCH] Relax condition of (vec_concat:M(vec_select op0 idx0)(vec_select op0 idx1)) to allow different modes between op0 and M, but have same inner mode.

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/9/2021 10:36 PM, liuhongt via Gcc-patches wrote:

   Currently for (vec_concat:M (vec_select op0 idx1)(vec_select op0 idx2)),
optimizer wouldn't simplify if op0 has different mode with M, but that's too
restrict which will prevent below optimization, the condition can be relaxed
to op0 must have same inner mode with M.

(set (reg:V2DF 87 [ xx ])
 (vec_concat:V2DF (vec_select:DF (reg:V4DF 92)
 (parallel [
 (const_int 2 [0x2])
 ]))
 (vec_select:DF (reg:V4DF 92)
 (parallel [
 (const_int 3 [0x3])
 ]

   Bootsrapped and regtested on x86_64-linux-gnu{-m32,}.
   Ok for trunk?

gcc/ChangeLog:

* simplify-rtx.c
(simplify_context::simplify_binary_operation_1): Relax
condition of simplifying (vec_concat:M (vec_select op0
index0)(vec_select op1 index1)) to allow different modes
between op0 and M, but have same inner mode.

gcc/testsuite/ChangeLog:

* gcc.target/i386/vect-rebuild.c:
* gcc.target/i386/avx512f-vect-rebuild.c: New test.
Funny, I was looking at something rather similar recently, but never 
pushed on it because we were going to need too many entries in the 
parallel selector.


I'm not convinced that we need the inner mode to match anything.  As 
long as the vec_concat's mode is twice the size of the vec_select modes 
and the vec_select mode is <= the mode of its operands ISTM this is 
fine.   We  might want the modes of the vec_select to match, but I don't 
think that's strictly necessary either, they just need to be the same 
size.  ie, we could have somethig like


(vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DI (reg:V4DI)))

I'm not sure if that level of generality is useful though.  If we want 
the modes of the vec_selects to match I think we could still support


(vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DF (reg:V8DF)))

Thoughts?

jeff

Jeff



---
  gcc/simplify-rtx.c|  3 ++-
  .../gcc.target/i386/avx512f-vect-rebuild.c| 21 +++
  gcc/testsuite/gcc.target/i386/vect-rebuild.c  |  2 +-
  3 files changed, 24 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index ebad5cb5a79..16286befd79 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -4587,7 +4587,8 @@ simplify_context::simplify_binary_operation_1 (rtx_code 
code,
if (GET_CODE (trueop0) == VEC_SELECT
&& GET_CODE (trueop1) == VEC_SELECT
&& rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0))
-   && GET_MODE (XEXP (trueop0, 0)) == mode)
+   && GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0)))
+  == GET_MODE_INNER(mode))
  {
rtx par0 = XEXP (trueop0, 1);
rtx par1 = XEXP (trueop1, 1);
diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c 
b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
new file mode 100644
index 000..aef6855aa46
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O -mavx512vl -mavx512dq -fno-tree-forwprop" } */
+
+typedef double v2df __attribute__ ((__vector_size__ (16)));
+typedef double v4df __attribute__ ((__vector_size__ (32)));
+
+v2df h (v4df x)
+{
+  v2df xx = { x[2], x[3] };
+  return xx;
+}
+
+v4df f2 (v4df x)
+{
+  v4df xx = { x[0], x[1], x[2], x[3] };
+  return xx;
+}
+
+/* { dg-final { scan-assembler-not "unpck" } } */
+/* { dg-final { scan-assembler-not "valign" } } */
+/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 1 } } 
*/
diff --git a/gcc/testsuite/gcc.target/i386/vect-rebuild.c 
b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
index 570967f6b5c..8e85b98bf1d 100644
--- a/gcc/testsuite/gcc.target/i386/vect-rebuild.c
+++ b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
@@ -30,4 +30,4 @@ v2df h (v4df x)
  
  /* { dg-final { scan-assembler-not "unpck" } } */

  /* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */
-/* { dg-final { scan-assembler-times "\tv?extractf128\[ \t\]" 1 } } */
+/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 1 } } 
*/




Re: [PING] Don't maintain a warning spec for 'UNKNOWN_LOCATION'/'BUILTINS_LOCATION' [PR101574] (was: [PATCH 1/13] v2 [PATCH 1/13] Add support for per-location warning groups (PR 74765))

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/10/2021 1:45 AM, Thomas Schwinge wrote:


0001-Simplify-gcc-diagnostic-spec.h-nowarn_map-setup.patch

 From 095c16ead5d432726f2b6de5ce12fd367600076d Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 1 Sep 2021 16:48:55 +0200
Subject: [PATCH 1/3] Simplify 'gcc/diagnostic-spec.h:nowarn_map' setup

If we've just read something from the map, we can be sure that it exists.

gcc/
* warning-control.cc (copy_warning): Remove 'nowarn_map' setup.
OK




0002-Clarify-key_type_t-to-location_t-as-used-for-gcc-dia.patch

 From 23d9b93401349fca03efaef0fef0960933f4c316 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 31 Aug 2021 22:01:23 +0200
Subject: [PATCH 2/3] Clarify 'key_type_t' to 'location_t' as used for
  'gcc/diagnostic-spec.h:nowarn_map'

To make it obvious what exactly the key type is.  No change in behavior.

gcc/
* diagnostic-spec.h (typedef xint_hash_t): Use 'location_t' instead 
of...
(typedef key_type_t): ... this.  Remove.
(nowarn_map): Document.
* diagnostic-spec.c (nowarn_map): Likewise.
* warning-control.cc (convert_to_key): Evolve functions into...
(get_location): ... these.  Adjust all users.

OK



0003-Don-t-maintain-a-warning-spec-for-UNKNOWN_LOCATION-B.patch

 From 51c9a8ac2caa0432730c78d00989fd01f3ac6fe5 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 30 Aug 2021 22:36:47 +0200
Subject: [PATCH 3/3] Don't maintain a warning spec for
  'UNKNOWN_LOCATION'/'BUILTINS_LOCATION' [PR101574]
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This resolves PR101574 "gcc/sparseset.h:215:20: error: suggest parentheses
around assignment used as truth value [-Werror=parentheses]", as (bogusly)
reported at commit a61f6afbee370785cf091fe46e2e022748528307:

 In file included from [...]/source-gcc/gcc/lra-lives.c:43:
 [...]/source-gcc/gcc/lra-lives.c: In function ‘void 
make_hard_regno_dead(int)’:
 [...]/source-gcc/gcc/sparseset.h:215:20: error: suggest parentheses around 
assignment used as truth value [-Werror=parentheses]
   215 |&& (((ITER) = sparseset_iter_elm (SPARSESET)) || 1);
 \
   |^
 [...]/source-gcc/gcc/lra-lives.c:304:3: note: in expansion of macro 
‘EXECUTE_IF_SET_IN_SPARSESET’
   304 |   EXECUTE_IF_SET_IN_SPARSESET (pseudos_live, i)
   |   ^~~

gcc/
PR bootstrap/101574
* diagnostic-spec.c (warning_suppressed_at, copy_warning): Handle
'RESERVED_LOCATION_P' locations.
* warning-control.cc (get_nowarn_spec, suppress_warning)
(copy_warning): Likewise.

OK.

Jeff



Re: [PATCH][v2] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/13/2021 7:47 AM, Richard Biener wrote:

On Mon, 13 Sep 2021, Jeff Law wrote:



On 9/13/2021 1:31 AM, Richard Biener wrote:

This makes defaults.h choose DWARF2_DEBUG if PREFERRED_DEBUGGING_TYPE
is not specified by the target and NO_DEBUG if DWARF is not supported.

It also makes us warn when STABS is enabled and removes the corresponding
diagnostic from the Ada frontend.  The warnings are pruned from the
testsuite output via prune_gcc_output.

This leaves the following targets without debug support:

   pdp11-*-*   pdp11 is a.out, dwarf support is difficult
   m68k*-*-openbsd*  it looks like this is a.out as well, at least it does
 not pretend to support DWARF
   hppa[12]*-*-hpux10*  does seem to not support DWARF

I would probably argue that hpux10 should just be removed, along with hpux 7-9
if they haven't been already.  It's the epitome of a dead platform.

It is in fact also hpux11*, thus all 32bit pa configs that do not support
DWARF (for whatever reasons).
We used embedded stabs for SOM (the native format for 32bit PA). SOM is 
a variant of COFF and could easily support dwarf I would think since it 
had support for fairly arbitrary sections.  Hell, it was already 
supporting embedded stabs as well as HP's proprietary debugging format.


But I'd consider 32bit SOM on hpux11 dead too :-)  I nearly asked why 
you restricted your original comment to hpux10...


Jeff


Re: [PATCH] Optimize macro: make it more predictable

2021-09-13 Thread Martin Liška

On 8/27/21 11:05, Richard Biener wrote:

So with ignoring darktable which seems completely insane the cases
will likely continue
to work as intended if we change from the current scheme to appending
as proposed.


All right, I'm addressing the flag_complex_method in a separate sub-thread.

There's slightly updated version of the patch where I modifed the documentation 
bits.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
MartinFrom e13e3ec56acfb62543bc1912f1310d00eefba5c3 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 2 Jun 2021 08:44:37 +0200
Subject: [PATCH] Append target/optimize attr to the current cmdline.

gcc/c-family/ChangeLog:

	* c-common.c (parse_optimize_options): Combine optimize
	options with what was provided on the command line.

gcc/ChangeLog:

	* toplev.c (toplev::main): Save decoded optimization options.
	* toplev.h (save_opt_decoded_options): New.
	* doc/extend.texi: Be more clear about optimize and target
	attributes.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx512er-vrsqrt28ps-3.c: Disable fast math.
	* gcc.target/i386/avx512er-vrsqrt28ps-5.c: Likewise.
	* gcc.target/i386/attr-optimize.c: New test.
---
 gcc/c-family/c-common.c   | 17 +++--
 gcc/doc/extend.texi   |  8 +--
 gcc/testsuite/gcc.target/i386/attr-optimize.c | 24 +++
 .../gcc.target/i386/avx512er-vrsqrt28ps-3.c   |  2 +-
 .../gcc.target/i386/avx512er-vrsqrt28ps-5.c   |  2 +-
 gcc/toplev.c  |  8 +++
 gcc/toplev.h  |  1 +
 7 files changed, 56 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/attr-optimize.c

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 017e41537ac..09038e3175f 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5904,9 +5904,22 @@ parse_optimize_options (tree args, bool attr_p)
   j++;
 }
   decoded_options_count = j;
-  /* And apply them.  */
+
+  /* Merge the decoded options with save_decoded_options.  */
+  unsigned save_opt_count = save_opt_decoded_options.length ();
+  unsigned merged_decoded_options_count
+= save_opt_count + decoded_options_count;
+  cl_decoded_option *merged_decoded_options
+= XNEWVEC (cl_decoded_option, merged_decoded_options_count);
+
+  for (unsigned i = 0; i < save_opt_count; ++i)
+merged_decoded_options[i] = save_opt_decoded_options[i];
+  for (unsigned i = 0; i < decoded_options_count; ++i)
+merged_decoded_options[save_opt_count + i] = decoded_options[i];
+
+   /* And apply them.  */
   decode_options (_options, _options_set,
-		  decoded_options, decoded_options_count,
+		  merged_decoded_options, merged_decoded_options_count,
 		  input_location, global_dc, NULL);
   free (decoded_options);
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 7fb22ed8063..1cb7e33ca29 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3639,7 +3639,10 @@ take function pointer arguments.
 @cindex @code{optimize} function attribute
 The @code{optimize} attribute is used to specify that a function is to
 be compiled with different optimization options than specified on the
-command line.  Valid arguments are constant non-negative integers and
+command line.  The optimize attribute arguments of a function behave
+behave as if appended to the command-line.
+
+Valid arguments are constant non-negative integers and
 strings.  Each numeric argument specifies an optimization @var{level}.
 Each @var{string} argument consists of one or more comma-separated
 substrings.  Each substring that begins with the letter @code{O} refers
@@ -3843,7 +3846,8 @@ This attribute prevents stack protection code for the function.
 Multiple target back ends implement the @code{target} attribute
 to specify that a function is to
 be compiled with different target options than specified on the
-command line.  One or more strings can be provided as arguments.
+command line.  The original target command-line options are ignored.
+One or more strings can be provided as arguments.
 Each string consists of one or more comma-separated suffixes to
 the @code{-m} prefix jointly forming the name of a machine-dependent
 option.  @xref{Submodel Options,,Machine-Dependent Options}.
diff --git a/gcc/testsuite/gcc.target/i386/attr-optimize.c b/gcc/testsuite/gcc.target/i386/attr-optimize.c
new file mode 100644
index 000..f5db028f1fd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/attr-optimize.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O1 -ftree-slp-vectorize -march=znver1 -fdump-tree-optimized" } */
+
+/* Use -O2, but -ftree-slp-vectorize option should be preserved and used.  */
+#pragma GCC optimize "-O2"
+
+typedef struct {
+  long n[4];
+} secp256k1_fe;
+
+void *a;
+int c;
+static void
+fn1(secp256k1_fe *p1, int p2)
+{
+  p1->n[0] = p1->n[1] = p2;
+}
+void
+fn2()
+{
+  fn1(a, 

Re: [PATCH][v2] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-09-13 Thread Richard Biener via Gcc-patches
On Mon, 13 Sep 2021, Jeff Law wrote:

> 
> 
> On 9/13/2021 1:31 AM, Richard Biener wrote:
> > This makes defaults.h choose DWARF2_DEBUG if PREFERRED_DEBUGGING_TYPE
> > is not specified by the target and NO_DEBUG if DWARF is not supported.
> >
> > It also makes us warn when STABS is enabled and removes the corresponding
> > diagnostic from the Ada frontend.  The warnings are pruned from the
> > testsuite output via prune_gcc_output.
> >
> > This leaves the following targets without debug support:
> >
> >   pdp11-*-*   pdp11 is a.out, dwarf support is difficult
> >   m68k*-*-openbsd*  it looks like this is a.out as well, at least it does
> > not pretend to support DWARF
> >   hppa[12]*-*-hpux10*  does seem to not support DWARF
> I would probably argue that hpux10 should just be removed, along with hpux 7-9
> if they haven't been already.  It's the epitome of a dead platform.

It is in fact also hpux11*, thus all 32bit pa configs that do not support
DWARF (for whatever reasons).

Richard.

> 
> >   vax-*-openbsd*  seems to be a.out as well, does not support DWARF
> >
> > behavior will be like
> >
> >> ./cc1 -quiet t.c -g
> > cc1: warning: target system does not support debug output
> >> ./cc1 -quiet t.c -gstabs
> > t.c: warning: STABS debugging information is obsolete and not supported
> > anymore
> >
> > that is, -g is unsupported but -gstabs will generate STABS (the above
> > is for pdp11).  It would be nice if maintainers could confirm the above
> > listed configurations do not support DWARF and weight in whether to
> > (apart from pdp11) the specific configurations can be obsoleted or
> > adjusted.  It looks like we do not have any openbsd maintainer.
> > I've discussed the situation for pdp11 with Paul already at some point
> > but we didn't reach any conclusion besides that it would be nice to
> > move pdp11 to ELF.
> >
> >
> > 2021-09-10  Richard Biener  
> >
> > gcc/
> >  * defaults.h (PREFERRED_DEBUGGING_TYPE): Choose DWARF2_DEBUG
> >  or NO_DEBUG.
> >  * toplev.c (process_options): Warn when STABS debugging is
> >  enabled.
> >
> > gcc/ada/
> >  * gcc-interface/misc.c (gnat_post_options): Do not warn
> >  about DBX_DEBUG use here.
> >
> > gcc/testsuite/
> >  * lib/prune.exp: Prune STABS obsoletion message.
> OK
> jeff
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: Regression with recent change

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/13/2021 7:29 AM, Aldy Hernandez wrote:

Jeff has pointed out that after my change adding global ranges to the
path solver, torture/pr55107.c is failing.  Before I start digging
deep into the IL, I'd like to make sure this is not either expected or
a bogus test.

Compiling this test on x86 with -Wall yields:

$ gcc -c -O2 pr55107.c -Wall
pr55107.c: In function ‘f’:
pr55107.c:32:51: warning: the omitted middle operand in ‘?:’ will
always be ‘true’, suggest explicit middle operand [-Wparentheses]
32 |   ((a = d) ? b = 0 : (**p ? : 1) != (d != 1 ? : (a = 0)))
!= (k ? a : 0)
   |   ^
pr55107.c:33:11: warning: suggest parentheses around comparison in
operand of ‘!=’ [-Wparentheses]
32 |   ((a = d) ? b = 0 : (**p ? : 1) != (d != 1 ? : (a = 0)))
!= (k ? a : 0)
   |
   ~~~
33 |   < (a *= c = k) && (**p = 0);
   |   ^~
pr55107.c:33:16: warning: operation on ‘a’ may be undefined [-Wsequence-point]
33 |   < (a *= c = k) && (**p = 0);
   |^~
pr55107.c:33:16: warning: operation on ‘a’ may be undefined [-Wsequence-point]
pr55107.c:33:26: warning: value computed is not used [-Wunused-value]
33 |   < (a *= c = k) && (**p = 0);
   |  ^~
pr55107.c:17:14: warning: ‘j’ is used uninitialized [-Wuninitialized]
17 |   for (; *j; j++)
   |  ^~

So it looks like there's some undefined behavior going on, even before
my patch.  I'd like to get some feedback, because this is usually the
type of problems I see in the presence of a smarter threader... things
get shuffled around, problematic code gets isolated, and warning
passes have an easier time (or sometimes harder time) diagnosing
things.
The original issue was PRE hanging, so I'd lean towards keeping the test 
as-is and instead twiddling any warning flags we can to make the 
diagnostics go away.


jeff


Re: PING^2 [PATCH] x86: Update memcpy/memset inline strategies for -mtune=generic

2021-09-13 Thread H.J. Lu via Gcc-patches
On Tue, Sep 7, 2021 at 8:01 PM H.J. Lu  wrote:
>
> On Sun, Aug 22, 2021 at 8:28 AM H.J. Lu  wrote:
> >
> > On Tue, Mar 23, 2021 at 09:19:38AM +0100, Richard Biener wrote:
> > > On Tue, Mar 23, 2021 at 3:41 AM Hongyu Wang  
> > > wrote:
> > > >
> > > > > Hongyue, please collect code size differences on SPEC CPU 2017 and
> > > > > eembc.
> > > >
> > > > Here is code size difference for this patch
> > >
> > > Thanks, nothing too bad although slightly larger impacts than envisioned.
> > >
> >
> > PING.
> >
> > OK for master branch?
> >
> > Thanks.
> >
> > H.J.
> >  ---
> > Simplify memcpy and memset inline strategies to avoid branches for
> > -mtune=generic:
> >
> > 1. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
> >load and store for up to 16 * 16 (256) bytes when the data size is
> >fixed and known.
> > 2. Inline only if data size is known to be <= 256.
> >a. Use "rep movsb/stosb" with simple code sequence if the data size
> >   is a constant.
> >b. Use loop if data size is not a constant.
> > 3. Use memcpy/memset libray function if data size is unknown or > 256.
> >
>
> PING:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577889.html
>

PING.  This should fix:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102294

-- 
H.J.


Re: [PATCH] Refactor jump_thread_path_registry.

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/11/2021 12:01 PM, Aldy Hernandez wrote:


So another thing to consider is that the threaders initially record 
their paths in different directions. Forward threading records 
starting at the first block, backward from the final block.  At some 
point (I no longer remember where) we invert the backwards threader's 
path to fit the model that the registry wanted and I think we then 
convert the path into the structure that the generic copier wants at 
some later point.


Yeah, that's in back_threader_registry::register_path, and then we go 
back and massage things again in back_jt_path_registry::update_cfg().




In theory with the two better separated we might not need to do the 
conversions of the backwards threader's paths anymore.


Good point.  I'll think about it, as we need some coherent way of 
keeping track of paths that can we can share, as I'm about to add yet 
another one in the jt_state business in tree-ssa-threadege.h.


ps.  Don't forget about gcc.dg/torture/pr55107 regression I sent you 
yesterday.  It's showing up on a wide range of targets.


Yeah, thanks for mentioning it.  I'm putting it aside until Monday 
when I'm actually getting paid to chase tricky bugs :).
Fair enough :-)  Works for me.  I've got a way to disable the test 
trivially until we sort out whether or not it's valid and/or useful.


jeff



Re: [PATCH] Come up with section_flag enum.

2021-09-13 Thread Martin Liška

PING^1

On 9/7/21 11:43, Martin Liška wrote:

Hi.

I'm planning some refactoring related to 'section *' and I noticed we have
quite ugly mask definitions (of form 1UL << N), where SECTION_FORGET is unused
and

#define SECTION_STYLE_MASK 0x60    /* bits used for SECTION_STYLE */

Is actually OR of 2 other values. What about making that a standard enum value
with 1UL << N values?

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

 * output.h (enum section_flag): New.
 (SECTION_FORGET): Remove.
 (SECTION_ENTSIZE): Make it (1UL << 8) - 1.
 (SECTION_STYLE_MASK): Define it based on other enum
 values.
 * varasm.c (switch_to_section): Remove unused handling of
 SECTION_FORGET.
---
  gcc/output.h | 85 +---
  gcc/varasm.c |  5 +---
  2 files changed, 48 insertions(+), 42 deletions(-)

diff --git a/gcc/output.h b/gcc/output.h
index 73ca4545f4f..8f6f15308f4 100644
--- a/gcc/output.h
+++ b/gcc/output.h
@@ -365,44 +365,53 @@ extern void default_function_switched_text_sections (FILE 
*, tree, bool);
  extern void no_asm_to_stream (FILE *);

  /* Flags controlling properties of a section.  */
-#define SECTION_ENTSIZE 0x000ff    /* entity size in section */
-#define SECTION_CODE 0x00100    /* contains code */
-#define SECTION_WRITE 0x00200    /* data is writable */
-#define SECTION_DEBUG 0x00400    /* contains debug data */
-#define SECTION_LINKONCE 0x00800    /* is linkonce */
-#define SECTION_SMALL 0x01000    /* contains "small data" */
-#define SECTION_BSS 0x02000    /* contains zeros only */
-#define SECTION_FORGET 0x04000    /* forget that we've entered the section 
*/
-#define SECTION_MERGE 0x08000    /* contains mergeable data */
-#define SECTION_STRINGS  0x1    /* contains zero terminated strings without
-   embedded zeros */
-#define SECTION_OVERRIDE 0x2    /* allow override of default flags */
-#define SECTION_TLS 0x4    /* contains thread-local storage */
-#define SECTION_NOTYPE 0x8    /* don't output @progbits */
-#define SECTION_DECLARED 0x10    /* section has been used */
-#define SECTION_STYLE_MASK 0x60    /* bits used for SECTION_STYLE */
-#define SECTION_COMMON   0x80    /* contains common data */
-#define SECTION_RELRO 0x100    /* data is readonly after relocation 
processing */
-#define SECTION_EXCLUDE  0x200    /* discarded by the linker */
-#define SECTION_RETAIN 0x400    /* retained by the linker.  */
-#define SECTION_LINK_ORDER 0x800    /* section needs link-order.  */
-
-/* NB: The maximum SECTION_MACH_DEP is 0x1000 since AVR needs 4 bits
-   in SECTION_MACH_DEP.  */
-#define SECTION_MACH_DEP 0x1000    /* subsequent bits reserved for target 
*/
-
-/* This SECTION_STYLE is used for unnamed sections that we can switch
-   to using a special assembler directive.  */
-#define SECTION_UNNAMED 0x00
-
-/* This SECTION_STYLE is used for named sections that we can switch
-   to using a general section directive.  */
-#define SECTION_NAMED 0x20
-
-/* This SECTION_STYLE is used for sections that we cannot switch to at
-   all.  The choice of section is implied by the directive that we use
-   to declare the object.  */
-#define SECTION_NOSWITCH 0x40
+enum section_flag
+{
+  /* This SECTION_STYLE is used for unnamed sections that we can switch
+ to using a special assembler directive.  */
+  SECTION_UNNAMED = 0,
+
+  SECTION_ENTSIZE = (1UL << 8) - 1,    /* entity size in section */
+  SECTION_CODE = 1UL << 8,    /* contains code */
+  SECTION_WRITE = 1UL << 9,    /* data is writable */
+
+  SECTION_DEBUG = 1UL << 10,    /* contains debug data */
+  SECTION_LINKONCE = 1UL << 11,    /* is linkonce */
+  SECTION_SMALL = 1UL << 12,    /* contains "small data" */
+  SECTION_BSS = 1UL << 13,    /* contains zeros only */
+  SECTION_MERGE = 1UL << 14,    /* contains mergeable data */
+  SECTION_STRINGS = 1UL << 15,    /* contains zero terminated strings
+   without embedded zeros */
+  SECTION_OVERRIDE = 1UL << 16,    /* allow override of default flags */
+  SECTION_TLS = 1UL << 17,    /* contains thread-local storage */
+  SECTION_NOTYPE = 1UL << 18,    /* don't output @progbits */
+  SECTION_DECLARED = 1UL << 19,    /* section has been used */
+
+  /* This SECTION_STYLE is used for named sections that we can switch
+ to using a general section directive.  */
+  SECTION_NAMED = 1UL << 20,
+
+  /* This SECTION_STYLE is used for sections that we cannot switch to at
+ all.  The choice of section is implied by the directive that we use
+ to declare the object.  */
+  SECTION_NOSWITCH = 1UL << 21,
+
+  /* bits used for SECTION_STYLE */
+  SECTION_STYLE_MASK = SECTION_NAMED | SECTION_NOSWITCH,
+
+  SECTION_COMMON = 1UL << 22,    /* contains 

Re: [PATCH] Remove references to FSM threads.

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/13/2021 3:54 AM, Aldy Hernandez wrote:

Now that the jump thread back registry has been split into the generic
copier and the custom (old) copier, it becomes trivial to remove the
FSM bits from the jump threaders.

First, there's no need for an EDGE_FSM_THREAD type.  The only reason
we were looking at the threading type was to determine what type of
copier to use, and now that the copier has been split, there's no need
to even look.  However, there is one check in register_jump_thread
where we verify that only the generic copier can thread through
back-edges.  I've removed that check in favor of a flag passed to the
constructor.

I've also removed all the FSM references from the code and tests.
Interestingly, some tests weren't even testing the right thing.  They
were testing for "FSM" which would catch jump thread paths as well as
the backward threader *failing* on registering a path.  *big eye roll*

The only remaining code that was actually checking for EDGE_FSM_THREAD
was adjust_paths_after_duplication, and the checks could be written
without looking at the edge type at all.  For the record, the code
there is horrible: it's convoluted, hard to read, and doesn't have any
tests.  I'd smack myself if I could go back in time.

All that remains are the FSM references in the --param's themselves.
I think we should s/fsm/threader/, since I envision a day when we can
share the cost basis code between the threaders.  However, I don't
know what the proper procedure is for renaming existing compiler
options.

By the way, param_fsm_maximum_phi_arguments is no longer relevant
after the rewrite.  We can nuke that one right away.

Tested on x86-64 Linux.

OK?

gcc/ChangeLog:

* tree-ssa-threadbackward.c
(back_threader_profitability::profitable_path_p): Remove FSM
references.
(back_threader_registry::register_path): Same.
* tree-ssa-threadedge.c
(jump_threader::simplify_control_stmt_condition): Same.
* tree-ssa-threadupdate.c (jt_path_registry::jt_path_registry):
Add backedge_threads argument.
(fwd_jt_path_registry::fwd_jt_path_registry): Pass
backedge_threads argument.
(back_jt_path_registry::back_jt_path_registry):  Same.
(dump_jump_thread_path): Adjust for FSM removal.
(back_jt_path_registry::rewire_first_differing_edge): Same.
(back_jt_path_registry::adjust_paths_after_duplication): Same.
(back_jt_path_registry::update_cfg): Same.
(jt_path_registry::register_jump_thread): Same.
* tree-ssa-threadupdate.h (enum jump_thread_edge_type): Remove
EDGE_FSM_THREAD.
(class back_jt_path_registry): Add backedge_threads to
constructor.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr21417.c: Adjust for FSM removal.
* gcc.dg/tree-ssa/pr66752-3.c: Same.
* gcc.dg/tree-ssa/pr68198.c: Same.
* gcc.dg/tree-ssa/pr69196-1.c: Same.
* gcc.dg/tree-ssa/pr70232.c: Same.
* gcc.dg/tree-ssa/pr77445.c: Same.
* gcc.dg/tree-ssa/ranger-threader-4.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
* gcc.dg/tree-ssa/ssa-thread-12.c: Same.
* gcc.dg/tree-ssa/ssa-thread-13.c: Same.

OK
jeff



Re: [PATCHv5 00/18] Replace the Power target-specific builtin machinery

2021-09-13 Thread Bill Schmidt via Gcc-patches

Ping.

Message-Id: 

Thanks!
Bill

On 9/1/21 11:13 AM, Bill Schmidt via Gcc-patches wrote:

Hi!

Original patch series here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-April/568840.html

V2 patch series here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572231.html

V3 patch series here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573020.html

V4 patch series here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/576284.html

Thanks for all the reviews so far!  We're into the home stretch.  I needed
to rebase this series again in order to pick up some changes from upstream.

Patch 01/18 is a reposting of V4 patch 19/34, addressing some of the
comments.  Full refactoring of this stuff will be done later, after this
patch series can burn in a little.  This wasn't yet formally approved.

Patch 02/18 is new, and is a minor bug fix.

Patches 03/18 through 17/18 correspond to V4 patches 20/34 through 34/34.
These were adjusted for upstream changes, and I did some formatting
cleanups.  I also provided better descriptions for some of the patches.

Patch 18/18 is new, and improves the parser to handle escape-newline
input.  With that in place, it cleans up all the long lines in the
input files.

Bootstrapped and tested on powerpc64le-linux-gnu (P10) and
powerpc64-linux-gnu (32- and 64-bit, P8).  There are no regressions for
little endian.  There are a small handful of big-endian regressions that
have crept in, and I'll post patches for those after I work through them.
But no need to hold up reviews on the rest of this in the meantime.

Thanks again for all of the helpful reviews so far!

Bill


Bill Schmidt (18):
   rs6000: Handle overloads during program parsing
   rs6000: Move __builtin_mffsl to the [always] stanza
   rs6000: Handle gimple folding of target built-ins
   rs6000: Handle some recent MMA builtin changes
   rs6000: Support for vectorizing built-in functions
   rs6000: Builtin expansion, part 1
   rs6000: Builtin expansion, part 2
   rs6000: Builtin expansion, part 3
   rs6000: Builtin expansion, part 4
   rs6000: Builtin expansion, part 5
   rs6000: Builtin expansion, part 6
   rs6000: Update rs6000_builtin_decl
   rs6000: Miscellaneous uses of rs6000_builtins_decl_x
   rs6000: Debug support
   rs6000: Update altivec.h for automated interfaces
   rs6000: Test case adjustments
   rs6000: Enable the new builtin support
   rs6000: Add escape-newline support for builtins files

  gcc/config/rs6000/altivec.h   |  519 +--
  gcc/config/rs6000/rs6000-builtin-new.def  |  442 ++-
  gcc/config/rs6000/rs6000-c.c  | 1088 ++
  gcc/config/rs6000/rs6000-call.c   | 3132 +++--
  gcc/config/rs6000/rs6000-gen-builtins.c   |  312 +-
  gcc/config/rs6000/rs6000.c|  272 +-
  .../powerpc/bfp/scalar-extract-exp-2.c|2 +-
  .../powerpc/bfp/scalar-extract-sig-2.c|2 +-
  .../powerpc/bfp/scalar-insert-exp-2.c |2 +-
  .../powerpc/bfp/scalar-insert-exp-5.c |2 +-
  .../powerpc/bfp/scalar-insert-exp-8.c |2 +-
  .../powerpc/bfp/scalar-test-neg-2.c   |2 +-
  .../powerpc/bfp/scalar-test-neg-3.c   |2 +-
  .../powerpc/bfp/scalar-test-neg-5.c   |2 +-
  .../gcc.target/powerpc/byte-in-set-2.c|2 +-
  gcc/testsuite/gcc.target/powerpc/cmpb-2.c |2 +-
  gcc/testsuite/gcc.target/powerpc/cmpb32-2.c   |2 +-
  .../gcc.target/powerpc/crypto-builtin-2.c |   14 +-
  .../powerpc/fold-vec-splat-floatdouble.c  |4 +-
  .../powerpc/fold-vec-splat-longlong.c |   10 +-
  .../powerpc/fold-vec-splat-misc-invalid.c |8 +-
  .../gcc.target/powerpc/int_128bit-runnable.c  |6 +-
  .../gcc.target/powerpc/p8vector-builtin-8.c   |1 +
  gcc/testsuite/gcc.target/powerpc/pr80315-1.c  |2 +-
  gcc/testsuite/gcc.target/powerpc/pr80315-2.c  |2 +-
  gcc/testsuite/gcc.target/powerpc/pr80315-3.c  |2 +-
  gcc/testsuite/gcc.target/powerpc/pr80315-4.c  |2 +-
  gcc/testsuite/gcc.target/powerpc/pr88100.c|   12 +-
  .../gcc.target/powerpc/pragma_misc9.c |2 +-
  .../gcc.target/powerpc/pragma_power8.c|2 +
  .../gcc.target/powerpc/pragma_power9.c|3 +
  .../powerpc/test_fpscr_drn_builtin_error.c|4 +-
  .../powerpc/test_fpscr_rn_builtin_error.c |   12 +-
  gcc/testsuite/gcc.target/powerpc/test_mffsl.c |3 +-
  gcc/testsuite/gcc.target/powerpc/vec-gnb-2.c  |2 +-
  .../gcc.target/powerpc/vsu/vec-all-nez-7.c|2 +-
  .../gcc.target/powerpc/vsu/vec-any-eqz-7.c|2 +-
  .../gcc.target/powerpc/vsu/vec-cmpnez-7.c |2 +-
  .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-2.c |2 +-
  .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-2.c |2 +-
  .../gcc.target/powerpc/vsu/vec-xl-len-13.c|2 +-
  .../gcc.target/powerpc/vsu/vec-xst-len-12.c   |2 +-
  42 files changed, 4803 insertions(+), 1089 deletions(-)





Re: [PATCH] flag_complex_method: support optimize attribute

2021-09-13 Thread Martin Liška

PING^1

On 9/7/21 11:42, Martin Liška wrote:

On 9/6/21 14:16, Richard Biener wrote:

On Mon, Sep 6, 2021 at 1:46 PM Jakub Jelinek  wrote:


On Mon, Sep 06, 2021 at 01:37:46PM +0200, Martin Liška wrote:

--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1323,6 +1323,14 @@ finish_options (struct gcc_options *opts, struct 
gcc_options *opts_set,
    = (opts->x_flag_unroll_loops
   || opts->x_flag_peel_loops
   || opts->x_optimize >= 3);
+
+  /* With -fcx-limited-range, we do cheap and quick complex arithmetic.  */
+  if (opts->x_flag_cx_limited_range)
+    flag_complex_method = 0;
+
+  /* With -fcx-fortran-rules, we do something in-between cheap and C99.  */
+  if (opts->x_flag_cx_fortran_rules)
+    flag_complex_method = 1;


That should then be opts->x_flag_complex_method instead of flag_complex_method.

Ok with that change.


But the C/C++ langhooks also set flag_complex_method so I fail to see how
this helps?  As said I was referring to -fcx-limited-range on the command-line
and -fno-cx-limited-range in the optimize node to undo this which should
get you the langhook setting of flag_complex_method = 2.


You are right, it's even more complicated as -fno-cx-limited-range is target 
specific.
Option handling has been introducing surprises every time ...

The following tested patch should handle it.

Ready to be installed?
Thanks,
Martin




Note, I think we want to do much more in finish_options and less in
process_options, anything that is about Optimization options rather than
just the global ones.  Though one needs to be careful with the cases where
the code diagnoses something.

 Jakub





Re: Regression with recent change

2021-09-13 Thread Aldy Hernandez via Gcc-patches
Jeff has pointed out that after my change adding global ranges to the
path solver, torture/pr55107.c is failing.  Before I start digging
deep into the IL, I'd like to make sure this is not either expected or
a bogus test.

Compiling this test on x86 with -Wall yields:

$ gcc -c -O2 pr55107.c -Wall
pr55107.c: In function ‘f’:
pr55107.c:32:51: warning: the omitted middle operand in ‘?:’ will
always be ‘true’, suggest explicit middle operand [-Wparentheses]
   32 |   ((a = d) ? b = 0 : (**p ? : 1) != (d != 1 ? : (a = 0)))
!= (k ? a : 0)
  |   ^
pr55107.c:33:11: warning: suggest parentheses around comparison in
operand of ‘!=’ [-Wparentheses]
   32 |   ((a = d) ? b = 0 : (**p ? : 1) != (d != 1 ? : (a = 0)))
!= (k ? a : 0)
  |
  ~~~
   33 |   < (a *= c = k) && (**p = 0);
  |   ^~
pr55107.c:33:16: warning: operation on ‘a’ may be undefined [-Wsequence-point]
   33 |   < (a *= c = k) && (**p = 0);
  |^~
pr55107.c:33:16: warning: operation on ‘a’ may be undefined [-Wsequence-point]
pr55107.c:33:26: warning: value computed is not used [-Wunused-value]
   33 |   < (a *= c = k) && (**p = 0);
  |  ^~
pr55107.c:17:14: warning: ‘j’ is used uninitialized [-Wuninitialized]
   17 |   for (; *j; j++)
  |  ^~

So it looks like there's some undefined behavior going on, even before
my patch.  I'd like to get some feedback, because this is usually the
type of problems I see in the presence of a smarter threader... things
get shuffled around, problematic code gets isolated, and warning
passes have an easier time (or sometimes harder time) diagnosing
things.

Thanks.
Aldy

On Fri, Sep 10, 2021 at 10:31 PM Jeff Law  wrote:
>
> This change:
> 01b5038718056b024b370b74a874fbd92c5bbab3 is the first bad commit
> commit 01b5038718056b024b370b74a874fbd92c5bbab3
> Author: Aldy Hernandez 
> Date:   Thu Sep 9 20:30:28 2021 +0200
>
>  Disable threading through latches until after loop optimizations.
>
>  The motivation for this patch was enabling the use of global ranges in
>  the path solver, but this caused certain properties of loops being
>  destroyed which made subsequent loop optimizations to fail.
>  Consequently, this patch's mail goal is to disable jump threading
>  involving the latch until after loop optimizations have run.
> [ ... ]
>
>
> Is causing a regression on nds32le-elf -- perhaps others as well, this
> just happened to pop first in my tester.
>
> Tests that now fail, but worked before (4 tests):
>
> nds32-sim: gcc.dg/torture/pr55107.c   -O2  (test for excess errors)
> nds32-sim: gcc.dg/torture/pr55107.c   -O2  (test for excess errors)
> nds32-sim: gcc.dg/torture/pr55107.c   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none  (test for excess errors)
> nds32-sim: gcc.dg/torture/pr55107.c   -O2 -flto -fno-use-linker-plugin
> -flto-partition=none  (test for excess errors)
>
> I suspect the underlying cause has
> a reasonable chance of causing problems on other ports.
>
>  From the log file:
>
> /home/jlaw/jenkins/workspace/nds32le-elf/gcc/gcc/testsuite/gcc.dg/torture/pr55107.c:
> In function 'f':^M
> /home/jlaw/jenkins/workspace/nds32le-elf/gcc/gcc/testsuite/gcc.dg/torture/pr55107.c:19:21:
> warning: iteration 2147483646 invokes undefined behavior
> [-Waggressive-loop-optimizations]^M
> /home/jlaw/jenkins/workspace/nds32le-elf/gcc/gcc/testsuite/gcc.dg/torture/pr55107.c:19:17:
> note: within this loop^M
>



Re: [PATCH] i386: support micro-levels in target{,_clone} attrs [PR101696]

2021-09-13 Thread Martin Liška

PING^1

On 8/13/21 15:41, H.J. Lu wrote:

On Fri, Aug 13, 2021 at 1:10 AM Martin Liška  wrote:


On 8/12/21 7:35 PM, H.J. Lu wrote:

What happens for arch=x86-64-v5?


pr101696.c:5:55: error: bad value (‘x86-64-v5’) for ‘target("arch=")’ attribute

  5 | __attribute__ ((target ("arch=x86-64-v5"))) void foo () {  __builtin_printf 
("arch=x86-64-v4\n"); }

|   ^

pr101696.c:5:55: note: valid arguments to ‘target("arch=")’ attribute are: 
nocona core2 nehalem corei7 westmere sandybridge corei7-avx ivybridge core-avx-i haswell 
core-avx2 broadwell skylake skylake-avx512 cannonlake icelake-client rocketlake 
icelake-server cascadelake tigerlake cooperlake sapphirerapids alderlake bonnell atom 
silvermont slm goldmont goldmont-plus tremont knl knm x86-64 x86-64-v2 x86-64-v3 
x86-64-v4 eden-x2 nano nano-1000 nano-2000 nano-3000 nano-x2 eden-x4 nano-x4 k8 k8-sse3 
opteron opteron-sse3 athlon64 athlon64-sse3 athlon-fx amdfam10 barcelona bdver1 bdver2 
bdver3 bdver4 znver1 znver2 znver3 btver1 btver2 native; did you mean ‘x86-64-v2’?


Which seems to me a reasonable error message.



The patch looks good to me.

Thanks.





Re: [PATCH][v2] Always default to DWARF2_DEBUG if not specified, warn about deprecated STABS

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/13/2021 1:31 AM, Richard Biener wrote:

This makes defaults.h choose DWARF2_DEBUG if PREFERRED_DEBUGGING_TYPE
is not specified by the target and NO_DEBUG if DWARF is not supported.

It also makes us warn when STABS is enabled and removes the corresponding
diagnostic from the Ada frontend.  The warnings are pruned from the
testsuite output via prune_gcc_output.

This leaves the following targets without debug support:

  pdp11-*-*   pdp11 is a.out, dwarf support is difficult
  m68k*-*-openbsd*  it looks like this is a.out as well, at least it does
not pretend to support DWARF
  hppa[12]*-*-hpux10*  does seem to not support DWARF
I would probably argue that hpux10 should just be removed, along with 
hpux 7-9 if they haven't been already.  It's the epitome of a dead platform.




  vax-*-openbsd*  seems to be a.out as well, does not support DWARF

behavior will be like


./cc1 -quiet t.c -g

cc1: warning: target system does not support debug output

./cc1 -quiet t.c -gstabs

t.c: warning: STABS debugging information is obsolete and not supported anymore

that is, -g is unsupported but -gstabs will generate STABS (the above
is for pdp11).  It would be nice if maintainers could confirm the above
listed configurations do not support DWARF and weight in whether to
(apart from pdp11) the specific configurations can be obsoleted or
adjusted.  It looks like we do not have any openbsd maintainer.
I've discussed the situation for pdp11 with Paul already at some point
but we didn't reach any conclusion besides that it would be nice to
move pdp11 to ELF.


2021-09-10  Richard Biener  

gcc/
* defaults.h (PREFERRED_DEBUGGING_TYPE): Choose DWARF2_DEBUG
or NO_DEBUG.
* toplev.c (process_options): Warn when STABS debugging is
enabled.

gcc/ada/
* gcc-interface/misc.c (gnat_post_options): Do not warn
about DBX_DEBUG use here.

gcc/testsuite/
* lib/prune.exp: Prune STABS obsoletion message.

OK
jeff



Re: [PATCH] Remove m32r{,le}-*-linux* support from GCC

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/13/2021 1:11 AM, apinski--- via Gcc-patches wrote:

From: Andrew Pinski 

m32r support never made it to glibc and the support for the Linux kernel
was removed with 4.18. It does not remove much but no reason to keep
around a port which never worked or one which the support in other
projects is gone.

OK? Checked to make sure m32r-linux and m32rle-linux were rejected
when building.

contrib/ChangeLog:

* config-list.mk: Remove m32r-linux and m32rle-linux
from the list.

gcc/ChangeLog:

* config.gcc: Add m32r-*-linux* and m32rle-*-linux*
to the Unsupported targets list.
Remove support for m32r-*-linux* and m32rle-*-linux*.
* config/m32r/linux.h: Removed.
* config/m32r/t-linux: Removed.

libgcc/ChangeLog:

* config.host: Remove m32r-*-linux* and m32rle-*-linux*.
* config/m32r/libgcc-glibc.ver: Removed.
* config/m32r/t-linux: Removed.

OK.
jeff



Re: [PATCH] Remove m68k-openbsd support

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/13/2021 5:36 AM, Richard Biener wrote:

This removes m68k-openbsd as a valid configuration, according
to openbsd.org m68k-openbsd [on the mac] was discontinued after
the 5.1 release.  The configuration is also not (or no longer)
supported by gas and GNU ld so I could not figure whether it is still
a.out (I suspect it is).  But first and foremost the target only supports
STABS as a debugging format.

OK for trunk?

Thanks,
Richard.

2021-09-13  Richard Biener  

* config.gcc: Remove m68k-openbsd.

contrib/
* config-list.mk: Remove m68k-openbsd.

OK.

Jeff



Re: [PATCH] Remove support for vax-openbsd

2021-09-13 Thread Jeff Law via Gcc-patches




On 9/13/2021 5:41 AM, Richard Biener via Gcc-patches wrote:

This removes the support for vax-openbsd which has been discontinued
after the OpenBSD 5.9 release and which has no supported gas or GNU ld
configuration [anymore].  In particular this target does only support
STABS debuginfo generation.

OK for trunk?

Thanks,
Richard.

2021-09-13  Richard Biener  

* config.gcc: Remove vax-*-openbsd* configuration.
openbsd5.9 isn't crazy old.  But combined with "vax" and "stabs 
removal", it enough to get me to OK.


jeff



Re: More aggressive threading causing loop-interchange-9.c regression

2021-09-13 Thread Aldy Hernandez via Gcc-patches
On Mon, Sep 13, 2021 at 1:49 PM Christophe Lyon
 wrote:

> This last test now fails on aarch64:
> FAIL:  gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump thread3 "Jumps 
> threaded: 8"
>
> Can you check?

These rather large tests checking for some random number of jump
threads are very annoying.  In all my work in the past 2 years,
they've never found anything that the smaller threading tests (for
instance ssa-thread-14.c) couldn't find.

The thing is that the new backward threader is not a threader per se,
but a framework for solving paths.  It can find virtually any path in
the IL, depending on how good the solver or the path discovery bits
are.  So any change to a great number of components can alter the
number of threads.  For example, changes to at *least* the following
components could alter the number of threads:

* the path solver
* the path discovery bits in tree-ssa-threadbackward
* the profitability code in tree-ssa-threadbackward
* the post-registration bits in the low-level path registry
* range-ops
* ranger
* any pass alterting global SSA_NAME_RANGE_INFO since the path solver
can pull that info (evrp, VRP, sprintf, PRE, strlen, loop-manip, etc).

It is unreasonable to have to go through every assumed jump thread
candidate in these tests for every minor change.  Heck, are we even
sure there are supposed to be 18 exact jump threads in the first
backward jump thread pass for this test?

I suggest we remove these tests.  They add a maintenance burder, for
very little return.  And if we _really_ must test something in this
area, perhaps a unit test in the appropriate component would be more
useful.

For this particular test the IL is sufficiently different on aarch64,
such that the number of threads is different.  I'm just going to
disable this.  We already have to disable the dom3 and vrp2 jump
threading  tests on this same test on aarch64.  This is more of the
same.  Committed to trunk.

I'd be happy to hear if anyone has other solutions.
Aldy
commit a7f59856ea8ea02d6f09d2e9e793ce2800ebbe4b
Author: Aldy Hernandez 
Date:   Mon Sep 13 14:25:15 2021 +0200

Adjust ssa-dom-thread-7.c on aarch64.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust for aarch64.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
index ba07942f9dd..e3d4b311c03 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c
@@ -2,7 +2,7 @@
 /* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats -fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats -fdump-tree-vrp2-stats -fno-guess-branch-probability" } */
 
 /* { dg-final { scan-tree-dump "Jumps threaded: 18"  "thread1" } } */
-/* { dg-final { scan-tree-dump "Jumps threaded: 8" "thread3" } } */
+/* { dg-final { scan-tree-dump "Jumps threaded: 8" "thread3" { target { ! aarch64*-*-* } } } } */
 /* { dg-final { scan-tree-dump-not "Jumps threaded"  "dom2" } } */
 
 /* aarch64 has the highest CASE_VALUES_THRESHOLD in GCC.  It's high enough


[PATCH] libstdc++-v3: Optimize 'to_string' with numeric_limits instead of __to_chars_len

2021-09-13 Thread 刘可 via Gcc-patches
Hi!
Gcc5 has implemented 'SSO'. The length of small string local buffer is 15,
which
is enough to store an integer. So we can use
'numeric_limits::digits+1' to
get the max length of int instead of  dynamically obtaining the length of
the
integer through __to_chars_len. In this way, I will get a performance
improvement
of about 15%.

Before optimization:

BenchmarkTimeCPU   Iterations

# to_string
Int2String   191785 ns   191780 ns 3645
# to_string
Unsigned2String159605 ns   159599 ns 4367

After optimization:

BenchmarkTimeCPU   Iterations

# to_string
Int2String   159382 ns   159381 ns 4354
# to_string
Unsigned2String136744 ns   136742 ns 5144

2020-09-13 Liuke 

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h: Use
std::numeric_limits::digits10 instead of __to_chars_len.

Diff:
diff --git a/libstdc++-v3/include/bits/basic_string.h
b/libstdc++-v3/include/bits/basic_string.h
index b61fe05efcf..5cbec537b2f 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 

 #if __cplusplus >= 201103L
 #include 
@@ -3721,7 +3722,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   {
 const bool __neg = __val < 0;
 const unsigned __uval = __neg ? (unsigned)~__val + 1u : __val;
-const auto __len = __detail::__to_chars_len(__uval);
+const auto __len = std::numeric_limits::digits10 + 1;
 string __str(__neg + __len, '-');
 __detail::__to_chars_10_impl(&__str[__neg], __len, __uval);
 return __str;
@@ -3730,7 +3731,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   inline string
   to_string(unsigned __val)
   {
-string __str(__detail::__to_chars_len(__val), '\0');
+string __str(std::numeric_limits::digits10 + 1;, '\0');
 __detail::__to_chars_10_impl(&__str[0], __str.size(), __val);
 return __str;
   }


Re: openmp: Implement OpenMP 5.1 atomics, so far for C only

2021-09-13 Thread Christophe Lyon via Gcc-patches
On Fri, Sep 10, 2021 at 8:47 PM Jakub Jelinek via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> Hi!
>
> This patch implements OpenMP 5.1 atomics (with clarifications from
> upcoming 5.2).
> The most important changes are that it is now possible to write (for C/C++,
> for Fortran it was possible before already) min/max atomics and more
> importantly
> compare and exchange in various forms.
> Also, acq_rel is now allowed on read/write and acq_rel/acquire are allowed
> on
> update, and there are new compare, weak and fail clauses.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.
>
> C++ support will follow next week hopefully.  Various new tests in
> c-c++-common are now with { target c }, that is temporary until the C++
> support is there.
>
> 2021-09-10  Jakub Jelinek  
>
> gcc/
> * tree-core.h (enum omp_memory_order): Add OMP_MEMORY_ORDER_MASK,
> OMP_FAIL_MEMORY_ORDER_UNSPECIFIED, OMP_FAIL_MEMORY_ORDER_RELAXED,
> OMP_FAIL_MEMORY_ORDER_ACQUIRE, OMP_FAIL_MEMORY_ORDER_RELEASE,
> OMP_FAIL_MEMORY_ORDER_ACQ_REL, OMP_FAIL_MEMORY_ORDER_SEQ_CST and
> OMP_FAIL_MEMORY_ORDER_MASK enumerators.
> (OMP_FAIL_MEMORY_ORDER_SHIFT): Define.
> * gimple-pretty-print.c (dump_gimple_omp_atomic_load,
> dump_gimple_omp_atomic_store): Print [weak] for weak atomic
> load/store.
> * gimple.h (enum gf_mask): Change GF_OMP_ATOMIC_MEMORY_ORDER
> to 6-bit mask, adjust GF_OMP_ATOMIC_NEED_VALUE value and add
> GF_OMP_ATOMIC_WEAK.
> (gimple_omp_atomic_weak_p, gimple_omp_atomic_set_weak): New inline
> functions.
> * tree.h (OMP_ATOMIC_WEAK): Define.
> * tree-pretty-print.c (dump_omp_atomic_memory_order): Adjust for
> fail memory order being encoded in the same enum and also print
> fail clause if present.
> (dump_generic_node): Print weak clause if OMP_ATOMIC_WEAK.
> * gimplify.c (goa_stabilize_expr): Add target_expr and rhs
> arguments,
> handle pre_p == NULL case as a test mode that only returns value
> but doesn't change gimplify nor change anything otherwise, adjust
> recursive calls, add MODIFY_EXPR, ADDR_EXPR, COND_EXPR, TARGET_EXPR
> and CALL_EXPR handling, adjust COMPOUND_EXPR handling for
> __builtin_clear_padding calls, for !rhs gimplify as lvalue rather
> than rvalue.
> (gimplify_omp_atomic): Adjust goa_stabilize_expr caller.  Handle
> COND_EXPR rhs.  Set weak flag on gimple load/store for
> OMP_ATOMIC_WEAK.
> * omp-expand.c (omp_memory_order_to_fail_memmodel): New function.
> (omp_memory_order_to_memmodel): Adjust for fail clause encoded
> in the same enum.
> (expand_omp_atomic_cas): New function.
> (expand_omp_atomic_pipeline): Use omp_memory_order_to_fail_memmodel
> function.
> (expand_omp_atomic): Attempt to optimize atomic compare and
> exchange
> using expand_omp_atomic_cas.
> gcc/c-family/
> * c-common.h (c_finish_omp_atomic): Add r and weak arguments.
> * c-omp.c: Include gimple-fold.h.
> (c_finish_omp_atomic): Add r and weak arguments.  Add support for
> OpenMP 5.1 atomics.
> gcc/c/
> * c-parser.c (c_parser_conditional_expression): If omp_atomic_lhs
> and
> cond.value is >, < or == with omp_atomic_lhs as one of the
> operands,
> don't call build_conditional_expr, instead build a COND_EXPR
> directly.
> (c_parser_binary_expression): Avoid calling parser_build_binary_op
> if omp_atomic_lhs even in more cases for >, < or ==.
> (c_parser_omp_atomic): Update function comment for OpenMP 5.1
> atomics,
> parse OpenMP 5.1 atomics and fail, compare and weak clauses, allow
> acq_rel on atomic read/write and acq_rel/acquire clauses on update.
> * c-typeck.c (build_binary_op): For flag_openmp only handle
> MIN_EXPR/MAX_EXPR.
> gcc/cp/
> * parser.c (cp_parser_omp_atomic): Allow acq_rel on atomic
> read/write
> and acq_rel/acquire clauses on update.
> * semantics.c (finish_omp_atomic): Adjust c_finish_omp_atomic
> caller.
> gcc/testsuite/
> * c-c++-common/gomp/atomic-17.c (foo): Add tests for atomic read,
> write or update with acq_rel clause and atomic update with acquire
> clause.
> * c-c++-common/gomp/atomic-18.c (foo): Adjust expected diagnostics
> wording, remove tests moved to atomic-17.c.
> * c-c++-common/gomp/atomic-21.c: Expect only 2 omp atomic release
> and
> 2 omp atomic acq_rel directives instead of 4 omp atomic release.
> * c-c++-common/gomp/atomic-25.c: New test.
> * c-c++-common/gomp/atomic-26.c: New test.
> * c-c++-common/gomp/atomic-27.c: New test.
> * c-c++-common/gomp/atomic-28.c: New test.
> * c-c++-common/gomp/atomic-29.c: New test.
> * 

Re: More aggressive threading causing loop-interchange-9.c regression

2021-09-13 Thread Christophe Lyon via Gcc-patches
On Fri, Sep 10, 2021 at 6:32 PM Jeff Law via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

>
>
> On 9/10/2021 7:53 AM, Aldy Hernandez via Gcc-patches wrote:
> >
> >
> > On 9/10/21 3:16 PM, Michael Matz wrote:
> >> Hi,
> >>
> >> On Fri, 10 Sep 2021, Aldy Hernandez via Gcc-patches wrote:
> >>
> >>>   }
> >>> +
> >>> +  /* Threading through a non-empty latch would cause code to be added
> >>
> >> "through an *empty* latch".  The test in code is correct, though.
> >
> > Whoops.
> >
> >>
> >> And for the before/after loops flag you added: we have a
> >> cfun->curr_properties field which can be used.  We even already have a
> >> PROP_loops flag but that is set throughout compilation from CFG
> >> construction until the RTL loop optimizers, so can't be re-used for what
> >> is needed here.  But you still could invent another PROP_ value
> >> instead of
> >> adding a new field in struct function.
> >
> > Oooo, even better.  No inline functions.
> >
> > Like this?
> > Aldy
> >
> > 0001-Disable-threading-through-latches-until-after-loop-o.patch
> >
> >  From ff25faa8dd8721da9bb4715706c662fc09fd4e8c Mon Sep 17 00:00:00 2001
> > From: Aldy Hernandez 
> > Date: Thu, 9 Sep 2021 20:30:28 +0200
> > Subject: [PATCH] Disable threading through latches until after loop
> >   optimizations.
> >
> > The motivation for this patch was enabling the use of global ranges in
> > the path solver, but this caused certain properties of loops being
> > destroyed which made subsequent loop optimizations to fail.
> > Consequently, this patch's mail goal is to disable jump threading
> > involving the latch until after loop optimizations have run.
> >
> > As can be seen in the test adjustments, we mostly shift the threading
> > from the early threaders (ethread, thread[12] to the late threaders
> > thread[34]).  I have nuked some of the early notes in the testcases
> > that came as part of the jump threader rewrite.  They're mostly noise
> > now.
> >
> > Note that we could probably relax some other restrictions in
> > profitable_path_p when loop optimizations have completed, but it would
> > require more testing, and I'm hesitant to touch more things than needed
> > at this point.  I have added a reminder to the function to keep this
> > in mind.
> >
> > Finally, perhaps as a follow-up, we should apply the same restrictions to
> > the forward threader.  At some point I'd like to combine the cost models.
> >
> > Tested on x86-64 Linux.
> >
> > p.s. There is a thorough discussion involving the limitations of jump
> > threading involving loops here:
> >
> >   https://gcc.gnu.org/pipermail/gcc/2021-September/237247.html
> >
> > gcc/ChangeLog:
> >
> >   * tree-pass.h (PROP_loop_opts_done): New.
> >   * gimple-range-path.cc (path_range_query::internal_range_of_expr):
> >   Intersect with global range.
> >   * tree-ssa-loop.c (tree_ssa_loop_done): Set PROP_loop_opts_done.
> >   * tree-ssa-threadbackward.c
> >   (back_threader_profitability::profitable_path_p): Disable
> >   threading through latches until after loop optimizations have run.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.dg/tree-ssa/ssa-dom-thread-2b.c: Adjust for disabling of
> >   threading through latches.
> >   * gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
> >   * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Same.
>

Hi,

This last test now fails on aarch64:
FAIL:  gcc.dg/tree-ssa/ssa-dom-thread-7.c scan-tree-dump thread3 "Jumps
threaded: 8"

Can you check?

Thanks,

Christophe

OK
> jeff
>
>


Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-13 Thread Richard Biener via Gcc-patches
On Mon, Sep 13, 2021 at 1:14 PM Hongtao Liu  wrote:
>
> On Mon, Sep 13, 2021 at 5:15 PM Richard Biener
>  wrote:
> >
> > On Mon, Sep 13, 2021 at 8:26 AM Hongtao Liu  wrote:
> > >
> > > On Mon, Sep 13, 2021 at 2:11 PM Richard Biener via Gcc-patches
> > >  wrote:
> > > >
> > > > On Fri, Sep 10, 2021 at 2:58 PM liuhongt  wrote:
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > > * expmed.c (extract_bit_field_using_extv): validate_subreg
> > > > > before call gen_lowpart.
> > > > > ---
> > > > >  gcc/expmed.c | 6 +-
> > > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/gcc/expmed.c b/gcc/expmed.c
> > > > > index 3143f38e057..10d62d857a8 100644
> > > > > --- a/gcc/expmed.c
> > > > > +++ b/gcc/expmed.c
> > > > > @@ -1571,12 +1571,16 @@ extract_bit_field_using_extv (const 
> > > > > extraction_insn *extv, rtx op0,
> > > > >
> > > > >if (GET_MODE (target) != ext_mode)
> > > > >  {
> > > > > +  machine_mode tmode = GET_MODE (target);
> > > > >/* Don't use LHS paradoxical subreg if explicit truncation is 
> > > > > needed
> > > > >  between the mode of the extraction (word_mode) and the target
> > > > >  mode.  Instead, create a temporary and use convert_move to 
> > > > > set
> > > > >  the target.  */
> > > > >if (REG_P (target)
> > > > > - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), 
> > > > > ext_mode))
> > > > > + && TRULY_NOOP_TRUNCATION_MODES_P (tmode, ext_mode)
> > > > > + && validate_subreg (ext_mode, tmode,
> > > > > + target,
> > > > > + subreg_lowpart_offset (ext_mode, 
> > > > > tmode)))
> > > > > {
> > > > >   target = gen_lowpart (ext_mode, target);
> > > >
> > > > That would be equivalent to use gen_lowpart_if_possible?
> > > No, target will be changed to NULL_RTX.
> > > But it does avoid ICE since maybe_expand_insn can legitimate operands,
> > > but I doubt it will introduce other bugs since the target has been
> > > changed here.
> > >
> > > I think the validate_subreg solution is plain and straightforward,
> > > just like it's done in
> > > r11-7515-g0ad6de3883a1641f7ec0bd9cf56d41fa5b313dae.
> >
> > That guards an explicit gen_rtx_SUBREG, here we're using gen_lowpart.
> > It's not an obvious match to validate gen_lowpart with validate_subreg,
> > I thought that gen_lowpart_if_possible would be prefered.  You obviously
> > have to adjust the code, like
> >
> >   rtx tem;
> >   if (...
> >   && (tem = gen_lowpart_if_possible (ext_mode, target))
> > {
> Yes, update patch
>
>   bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
>   Ok for trunk?

OK.

Thanks,
Richard.

> gcc/ChangeLog:
>
> * expmed.c (extract_bit_field_using_extv): Use
> gen_lowpart_if_possible instead of gen_lowpart to avoid ICE.
> ---
>  gcc/expmed.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/expmed.c b/gcc/expmed.c
> index 3143f38e057..59734d4841c 100644
> --- a/gcc/expmed.c
> +++ b/gcc/expmed.c
> @@ -1571,14 +1571,16 @@ extract_bit_field_using_extv (const
> extraction_insn *extv, rtx op0,
>
>if (GET_MODE (target) != ext_mode)
>  {
> +  rtx temp;
>/* Don't use LHS paradoxical subreg if explicit truncation is needed
>  between the mode of the extraction (word_mode) and the target
>  mode.  Instead, create a temporary and use convert_move to set
>  the target.  */
>if (REG_P (target)
> - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
> + && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode)
> + && (temp = gen_lowpart_if_possible (ext_mode, target)))
> {
> - target = gen_lowpart (ext_mode, target);
> + target = temp;
>   if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
> spec_target_subreg = target;
> }
> --
> 2.27.0
>
> >target = tem;
> > ...
> >
> > Richard.
> >
> > >
> > > >
> > > > >   if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
> > > > > --
> > > > > 2.27.0
> > > > >
> > >
> > >
> > >
> > > --
> > > BR,
> > > Hongtao
>
>
>
> --
> BR,
> Hongtao


Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-13 Thread Tobias Burnus

On 13.09.21 13:14, Hongtao Liu via Gcc-patches wrote:

Yes, update patch
   bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
   Ok for trunk?

If that patch gets approved, can you add PR bootstrap/102302 to the
commit changelog ?

cf. https://gcc.gnu.org/PR102302

Thanks,

Tobias


gcc/ChangeLog:

 * expmed.c (extract_bit_field_using_extv): Use
 gen_lowpart_if_possible instead of gen_lowpart to avoid ICE.
---
  gcc/expmed.c | 6 --
  1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/expmed.c b/gcc/expmed.c
index 3143f38e057..59734d4841c 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -1571,14 +1571,16 @@ extract_bit_field_using_extv (const
extraction_insn *extv, rtx op0,

if (GET_MODE (target) != ext_mode)
  {
+  rtx temp;
/* Don't use LHS paradoxical subreg if explicit truncation is needed
  between the mode of the extraction (word_mode) and the target
  mode.  Instead, create a temporary and use convert_move to set
  the target.  */
if (REG_P (target)
- && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
+ && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode)
+ && (temp = gen_lowpart_if_possible (ext_mode, target)))
 {
- target = gen_lowpart (ext_mode, target);
+ target = temp;
   if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
 spec_target_subreg = target;
 }
--
2.27.0


target = tem;
...

Richard.


   if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
--
2.27.0




--
BR,
Hongtao




-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCHv2] [aarch64] Fix target/95969: __builtin_aarch64_im_lane_boundsi interferes with gimple

2021-09-13 Thread Richard Sandiford via Gcc-patches
apinski--- via Gcc-patches  writes:
> From: Andrew Pinski 
>
> This patch adds simple folding of __builtin_aarch64_im_lane_boundsi where
> we are not going to error out. It fixes the problem by the removal
> of the function from the IR.
>
> OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.
>
> gcc/ChangeLog:
>
>   PR target/95969
>   * config/aarch64/aarch64-builtins.c (aarch64_fold_builtin_lane_check):
>   New function.
>   (aarch64_general_fold_builtin): Handle AARCH64_SIMD_BUILTIN_LANE_CHECK.
>   (aarch64_general_gimple_fold_builtin): Likewise.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/95969
>   * gcc.target/aarch64/lane-bound-1.c: New test.
>   * gcc.target/aarch64/lane-bound-2.c: New test.

OK, thanks.  Sorry for the slow reply, was away last week.

Richard

> ---
>  gcc/config/aarch64/aarch64-builtins.c | 35 +++
>  .../gcc.target/aarch64/lane-bound-1.c | 14 
>  .../gcc.target/aarch64/lane-bound-2.c | 10 ++
>  3 files changed, 59 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/lane-bound-1.c
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/lane-bound-2.c
>
> diff --git a/gcc/config/aarch64/aarch64-builtins.c 
> b/gcc/config/aarch64/aarch64-builtins.c
> index eef9fc0f444..119f67d4e4c 100644
> --- a/gcc/config/aarch64/aarch64-builtins.c
> +++ b/gcc/config/aarch64/aarch64-builtins.c
> @@ -29,6 +29,7 @@
>  #include "rtl.h"
>  #include "tree.h"
>  #include "gimple.h"
> +#include "ssa.h"
>  #include "memmodel.h"
>  #include "tm_p.h"
>  #include "expmed.h"
> @@ -2333,6 +2334,27 @@ aarch64_general_builtin_rsqrt (unsigned int fn)
>return NULL_TREE;
>  }
>  
> +/* Return true if the lane check can be removed as there is no
> +   error going to be emitted.  */
> +static bool
> +aarch64_fold_builtin_lane_check (tree arg0, tree arg1, tree arg2)
> +{
> +  if (TREE_CODE (arg0) != INTEGER_CST)
> +return false;
> +  if (TREE_CODE (arg1) != INTEGER_CST)
> +return false;
> +  if (TREE_CODE (arg2) != INTEGER_CST)
> +return false;
> +
> +  auto totalsize = wi::to_widest (arg0);
> +  auto elementsize = wi::to_widest (arg1);
> +  if (totalsize == 0 || elementsize == 0)
> +return false;
> +  auto lane = wi::to_widest (arg2);
> +  auto high = wi::udiv_trunc (totalsize, elementsize);
> +  return wi::ltu_p (lane, high);
> +}
> +
>  #undef VAR1
>  #define VAR1(T, N, MAP, FLAG, A) \
>case AARCH64_SIMD_BUILTIN_##T##_##N##A:
> @@ -2353,6 +2375,11 @@ aarch64_general_fold_builtin (unsigned int fcode, tree 
> type,
>VAR1 (UNOP, floatv4si, 2, ALL, v4sf)
>VAR1 (UNOP, floatv2di, 2, ALL, v2df)
>   return fold_build1 (FLOAT_EXPR, type, args[0]);
> +  case AARCH64_SIMD_BUILTIN_LANE_CHECK:
> + gcc_assert (n_args == 3);
> + if (aarch64_fold_builtin_lane_check (args[0], args[1], args[2]))
> +   return void_node;
> + break;
>default:
>   break;
>  }
> @@ -2440,6 +2467,14 @@ aarch64_general_gimple_fold_builtin (unsigned int 
> fcode, gcall *stmt)
>   }
> break;
>   }
> +case AARCH64_SIMD_BUILTIN_LANE_CHECK:
> +  if (aarch64_fold_builtin_lane_check (args[0], args[1], args[2]))
> + {
> +   unlink_stmt_vdef (stmt);
> +   release_defs (stmt);
> +   new_stmt = gimple_build_nop ();
> + }
> +  break;
>  default:
>break;
>  }
> diff --git a/gcc/testsuite/gcc.target/aarch64/lane-bound-1.c 
> b/gcc/testsuite/gcc.target/aarch64/lane-bound-1.c
> new file mode 100644
> index 000..bbbe679fd80
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/lane-bound-1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +#include 
> +
> +void
> +f (float32x4_t **ptr)
> +{
> +  float32x4_t res = vsetq_lane_f32 (0.0f, **ptr, 0);
> +  **ptr = res;
> +}
> +/* GCC should be able to remove the call to 
> "__builtin_aarch64_im_lane_boundsi"
> +   and optimize out the second load from *ptr.  */
> +/* { dg-final { scan-tree-dump-times "__builtin_aarch64_im_lane_boundsi" 0 
> "optimized" } } */
> +/* { dg-final { scan-tree-dump-times " = \\\*ptr_" 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/lane-bound-2.c 
> b/gcc/testsuite/gcc.target/aarch64/lane-bound-2.c
> new file mode 100644
> index 000..923c94687c6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/lane-bound-2.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-original" } */
> +void
> +f (void)
> +{
> +  __builtin_aarch64_im_lane_boundsi (16, 4, 0);
> +  __builtin_aarch64_im_lane_boundsi (8, 8, 0);
> +}
> +/* GCC should be able to optimize these out before gimplification. */
> +/* { dg-final { scan-tree-dump-times "__builtin_aarch64_im_lane_boundsi" 0 
> "original" } } */


[PATCH] Remove support for vax-openbsd

2021-09-13 Thread Richard Biener via Gcc-patches
This removes the support for vax-openbsd which has been discontinued
after the OpenBSD 5.9 release and which has no supported gas or GNU ld
configuration [anymore].  In particular this target does only support
STABS debuginfo generation.

OK for trunk?

Thanks,
Richard.

2021-09-13  Richard Biener  

* config.gcc: Remove vax-*-openbsd* configuration.

contrib/
* config-list.mk: Remove vax-openbsd.
---
 contrib/config-list.mk | 2 +-
 gcc/config.gcc | 5 -
 2 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index 8b1a2c67b60..d1f290307ba 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -96,7 +96,7 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   tilegx-linux-gnuOPT-enable-obsolete tilegxbe-linux-gnuOPT-enable-obsolete \
   tilepro-linux-gnuOPT-enable-obsolete \
   v850e1-elf v850e-elf v850-elf v850-rtems vax-linux-gnu \
-  vax-netbsdelf vax-openbsd visium-elf x86_64-apple-darwin \
+  vax-netbsdelf visium-elf x86_64-apple-darwin \
   x86_64-pc-linux-gnuOPT-with-fpmath=avx \
   x86_64-elfOPT-with-fpmath=sse x86_64-freebsd6 x86_64-netbsd \
   x86_64-w64-mingw32 \
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 5e38803d275..856c6071176 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3525,11 +3525,6 @@ vax-*-netbsdelf*)
extra_options="${extra_options} netbsd.opt netbsd-elf.opt vax/elf.opt"
tm_defines="${tm_defines} CHAR_FAST8=1 SHORT_FAST16=1"
;;
-vax-*-openbsd*)
-   tm_file="vax/vax.h vax/openbsd1.h openbsd.h openbsd-stdint.h 
openbsd-libpthread.h vax/openbsd.h"
-   extra_options="${extra_options} openbsd.opt"
-   use_collect2=yes
-   ;;
 visium-*-elf*)
tm_file="dbxelf.h elfos.h ${tm_file} visium/elf.h newlib-stdint.h"
tmake_file="visium/t-visium visium/t-crtstuff"
-- 
2.31.1


[PATCH] Remove m68k-openbsd support

2021-09-13 Thread Richard Biener via Gcc-patches
This removes m68k-openbsd as a valid configuration, according
to openbsd.org m68k-openbsd [on the mac] was discontinued after
the 5.1 release.  The configuration is also not (or no longer)
supported by gas and GNU ld so I could not figure whether it is still
a.out (I suspect it is).  But first and foremost the target only supports
STABS as a debugging format.

OK for trunk?

Thanks,
Richard.

2021-09-13  Richard Biener  

* config.gcc: Remove m68k-openbsd.

contrib/
* config-list.mk: Remove m68k-openbsd.
---
 contrib/config-list.mk |  2 +-
 gcc/config.gcc | 12 +---
 2 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index b493e69f5d6..8b1a2c67b60 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -58,7 +58,7 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   ia64-freebsd6 ia64-linux ia64-hpux ia64-hp-vms iq2000-elf lm32-elf \
   lm32-rtems lm32-uclinux m32c-rtems m32c-elf m32r-elf m32rle-elf \
   m32r-linux m32rle-linux m68k-elf m68k-netbsdelf \
-  m68k-openbsd m68k-uclinux m68k-linux m68k-rtems \
+  m68k-uclinux m68k-linux m68k-rtems \
   mcore-elf microblaze-linux microblaze-elf \
   mips-netbsd \
   mips64el-st-linux-gnu mips64octeon-linux mipsisa64r2-linux \
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 84de1a3f691..5e38803d275 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -277,6 +277,7 @@ case ${target} in
  | crisv32-*-* \
  | i[34567]86-go32-*   \
  | i[34567]86-*-go32*  \
+ | m68k*-*-openbsd*\
  | m68k-*-uclinuxoldabi*   \
  | mips64orion*-*-rtems*   \
  | pdp11-*-bsd \
@@ -2356,17 +2357,6 @@ m68k*-*-netbsdelf*)
extra_options="${extra_options} netbsd.opt netbsd-elf.opt"
tm_defines="${tm_defines} MOTOROLA=1 CHAR_FAST8=1 SHORT_FAST16=1"
;;
-m68k*-*-openbsd*)
-   default_m68k_cpu=68020
-   default_cf_cpu=5475
-   # needed to unconfuse gdb
-   tm_defines="${tm_defines} OBSD_OLD_GAS"
-   tm_file="${tm_file} openbsd.h openbsd-stdint.h openbsd-libpthread.h 
m68k/openbsd.h"
-   extra_options="${extra_options} openbsd.opt"
-   tmake_file="t-openbsd m68k/t-openbsd"
-   # we need collect2 until our bug is fixed...
-   use_collect2=yes
-   ;;
 m68k-*-uclinux*)   # Motorola m68k/ColdFire running uClinux
# with uClibc, using the new GNU/Linux-style
# ABI.
-- 
2.31.1


[COMMITTED] Move pointer_equiv_analyzer to new file.

2021-09-13 Thread Aldy Hernandez via Gcc-patches
We need to use the pointer equivalence tracking from evrp in the jump
threader.  Instead of moving it to some *evrp.h header, it's cleaner for
it to live in its own file, since it's completely independent and not
evrp specific.

Tested on x86-64 Linux.

gcc/ChangeLog:

* Makefile.in (OBJS): Add value-pointer-equiv.o.
* gimple-ssa-evrp.c (class ssa_equiv_stack): Move to
value-pointer-equiv.*.
(ssa_equiv_stack::ssa_equiv_stack): Same.
(ssa_equiv_stack::enter): Same.
(ssa_equiv_stack::leave): Same.
(ssa_equiv_stack::push_replacement): Same.
(ssa_equiv_stack::get_replacement): Same.
(is_pointer_ssa): Same.
(class pointer_equiv_analyzer): Same.
(pointer_equiv_analyzer::pointer_equiv_analyzer): Same.
(pointer_equiv_analyzer::~pointer_equiv_analyzer): Same.
(pointer_equiv_analyzer::set_global_equiv): Same.
(pointer_equiv_analyzer::set_cond_equiv): Same.
(pointer_equiv_analyzer::get_equiv): Same.
(pointer_equiv_analyzer::enter): Same.
(pointer_equiv_analyzer::leave): Same.
(pointer_equiv_analyzer::get_equiv_expr): Same.
(pta_valueize): Same.
(pointer_equiv_analyzer::visit_stmt): Same.
(pointer_equiv_analyzer::visit_edge): Same.
(hybrid_folder::value_of_expr): Same.
(hybrid_folder::value_on_edge): Same.
* value-pointer-equiv.cc: New file.
* value-pointer-equiv.h: New file.
---
 gcc/Makefile.in|   1 +
 gcc/gimple-ssa-evrp.c  | 302 +
 gcc/value-pointer-equiv.cc | 302 +
 gcc/value-pointer-equiv.h  |  62 
 4 files changed, 370 insertions(+), 297 deletions(-)
 create mode 100644 gcc/value-pointer-equiv.cc
 create mode 100644 gcc/value-pointer-equiv.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index f0c560fe45b..f3877128524 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1698,6 +1698,7 @@ OBJS = \
typed-splay-tree.o \
unique-ptr-tests.o \
valtrack.o \
+   value-pointer-equiv.o \
value-query.o \
value-range.o \
value-range-equiv.o \
diff --git a/gcc/gimple-ssa-evrp.c b/gcc/gimple-ssa-evrp.c
index 61de5013d6d..254542ef4cc 100644
--- a/gcc/gimple-ssa-evrp.c
+++ b/gcc/gimple-ssa-evrp.c
@@ -43,299 +43,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-ssa-evrp-analyze.h"
 #include "gimple-range.h"
 #include "fold-const.h"
-
-// Unwindable SSA equivalence table for pointers.
-//
-// The main query point is get_replacement() which returns what a
-// given SSA can be replaced with in the current scope.
-
-class ssa_equiv_stack
-{
-public:
-  ssa_equiv_stack ();
-  void enter (basic_block);
-  void leave (basic_block);
-  void push_replacement (tree name, tree replacement);
-  tree get_replacement (tree name) const;
-
-private:
-  auto_vec> m_stack;
-  auto_vec m_replacements;
-  const std::pair  m_marker = std::make_pair (NULL, NULL);
-};
-
-ssa_equiv_stack::ssa_equiv_stack ()
-{
-  m_replacements.safe_grow_cleared (num_ssa_names);
-}
-
-// Pushes a marker at the given point.
-
-void
-ssa_equiv_stack::enter (basic_block)
-{
-  m_stack.safe_push (m_marker);
-}
-
-// Pops the stack to the last marker, while performing replacements
-// along the way.
-
-void
-ssa_equiv_stack::leave (basic_block)
-{
-  gcc_checking_assert (!m_stack.is_empty ());
-  while (m_stack.last () != m_marker)
-{
-  std::pair e = m_stack.pop ();
-  m_replacements[SSA_NAME_VERSION (e.first)] = e.second;
-}
-  m_stack.pop ();
-}
-
-// Set the equivalence of NAME to REPLACEMENT.
-
-void
-ssa_equiv_stack::push_replacement (tree name, tree replacement)
-{
-  tree old = m_replacements[SSA_NAME_VERSION (name)];
-  m_replacements[SSA_NAME_VERSION (name)] = replacement;
-  m_stack.safe_push (std::make_pair (name, old));
-}
-
-// Return the equivalence of NAME.
-
-tree
-ssa_equiv_stack::get_replacement (tree name) const
-{
-  return m_replacements[SSA_NAME_VERSION (name)];
-}
-
-// Return TRUE if EXPR is an SSA holding a pointer.
-
-static bool inline
-is_pointer_ssa (tree expr)
-{
-  return TREE_CODE (expr) == SSA_NAME && POINTER_TYPE_P (TREE_TYPE (expr));
-}
-
-// Simple context-aware pointer equivalency analyzer that returns what
-// a pointer SSA name is equivalent to at a given point during a walk
-// of the IL.
-//
-// Note that global equivalency take priority over conditional
-// equivalency.  That is, p =  takes priority over a later p == 
-//
-// This class is meant to be called during a DOM walk.
-
-class pointer_equiv_analyzer
-{
-public:
-  pointer_equiv_analyzer (gimple_ranger *r);
-  ~pointer_equiv_analyzer ();
-  void enter (basic_block);
-  void leave (basic_block);
-  void visit_stmt (gimple *stmt);
-  tree get_equiv (tree ssa) const;
-
-private:
-  void visit_edge (edge e);
-  tree get_equiv_expr (tree_code code, tree expr) const;
-  void set_global_equiv 

Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-13 Thread Hongtao Liu via Gcc-patches
On Mon, Sep 13, 2021 at 5:15 PM Richard Biener
 wrote:
>
> On Mon, Sep 13, 2021 at 8:26 AM Hongtao Liu  wrote:
> >
> > On Mon, Sep 13, 2021 at 2:11 PM Richard Biener via Gcc-patches
> >  wrote:
> > >
> > > On Fri, Sep 10, 2021 at 2:58 PM liuhongt  wrote:
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * expmed.c (extract_bit_field_using_extv): validate_subreg
> > > > before call gen_lowpart.
> > > > ---
> > > >  gcc/expmed.c | 6 +-
> > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/gcc/expmed.c b/gcc/expmed.c
> > > > index 3143f38e057..10d62d857a8 100644
> > > > --- a/gcc/expmed.c
> > > > +++ b/gcc/expmed.c
> > > > @@ -1571,12 +1571,16 @@ extract_bit_field_using_extv (const 
> > > > extraction_insn *extv, rtx op0,
> > > >
> > > >if (GET_MODE (target) != ext_mode)
> > > >  {
> > > > +  machine_mode tmode = GET_MODE (target);
> > > >/* Don't use LHS paradoxical subreg if explicit truncation is 
> > > > needed
> > > >  between the mode of the extraction (word_mode) and the target
> > > >  mode.  Instead, create a temporary and use convert_move to set
> > > >  the target.  */
> > > >if (REG_P (target)
> > > > - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), 
> > > > ext_mode))
> > > > + && TRULY_NOOP_TRUNCATION_MODES_P (tmode, ext_mode)
> > > > + && validate_subreg (ext_mode, tmode,
> > > > + target,
> > > > + subreg_lowpart_offset (ext_mode, tmode)))
> > > > {
> > > >   target = gen_lowpart (ext_mode, target);
> > >
> > > That would be equivalent to use gen_lowpart_if_possible?
> > No, target will be changed to NULL_RTX.
> > But it does avoid ICE since maybe_expand_insn can legitimate operands,
> > but I doubt it will introduce other bugs since the target has been
> > changed here.
> >
> > I think the validate_subreg solution is plain and straightforward,
> > just like it's done in
> > r11-7515-g0ad6de3883a1641f7ec0bd9cf56d41fa5b313dae.
>
> That guards an explicit gen_rtx_SUBREG, here we're using gen_lowpart.
> It's not an obvious match to validate gen_lowpart with validate_subreg,
> I thought that gen_lowpart_if_possible would be prefered.  You obviously
> have to adjust the code, like
>
>   rtx tem;
>   if (...
>   && (tem = gen_lowpart_if_possible (ext_mode, target))
> {
Yes, update patch

  bootstrapped and regtested on x86_64-linux-gnu{-m32,}.
  Ok for trunk?
gcc/ChangeLog:

* expmed.c (extract_bit_field_using_extv): Use
gen_lowpart_if_possible instead of gen_lowpart to avoid ICE.
---
 gcc/expmed.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/expmed.c b/gcc/expmed.c
index 3143f38e057..59734d4841c 100644
--- a/gcc/expmed.c
+++ b/gcc/expmed.c
@@ -1571,14 +1571,16 @@ extract_bit_field_using_extv (const
extraction_insn *extv, rtx op0,

   if (GET_MODE (target) != ext_mode)
 {
+  rtx temp;
   /* Don't use LHS paradoxical subreg if explicit truncation is needed
 between the mode of the extraction (word_mode) and the target
 mode.  Instead, create a temporary and use convert_move to set
 the target.  */
   if (REG_P (target)
- && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
+ && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode)
+ && (temp = gen_lowpart_if_possible (ext_mode, target)))
{
- target = gen_lowpart (ext_mode, target);
+ target = temp;
  if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
spec_target_subreg = target;
}
--
2.27.0

>target = tem;
> ...
>
> Richard.
>
> >
> > >
> > > >   if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
> > > > --
> > > > 2.27.0
> > > >
> >
> >
> >
> > --
> > BR,
> > Hongtao



-- 
BR,
Hongtao


Re: [PATCH] aarch64: PR target/102252 Invalid addressing mode for SVE load predicate

2021-09-13 Thread Richard Sandiford via Gcc-patches
Kyrylo Tkachov  writes:
> Hi all,
>
> In the testcase we generate invalid assembly for an SVE load predicate 
> instruction.
> The RTL for the insn is:
> (insn 9 8 10 (set (reg:VNx16BI 68 p0)
> (mem:VNx16BI (plus:DI (mult:DI (reg:DI 1 x1 [93])
> (const_int 8 [0x8]))
> (reg/f:DI 0 x0 [92])) [2 work_3(D)->array[offset_4(D)]+0 S8 
> A16]))
>
> That addressing mode is not valid for the instruction [1] as it only accepts 
> the addressing mode:
> [{, #, MUL VL}]
>
> This patch rejects the register index form for SVE predicate modes.
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> Ok for trunk?
> Thanks,
> Kyrill
>
> [1] 
> https://developer.arm.com/documentation/ddi0602/2021-06/SVE-Instructions/LDR--predicate---Load-predicate-register-
>
> gcc/ChangeLog:
>
> PR target/102252
> * config/aarch64/aarch64.c (aarch64_classify_address): Don't allow
> register index for SVE predicate modes.
>
> gcc/testsuite/ChangeLog:
>
> PR target/102252
> * g++.target/aarch64/sve/pr102252.C: New test.
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> e37922db0007e3b4b559cda65f135247f4fb1b9f..e6253edeb55cdcc3dbc7303e03bad26dd519c4b1
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -9770,7 +9770,7 @@ aarch64_classify_address (struct aarch64_address_info 
> *info,
>   || mode == TImode
>   || mode == TFmode
>   || (BYTES_BIG_ENDIAN && advsimd_struct_p));
> -
> +  bool sve_pred_p = (vec_flags & VEC_SVE_PRED) != 0;
>/* If we are dealing with ADDR_QUERY_LDP_STP_N that means the incoming mode
>   corresponds to the actual size of the memory being loaded/stored and the
>   mode of the corresponding addressing mode is half of that.  */
> @@ -9779,12 +9779,14 @@ aarch64_classify_address (struct aarch64_address_info 
> *info,
>  mode = DFmode;
>  
>bool allow_reg_index_p = (!load_store_pair_p
> + && !sve_pred_p
>   && (known_lt (GET_MODE_SIZE (mode), 16)
>   || vec_flags == VEC_ADVSIMD
>   || vec_flags & VEC_SVE_DATA));

I think the known_lt (GET_MODE_SIZE (mode), 16) is really there for
non-vector cases, with the ||s enumerating the valid vector cases.
So how about:

  bool allow_reg_index_p = (!load_store_pair_p
&& ((vec_flags == 0
 && known_lt (GET_MODE_SIZE (mode), 16))
|| vec_flags == VEC_ADVSIMD
|| vec_flags & VEC_SVE_DATA));

instead?  OK with that change from my POV.

Thanks,
Richard

>  
> -  /* For SVE, only accept [Rn], [Rn, Rm, LSL #shift] and
> - [Rn, #offset, MUL VL].  */
> +  /* For SVE, only accept [Rn], [Rn, #offset, MUL VL] and [Rn, Rm, LSL 
> #shift].
> + The latter is not valid for SVE predicates, and that's rejected through
> + allow_reg_index_p above.  */
>if ((vec_flags & (VEC_SVE_DATA | VEC_SVE_PRED)) != 0
>&& (code != REG && code != PLUS))
>  return false;
> diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr102252.C 
> b/gcc/testsuite/g++.target/aarch64/sve/pr102252.C
> new file mode 100644
> index 
> ..f90f1218555f0dfdb0253fe83c656ba03b1aac43
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/aarch64/sve/pr102252.C
> @@ -0,0 +1,37 @@
> +/* PR target/102252.  */
> +/* { dg-do assemble { target aarch64_asm_sve_ok } } */
> +/* { dg-options "-march=armv8.2-a+sve -msve-vector-bits=512" } */
> +
> +/* We used to generate invalid assembly for SVE predicate loads.  */
> +
> +#include 
> +
> +class SimdBool
> +{
> +private:
> +typedef svbool_t simdInternalType_ 
> __attribute__((arm_sve_vector_bits(512)));
> +
> +public:
> +SimdBool() {}
> +
> +simdInternalType_ simdInternal_;
> +
> +};
> +
> +static svfloat32_t selectByMask(svfloat32_t a, SimdBool m) {
> +return svsel_f32(m.simdInternal_, a, svdup_f32(0.0));
> +}
> +
> +struct s {
> +SimdBool array[1];
> +};
> +
> +
> +
> +void foo(struct s* const work, int offset)
> +{
> +svfloat32_t tz_S0;
> +
> +tz_S0 = selectByMask(tz_S0, work->array[offset]);
> +}
> +


Re: [PATCH] PR c/102245: Don't warn that ((_Bool)x<<0) isn't a truthvalue.

2021-09-13 Thread Jakub Jelinek via Gcc-patches
On Mon, Sep 13, 2021 at 11:42:08AM +0100, Roger Sayle wrote:
> gcc/c-family/ChangeLog
>   PR c/102245
>   * c-common.c (c_common_truthvalue_conversion) [LSHIFT_EXPR]:
>   Special case (optimize) shifts by zero.
> 
> gcc/testsuite/ChangeLog
>   PR c/102245
>   * gcc.dg/Wint-in-bool-context-4.c: New test case.
> 
> Roger
> --
> 

> diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
> index 017e415..44b5fcc 100644
> --- a/gcc/c-family/c-common.c
> +++ b/gcc/c-family/c-common.c
> @@ -3541,6 +3541,10 @@ c_common_truthvalue_conversion (location_t location, 
> tree expr)
>break;
>  
>  case LSHIFT_EXPR:
> +  /* Treat shifts by zero as a special case.  */
> +  if (integer_zerop (TREE_OPERAND (expr, 1)))
> + return c_common_truthvalue_conversion (location,
> +TREE_OPERAND (expr, 0));
>/* We will only warn on signed shifts here, because the majority of
>false positive warnings happen in code where unsigned arithmetic
>was used in anticipation of a possible overflow.

> /* PR c/102245 */
> /* { dg-options "-Wint-in-bool-context" } */
> /* { dg-do compile } */
> 
> _Bool test1(_Bool x)
> {
>   return !(x << 0);  /* { dg-bogus "boolean context" } */
> }

While this exact case is unlikely a misspelling of !(x < 0) as
no _Bool is less than zero and hopefully we get a warning for
!(x < 0), what about
_Bool test1a(int x)
{
  return !(x << 0);
}
?  I think there is a non-zero chance this was meant to be !(x < 0)
and the current
pr102245.c: In function ‘test1a’:
pr102245.c:3:14: warning: ‘<<’ in boolean context, did you mean ‘<’? 
[-Wint-in-bool-context]
3 |   return !(x << 0);
  |   ~~~^
warning seems to be useful.

Jakub



[PATCH] PR c/102245: Don't warn that ((_Bool)x<<0) isn't a truthvalue.

2021-09-13 Thread Roger Sayle

If the tree expression X is a truthvalue, then X << 0 is a truthvalue.
In fact, because _Bool (truthvalue_type) has 1 bit precision, and shifts
are only well defined for bit counts less than the precision, the only
reasonable(?) left shift of a _Bool is by zero [where this reasonable
overlooks that shifts by zero should be optimized away as no-ops].

Now consider a language front-end that doesn't fold binary expressions,
hence retains (x<<0), but does fold type conversions, and can therefore
see that ((_Bool)x<<0) can be shortened to _Bool, but then warns that
any LSHIFT_EXPR in a boolean context is suspicious.

The answer is that shifts by zero are special, and that all other
shifts are indeed suspicious.  The most suspicious thing about a
(BImode) shift by zero, is why it hasn't already been optimized away.
Indeed, in Bernd Edlinger's original 2016 patch submission to warn
of LSHIFT_EXPR with -Wint-in-bool-context he included exceptions
for shifts (of truthvalues) by zero,
https://gcc.gnu.org/pipermail/gcc-patches/2016-September/457716.html
but was talked out of this during the review process, and unconditionally
warned of all LSHIFT_EXPRs by
https://gcc.gnu.org/pipermail/gcc-patches/2016-September/458263.html

This patch teaches c_common_truthvalue_conversion that a left shift
by zero is special/a no-op, and to apply the conversion to the first
operand, which both fixes the bogus warning and generates more sensible
expression trees.  [Some part of me thinks increasing the amount of
folding in the front-ends is bad, but another part thinks that calling
fold on trees that haven't had their operands folded/canonicalized
(then complaining about suspicious looking but perfectly valid results)
is sometimes worse].

This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
and "make -k check" with no new failures.  Ok for mainline?


2021-09-13  Roger Sayle  

gcc/c-family/ChangeLog
PR c/102245
* c-common.c (c_common_truthvalue_conversion) [LSHIFT_EXPR]:
Special case (optimize) shifts by zero.

gcc/testsuite/ChangeLog
PR c/102245
* gcc.dg/Wint-in-bool-context-4.c: New test case.

Roger
--

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 017e415..44b5fcc 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -3541,6 +3541,10 @@ c_common_truthvalue_conversion (location_t location, 
tree expr)
   break;
 
 case LSHIFT_EXPR:
+  /* Treat shifts by zero as a special case.  */
+  if (integer_zerop (TREE_OPERAND (expr, 1)))
+   return c_common_truthvalue_conversion (location,
+  TREE_OPERAND (expr, 0));
   /* We will only warn on signed shifts here, because the majority of
 false positive warnings happen in code where unsigned arithmetic
 was used in anticipation of a possible overflow.
/* PR c/102245 */
/* { dg-options "-Wint-in-bool-context" } */
/* { dg-do compile } */

_Bool test1(_Bool x)
{
  return !(x << 0);  /* { dg-bogus "boolean context" } */
}

_Bool test2(_Bool x)
{
  return !(x << 1);  /* { dg-warning "boolean context" } */
}

_Bool test3(_Bool x, int y)
{
  return !(x << y);  /* { dg-warning "boolean context" } */
}

_Bool test4(int x, int y)
{
  return !(x << y);  /* { dg-warning "boolean context" } */
}

_Bool test5(int x, int y)
{
  return !((x << y) << 0);  /* { dg-warning "boolean context" } */
}

int test6(_Bool x)
{
  int v = 0;
  return (v & ~1L) | (1L & (x << 0));  /* { dg-bogus "boolean context" } */
}



Re: [PATCH] Add cr16-*-* to the list of obsoleted targets

2021-09-13 Thread Jan-Benedict Glaw
Hi!

On Mon, 2021-09-13 11:58:59 +0200, Richard Biener  wrote:
> On Mon, 13 Sep 2021, Jan-Benedict Glaw wrote:
> > contrib/ChangeLog:
> > 
> > * config-list.mk (LIST): --enable-obsolete for cr16-elf.
> > 
[...]
> OK.

Committed, thanks!

MfG, JBG

-- 


signature.asc
Description: PGP signature


Re: [PATCH] Fix multi-statement define for alpha-dec-vms

2021-09-13 Thread Jan-Benedict Glaw
Hi!

On Mon, 2021-09-13 11:11:30 +0200, Richard Biener  
wrote:
> On Sun, Sep 12, 2021 at 8:12 PM Jan-Benedict Glaw  wrote:
> > gcc/ChangeLog:
> >
> > * config/alpha/vms.h (INIT_CUMULATIVE_ARGS): Wrap multi-statment
> > define into a block.
> OK.

Committed, thanks!

MfG, JBG

-- 


signature.asc
Description: PGP signature


Re: [PATCH] Add cr16-*-* to the list of obsoleted targets

2021-09-13 Thread Richard Biener via Gcc-patches
On Mon, 13 Sep 2021, Jan-Benedict Glaw wrote:

> Hi Richard,
> 
> On Mon, 2021-09-13 11:24:53 +0200, Richard Biener via Gcc-patches 
>  wrote:
> > This adds cr16-*-* to the list of obsoleted targets in config.gcc
> > 
> > Approved by Jeff in another thread, pushed.  cr16 has no maintainer and
> > it's still cc0.
> > 
> > 2021-09-13  Richard Biener  
> > 
> > * config.gcc: Add cr16-*-* to the list of obsoleted targets.
> 
> for the time being, please update ./contrib/config-list.mk to list
> "cr16-elf" as "cr16-elfOPT-enable-obsolete". Or anybody to ACK this
> for trunk?

OK.

Thanks and sorry for missing this,
Richard.

> contrib/ChangeLog:
> 
>   * config-list.mk (LIST): --enable-obsolete for cr16-elf.
> 
> diff --git a/contrib/config-list.mk b/contrib/config-list.mk
> index b9e9dd0b34b..b493e69f5d6 100644
> --- a/contrib/config-list.mk
> +++ b/contrib/config-list.mk
> @@ -40,7 +40,7 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
>arm-symbianelf avr-elf \
>bfin-elf bfin-uclinux bfin-linux-uclibc bfin-rtems bfin-openbsd \
>bpf-unknown-none \
> -  c6x-elf c6x-uclinux cr16-elf cris-elf \
> +  c6x-elf c6x-uclinux cr16-elfOPT-enable-obsolete cris-elf \
>csky-elf csky-linux-gnu \
>epiphany-elf epiphany-elfOPT-with-stack-offset=16 fido-elf \
>fr30-elf frv-elf frv-linux ft32-elf h8300-elf hppa-linux-gnu \
> 
> 
> Thanks,
>   Jan-Benedict
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH] Add cr16-*-* to the list of obsoleted targets

2021-09-13 Thread Jan-Benedict Glaw
Hi Richard,

On Mon, 2021-09-13 11:24:53 +0200, Richard Biener via Gcc-patches 
 wrote:
> This adds cr16-*-* to the list of obsoleted targets in config.gcc
> 
> Approved by Jeff in another thread, pushed.  cr16 has no maintainer and
> it's still cc0.
> 
> 2021-09-13  Richard Biener  
> 
>   * config.gcc: Add cr16-*-* to the list of obsoleted targets.

for the time being, please update ./contrib/config-list.mk to list
"cr16-elf" as "cr16-elfOPT-enable-obsolete". Or anybody to ACK this
for trunk?

contrib/ChangeLog:

* config-list.mk (LIST): --enable-obsolete for cr16-elf.

diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index b9e9dd0b34b..b493e69f5d6 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -40,7 +40,7 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   arm-symbianelf avr-elf \
   bfin-elf bfin-uclinux bfin-linux-uclibc bfin-rtems bfin-openbsd \
   bpf-unknown-none \
-  c6x-elf c6x-uclinux cr16-elf cris-elf \
+  c6x-elf c6x-uclinux cr16-elfOPT-enable-obsolete cris-elf \
   csky-elf csky-linux-gnu \
   epiphany-elf epiphany-elfOPT-with-stack-offset=16 fido-elf \
   fr30-elf frv-elf frv-linux ft32-elf h8300-elf hppa-linux-gnu \


Thanks,
  Jan-Benedict
-- 


signature.asc
Description: PGP signature


Re: [PATCH v3 1/3] rtl: directly handle MEM in gen_highpart [PR102125]

2021-09-13 Thread Richard Sandiford via Gcc-patches
Richard Earnshaw  writes:
> On 13/09/2021 10:38, Richard Sandiford via Gcc-patches wrote:
>> Richard Earnshaw via Gcc-patches  writes:
>>> gen_lowpart_general handles forming a lowpart of a MEM by using
>>> adjust_address to rework and validate a new version of the MEM.
>>> Do the same for gen_highpart rather than calling simplify_gen_subreg
>>> for this case.
>> 
>> Looks OK, but what went wrong with the existing code?  Did
>> simplify_gen_subreg refuse to handle a MEM that you wanted
>> it to handle, or did the validize_mem go wrong for some reason?
>
> It refused to handle it and simply returned (subreg (mem)) - see the 
> discussion on version 1 of the patch series.

OK, that's good then.  The patch is OK from my POV too FWIW.

Richard

>>> gcc/ChangeLog:
>>>
>>> PR target/102125
>>> * emit-rtl.c (gen_highpart): Use adjust_address to handle
>>> MEM rather than calling simplify_gen_subreg.
>>> ---
>>>   gcc/emit-rtl.c | 23 +--
>>>   1 file changed, 13 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
>>> index 77ea8948ee8..0ba110879aa 100644
>>> --- a/gcc/emit-rtl.c
>>> +++ b/gcc/emit-rtl.c
>>> @@ -1585,19 +1585,22 @@ gen_highpart (machine_mode mode, rtx x)
>>> gcc_assert (known_le (msize, (unsigned int) UNITS_PER_WORD)
>>>   || known_eq (msize, GET_MODE_UNIT_SIZE (GET_MODE (x;
>>>   
>>> -  result = simplify_gen_subreg (mode, x, GET_MODE (x),
>>> -   subreg_highpart_offset (mode, GET_MODE (x)));
>>> -  gcc_assert (result);
>>> -
>>> -  /* simplify_gen_subreg is not guaranteed to return a valid operand for
>>> - the target if we have a MEM.  gen_highpart must return a valid 
>>> operand,
>>> - emitting code if necessary to do so.  */
>>> -  if (MEM_P (result))
>>> +  /* gen_lowpart_common handles a lot of special cases due to needing to 
>>> handle
>>> + paradoxical subregs; it only calls simplify_gen_subreg when certain 
>>> that
>>> + it will produce something meaningful.  The only case we need to handle
>>> + specially here is MEM.  */
>>> +  if (MEM_P (x))
>>>   {
>>> -  result = validize_mem (result);
>>> -  gcc_assert (result);
>>> +  poly_int64 offset = subreg_highpart_offset (mode, GET_MODE (x));
>>> +  return adjust_address (x, mode, offset);
>>>   }
>>>   
>>> +  result = simplify_gen_subreg (mode, x, GET_MODE (x),
>>> +   subreg_highpart_offset (mode, GET_MODE (x)));
>>> +  /* Since we handle MEM directly above, we should never get a MEM back
>>> + from simplify_gen_subreg.  */
>>> +  gcc_assert (result && !MEM_P (result));
>>> +
>>> return result;
>>>   }
>>>   


[PATCH] Remove references to FSM threads.

2021-09-13 Thread Aldy Hernandez via Gcc-patches
Now that the jump thread back registry has been split into the generic
copier and the custom (old) copier, it becomes trivial to remove the
FSM bits from the jump threaders.

First, there's no need for an EDGE_FSM_THREAD type.  The only reason
we were looking at the threading type was to determine what type of
copier to use, and now that the copier has been split, there's no need
to even look.  However, there is one check in register_jump_thread
where we verify that only the generic copier can thread through
back-edges.  I've removed that check in favor of a flag passed to the
constructor.

I've also removed all the FSM references from the code and tests.
Interestingly, some tests weren't even testing the right thing.  They
were testing for "FSM" which would catch jump thread paths as well as
the backward threader *failing* on registering a path.  *big eye roll*

The only remaining code that was actually checking for EDGE_FSM_THREAD
was adjust_paths_after_duplication, and the checks could be written
without looking at the edge type at all.  For the record, the code
there is horrible: it's convoluted, hard to read, and doesn't have any
tests.  I'd smack myself if I could go back in time.

All that remains are the FSM references in the --param's themselves.
I think we should s/fsm/threader/, since I envision a day when we can
share the cost basis code between the threaders.  However, I don't
know what the proper procedure is for renaming existing compiler
options.

By the way, param_fsm_maximum_phi_arguments is no longer relevant
after the rewrite.  We can nuke that one right away.

Tested on x86-64 Linux.

OK?

gcc/ChangeLog:

* tree-ssa-threadbackward.c
(back_threader_profitability::profitable_path_p): Remove FSM
references.
(back_threader_registry::register_path): Same.
* tree-ssa-threadedge.c
(jump_threader::simplify_control_stmt_condition): Same.
* tree-ssa-threadupdate.c (jt_path_registry::jt_path_registry):
Add backedge_threads argument.
(fwd_jt_path_registry::fwd_jt_path_registry): Pass
backedge_threads argument.
(back_jt_path_registry::back_jt_path_registry):  Same.
(dump_jump_thread_path): Adjust for FSM removal.
(back_jt_path_registry::rewire_first_differing_edge): Same.
(back_jt_path_registry::adjust_paths_after_duplication): Same.
(back_jt_path_registry::update_cfg): Same.
(jt_path_registry::register_jump_thread): Same.
* tree-ssa-threadupdate.h (enum jump_thread_edge_type): Remove
EDGE_FSM_THREAD.
(class back_jt_path_registry): Add backedge_threads to
constructor.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr21417.c: Adjust for FSM removal.
* gcc.dg/tree-ssa/pr66752-3.c: Same.
* gcc.dg/tree-ssa/pr68198.c: Same.
* gcc.dg/tree-ssa/pr69196-1.c: Same.
* gcc.dg/tree-ssa/pr70232.c: Same.
* gcc.dg/tree-ssa/pr77445.c: Same.
* gcc.dg/tree-ssa/ranger-threader-4.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Same.
* gcc.dg/tree-ssa/ssa-dom-thread-6.c: Same.
* gcc.dg/tree-ssa/ssa-thread-12.c: Same.
* gcc.dg/tree-ssa/ssa-thread-13.c: Same.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr21417.c   |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c |  4 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr68198.c   |  4 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr69196-1.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr70232.c   | 12 +--
 gcc/testsuite/gcc.dg/tree-ssa/pr77445.c   |  2 +-
 .../gcc.dg/tree-ssa/ranger-threader-4.c   |  2 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-18.c   |  2 +-
 .../gcc.dg/tree-ssa/ssa-dom-thread-6.c|  4 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-12.c |  7 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-13.c |  2 +-
 gcc/tree-ssa-threadbackward.c | 37 +-
 gcc/tree-ssa-threadedge.c | 10 +--
 gcc/tree-ssa-threadupdate.c   | 73 +--
 gcc/tree-ssa-threadupdate.h   |  8 +-
 15 files changed, 85 insertions(+), 86 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
index fc14af4e662..b934c9c73d5 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr21417.c
@@ -49,5 +49,5 @@ L23:
 /* We should thread the backedge to the top of the loop; ie we only
execute the if (expr->common.code != 142) test once per loop
iteration.  */
-/* { dg-final { scan-tree-dump-times "FSM jump thread" 1 "thread4" } } */
+/* { dg-final { scan-tree-dump-times "jump thread" 1 "thread4" } } */
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c
index 896c8bf7edc..e1464e21170 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c
@@ -32,9 +32,9 @@ foo (int N, int c, int b, int *a)

Re: [PATCH v3 1/3] rtl: directly handle MEM in gen_highpart [PR102125]

2021-09-13 Thread Richard Earnshaw via Gcc-patches




On 13/09/2021 10:38, Richard Sandiford via Gcc-patches wrote:

Richard Earnshaw via Gcc-patches  writes:

gen_lowpart_general handles forming a lowpart of a MEM by using
adjust_address to rework and validate a new version of the MEM.
Do the same for gen_highpart rather than calling simplify_gen_subreg
for this case.


Looks OK, but what went wrong with the existing code?  Did
simplify_gen_subreg refuse to handle a MEM that you wanted
it to handle, or did the validize_mem go wrong for some reason?


It refused to handle it and simply returned (subreg (mem)) - see the 
discussion on version 1 of the patch series.


R.



Thanks,
Richard


gcc/ChangeLog:

PR target/102125
* emit-rtl.c (gen_highpart): Use adjust_address to handle
MEM rather than calling simplify_gen_subreg.
---
  gcc/emit-rtl.c | 23 +--
  1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 77ea8948ee8..0ba110879aa 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -1585,19 +1585,22 @@ gen_highpart (machine_mode mode, rtx x)
gcc_assert (known_le (msize, (unsigned int) UNITS_PER_WORD)
  || known_eq (msize, GET_MODE_UNIT_SIZE (GET_MODE (x;
  
-  result = simplify_gen_subreg (mode, x, GET_MODE (x),

-   subreg_highpart_offset (mode, GET_MODE (x)));
-  gcc_assert (result);
-
-  /* simplify_gen_subreg is not guaranteed to return a valid operand for
- the target if we have a MEM.  gen_highpart must return a valid operand,
- emitting code if necessary to do so.  */
-  if (MEM_P (result))
+  /* gen_lowpart_common handles a lot of special cases due to needing to handle
+ paradoxical subregs; it only calls simplify_gen_subreg when certain that
+ it will produce something meaningful.  The only case we need to handle
+ specially here is MEM.  */
+  if (MEM_P (x))
  {
-  result = validize_mem (result);
-  gcc_assert (result);
+  poly_int64 offset = subreg_highpart_offset (mode, GET_MODE (x));
+  return adjust_address (x, mode, offset);
  }
  
+  result = simplify_gen_subreg (mode, x, GET_MODE (x),

+   subreg_highpart_offset (mode, GET_MODE (x)));
+  /* Since we handle MEM directly above, we should never get a MEM back
+ from simplify_gen_subreg.  */
+  gcc_assert (result && !MEM_P (result));
+
return result;
  }
  


Re: [PATCH v3 1/3] rtl: directly handle MEM in gen_highpart [PR102125]

2021-09-13 Thread Richard Sandiford via Gcc-patches
Richard Earnshaw via Gcc-patches  writes:
> gen_lowpart_general handles forming a lowpart of a MEM by using
> adjust_address to rework and validate a new version of the MEM.
> Do the same for gen_highpart rather than calling simplify_gen_subreg
> for this case.

Looks OK, but what went wrong with the existing code?  Did
simplify_gen_subreg refuse to handle a MEM that you wanted
it to handle, or did the validize_mem go wrong for some reason?

Thanks,
Richard

> gcc/ChangeLog:
>
>   PR target/102125
>   * emit-rtl.c (gen_highpart): Use adjust_address to handle
>   MEM rather than calling simplify_gen_subreg.
> ---
>  gcc/emit-rtl.c | 23 +--
>  1 file changed, 13 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index 77ea8948ee8..0ba110879aa 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -1585,19 +1585,22 @@ gen_highpart (machine_mode mode, rtx x)
>gcc_assert (known_le (msize, (unsigned int) UNITS_PER_WORD)
> || known_eq (msize, GET_MODE_UNIT_SIZE (GET_MODE (x;
>  
> -  result = simplify_gen_subreg (mode, x, GET_MODE (x),
> - subreg_highpart_offset (mode, GET_MODE (x)));
> -  gcc_assert (result);
> -
> -  /* simplify_gen_subreg is not guaranteed to return a valid operand for
> - the target if we have a MEM.  gen_highpart must return a valid operand,
> - emitting code if necessary to do so.  */
> -  if (MEM_P (result))
> +  /* gen_lowpart_common handles a lot of special cases due to needing to 
> handle
> + paradoxical subregs; it only calls simplify_gen_subreg when certain that
> + it will produce something meaningful.  The only case we need to handle
> + specially here is MEM.  */
> +  if (MEM_P (x))
>  {
> -  result = validize_mem (result);
> -  gcc_assert (result);
> +  poly_int64 offset = subreg_highpart_offset (mode, GET_MODE (x));
> +  return adjust_address (x, mode, offset);
>  }
>  
> +  result = simplify_gen_subreg (mode, x, GET_MODE (x),
> + subreg_highpart_offset (mode, GET_MODE (x)));
> +  /* Since we handle MEM directly above, we should never get a MEM back
> + from simplify_gen_subreg.  */
> +  gcc_assert (result && !MEM_P (result));
> +
>return result;
>  }
>  


Re: [PATCH v3 1/3] rtl: directly handle MEM in gen_highpart [PR102125]

2021-09-13 Thread Richard Biener via Gcc-patches
On Fri, Sep 10, 2021 at 4:48 PM Richard Earnshaw  wrote:
>
>
> gen_lowpart_general handles forming a lowpart of a MEM by using
> adjust_address to rework and validate a new version of the MEM.
> Do the same for gen_highpart rather than calling simplify_gen_subreg
> for this case.

OK from my side.

Thanks,
Richard.

> gcc/ChangeLog:
>
> PR target/102125
> * emit-rtl.c (gen_highpart): Use adjust_address to handle
> MEM rather than calling simplify_gen_subreg.
> ---
>  gcc/emit-rtl.c | 23 +--
>  1 file changed, 13 insertions(+), 10 deletions(-)
>


Re: [COMMITTED][patch][version 9]add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-09-13 Thread Richard Biener via Gcc-patches
On Fri, Sep 10, 2021 at 3:47 PM Jose E. Marchesi via Gcc-patches
 wrote:
>
>
> Hi Richard.
>
> > On Thu, 9 Sep 2021, Kees Cook wrote:
> >
> >> On Thu, Sep 09, 2021 at 10:49:11PM +, Qing Zhao wrote:
> >> > Hi, FYI
> >> >
> >> > I just committed the following patch to gcc upstream:
> >> >
> >> >
> >> > https://gcc.gnu.org/pipermail/gcc-cvs/2021-September/353195.html
> >>
> >> Hurray! Thank you so much for working on this, and thanks also to the
> >> reviewers and everyone else poking at it.
> >>
> >> I will go update my Linux Plumbers slides to say "supported" instead of
> >> "proposed". :)
> >
> > Can you two work on wording to add to gcc-12/changes.html for this
> > feature?  I think it deserves a release note.  Likewise the CTF/BTF
> > support btw.
>
> What about something like this for the BPF, CTF and BTF changes..

Looks good to me!

Thanks,
Richard.

> commit 3826495d1a2c265954d5da13ca71925eea390060 (HEAD -> master)
> Author: Jose E. Marchesi 
> Date:   Fri Sep 10 15:44:30 2021 +0200
>
> gcc-12/changes.html: BPF, CTF and BTF update
>
> * htdocs/gcc-12/changes.html (BPF): Item about the CO-RE support.
> (Debugging formats): New section with items about the support for
> CTF and BTF.
>
> diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
> index 946faa49..936af979 100644
> --- a/htdocs/gcc-12/changes.html
> +++ b/htdocs/gcc-12/changes.html
> @@ -143,6 +143,15 @@ a work-in-progress.
>
>  
>
> +BPF
> +
> +  Support for CO-RE (compile-once, run-everywhere) has been added
> +  to the BPF backend.  CO-RE allows to compile portable BPF
> +  programs that are able to run among different versions of the
> +  Linux kernel.
> +  
> +
> +
>  
>
>  
> @@ -210,7 +219,25 @@ a work-in-progress.
>  
>
>  
> -
> +Other significant improvements
> +
> +Debugging formats
> +
> +
> +  GCC can now generate debugging information
> +  in https://ctfstd.org;>CTF, a lightweight debugging
> +  format that provides information about C types and the
> +  association between functions and data symbols and types.  This
> +  format is designed to be embedded in ELF files and to be very
> +  compact and simple.  A new command-line
> +  option -gctf enables the generation of CTF.
> +  
> +  GCC can now generate debugging information in BTF.  This is a
> +  debugging format mainly used in BPF programs and the Linux
> +  kernel.  The compiler can generate BTF for any target, when
> +  enabled with the command-line option -gbtf
> +  
> +
>
>
>  


[PATCH] Fix i686-lynx build breakage

2021-09-13 Thread Richard Biener via Gcc-patches
With the last adjustment I failed to remove a stray

Pushed.

2021-09-13  Richard Biener  

* config/i386/lynx.h: Remove undef of PREFERRED_DEBUGGING_TYPE
to inherit from elfos.h
---
 gcc/config/i386/lynx.h | 4 
 1 file changed, 4 deletions(-)

diff --git a/gcc/config/i386/lynx.h b/gcc/config/i386/lynx.h
index 70b2587e6cb..65fc6a7468d 100644
--- a/gcc/config/i386/lynx.h
+++ b/gcc/config/i386/lynx.h
@@ -60,10 +60,6 @@ along with GCC; see the file COPYING3.  If not see
 
 #undef ASM_OUTPUT_ALIGN
 
-/* Undefine the definition from elfos.h to enable our default.  */
-
-#undef PREFERRED_DEBUGGING_TYPE
-
 /* The file i386.c defines TARGET_HAVE_TLS unconditionally if
HAVE_AS_TLS is defined.  HAVE_AS_TLS is defined as gas support for
TLS is detected by configure.  We undefine it here.  */
-- 
2.31.1


Re: [PATCH] Remove dbx.h, do not set PREFERRED_DEBUGGING_TYPE from dbxcoff.h, lynx.h

2021-09-13 Thread Richard Biener via Gcc-patches
On Mon, 13 Sep 2021, Jan-Benedict Glaw wrote:

> Hi Richard,
> 
> On Fri, 2021-09-10 08:02:00 +0200, Richard Biener via Gcc-patches 
>  wrote:
> > > On 9/9/2021 7:19 AM, Richard Biener via Gcc-patches wrote:
> > > > The patch also removes the PREFERRED_DEBUGGING_TYPE define from
> > > > lynx.h which always follows elfos.h already defaulting to DWARF,
> > > > so the comment about STABS being the default is misleading and
> > > > outdated.  There's no listed maintainer for Lynx OS.
> > > >
> > > > I have not tested this in any ways but I also have no idea how
> > > > to meaningfully do so.
> 
> I'm not actually running such a configuration and cannot properly test
> it, but automated mass-building broke for --target=i686-lynxos:

Ah, I didn't spot

/* Undefine the definition from elfos.h to enable our default.  */

#undef PREFERRED_DEBUGGING_TYPE

will fix.

Richard.


[PATCH] Add cr16-*-* to the list of obsoleted targets

2021-09-13 Thread Richard Biener via Gcc-patches
This adds cr16-*-* to the list of obsoleted targets in config.gcc

Approved by Jeff in another thread, pushed.  cr16 has no maintainer and
it's still cc0.

2021-09-13  Richard Biener  

* config.gcc: Add cr16-*-* to the list of obsoleted targets.
---
 gcc/config.gcc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index ccf41f66e42..84de1a3f691 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -249,6 +249,7 @@ md_file=
 # Obsolete configurations.
 case ${target} in
   tile*-*-*\
+ | cr16-*-*\
  )
 if test "x$enable_obsolete" != xyes; then
   echo "*** Configuration ${target} is obsolete." >&2
-- 
2.31.1


Re: [PATCH 2/2] validate_subreg before call gen_lowpart to avoid ICE.

2021-09-13 Thread Richard Biener via Gcc-patches
On Mon, Sep 13, 2021 at 8:26 AM Hongtao Liu  wrote:
>
> On Mon, Sep 13, 2021 at 2:11 PM Richard Biener via Gcc-patches
>  wrote:
> >
> > On Fri, Sep 10, 2021 at 2:58 PM liuhongt  wrote:
> > >
> > > gcc/ChangeLog:
> > >
> > > * expmed.c (extract_bit_field_using_extv): validate_subreg
> > > before call gen_lowpart.
> > > ---
> > >  gcc/expmed.c | 6 +-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/gcc/expmed.c b/gcc/expmed.c
> > > index 3143f38e057..10d62d857a8 100644
> > > --- a/gcc/expmed.c
> > > +++ b/gcc/expmed.c
> > > @@ -1571,12 +1571,16 @@ extract_bit_field_using_extv (const 
> > > extraction_insn *extv, rtx op0,
> > >
> > >if (GET_MODE (target) != ext_mode)
> > >  {
> > > +  machine_mode tmode = GET_MODE (target);
> > >/* Don't use LHS paradoxical subreg if explicit truncation is 
> > > needed
> > >  between the mode of the extraction (word_mode) and the target
> > >  mode.  Instead, create a temporary and use convert_move to set
> > >  the target.  */
> > >if (REG_P (target)
> > > - && TRULY_NOOP_TRUNCATION_MODES_P (GET_MODE (target), ext_mode))
> > > + && TRULY_NOOP_TRUNCATION_MODES_P (tmode, ext_mode)
> > > + && validate_subreg (ext_mode, tmode,
> > > + target,
> > > + subreg_lowpart_offset (ext_mode, tmode)))
> > > {
> > >   target = gen_lowpart (ext_mode, target);
> >
> > That would be equivalent to use gen_lowpart_if_possible?
> No, target will be changed to NULL_RTX.
> But it does avoid ICE since maybe_expand_insn can legitimate operands,
> but I doubt it will introduce other bugs since the target has been
> changed here.
>
> I think the validate_subreg solution is plain and straightforward,
> just like it's done in
> r11-7515-g0ad6de3883a1641f7ec0bd9cf56d41fa5b313dae.

That guards an explicit gen_rtx_SUBREG, here we're using gen_lowpart.
It's not an obvious match to validate gen_lowpart with validate_subreg,
I thought that gen_lowpart_if_possible would be prefered.  You obviously
have to adjust the code, like

  rtx tem;
  if (...
  && (tem = gen_lowpart_if_possible (ext_mode, target))
{
   target = tem;
...

Richard.

>
> >
> > >   if (partial_subreg_p (GET_MODE (spec_target), ext_mode))
> > > --
> > > 2.27.0
> > >
>
>
>
> --
> BR,
> Hongtao


Re: [PATCH] Fix multi-statement define for alpha-dec-vms

2021-09-13 Thread Richard Biener via Gcc-patches
On Sun, Sep 12, 2021 at 8:12 PM Jan-Benedict Glaw  wrote:
>
> Hi!
>
> While mass-building a cross-gcc, I noticed that for
> alpha-dec-vms/alpha64-dec-vms, recent GCC versions correctly throw a warning
> due to a multi-statement define that gets rippen in an if/else case:
>
> [all 2021-09-12 15:51:55] /usr/lib/gcc-snapshot/bin/g++  -fno-PIE -c   -g -O2 
> -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE   -fno-exceptions -fno-rtti 
> -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
> -Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute 
> -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros 
> -Wno-overlength-strings -Werror -fno-common  -DHAVE_CONFIG_H -I. -I. 
> -I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include 
> -I../../gcc/gcc/../libcpp/include -I../../gcc/gcc/../libcody  
> -I../../gcc/gcc/../libdecnumber -I../../gcc/gcc/../libdecnumber/dpd 
> -I../libdecnumber -I../../gcc/gcc/../libbacktrace   -o value-prof.o -MT 
> value-prof.o -MMD -MP -MF ./.deps/value-prof.TPo ../../gcc/gcc/value-prof.c
> [all 2021-09-12 15:52:01] /usr/lib/gcc-snapshot/bin/g++  -fno-PIE -c   -g -O2 
> -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE   -fno-exceptions -fno-rtti 
> -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
> -Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute 
> -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros 
> -Wno-overlength-strings -Werror -fno-common  -DHAVE_CONFIG_H -I. -I. 
> -I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include 
> -I../../gcc/gcc/../libcpp/include -I../../gcc/gcc/../libcody  
> -I../../gcc/gcc/../libdecnumber -I../../gcc/gcc/../libdecnumber/dpd 
> -I../libdecnumber -I../../gcc/gcc/../libbacktrace   -o var-tracking.o -MT 
> var-tracking.o -MMD -MP -MF ./.deps/var-tracking.TPo 
> ../../gcc/gcc/var-tracking.c
> [all 2021-09-12 15:52:03] In file included from ./tm.h:21,
> [all 2021-09-12 15:52:03]  from ../../gcc/gcc/backend.h:28,
> [all 2021-09-12 15:52:03]  from 
> ../../gcc/gcc/var-tracking.c:91:
> [all 2021-09-12 15:52:03] ../../gcc/gcc/var-tracking.c: In function 'void 
> prepare_call_arguments(basic_block, rtx_insn*)':
> [all 2021-09-12 15:52:03] ../../gcc/gcc/config/alpha/vms.h:148:3: error: 
> macro expands to multiple statements [-Werror=multistatement-macros]
> [all 2021-09-12 15:52:03]   148 |   (CUM).num_args = 0;   
> \
> [all 2021-09-12 15:52:03]   |   ^
> [all 2021-09-12 15:52:03] ../../gcc/gcc/var-tracking.c:6334:17: note: in 
> expansion of macro 'INIT_CUMULATIVE_ARGS'
> [all 2021-09-12 15:52:03]  6334 | INIT_CUMULATIVE_ARGS 
> (args_so_far_v, type, NULL_RTX, fndecl,
> [all 2021-09-12 15:52:03]   | ^~~~
> [all 2021-09-12 15:52:03] ../../gcc/gcc/var-tracking.c:6332:15: note: some 
> parts of macro expansion are not guarded by this 'else' clause
> [all 2021-09-12 15:52:03]  6332 |   else
> [all 2021-09-12 15:52:03]   |   ^~~~
> [all 2021-09-12 15:52:20] cc1plus: all warnings being treated as errors
> [all 2021-09-12 15:52:20] make[1]: *** [Makefile:1143: var-tracking.o] Error 1
> [all 2021-09-12 15:52:20] make[1]: Leaving directory 
> '/var/lib/laminar/run/gcc-alpha64-dec-vms/8/toolchain-build/gcc'
> [all 2021-09-12 15:52:20] make: *** [Makefile:4425: all-gcc] Error 2
>
>
>
>
> gcc/ChangeLog:
>
> * config/alpha/vms.h (INIT_CUMULATIVE_ARGS): Wrap multi-statment
> define into a block.
>
>
> diff --git a/gcc/config/alpha/vms.h b/gcc/config/alpha/vms.h
> index b8673b6b6fb..e979aef10c7 100644
> --- a/gcc/config/alpha/vms.h
> +++ b/gcc/config/alpha/vms.h
> @@ -145,9 +145,13 @@ typedef struct {int num_args; enum avms_arg_type 
> atypes[6];} avms_arg_info;
>
>  #undef INIT_CUMULATIVE_ARGS
>  #define INIT_CUMULATIVE_ARGS(CUM, FNTYPE, LIBNAME, INDIRECT, N_NAMED_ARGS) \
> -  (CUM).num_args = 0;  \
> -  (CUM).atypes[0] = (CUM).atypes[1] = (CUM).atypes[2] = I64;   \
> -  (CUM).atypes[3] = (CUM).atypes[4] = (CUM).atypes[5] = I64;
> +  do   \
> +{  \
> +  (CUM).num_args = 0;  \
> +  (CUM).atypes[0] = (CUM).atypes[1] = (CUM).atypes[2] = I64;   \
> +  (CUM).atypes[3] = (CUM).atypes[4] = (CUM).atypes[5] = I64;   \
> +}  \
> +  while (0)
>
>  #define DEFAULT_PCC_STRUCT_RETURN 0
>
>
>
>
> Okay for trunk?

OK.

Thanks,
Richard.

> Thanks,
>   Jan-Benedict
>
> --


Re: [PATCH] Fix PR lto/49664: liblto_plugin.so exports too many symbols

2021-09-13 Thread Richard Biener via Gcc-patches
On Sun, Sep 12, 2021 at 6:12 PM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> So right now liblto_plugin.so exports many libiberty symbols and
> simple_object file symbols but really it just needs to export onload.
>
> This fixes the problem by using "-export-symbols-regex onload" on
> the libtool link line.

OK.

> lto-plugin/ChangeLog:
>
> * Makefile.am: Export only onload.
> * Makefile.in: Regenerate.
> ---
>  lto-plugin/Makefile.am | 3 ++-
>  lto-plugin/Makefile.in | 7 ---
>  2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/lto-plugin/Makefile.am b/lto-plugin/Makefile.am
> index 8b20e1d1d87..988d7a78294 100644
> --- a/lto-plugin/Makefile.am
> +++ b/lto-plugin/Makefile.am
> @@ -21,7 +21,8 @@ in_gcc_libs = $(foreach lib, $(libexecsub_LTLIBRARIES), 
> $(gcc_build_dir)/$(lib))
>  liblto_plugin_la_SOURCES = lto-plugin.c
>  # Note that we intentionally override the bindir supplied by 
> ACX_LT_HOST_FLAGS.
>  liblto_plugin_la_LDFLAGS = $(AM_LDFLAGS) \
> -   $(lt_host_flags) -module -avoid-version -bindir $(libexecsubdir)
> +   $(lt_host_flags) -module -avoid-version -bindir $(libexecsubdir) \
> +   -export-symbols-regex onload
>  # Can be simplified when libiberty becomes a normal convenience library.
>  libiberty = $(with_libiberty)/libiberty.a
>  libiberty_noasan = $(with_libiberty)/noasan/libiberty.a
> diff --git a/lto-plugin/Makefile.in b/lto-plugin/Makefile.in
> index 20611c6b1e6..f8df31bb1e8 100644
> --- a/lto-plugin/Makefile.in
> +++ b/lto-plugin/Makefile.in
> @@ -323,6 +323,7 @@ prefix = @prefix@
>  program_transform_name = @program_transform_name@
>  psdir = @psdir@
>  real_target_noncanonical = @real_target_noncanonical@
> +runstatedir = @runstatedir@
>  sbindir = @sbindir@
>  sharedstatedir = @sharedstatedir@
>  srcdir = @srcdir@
> @@ -350,9 +351,9 @@ libexecsub_LTLIBRARIES = liblto_plugin.la
>  in_gcc_libs = $(foreach lib, $(libexecsub_LTLIBRARIES), 
> $(gcc_build_dir)/$(lib))
>  liblto_plugin_la_SOURCES = lto-plugin.c
>  # Note that we intentionally override the bindir supplied by 
> ACX_LT_HOST_FLAGS.
> -liblto_plugin_la_LDFLAGS = $(AM_LDFLAGS) $(lt_host_flags) -module 
> -avoid-version \
> -   -bindir $(libexecsubdir) $(if $(wildcard \
> -   $(libiberty_noasan)),, $(if $(wildcard \
> +liblto_plugin_la_LDFLAGS = $(AM_LDFLAGS) $(lt_host_flags) -module \
> +   -avoid-version -bindir $(libexecsubdir) -export-symbols-regex \
> +   onload $(if $(wildcard $(libiberty_noasan)),, $(if $(wildcard \
> $(libiberty_pic)),,-Wc,$(libiberty)))
>  # Can be simplified when libiberty becomes a normal convenience library.
>  libiberty = $(with_libiberty)/libiberty.a
> --
> 2.17.1
>


Re: [PATCH] Fix SFmode subreg of DImode and TImode

2021-09-13 Thread Richard Biener via Gcc-patches
On Fri, Sep 10, 2021 at 5:07 PM Segher Boessenkool
 wrote:
>
> On Fri, Sep 10, 2021 at 12:53:37PM +0200, Richard Biener wrote:
> > On Fri, Sep 10, 2021 at 1:50 AM Segher Boessenkool
> >  wrote:
> > > And many targets have strange rules for bit-strings in which modes can
> > > be used as bit-strings in which other modes, and at what offsets in
> > > which registers.  Now perhaps none of that is optimal (I bet it isn't),
> > > but changing this without a transition plan simply does not work.
> >
> > But we _do_ already allow some of them :/  Like
>
> Yes.  And all of this is old and ingrained, and targets depend on the
> status quo, so changing this will need more care and planning and
> cooperation.  It certainly is a worthwhile thing to improve, but it is
> not a small project, and it requires a plan.
>
> >   /* ??? Similarly, e.g. with (subreg:DF (reg:TI)).  Though store_bit_field
> >  is the culprit here, and not the backends.  */
> >   else if (known_ge (osize, regsize) && known_ge (isize, osize))
> > ;
> >
> > so for the special case where 'regsize' matches osize it would be
> > a bit-cast of a full register from int to float.  But as written it also
> > allows (subreg:XF (reg:TI))  which will likely wreck havoc?
>
> That does not pass the isize >= osize test?  Or maybe I don't know
> what XFmode is well enough :-)  Hey I can read, I have source, and it
> is Friday...

TImode is 16 bytes but XFmode is 12.  I meant to construct a case
which passes all the tests but where definitely such kind of subreg
is very odd to be "allowed" by validate_subreg - a target may of course
have means to make sense of it (I don't see how x87 would though).

> Ah.  So XF has different size on 32-bit and on 64-bit, but that doesn't
> even matter here.
>
> > Similar for the omode == word_mode check which allows
> > (subreg:DI (reg:TF ..)).  That is, the existing special-cases look
> > too broad to me - and they probably exist because when validate_subreg
> > rejects sth then we can't put it together later when expand split it
> > into two subregs and a pseudo ...
>
> I said it before, and I'll say it again, it is a very important point:
> expand should not try to optimise this, *at all*.  And not just this,
> not *anything*.  Expand's job in the current compiler is only to
> translate Gimple to RTL, and nothing more.

+1 (or +10!), but unfortunately expand _is_ an important part
of optimization - in particular when it gets to avoiding stack usage
since we can never get rid of all effects of allocating a stack slot.

Splitting to multiple insns with pseudos should be fine though, but
it at least seems that splitting (subreg:FLOAT (reg:INT)) into
(subreg:INT (reg:INT)) (subreg:FLOAT (reg:INT)) isn't always valid.

Richard.


Re: [PATCH 00/13] ARM/MVE use vectors of boolean for predicates

2021-09-13 Thread Christophe LYON via Gcc-patches

ping?


On 07/09/2021 11:15, Christophe Lyon wrote:

This patch series addresses PR 100757 and 101325 by representing
vectors of predicates (MVE VPR.P0 register) as vectors of booleans
rather than using HImode.

As this implies a lot of mostly mechanical changes, I have tried to
split the patches in a way that should help reviewers, but the split
is a bit artificial.

Patches 1-3 add new tests.

Patches 4-6 are small independent improvements.

Patch 7 implements the predicate qualifier, but does not change any
builtin yet.

Patch 8 is the first of the two main patches, and uses the new
qualifier to describe the vcmp and vpsel builtins that are useful for
auto-vectorization of comparisons.

Patch 9 is the second main patch, which fixes the vcond_mask expander.

Patches 10-13 convert almost all the remaining builtins with HI
operands to use the predicate qualifier.  After these, there are still
a few builtins with HI operands left, about which I am not sure: vctp,
vpnot, load-gather and store-scatter with v2di operands.  In fact,
patches 11/12 update some STR/LDR qualifiers in a way that breaks
these v2di builtins although existing tests still pass.

Christophe Lyon (13):
   arm: Add new tests for comparison vectorization with Neon and MVE
   arm: Add tests for PR target/100757
   arm: Add test for PR target/101325
   arm: Add GENERAL_AND_VPR_REGS regclass
   arm: Add support for VPR_REG in arm_class_likely_spilled_p
   arm: Fix mve_vmvnq_n_ argument mode
   arm: Implement MVE predicates as vectors of booleans
   arm: Implement auto-vectorized MVE comparisons with vectors of boolean
 predicates
   arm: Fix vcond_mask expander for MVE (PR target/100757)
   arm: Convert remaining MVE vcmp builtins to predicate qualifiers
   arm: Convert more MVE builtins to predicate qualifiers
   arm: Convert more load/store MVE builtins to predicate qualifiers
   arm: Convert more MVE/CDE builtins to predicate qualifiers

  gcc/config/arm/arm-builtins.c | 228 +++--
  gcc/config/arm/arm-modes.def  |   5 +
  gcc/config/arm/arm-protos.h   |   3 +-
  gcc/config/arm/arm-simd-builtin-types.def |   4 +
  gcc/config/arm/arm.c  | 128 ++-
  gcc/config/arm/arm.h  |   5 +-
  gcc/config/arm/arm_mve_builtins.def   | 746 
  gcc/config/arm/iterators.md   |   5 +
  gcc/config/arm/mve.md | 823 ++
  gcc/config/arm/neon.md|  39 +
  gcc/config/arm/vec-common.md  |  52 --
  gcc/simplify-rtx.c|   7 +
  .../arm/acle/cde-mve-full-assembly.c  | 264 +++---
  .../gcc.target/arm/simd/mve-vcmp-f32-2.c  |  32 +
  .../gcc.target/arm/simd/neon-compare-1.c  |  78 ++
  .../gcc.target/arm/simd/neon-compare-2.c  |  13 +
  .../gcc.target/arm/simd/neon-compare-3.c  |  14 +
  .../arm/simd/neon-compare-scalar-1.c  |  57 ++
  .../gcc.target/arm/simd/neon-vcmp-f16.c   |  12 +
  .../gcc.target/arm/simd/neon-vcmp-f32-2.c |  15 +
  .../gcc.target/arm/simd/neon-vcmp-f32-3.c |  12 +
  .../gcc.target/arm/simd/neon-vcmp-f32.c   |  12 +
  gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c |  22 +
  .../gcc.target/arm/simd/pr100757-2.c  |  20 +
  .../gcc.target/arm/simd/pr100757-3.c  |  20 +
  .../gcc.target/arm/simd/pr100757-4.c  |  19 +
  gcc/testsuite/gcc.target/arm/simd/pr100757.c  |  19 +
  gcc/testsuite/gcc.target/arm/simd/pr101325.c  |  14 +
  28 files changed, 1581 insertions(+), 1087 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757.c
  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325.c



  1   2   >