date:20211101

Re: [PATCH 1/2] add -Wuse-after-free

2021-11-01 Thread Eric Gallager via Gcc-patches

On Mon, Nov 1, 2021 at 6:18 PM Martin Sebor via Gcc-patches
 wrote:
>
> Patch 1 in the series detects a small subset of uses of pointers
> made indeterminate by calls to deallocation functions like free
> or C++ operator delete.  To control the conditions the warnings
> are issued under the new -Wuse-after-free= option provides three
> levels.  At the lowest level the warning triggers only for
> unconditional uses of freed pointers and doesn't warn for uses
> in equality expressions.  Level 2 warns also for come conditional
> uses, and level 3 also for uses in equality expressions.
>
> I debated whether to make level 2 or 3 the default included in
> -Wall.  I decided on 3 for two reasons: 1) to raise awareness
> of both the problem and GCC's new ability to detect it: using
> a pointer after it's been freed, even only in principle, by
> a successful call to realloc, is undefined, and 2) because
> it's trivial to lower the level either globally, or locally
> by suppressing the warning around such misuses.
>
> I've tested the patch on x86_64-linux and by building Glibc
> and Binutils/GDB.  It triggers a number of times in each, all
> due to comparing invalidated pointers for equality (i.e., level
> 3).  I have suppressed these in GCC (libiberty) by a #pragma,
> and will see how the Glibc folks want to deal with theirs (I
> track them in BZ #28521).
>
> The tests contain a number of xfails due to limitations I'm
> aware of.  I marked them pr?? until the patch is approved.
> I will open bugs for them before committing if I don't resolve
> them in a followup.
>
> Martin

Hi, I'm just wondering how this fares compared to the static
analyzer's -Wanalyzer-use-after-free; could you compare and contrast
them for us?
Thanks,
Eric

Re: [PATCH] Add a simulate_record_decl lang hook

2021-11-01 Thread Jason Merrill via Gcc-patches

On Sat, Oct 30, 2021 at 2:29 PM Richard Sandiford 
wrote:

> Jason Merrill  writes:
> > On 10/18/21 16:35, Richard Sandiford wrote:
> >> Jason Merrill  writes:
> >>> On 9/24/21 13:53, Richard Sandiford wrote:
>  +  if (type == error_mark_node)
>  +return lhd_simulate_record_decl (loc, name, fields);
> >>>
> >>> Why fall back to the language-independent function on error?  Is there
> a
> >>> case where that gives better error recovery than just returning
> >>> error_mark_node?
> >>
> >> I don't think falling back necessarily improves future error messages
> >> (or makes them worse).  The reason was more that the code to handle
> >> target builtins generally expects to be able to create whatever types
> >> and functions it wants.  If we return something unexpected, even it's
> >> error_mark_node, then there's a higher risk of ICEs later on.
> >>
> >> I guess that's a bit defeatist.  But in practice, the first code
> >> that uses the hook will be code that previously ran at start-up
> >> and so didn't have to worry about these errors.
> >>
> >> In practice I think errors will be extremely rare.
> >>
>  +  xref_basetypes (type, NULL_TREE);
>  +  type = begin_class_definition (type);
>  +  if (type == error_mark_node)
>  +return lhd_simulate_record_decl (loc, name, fields);
>  +
>  +  for (tree field : fields)
>  +finish_member_declaration (field);
>  +
>  +  type = finish_struct (type, NULL_TREE);
>  +
>  +  tree decl = build_decl (loc, TYPE_DECL, ident, type);
>  +  TYPE_NAME (type) = decl;
>  +  TYPE_STUB_DECL (type) = decl;
> >>>
> >>> Setting TYPE_NAME and TYPE_STUB_DECL to the typedef is wrong; it should
> >>> work to just remove these two lines.  I expect they're also wrong for
> C.
> >>>
> >>> For C++ only, I wonder if you need this typedef at all.
> >>>
> >>> If you do want it, you need to use set_underlying_type to create a real
> >>> typedef.  I expect that's also true for C.
> >>
> >> Ah, yeah, thanks for the pointer.  Fixed in the patch below.
> >>
> >> I wanted the hook to simulate the typedef even for C++ because its
> >> first user will be arm_neon.h.  The spec for arm_neon.h says that the
> >> types must be declared as:
> >>
> >>typedef struct int32x2x4_t { … } int32x2x4_t;
> >>
> >> etc.  So, although it's a silly edge case, code that tries to take
> >> advantage of the struct stat hack, such as:
> >>
> >>#include 
> >>struct int32x2x4_t int32x2x4_t = {};
> >>
> >> should continue to be rejected for C++ as well as C.
> >>
> >> Maybe in future we could add a flag to suppress the typedef if some
> >> callers prefer that behaviour.
> >>
> >> Tested as before.
> >
> > Can the C++ hook go in cp-lang.c (which already includes
> > langhooks-def.h) instead of decl.c?  With that change, the patch is OK
> > in a week if nobody else has feedback.
>
> Thanks.  I just tried with that change, but it breaks Objective C++ builds,
> since cp-lang.o isn't linked there.
>

Then it's OK without that change.

Jason

Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-11-01 Thread HAO CHEN GUI via Gcc-patches

David,

    My patch file was broken. I am sorry for it.  Here is the correct one. 
Thanks a lot.

ChangeLog

2021-11-01 Haochen Gui 

gcc/
    * config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin): Disable
    gimple fold for VSX_BUILTIN_XVMINDP, ALTIVEC_BUILTIN_VMINFP,
    VSX_BUILTIN_XVMAXDP, ALTIVEC_BUILTIN_VMAXFP when fast-math is not
    set.

gcc/testsuite/
    * gcc.target/powerpc/vec-minmax-1.c: New test.
    * gcc.target/powerpc/vec-minmax-2.c: Likewise.


patch.diff

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 7d485480225..a8e193a0089 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -12333,6 +12333,14 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
   return true;
 /* flavors of vec_min.  */
 case VSX_BUILTIN_XVMINDP:
+    case ALTIVEC_BUILTIN_VMINFP:
+  {
+   lhs = gimple_call_lhs (stmt);
+   tree type = TREE_TYPE (lhs);
+   if (HONOR_NANS (type))
+ return false;
+   gcc_fallthrough ();
+  }
 case P8V_BUILTIN_VMINSD:
 case P8V_BUILTIN_VMINUD:
 case ALTIVEC_BUILTIN_VMINSB:
@@ -12341,7 +12349,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 case ALTIVEC_BUILTIN_VMINUB:
 case ALTIVEC_BUILTIN_VMINUH:
 case ALTIVEC_BUILTIN_VMINUW:
-    case ALTIVEC_BUILTIN_VMINFP:
   arg0 = gimple_call_arg (stmt, 0);
   arg1 = gimple_call_arg (stmt, 1);
   lhs = gimple_call_lhs (stmt);
@@ -12351,6 +12358,14 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
   return true;
 /* flavors of vec_max.  */
 case VSX_BUILTIN_XVMAXDP:
+    case ALTIVEC_BUILTIN_VMAXFP:
+  {
+   lhs = gimple_call_lhs (stmt);
+   tree type = TREE_TYPE (lhs);
+   if (HONOR_NANS (type))
+ return false;
+   gcc_fallthrough ();
+  }
 case P8V_BUILTIN_VMAXSD:
 case P8V_BUILTIN_VMAXUD:
 case ALTIVEC_BUILTIN_VMAXSB:
@@ -12359,7 +12374,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 case ALTIVEC_BUILTIN_VMAXUB:
 case ALTIVEC_BUILTIN_VMAXUH:
 case ALTIVEC_BUILTIN_VMAXUW:
-    case ALTIVEC_BUILTIN_VMAXFP:
   arg0 = gimple_call_arg (stmt, 0);
   arg1 = gimple_call_arg (stmt, 1);
   lhs = gimple_call_lhs (stmt);
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c 
b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c
new file mode 100644
index 000..e238659c9be
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c
@@ -0,0 +1,52 @@
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
+/* { dg-final { scan-assembler-times {\mxvmaxdp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvmaxsp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvmindp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvminsp\M} 1 } } */
+
+/* This test verifies that float or double vec_min/max are bound to
+   xv[min|max][d|s]p instructions when fast-math is not set.  */
+
+
+#include 
+
+#ifdef _BIG_ENDIAN
+   const int PREF_D = 0;
+#else
+   const int PREF_D = 1;
+#endif
+
+double vmaxd (double a, double b)
+{
+  vector double va = vec_promote (a, PREF_D);
+  vector double vb = vec_promote (b, PREF_D);
+  return vec_extract (vec_max (va, vb), PREF_D);
+}
+
+double vmind (double a, double b)
+{
+  vector double va = vec_promote (a, PREF_D);
+  vector double vb = vec_promote (b, PREF_D);
+  return vec_extract (vec_min (va, vb), PREF_D);
+}
+
+#ifdef _BIG_ENDIAN
+   const int PREF_F = 0;
+#else
+   const int PREF_F = 3;
+#endif
+
+float vmaxf (float a, float b)
+{
+  vector float va = vec_promote (a, PREF_F);
+  vector float vb = vec_promote (b, PREF_F);
+  return vec_extract (vec_max (va, vb), PREF_F);
+}
+
+float vminf (float a, float b)
+{
+  vector float va = vec_promote (a, PREF_F);
+  vector float vb = vec_promote (b, PREF_F);
+  return vec_extract (vec_min (va, vb), PREF_F);
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c 
b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c
new file mode 100644
index 000..149275d8709
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c
@@ -0,0 +1,50 @@
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power9 -ffast-math" } */
+/* { dg-final { scan-assembler-times {\mxsmaxcdp\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxsmincdp\M} 2 } } */
+
+/* This test verifies that float or double vec_min/max can be converted
+   to scalar comparison when fast-math is set.  */
+
+
+#include 
+
+#ifdef _BIG_ENDIAN
+   const int PREF_D = 0;
+#else
+   const int PREF_D = 1;
+#endif
+
+double vmaxd (double a, double b)
+{
+  vector double va = vec_promote (a, PREF_D);
+  vector double vb = vec_promote (b, PREF_D);
+  return vec_extract (vec_max (va, vb), PREF_D);
+}
+
+double vmind (double a, double b)
+{
+  vector double va = vec_promote (a, PREF_D);
+  vector double vb =

Re: [PATCH] libstdc++: Clear padding bits in atomic compare_exchange

2021-11-01 Thread Thomas Rodgers via Gcc-patches

This should address Jonathan's feedback and adds support for atomic_ref

On Wed, Sep 29, 2021 at 5:14 AM Jonathan Wakely  wrote:

> On Mon, 27 Sept 2021 at 15:11, Thomas Rodgers 
> wrote:
> >
> > From: Thomas Rodgers 
> >
> > Now with checks for __has_builtin(__builtin_clear_padding)
> >
> > This change implements P0528 which requires that padding bits not
> > participate in atomic compare exchange operations. All arguments to the
> > generic template are 'sanitized' by the __builtin_clearpadding intrisic
> > before they are used in comparisons. This alrequires that any stores
> > also sanitize the incoming value.
> >
> > Signed-off-by: Thomas Rodgers 
> >
> > libstdc++=v3/ChangeLog:
> >
> > * include/std/atomic (atomic::atomic(_Tp) clear padding for
> > __cplusplus > 201703L.
> > (atomic::store()) Clear padding.
> > (atomic::exchange()) Likewise.
> > (atomic::compare_exchange_weak()) Likewise.
> > (atomic::compare_exchange_strong()) Likewise.
>
> Don't we also need this for std::atomic_ref, i.e. for the
> __atomic_impl free functions in ?
>
> There we don't have any distinction between atomic_ref
> and atomic_ref, they both use the same
> implementations. But I think that's OK, as I think the built-in is
> smart enough to be a no-op for types with no padding.
>
> > * testsuite/29_atomics/atomic/compare_exchange_padding.cc: New
> > test.
> > ---
> >  libstdc++-v3/include/std/atomic   | 41 +-
> >  .../atomic/compare_exchange_padding.cc| 42 +++
> >  2 files changed, 81 insertions(+), 2 deletions(-)
> >  create mode 100644
> libstdc++-v3/testsuite/29_atomics/atomic/compare_exchange_padding.cc
> >
> > diff --git a/libstdc++-v3/include/std/atomic
> b/libstdc++-v3/include/std/atomic
> > index 936dd50ba1c..4ac9ccdc1ab 100644
> > --- a/libstdc++-v3/include/std/atomic
> > +++ b/libstdc++-v3/include/std/atomic
> > @@ -228,7 +228,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >atomic& operator=(const atomic&) = delete;
> >atomic& operator=(const atomic&) volatile = delete;
> >
> > -  constexpr atomic(_Tp __i) noexcept : _M_i(__i) { }
> > +#if __cplusplus > 201703L && __has_builtin(__builtin_clear_padding)
> > +  constexpr atomic(_Tp __i) noexcept : _M_i(__i)
> > +  { __builtin_clear_padding(std::__addressof(_M_i)); }
> > +#else
> > +  constexpr atomic(_Tp __i) noexcept : _M_i(__i)
> > +  { }
> > +#endif
>
> Please write this as a single function with the preprocessor
> conditions in the body:
>
>   constexpr atomic(_Tp __i) noexcept : _M_i(__i)
>   {
> #if __cplusplus > 201703L && __has_builtin(__builtin_clear_padding)
> __builtin_clear_padding(std::__addressof(_M_i)); }
> #endif
>   }
>
> This not only avoids duplication of the identical parts, but it avoids
> warnings from ld.gold if you use --detect-odr-violations. Otherwise,
> the linker can see a definition of that constructor on two different
> lines (233 and 236), and so warns about possible ODR violations,
> something like "warning: while linking foo: symbol
> 'std::atomic::atomic(int)' defined in multiple places (possible
> ODR violation): ...atomic:233 ... atomic:236"
>
> Can't we clear the padding for >= 201402L instead of only C++20? Only
> C++11 has a problem with the built-in in a constexpr function, right?
> So we can DTRT for C++14 upwards.
>
>
> >
> >operator _Tp() const noexcept
> >{ return load(); }
> > @@ -268,12 +274,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >void
> >store(_Tp __i, memory_order __m = memory_order_seq_cst) noexcept
> >{
> > +#if __has_builtin(__builtin_clear_padding)
> > +   __builtin_clear_padding(std::__addressof(__i));
> > +#endif
>
> We repeat this *a lot*. When I started work on this I defined a
> non-member function in the __atomic_impl namespace:
>
> template
>   _GLIBCXX_ALWAYS_INLINE void
>   __clear_padding(_Tp& __val) noexcept
>   {
> #if __has_builtin(__builtin_clear_padding)
>__builtin_clear_padding(std::__addressof(__val));
> #endif
>   }
>
> Then you can just use that everywhere (except the constexpr
> constructor), without all the #if checks.
>
>
>
> > __atomic_store(std::__addressof(_M_i), std::__addressof(__i),
> int(__m));
> >}
> >
> >void
> >store(_Tp __i, memory_order __m = memory_order_seq_cst) volatile
> noexcept
> >{
> > +#if __has_builtin(__builtin_clear_padding)
> > +   __builtin_clear_padding(std::__addressof(__i));
> > +#endif
> > __atomic_store(std::__addressof(_M_i), std::__addressof(__i),
> int(__m));
> >}
> >
> > @@ -300,6 +312,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >{
> >  alignas(_Tp) unsigned char __buf[sizeof(_Tp)];
> > _Tp* __ptr = reinterpret_cast<_Tp*>(__buf);
> > +#if __has_builtin(__builtin_clear_padding)
> > +   __builtin_clear_padding(std::__addressof(__i));
> >

Re: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-11-01 Thread David Edelsohn via Gcc-patches

Hi, Hao

Neither the inlined patch nor the attached patch seem to contain the
change to rs6000-call.c.  I only see the new testcases.

Please resend the complete patch.

Thanks David

On Mon, Nov 1, 2021 at 2:48 AM HAO CHEN GUI  wrote:
>
> Hi,
>
>   This patch disables gimple folding for VSX_BUILTIN_XVMINDP, 
> VSX_BUILTIN_XVMAXDP, ALTIVEC_BUILTIN_VMINFP and  ALTIVEC_BUILTIN_VMAXFP when 
> fast-math is not set.  With the gimple folding is enabled, the four built-ins 
> will be implemented by c-type instructions - xs[min|max]cdp on P9 and P10 if 
> they can be converted to scalar comparisons.  While they are implemented by 
> xv[min|max][s|d]p on P8 and P7 as P8 and P7 don't have corresponding scalar 
> comparison instructions.  The patch binds these four built-ins to 
> xv[min|max][s|d]p when fast-math is not set. The two new test cases 
> illustrate it.
>
>   ALTIVEC_BUILTIN_VMINFP and  ALTIVEC_BUILTIN_VMAXFP are not implemented by 
> vminfp or vmaxfp.
>
> rs6000-builtin.def:BU_ALTIVEC_2 (VMAXFP,  "vmaxfp", 
> CONST, smaxv4sf3)
>
> rs6000-builtin.def:BU_ALTIVEC_2 (VMINFP,  "vminfp", 
> CONST, sminv4sf3)
>
> Bootstrapped and tested on powerpc64le-linux with no regressions. Is this 
> okay for trunk? Any recommendations? Thanks a lot.
>
>
> ChangeLog
>
> 2021-11-01 Haochen Gui 
>
> gcc/
> * config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin): Disable
> gimple fold for VSX_BUILTIN_XVMINDP, ALTIVEC_BUILTIN_VMINFP,
> VSX_BUILTIN_XVMAXDP, ALTIVEC_BUILTIN_VMAXFP when fast-math is not
> set.
>
> gcc/testsuite/
> * gcc.target/powerpc/vec-minmax-1.c: New test.
> * gcc.target/powerpc/vec-minmax-2.c: Likewise.
>
>
> patch.diff
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c 
> b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c
> new file mode 100644
> index 000..e238659c9be
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c
> @@ -0,0 +1,52 @@
> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
> +/* { dg-final { scan-assembler-times {\mxvmaxdp\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxvmaxsp\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxvmindp\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxvminsp\M} 1 } } */
> +
> +/* This test verifies that float or double vec_min/max are bound to
> +   xv[min|max][d|s]p instructions when fast-math is not set.  */
> +
> +
> +#include 
> +
> +#ifdef _BIG_ENDIAN
> +   const int PREF_D = 0;
> +#else
> +   const int PREF_D = 1;
> +#endif
> +
> +double vmaxd (double a, double b)
> +{
> +  vector double va = vec_promote (a, PREF_D);
> +  vector double vb = vec_promote (b, PREF_D);
> +  return vec_extract (vec_max (va, vb), PREF_D);
> +}
> +
> +double vmind (double a, double b)
> +{
> +  vector double va = vec_promote (a, PREF_D);
> +  vector double vb = vec_promote (b, PREF_D);
> +  return vec_extract (vec_min (va, vb), PREF_D);
> +}
> +
> +#ifdef _BIG_ENDIAN
> +   const int PREF_F = 0;
> +#else
> +   const int PREF_F = 3;
> +#endif
> +
> +float vmaxf (float a, float b)
> +{
> +  vector float va = vec_promote (a, PREF_F);
> +  vector float vb = vec_promote (b, PREF_F);
> +  return vec_extract (vec_max (va, vb), PREF_F);
> +}
> +
> +float vminf (float a, float b)
> +{
> +  vector float va = vec_promote (a, PREF_F);
> +  vector float vb = vec_promote (b, PREF_F);
> +  return vec_extract (vec_min (va, vb), PREF_F);
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c 
> b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c
> new file mode 100644
> index 000..149275d8709
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c
> @@ -0,0 +1,50 @@
> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power9 -ffast-math" } */
> +/* { dg-final { scan-assembler-times {\mxsmaxcdp\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mxsmincdp\M} 2 } } */
> +
> +/* This test verifies that float or double vec_min/max can be converted
> +   to scalar comparison when fast-math is set.  */
> +
> +
> +#include 
> +
> +#ifdef _BIG_ENDIAN
> +   const int PREF_D = 0;
> +#else
> +   const int PREF_D = 1;
> +#endif
> +
> +double vmaxd (double a, double b)
> +{
> +  vector double va = vec_promote (a, PREF_D);
> +  vector double vb = vec_promote (b, PREF_D);
> +  return vec_extract (vec_max (va, vb), PREF_D);
> +}
> +
> +double vmind (double a, double b)
> +{
> +  vector double va = vec_promote (a, PREF_D);
> +  vector double vb = vec_promote (b, PREF_D);
> +  return vec_extract (vec_min (va, vb), PREF_D);
> +}
> +
> +#ifdef _BIG_ENDIAN
> +   const int PREF_F = 0;
> +#else
> +   const int PREF_F = 3;
> +#endif
> +
> +float vmaxf (float a, float b)
> +{
> +  vector float va = vec_promote (a, PREF_F);
> +  vector float vb = vec_promote (b, PREF_F);
> +  return vec_extract (vec_max (va, vb), PREF_F);
> +}
> +

[PATCH 5/5] Fortran manual: Remove old docs for never-implemented extensions.

2021-11-01 Thread Sandra Loosemore

2021-11-01  Sandra Loosemore  

gcc/fortran/
* gfortran.texi (Projects): Add bullet for helping with
incomplete standards compliance.
(Proposed Extensions): Delete section.
---
 gcc/fortran/gfortran.texi | 92 ---
 1 file changed, 7 insertions(+), 85 deletions(-)

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index e231e74..f3a961e 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -5397,7 +5397,6 @@ but they are also things doable by someone who is willing 
and able.
 @menu
 * Contributors::
 * Projects::
-* Proposed Extensions::
 @end menu
 
 
@@ -5491,91 +5490,14 @@ isolating them.  Going through the bugzilla database at
 add more information (for example, for which version does the testcase
 work, for which versions does it fail?) is also very helpful.
 
-@end table
-
-
-@node Proposed Extensions
-@section Proposed Extensions
-
-Here's a list of proposed extensions for the GNU Fortran compiler, in no 
particular
-order.  Most of these are necessary to be fully compatible with
-existing Fortran compilers, but they are not part of the official
-J3 Fortran 95 standard.
-
-@subsection Compiler extensions:
-@itemize @bullet
-@item
-User-specified alignment rules for structures.
-
-@item
-Automatically extend single precision constants to double.
-
-@item
-Compile code that conserves memory by dynamically allocating common and
-module storage either on stack or heap.
-
-@item
-Compile flag to generate code for array conformance checking (suggest -CC).
-
-@item
-User control of symbol names (underscores, etc).
-
-@item
-Compile setting for maximum size of stack frame size before spilling
-parts to static or heap.
-
-@item
-Flag to force local variables into static space.
-
-@item
-Flag to force local variables onto stack.
-@end itemize
-
-
-@subsection Environment Options
-@itemize @bullet
-@item
-Pluggable library modules for random numbers, linear algebra.
-LA should use BLAS calling conventions.
+@item Missing features
+For a larger project, consider working on the missing features required for
+Fortran language standards compliance (@pxref{Standards}), or contributing
+to the implementation of extensions such as OpenMP (@pxref{OpenMP}) or
+OpenACC (@pxref{OpenACC}) that are under active development.  Again,
+contributing test cases for these features is useful too!
 
-@item
-Environment variables controlling actions on arithmetic exceptions like
-overflow, underflow, precision loss---Generate NaN, abort, default.
-action.
-
-@item
-Set precision for fp units that support it (i387).
-
-@item
-Variable for setting fp rounding mode.
-
-@item
-Variable to fill uninitialized variables with a user-defined bit
-pattern.
-
-@item
-Environment variable controlling filename that is opened for that unit
-number.
-
-@item
-Environment variable to clear/trash memory being freed.
-
-@item
-Environment variable to control tracing of allocations and frees.
-
-@item
-Environment variable to display allocated memory at normal program end.
-
-@item
-Environment variable for filename for * IO-unit.
-
-@item
-Environment variable for temporary file directory.
-
-@item
-Environment variable forcing standard output to be line buffered (Unix).
-
-@end itemize
+@end table
 
 
 @c -
-- 
2.8.1

[PATCH 4/5] Fortran manual: Update miscellaneous references to old standard versions.

2021-11-01 Thread Sandra Loosemore

2021-11-01  Sandra Loosemore  

gcc/fortran/
* intrinsic.texi (Introduction to Intrinsics): Genericize
references to standard versions.
* invoke.texi (-fall-intrinsics): Likewise.
(-fmax-identifier-length=): Likewise.
---
 gcc/fortran/intrinsic.texi | 15 ++-
 gcc/fortran/invoke.texi|  4 ++--
 2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/gcc/fortran/intrinsic.texi b/gcc/fortran/intrinsic.texi
index 6f7008a..9201c38 100644
--- a/gcc/fortran/intrinsic.texi
+++ b/gcc/fortran/intrinsic.texi
@@ -329,14 +329,11 @@ Some basic guidelines for editing this document:
 @node Introduction to Intrinsics
 @section Introduction to intrinsic procedures
 
-The intrinsic procedures provided by GNU Fortran include all of the
-intrinsic procedures required by the Fortran 95 standard, a set of
-intrinsic procedures for backwards compatibility with G77, and a
-selection of intrinsic procedures from the Fortran 2003 and Fortran 2008
-standards.  Any conflict between a description here and a description in
-either the Fortran 95 standard, the Fortran 2003 standard or the Fortran
-2008 standard is unintentional, and the standard(s) should be considered
-authoritative.
+The intrinsic procedures provided by GNU Fortran include procedures required
+by the Fortran 95 and later supported standards, and a set of intrinsic
+procedures for backwards compatibility with G77.  Any conflict between
+a description here and a description in the Fortran standards is
+unintentional, and the standard(s) should be considered authoritative.
 
 The enumeration of the @code{KIND} type parameter is processor defined in
 the Fortran 95 standard.  GNU Fortran defines the default integer type and
@@ -355,7 +352,7 @@ Many of the intrinsic procedures take one or more optional 
arguments.
 This document follows the convention used in the Fortran 95 standard,
 and denotes such arguments by square brackets.
 
-GNU Fortran offers the @option{-std=f95} and @option{-std=gnu} options,
+GNU Fortran offers the @option{-std=} command-line option,
 which can be used to restrict the set of intrinsic procedures to a 
 given standard.  By default, @command{gfortran} sets the @option{-std=gnu}
 option, and so all intrinsic procedures described here are accepted.  There
diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
index 3533e86..e9fb792 100644
--- a/gcc/fortran/invoke.texi
+++ b/gcc/fortran/invoke.texi
@@ -227,7 +227,7 @@ form is determined by the file extension.
 @item -fall-intrinsics
 @opindex @code{fall-intrinsics}
 This option causes all intrinsic procedures (including the GNU-specific
-extensions) to be accepted.  This can be useful with @option{-std=f95} to
+extensions) to be accepted.  This can be useful with @option{-std=} to
 force standard-compliance but get access to the full range of intrinsics
 available with @command{gfortran}.  As a consequence, @option{-Wintrinsics-std}
 will be ignored and no user-defined procedure with the same name as any
@@ -397,7 +397,7 @@ lines in the source file. The default value is 132.
 @item -fmax-identifier-length=@var{n}
 @opindex @code{fmax-identifier-length=}@var{n}
 Specify the maximum allowed identifier length. Typical values are
-31 (Fortran 95) and 63 (Fortran 2003 and Fortran 2008).
+31 (Fortran 95) and 63 (Fortran 2003 and later).
 
 @item -fimplicit-none
 @opindex @code{fimplicit-none}
-- 
2.8.1

[PATCH 4/5] Fortran manual: Update miscellaneous references to old standard versions.

2021-11-01 Thread Sandra Loosemore

2021-11-01  Sandra Loosemore  

gcc/fortran/
* intrinsic.texi (Introduction to Intrinsics): Genericize
references to standard versions.
* invoke.texi (-fall-intrinsics): Likewise.
(-fmax-identifier-length=): Likewise.
---
 gcc/fortran/intrinsic.texi | 15 ++-
 gcc/fortran/invoke.texi|  4 ++--
 2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/gcc/fortran/intrinsic.texi b/gcc/fortran/intrinsic.texi
index 6f7008a..9201c38 100644
--- a/gcc/fortran/intrinsic.texi
+++ b/gcc/fortran/intrinsic.texi
@@ -329,14 +329,11 @@ Some basic guidelines for editing this document:
 @node Introduction to Intrinsics
 @section Introduction to intrinsic procedures
 
-The intrinsic procedures provided by GNU Fortran include all of the
-intrinsic procedures required by the Fortran 95 standard, a set of
-intrinsic procedures for backwards compatibility with G77, and a
-selection of intrinsic procedures from the Fortran 2003 and Fortran 2008
-standards.  Any conflict between a description here and a description in
-either the Fortran 95 standard, the Fortran 2003 standard or the Fortran
-2008 standard is unintentional, and the standard(s) should be considered
-authoritative.
+The intrinsic procedures provided by GNU Fortran include procedures required
+by the Fortran 95 and later supported standards, and a set of intrinsic
+procedures for backwards compatibility with G77.  Any conflict between
+a description here and a description in the Fortran standards is
+unintentional, and the standard(s) should be considered authoritative.
 
 The enumeration of the @code{KIND} type parameter is processor defined in
 the Fortran 95 standard.  GNU Fortran defines the default integer type and
@@ -355,7 +352,7 @@ Many of the intrinsic procedures take one or more optional 
arguments.
 This document follows the convention used in the Fortran 95 standard,
 and denotes such arguments by square brackets.
 
-GNU Fortran offers the @option{-std=f95} and @option{-std=gnu} options,
+GNU Fortran offers the @option{-std=} command-line option,
 which can be used to restrict the set of intrinsic procedures to a 
 given standard.  By default, @command{gfortran} sets the @option{-std=gnu}
 option, and so all intrinsic procedures described here are accepted.  There
diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
index 3533e86..e9fb792 100644
--- a/gcc/fortran/invoke.texi
+++ b/gcc/fortran/invoke.texi
@@ -227,7 +227,7 @@ form is determined by the file extension.
 @item -fall-intrinsics
 @opindex @code{fall-intrinsics}
 This option causes all intrinsic procedures (including the GNU-specific
-extensions) to be accepted.  This can be useful with @option{-std=f95} to
+extensions) to be accepted.  This can be useful with @option{-std=} to
 force standard-compliance but get access to the full range of intrinsics
 available with @command{gfortran}.  As a consequence, @option{-Wintrinsics-std}
 will be ignored and no user-defined procedure with the same name as any
@@ -397,7 +397,7 @@ lines in the source file. The default value is 132.
 @item -fmax-identifier-length=@var{n}
 @opindex @code{fmax-identifier-length=}@var{n}
 Specify the maximum allowed identifier length. Typical values are
-31 (Fortran 95) and 63 (Fortran 2003 and Fortran 2008).
+31 (Fortran 95) and 63 (Fortran 2003 and later).
 
 @item -fimplicit-none
 @opindex @code{fimplicit-none}
-- 
2.8.1

[PATCH 3/5] Fortran manual: Update section on Interoperability with C

2021-11-01 Thread Sandra Loosemore

2021-11-01  Sandra Loosemore  

gcc/fortran/
* gfortran.texi (Interoperability with C): Copy-editing.  Add
more index entries.
(Intrinsic Types): Likewise.
(Derived Types and struct): Likewise.
(Interoperable Global Variables): Likewise.
(Interoperable Subroutines and Functions): Likewise.
(Working with C Pointers): Likewise.
(Further Interoperability of Fortran with C): Likewise.  Rewrite
to reflect that this is now fully supported by gfortran.
---
 gcc/fortran/gfortran.texi | 170 +++---
 1 file changed, 69 insertions(+), 101 deletions(-)

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index ba5db57..e231e74 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -2726,20 +2726,22 @@ and their use is highly recommended.
 
 @node Interoperability with C
 @section Interoperability with C
+@cindex interoperability with C
+@cindex C interoperability
 
 @menu
 * Intrinsic Types::
 * Derived Types and struct::
 * Interoperable Global Variables::
 * Interoperable Subroutines and Functions::
-* Working with Pointers::
+* Working with C Pointers::
 * Further Interoperability of Fortran with C::
 @end menu
 
 Since Fortran 2003 (ISO/IEC 1539-1:2004(E)) there is a
 standardized way to generate procedure and derived-type
-declarations and global variables which are interoperable with C
-(ISO/IEC 9899:1999).  The @code{bind(C)} attribute has been added
+declarations and global variables that are interoperable with C
+(ISO/IEC 9899:1999).  The @code{BIND(C)} attribute has been added
 to inform the compiler that a symbol shall be interoperable with C;
 also, some constraints are added.  Note, however, that not
 all C features have a Fortran equivalent or vice versa.  For instance,
@@ -2755,12 +2757,16 @@ assuming @math{i < n}) in memory is @code{A(i+1,j)} (C: 
@code{A[j-1][i]}).
 
 @node Intrinsic Types
 @subsection Intrinsic Types
+@cindex C intrinsic type interoperability
+@cindex intrinsic type interoperability with C
+@cindex interoperability, intrinsic type
 
 In order to ensure that exactly the same variable type and kind is used
-in C and Fortran, the named constants shall be used which are defined in the
-@code{ISO_C_BINDING} intrinsic module.  That module contains named constants
-for kind parameters and character named constants for the escape sequences
-in C.  For a list of the constants, see @ref{ISO_C_BINDING}.
+in C and Fortran, you should use the named constants for kind parameters
+that are defined in the @code{ISO_C_BINDING} intrinsic module.
+That module contains named constants of character type representing
+the escaped special characters in C, such as newline.
+For a list of the constants, see @ref{ISO_C_BINDING}.
 
 For logical types, please note that the Fortran standard only guarantees
 interoperability between C99's @code{_Bool} and Fortran's @code{C_Bool}-kind
@@ -2770,12 +2776,13 @@ the value 0.  Using any other integer value with GNU 
Fortran's @code{LOGICAL}
 values than 0 and 1 to GCC's @code{_Bool} is also undefined, unless the
 integer is explicitly or implicitly casted to @code{_Bool}.)
 
-
-
 @node Derived Types and struct
 @subsection Derived Types and struct
+@cindex C derived type and struct interoperability
+@cindex derived type interoperability with C
+@cindex interoperability, derived type and struct
 
-For compatibility of derived types with @code{struct}, one needs to use
+For compatibility of derived types with @code{struct}, use
 the @code{BIND(C)} attribute in the type declaration.  For instance, the
 following type declaration
 
@@ -2790,6 +2797,7 @@ following type declaration
  END TYPE
 @end smallexample
 
+@noindent
 matches the following @code{struct} declaration in C
 
 @smallexample
@@ -2814,6 +2822,9 @@ with bit field or variable-length array members are 
interoperable.
 
 @node Interoperable Global Variables
 @subsection Interoperable Global Variables
+@cindex C variable interoperability
+@cindex variable interoperability with C
+@cindex interoperability, variable
 
 Variables can be made accessible from C using the C binding attribute,
 optionally together with specifying a binding name.  Those variables
@@ -2841,17 +2852,18 @@ a macro.  Use the @code{IERRNO} intrinsic (GNU 
extension) instead.
 
 @node Interoperable Subroutines and Functions
 @subsection Interoperable Subroutines and Functions
+@cindex C procedure interoperability
+@cindex procedure interoperability with C
+@cindex function interoperability with C
+@cindex subroutine interoperability with C
+@cindex interoperability, subroutine and function
 
 Subroutines and functions have to have the @code{BIND(C)} attribute to
 be compatible with C.  The dummy argument declaration is relatively
 straightforward.  However, one needs to be careful because C uses
 call-by-value by default while Fortran behaves usually similar to
 call-by-reference.

[PATCH 2/5] Fortran manual: Revise introductory chapter.

2021-11-01 Thread Sandra Loosemore

Fix various bit-rot in the discussion of standards conformance, remove
material that is only of historical interest, copy-editing.  Also move
discussion of preprocessing out of the introductory chapter.

2021-11-01  Sandra Loosemore  

gcc/fortran/
* gfortran.texi (About GNU Fortran): Consolidate material
formerly in other sections.  Copy-editing.
(Preprocessing and conditional compilation): Delete, moving
most material to invoke.texi.
(GNU Fortran and G77): Delete.
(Project Status): Delete.
(Standards): Update.
(Fortran 95 status): Mention conditional compilation here.
(Fortran 2003 status): Rewrite to mention the 1 missing feature
instead of all the ones implemented.
(Fortran 2008 status): Similarly for the 2 missing features.
(Fortran 2018 status): Rewrite to reflect completion of TS29113
feature support.
* invoke.texi (Preprocessing Options): Move material formerly
in introductory chapter here.
---
 gcc/fortran/gfortran.texi | 627 +-
 gcc/fortran/invoke.texi   |  44 +++-
 2 files changed, 160 insertions(+), 511 deletions(-)

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 26cf44f..ba5db57 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -219,17 +219,9 @@ compiler.
 @end ifset
 @end iftex
 
-The GNU Fortran compiler front end was
-designed initially as a free replacement for,
-or alternative to, the Unix @command{f95} command;
-@command{gfortran} is the command you will use to invoke the compiler.
-
 @menu
 * About GNU Fortran::What you should know about the GNU Fortran compiler.
 * GNU Fortran and GCC::  You can compile Fortran, C, or other programs.
-* Preprocessing and conditional compilation:: The Fortran preprocessor
-* GNU Fortran and G77::  Why we chose to start from scratch.
-* Project Status::   Status of GNU Fortran, roadmap, proposed extensions.
 * Standards::Standards supported by GNU Fortran.
 @end menu
 
@@ -241,46 +233,67 @@ or alternative to, the Unix @command{f95} command;
 @node About GNU Fortran
 @section About GNU Fortran
 
-The GNU Fortran compiler supports the Fortran 77, 90 and 95 standards
-completely, parts of the Fortran 2003, 2008 and 2018 standards, and
-several vendor extensions.  The development goal is to provide the
-following features:
+The GNU Fortran compiler is the successor to @command{g77}, the
+Fortran 77 front end included in GCC prior to version 4 (released in
+2005).  While it is backward-compatible with most @command{g77}
+extensions and command-line options, @command{gfortran} is a completely new
+implemention designed to support more modern dialects of Fortran.
+GNU Fortran implements the Fortran 77, 90 and 95 standards
+completely, most of the Fortran 2003 and 2008 standards, and some
+features from the 2018 standard.  It also implements several extensions
+including OpenMP and OpenACC support for parallel programming.
+
+The GNU Fortran compiler passes the
+@uref{http://www.fortran-2000.com/ArnaudRecipes/fcvs21_f95.html,
+NIST Fortran 77 Test Suite}, and produces acceptable results on the
+@uref{http://www.netlib.org/lapack/faq.html#1.21, LAPACK Test Suite}.
+It also provides respectable performance on
+the @uref{https://polyhedron.com/?page_id=175,
+Polyhedron Fortran compiler benchmarks} and the
+@uref{http://www.netlib.org/benchmark/livermore,
+Livermore Fortran Kernels test}.  It has been used to compile a number of
+large real-world programs, including
+@uref{http://hirlam.org/, the HARMONIE and HIRLAM weather forecasting code} and
+@uref{https://github.com/dylan-jayatilaka/tonto,
+the Tonto quantum chemistry package}; see
+@url{https://gcc.gnu.org/@/wiki/@/GfortranApps} for an extended list.
+
+GNU Fortran provides the following functionality:
 
 @itemize @bullet
 @item
-Read a user's program, stored in a file and containing instructions
-written in Fortran 77, Fortran 90, Fortran 95, Fortran 2003, Fortran
-2008 or Fortran 2018.  This file contains @dfn{source code}.
+Read a program, stored in a file and containing @dfn{source code}
+instructions written in Fortran 77.
 
 @item
-Translate the user's program into instructions a computer
+Translate the program into instructions a computer
 can carry out more quickly than it takes to translate the
-instructions in the first
-place.  The result after compilation of a program is
+original Fortran instructions.
+The result after compilation of a program is
 @dfn{machine code},
-code designed to be efficiently translated and processed
+which is efficiently translated and processed
 by a machine such as your computer.
 Humans usually are not as good writing machine code
 as they are at writing Fortran (or C++, Ada, or Java),
 because it is easy to make tiny mistakes writing machine code.
 
 @item
-Provide the user with information about the reasons why
-the compiler is

[PATCH 1/5] Fortran manual: Combine standard conformance docs in one place.

2021-11-01 Thread Sandra Loosemore

Discussion of conformance with various revisions of the
Fortran standard was split between two separate parts of the
manual.  This patch moves it all to the introductory chapter.

2021-11-01  Sandra Loosemore  

gcc/fortran/
* gfortran.texi (Standards): Move discussion of specific
standard versions here
(Fortran standards status): ...from here, and delete this node.
---
 gcc/fortran/gfortran.texi | 508 +++---
 1 file changed, 250 insertions(+), 258 deletions(-)

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 0ace382..26cf44f 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -180,7 +180,6 @@ Part I: Invoking GNU Fortran
 * Runtime::  Influencing runtime behavior with environment 
variables.
 
 Part II: Language Reference
-* Fortran standards status::  Fortran 2003, 2008 and 2018 features 
supported by GNU Fortran.
 * Compiler Characteristics::  User-visible implementation details.
 * Extensions::Language extensions implemented by GNU 
Fortran.
 * Mixed-Language Programming::Interoperability with C
@@ -524,7 +523,10 @@ Fortran 2008 and Fortran 2018.
 @cindex Standards
 
 @menu
-* Varying Length Character Strings::
+* Fortran 95 status::
+* Fortran 2003 status::
+* Fortran 2008 status::
+* Fortran 2018 status::
 @end menu
 
 The GNU Fortran compiler implements
@@ -547,8 +549,8 @@ There also is support for the OpenACC specification 
(targeting
 version 2.6, @uref{http://www.openacc.org/}).  See
 @uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
 
-@node Varying Length Character Strings
-@subsection Varying Length Character Strings
+@node Fortran 95 status
+@subsection Fortran 95 status
 @cindex Varying length character strings
 @cindex Varying length strings
 @cindex strings, varying length
@@ -565,257 +567,8 @@ the features of @code{ISO_VARYING_STRING} and should be 
considered as
 replacement. (Namely, allocatable or pointers of the type
 @code{character(len=:)}.)
 
-
-@c =
-@c PART I: INVOCATION REFERENCE
-@c =
-
-@tex
-\part{I}{Invoking GNU Fortran}
-@end tex
-
-@c -
-@c Compiler Options
-@c -
-
-@include invoke.texi
-
-
-@c -
-@c Runtime
-@c -
-
-@node Runtime
-@chapter Runtime:  Influencing runtime behavior with environment variables
-@cindex environment variable
-
-The behavior of the @command{gfortran} can be influenced by
-environment variables.
-
-Malformed environment variables are silently ignored.
-
-@menu
-* TMPDIR:: Directory for scratch files
-* GFORTRAN_STDIN_UNIT:: Unit number for standard input
-* GFORTRAN_STDOUT_UNIT:: Unit number for standard output
-* GFORTRAN_STDERR_UNIT:: Unit number for standard error
-* GFORTRAN_UNBUFFERED_ALL:: Do not buffer I/O for all units
-* GFORTRAN_UNBUFFERED_PRECONNECTED:: Do not buffer I/O for preconnected units.
-* GFORTRAN_SHOW_LOCUS::  Show location for runtime errors
-* GFORTRAN_OPTIONAL_PLUS:: Print leading + where permitted
-* GFORTRAN_LIST_SEPARATOR::  Separator for list output
-* GFORTRAN_CONVERT_UNIT::  Set endianness for unformatted I/O
-* GFORTRAN_ERROR_BACKTRACE:: Show backtrace on run-time errors
-* GFORTRAN_FORMATTED_BUFFER_SIZE:: Buffer size for formatted files
-* GFORTRAN_UNFORMATTED_BUFFER_SIZE:: Buffer size for unformatted files
-@end menu
-
-@node TMPDIR
-@section @env{TMPDIR}---Directory for scratch files
-
-When opening a file with @code{STATUS='SCRATCH'}, GNU Fortran tries to
-create the file in one of the potential directories by testing each
-directory in the order below.
-
-@enumerate
-@item
-The environment variable @env{TMPDIR}, if it exists.
-
-@item
-On the MinGW target, the directory returned by the @code{GetTempPath}
-function. Alternatively, on the Cygwin target, the @env{TMP} and
-@env{TEMP} environment variables, if they exist, in that order.
-
-@item
-The @code{P_tmpdir} macro if it is defined, otherwise the directory
-@file{/tmp}.
-@end enumerate
-
-@node GFORTRAN_STDIN_UNIT
-@section @env{GFORTRAN_STDIN_UNIT}---Unit number for standard input
-
-This environment variable can be used to select the unit number
-preconnected to standard input.  This must be a positive integer.
-The default value is 5.
-
-@node GFORTRAN_STDOUT_UNIT
-@section @env{GFORTRAN_STDOUT_UNIT}---Unit number for standard output
-
-This environment variable can be used to select the unit number
-preconnected to standard output.  This must be a positive integer.
-The default value is 6.
-
-@node GFORTRAN_STDERR_UNIT
-@section @env{GFORTRAN_STDERR_UNIT}---Unit number for

[PATCH 0/5] Fortran manual updates

2021-11-01 Thread Sandra Loosemore

This series of patches addresses some areas of bit-rot in the GNU
Fortran manual, mainly relating to the state of standard compliance
and the recently-completed TS29113 work.  I also removed some material
that is primarily of historical interest; given that gfortran replaced
g77 almost 17 years ago now, relatively few users will be interested
in why or how they differ any more, for instance.  And I don't see the
point in the manual documenting extensions that aren't actually
implemented.  I also fixed some organization and copy-editing issues I
noticed while working on the relevant sections, although that wasn't
my primary purpose in this patch set.  The whole manual needs a lot
more of that, but with documentation it's fine to do incremental
improvements.

I'll wait a couple days before committing these patches, in case
anybody wants to give some feedback, especially on technical issues.

-Sandra


Sandra Loosemore (5):
  Fortran manual: Combine standard conformance docs in one place.
  Fortran manual: Revise introductory chapter.
  Fortran manual: Update section on Interoperability with C
  Fortran manual: Update miscellaneous references to old standard
versions.
  Fortran manual: Remove old docs for never-implemented extensions.

 gcc/fortran/gfortran.texi  | 985 +++--
 gcc/fortran/intrinsic.texi |  15 +-
 gcc/fortran/invoke.texi|  48 ++-
 3 files changed, 288 insertions(+), 760 deletions(-)

-- 
2.8.1

[committed] libstdc++: Missing constexpr for __gnu_debug::__valid_range etc

2021-11-01 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, pushed to trunk.


The new 25_algorithms/move/constexpr.cc test fails in debug mode,
because the debug assertions use the non-constexpr overloads in
.

libstdc++-v3/ChangeLog:

* include/debug/stl_iterator.h (__valid_range): Add constexpr
for C++20. Qualify call to avoid ADL.
(__get_distance, __can_advance, __unsafe, __base): Likewise.
* testsuite/25_algorithms/move/constexpr.cc: Also check with
std::reverse_iterator arguments.
---
 libstdc++-v3/include/debug/stl_iterator.h | 32 ++-
 .../testsuite/25_algorithms/move/constexpr.cc | 11 +++
 2 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/include/debug/stl_iterator.h 
b/libstdc++-v3/include/debug/stl_iterator.h
index edeb42ebe98..54f7d42b074 100644
--- a/libstdc++-v3/include/debug/stl_iterator.h
+++ b/libstdc++-v3/include/debug/stl_iterator.h
@@ -35,31 +35,38 @@ namespace __gnu_debug
 {
   // Help Debug mode to see through reverse_iterator.
   template
+_GLIBCXX20_CONSTEXPR
 inline bool
 __valid_range(const std::reverse_iterator<_Iterator>& __first,
  const std::reverse_iterator<_Iterator>& __last,
  typename _Distance_traits<_Iterator>::__type& __dist)
-{ return __valid_range(__last.base(), __first.base(), __dist); }
+{
+  return __gnu_debug::__valid_range(__last.base(), __first.base(), __dist);
+}
 
   template
+_GLIBCXX20_CONSTEXPR
 inline typename _Distance_traits<_Iterator>::__type
 __get_distance(const std::reverse_iterator<_Iterator>& __first,
   const std::reverse_iterator<_Iterator>& __last)
-{ return __get_distance(__last.base(), __first.base()); }
+{ return __gnu_debug::__get_distance(__last.base(), __first.base()); }
 
   template
+_GLIBCXX20_CONSTEXPR
 inline bool
 __can_advance(const std::reverse_iterator<_Iterator>& __it, _Size __n)
-{ return __can_advance(__it.base(), -__n); }
+{ return __gnu_debug::__can_advance(__it.base(), -__n); }
 
   template
+_GLIBCXX20_CONSTEXPR
 inline bool
 __can_advance(const std::reverse_iterator<_Iterator>& __it,
  const std::pair<_Diff, _Distance_precision>& __dist,
  int __way)
-{ return __can_advance(__it.base(), __dist, -__way); }
+{ return __gnu_debug::__can_advance(__it.base(), __dist, -__way); }
 
   template
+_GLIBCXX20_CONSTEXPR
 inline std::reverse_iterator<_Iterator>
 __base(const std::reverse_iterator<_Safe_iterator<
 _Iterator, _Sequence, std::random_access_iterator_tag> >& __it)
@@ -82,6 +89,7 @@ namespace __gnu_debug
 }
 #else
   template
+_GLIBCXX20_CONSTEXPR
 inline auto
 __unsafe(const std::reverse_iterator<_Iterator>& __it)
 -> decltype(std::__make_reverse_iterator(__unsafe(__it.base(
@@ -91,37 +99,45 @@ namespace __gnu_debug
 #if __cplusplus >= 201103L
   // Help Debug mode to see through move_iterator.
   template
+_GLIBCXX20_CONSTEXPR
 inline bool
 __valid_range(const std::move_iterator<_Iterator>& __first,
  const std::move_iterator<_Iterator>& __last,
  typename _Distance_traits<_Iterator>::__type& __dist)
-{ return __valid_range(__first.base(), __last.base(), __dist); }
+{
+  return __gnu_debug::__valid_range(__first.base(), __last.base(), __dist);
+}
 
   template
+_GLIBCXX20_CONSTEXPR
 inline typename _Distance_traits<_Iterator>::__type
 __get_distance(const std::move_iterator<_Iterator>& __first,
   const std::move_iterator<_Iterator>& __last)
-{ return __get_distance(__first.base(), __last.base()); }
+{ return __gnu_debug::__get_distance(__first.base(), __last.base()); }
 
   template
+_GLIBCXX20_CONSTEXPR
 inline bool
 __can_advance(const std::move_iterator<_Iterator>& __it, _Size __n)
-{ return __can_advance(__it.base(), __n); }
+{ return __gnu_debug::__can_advance(__it.base(), __n); }
 
   template
+_GLIBCXX20_CONSTEXPR
 inline bool
 __can_advance(const std::move_iterator<_Iterator>& __it,
  const std::pair<_Diff, _Distance_precision>& __dist,
  int __way)
-{ return __can_advance(__it.base(), __dist, __way); }
+{ return __gnu_debug::__can_advance(__it.base(), __dist, __way); }
 
   template
+_GLIBCXX20_CONSTEXPR
 inline auto
 __unsafe(const std::move_iterator<_Iterator>& __it)
 -> decltype(std::make_move_iterator(__unsafe(__it.base(
 { return std::make_move_iterator(__unsafe(__it.base())); }
 
   template
+_GLIBCXX20_CONSTEXPR
 inline auto
 __base(const std::move_iterator<_Iterator>& __it)
 -> decltype(std::make_move_iterator(__base(__it.base(
diff --git a/libstdc++-v3/testsuite/25_algorithms/move/constexpr.cc 
b/libstdc++-v3/testsuite/25_algorithms/move/constexpr.cc
index 773c55cfb50..eb1f3b17e72 100644
---

[committed] libstdc++: Reorder constraints on std::span::span(Range&&) constructor.

2021-11-01 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, pushed to trunk.


In PR libstdc++/103013 Tim Song pointed out that we could reorder the
constraints of this constructor. That's worth doing just to reduce the
work the compiler has to do during overload resolution, even if it isn't
needed to make the code in the PR work.

libstdc++-v3/ChangeLog:

* include/std/span (span(Range&&)): Reorder constraints.
---
 libstdc++-v3/include/std/span | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/std/span b/libstdc++-v3/include/std/span
index 61824dee845..0898ea85c50 100644
--- a/libstdc++-v3/include/std/span
+++ b/libstdc++-v3/include/std/span
@@ -201,11 +201,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{ }
 
   template
-   requires ranges::contiguous_range<_Range> && ranges::sized_range<_Range>
- && (ranges::borrowed_range<_Range> || is_const_v)
- && (!__detail::__is_span>)
+   requires (!__detail::__is_span>)
  && (!__detail::__is_std_array>)
  && (!is_array_v>)
+ && ranges::contiguous_range<_Range> && ranges::sized_range<_Range>
+ && (ranges::borrowed_range<_Range> || is_const_v)
  && __is_compatible_ref>::value
constexpr explicit(extent != dynamic_extent)
span(_Range&& __range)
-- 
2.31.1

Re: [PATCH] attribs: Allow optional second arg for attr deprecated [PR102049]

2021-11-01 Thread Martin Sebor via Gcc-patches


On 10/11/21 9:17 AM, Marek Polacek via Gcc-patches wrote:

Any thoughts?


I'm a little unsure.  Clang just uses the replacement string
as the text of the fix-it note as is, so it does nothing to
help programmers make sure the replacement is in sync with
what it's supposed to replace.  E.g., for this Clang output
is below:

__attribute__ ((deprecated ("foo is bad", "use bar instead")))
void foo (void);
void baz (void) { foo (); }

int bar;

:2:19: warning: 'foo' is deprecated: foo is bad 
[-Wdeprecated-declarations]

void baz (void) { foo (); }
  ^~~
  use bar instead

Since bar is a variable it's hard to see how it might be used
instead of the function foo().  Fix-its, as I understand them,
are meant not just as a visual clue but also to let IDEs and
other tools automatically apply the fixes.  With buggy fix-its
this obviously wouldn't work.

I think the replacement would be useful if it had to reference
an existing symbol of the same kind, and if the compiler helped
enforce it.  Otherwise it seems like a recipe for bit rot and
for tings/tools not working well together.

Martin



On Thu, Sep 23, 2021 at 12:16:36PM -0400, Marek Polacek via Gcc-patches wrote:

Clang implements something we don't have:

__attribute__((deprecated("message", "replacement")));

which seems pretty neat so I wrote this patch to add it to gcc.

It doesn't allow the optional second argument in the standard [[]]
form so as not to clash with possible future standard additions.

I had hoped we could print a nice fix-it replacement hint, but that
won't be possible until warn_deprecated_use gets something better than
input_location.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/102049

gcc/c-family/ChangeLog:

* c-attribs.c (c_common_attribute_table): Increase max_len for
deprecated.
(handle_deprecated_attribute): Allow an optional second argument
in the GNU form of attribute deprecated.

gcc/c/ChangeLog:

* c-parser.c (c_parser_std_attribute): Give a diagnostic when
the standard form of an attribute deprecated has a second argument.

gcc/ChangeLog:

* doc/extend.texi: Document attribute deprecated with an
optional second argument.
* tree.c (warn_deprecated_use): Print the replacement argument,
if any.

gcc/testsuite/ChangeLog:

* gcc.dg/c2x-attr-deprecated-3.c: Adjust dg-error.
* c-c++-common/Wdeprecated-arg-1.c: New test.
---
  gcc/c-family/c-attribs.c  | 17 -
  gcc/c/c-parser.c  |  8 ++
  gcc/doc/extend.texi   | 24 ++
  .../c-c++-common/Wdeprecated-arg-1.c  | 21 
  gcc/testsuite/gcc.dg/c2x-attr-deprecated-3.c  |  2 +-
  gcc/tree.c| 25 +++
  6 files changed, 90 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/c-c++-common/Wdeprecated-arg-1.c

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 007b928c54b..ef857a9ae2c 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -409,7 +409,7 @@ const struct attribute_spec c_common_attribute_table[] =
   to prevent its usage in source code.  */
{ "no vops",0, 0, true,  false, false, false,
  handle_novops_attribute, NULL },
-  { "deprecated", 0, 1, false, false, false, false,
+  { "deprecated", 0, 2, false, false, false, false,
  handle_deprecated_attribute, NULL },
{ "unavailable",0, 1, false, false, false, false,
  handle_unavailable_attribute, NULL },
@@ -4107,6 +4107,21 @@ handle_deprecated_attribute (tree *node, tree name,
error ("deprecated message is not a string");
*no_add_attrs = true;
  }
+  else if (TREE_CHAIN (args) != NULL_TREE)
+{
+  /* We allow an optional second argument in the GNU form of
+attribute deprecated, which specifies the replacement.  */
+  if (flags & ATTR_FLAG_CXX11)
+   {
+ error ("replacement argument only allowed in GNU attributes");
+ *no_add_attrs = true;
+   }
+  else if (TREE_CODE (TREE_VALUE (TREE_CHAIN (args))) != STRING_CST)
+   {
+ error ("replacement argument is not a string");
+ *no_add_attrs = true;
+   }
+}
  
if (DECL_P (*node))

  {
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index fa29d2c15fc..2b47f01d166 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -4952,6 +4952,14 @@ c_parser_std_attribute (c_parser *parser, bool for_tm)
TREE_VALUE (attribute)
  = c_parser_attribute_arguments (parser, takes_identifier,
  require_string, false);
+   if (c_parser_next_token_is (parser, CPP_COMMA)
+   && strcmp (IDENTIFIER_POINTER (name),

[PATCH 2/2] add -Wdangling-pointer [PR #63272]

2021-11-01 Thread Martin Sebor via Gcc-patches


Patch 2 in this series adds support for detecting the uses of
dangling pointers: those to auto objects that have gone out of
scope.  Like patch 1, to minimize false positives this detection
is very simplistic.  However, thanks to the more deterministic
nature of the problem (all local objects go out of scope) is able
to detect more instances of it.  The approach I used is to simply
search the IL for clobbers that dominate uses of pointers to
the clobbered objects.  If such a use is found that's not
followed by a clobber of the same object the warning triggers.
Similar to -Wuse-after-free, the new -Wdangling-pointer option
has multiple levels: level 1 to detect unconditional uses and
level 2 to flag conditional ones.  Unlike with -Wuse-after-free
there is no use case for testing dangling pointers for
equality, so there is no level 3.

Tested on x86_64-linux and  by building Glibc and Binutils/GDB.
It found no problems outside of the GCC test suite.

As with the first patch in this series, the tests contain a number
of xfails due to known limitations marked with pr??.  I'll
open bugs for them before committing the patch if I don't resolve
them first in a followup.

Martin
Add -Wdangling-pointer [PR63272].
Resolves:

PR c/63272 - GCC should warn when using pointer to dead scoped variable within the same function

gcc/c-family/ChangeLog:

	PR c/63272
	* c.opt:

gcc/ChangeLog:

	PR c/63272
	* diagnostic-spec.c (nowarn_spec_t::nowarn_spec_t): Handle
	-Wdangling-pointer.
	* doc/invoke.texi (-Wdangling-pointer): Document new option.
	* gimple-ssa-isolate-paths.c (diag_returned_locals): Suppress
	warning after issuing it.
	* gimple-ssa-warn-access.cc (pass_waccess::clone): Set new member.
	(pass_waccess::check_pointer_uses): New function.
	(pass_waccess::gimple_call_return_arg): New function.
	(pass_waccess::gimple_call_return_arg_ref): New function.
	(pass_waccess::check_call_dangling): New function.
	(pass_waccess::check_dangling_uses): New function overloads.
	(pass_waccess::check_dangling_stores): New function.
	(pass_waccess::check_dangling_stores): New function.
	(pass_waccess::m_clobbers): New data member.
	(pass_waccess::m_func): New data member.
	(pass_waccess::m_run_number): New data member.
	(pass_waccess::m_check_dangling_p): New data member.
	(pass_waccess::check_alloca): Check m_early_checks_p.
	(pass_waccess::check_alloc_size_call): Same.
	(pass_waccess::check_strcat): Same.
	(pass_waccess::check_strncat): Same.
	(pass_waccess::check_stxcpy): Same.
	(pass_waccess::check_stxncpy): Same.
	(pass_waccess::check_strncmp): Same.
	(pass_waccess::check_memop_access): Same.
	(pass_waccess::check_read_access): Same.
	(pass_waccess::check_builtin): Call check_pointer_uses.
	(pass_waccess::warn_invalid_pointer): Add arguments.
	(is_auto_decl): New function.
	(pass_waccess::check_stmt): New function.
	(pass_waccess::check_block): Call check_stmt.
	(pass_waccess::execute): Call check_dangling_uses,
	check_dangling_stores.  Empty m_clobbers.
	* passes.def (pass_warn_access): Invoke pass two more times.

gcc/testsuite/ChangeLog:

	PR c/63272
	* g++.dg/warn/Wfree-nonheap-object-6.C: Disable valid warnings.
	* gcc.dg/uninit-pr50476.c: Expect a new warning.
	* c-c++-common/Wdangling-pointer-2.c: New test.
	* c-c++-common/Wdangling-pointer-3.c: New test.
	* c-c++-common/Wdangling-pointer-4.c: New test.
	* c-c++-common/Wdangling-pointer-5.c: New test.
	* c-c++-common/Wdangling-pointer.c: New test.
	* gcc.dg/Wdangling-pointer-2.c: New test.
	* gcc.dg/Wdangling-pointer.c: New test.


diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index a5fe00ed195..6aa04721075 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -524,6 +524,14 @@ Wdangling-else
 C ObjC C++ ObjC++ Var(warn_dangling_else) Warning LangEnabledBy(C ObjC C++ ObjC++,Wparentheses)
 Warn about dangling else.
 
+Wdangling-pointer
+C ObjC C++ LTO ObjC++ Alias(Wdangling-pointer=, 2, 0) Warning
+Warn for uses of pointers to auto variables whose lifetime has ended.
+
+Wdangling-pointer=
+C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_dangling_pointer) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall, 2, 0) IntegerRange(0, 2)
+Warn for uses of pointers to auto variables whose lifetime has ended.
+
 Wdate-time
 C ObjC C++ ObjC++ CPP(warn_date_time) CppReason(CPP_W_DATE_TIME) Var(cpp_warn_date_time) Init(0) Warning
 Warn about __TIME__, __DATE__ and __TIMESTAMP__ usage.
diff --git a/gcc/diagnostic-spec.c b/gcc/diagnostic-spec.c
index 0d68af4d91e..ac2ec0c13ce 100644
--- a/gcc/diagnostic-spec.c
+++ b/gcc/diagnostic-spec.c
@@ -99,6 +99,7 @@ nowarn_spec_t::nowarn_spec_t (opt_code opt)
 	m_bits = NW_UNINIT;
   break;
 
+case OPT_Wdangling_pointer_:
 case OPT_Wreturn_local_addr:
 case OPT_Wuse_after_free_:
   m_bits = NW_DANGLING;
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index eb4ecb56dcc..cdbd9991da2 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -339,7 +339,8 @@ Objective-C and Objective-C++ Dialects}.

[PATCH 1/2] add -Wuse-after-free

2021-11-01 Thread Martin Sebor via Gcc-patches


Patch 1 in the series detects a small subset of uses of pointers
made indeterminate by calls to deallocation functions like free
or C++ operator delete.  To control the conditions the warnings
are issued under the new -Wuse-after-free= option provides three
levels.  At the lowest level the warning triggers only for
unconditional uses of freed pointers and doesn't warn for uses
in equality expressions.  Level 2 warns also for come conditional
uses, and level 3 also for uses in equality expressions.

I debated whether to make level 2 or 3 the default included in
-Wall.  I decided on 3 for two reasons: 1) to raise awareness
of both the problem and GCC's new ability to detect it: using
a pointer after it's been freed, even only in principle, by
a successful call to realloc, is undefined, and 2) because
it's trivial to lower the level either globally, or locally
by suppressing the warning around such misuses.

I've tested the patch on x86_64-linux and by building Glibc
and Binutils/GDB.  It triggers a number of times in each, all
due to comparing invalidated pointers for equality (i.e., level
3).  I have suppressed these in GCC (libiberty) by a #pragma,
and will see how the Glibc folks want to deal with theirs (I
track them in BZ #28521).

The tests contain a number of xfails due to limitations I'm
aware of.  I marked them pr?? until the patch is approved.
I will open bugs for them before committing if I don't resolve
them in a followup.

Martin
Add -Wuse-after-free.

gcc/c-family/ChangeLog

	* c.opt (-Wuse-after-free): New options.

gcc/ChangeLog:

	* diagnostic-spec.c (nowarn_spec_t::nowarn_spec_t): Handle
	OPT_Wreturn_local_addr and OPT_Wuse_after_free_.
	* diagnostic-spec.h (NW_DANGLING): New enumerator.
	* doc/invoke.texi (-Wuse-after-free): Document new option.
	* gimple-ssa-warn-access.cc (pass_waccess::check_call): Rename...
	(pass_waccess::check_call_access): ...to this.
	(pass_waccess::check): Rename...
	(pass_waccess::check_block): ...to this.
	(pass_waccess::check_pointer_uses): New function.
	(pass_waccess::gimple_call_return_arg): New function.
	(pass_waccess::warn_invalid_pointer): New function.
	(pass_waccess::check_builtin): Handle free and realloc.
	(gimple_use_after_inval_p): New function.
	(get_realloc_lhs): New function.
	(maybe_warn_mismatched_realloc): New function.
	(pointers_related_p): New function.
	(pass_waccess::check_call): Call check_pointer_uses.
	(pass_waccess::execute): Compute and free dominance info.

libcpp/ChangeLog:

	* files.c (_cpp_find_file): Substitute a valid pointer for
	an invalid one to avoid -Wuse-0after-free.

libiberty/ChangeLog:

	* regex.c: Suppress -Wuse-after-free.

gcc/testsuite/ChangeLog:

	* gcc.dg/Wmismatched-dealloc-2.c: Avoid -Wuse-after-free.
	* gcc.dg/Wmismatched-dealloc-3.c: Same.
	* gcc.dg/attr-alloc_size-6.c: Disable -Wuse-after-free.
	* gcc.dg/attr-alloc_size-7.c: Same.
	* c-c++-common/Wuse-after-free-2.c: New test.
	* c-c++-common/Wuse-after-free-3.c: New test.
	* c-c++-common/Wuse-after-free-4.c: New test.
	* c-c++-common/Wuse-after-free-5.c: New test.
	* c-c++-common/Wuse-after-free-6.c: New test.
	* c-c++-common/Wuse-after-free-7.c: New test.
	* c-c++-common/Wuse-after-free.c: New test.
	* g++.dg/warn/Wdangling-pointer.C: New test.
	* g++.dg/warn/Wmismatched-dealloc-3.C: New test.
	* g++.dg/warn/Wuse-after-free.C: New test.

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 06457ac739e..a5fe00ed195 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1334,6 +1334,14 @@ Wunused-const-variable=
 C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_unused_const_variable) Warning LangEnabledBy(C ObjC,Wunused-variable, 1, 0) IntegerRange(0, 2)
 Warn when a const variable is unused.
 
+Wuse-after-free
+C ObjC C++ LTO ObjC++ Alias(Wuse-after-free=, 2, 0) Warning
+Warn for uses of pointers to deallocated strorage.
+
+Wuse-after-free=
+C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_use_after_free) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall, 3, 0) IntegerRange(0, 3)
+Warn for uses of pointers to deallocated strorage.
+
 Wvariadic-macros
 C ObjC C++ ObjC++ CPP(warn_variadic_macros) CppReason(CPP_W_VARIADIC_MACROS) Var(cpp_warn_variadic_macros) Init(0) Warning LangEnabledBy(C ObjC C++ ObjC++,Wpedantic || Wtraditional)
 Warn about using variadic macros.
diff --git a/gcc/diagnostic-spec.c b/gcc/diagnostic-spec.c
index 85ffb725c02..0d68af4d91e 100644
--- a/gcc/diagnostic-spec.c
+++ b/gcc/diagnostic-spec.c
@@ -99,6 +99,11 @@ nowarn_spec_t::nowarn_spec_t (opt_code opt)
 	m_bits = NW_UNINIT;
   break;
 
+case OPT_Wreturn_local_addr:
+case OPT_Wuse_after_free_:
+  m_bits = NW_DANGLING;
+  break;
+
 default:
   /* A catchall group for everything else.  */
   m_bits = NW_OTHER;
diff --git a/gcc/diagnostic-spec.h b/gcc/diagnostic-spec.h
index 9b33ce6..164c9fc675f 100644
--- a/gcc/diagnostic-spec.h
+++ b/gcc/diagnostic-spec.h
@@ -41,11 +41,13 @@ public:
  NW_UNINIT = 1 << 3,
  /*

[PATCH 0/2] provide simple detection of indeterminate pointers

2021-11-01 Thread Martin Sebor via Gcc-patches


This two-patch series adds support for the detection of uses
of pointers invalidated as a result of the lifetime of
the objects they point to having ended: either explicitly,
after a call to a dynamic deallocation function, or implicitly,
by virtue of an object with automatic storage duration having
gone out of scope.

To minimize false positives the initial logic is very simple
(even simplistic): the code only checks uses in basic blocks
dominated by the invalidating calls (either calls to
deallocation functions or GCC's clobbers).

A more thorough checker is certainly possible and I'd say most
desirable but will require a more sophisticated implementation
and a better predicate analyzer than is available, and so will
need to wait for GCC 13.

Martin

Re: [PATCH] libcpp: Implement -Wbidirectional for CVE-2021-42574 [PR103026]

2021-11-01 Thread Joseph Myers

On Mon, 1 Nov 2021, Marek Polacek via Gcc-patches wrote:

> +  /* We've read a bidi char, update the current vector as necessary.  */
> +  void on_char (kind k, bool ucn_p)
> +  {
> +switch (k)
> +  {
> +  case kind::LRE:
> +  case kind::RLE:
> +  case kind::LRO:
> +  case kind::RLO:
> + vec.push (ucn_p ? 3u : 1u);
> + break;
> +  case kind::LRI:
> +  case kind::RLI:
> +  case kind::FSI:
> + vec.push (ucn_p ? 2u : 0u);
> + break;
> +  case kind::PDF:
> + if (current_ctx () == kind::PDF)
> +   pop ();
> + break;
> +  case kind::PDI:
> + if (current_ctx () == kind::PDI)
> +   pop ();

My understanding is that PDI should pop all intermediate PDF contexts 
outward to a PDI context, which it also pops.  (But if it's embedded only 
in PDF contexts, with no PDI context containing it, it doesn't pop 
anything.)

I think failing to handle that only means libcpp sometimes models there 
as being more bidirectional contexts open than there should be, so it 
might give spurious warnings when in fact all such contexts had been 
closed by end of string or comment.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] PR fortran/102817 - [12 Regression] ICE in gfc_clear_shape, at fortran/expr.c:422

2021-11-01 Thread Harald Anlauf via Gcc-patches

Dear Fortranners,

a recent patch uncovered a latent issue with simplification of
array-valued expressions where the resulting shape was not set
from the referenced subobject.  Once found, the fix looks obvious.

Regtested on x86_64-pc-linux-gnu.  OK?

Thanks,
Harald

Fortran: fix simplification of array-valued parameter expressions

gcc/fortran/ChangeLog:

	PR fortran/102817
	* expr.c (simplify_parameter_variable): Copy shape of referenced
	subobject when simplifying.

gcc/testsuite/ChangeLog:

	PR fortran/102817
	* gfortran.dg/pr102817.f90: New test.

diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index 4dea840e348..c5360dfaede 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -2129,6 +2129,7 @@ simplify_parameter_variable (gfc_expr *p, int type)
 	return false;

   e->rank = p->rank;
+  e->shape = gfc_copy_shape (p->shape, p->rank);

   if (e->ts.type == BT_CHARACTER && p->ts.u.cl)
 	e->ts = p->ts;
diff --git a/gcc/testsuite/gfortran.dg/pr102817.f90 b/gcc/testsuite/gfortran.dg/pr102817.f90
new file mode 100644
index 000..c081a69f0ea
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr102817.f90
@@ -0,0 +1,17 @@
+! { dg-do compile }
+! PR fortran/102817 - ICE in gfc_clear_shape
+
+program test
+  type t
+ integer :: a(1,2) = 3
+  end type t
+  type(t), parameter :: u= t(4)
+  type(t), parameter :: x(1) = t(4)
+  integer, parameter :: p(1,2) = (x(1)%a)
+  integer:: z(1,2) = (x(1)%a)
+  integer:: y(1,2), v(1,2), w(1,2)
+  v = (u   %a)
+  w =  x(1)%a
+  y = (x(1)%a)
+  print *, v, w, y, z, p
+end

Add EAF_NOT_RETURNED_DIRECTLY

2021-11-01 Thread Jan Hubicka via Gcc-patches

Hi,
this patch adds EAF_NOT_RETURNED_DIRECTLY which works similarly as
EAF_NODIRECTESCAPE.  Values pointed to by a given argument may be returned but
not the argument itself.  This helps PTA quite noticeably because we mostly
care about tracking points to which given memory location can escape.

I think this is about last reasonable improvement we can get to EAF
flags.

cc1plus disambiguation counts change from:

Alias oracle query stats:
  refs_may_alias_p: 77976088 disambiguations, 98744590 queries
  ref_maybe_used_by_call_p: 572845 disambiguations, 79014622 queries
  call_may_clobber_ref_p: 340823 disambiguations, 344823 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 26590 queries
  nonoverlapping_refs_since_match_p: 31626 disambiguations, 65379 must 
overlaps, 97963 queries
  aliasing_component_refs_p: 57414 disambiguations, 11434878 queries
  TBAA oracle: 27749649 disambiguations 91624184 queries
   14733408 are in alias set 0
   8847139 queries asked about the same object
   139 queries asked about the same alias set
   0 access volatile
   38412201 are dependent in the DAG
   1881648 are aritificially in conflict with void *

Modref stats:
  modref use: 23785 disambiguations, 702425 queries
  modref clobber: 2296391 disambiguations, 22690531 queries
  5260226 tbaa queries (0.231825 per modref query)
  731741 base compares (0.032249 per modref query)

PTA query stats:
  pt_solution_includes: 12580233 disambiguations, 35854408 queries
  pt_solutions_intersect: 1409041 disambiguations, 13496899 queries

To:

Alias oracle query stats:
  refs_may_alias_p: 78304485 disambiguations, 98830913 queries
  ref_maybe_used_by_call_p: 630360 disambiguations, 79308222 queries
  call_may_clobber_ref_p: 381549 disambiguations, 384627 queries
  nonoverlapping_component_refs_p: 0 disambiguations, 26299 queries
  nonoverlapping_refs_since_match_p: 29919 disambiguations, 64917 must 
overlaps, 95781 queries
  aliasing_component_refs_p: 57250 disambiguations, 11336880 queries
  TBAA oracle: 27835747 disambiguations 91534430 queries
   14884868 are in alias set 0
   8933627 queries asked about the same object
   123 queries asked about the same alias set
   0 access volatile
   37974723 are dependent in the DAG
   1905342 are aritificially in conflict with void *

Modref stats:
  modref use: 24929 disambiguations, 756294 queries
  modref clobber: 2334910 disambiguations, 23414495 queries
  5359212 tbaa queries (0.228884 per modref query)
  754642 base compares (0.032230 per modref query)

PTA query stats:
  pt_solution_includes: 13262256 disambiguations, 36306509 queries
  pt_solutions_intersect: 1574672 disambiguations, 13638933 queries

So about 5% more pt_solution_includes and 11% more pt_solutions_intersect
disambiguations.

Bootstrapped/regtested x86_64-linux, OK?

Honza

gcc/ChangeLog:

* tree-core.h (EAF_NOT_RETURNED_DIRECTLY): New flag.
(EAF_NOREAD): Renumber.
* ipa-modref.c (dump_eaf_flags): Dump EAF_NOT_RETURNED_DIRECTLY.
(remove_useless_eaf_flags): Handle EAF_NOT_RETURNED_DIRECTLY
(deref_flags): Likewise.
(modref_lattice::init): Likewise.
(modref_lattice::merge): Likewise.
(merge_call_lhs_flags): Likewise.
(analyze_ssa_name_flags): Likewise.
(modref_merge_call_site_flags): Likewise.
* tree-ssa-structalias.c (handle_call_arg): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/ipa/modref-3.c: New test.
* gcc.dg/tree-ssa/modref-10.c: New test.

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index d866d9ed6b3..c0aae084dbd 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -160,6 +160,8 @@ dump_eaf_flags (FILE *out, int flags, bool newline = true)
 fprintf (out, " unused");
   if (flags & EAF_NOT_RETURNED)
 fprintf (out, " not_returned");
+  if (flags & EAF_NOT_RETURNED_DIRECTLY)
+fprintf (out, " not_returned_directly");
   if (flags & EAF_NOREAD)
 fprintf (out, " noread");
   if (newline)
@@ -295,7 +297,7 @@ remove_useless_eaf_flags (int eaf_flags, int ecf_flags, 
bool returns_void)
   else if (ecf_flags & ECF_PURE)
 eaf_flags &= ~implicit_pure_eaf_flags;
   else if ((ecf_flags & ECF_NORETURN) || returns_void)
-eaf_flags &= ~EAF_NOT_RETURNED;
+eaf_flags &= ~(EAF_NOT_RETURNED | EAF_NOT_RETURNED_DIRECTLY);
   return eaf_flags;
 }
 
@@ -1373,7 +1375,7 @@ memory_access_to (tree op, tree ssa_name)
 static int
 deref_flags (int flags, bool ignore_stores)
 {
-  int ret = EAF_NODIRECTESCAPE;
+  int ret = EAF_NODIRECTESCAPE | EAF_NOT_RETURNED_DIRECTLY;
   /* If argument is unused just account for
  the read involved in dereference.  */
   if (flags & EAF_UNUSED)
@@ -1458,7 +1460,8 @@ modref_lattice::init ()
 {
   /* All flags we track.  */
   int f = EAF_DIRECT | EAF_NOCLOBBER | EAF_NOESCAPE | EAF_UNUSED
- |

Re: redundant bitmap_bit_p followed by bitmap_clear_bit [was: Re: [COMMITTED] Kill second order relations in the path solver.]

2021-11-01 Thread Bernhard Reutner-Fischer via Gcc-patches

On Mon, 1 Nov 2021 15:21:03 +0100
Aldy Hernandez  wrote:

> I'm not convinced this makes the code clearer to read, especially if
> it's not on a critical path.  But if you feel strongly, please submit
> a patch ;-).

No i don't feel strongly about it.
Compiling e.g. -O2 ira.o
# Overhead   Samples  Command  Shared Object  Symbol   
#     ...  .  .
#
   100.00%  4197  cc1plus  cc1plus[.] mark_reachable_blocks
   100.00% 22000  cc1plus  cc1plus[.] path_oracle::killing_def
and the mark_elimination is reload.
So it's not just a handful of calls saved but some. And it would be
smaller code as it saves a call. Well maybe another day.
thanks,
> 
> Aldy
> 
> On Mon, Nov 1, 2021 at 3:10 PM Bernhard Reutner-Fischer
>  wrote:
> >
> > On Thu, 28 Oct 2021 01:55:30 +0200
> > Bernhard Reutner-Fischer  wrote:
> >  
> > > On Wed, 27 Oct 2021 20:13:21 +0200
> > > Aldy Hernandez via Gcc-patches  wrote:  
> >  
> > > > @@ -1306,6 +1307,24 @@ path_oracle::killing_def (tree ssa)
> > > >ptr->m_next = m_equiv.m_next;
> > > >m_equiv.m_next = ptr;
> > > >bitmap_ior_into (m_equiv.m_names, b);
> > > > +
> > > > +  // Walk the relation list an remove SSA from any relations.  
> > >
> > > s/an /and /
> > >  
> > > > +  if (!bitmap_bit_p (m_relations.m_names, v))
> > > > +return;
> > > > +
> > > > +  bitmap_clear_bit (m_relations.m_names, v);  
> > >
> > > IIRC bitmap_clear_bit returns true if the bit was set, false otherwise,
> > > so should be used as if(!bitmap_clear_bit) above.  
> >  
> > > > +  relation_chain **prev = &(m_relations.m_head);  
> > >
> > > s/[()]//
> > > thanks,  
> >
> > There seems to be two other spots where a redundant bitmap_bit_p checks
> > if we want to bitmap_clear_bit. In dse and ira.
> > Like:
> > $ cat ~/coccinelle/gcc_bitmap_bit_p-0.cocci ; echo EOF
> > // replace redundant bitmap_bit_p() bitmap_clear_bit with the latter
> > @ rule1 @
> > identifier fn;
> > expression bitmap, bit;
> > @@
> >
> > fn(...) {
> > <...
> > (
> > -if (bitmap_bit_p (bitmap, bit))
> > +if (bitmap_clear_bit (bitmap, bit))
> > {
> >   ...
> > -  bitmap_clear_bit (bitmap, bit);
> >   ...
> > }
> > |
> > -if (bitmap_bit_p (bitmap, bit))
> > +if (bitmap_clear_bit (bitmap, bit))
> > {
> >   ...
> > }
> > ...
> > -bitmap_clear_bit (bitmap, bit);
> > )  
> > ...>  
> > }
> > EOF
> > $ find gcc/ -type f -a \( -name "*.c" -o -name "*.cc" \) -a \( ! -path 
> > "gcc/testsuite/*" -a ! -path "gcc/contrib/*" \) -exec spatch -sp_file 
> > ~/coccinelle/gcc_bitmap_bit_p-0.cocci --show-diff {} \;
> > diff =
> > --- gcc/dse.c
> > +++ /tmp/cocci-output-1104419-443759-dse.c
> > @@ -3238,9 +3238,8 @@ mark_reachable_blocks (sbitmap unreachab
> >edge e;
> >edge_iterator ei;
> >
> > -  if (bitmap_bit_p (unreachable_blocks, bb->index))
> > +  if (bitmap_clear_bit(unreachable_blocks, bb->index))
> >  {
> > -  bitmap_clear_bit (unreachable_blocks, bb->index);
> >FOR_EACH_EDGE (e, ei, bb->preds)
> > {
> >   mark_reachable_blocks (unreachable_blocks, e->src);
> > diff =
> > --- gcc/ira.c
> > +++ /tmp/cocci-output-1104678-d8679a-ira.c
> > @@ -2944,17 +2944,15 @@ mark_elimination (int from, int to)
> >FOR_EACH_BB_FN (bb, cfun)
> >  {
> >r = DF_LR_IN (bb);
> > -  if (bitmap_bit_p (r, from))
> > +  if (bitmap_clear_bit(r, from))
> > {
> > - bitmap_clear_bit (r, from);
> >   bitmap_set_bit (r, to);
> > }
> >if (! df_live)
> >  continue;
> >r = DF_LIVE_IN (bb);
> > -  if (bitmap_bit_p (r, from))
> > +  if (bitmap_clear_bit(r, from))
> > {
> > - bitmap_clear_bit (r, from);
> >   bitmap_set_bit (r, to);
> > }
> >  }
> > # in ira.c one would have to fixup the curly braces manually
> > PS: coccinelle seems to ruin the spaces before braces in the '+' even
> > though i have written them correctly according to GNU style..
> >  
>

Re: [PATCH] Fix negative integer range for UInteger.

2021-11-01 Thread Martin Liška


On 10/27/21 08:33, Richard Biener wrote:

OK.  (btw, can we use opts_set.param_... now instead of -1?)


Yes, we can. Let me try cooking a follow up patch.

Cheers,
Martin

Re: [PATCH] c++: quadratic constexpr behavior for left-assoc logical exprs [PR102780]

2021-11-01 Thread Patrick Palka via Gcc-patches

On Mon, 1 Nov 2021, Patrick Palka wrote:

> On Fri, 29 Oct 2021, Jakub Jelinek wrote:
> 
> > On Thu, Oct 28, 2021 at 03:35:20PM -0400, Patrick Palka wrote:
> > > > Is there a reason to turn this mode of evaluating everything twice if an
> > > > error should be diagnosed all the time, rather than only if we actually 
> > > > see
> > > > a TRUTH_*_EXPR we want to handle this way?
> > > > If we don't see any TRUTH_*_EXPR, or if processing_template_decl, or if
> > > > the first operand is already a constant, that seems like a waste of 
> > > > time.
> > > 
> > > Hmm yeah, at the very least it wouldn't hurt to check
> > > processing_template_decl before doing the tf_error shenanigans.  I'm not
> > > sure if we would gain anything by first looking for TRUTH_*_EXPR since
> > > that'd involve walking the entire expression anyway IIUC.
> > 
> > I meant actually something like:
> > --- gcc/cp/constexpr.c.jj   2021-10-28 20:07:48.566193259 +0200
> > +++ gcc/cp/constexpr.c  2021-10-29 13:47:48.824026156 +0200
> > @@ -8789,7 +8789,7 @@ potential_constant_expression_1 (tree t,
> >   return false;
> > }
> >   else if (!check_for_uninitialized_const_var
> > -  (tmp, /*constexpr_context_p=*/true, flags))
> > +  (tmp, /*constexpr_context_p=*/true, flags & ~(1 << 30)))
> > return false;
> > }
> >return RECUR (tmp, want_rval);
> > @@ -8896,14 +8896,36 @@ potential_constant_expression_1 (tree t,
> > tree op1 = TREE_OPERAND (t, 1);
> > if (!RECUR (op0, rval))
> >   return false;
> > -   if (!(flags & tf_error) && RECUR (op1, rval))
> > - /* When quiet, try to avoid expensive trial evaluation by first
> > -checking potentiality of the second operand.  */
> > - return true;
> > -   if (!processing_template_decl)
> > - op0 = cxx_eval_outermost_constant_expr (op0, true);
> > +   if (TREE_CODE (op0) != INTEGER_CST && !processing_template_decl)
> > + {
> > +   /* If op0 is not a constant, we can either
> > +  cxx_eval_outermost_constant_expr first, or RECUR (op1, rval)
> > +  first.  If quiet, perform the latter first, if not quiet
> > +  and it is the outermost such TRUTH_*_EXPR, perform the
> > +  latter first in quiet mode, followed by the former and
> > +  retry with the latter in non-quiet mode.  */
> > +   if ((flags & (1 << 30)) != 0)
> > + op0 = cxx_eval_outermost_constant_expr (op0, true);
> > +   else if ((flags & tf_error) != 0)
> > + {
> > +   flags &= ~tf_error;
> > +   if (RECUR (op1, rval))
> > + return true;
> > +   op0 = cxx_eval_outermost_constant_expr (op0, true);
> > +   flags |= tf_error | (1 << 30);
> > + }
> > +   else
> > + {
> > +   if (RECUR (op1, rval))
> > + return true;
> > +   op0 = cxx_eval_outermost_constant_expr (op0, true);
> > +   if (tree_int_cst_equal (op0, tmp))
> > + return false;
> > +   return true;
> > + }
> > + }
> > if (tree_int_cst_equal (op0, tmp))
> > - return (flags & tf_error) ? RECUR (op1, rval) : false;
> > + return RECUR (op1, rval);
> > else
> >   return true;
> >}
> > @@ -9112,17 +9134,6 @@ bool
> >  potential_constant_expression_1 (tree t, bool want_rval, bool strict, bool 
> > now,
> >  tsubst_flags_t flags)
> >  {
> > -  if (flags & tf_error)
> > -{
> > -  /* Check potentiality quietly first, as that could be performed more
> > -efficiently in some cases (currently only for TRUTH_*_EXPR).  If
> > -that fails, replay the check noisily to give errors.  */
> > -  flags &= ~tf_error;
> > -  if (potential_constant_expression_1 (t, want_rval, strict, now, 
> > flags))
> > -   return true;
> > -  flags |= tf_error;
> > -}
> > -
> >tree target = NULL_TREE;
> >return potential_constant_expression_1 (t, want_rval, strict, now,
> >   flags, );
> > 
> > (perhaps with naming the 1 << 30 as tf_something or using different bit for
> > that).  So no doubling of potential_constant_expression_1 evaluation
> > for tf_error unless a TRUTH_*_EXPR is seen outside of template with
> > potentially constant first operand other than INTEGER_CST, but similarly to
> > what you did, make sure that there are at most two calls and not more.
> 
> That makes sense, though it's somewhat unfortunate that we'd have to
> use/add an adhoc tsubst_flags_t flag with this approach :/
> 
> > 
> > > > As I said, another possibility is something like:
> > > > /* Try to quietly evaluate T to constant, but don't try too hard.  */
> > > > 
> > > > static tree
> > > > potential_constant_expression_eval (tree t)
> > > > {
> > > >   auto o = make_temp_override (constexpr_ops_limit,
> > > >MIN (constexpr_ops_limit, 100));
> > > >   return

Re: [PATCH] x86_64: Improved implementation of TImode rotations.

2021-11-01 Thread Uros Bizjak via Gcc-patches

On Mon, Nov 1, 2021 at 5:45 PM Roger Sayle  wrote:
>
>
> This simple patch improves the implementation of 128-bit (TImode)
> rotations on x86_64 (a missed optimization opportunity spotted
> during the recent V1TImode improvements).
>
> Currently, the function:
>
> unsigned __int128 rotrti3(unsigned __int128 x, unsigned int i) {
>   return (x >> i) | (x << (128-i));
> }
>
> produces:
>
> rotrti3:
> movq%rsi, %r8
> movq%rdi, %r9
> movl%edx, %ecx
> movq%rdi, %rsi
> movq%r9, %rax
> movq%r8, %rdx
> movq%r8, %rdi
> shrdq   %r8, %rax
> shrq%cl, %rdx
> xorl%r8d, %r8d
> testb   $64, %cl
> cmovne  %rdx, %rax
> cmovne  %r8, %rdx
> negl%ecx
> andl$127, %ecx
> shldq   %r9, %rdi
> salq%cl, %rsi
> xorl%r9d, %r9d
> testb   $64, %cl
> cmovne  %rsi, %rdi
> cmovne  %r9, %rsi
> orq %rdi, %rdx
> orq %rsi, %rax
> ret
>
> with this patch, GCC will now generate the much nicer:
> rotrti3:
> movl%edx, %ecx
> movq%rdi, %rdx
> shrdq   %rsi, %rdx
> shrdq   %rdi, %rsi
> andl$64, %ecx
> movq%rdx, %rax
> cmove   %rsi, %rdx
> cmovne  %rsi, %rax
> ret
>
> Even I wasn't expecting the optimizer's choice of the final three
> instructions; a thing of beauty.  For rotations larger than 64,
> the lowpart and the highpart (%rax and %rdx) are transposed, and
> it would be nice to have a conditional swap/exchange.  The inspired
> solution the compiler comes up with is to store/duplicate the same
> value in both %rax/%rdx, and then use complementary conditional moves
> to either update the lowpart or highpart, which cleverly avoids the
> potential decode-stage pipeline stall (on some microarchitectures)
> from having multiple instructions conditional on the same condition.
> See X86_TUNE_ONE_IF_CONV_INSN, and notice there are two such stalls
> in the original expansion of rot[rl]ti3.
>
> One quick question, does TARGET_64BIT (always) imply TARGET_CMOVE?

Yes, from i386-options.c:

  /* X86_ARCH_CMOV: Conditional move was added for pentiumpro.  */
  ~(m_386 | m_486 | m_PENT | m_LAKEMONT | m_K6),

> This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap
> and make -k check with no new failures.  Interestingly the correct
> behaviour is already tested by (amongst other tests) sse2-v1ti-shift-3.c
> that confirms V1TImode rotates by constants match rotlti3/rotrti3.
>
> Ok for mainline?
>
>
> 2021-11-01  Roger Sayle  
>
> * config/i386/i386.md (ti3): Provide expansion for
> rotations by non-constant amounts on TARGET_CMOVE architectures.

OK with a nit below.

Thanks,
Uros.

+  if ( == ROTATE)
+ {
+  emit_insn (gen_x86_64_shld (tmp_lo, src_hi, amount));
+  emit_insn (gen_x86_64_shld (tmp_hi, src_lo, amount));
+ }
+  else
+ {
+  emit_insn (gen_x86_64_shrd (tmp_lo, src_hi, amount));
+  emit_insn (gen_x86_64_shrd (tmp_hi, src_lo, amount));
+ }

  rtx (*shift) (rtx, rtx, rtx)
= ( == ROTATE) ? gen_x86_64_shld : gen_x86_64_shrd;
  emit_insn (shift (tmp_lo, src_hi, amount));
  emit_insn (shift (tmp_hi, src_lo, amount));



>
> Thanks in advance,
> Roger
> --
>

Re: [PATCH] c++: quadratic constexpr behavior for left-assoc logical exprs [PR102780]

2021-11-01 Thread Patrick Palka via Gcc-patches

On Fri, 29 Oct 2021, Jakub Jelinek wrote:

> On Thu, Oct 28, 2021 at 03:35:20PM -0400, Patrick Palka wrote:
> > > Is there a reason to turn this mode of evaluating everything twice if an
> > > error should be diagnosed all the time, rather than only if we actually 
> > > see
> > > a TRUTH_*_EXPR we want to handle this way?
> > > If we don't see any TRUTH_*_EXPR, or if processing_template_decl, or if
> > > the first operand is already a constant, that seems like a waste of time.
> > 
> > Hmm yeah, at the very least it wouldn't hurt to check
> > processing_template_decl before doing the tf_error shenanigans.  I'm not
> > sure if we would gain anything by first looking for TRUTH_*_EXPR since
> > that'd involve walking the entire expression anyway IIUC.
> 
> I meant actually something like:
> --- gcc/cp/constexpr.c.jj 2021-10-28 20:07:48.566193259 +0200
> +++ gcc/cp/constexpr.c2021-10-29 13:47:48.824026156 +0200
> @@ -8789,7 +8789,7 @@ potential_constant_expression_1 (tree t,
> return false;
>   }
> else if (!check_for_uninitialized_const_var
> -(tmp, /*constexpr_context_p=*/true, flags))
> +(tmp, /*constexpr_context_p=*/true, flags & ~(1 << 30)))
>   return false;
>   }
>return RECUR (tmp, want_rval);
> @@ -8896,14 +8896,36 @@ potential_constant_expression_1 (tree t,
>   tree op1 = TREE_OPERAND (t, 1);
>   if (!RECUR (op0, rval))
> return false;
> - if (!(flags & tf_error) && RECUR (op1, rval))
> -   /* When quiet, try to avoid expensive trial evaluation by first
> -  checking potentiality of the second operand.  */
> -   return true;
> - if (!processing_template_decl)
> -   op0 = cxx_eval_outermost_constant_expr (op0, true);
> + if (TREE_CODE (op0) != INTEGER_CST && !processing_template_decl)
> +   {
> + /* If op0 is not a constant, we can either
> +cxx_eval_outermost_constant_expr first, or RECUR (op1, rval)
> +first.  If quiet, perform the latter first, if not quiet
> +and it is the outermost such TRUTH_*_EXPR, perform the
> +latter first in quiet mode, followed by the former and
> +retry with the latter in non-quiet mode.  */
> + if ((flags & (1 << 30)) != 0)
> +   op0 = cxx_eval_outermost_constant_expr (op0, true);
> + else if ((flags & tf_error) != 0)
> +   {
> + flags &= ~tf_error;
> + if (RECUR (op1, rval))
> +   return true;
> + op0 = cxx_eval_outermost_constant_expr (op0, true);
> + flags |= tf_error | (1 << 30);
> +   }
> + else
> +   {
> + if (RECUR (op1, rval))
> +   return true;
> + op0 = cxx_eval_outermost_constant_expr (op0, true);
> + if (tree_int_cst_equal (op0, tmp))
> +   return false;
> + return true;
> +   }
> +   }
>   if (tree_int_cst_equal (op0, tmp))
> -   return (flags & tf_error) ? RECUR (op1, rval) : false;
> +   return RECUR (op1, rval);
>   else
> return true;
>}
> @@ -9112,17 +9134,6 @@ bool
>  potential_constant_expression_1 (tree t, bool want_rval, bool strict, bool 
> now,
>tsubst_flags_t flags)
>  {
> -  if (flags & tf_error)
> -{
> -  /* Check potentiality quietly first, as that could be performed more
> -  efficiently in some cases (currently only for TRUTH_*_EXPR).  If
> -  that fails, replay the check noisily to give errors.  */
> -  flags &= ~tf_error;
> -  if (potential_constant_expression_1 (t, want_rval, strict, now, flags))
> - return true;
> -  flags |= tf_error;
> -}
> -
>tree target = NULL_TREE;
>return potential_constant_expression_1 (t, want_rval, strict, now,
> flags, );
> 
> (perhaps with naming the 1 << 30 as tf_something or using different bit for
> that).  So no doubling of potential_constant_expression_1 evaluation
> for tf_error unless a TRUTH_*_EXPR is seen outside of template with
> potentially constant first operand other than INTEGER_CST, but similarly to
> what you did, make sure that there are at most two calls and not more.

That makes sense, though it's somewhat unfortunate that we'd have to
use/add an adhoc tsubst_flags_t flag with this approach :/

> 
> > > As I said, another possibility is something like:
> > > /* Try to quietly evaluate T to constant, but don't try too hard.  */
> > > 
> > > static tree
> > > potential_constant_expression_eval (tree t)
> > > {
> > >   auto o = make_temp_override (constexpr_ops_limit,
> > >  MIN (constexpr_ops_limit, 100));
> > >   return cxx_eval_outermost_constant_expr (t, true);
> > > }
> > > and using this new function instead of cxx_eval_outermost_constant_expr 
> > > (op, true);
> > > everywhere in potential_constant_expression_1

Re: [PATCH] ipa-sra: Improve debug info for removed parameters (PR 93385)

2021-11-01 Thread Martin Jambor

Hello,

I'd like to ping this patch.

Thanks,

Martin


On Wed, Oct 13 2021, Martin Jambor wrote:
> Hi,
>
> in spring I added code eliminating any statements using parameters
> removed by IPA passes (to fix PR 93385).  That patch fixed issues such
> as divisions by zero that such code could perform but it only reset
> all affected debug bind statements, this one updates them with
> expressions which can allow the debugger to print the removed value -
> see the added test-case for an example.
>
> Even though I originally did not want to create DEBUG_EXPR_DECLs for
> intermediate values, I ended up doing so, because otherwise the code
> started creating statements like
>
># DEBUG __aD.198693 => [(const struct _Alloc_nodeD.171110 
> *)D#195]._M_tD.184726->_M_implD.171154
>
> which not only is a bit scary but also gimple-fold ICEs on
> it. Therefore I decided they are probably quite necessary.
>
> The patch simply notes each removed SSA name present in a debug
> statement and then works from it backwards, looking if it can
> reconstruct the expression it represents (which can fail if a
> non-degenerate PHI node is in the way).  If it can, it populates two
> hash maps with those expressions so that 1) removed assignments are
> replaced with a debug bind defining a new intermediate debug_decl_expr
> and 2) existing debug binds that refer to SSA names that are bing
> removed now refer to corresponding debug_decl_exprs.
>
> If a removed parameter is passed to another function, the debugging
> information still cannot describe its value there - see the xfailed
> test in the testcase.  I sort of know what needs to be done but that
> needs a little bit more of IPA infrastructure on top of this patch and
> so I would like to get this patch reviewed first.
>
> Bootstrapped and tested on x86_64-linux, i686-linux and (long time
> ago) on aarch64-linux.  Also LTO-bootstrapped and on x86_64-linux.
>
> Perhaps it is good to go to trunk?
>
> Thanks,
>
> Martin
>
> gcc/ChangeLog:
>
> 2021-03-29  Martin Jambor  
>
>   PR ipa/93385
>   * ipa-param-manipulation.h (class ipa_param_body_adjustments): New
>   members remap_with_debug_expressions, m_dead_ssa_debug_equiv,
>   m_dead_stmt_debug_equiv and prepare_debug_expressions.  Added
>   parameter to mark_dead_statements.
>   * ipa-param-manipulation.c: Include tree-phinodes.h and cfgexpand.h.
>   (ipa_param_body_adjustments::mark_dead_statements): New parameter
>   debugstack, push into it all SSA names used in debug statements,
>   produce m_dead_ssa_debug_equiv mapping for the removed param.
>   (replace_with_mapped_expr): New function.
>   (ipa_param_body_adjustments::remap_with_debug_expressions): Likewise.
>   (ipa_param_body_adjustments::prepare_debug_expressions): Likewise.
>   (ipa_param_body_adjustments::common_initialization): Gather and
>   procecc SSA which will be removed but are in debug statements. Simplify.
>   (ipa_param_body_adjustments::ipa_param_body_adjustments): Initialize
>   new members.
>   * tree-inline.c (remap_gimple_stmt): Create a debug bind when possible
>   when avoiding a copy of an unnecessary statement.  Remap removed SSA
>   names in existing debug statements.
>   (tree_function_versioning): Do not create DEBUG_EXPR_DECL for removed
>   parameters if we have already done so.
>
> gcc/testsuite/ChangeLog:
>
> 2021-03-29  Martin Jambor  
>
>   PR ipa/93385
>   * gcc.dg/guality/ipa-sra-1.c: New test.
> ---
>  gcc/ipa-param-manipulation.c | 280 ++-
>  gcc/ipa-param-manipulation.h |  12 +-
>  gcc/testsuite/gcc.dg/guality/ipa-sra-1.c |  45 
>  gcc/tree-inline.c|  45 ++--
>  4 files changed, 305 insertions(+), 77 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/guality/ipa-sra-1.c
>
> diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c
> index 26b02d7aa95..c84d669521c 100644
> --- a/gcc/ipa-param-manipulation.c
> +++ b/gcc/ipa-param-manipulation.c
> @@ -43,6 +43,8 @@ along with GCC; see the file COPYING3.  If not see
>  #include "alloc-pool.h"
>  #include "symbol-summary.h"
>  #include "symtab-clones.h"
> +#include "tree-phinodes.h"
> +#include "cfgexpand.h"
>  
>  
>  /* Actual prefixes of different newly synthetized parameters.  Keep in sync
> @@ -972,10 +974,12 @@ ipa_param_body_adjustments::carry_over_param (tree t)
>  
>  /* Populate m_dead_stmts given that DEAD_PARAM is going to be removed without
> any replacement or splitting.  REPL is the replacement VAR_SECL to base 
> any
> -   remaining uses of a removed parameter on.  */
> +   remaining uses of a removed parameter on.  Push all removed SSA names that
> +   are used within debug statements to DEBUGSTACK.  */
>  
>  void
> -ipa_param_body_adjustments::mark_dead_statements (tree dead_param)
> +ipa_param_body_adjustments::mark_dead_statements (tree dead_param,
> +

[PATCH] x86_64: Improved implementation of TImode rotations.

2021-11-01 Thread Roger Sayle


This simple patch improves the implementation of 128-bit (TImode)
rotations on x86_64 (a missed optimization opportunity spotted
during the recent V1TImode improvements).

Currently, the function:

unsigned __int128 rotrti3(unsigned __int128 x, unsigned int i) {
  return (x >> i) | (x << (128-i));
}

produces:

rotrti3:
movq%rsi, %r8
movq%rdi, %r9
movl%edx, %ecx
movq%rdi, %rsi
movq%r9, %rax
movq%r8, %rdx
movq%r8, %rdi
shrdq   %r8, %rax
shrq%cl, %rdx
xorl%r8d, %r8d
testb   $64, %cl
cmovne  %rdx, %rax
cmovne  %r8, %rdx
negl%ecx
andl$127, %ecx
shldq   %r9, %rdi
salq%cl, %rsi
xorl%r9d, %r9d
testb   $64, %cl
cmovne  %rsi, %rdi
cmovne  %r9, %rsi
orq %rdi, %rdx
orq %rsi, %rax
ret

with this patch, GCC will now generate the much nicer:
rotrti3:
movl%edx, %ecx
movq%rdi, %rdx
shrdq   %rsi, %rdx
shrdq   %rdi, %rsi
andl$64, %ecx
movq%rdx, %rax
cmove   %rsi, %rdx
cmovne  %rsi, %rax
ret

Even I wasn't expecting the optimizer's choice of the final three
instructions; a thing of beauty.  For rotations larger than 64,
the lowpart and the highpart (%rax and %rdx) are transposed, and
it would be nice to have a conditional swap/exchange.  The inspired
solution the compiler comes up with is to store/duplicate the same
value in both %rax/%rdx, and then use complementary conditional moves
to either update the lowpart or highpart, which cleverly avoids the
potential decode-stage pipeline stall (on some microarchitectures)
from having multiple instructions conditional on the same condition.
See X86_TUNE_ONE_IF_CONV_INSN, and notice there are two such stalls
in the original expansion of rot[rl]ti3.

One quick question, does TARGET_64BIT (always) imply TARGET_CMOVE?

This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap
and make -k check with no new failures.  Interestingly the correct
behaviour is already tested by (amongst other tests) sse2-v1ti-shift-3.c
that confirms V1TImode rotates by constants match rotlti3/rotrti3.

Ok for mainline?


2021-11-01  Roger Sayle  

* config/i386/i386.md (ti3): Provide expansion for
rotations by non-constant amounts on TARGET_CMOVE architectures.


Thanks in advance,
Roger
--

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index e733a40..2285c6c 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -12572,6 +12572,31 @@
   if (const_1_to_63_operand (operands[2], VOIDmode))
 emit_insn (gen_ix86_ti3_doubleword
(operands[0], operands[1], operands[2]));
+  else if (TARGET_CMOVE)
+{
+  rtx amount = force_reg (QImode, operands[2]);
+  rtx src_lo = gen_lowpart (DImode, operands[1]);
+  rtx src_hi = gen_highpart (DImode, operands[1]);
+  rtx tmp_lo = gen_reg_rtx (DImode);
+  rtx tmp_hi = gen_reg_rtx (DImode);
+  emit_move_insn (tmp_lo, src_lo);
+  emit_move_insn (tmp_hi, src_hi);
+  if ( == ROTATE)
+   {
+ emit_insn (gen_x86_64_shld (tmp_lo, src_hi, amount));
+ emit_insn (gen_x86_64_shld (tmp_hi, src_lo, amount));
+   }
+  else
+   {
+ emit_insn (gen_x86_64_shrd (tmp_lo, src_hi, amount));
+ emit_insn (gen_x86_64_shrd (tmp_hi, src_lo, amount));
+   }
+  rtx dst_lo = gen_lowpart (DImode, operands[0]);
+  rtx dst_hi = gen_highpart (DImode, operands[0]);
+  emit_move_insn (dst_lo, tmp_lo);
+  emit_move_insn (dst_hi, tmp_hi);
+  emit_insn (gen_x86_shiftdi_adj_1 (dst_lo, dst_hi, amount, tmp_lo));
+}
   else
 FAIL;

[PATCH v1] aarch64: enable Ampere-1 CPU

2021-11-01 Thread Philipp Tomsich

This adds support and a basic turning model for the Ampere Computing
"Ampere-1" CPU.

The Ampere-1 implements the ARMv8.6 architecture in A64 mode and is
modelled as a 4-wide issue (as with all modern micro-architectures,
the chosen issue rate is a compromise between the maximum dispatch
rate and the maximum rate of uops issued to the scheduler).

This adds the -mcpu=ampere1 command-line option and the relevant cost
information/tuning tables for the Ampere-1.

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (AARCH64_CORE): New Ampere-1
core.
* config/aarch64/aarch64-tune.md: Regenerate.
* config/aarch64/aarch64-cost-tables.h: Add extra costs for
Ampere-1.
* config/aarch64/aarch64.c: Add tuning structures for Ampere-1.

---

 gcc/config/aarch64/aarch64-cores.def |   3 +-
 gcc/config/aarch64/aarch64-cost-tables.h | 107 +++
 gcc/config/aarch64/aarch64-tune.md   |   2 +-
 gcc/config/aarch64/aarch64.c |  78 +
 gcc/doc/invoke.texi  |   2 +-
 5 files changed, 189 insertions(+), 3 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 77da31084de..617cde42fba 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -68,7 +68,8 @@ AARCH64_CORE("octeontx83",octeontxt83,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH
 AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
0x0a2, -1)
 AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
0x0a3, -1)
 
-/* Ampere Computing cores. */
+/* Ampere Computing ('\xC0') cores. */
+AARCH64_CORE("ampere1", ampere1, cortexa57, 8_6A, AARCH64_FL_FOR_ARCH8_6, 
ampere1, 0xC0, 0xac3, -1)
 /* Do not swap around "emag" and "xgene1",
this order is required to handle variant correctly. */
 AARCH64_CORE("emag",emag,  xgene1,8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, 0x000, 3)
diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
b/gcc/config/aarch64/aarch64-cost-tables.h
index bb499a1eae6..e6ded65b67d 100644
--- a/gcc/config/aarch64/aarch64-cost-tables.h
+++ b/gcc/config/aarch64/aarch64-cost-tables.h
@@ -668,4 +668,111 @@ const struct cpu_cost_table a64fx_extra_costs =
   }
 };
 
+const struct cpu_cost_table ampere1_extra_costs =
+{
+  /* ALU */
+  {
+0, /* arith.  */
+0, /* logical.  */
+0, /* shift.  */
+COSTS_N_INSNS (1), /* shift_reg.  */
+0, /* arith_shift.  */
+COSTS_N_INSNS (1), /* arith_shift_reg.  */
+0, /* log_shift.  */
+COSTS_N_INSNS (1), /* log_shift_reg.  */
+0, /* extend.  */
+COSTS_N_INSNS (1), /* extend_arith.  */
+0, /* bfi.  */
+0, /* bfx.  */
+0, /* clz.  */
+0, /* rev.  */
+0, /* non_exec.  */
+true   /* non_exec_costs_exec.  */
+  },
+  {
+/* MULT SImode */
+{
+  COSTS_N_INSNS (3),   /* simple.  */
+  COSTS_N_INSNS (3),   /* flag_setting.  */
+  COSTS_N_INSNS (3),   /* extend.  */
+  COSTS_N_INSNS (4),   /* add.  */
+  COSTS_N_INSNS (4),   /* extend_add.  */
+  COSTS_N_INSNS (18)   /* idiv.  */
+},
+/* MULT DImode */
+{
+  COSTS_N_INSNS (3),   /* simple.  */
+  0,   /* flag_setting (N/A).  */
+  COSTS_N_INSNS (3),   /* extend.  */
+  COSTS_N_INSNS (4),   /* add.  */
+  COSTS_N_INSNS (4),   /* extend_add.  */
+  COSTS_N_INSNS (34)   /* idiv.  */
+}
+  },
+  /* LD/ST */
+  {
+COSTS_N_INSNS (4), /* load.  */
+COSTS_N_INSNS (4), /* load_sign_extend.  */
+0, /* ldrd (n/a).  */
+0, /* ldm_1st.  */
+0, /* ldm_regs_per_insn_1st.  */
+0, /* ldm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (5), /* loadf.  */
+COSTS_N_INSNS (5), /* loadd.  */
+COSTS_N_INSNS (5), /* load_unaligned.  */
+0, /* store.  */
+0, /* strd.  */
+0, /* stm_1st.  */
+0, /* stm_regs_per_insn_1st.  */
+0, /* stm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (2), /* storef.  */
+COSTS_N_INSNS (2), /* stored.  */
+COSTS_N_INSNS (2), /* store_unaligned.  */
+COSTS_N_INSNS (3), /* loadv.  */
+COSTS_N_INSNS (3)  /* storev.  */
+  },
+  {
+/* FP SFmode */
+{
+  COSTS_N_INSNS (25),  /* div.  */
+  COSTS_N_INSNS (4),

[PATCH] Fix test-suite pattern scanning.

2021-11-01 Thread Martin Liška


Pushed as obvious.

Martin

Fixes:

UNRESOLVED: g++.dg/ipa/modref-1.C   scan-ipa-dump local-pure-const1 "Function found 
to be const: {anonymous}::B::genB"
UNRESOLVED: g++.dg/ipa/modref-1.C   scan-ipa-dump modref1 "Retslot flags: direct 
noescape nodirectescape not_returned noread"

gcc/testsuite/ChangeLog:

* g++.dg/ipa/modref-1.C: Fix test-suite pattern scanning.
---
 gcc/testsuite/g++.dg/ipa/modref-1.C | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/g++.dg/ipa/modref-1.C 
b/gcc/testsuite/g++.dg/ipa/modref-1.C
index 1de9e1d06a5..0acae2114dd 100644
--- a/gcc/testsuite/g++.dg/ipa/modref-1.C
+++ b/gcc/testsuite/g++.dg/ipa/modref-1.C
@@ -30,6 +30,6 @@ int main()
linker_error ();
return 0;
 }
-/* { dg-final { scan-ipa-dump "Function found to be const: {anonymous}::B::genB" 
"local-pure-const1"  } } */
-/* { dg-final { scan-ipa-dump "Retslot flags: direct noescape nodirectescape not_returned 
noread" "modref1" } } */
+/* { dg-final { scan-tree-dump "Function found to be const: {anonymous}::B::genB" 
"local-pure-const1"  } } */
+/* { dg-final { scan-tree-dump "Retslot flags: direct noescape nodirectescape not_returned 
noread" "modref1" } } */
   
--

2.33.1

Re: [PATCH] contrib: add unicode/utf8-dump.py

2021-11-01 Thread Martin Liška


On 11/1/21 16:32, David Malcolm wrote:

|Thanks. Here's an updated version of the script that fixes the above issues.|


Thanks. Please install it (there are no strict approval rules for contrib 
folder).

Cheers,
Martin

[PATCH] contrib: add unicode/utf8-dump.py

2021-11-01 Thread David Malcolm via Gcc-patches

On Mon, 2021-11-01 at 15:36 +0100, Martin Liška wrote:
> On 11/1/21 15:14, David Malcolm via Gcc-patches wrote:
> > > This script may be useful when debugging issues relating to
> > > Unicode encoding (e.g. when investigating source files with
> > > bidirectional control characters).|
> 
> I like the script except the following flake8 issues:
> 
> $ flake8 contrib/unicode/utf8-dump.py
> contrib/unicode/utf8-dump.py:35:1: E302 expected 2 blank lines, found
> 1
> contrib/unicode/utf8-dump.py:43:1: E302 expected 2 blank lines, found
> 1
> contrib/unicode/utf8-dump.py:53:1: E302 expected 2 blank lines, found
> 1
> contrib/unicode/utf8-dump.py:64:1: E305 expected 2 blank lines after
> class or function definition, found 1

Thanks.  Here's an updated version of the script that fixes the
above issues.

contrib/ChangeLog:
* unicode/utf8-dump.py: New file.

Signed-off-by: David Malcolm 
---
 contrib/unicode/utf8-dump.py | 69 
 1 file changed, 69 insertions(+)
 create mode 100755 contrib/unicode/utf8-dump.py

diff --git a/contrib/unicode/utf8-dump.py b/contrib/unicode/utf8-dump.py
new file mode 100755
index 000..f12ee79f9f2
--- /dev/null
+++ b/contrib/unicode/utf8-dump.py
@@ -0,0 +1,69 @@
+#!/usr/bin/env python3
+#
+# Script to dump a UTF-8 file as a list of numbered lines (mimicking GCC's
+# diagnostic output format), interleaved with lines per character showing
+# the Unicode codepoints, the UTF-8 encoding bytes, the name of the
+# character, and, where printable, the characters themselves.
+# The lines are printed in logical order, which may help the reader to grok
+# the relationship between visual and logical ordering in bi-di files.
+#
+# SPDX-License-Identifier: MIT
+#
+# Copyright (C) 2021 David Malcolm .
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the "Software"),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice shall be included
+# in all copies or substantial portions of the Software.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
+# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT
+# OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE
+# OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+
+import sys
+import unicodedata
+
+
+def get_name(ch):
+try:
+return unicodedata.name(ch)
+except ValueError:
+if ch == '\n':
+return 'LINE FEED (LF)'
+return '(unknown)'
+
+
+def get_printable(ch):
+cat = unicodedata.category(ch)
+if cat == 'Cc':
+return '(control character)'
+elif cat == 'Cf':
+return '(format control)'
+elif cat[0] == 'Z':
+return '(separator)'
+return ch
+
+
+def dump_file(f_in):
+line_num = 1
+for line in f_in:
+print('%4i | %s' % (line_num, line.rstrip()))
+for ch in line:
+utf8_desc = '%15s' % (' '.join(['0x%02x' % b
+for b in ch.encode('utf-8')]))
+print('%4s |   U+%04X %s %40s %s'
+  % ('', ord(ch), utf8_desc, get_name(ch), get_printable(ch)))
+line_num += 1
+
+
+with open(sys.argv[1], mode='r') as f_in:
+dump_file(f_in)
-- 
2.26.3

Re: [PATCH] contrib: add unicode/utf8-dump.py

2021-11-01 Thread Martin Liška


On 11/1/21 15:14, David Malcolm via Gcc-patches wrote:

|This script may be useful when debugging issues relating to Unicode encoding 
(e.g. when investigating source files with bidirectional control characters).|


I like the script except the following flake8 issues:

$ flake8 contrib/unicode/utf8-dump.py
contrib/unicode/utf8-dump.py:35:1: E302 expected 2 blank lines, found 1
contrib/unicode/utf8-dump.py:43:1: E302 expected 2 blank lines, found 1
contrib/unicode/utf8-dump.py:53:1: E302 expected 2 blank lines, found 1
contrib/unicode/utf8-dump.py:64:1: E305 expected 2 blank lines after class or 
function definition, found 1

Martin

Re: redundant bitmap_bit_p followed by bitmap_clear_bit [was: Re: [COMMITTED] Kill second order relations in the path solver.]

2021-11-01 Thread Aldy Hernandez via Gcc-patches

I'm not convinced this makes the code clearer to read, especially if
it's not on a critical path.  But if you feel strongly, please submit
a patch ;-).

Aldy

On Mon, Nov 1, 2021 at 3:10 PM Bernhard Reutner-Fischer
 wrote:
>
> On Thu, 28 Oct 2021 01:55:30 +0200
> Bernhard Reutner-Fischer  wrote:
>
> > On Wed, 27 Oct 2021 20:13:21 +0200
> > Aldy Hernandez via Gcc-patches  wrote:
>
> > > @@ -1306,6 +1307,24 @@ path_oracle::killing_def (tree ssa)
> > >ptr->m_next = m_equiv.m_next;
> > >m_equiv.m_next = ptr;
> > >bitmap_ior_into (m_equiv.m_names, b);
> > > +
> > > +  // Walk the relation list an remove SSA from any relations.
> >
> > s/an /and /
> >
> > > +  if (!bitmap_bit_p (m_relations.m_names, v))
> > > +return;
> > > +
> > > +  bitmap_clear_bit (m_relations.m_names, v);
> >
> > IIRC bitmap_clear_bit returns true if the bit was set, false otherwise,
> > so should be used as if(!bitmap_clear_bit) above.
>
> > > +  relation_chain **prev = &(m_relations.m_head);
> >
> > s/[()]//
> > thanks,
>
> There seems to be two other spots where a redundant bitmap_bit_p checks
> if we want to bitmap_clear_bit. In dse and ira.
> Like:
> $ cat ~/coccinelle/gcc_bitmap_bit_p-0.cocci ; echo EOF
> // replace redundant bitmap_bit_p() bitmap_clear_bit with the latter
> @ rule1 @
> identifier fn;
> expression bitmap, bit;
> @@
>
> fn(...) {
> <...
> (
> -if (bitmap_bit_p (bitmap, bit))
> +if (bitmap_clear_bit (bitmap, bit))
> {
>   ...
> -  bitmap_clear_bit (bitmap, bit);
>   ...
> }
> |
> -if (bitmap_bit_p (bitmap, bit))
> +if (bitmap_clear_bit (bitmap, bit))
> {
>   ...
> }
> ...
> -bitmap_clear_bit (bitmap, bit);
> )
> ...>
> }
> EOF
> $ find gcc/ -type f -a \( -name "*.c" -o -name "*.cc" \) -a \( ! -path 
> "gcc/testsuite/*" -a ! -path "gcc/contrib/*" \) -exec spatch -sp_file 
> ~/coccinelle/gcc_bitmap_bit_p-0.cocci --show-diff {} \;
> diff =
> --- gcc/dse.c
> +++ /tmp/cocci-output-1104419-443759-dse.c
> @@ -3238,9 +3238,8 @@ mark_reachable_blocks (sbitmap unreachab
>edge e;
>edge_iterator ei;
>
> -  if (bitmap_bit_p (unreachable_blocks, bb->index))
> +  if (bitmap_clear_bit(unreachable_blocks, bb->index))
>  {
> -  bitmap_clear_bit (unreachable_blocks, bb->index);
>FOR_EACH_EDGE (e, ei, bb->preds)
> {
>   mark_reachable_blocks (unreachable_blocks, e->src);
> diff =
> --- gcc/ira.c
> +++ /tmp/cocci-output-1104678-d8679a-ira.c
> @@ -2944,17 +2944,15 @@ mark_elimination (int from, int to)
>FOR_EACH_BB_FN (bb, cfun)
>  {
>r = DF_LR_IN (bb);
> -  if (bitmap_bit_p (r, from))
> +  if (bitmap_clear_bit(r, from))
> {
> - bitmap_clear_bit (r, from);
>   bitmap_set_bit (r, to);
> }
>if (! df_live)
>  continue;
>r = DF_LIVE_IN (bb);
> -  if (bitmap_bit_p (r, from))
> +  if (bitmap_clear_bit(r, from))
> {
> - bitmap_clear_bit (r, from);
>   bitmap_set_bit (r, to);
> }
>  }
> # in ira.c one would have to fixup the curly braces manually
> PS: coccinelle seems to ruin the spaces before braces in the '+' even
> though i have written them correctly according to GNU style..
>

[PATCH] contrib: add unicode/utf8-dump.py

2021-11-01 Thread David Malcolm via Gcc-patches

This script may be useful when debugging issues relating to Unicode
encoding (e.g. when investigating source files with bidirectional control
characters).

It dump a UTF-8 file as a list of numbered lines (mimicking GCC's
diagnostic output format), interleaved with lines per character showing
the Unicode codepoints, the UTF-8 encoding bytes, the name of the
character, and, where printable, the characters themselves.
The lines are printed in logical order, which may help the reader to grok
the relationship between visual and logical ordering in bi-di files.

For example:

$ cat test.c
int གྷ;
const char *אבג = "ALEF-BET-GIMEL";

$ ./contrib/unicode/utf8-dump.py test.c
   1 | int གྷ;
 |   U+00690x69 LATIN SMALL LETTER I i
 |   U+006E0x6e LATIN SMALL LETTER N n
 |   U+00740x74 LATIN SMALL LETTER T t
 |   U+00200x20SPACE 
(separator)
 |   U+0F43  0xe0 0xbd 0x83   TIBETAN LETTER GHA གྷ
 |   U+003B0x3bSEMICOLON ;
 |   U+000A0x0a   LINE FEED (LF) 
(control character)
   2 | const char *אבג = "ALEF-BET-GIMEL";
 |   U+00630x63 LATIN SMALL LETTER C c
 |   U+006F0x6f LATIN SMALL LETTER O o
 |   U+006E0x6e LATIN SMALL LETTER N n
 |   U+00730x73 LATIN SMALL LETTER S s
 |   U+00740x74 LATIN SMALL LETTER T t
 |   U+00200x20SPACE 
(separator)
 |   U+00630x63 LATIN SMALL LETTER C c
 |   U+00680x68 LATIN SMALL LETTER H h
 |   U+00610x61 LATIN SMALL LETTER A a
 |   U+00720x72 LATIN SMALL LETTER R r
 |   U+00200x20SPACE 
(separator)
 |   U+002A0x2a ASTERISK *
 |   U+05D0   0xd7 0x90   HEBREW LETTER ALEF א
 |   U+05D1   0xd7 0x91HEBREW LETTER BET ב
 |   U+05D2   0xd7 0x92  HEBREW LETTER GIMEL ג
 |   U+00200x20SPACE 
(separator)
 |   U+003D0x3d  EQUALS SIGN =
 |   U+00200x20SPACE 
(separator)
 |   U+00220x22   QUOTATION MARK "
 |   U+00410x41   LATIN CAPITAL LETTER A A
 |   U+004C0x4c   LATIN CAPITAL LETTER L L
 |   U+00450x45   LATIN CAPITAL LETTER E E
 |   U+00460x46   LATIN CAPITAL LETTER F F
 |   U+002D0x2d HYPHEN-MINUS -
 |   U+00420x42   LATIN CAPITAL LETTER B B
 |   U+00450x45   LATIN CAPITAL LETTER E E
 |   U+00540x54   LATIN CAPITAL LETTER T T
 |   U+002D0x2d HYPHEN-MINUS -
 |   U+00470x47   LATIN CAPITAL LETTER G G
 |   U+00490x49   LATIN CAPITAL LETTER I I
 |   U+004D0x4d   LATIN CAPITAL LETTER M M
 |   U+00450x45   LATIN CAPITAL LETTER E E
 |   U+004C0x4c   LATIN CAPITAL LETTER L L
 |   U+00220x22   QUOTATION MARK "
 |   U+003B0x3bSEMICOLON ;
 |   U+000A0x0a   LINE FEED (LF) 
(control character)

Tested with Python 3.8

OK for trunk and to backport?

contrib/ChangeLog:
* unicode/utf8-dump.py: New file.

Signed-off-by: David Malcolm 
---
 contrib/unicode/utf8-dump.py | 65 
 1 file changed, 65 insertions(+)
 create mode 100755 contrib/unicode/utf8-dump.py

diff --git a/contrib/unicode/utf8-dump.py b/contrib/unicode/utf8-dump.py
new file mode 100755
index 000..21885a85bdc
--- /dev/null
+++ b/contrib/unicode/utf8-dump.py
@@ -0,0 +1,65 @@
+#!/usr/bin/env python3
+#
+# Script to dump a UTF-8 file as a list of numbered lines (mimicking GCC's
+# diagnostic output format), interleaved with lines per character showing
+# the Unicode codepoints, the UTF-8 encoding bytes, the name of the
+# character, and, where printable, the characters themselves.
+# The lines are printed in logical order, which may help the reader to grok
+# the relationship between visual and logical ordering in

redundant bitmap_bit_p followed by bitmap_clear_bit [was: Re: [COMMITTED] Kill second order relations in the path solver.]

2021-11-01 Thread Bernhard Reutner-Fischer via Gcc-patches

On Thu, 28 Oct 2021 01:55:30 +0200
Bernhard Reutner-Fischer  wrote:

> On Wed, 27 Oct 2021 20:13:21 +0200
> Aldy Hernandez via Gcc-patches  wrote:

> > @@ -1306,6 +1307,24 @@ path_oracle::killing_def (tree ssa)
> >ptr->m_next = m_equiv.m_next;
> >m_equiv.m_next = ptr;
> >bitmap_ior_into (m_equiv.m_names, b);
> > +
> > +  // Walk the relation list an remove SSA from any relations.  
> 
> s/an /and /
> 
> > +  if (!bitmap_bit_p (m_relations.m_names, v))
> > +return;
> > +
> > +  bitmap_clear_bit (m_relations.m_names, v);  
> 
> IIRC bitmap_clear_bit returns true if the bit was set, false otherwise,
> so should be used as if(!bitmap_clear_bit) above.

> > +  relation_chain **prev = &(m_relations.m_head);  
> 
> s/[()]//
> thanks,

There seems to be two other spots where a redundant bitmap_bit_p checks
if we want to bitmap_clear_bit. In dse and ira.
Like:
$ cat ~/coccinelle/gcc_bitmap_bit_p-0.cocci ; echo EOF
// replace redundant bitmap_bit_p() bitmap_clear_bit with the latter
@ rule1 @
identifier fn;
expression bitmap, bit;
@@

fn(...) {
<...
(
-if (bitmap_bit_p (bitmap, bit))
+if (bitmap_clear_bit (bitmap, bit))
{
  ...
-  bitmap_clear_bit (bitmap, bit);
  ...
}
|
-if (bitmap_bit_p (bitmap, bit))
+if (bitmap_clear_bit (bitmap, bit))
{
  ...
}
...
-bitmap_clear_bit (bitmap, bit);
)
...>
}
EOF
$ find gcc/ -type f -a \( -name "*.c" -o -name "*.cc" \) -a \( ! -path 
"gcc/testsuite/*" -a ! -path "gcc/contrib/*" \) -exec spatch -sp_file 
~/coccinelle/gcc_bitmap_bit_p-0.cocci --show-diff {} \;
diff = 
--- gcc/dse.c
+++ /tmp/cocci-output-1104419-443759-dse.c
@@ -3238,9 +3238,8 @@ mark_reachable_blocks (sbitmap unreachab
   edge e;
   edge_iterator ei;
 
-  if (bitmap_bit_p (unreachable_blocks, bb->index))
+  if (bitmap_clear_bit(unreachable_blocks, bb->index))
 {
-  bitmap_clear_bit (unreachable_blocks, bb->index);
   FOR_EACH_EDGE (e, ei, bb->preds)
{
  mark_reachable_blocks (unreachable_blocks, e->src);
diff = 
--- gcc/ira.c
+++ /tmp/cocci-output-1104678-d8679a-ira.c
@@ -2944,17 +2944,15 @@ mark_elimination (int from, int to)
   FOR_EACH_BB_FN (bb, cfun)
 {
   r = DF_LR_IN (bb);
-  if (bitmap_bit_p (r, from))
+  if (bitmap_clear_bit(r, from))
{
- bitmap_clear_bit (r, from);
  bitmap_set_bit (r, to);
}
   if (! df_live)
 continue;
   r = DF_LIVE_IN (bb);
-  if (bitmap_bit_p (r, from))
+  if (bitmap_clear_bit(r, from))
{
- bitmap_clear_bit (r, from);
  bitmap_set_bit (r, to);
}
 }
# in ira.c one would have to fixup the curly braces manually
PS: coccinelle seems to ruin the spaces before braces in the '+' even
though i have written them correctly according to GNU style..

PING^2 [PATCH v4 0/2] Implement indirect external access

2021-11-01 Thread H.J. Lu via Gcc-patches

On Thu, Oct 21, 2021 at 12:56 PM H.J. Lu  wrote:
>
> On Wed, Sep 22, 2021 at 7:02 PM H.J. Lu  wrote:
> >
> > Changes in the v4 patch.
> >
> > 1. Add nodirect_extern_access attribute.
> >
> > Changes in the v3 patch.
> >
> > 1. GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS support has been added to
> > GNU binutils 2.38.  But the -z indirect-extern-access linker option is
> > only available for Linux/x86.  However, the --max-cache-size=SIZE linker
> > option was also addded within a day.  --max-cache-size=SIZE is used to
> > check for GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS support.
> >
> > Changes in the v2 patch.
> >
> > 1. Rename the option to -fdirect-extern-access.
> >
> > ---
> > On systems with copy relocation:
> > * A copy in executable is created for the definition in a shared library
> > at run-time by ld.so.
> > * The copy is referenced by executable and shared libraries.
> > * Executable can access the copy directly.
> >
> > Issues are:
> > * Overhead of a copy, time and space, may be visible at run-time.
> > * Read-only data in the shared library becomes read-write copy in
> > executable at run-time.
> > * Local access to data with the STV_PROTECTED visibility in the shared
> > library must use GOT.
> >
> > On systems without function descriptor, function pointers vary depending
> > on where and how the functions are defined.
> > * If the function is defined in executable, it can be the address of
> > function body.
> > * If the function, including the function with STV_PROTECTED visibility,
> > is defined in the shared library, it can be the address of the PLT entry
> > in executable or shared library.
> >
> > Issues are:
> > * The address of function body may not be used as its function pointer.
> > * ld.so needs to search loaded shared libraries for the function pointer
> > of the function with STV_PROTECTED visibility.
> >
> > Here is a proposal to remove copy relocation and use canonical function
> > pointer:
> >
> > 1. Accesses, including in PIE and non-PIE, to undefined symbols must
> > use GOT.
> >   a. Linker may optimize out GOT access if the data is defined in PIE or
> >   non-PIE.
> > 2. Read-only data in the shared library remain read-only at run-time
> > 3. Address of global data with the STV_PROTECTED visibility in the shared
> > library is the address of data body.
> >   a. Can use IP-relative access.
> >   b. May need GOT without IP-relative access.
> > 4. For systems without function descriptor,
> >   a. All global function pointers of undefined functions in PIE and
> >   non-PIE must use GOT.  Linker may optimize out GOT access if the
> >   function is defined in PIE or non-PIE.
> >   b. Function pointer of functions with the STV_PROTECTED visibility in
> >   executable and shared library is the address of function body.
> >i. Can use IP-relative access.
> >ii. May need GOT without IP-relative access.
> >iii. Branches to undefined functions may use PLT.
> > 5. Single global definition marker:
> >
> > Add GNU_PROPERTY_1_NEEDED:
> >
> > #define GNU_PROPERTY_1_NEEDED GNU_PROPERTY_UINT32_OR_LO
> >
> > to indicate the needed properties by the object file.
> >
> > Add GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS:
> >
> > #define GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS (1U << 0)
> >
> > to indicate that the object file requires canonical function pointers and
> > cannot be used with copy relocation.  This bit should be cleared in
> > executable when there are non-GOT or non-PLT relocations in relocatable
> > input files without this bit set.
> >
> >   a. Protected symbol access within the shared library can be treated as
> >   local.
> >   b. Copy relocation should be disallowed at link-time and run-time.
> >   c. GOT function pointer reference is required at link-time and run-time.
> >
> > The indirect external access marker can be used in the following ways:
> >
> > 1. Linker can decide the best way to resolve a relocation against a
> > protected symbol before seeing all relocations against the symbol.
> > 2. Dynamic linker can decide if it is an error to have a copy relocation
> > in executable against the protected symbol in a shared library by checking
> > if the shared library is built with -fno-direct-extern-access.
> >
> > Add a compiler option, -fdirect-extern-access. -fdirect-extern-access is
> > the default.  With -fno-direct-extern-access:
> >
> > 1. Always to use GOT to access undefined symbols, including in PIE and
> > non-PIE.  This is safe to do and does not break the ABI.
> > 2. In executable and shared library, for symbols with the STV_PROTECTED
> > visibility:
> >   a. The address of data symbol is the address of data body.
> >   b. For systems without function descriptor, the function pointer is
> >   the address of function body.
> > These break the ABI and resulting shared libraries may not be compatible
> > with executables which are not compiled with -fno-direct-extern-access.
> > 3. Generate an indirect external access marker in

Re: [PATCH]middle-end testsuite: fix failing complex add testcases PR103000

2021-11-01 Thread Jeff Law via Gcc-patches





On 11/1/2021 3:54 AM, Tamar Christina via Gcc-patches wrote:

Hi All,

Some targets have overriden the default unroll factor and so do not have enough
data to succeed for SLP vectorization if loop vect is turned off.

To fix this just always unroll in these testcases.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues.

Ok for master?

Thanks,
Tamar

gcc/testsuite/ChangeLog:

PR testsuite/103000
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c:
Force unroll.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c: likewise
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c:
Likewise
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c:
Likewise.

OK
jeff

[committed] diagnostics: escape non-ASCII source bytes for certain diagnostics

2021-11-01 Thread David Malcolm via Gcc-patches

This patch adds support to GCC's diagnostic subsystem for escaping certain
bytes and Unicode characters when quoting source code.

Specifically, this patch adds a new flag rich_location::m_escape_on_output
which is a hint from a diagnostic that non-ASCII bytes in the pertinent
lines of the user's source code should be escaped when printed.

The patch sets this for the following diagnostics:
- when complaining about stray bytes in the program (when these
are non-printable)
- when complaining about "null character(s) ignored");
- for -Wnormalized= (and generate source ranges for such warnings)

The escaping is controlled by a new option:
  -fdiagnostics-escape-format=[unicode|bytes]

For example, consider a diagnostic involing a source line containing the
string "before" followed by the Unicode character U+03C0 ("GREEK SMALL
LETTER PI", with UTF-8 encoding 0xCF 0x80) followed by the byte 0xBF
(a stray UTF-8 trailing byte), followed by the string "after", where the
diagnostic highlights the U+03C0 character.

By default, this line will be printed verbatim to the user when
reporting a diagnostic at it, as:

 beforeπXafter
   ^

(using X for the stray byte to avoid putting invalid UTF-8 in this
commit message)

If the diagnostic sets the "escape" flag, it will be printed as:

 beforeafter
   ^~~~

with -fdiagnostics-escape-format=unicode (the default), or as:

  before<80>after
^~~~

if the user supplies -fdiagnostics-escape-format=bytes.

This only affects how the source is printed; it does not affect
how column numbers that are printed (as per -fdiagnostics-column-unit=
and -fdiagnostics-column-origin=).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

Also tested by attempting to compile a .png file as if it were C
source code.

Pushed to trunk as bd5e882cf6e0def3dd1bc106075d59a303fe0d1e.


gcc/c-family/ChangeLog:
* c-lex.c (c_lex_with_flags): When complaining about non-printable
CPP_OTHER tokens, set the "escape on output" flag.

gcc/ChangeLog:
* common.opt (fdiagnostics-escape-format=): New.
(diagnostics_escape_format): New enum.
(DIAGNOSTICS_ESCAPE_FORMAT_UNICODE): New enum value.
(DIAGNOSTICS_ESCAPE_FORMAT_BYTES): Likewise.
* diagnostic-format-json.cc (json_end_diagnostic): Add
"escape-source" attribute.
* diagnostic-show-locus.c
(exploc_with_display_col::exploc_with_display_col): Replace
"tabstop" param with a cpp_char_column_policy and add an "aspect"
param.  Use these to compute m_display_col accordingly.
(struct char_display_policy): New struct.
(layout::m_policy): New field.
(layout::m_escape_on_output): New field.
(def_policy): New function.
(make_range): Update for changes to exploc_with_display_col ctor.
(default_print_decoded_ch): New.
(width_per_escaped_byte): New.
(escape_as_bytes_width): New.
(escape_as_bytes_print): New.
(escape_as_unicode_width): New.
(escape_as_unicode_print): New.
(make_policy): New.
(layout::layout): Initialize new fields.  Update m_exploc ctor
call for above change to ctor.
(layout::maybe_add_location_range): Update for changes to
exploc_with_display_col ctor.
(layout::calculate_x_offset_display): Update for change to
cpp_display_width.
(layout::print_source_line): Pass policy
to cpp_display_width_computation. Capture cpp_decoded_char when
calling process_next_codepoint.  Move printing of source code to
m_policy.m_print_cb.
(line_label::line_label): Pass in policy rather than context.
(layout::print_any_labels): Update for change to line_label ctor.
(get_affected_range): Pass in policy rather than context, updating
calls to location_compute_display_column accordingly.
(get_printed_columns): Likewise, also for cpp_display_width.
(correction::correction): Pass in policy rather than tabstop.
(correction::compute_display_cols): Pass m_policy rather than
m_tabstop to cpp_display_width.
(correction::m_tabstop): Replace with...
(correction::m_policy): ...this.
(line_corrections::line_corrections): Pass in policy rather than
context.
(line_corrections::m_context): Replace with...
(line_corrections::m_policy): ...this.
(line_corrections::add_hint): Update to use m_policy rather than
m_context.
(line_corrections::add_hint): Likewise.
(layout::print_trailing_fixits): Likewise.
(selftest::test_display_widths): New.
(selftest::test_layout_x_offset_display_utf8): Update to use
policy rather than tabstop.
(selftest::test_one_liner_labels_utf8): Add test of escaping
source lines.
(selftest::test_diagnostic_show_locus_one_liner_utf8): Update to
use policy rather than tabstop.

[COMMITTED] PR tree-optimization/103003 - Don't register nonsensical relations.

2021-11-01 Thread Andrew MacLeod via Gcc-patches


The testcase in question has a stmt of the form:
   _10 = _4 <= _4;

We know this resolves to true, but when proce3ssing outgoing edges on 
the following branch, we try to register the relations


_4 <= _4  and _4 > _4 on the 2 outgoing edges...  which is nonsense of 
course.


this patch simply changes the registry to bai is the 2 ssa_anames are 
the same.


If the relation is ==, then equality is already implied, and any other 
relation just doesn't make much sense.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew

commit 0187c03be31a58ba561d535687dc00c94f0ff1aa
Author: Andrew MacLeod 
Date:   Sat Oct 30 11:00:49 2021 -0400

Don't register nonsensical relations.

gcc/
PR tree-optimization/103003
* value-relation.cc (dom_oracle::register_relation): If the 2
ssa names are the same, don't register any relation.

gcc/testsuite/
* gcc.dg/pr103003.c: New.

diff --git a/gcc/testsuite/gcc.dg/pr103003.c b/gcc/testsuite/gcc.dg/pr103003.c
new file mode 100644
index 000..d3d65f8b6a6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr103003.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+typedef char int8_t;
+int8_t c_4, uli_5;
+unsigned short us_6;
+void func_1() {
+  int uli_9;
+  short ptr_16ptr_11 = _9; /* { dg-warning "initialization of*" } */
+  for (; us_6 <= 6;)
+if ((us_6 *= uli_9) < (uli_5 || 0) ?: ((c_4 = us_6) >= us_6) - uli_5)
+  uli_9 = 9;
+}
diff --git a/gcc/value-relation.cc b/gcc/value-relation.cc
index f572bcd4dc2..f1e46d38de1 100644
--- a/gcc/value-relation.cc
+++ b/gcc/value-relation.cc
@@ -877,7 +877,13 @@ relation_oracle::register_edge (edge e, relation_kind k, tree op1, tree op2)
 void
 dom_oracle::register_relation (basic_block bb, relation_kind k, tree op1,
 			   tree op2)
-{  // Equivalencies are handled by the equivalence oracle.
+{
+  // If the 2 ssa_names are the same, do nothing.  An equivalence is implied,
+  // and no other relation makes sense.
+  if (op1 == op2)
+return;
+
+  // Equivalencies are handled by the equivalence oracle.
   if (k == EQ_EXPR)
 equiv_oracle::register_relation (bb, k, op1, op2);
   else

[committed] libstdc++: Fix range access for empty std::valarray [PR103022]

2021-11-01 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, pushed to trunk.


The std::begin and std::end overloads for std::valarray are defined in
terms of std::addressof(v[0]) which is undefined for an empty valarray.

libstdc++-v3/ChangeLog:

PR libstdc++/103022
* include/std/valarray (begin, end): Do not dereference an empty
valarray. Add noexcept and [[nodiscard]].
* testsuite/26_numerics/valarray/range_access.cc: Check empty
valarray. Check iterator properties. Run as well as compiling.
* testsuite/26_numerics/valarray/range_access2.cc: Likewise.
* testsuite/26_numerics/valarray/103022.cc: New test.
---
 libstdc++-v3/include/std/valarray | 30 +---
 .../testsuite/26_numerics/valarray/103022.cc  | 15 ++
 .../26_numerics/valarray/range_access.cc  | 49 ---
 .../26_numerics/valarray/range_access2.cc | 22 -
 4 files changed, 100 insertions(+), 16 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/26_numerics/valarray/103022.cc

diff --git a/libstdc++-v3/include/std/valarray 
b/libstdc++-v3/include/std/valarray
index 5adc94282ee..c6242eb4db9 100644
--- a/libstdc++-v3/include/std/valarray
+++ b/libstdc++-v3/include/std/valarray
@@ -1210,9 +1210,10 @@ _DEFINE_BINARY_OPERATOR(>=, __greater_equal)
*  @param  __va  valarray.
*/
   template
+[[__nodiscard__]]
 inline _Tp*
-begin(valarray<_Tp>& __va)
-{ return std::__addressof(__va[0]); }
+begin(valarray<_Tp>& __va) noexcept
+{ return __va.size() ? std::__addressof(__va[0]) : nullptr; }
 
   /**
*  @brief  Return an iterator pointing to the first element of
@@ -1220,9 +1221,10 @@ _DEFINE_BINARY_OPERATOR(>=, __greater_equal)
*  @param  __va  valarray.
*/
   template
+[[__nodiscard__]]
 inline const _Tp*
-begin(const valarray<_Tp>& __va)
-{ return std::__addressof(__va[0]); }
+begin(const valarray<_Tp>& __va) noexcept
+{ return __va.size() ? std::__addressof(__va[0]) : nullptr; }
 
   /**
*  @brief  Return an iterator pointing to one past the last element of
@@ -1230,9 +1232,15 @@ _DEFINE_BINARY_OPERATOR(>=, __greater_equal)
*  @param  __va  valarray.
*/
   template
+[[__nodiscard__]]
 inline _Tp*
-end(valarray<_Tp>& __va)
-{ return std::__addressof(__va[0]) + __va.size(); }
+end(valarray<_Tp>& __va) noexcept
+{
+  if (auto __n = __va.size())
+   return std::__addressof(__va[0]) + __n;
+  else
+   return nullptr;
+}
 
   /**
*  @brief  Return an iterator pointing to one past the last element of
@@ -1240,9 +1248,15 @@ _DEFINE_BINARY_OPERATOR(>=, __greater_equal)
*  @param  __va  valarray.
*/
   template
+[[__nodiscard__]]
 inline const _Tp*
-end(const valarray<_Tp>& __va)
-{ return std::__addressof(__va[0]) + __va.size(); }
+end(const valarray<_Tp>& __va) noexcept
+{
+  if (auto __n = __va.size())
+   return std::__addressof(__va[0]) + __n;
+  else
+   return nullptr;
+}
 #endif // C++11
 
   /// @} group numeric_arrays
diff --git a/libstdc++-v3/testsuite/26_numerics/valarray/103022.cc 
b/libstdc++-v3/testsuite/26_numerics/valarray/103022.cc
new file mode 100644
index 000..d2e346760dd
--- /dev/null
+++ b/libstdc++-v3/testsuite/26_numerics/valarray/103022.cc
@@ -0,0 +1,15 @@
+// { dg-options "-D_GLIBCXX_DEBUG" }
+// { dg-do compile { target c++11 } }
+
+#include 
+
+int main()
+{
+  // PR libstdc++/103022
+  std::valarray va;
+  (void) std::begin(va);
+  (void) std::end(va);
+  const auto& cva = va;
+  (void) std::begin(cva);
+  (void) std::end(cva);
+}
diff --git a/libstdc++-v3/testsuite/26_numerics/valarray/range_access.cc 
b/libstdc++-v3/testsuite/26_numerics/valarray/range_access.cc
index c015d1890b6..c49c2c52f47 100644
--- a/libstdc++-v3/testsuite/26_numerics/valarray/range_access.cc
+++ b/libstdc++-v3/testsuite/26_numerics/valarray/range_access.cc
@@ -1,4 +1,4 @@
-// { dg-do compile { target c++11 } }
+// { dg-do run { target c++11 } }
 
 // Copyright (C) 2010-2021 Free Software Foundation, Inc.
 //
@@ -17,7 +17,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// 26.6.10 valarray range access: [valarray.range]
+// C++11 26.6.10 valarray range access: [valarray.range]
 
 #include 
 
@@ -25,9 +25,46 @@ void
 test01()
 {
   std::valarray va{1.0, 2.0, 3.0};
-  std::begin(va);
-  std::end(va);
+  (void) std::begin(va);
+  (void) std::end(va);
   const auto& cva = va;
-  std::begin(cva);
-  std::end(cva);
+  (void) std::begin(cva);
+  (void) std::end(cva);
+
+  using Iter = decltype(std::begin(va));
+  using IterTraits = std::iterator_traits;
+  static_assert( std::is_same::value, "" );
+  static_assert( std::is_same::value, "" );
+  static_assert( std::is_same::value, "" );
+  static_assert( std::is_same::value, "" );
+  using CIter = decltype(std::begin(cva));
+  using CIterTraits = std::iterator_traits;
+  static_assert( std::is_same::value, ""

Re: [PATCH] Move statics to threader pass class.

2021-11-01 Thread Aldy Hernandez via Gcc-patches

On Mon, Nov 1, 2021 at 2:03 PM Jeff Law  wrote:
>
>
>
> On 11/1/2021 3:53 AM, Aldy Hernandez wrote:
> > This patch moves all the static functions into the pass class, and
> > cleans up things a little.  The goal is to shuffle things around such
> > that we can add debug counters that depend on different threading
> > passes, but it's a clean-up on its own right.
> >
> > Tested on x86-64 Linux.
> >
> > OK?
> >
> > gcc/ChangeLog:
> >
> >   * tree-ssa-threadbackward.c (BT_NONE): New.
> >   (BT_SPEED): New.
> >   (BT_RESOLVE): New.
> >   (back_threader::back_threader): Add flags.
> >   Move loop initialization here.
> >   (back_threader::~back_threader): New.
> >   (back_threader::find_taken_edge_switch): Change solver and ranger
> >   to pointers.
> >   (back_threader::find_taken_edge_cond): Same.
> >   (back_threader::find_paths_to_names): Same.
> >   (back_threader::find_paths): Same.
> >   (back_threader::dump): Same.
> >   (try_thread_blocks): Merge into thread_blocks.
> >   (back_threader::thread_blocks): New.
> >   (do_early_thread_jumps): Merge into thread_blocks.
> >   (do_thread_jumps): Merge into thread_blocks.
> >   (back_threader::thread_through_all_blocks): Remove.
> OK.  Presumably this is a prereq for the counter patch.

Indeed.  As promised, I'm not adding any functionality, just enhancing
the developer debug experience.  Debugging threading problems is hard
:-/.

Aldy

Re: [PATCH] Move statics to threader pass class.

2021-11-01 Thread Jeff Law via Gcc-patches





On 11/1/2021 3:53 AM, Aldy Hernandez wrote:

This patch moves all the static functions into the pass class, and
cleans up things a little.  The goal is to shuffle things around such
that we can add debug counters that depend on different threading
passes, but it's a clean-up on its own right.

Tested on x86-64 Linux.

OK?

gcc/ChangeLog:

* tree-ssa-threadbackward.c (BT_NONE): New.
(BT_SPEED): New.
(BT_RESOLVE): New.
(back_threader::back_threader): Add flags.
Move loop initialization here.
(back_threader::~back_threader): New.
(back_threader::find_taken_edge_switch): Change solver and ranger
to pointers.
(back_threader::find_taken_edge_cond): Same.
(back_threader::find_paths_to_names): Same.
(back_threader::find_paths): Same.
(back_threader::dump): Same.
(try_thread_blocks): Merge into thread_blocks.
(back_threader::thread_blocks): New.
(do_early_thread_jumps): Merge into thread_blocks.
(do_thread_jumps): Merge into thread_blocks.
(back_threader::thread_through_all_blocks): Remove.

OK.  Presumably this is a prereq for the counter patch.

jeff

Re: [PATCH] Add debug counters to back threader.

2021-11-01 Thread Jeff Law via Gcc-patches





On 11/1/2021 3:54 AM, Aldy Hernandez wrote:

Chasing down stage3 miscomparisons is never fun, and having no way to
distinguish between jump threads registered by a particular
pass, is even harder.  This patch adds debug counters for the individual
back threading passes.  I've left the ethread pass alone, as that one is
usually benign, but we could easily add it if needed.

The fact that we can only pass one boolean argument to the passes
infrastructure has us do all sorts of gymnastics to differentiate
between the various back threading passes.

Tested on x86-64 Linux.

OK?

gcc/ChangeLog:

* dbgcnt.def: Add debug counter for back_thread[12] and
back_threadfull[12].
* passes.def: Pass "first" argument to each back threading pass.
* tree-ssa-threadbackward.c (back_threader::back_threader): Add
first argument.
(back_threader::debug_counter): New.
(back_threader::maybe_register_path): Call debug_counter.

OK
jeff

[PATCH] RISC-V: Fix build errors with shNadd/shNadd.uw patterns in zba cost model

2021-11-01 Thread Maciej W. Rozycki

Fix a build regression from commit 04a9b554ba1a ("RISC-V: Cost model 
for zba extension."):

.../gcc/config/riscv/riscv.c: In function 'bool riscv_rtx_costs(rtx, 
machine_mode, int, int, int*, bool)':
.../gcc/config/riscv/riscv.c:2018:11: error: 'and' of mutually exclusive 
equal-tests is always 0 [-Werror]
 2018 |   && IN_RANGE (INTVAL (XEXP (XEXP (x, 0), 0)), 1, 3))
  |   ^~
.../gcc/config/riscv/riscv.c:2047:17: error: unused variable 'ashift_lhs' 
[-Werror=unused-variable]
 2047 | rtx ashift_lhs = XEXP (and_lhs, 0);
  | ^~


by removing an incorrect REG_P check applied to a constant expression 
and getting rid of the unused variable.

gcc/
* config/riscv/riscv.c (riscv_rtx_costs): Remove a REG_P check 
and an unused local variable with shNadd/shNadd.uw pattern 
handling.
---
Hi,

 As described above and I guess almost obvious -- I gather the code was 
only verified with a `-Wno-error' build and the handling of the shNadd 
pattern has not been actually covered owing to this bug making the 
condition impossible to match.

 OK to apply then?

  Maciej
---
 gcc/config/riscv/riscv.c |2 --
 1 file changed, 2 deletions(-)

gcc-riscv-rtx-costs-zba-shnadd.diff
Index: gcc/gcc/config/riscv/riscv.c
===
--- gcc.orig/gcc/config/riscv/riscv.c
+++ gcc/gcc/config/riscv/riscv.c
@@ -2013,7 +2013,6 @@ riscv_rtx_costs (rtx x, machine_mode mod
  && ((!TARGET_64BIT && (mode == SImode)) ||
  (TARGET_64BIT && (mode == DImode)))
  && (GET_CODE (XEXP (x, 0)) == ASHIFT)
- && REG_P (XEXP (XEXP (x, 0), 0))
  && CONST_INT_P (XEXP (XEXP (x, 0), 0))
  && IN_RANGE (INTVAL (XEXP (XEXP (x, 0), 0)), 1, 3))
{
@@ -2044,7 +2043,6 @@ riscv_rtx_costs (rtx x, machine_mode mod
if (!CONST_INT_P (and_rhs))
  break;
 
-   rtx ashift_lhs = XEXP (and_lhs, 0);
rtx ashift_rhs = XEXP (and_lhs, 1);
 
if (!CONST_INT_P (ashift_rhs)

Re: [PATCH 07/18] rs6000: Builtin expansion, part 2

2021-11-01 Thread Segher Boessenkool

Hi!

On Wed, Sep 01, 2021 at 11:13:43AM -0500, Bill Schmidt wrote:
>   * config/rs6000/rs6000-call.c (rs6000_invalid_new_builtin):
>   Implement.

That fits on one line.  Don't wrap early, esp. not if that leaves a
colon without anything following it on that line: it looks like
something is missing.

>   (rs6000_expand_ldst_mask): Likewise.
>   (rs6000_init_builtins): Initialize altivec_builtin_mask_for_load.

>  static void
>  rs6000_invalid_new_builtin (enum rs6000_gen_builtins fncode)
>  {
> +  size_t uns_fncode = (size_t) fncode;

Like in the previous patch, the "uns_*" name made me think "you do not
need an explicit cast, the assignment will do that automatically".  But
of course it does not matter this is unsigned at all: the cast is
casting an enum to a number, which in C++ does require a cast.

So maybe you can think of some better name?  Something like "j" is fine
with me as well btw, it's nice and short, and it is clear you do not
want more meaning ;-)

> +  switch (rs6000_builtin_info_x[uns_fncode].enable)

> +case ENB_P6:
> +  error ("%qs requires the %qs option", name, "-mcpu=power6");
> +  break;

> +case ENB_CELL:
> +  error ("%qs is only valid for the cell processor", name);
> +  break;

Maybe  "%qs requires the %qs option", name, "-mcpu=cell"  ?  Boring is
good ;-)

> +};

(This is  switch (...) { ... };  )
Stray semi.  Was there no warning?

>  rtx
>  rs6000_expand_ldst_mask (rtx target, tree arg0)
>   {
> +  int icode2 = BYTES_BIG_ENDIAN

You do not need a line break here.

> +? (int) CODE_FOR_altivec_lvsr_direct
> +: (int) CODE_FOR_altivec_lvsl_direct;

You can align the ? and : just fine without it.

> +  rtx op, addr, pat;

Don't declare such things early.

Okay for trunk with those things fixed.  Thanks!

Segher

Re: [PATCH] x86: Document -fcf-protection requires i686 or newer

2021-11-01 Thread H.J. Lu via Gcc-patches

On Fri, Oct 29, 2021 at 3:04 PM Eric Gallager  wrote:
>
> On Thu, Oct 21, 2021 at 12:49 PM H.J. Lu via Gcc-patches
>  wrote:
> >
> > PR target/98667
> > * doc/invoke.texi: Document -fcf-protection requires i686 or
> > new.
> > ---
> >  gcc/doc/invoke.texi | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index c66a25fcd69..71992b8c597 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -15542,7 +15542,8 @@ which functions and calls should be skipped from 
> > instrumentation
> >  (@pxref{Function Attributes}).
> >
> >  Currently the x86 GNU/Linux target provides an implementation based
> > -on Intel Control-flow Enforcement Technology (CET).
> > +on Intel Control-flow Enforcement Technology (CET) which works for
> > +i686 processor or newer.
>
> I think "processor" should be pluralized to "processors"? Also,
> possibly a missing comma after "(CET)"?
>

Can you submit a patch?

Thanks.

-- 
H.J.

Re: [PATCH] [PR103017] aarch64:fix redundant check in aut insn generation

2021-11-01 Thread Richard Sandiford via Gcc-patches

Dan Li via Gcc-patches  writes:
> During the generation of the epilogue of aarch64(aarch64_expand_epilogue),
> the value of crtl->calls_eh_return does not need to be checked again.
> This value has been checked during aarch64_return_address_signing_enabled.

Ah, yeah, looks like this code became dead with
2bc95be3bb8c8138e2e87c1c11c84bfede989d61.

>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64.c (aarch64_expand_epilogue):
>   * config/aarch64/aarch64.md:
>
> Signed-off-by: Dan Li 

Thanks, applied with the changelog:

* config/aarch64/aarch64.c (aarch64_expand_epilogue): Remove
redundant check for calls_eh_return.
* config/aarch64/aarch64.md (*do_return): Likewise.
> ---
>  gcc/config/aarch64/aarch64.c  | 6 +-
>  gcc/config/aarch64/aarch64.md | 3 +--
>  2 files changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 699c105a42a..8448e56443c 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -9076,13 +9076,9 @@ aarch64_expand_epilogue (bool for_sibcall)
>   2) The RETAA instruction is not available before ARMv8.3-A, so if we are
>  generating code for !TARGET_ARMV8_3 we can't use it and must
>  explicitly authenticate.
> -
> - 3) On an eh_return path we make extra stack adjustments to update the
> -canonical frame address to be the exception handler's CFA.  We want
> -to authenticate using the CFA of the function which calls eh_return.
>  */
>if (aarch64_return_address_signing_enabled ()
> -  && (for_sibcall || !TARGET_ARMV8_3 || crtl->calls_eh_return))
> +  && (for_sibcall || !TARGET_ARMV8_3))
>  {
>switch (aarch64_ra_sign_key)
>   {
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 1a39470a1fe..65ee6159d73 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -879,8 +879,7 @@ (define_insn "*do_return"
>{
>  const char *ret = NULL;
>  if (aarch64_return_address_signing_enabled ()
> - && (TARGET_PAUTH)
> - && !crtl->calls_eh_return)
> + && (TARGET_PAUTH))
>{
>   if (aarch64_ra_sign_key == AARCH64_KEY_B)
> ret = "retab";

[PATCH] Add debug counters to back threader.

2021-11-01 Thread Aldy Hernandez via Gcc-patches

Chasing down stage3 miscomparisons is never fun, and having no way to
distinguish between jump threads registered by a particular
pass, is even harder.  This patch adds debug counters for the individual
back threading passes.  I've left the ethread pass alone, as that one is
usually benign, but we could easily add it if needed.

The fact that we can only pass one boolean argument to the passes
infrastructure has us do all sorts of gymnastics to differentiate
between the various back threading passes.

Tested on x86-64 Linux.

OK?

gcc/ChangeLog:

* dbgcnt.def: Add debug counter for back_thread[12] and
back_threadfull[12].
* passes.def: Pass "first" argument to each back threading pass.
* tree-ssa-threadbackward.c (back_threader::back_threader): Add
first argument.
(back_threader::debug_counter): New.
(back_threader::maybe_register_path): Call debug_counter.
---
 gcc/dbgcnt.def|  4 ++
 gcc/passes.def| 10 ++---
 gcc/tree-ssa-threadbackward.c | 71 ---
 3 files changed, 74 insertions(+), 11 deletions(-)

diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
index c2bcc4eef5e..3a85665a1d7 100644
--- a/gcc/dbgcnt.def
+++ b/gcc/dbgcnt.def
@@ -144,6 +144,10 @@ echo ubound: $ub
Please keep the list sorted in alphabetic order.  */
 DEBUG_COUNTER (asan_use_after_scope)
 DEBUG_COUNTER (auto_inc_dec)
+DEBUG_COUNTER (back_thread1)
+DEBUG_COUNTER (back_thread2)
+DEBUG_COUNTER (back_threadfull1)
+DEBUG_COUNTER (back_threadfull2)
 DEBUG_COUNTER (ccp)
 DEBUG_COUNTER (cfg_cleanup)
 DEBUG_COUNTER (cprop)
diff --git a/gcc/passes.def b/gcc/passes.def
index 29921f80ed9..0f541454e7f 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -81,7 +81,7 @@ along with GCC; see the file COPYING3.  If not see
  /* After CCP we rewrite no longer addressed locals into SSA
 form if possible.  */
  NEXT_PASS (pass_forwprop);
-  NEXT_PASS (pass_early_thread_jumps);
+  NEXT_PASS (pass_early_thread_jumps, /*first=*/true);
  NEXT_PASS (pass_sra_early);
  /* pass_build_ealias is a dummy pass that ensures that we
 execute TODO_rebuild_alias at this point.  */
@@ -210,7 +210,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_return_slot);
   NEXT_PASS (pass_fre, true /* may_iterate */);
   NEXT_PASS (pass_merge_phi);
-  NEXT_PASS (pass_thread_jumps_full);
+  NEXT_PASS (pass_thread_jumps_full, /*first=*/true);
   NEXT_PASS (pass_vrp, true /* warn_array_bounds_p */);
   NEXT_PASS (pass_dse);
   NEXT_PASS (pass_dce);
@@ -233,7 +233,7 @@ along with GCC; see the file COPYING3.  If not see
 propagations have already run, but before some more dead code
 is removed, and this place fits nicely.  Remember this when
 trying to move or duplicate pass_dominator somewhere earlier.  */
-  NEXT_PASS (pass_thread_jumps);
+  NEXT_PASS (pass_thread_jumps, /*first=*/true);
   NEXT_PASS (pass_dominator, true /* may_peel_loop_headers_p */);
   /* Threading can leave many const/copy propagations in the IL.
 Clean them up.  Failure to do so well can lead to false
@@ -332,10 +332,10 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_fre, false /* may_iterate */);
   /* After late FRE we rewrite no longer addressed locals into SSA
  form if possible.  */
-  NEXT_PASS (pass_thread_jumps);
+  NEXT_PASS (pass_thread_jumps, /*first=*/false);
   NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
   NEXT_PASS (pass_strlen);
-  NEXT_PASS (pass_thread_jumps_full);
+  NEXT_PASS (pass_thread_jumps_full, /*first=*/false);
   NEXT_PASS (pass_vrp, false /* warn_array_bounds_p */);
   /* Run CCP to compute alignment and nonzero bits.  */
   NEXT_PASS (pass_ccp, true /* nonzero_p */);
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index c66e74d159a..8e4a59744c5 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -44,6 +44,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-cfgcleanup.h"
 #include "tree-pretty-print.h"
 #include "cfghooks.h"
+#include "dbgcnt.h"
 
 // Path registry for the backwards threader.  After all paths have been
 // registered with register_path(), thread_through_all_blocks() is called
@@ -80,12 +81,13 @@ private:
 class back_threader
 {
 public:
-  back_threader (function *fun, unsigned flags);
+  back_threader (function *fun, unsigned flags, bool first);
   ~back_threader ();
   unsigned thread_blocks ();
 private:
   void maybe_thread_block (basic_block bb);
   void find_paths (basic_block bb, tree name);
+  bool debug_counter ();
   edge maybe_register_path ();
   bool find_paths_to_names (basic_block bb, bitmap imports);
   bool resolve_def (tree name, bitmap interesting, vec );
@@ -120,14 +122,19 @@ private:
   //

[PATCH]middle-end testsuite: fix failing complex add testcases PR103000

2021-11-01 Thread Tamar Christina via Gcc-patches

Hi All,

Some targets have overriden the default unroll factor and so do not have enough
data to succeed for SLP vectorization if loop vect is turned off.

To fix this just always unroll in these testcases.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues.

Ok for master?

Thanks,
Tamar

gcc/testsuite/ChangeLog:

PR testsuite/103000
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c:
Force unroll.
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c: likewise
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c:
Likewise
* gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c:
Likewise.

--- inline copy of patch -- 
diff --git 
a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c 
b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c
index 
4445119fc9d2c7dafe6abb5f7fb741c7794144a2..23f179a55dcf77c7cfa8f55f748c9973b5e9c646
 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target vect_double } */
 /* { dg-add-options arm_v8_3a_complex_neon } */
-/* { dg-additional-options "-fno-tree-loop-vectorize" } */
+/* { dg-additional-options "-fno-tree-loop-vectorize -funroll-loops" } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 
 #define TYPE double
diff --git 
a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c 
b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c
index 
ff53719d1a895a7161ebcc6fba4903fc3de9095f..cc7715160981274605b4ab21e7db33fdb373e04d
 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-float.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target vect_float } */
 /* { dg-add-options arm_v8_3a_complex_neon } */
-/* { dg-additional-options "-fno-tree-loop-vectorize" } */
+/* { dg-additional-options "-fno-tree-loop-vectorize -funroll-loops" } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 
 #define TYPE float
diff --git 
a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c
 
b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c
index 
8bc7117565e79a0e93a22d2b28a32e9c5ddfe4d3..fb6a1676fb4b7a766088dcec42a3a2465c3e11f9
 100644
--- 
a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c
+++ 
b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-float.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target vect_float } */
 /* { dg-add-options arm_v8_3a_complex_neon } */
-/* { dg-additional-options "-fno-tree-loop-vectorize" } */
+/* { dg-additional-options "-fno-tree-loop-vectorize -funroll-loops" } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 
 #define TYPE float
diff --git 
a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c
 
b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c
index 
80e0f5d5412318d05883813a81dc4a2d9a62f234..4bb106a3d520c6ab2a322cc463f6a7f5c5238f95
 100644
--- 
a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c
+++ 
b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-pattern-half-float.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target vect_complex_add_half } */
 /* { dg-add-options arm_v8_3a_fp16_complex_neon } */
-/* { dg-additional-options "-fno-tree-loop-vectorize" } */
+/* { dg-additional-options "-fno-tree-loop-vectorize -funroll-loops" } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 
 #define TYPE _Float16
@@ -8,6 +9,6 @@
 #include "complex-add-pattern-template.c"
 
 /* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_ADD_ROT90" 1 "slp1" { 
target { vect_complex_add_half } } } } */
-/* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_ADD_ROT270" 1 "slp1" { 
target { vect_complex_add_byte } && ! target { arm*-*-* } } } } */
+/* { dg-final { scan-tree-dump-times "stmt.*COMPLEX_ADD_ROT270" 1 "slp1" { 
target { vect_complex_add_half } && ! target { arm*-*-* } } } } */
 /* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT270" "slp1" { xfail *-*-* 
} } } */
 /* { dg-final { scan-tree-dump "Found COMPLEX_ADD_ROT90" "slp1" } } */


-- 
diff --git a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c
index 4445119fc9d2c7dafe6abb5f7fb741c7794144a2..23f179a55dcf77c7cfa8f55f748c9973b5e9c646 100644
--- a/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c
+++ b/gcc/testsuite/gcc.dg/vect/complex/fast-math-bb-slp-complex-add-double.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
+/* {

[PATCH]middle-end Fix PR103007, add missing check on complex fms detection.

2021-11-01 Thread Tamar Christina via Gcc-patches

Hi All,

The complex FMS detection is missing a check on if the nodes of the VEC_PERM
has the amount of children we expect before it recurses.

This check is there on MUL and FMA but was missing for FMS, due to this the
compiler goes on further than it should and hits an assert.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR tree-optimizations/103007
* tree-vect-slp-patterns.c (complex_fms_pattern::matches): Add elem
check.

gcc/testsuite/ChangeLog:

PR tree-optimizations/103007
* g++.dg/pr103007.C: New test.

--- inline copy of patch -- 
diff --git a/gcc/testsuite/g++.dg/pr103007.C b/gcc/testsuite/g++.dg/pr103007.C
new file mode 100644
index 
..1631a85080039f29b83c97d2f62c66be9eac109f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr103007.C
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+typedef float MushMeshVector[4];
+struct MushMeshQuaternionPair {
+  void VectorRotate(MushMeshVector &);
+  MushMeshVector m_first;
+  MushMeshVector m_second;
+};
+void 
+MushMeshQuaternionPair::
+VectorRotate(MushMeshVector )  {
+  ioVec[2] = (2 - m_first[1] + m_first[3] * 0);
+  ioVec[3] = (m_first[3] + m_first[1] - m_first[2] * 0);
+  float c = ioVec[2], d = ioVec[3];
+  ioVec[2] = (0 - d * m_second[1]);
+  ioVec[3] = (2 - c * m_second[1]);
+}
+
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
index 
6b37e9bac6f3f86a51d1a532a4c570a37af76eac..5e64a9bc417ab6b855e8791fd482dba23287f467
 100644
--- a/gcc/tree-vect-slp-patterns.c
+++ b/gcc/tree-vect-slp-patterns.c
@@ -1250,13 +1250,17 @@ complex_fms_pattern::matches (complex_operation_t op,
 
   auto childs = SLP_TREE_CHILDREN (nodes[0]);
   auto l0node = SLP_TREE_CHILDREN (childs[0]);
-  auto l1node = SLP_TREE_CHILDREN (childs[1]);
 
   /* Now operand2+4 may lead to another expression.  */
   auto_vec left_op, right_op;
   left_op.safe_splice (SLP_TREE_CHILDREN (l0node[1]));
   right_op.safe_splice (SLP_TREE_CHILDREN (nodes[1]));
 
+  /* If these nodes don't have any children then they're
+ not ones we're interested in.  */
+  if (left_op.length () != 2 || right_op.length () != 2)
+return IFN_LAST;
+
   bool is_neg = vect_normalize_conj_loc (left_op);
 
   bool conj_first_operand = false;


-- 
diff --git a/gcc/testsuite/g++.dg/pr103007.C b/gcc/testsuite/g++.dg/pr103007.C
new file mode 100644
index ..1631a85080039f29b83c97d2f62c66be9eac109f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr103007.C
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-O3" } */
+
+typedef float MushMeshVector[4];
+struct MushMeshQuaternionPair {
+  void VectorRotate(MushMeshVector &);
+  MushMeshVector m_first;
+  MushMeshVector m_second;
+};
+void 
+MushMeshQuaternionPair::
+VectorRotate(MushMeshVector )  {
+  ioVec[2] = (2 - m_first[1] + m_first[3] * 0);
+  ioVec[3] = (m_first[3] + m_first[1] - m_first[2] * 0);
+  float c = ioVec[2], d = ioVec[3];
+  ioVec[2] = (0 - d * m_second[1]);
+  ioVec[3] = (2 - c * m_second[1]);
+}
+
diff --git a/gcc/tree-vect-slp-patterns.c b/gcc/tree-vect-slp-patterns.c
index 6b37e9bac6f3f86a51d1a532a4c570a37af76eac..5e64a9bc417ab6b855e8791fd482dba23287f467 100644
--- a/gcc/tree-vect-slp-patterns.c
+++ b/gcc/tree-vect-slp-patterns.c
@@ -1250,13 +1250,17 @@ complex_fms_pattern::matches (complex_operation_t op,
 
   auto childs = SLP_TREE_CHILDREN (nodes[0]);
   auto l0node = SLP_TREE_CHILDREN (childs[0]);
-  auto l1node = SLP_TREE_CHILDREN (childs[1]);
 
   /* Now operand2+4 may lead to another expression.  */
   auto_vec left_op, right_op;
   left_op.safe_splice (SLP_TREE_CHILDREN (l0node[1]));
   right_op.safe_splice (SLP_TREE_CHILDREN (nodes[1]));
 
+  /* If these nodes don't have any children then they're
+ not ones we're interested in.  */
+  if (left_op.length () != 2 || right_op.length () != 2)
+return IFN_LAST;
+
   bool is_neg = vect_normalize_conj_loc (left_op);
 
   bool conj_first_operand = false;

[PATCH] Move statics to threader pass class.

2021-11-01 Thread Aldy Hernandez via Gcc-patches

This patch moves all the static functions into the pass class, and
cleans up things a little.  The goal is to shuffle things around such
that we can add debug counters that depend on different threading
passes, but it's a clean-up on its own right.

Tested on x86-64 Linux.

OK?

gcc/ChangeLog:

* tree-ssa-threadbackward.c (BT_NONE): New.
(BT_SPEED): New.
(BT_RESOLVE): New.
(back_threader::back_threader): Add flags.
Move loop initialization here.
(back_threader::~back_threader): New.
(back_threader::find_taken_edge_switch): Change solver and ranger
to pointers.
(back_threader::find_taken_edge_cond): Same.
(back_threader::find_paths_to_names): Same.
(back_threader::find_paths): Same.
(back_threader::dump): Same.
(try_thread_blocks): Merge into thread_blocks.
(back_threader::thread_blocks): New.
(do_early_thread_jumps): Merge into thread_blocks.
(do_thread_jumps): Merge into thread_blocks.
(back_threader::thread_through_all_blocks): Remove.
---
 gcc/tree-ssa-threadbackward.c | 116 +-
 1 file changed, 58 insertions(+), 58 deletions(-)

diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 9979bfdedf4..c66e74d159a 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -69,13 +69,22 @@ private:
   const bool m_speed_p;
 };
 
+// Back threader flags.
+#define BT_NONE 0
+// Generate fast code at the expense of code size.
+#define BT_SPEED 1
+// Resolve unknown SSAs on entry to a threading path.  If set, use the
+// ranger.  If not, assume all ranges on entry to a path are VARYING.
+#define BT_RESOLVE 2
+
 class back_threader
 {
 public:
-  back_threader (bool speed_p, bool resolve);
-  void maybe_thread_block (basic_block bb);
-  bool thread_through_all_blocks (bool may_peel_loop_headers);
+  back_threader (function *fun, unsigned flags);
+  ~back_threader ();
+  unsigned thread_blocks ();
 private:
+  void maybe_thread_block (basic_block bb);
   void find_paths (basic_block bb, tree name);
   edge maybe_register_path ();
   bool find_paths_to_names (basic_block bb, bitmap imports);
@@ -89,8 +98,8 @@ private:
 
   back_threader_registry m_registry;
   back_threader_profitability m_profit;
-  gimple_ranger m_ranger;
-  path_range_query m_solver;
+  gimple_ranger *m_ranger;
+  path_range_query *m_solver;
 
   // Current path being analyzed.
   auto_vec m_path;
@@ -109,19 +118,35 @@ private:
   // Set to TRUE if unknown SSA names along a path should be resolved
   // with the ranger.  Otherwise, unknown SSA names are assumed to be
   // VARYING.  Setting to true is more precise but slower.
-  bool m_resolve;
+  function *m_fun;
+  unsigned m_flags;
 };
 
 // Used to differentiate unreachable edges, so we may stop the search
 // in a the given direction.
 const edge back_threader::UNREACHABLE_EDGE = (edge) -1;
 
-back_threader::back_threader (bool speed_p, bool resolve)
-  : m_profit (speed_p),
-m_solver (m_ranger, resolve)
+back_threader::back_threader (function *fun, unsigned flags)
+  : m_profit (flags & BT_SPEED)
 {
+  if (flags & BT_SPEED)
+loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
+  else
+loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
+
+  m_fun = fun;
+  m_flags = flags;
+  m_ranger = new gimple_ranger;
+  m_solver = new path_range_query (*m_ranger, flags & BT_RESOLVE);
   m_last_stmt = NULL;
-  m_resolve = resolve;
+}
+
+back_threader::~back_threader ()
+{
+  delete m_solver;
+  delete m_ranger;
+
+  loop_optimizer_finalize ();
 }
 
 // Register the current path for jump threading if it's profitable to
@@ -186,8 +211,8 @@ back_threader::find_taken_edge_switch (const 
vec ,
   tree name = gimple_switch_index (sw);
   int_range_max r;
 
-  m_solver.compute_ranges (path, m_imports);
-  m_solver.range_of_expr (r, name, sw);
+  m_solver->compute_ranges (path, m_imports);
+  m_solver->range_of_expr (r, name, sw);
 
   if (r.undefined_p ())
 return UNREACHABLE_EDGE;
@@ -210,10 +235,10 @@ back_threader::find_taken_edge_cond (const 
vec ,
 {
   int_range_max r;
 
-  m_solver.compute_ranges (path, m_imports);
-  m_solver.range_of_stmt (r, cond);
+  m_solver->compute_ranges (path, m_imports);
+  m_solver->range_of_stmt (r, cond);
 
-  if (m_solver.unreachable_path_p ())
+  if (m_solver->unreachable_path_p ())
 return UNREACHABLE_EDGE;
 
   int_range<2> true_range (boolean_true_node, boolean_true_node);
@@ -381,7 +406,7 @@ back_threader::find_paths_to_names (basic_block bb, bitmap 
interesting)
   // Examine blocks that define or export an interesting SSA,
   // since they may compute a range which resolve this path.
   if ((def_bb == bb
-  || bitmap_bit_p (m_ranger.gori ().exports (bb), i))
+  || bitmap_bit_p (m_ranger->gori ().exports (bb), i))
  && m_path.length () > 1)
{
  if (maybe_register_path

[PATCH 4/4] libgcc: vxcrtstuff.c: add a few undefs

2021-11-01 Thread Rasmus Villemoes

When vxcrtstuff.c was created, the set of #includes was copied from
crtstuff.c. But crtstuff.c also has a bunch of #undefs after the first
#include, because, as the comment says, including auto-host.h when
building objects that are meant for target is technically not
correct.

This manifests when I try do do a canadian cross, with build=linux,
host=windows and target=vxworks, in that we pick up a

  #define caddr_t char *

from auto-host.h, which then of course creates a problem when we later
include a target header that has

  typedef char * caddr_t;

I assume that the #undefs in crtstuff.c have been added for similar
reasons.

These potentially problematic #defines all seem to be guarded by
#ifndef USED_FOR_TARGET, which tconfig.h defines before including
auto-host.h. So at first, it seems that one could avoid the problem
by simply removing the initial include of auto-host.h. Unfortunately,
we do need some of the things defined in auto-host.h within such an
ifndef USED_FOR_TARGET, namely the define of
HAVE_INITFINI_ARRAY_SUPPORT, which is what later causes
initfini-array.h to define USE_INITFINI_ARRAY. So as the next best
fix, just copy the #undefs from crtstuff.c.

libgcc/
* config/vxcrtstuff.c: Undefine caddr_t, pid_t, rlim_t,
ssize_t and vfork after including auto-host.h.
---
 libgcc/config/vxcrtstuff.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/libgcc/config/vxcrtstuff.c b/libgcc/config/vxcrtstuff.c
index 652a65364b0..c15e15e54e9 100644
--- a/libgcc/config/vxcrtstuff.c
+++ b/libgcc/config/vxcrtstuff.c
@@ -26,7 +26,15 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 
 #define IN_LIBGCC2
 
+/* FIXME: Including auto-host is incorrect, but until we have
+   identified the set of defines that need to go into auto-target.h,
+   this will have to do.  */
 #include "auto-host.h"
+#undef caddr_t
+#undef pid_t
+#undef rlim_t
+#undef ssize_t
+#undef vfork
 #include "tconfig.h"
 #include "tsystem.h"
 #include "coretypes.h"
-- 
2.31.1

[PATCH 3/4] libgcc: vxcrtstuff.c: make ctor/dtor functions static

2021-11-01 Thread Rasmus Villemoes

When the translation unit itself creates pointers to the ctors/dtors
in a specific section handled by the linker (whether .init_array or
.ctors.*), there's no reason for the functions to have external
linkage. That ends up polluting the symbol table in the running
kernel.

This makes vxcrtstuff.c on par with the generic crtstuff.c which also
defines e.g. frame_dummy and __do_global_dtors_aux static.

libgcc/
* config/vxcrtstuff.c: Make constructor and destructor
functions static when possible.
---
 libgcc/config/vxcrtstuff.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/libgcc/config/vxcrtstuff.c b/libgcc/config/vxcrtstuff.c
index c146e1be3be..652a65364b0 100644
--- a/libgcc/config/vxcrtstuff.c
+++ b/libgcc/config/vxcrtstuff.c
@@ -58,14 +58,18 @@ __attribute__((section(__LIBGCC_EH_FRAME_SECTION_NAME__), 
aligned(4)))
 
 #define EH_CTOR_NAME _crtbe_register_frame
 #define EH_DTOR_NAME _ctrbe_deregister_frame
+#define EH_LINKAGE static
 
 #else
 
 /* No specific sections for constructors or destructors: we thus use a
symbol naming convention so that the constructors are then recognized
-   by munch or whatever tool is used for the final link phase.  */
+   by munch or whatever tool is used for the final link phase.  Since the
+   pointers to the constructor/destructor functions are not created in this
+   translation unit, they must have external linkage.  */
 #define EH_CTOR_NAME _GLOBAL__I_00101_0__crtbe_register_frame
 #define EH_DTOR_NAME _GLOBAL__D_00101_1__crtbe_deregister_frame
+#define EH_LINKAGE
 
 #endif
 
@@ -88,13 +92,13 @@ __attribute__((section(__LIBGCC_EH_FRAME_SECTION_NAME__), 
aligned(4)))
 
 #endif /* USE_INITFINI_ARRAY  */
 
-EH_CTOR_ATTRIBUTE void EH_CTOR_NAME (void)
+EH_LINKAGE EH_CTOR_ATTRIBUTE void EH_CTOR_NAME (void)
 {
   static struct object object;
   __register_frame_info (__EH_FRAME_BEGIN__, );
 }
 
-EH_DTOR_ATTRIBUTE void EH_DTOR_NAME (void)
+EH_LINKAGE EH_DTOR_ATTRIBUTE void EH_DTOR_NAME (void)
 {
   __deregister_frame_info (__EH_FRAME_BEGIN__);
 }
-- 
2.31.1

[PATCH 2/4] libgcc: vxcrtstuff.c: remove ctor/dtor declarations

2021-11-01 Thread Rasmus Villemoes

These declarations prevent the priority given in the
constructor/destructor attributes from taking effect, thus emitting
the function pointers in the ordinary (lowest-priority)
.init_array/.fini_array sections.

libgcc/
* config/vxcrtstuff.c: Remove constructor/destructor
declarations.
---
 libgcc/config/vxcrtstuff.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/libgcc/config/vxcrtstuff.c b/libgcc/config/vxcrtstuff.c
index 87fadda9ac5..c146e1be3be 100644
--- a/libgcc/config/vxcrtstuff.c
+++ b/libgcc/config/vxcrtstuff.c
@@ -88,9 +88,6 @@ __attribute__((section(__LIBGCC_EH_FRAME_SECTION_NAME__), 
aligned(4)))
 
 #endif /* USE_INITFINI_ARRAY  */
 
-void EH_CTOR_NAME (void);
-void EH_DTOR_NAME (void);
-
 EH_CTOR_ATTRIBUTE void EH_CTOR_NAME (void)
 {
   static struct object object;
-- 
2.31.1

[PATCH 1/4] libgcc: remove crt{begin, end}.o from powerpc-wrs-vxworks target

2021-11-01 Thread Rasmus Villemoes

Since commit 78e49fb1bc (Introduce vxworks specific crtstuff support),
the generic crtbegin.o/crtend.o have been unnecessary to build. So
remove them from extra_parts.

This is effectively a revert of commit 9a5b8df70 (libgcc: add
crt{begin,end} for powerpc-wrs-vxworks target).

libgcc/
* config.host (powerpc-wrs-vxworks): Do not add crtbegin.o and
crtend.o to extra_parts.
---
 libgcc/config.host | 1 -
 1 file changed, 1 deletion(-)

diff --git a/libgcc/config.host b/libgcc/config.host
index 85de83da766..651e63adb23 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -1243,7 +1243,6 @@ powerpc*-wrs-vxworks7*)
 ;;
 powerpc-wrs-vxworks*)
tmake_file="$tmake_file rs6000/t-ppccomm rs6000/t-savresfgpr t-fdpbit"
-   extra_parts="$extra_parts crtbegin.o crtend.o"
;;
 powerpc-*-lynxos*)
tmake_file="$tmake_file rs6000/t-lynx t-fdpbit"
-- 
2.31.1

[PATCH 0/4] some vxworks crtstuff

2021-11-01 Thread Rasmus Villemoes

From: Rasmus Villemoes 

A few things I hit when trying to upgrade our VxWorks5 toolchain. I
don't think these can break anything for VxWorks 6+, and patch 2
should be an improvement for all in that the current code doesn't get
compiled as it was clearly intended - though the real bug is likely in
gcc itself, it's easier to work around here by just removing the
declarations.

OK for master?


Rasmus Villemoes (4):
  libgcc: remove crt{begin,end}.o from powerpc-wrs-vxworks target
  libgcc: vxcrtstuff.c: remove ctor/dtor declarations
  libgcc: vxcrtstuff.c: make ctor/dtor functions static
  libgcc: vxcrtstuff.c: add a few undefs

 libgcc/config.host |  1 -
 libgcc/config/vxcrtstuff.c | 21 +++--
 2 files changed, 15 insertions(+), 7 deletions(-)

-- 
2.31.1

Re: [PATCH Take #2] x86_64: Expand ashrv1ti (and PR target/102986)

2021-11-01 Thread Uros Bizjak via Gcc-patches

On Mon, Nov 1, 2021 at 9:43 AM Jakub Jelinek  wrote:
>
> On Mon, Nov 01, 2021 at 08:27:12AM +0100, Uros Bizjak wrote:
> > > Also, I wonder for all these patterns (previously and now added), 
> > > shouldn't
> > > they have && TARGET_64BIT in conditions?  I mean, we don't really support
> > > scalar TImode for ia32, but VALID_SSE_REG_MODE includes V1TImode and while
> > > the constant shifts can be done, I think the variable shifts can't, there
> > > are no TImode shift patterns...
> >
> > - (match_operand:SI 2 "const_int_operand")))]
> > -  "TARGET_SSE2"
> > + (match_operand:QI 2 "general_operand")))]
> > +  "TARGET_SSE2 && TARGET_64BIT"
> >
> > I wonder if this change is too restrictive, as it disables V1TI shifts
> > by constant on 32bit targets. Perhaps we can introduce a conditional
> > predicate, like:
> >
> > (define_predicate "shiftv1ti_input_operand"
> >   (if_then_else (match_test "TARGET_64BIT")
> > (match_operand 0 "general_operand")
> > (match_operand 0 "const_int_operand")))
> >
> > However, I'm not familiar with how the middle-end behaves with the
> > above approach - will it try to put the constant in a register under
> > some circumstances and consequently fail the expansion?
>
> That would run again into the assertions that shift expanders must never
> fail.
> The question is if a V1TImode shift can ever appear in 32-bit x86, because
> typedef __int128 V __attribute__((vector_size (16)));
> is rejected with
> error: ‘__int128’ is not supported on this target
> when -m32 is in use, no matter what ISA flags are used.

We can do:

typedef int __v1ti __attribute__((mode (V1TI)));

__v1ti foo (__v1ti a)
{
  return a << 11;
}

gcc -O2 -msse2 -m32:

v1ti.c:1:1: warning: specifying vector types with ‘__attribute__
((mode))’ is deprecated [-Wattributes]
   1 | typedef int __v1ti __attribute__((mode (V1TI)));
 | ^~~
v1ti.c:1:1: note: use ‘__attribute__ ((vector_size))’ instead
during RTL pass: expand
v1ti.c: In function ‘foo’:
v1ti.c:5:12: internal compiler error: in expand_shift_1, at expmed.c:2668
   5 |   return a << 11;
 |  ~~^

which looks like an oversight of some kind, since TI (and V2TI) mode
errors out with:

v1ti.c:1:1: error: unable to emulate ‘TI’

and

v1ti.c:1:1: error: unable to emulate ‘V2TI’

I will submit a PR with the above issue.

But I agree, V1TI is x86_64 specific, so the added insn constraint is OK.

Thanks,
Uros.

> Jakub
>

Re: [PATCH]AArch64 Make use of FADDP in simple reductions.

2021-11-01 Thread Richard Sandiford via Gcc-patches

Tamar Christina  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: Friday, October 8, 2021 5:24 PM
>> To: Tamar Christina 
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; Kyrylo Tkachov 
>> Subject: Re: [PATCH]AArch64 Make use of FADDP in simple reductions.
>> 
>> Tamar Christina  writes:
>> > Hi All,
>> >
>> > This is a respin of an older patch which never got upstream reviewed
>> > by a maintainer.  It's been updated to fit the current GCC codegen.
>> >
>> > This patch adds a pattern to support the (F)ADDP (scalar) instruction.
>> >
>> > Before the patch, the C code
>> >
>> > typedef float v4sf __attribute__((vector_size (16)));
>> >
>> > float
>> > foo1 (v4sf x)
>> > {
>> >   return x[0] + x[1];
>> > }
>> >
>> > generated:
>> >
>> > foo1:
>> >dup s1, v0.s[1]
>> >fadds0, s1, s0
>> >ret
>> >
>> > After patch:
>> > foo1:
>> >faddp   s0, v0.2s
>> >ret
>> >
>> > The double case is now handled by SLP but the remaining cases still
>> > need help from combine.  I have kept the integer and floating point
>> > separate because of the integer one only supports V2DI and sharing it
>> > with the float would have required definition of a few new iterators for 
>> > just
>> a single use.
>> >
>> > I provide support for when both elements are subregs as a different
>> > pattern as there's no way to tell reload that the two registers must
>> > be equal with just constraints.
>> >
>> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>> >
>> > Ok for master?
>> >
>> > Thanks,
>> > Tamar
>> >
>> > gcc/ChangeLog:
>> >
>> >* config/aarch64/aarch64-simd.md (*aarch64_faddp_scalar,
>> >*aarch64_addp_scalarv2di, *aarch64_faddp_scalar2,
>> >*aarch64_addp_scalar2v2di): New.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >* gcc.target/aarch64/simd/scalar_faddp.c: New test.
>> >* gcc.target/aarch64/simd/scalar_faddp2.c: New test.
>> >* gcc.target/aarch64/simd/scalar_addp.c: New test.
>> >
>> > Co-authored-by: Tamar Christina 
>> >
>> > --- inline copy of patch --
>> > diff --git a/gcc/config/aarch64/aarch64-simd.md
>> > b/gcc/config/aarch64/aarch64-simd.md
>> > index
>> >
>> 6814dae079c9ff40aaa2bb625432bf9eb8906b73..b49f8b79b11cbb1888c503d9a
>> 938
>> > 4424f44bde05 100644
>> > --- a/gcc/config/aarch64/aarch64-simd.md
>> > +++ b/gcc/config/aarch64/aarch64-simd.md
>> > @@ -3414,6 +3414,70 @@ (define_insn "aarch64_faddp"
>> >[(set_attr "type" "neon_fp_reduc_add_")]
>> >  )
>> >
>> > +;; For the case where both operands are a subreg we need to use a ;;
>> > +match_dup since reload cannot enforce that the registers are ;; the
>> > +same with a constraint in this case.
>> > +(define_insn "*aarch64_faddp_scalar2"
>> > +  [(set (match_operand: 0 "register_operand" "=w")
>> > +  (plus:
>> > +(vec_select:
>> > +  (match_operator: 1 "subreg_lowpart_operator"
>> > +[(match_operand:VHSDF 2 "register_operand" "w")])
>> > +  (parallel [(match_operand 3 "const_int_operand" "n")]))
>> > +(match_dup: 2)))]
>> > +  "TARGET_SIMD
>> > +   && ENDIAN_LANE_N (, INTVAL (operands[3])) == 1"
>> > +  "faddp\t%0, %2.2"
>> > +  [(set_attr "type" "neon_fp_reduc_add_")]
>> > +)
>> 
>> The difficulty with using match_dup here is that the first vec_select operand
>> ought to fold to a REG after reload, rather than stay as a subreg.  From that
>> POV we're forcing the generation of non-canonical rtl.
>> 
>> Also…
>> 
>> > +(define_insn "*aarch64_faddp_scalar"
>> > +  [(set (match_operand: 0 "register_operand" "=w")
>> > +  (plus:
>> > +(vec_select:
>> > +  (match_operand:VHSDF 1 "register_operand" "w")
>> > +  (parallel [(match_operand 2 "const_int_operand" "n")]))
>> > +(match_operand: 3 "register_operand" "1")))]
>> > +  "TARGET_SIMD
>> > +   && ENDIAN_LANE_N (, INTVAL (operands[2])) == 1
>> > +   && SUBREG_P (operands[3]) && !SUBREG_P (operands[1])
>> > +   && subreg_lowpart_p (operands[3])"
>> > +  "faddp\t%0, %1.2"
>> > +  [(set_attr "type" "neon_fp_reduc_add_")]
>> > +)

Also:

It'd probably be better to use V2F for the iterator, since it excludes
V4 and V8 modes.

I think we can use vect_par_cnst_hi_half for operand 2.

>> …matching constraints don't work reliably between two inputs:
>> the RA doesn't know how to combine two different inputs into one input in
>> order to make them match.
>> 
>> Have you tried doing this as a define_peephole2 instead?
>> That fits this kind of situation better (from an rtl representation point of
>> view), but peephole2s are admittedly less powerful than combine.
>> 
>> If peephole2s don't work then I think we'll have to provide a pattern that
>> accepts two distinct inputs and then split the instruction if the inputs 
>> aren't in
>> the same register.  That sounds a bit ugly though, so it'd be good news if 
>> the
>> peephole thing works out.
>
> Unfortunately peepholes don't work very well here because e.g. addp can be
> Assigned by the regalloc to the integer

RE: [PATCH 1/2] middle-end Teach CSE to be able to do vector extracts.

2021-11-01 Thread Tamar Christina via Gcc-patches

Mailing list got lost somewhere, Archiving OK.

> -Original Message-
> From: Richard Sandiford 
> Sent: Friday, October 29, 2021 4:52 PM
> To: Tamar Christina 
> Cc: jeffreya...@gmail.com; rguent...@suse.de; nd 
> Subject: Re: [PATCH 1/2] middle-end Teach CSE to be able to do vector
> extracts.
> 
> Sorry for the slow review.
> 
> Tamar Christina  writes:
> > [this time with patch]
> >
> > Hi all,
> >
> > This is a new version which has the rewrite Richard S requested And
> > also handles when lowpart_subreg fails.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
> > and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * cse.c (add_to_set): New.
> > (find_sets_in_insn): Register constants in sets.
> > (canonicalize_insn): Use auto_vec instead.
> > (cse_insn): Try materializing using vec_dup.
> > * rtl.h (simplify_context::simplify_gen_vec_select,
> > simplify_gen_vec_select): New.
> > * simplify-rtx.c (simplify_context::simplify_gen_vec_select): New.
> >
> > --- inline copy of patch ---
> >
> > diff --git a/gcc/cse.c b/gcc/cse.c
> > index
> >
> 4c3988ee430e99cff74c32cdf9b6382505edd415..2c0442484117317e553c92f48fa
> c
> > 24a0b55063bd 100644
> > --- a/gcc/cse.c
> > +++ b/gcc/cse.c
> > @@ -44,6 +44,7 @@ along with GCC; see the file COPYING3.  If not see
> > #include "regs.h"
> >  #include "function-abi.h"
> >  #include "rtlanal.h"
> > +#include "expr.h"
> >
> >  /* The basic idea of common subexpression elimination is to go
> > through the code, keeping a record of expressions that would @@
> > -4240,13 +4241,23 @@ try_back_substitute_reg (rtx set, rtx_insn *insn)
> >  }
> >  }
> >
> >
> > +
> 
> Seems like excessive whitespace.
> 
> > +/* Add an entry containing RTL X into SETS.  */ static inline void
> > +add_to_set (vec *sets, rtx x) {
> > +  struct set entry = {};
> > +  entry.rtl = x;
> > +  sets->safe_push (entry);
> > +}
> > +
> >  /* Record all the SETs in this instruction into SETS_PTR,
> > and return the number of recorded sets.  */  static int
> > -find_sets_in_insn (rtx_insn *insn, struct set **psets)
> > +find_sets_in_insn (rtx_insn *insn, vec *psets)
> >  {
> > -  struct set *sets = *psets;
> > -  int n_sets = 0;
> > +  vec sets = *psets;
> 
> Is this needed?  It looks like you convert all uses to pset (which is good).
> 
> > +
> >rtx x = PATTERN (insn);
> >
> >if (GET_CODE (x) == SET)
> > @@ -4266,8 +4277,25 @@ find_sets_in_insn (rtx_insn *insn, struct set
> **psets)
> >  someplace else, so it isn't worth cse'ing.  */
> >else if (GET_CODE (SET_SRC (x)) == CALL)
> > ;
> > +  else if (GET_CODE (SET_SRC (x)) == CONST_VECTOR
> > +  && GET_MODE_CLASS (GET_MODE (SET_SRC (x))) !=
> MODE_VECTOR_BOOL)
> > +   {
> > + /* First register the vector itself.  */
> > + add_to_set (psets, x);
> > + rtx src = SET_SRC (x);
> > + /* Go over the constants of the CONST_VECTOR in forward order, to
> > +put them in the same order in the SETS array.  */
> > + for (unsigned i = 0; i < const_vector_encoded_nelts (src) ; i++)
> > +   {
> > + /* These are templates and don't actually get emitted but are
> > +used to tell CSE how to get to a particular constant.  */
> > + rtx y = simplify_gen_vec_select (SET_DEST (x), i);
> > + gcc_assert (y);
> > + add_to_set (psets, gen_rtx_SET (y, CONST_VECTOR_ELT
> > + (src, i)));
> 
> For the record: it looks like everything that uses set::rtl only really cares
> about the SET_DEST & SET_SRC individually, so in principle we could save
> creating some garbage SETs by splitting it into dest and src fields.  I don't 
> think
> that's important enough to be a requirement though.
> 
> > +   }
> > +   }
> >else
> > -   sets[n_sets++].rtl = x;
> > +   add_to_set (psets, x);
> >  }
> >else if (GET_CODE (x) == PARALLEL)
> >  {
> > @@ -4288,12 +4316,12 @@ find_sets_in_insn (rtx_insn *insn, struct set
> **psets)
> >   else if (GET_CODE (SET_SRC (y)) == CALL)
> > ;
> >   else
> > -   sets[n_sets++].rtl = y;
> > +   add_to_set (psets, y);
> > }
> > }
> >  }
> >
> > -  return n_sets;
> > +  return sets.length ();
> >  }
> >
> >
> >  /* Subroutine of canonicalize_insn.  X is an ASM_OPERANDS in INSN.
> > */ @@ -4341,9 +4369,10 @@ canon_asm_operands (rtx x, rtx_insn *insn)
> > see canon_reg.  */
> >
> >  static void
> > -canonicalize_insn (rtx_insn *insn, struct set **psets, int n_sets)
> > +canonicalize_insn (rtx_insn *insn, vec *psets)
> >  {
> > -  struct set *sets = *psets;
> > +  vec sets = *psets;
> > +  int n_sets = sets.length ();
> >rtx tem;
> >rtx x = PATTERN (insn);
> >int i;
> > @@ -4502,13 +4531,6 @@ cse_insn (rtx_insn *insn)
>

Re: [PATCH Take #2] x86_64: Expand ashrv1ti (and PR target/102986)

2021-11-01 Thread Jakub Jelinek via Gcc-patches

On Mon, Nov 01, 2021 at 08:27:12AM +0100, Uros Bizjak wrote:
> > Also, I wonder for all these patterns (previously and now added), shouldn't
> > they have && TARGET_64BIT in conditions?  I mean, we don't really support
> > scalar TImode for ia32, but VALID_SSE_REG_MODE includes V1TImode and while
> > the constant shifts can be done, I think the variable shifts can't, there
> > are no TImode shift patterns...
> 
> - (match_operand:SI 2 "const_int_operand")))]
> -  "TARGET_SSE2"
> + (match_operand:QI 2 "general_operand")))]
> +  "TARGET_SSE2 && TARGET_64BIT"
> 
> I wonder if this change is too restrictive, as it disables V1TI shifts
> by constant on 32bit targets. Perhaps we can introduce a conditional
> predicate, like:
> 
> (define_predicate "shiftv1ti_input_operand"
>   (if_then_else (match_test "TARGET_64BIT")
> (match_operand 0 "general_operand")
> (match_operand 0 "const_int_operand")))
> 
> However, I'm not familiar with how the middle-end behaves with the
> above approach - will it try to put the constant in a register under
> some circumstances and consequently fail the expansion?

That would run again into the assertions that shift expanders must never
fail.
The question is if a V1TImode shift can ever appear in 32-bit x86, because
typedef __int128 V __attribute__((vector_size (16)));
is rejected with
error: ‘__int128’ is not supported on this target
when -m32 is in use, no matter what ISA flags are used.

Jakub

[PATCH] [PR103017] aarch64:fix redundant check in aut insn generation

2021-11-01 Thread Dan Li via Gcc-patches

During the generation of the epilogue of aarch64(aarch64_expand_epilogue),
the value of crtl->calls_eh_return does not need to be checked again.
This value has been checked during aarch64_return_address_signing_enabled.

gcc/ChangeLog:

* config/aarch64/aarch64.c (aarch64_expand_epilogue):
* config/aarch64/aarch64.md:

Signed-off-by: Dan Li 
---
 gcc/config/aarch64/aarch64.c  | 6 +-
 gcc/config/aarch64/aarch64.md | 3 +--
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 699c105a42a..8448e56443c 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9076,13 +9076,9 @@ aarch64_expand_epilogue (bool for_sibcall)
2) The RETAA instruction is not available before ARMv8.3-A, so if we are
   generating code for !TARGET_ARMV8_3 we can't use it and must
   explicitly authenticate.
-
-   3) On an eh_return path we make extra stack adjustments to update the
-  canonical frame address to be the exception handler's CFA.  We want
-  to authenticate using the CFA of the function which calls eh_return.
 */
   if (aarch64_return_address_signing_enabled ()
-  && (for_sibcall || !TARGET_ARMV8_3 || crtl->calls_eh_return))
+  && (for_sibcall || !TARGET_ARMV8_3))
 {
   switch (aarch64_ra_sign_key)
{
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 1a39470a1fe..65ee6159d73 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -879,8 +879,7 @@ (define_insn "*do_return"
   {
 const char *ret = NULL;
 if (aarch64_return_address_signing_enabled ()
-   && (TARGET_PAUTH)
-   && !crtl->calls_eh_return)
+   && (TARGET_PAUTH))
   {
if (aarch64_ra_sign_key == AARCH64_KEY_B)
  ret = "retab";
-- 
2.17.1

[PATCH] [PATCH] aarch64:fix redundant check in aut insn generation [PR103017] During the generation of the epilogue of aarch64(aarch64_expand_epilogue), the value of crtl->calls_eh_return does not nee

2021-11-01 Thread Dan Li via Gcc-patches

gcc/ChangeLog:

* config/aarch64/aarch64.c (aarch64_expand_epilogue):
* config/aarch64/aarch64.md:

Signed-off-by: Dan Li 
---
 gcc/config/aarch64/aarch64.c  | 6 +-
 gcc/config/aarch64/aarch64.md | 3 +--
 2 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 699c105a42a..8448e56443c 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9076,13 +9076,9 @@ aarch64_expand_epilogue (bool for_sibcall)
2) The RETAA instruction is not available before ARMv8.3-A, so if we are
   generating code for !TARGET_ARMV8_3 we can't use it and must
   explicitly authenticate.
-
-   3) On an eh_return path we make extra stack adjustments to update the
-  canonical frame address to be the exception handler's CFA.  We want
-  to authenticate using the CFA of the function which calls eh_return.
 */
   if (aarch64_return_address_signing_enabled ()
-  && (for_sibcall || !TARGET_ARMV8_3 || crtl->calls_eh_return))
+  && (for_sibcall || !TARGET_ARMV8_3))
 {
   switch (aarch64_ra_sign_key)
{
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 1a39470a1fe..65ee6159d73 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -879,8 +879,7 @@ (define_insn "*do_return"
   {
 const char *ret = NULL;
 if (aarch64_return_address_signing_enabled ()
-   && (TARGET_PAUTH)
-   && !crtl->calls_eh_return)
+   && (TARGET_PAUTH))
   {
if (aarch64_ra_sign_key == AARCH64_KEY_B)
  ret = "retab";
-- 
2.17.1

Add static_chain support to ipa-modref

2021-11-01 Thread Jan Hubicka via Gcc-patches

Hi,
this is patchs teaches ipa-modref about the static chain that is, like
retslot, a hiden argument.  The patch is pretty much symemtric to what
was done for retslot handling and I verified it does the intended job
for Ada LTO bootstrap.

Bootstrapped/regtested x86_64-linux, OK?

Honza

gcc/ChangeLog:

* gimple.c (gimple_call_static_chain_flags): New function.
* gimple.h (gimple_call_static_chain_flags): Declare
* ipa-modref.c (modref_summary::modref_summary): Initialize
static_chain_flags.
(modref_summary_lto::modref_summary_lto): Likewise.
(modref_summary::useful_p): Test static_chain_flags.
(modref_summary_lto::useful_p): Likewise.
(struct modref_summary_lto): Add static_chain_flags.
(modref_summary::dump): Dump static_chain_flags.
(modref_summary_lto::dump): Likewise.
(struct escape_point): Add static_cahin_arg.
(analyze_ssa_name_flags): Use gimple_call_static_chain_flags.
(analyze_parms): Handle static chains.
(modref_summaries::duplicate): Duplicate static_chain_flags.
(modref_summaries_lto::duplicate): Likewise.
(modref_write): Stream static_chain_flags.
(read_section): Likewise.
(modref_merge_call_site_flags): Handle static_chain_flags.
* ipa-modref.h (struct modref_summary): Add static_chain_flags.
* tree-ssa-structalias.c (handle_rhs_call): Use
* gimple_static_chain_flags.

gcc/testsuite/ChangeLog:

* gcc.dg/ipa/modref-3.c: New test.

diff --git a/gcc/gimple.c b/gcc/gimple.c
index 22dd6417d19..ef07d9385c5 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -1647,6 +1647,33 @@ gimple_call_retslot_flags (const gcall *stmt)
   return flags;
 }
 
+/* Detects argument flags for static chain on call STMT.  */
+
+int
+gimple_call_static_chain_flags (const gcall *stmt)
+{
+  int flags = 0;
+
+  tree callee = gimple_call_fndecl (stmt);
+  if (callee)
+{
+  cgraph_node *node = cgraph_node::get (callee);
+  modref_summary *summary = node ? get_modref_function_summary (node)
+   : NULL;
+
+  if (summary)
+   {
+ int modref_flags = summary->static_chain_flags;
+
+ /* We have possibly optimized out load.  Be conservative here.  */
+ gcc_checking_assert (node->binds_to_current_def_p ());
+ if (dbg_cnt (ipa_mod_ref_pta))
+   flags |= modref_flags;
+   }
+}
+  return flags;
+}
+
 /* Detects return flags for the call STMT.  */
 
 int
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 23a124ec769..3cde3cde7fe 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -1590,6 +1590,7 @@ bool gimple_call_same_target_p (const gimple *, const 
gimple *);
 int gimple_call_flags (const gimple *);
 int gimple_call_arg_flags (const gcall *, unsigned);
 int gimple_call_retslot_flags (const gcall *);
+int gimple_call_static_chain_flags (const gcall *);
 int gimple_call_return_flags (const gcall *);
 bool gimple_call_nonnull_result_p (gcall *);
 tree gimple_call_nonnull_arg (gcall *);
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index d866d9ed6b3..ae8ed53b396 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -270,7 +270,8 @@ static GTY(()) fast_function_summary 
 /* Summary for a single function which this pass produces.  */
 
 modref_summary::modref_summary ()
-  : loads (NULL), stores (NULL), retslot_flags (0), writes_errno (false)
+  : loads (NULL), stores (NULL), retslot_flags (0), static_chain_flags (0),
+writes_errno (false)
 {
 }
 
@@ -325,6 +326,9 @@ modref_summary::useful_p (int ecf_flags, bool check_flags)
   arg_flags.release ();
   if (check_flags && remove_useless_eaf_flags (retslot_flags, ecf_flags, 
false))
 return true;
+  if (check_flags
+  && remove_useless_eaf_flags (static_chain_flags, ecf_flags, false))
+return true;
   if (ecf_flags & ECF_CONST)
 return false;
   if (loads && !loads->every_base)
@@ -367,6 +371,7 @@ struct GTY(()) modref_summary_lto
   modref_records_lto *stores;
   auto_vec GTY((skip)) arg_flags;
   eaf_flags_t retslot_flags;
+  eaf_flags_t static_chain_flags;
   bool writes_errno;
 
   modref_summary_lto ();
@@ -378,7 +383,8 @@ struct GTY(()) modref_summary_lto
 /* Summary for a single function which this pass produces.  */
 
 modref_summary_lto::modref_summary_lto ()
-  : loads (NULL), stores (NULL), retslot_flags (0), writes_errno (false)
+  : loads (NULL), stores (NULL), retslot_flags (0), static_chain_flags (0),
+writes_errno (false)
 {
 }
 
@@ -406,6 +412,9 @@ modref_summary_lto::useful_p (int ecf_flags, bool 
check_flags)
   arg_flags.release ();
   if (check_flags && remove_useless_eaf_flags (retslot_flags, ecf_flags, 
false))
 return true;
+  if (check_flags
+  && remove_useless_eaf_flags (static_chain_flags, ecf_flags, false))
+return true;
   if (ecf_flags & ECF_CONST)
 return false;
   if (loads && !loads->every_base)
@@ -619,6 +628,11 @@ modref_summary::dump (FILE *out)

RE: [PATCH]AArch64 Make use of FADDP in simple reductions.

2021-11-01 Thread Tamar Christina via Gcc-patches



> -Original Message-
> From: Richard Sandiford 
> Sent: Friday, October 8, 2021 5:24 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> ; Kyrylo Tkachov 
> Subject: Re: [PATCH]AArch64 Make use of FADDP in simple reductions.
> 
> Tamar Christina  writes:
> > Hi All,
> >
> > This is a respin of an older patch which never got upstream reviewed
> > by a maintainer.  It's been updated to fit the current GCC codegen.
> >
> > This patch adds a pattern to support the (F)ADDP (scalar) instruction.
> >
> > Before the patch, the C code
> >
> > typedef float v4sf __attribute__((vector_size (16)));
> >
> > float
> > foo1 (v4sf x)
> > {
> >   return x[0] + x[1];
> > }
> >
> > generated:
> >
> > foo1:
> > dup s1, v0.s[1]
> > fadds0, s1, s0
> > ret
> >
> > After patch:
> > foo1:
> > faddp   s0, v0.2s
> > ret
> >
> > The double case is now handled by SLP but the remaining cases still
> > need help from combine.  I have kept the integer and floating point
> > separate because of the integer one only supports V2DI and sharing it
> > with the float would have required definition of a few new iterators for 
> > just
> a single use.
> >
> > I provide support for when both elements are subregs as a different
> > pattern as there's no way to tell reload that the two registers must
> > be equal with just constraints.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64-simd.md (*aarch64_faddp_scalar,
> > *aarch64_addp_scalarv2di, *aarch64_faddp_scalar2,
> > *aarch64_addp_scalar2v2di): New.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/simd/scalar_faddp.c: New test.
> > * gcc.target/aarch64/simd/scalar_faddp2.c: New test.
> > * gcc.target/aarch64/simd/scalar_addp.c: New test.
> >
> > Co-authored-by: Tamar Christina 
> >
> > --- inline copy of patch --
> > diff --git a/gcc/config/aarch64/aarch64-simd.md
> > b/gcc/config/aarch64/aarch64-simd.md
> > index
> >
> 6814dae079c9ff40aaa2bb625432bf9eb8906b73..b49f8b79b11cbb1888c503d9a
> 938
> > 4424f44bde05 100644
> > --- a/gcc/config/aarch64/aarch64-simd.md
> > +++ b/gcc/config/aarch64/aarch64-simd.md
> > @@ -3414,6 +3414,70 @@ (define_insn "aarch64_faddp"
> >[(set_attr "type" "neon_fp_reduc_add_")]
> >  )
> >
> > +;; For the case where both operands are a subreg we need to use a ;;
> > +match_dup since reload cannot enforce that the registers are ;; the
> > +same with a constraint in this case.
> > +(define_insn "*aarch64_faddp_scalar2"
> > +  [(set (match_operand: 0 "register_operand" "=w")
> > +   (plus:
> > + (vec_select:
> > +   (match_operator: 1 "subreg_lowpart_operator"
> > + [(match_operand:VHSDF 2 "register_operand" "w")])
> > +   (parallel [(match_operand 3 "const_int_operand" "n")]))
> > + (match_dup: 2)))]
> > +  "TARGET_SIMD
> > +   && ENDIAN_LANE_N (, INTVAL (operands[3])) == 1"
> > +  "faddp\t%0, %2.2"
> > +  [(set_attr "type" "neon_fp_reduc_add_")]
> > +)
> 
> The difficulty with using match_dup here is that the first vec_select operand
> ought to fold to a REG after reload, rather than stay as a subreg.  From that
> POV we're forcing the generation of non-canonical rtl.
> 
> Also…
> 
> > +(define_insn "*aarch64_faddp_scalar"
> > +  [(set (match_operand: 0 "register_operand" "=w")
> > +   (plus:
> > + (vec_select:
> > +   (match_operand:VHSDF 1 "register_operand" "w")
> > +   (parallel [(match_operand 2 "const_int_operand" "n")]))
> > + (match_operand: 3 "register_operand" "1")))]
> > +  "TARGET_SIMD
> > +   && ENDIAN_LANE_N (, INTVAL (operands[2])) == 1
> > +   && SUBREG_P (operands[3]) && !SUBREG_P (operands[1])
> > +   && subreg_lowpart_p (operands[3])"
> > +  "faddp\t%0, %1.2"
> > +  [(set_attr "type" "neon_fp_reduc_add_")]
> > +)
> 
> …matching constraints don't work reliably between two inputs:
> the RA doesn't know how to combine two different inputs into one input in
> order to make them match.
> 
> Have you tried doing this as a define_peephole2 instead?
> That fits this kind of situation better (from an rtl representation point of
> view), but peephole2s are admittedly less powerful than combine.
> 
> If peephole2s don't work then I think we'll have to provide a pattern that
> accepts two distinct inputs and then split the instruction if the inputs 
> aren't in
> the same register.  That sounds a bit ugly though, so it'd be good news if the
> peephole thing works out.

Unfortunately peepholes don't work very well here because e.g. addp can be
Assigned by the regalloc to the integer side instead of simd, in which case you
Can't use the instruction anymore.

The peepholes seem to only detect the simple FP cases.

I tried adding something like a post-reload spit

+  "&& reload_completed && REGNO (operands[1]) != REGNO (operands[3])"
+  [(clobber (match_scratch: 4 "=w"))
+

Re: [PATCH Take #2] x86_64: Expand ashrv1ti (and PR target/102986)

2021-11-01 Thread Uros Bizjak via Gcc-patches

On Sun, Oct 31, 2021 at 11:02 AM Roger Sayle  wrote:
>
>
> Very many thanks to Jakub for proof-reading my patch, catching my silly
> GNU-style
> mistakes and making excellent suggestions.  This revised patch incorporates
> all of
> his feedback, and has been tested on x86_64-pc-linux-gnu with make bootstrap
> and
> make -k check with no new failures.
>
> 2021-10-31  Roger Sayle  
> Jakub Jelinek  
>
> gcc/ChangeLog
> PR target/102986
> * config/i386/i386-expand.c (ix86_expand_v1ti_to_ti,
> ix86_expand_ti_to_v1ti): New helper functions.
> (ix86_expand_v1ti_shift): Check if the amount operand is an
> integer constant, and expand as a TImode shift if it isn't.
> (ix86_expand_v1ti_rotate): Check if the amount operand is an
> integer constant, and expand as a TImode rotate if it isn't.
> (ix86_expand_v1ti_ashiftrt): New function to expand arithmetic
> right shifts of V1TImode quantities.
> * config/i386/i386-protos.h (ix86_expand_v1ti_ashift): Prototype.
> * config/i386/sse.md (ashlv1ti3, lshrv1ti3): Change constraints
> to QImode general_operand, and let the helper functions lower
> shifts by non-constant operands, as TImode shifts.  Make
> conditional on TARGET_64BIT.
> (ashrv1ti3): New expander calling ix86_expand_v1ti_ashiftrt.
> (rotlv1ti3, rotrv1ti3): Change shift operand to QImode.
> Make conditional on TARGET_64BIT.
>
> gcc/testsuite/ChangeLog
> PR target/102986
> * gcc.target/i386/sse2-v1ti-ashiftrt-1.c: New test case.
> * gcc.target/i386/sse2-v1ti-ashiftrt-2.c: New test case.
> * gcc.target/i386/sse2-v1ti-ashiftrt-3.c: New test case.
> * gcc.target/i386/sse2-v1ti-shift-2.c: New test case.
> * gcc.target/i386/sse2-v1ti-shift-3.c: New test case.
>
> Thanks.
> Roger
> --
>
> -Original Message-
> From: Jakub Jelinek 
> Sent: 30 October 2021 11:30
> To: Roger Sayle 
> Cc: 'GCC Patches' ; 'Uros Bizjak'
> 
> Subject: Re: [PATCH] x86_64: Expand ashrv1ti (and PR target/102986)
>
> On Sat, Oct 30, 2021 at 11:16:41AM +0100, Roger Sayle wrote:
> > 2021-10-30  Roger Sayle  
> >
> > gcc/ChangeLog
> >   PR target/102986
> >   * config/i386/i386-expand.c (ix86_expand_v1ti_to_ti,
> >   ix86_expand_ti_to_v1ti): New helper functions.
> >   (ix86_expand_v1ti_shift): Check if the amount operand is an
> >   integer constant, and expand as a TImode shift if it isn't.
> >   (ix86_expand_v1ti_rotate): Check if the amount operand is an
> >   integer constant, and expand as a TImode rotate if it isn't.
> >   (ix86_expand_v1ti_ashiftrt): New function to expand arithmetic
> >   right shifts of V1TImode quantities.
> >   * config/i386/i386-protos.h (ix86_expand_v1ti_ashift): Prototype.
> >   * config/i386/sse.md (ashlv1ti3, lshrv1ti3): Change constraints
> >   to QImode general_operand, and let the helper functions lower
> >   shifts by non-constant operands, as TImode shifts.
> >   (ashrv1ti3): New expander calling ix86_expand_v1ti_ashiftrt.
> >   (rotlv1ti3, rotrv1ti3): Change shift operand to QImode.
> >
> > gcc/testsuite/ChangeLog
> >   PR target/102986
> >   * gcc.target/i386/sse2-v1ti-ashiftrt-1.c: New test case.
> >   * gcc.target/i386/sse2-v1ti-ashiftrt-2.c: New test case.
> >   * gcc.target/i386/sse2-v1ti-ashiftrt-3.c: New test case.
> >   * gcc.target/i386/sse2-v1ti-shift-2.c: New test case.
> >   * gcc.target/i386/sse2-v1ti-shift-3.c: New test case.
> >
> > Sorry again for the breakage in my last patch.   I wasn't testing things
> > that shouldn't have been affected/changed.
>
> Not a review, will defer that to Uros, but just nits:
>
> > +/* Expand move of V1TI mode register X to a new TI mode register.  */
> > +static rtx ix86_expand_v1ti_to_ti (rtx x)
>
> ix86_expand_v1ti_to_ti should be at the start of next line, so static rtx
> ix86_expand_v1ti_to_ti (rtx x)
>
> Ditto for other functions and also in functions you've added by the previous
> patch.
> > +  emit_insn (code == ASHIFT ? gen_ashlti3(tmp2, tmp1, operands[2])
> > + : gen_lshrti3(tmp2, tmp1, operands[2]));
>
> Space before ( twice.
>
> > +  emit_insn (code == ROTATE ? gen_rotlti3(tmp2, tmp1, operands[2])
> > + : gen_rotrti3(tmp2, tmp1, operands[2]));
>
> Likewise.
>
> > +  emit_insn (gen_ashrti3(tmp2, tmp1, operands[2]));
>
> Similarly.
>
> Also, I wonder for all these patterns (previously and now added), shouldn't
> they have && TARGET_64BIT in conditions?  I mean, we don't really support
> scalar TImode for ia32, but VALID_SSE_REG_MODE includes V1TImode and while
> the constant shifts can be done, I think the variable shifts can't, there
> are no TImode shift patterns...

- (match_operand:SI 2 "const_int_operand")))]
-  "TARGET_SSE2"
+ (match_operand:QI 2 "general_operand")))]
+  "TARGET_SSE2 &&

[PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set

2021-11-01 Thread HAO CHEN GUI via Gcc-patches

Hi,

  This patch disables gimple folding for VSX_BUILTIN_XVMINDP, 
VSX_BUILTIN_XVMAXDP, ALTIVEC_BUILTIN_VMINFP and  ALTIVEC_BUILTIN_VMAXFP when 
fast-math is not set.  With the gimple folding is enabled, the four built-ins 
will be implemented by c-type instructions - xs[min|max]cdp on P9 and P10 if 
they can be converted to scalar comparisons.  While they are implemented by 
xv[min|max][s|d]p on P8 and P7 as P8 and P7 don't have corresponding scalar 
comparison instructions.  The patch binds these four built-ins to 
xv[min|max][s|d]p when fast-math is not set. The two new test cases illustrate 
it. 

  ALTIVEC_BUILTIN_VMINFP and  ALTIVEC_BUILTIN_VMAXFP are not implemented by 
vminfp or vmaxfp.

rs6000-builtin.def:BU_ALTIVEC_2 (VMAXFP,  "vmaxfp", CONST, 
smaxv4sf3)

rs6000-builtin.def:BU_ALTIVEC_2 (VMINFP,  "vminfp", CONST, 
sminv4sf3)

Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay 
for trunk? Any recommendations? Thanks a lot.


ChangeLog

2021-11-01 Haochen Gui 

gcc/
    * config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin): Disable
    gimple fold for VSX_BUILTIN_XVMINDP, ALTIVEC_BUILTIN_VMINFP,
    VSX_BUILTIN_XVMAXDP, ALTIVEC_BUILTIN_VMAXFP when fast-math is not
    set.

gcc/testsuite/
    * gcc.target/powerpc/vec-minmax-1.c: New test.
    * gcc.target/powerpc/vec-minmax-2.c: Likewise.


patch.diff

diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c 
b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c
new file mode 100644
index 000..e238659c9be
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c
@@ -0,0 +1,52 @@
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
+/* { dg-final { scan-assembler-times {\mxvmaxdp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvmaxsp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvmindp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvminsp\M} 1 } } */
+
+/* This test verifies that float or double vec_min/max are bound to
+   xv[min|max][d|s]p instructions when fast-math is not set.  */
+
+
+#include 
+
+#ifdef _BIG_ENDIAN
+   const int PREF_D = 0;
+#else
+   const int PREF_D = 1;
+#endif
+
+double vmaxd (double a, double b)
+{
+  vector double va = vec_promote (a, PREF_D);
+  vector double vb = vec_promote (b, PREF_D);
+  return vec_extract (vec_max (va, vb), PREF_D);
+}
+
+double vmind (double a, double b)
+{
+  vector double va = vec_promote (a, PREF_D);
+  vector double vb = vec_promote (b, PREF_D);
+  return vec_extract (vec_min (va, vb), PREF_D);
+}
+
+#ifdef _BIG_ENDIAN
+   const int PREF_F = 0;
+#else
+   const int PREF_F = 3;
+#endif
+
+float vmaxf (float a, float b)
+{
+  vector float va = vec_promote (a, PREF_F);
+  vector float vb = vec_promote (b, PREF_F);
+  return vec_extract (vec_max (va, vb), PREF_F);
+}
+
+float vminf (float a, float b)
+{
+  vector float va = vec_promote (a, PREF_F);
+  vector float vb = vec_promote (b, PREF_F);
+  return vec_extract (vec_min (va, vb), PREF_F);
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c 
b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c
new file mode 100644
index 000..149275d8709
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c
@@ -0,0 +1,50 @@
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power9 -ffast-math" } */
+/* { dg-final { scan-assembler-times {\mxsmaxcdp\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mxsmincdp\M} 2 } } */
+
+/* This test verifies that float or double vec_min/max can be converted
+   to scalar comparison when fast-math is set.  */
+
+
+#include 
+
+#ifdef _BIG_ENDIAN
+   const int PREF_D = 0;
+#else
+   const int PREF_D = 1;
+#endif
+
+double vmaxd (double a, double b)
+{
+  vector double va = vec_promote (a, PREF_D);
+  vector double vb = vec_promote (b, PREF_D);
+  return vec_extract (vec_max (va, vb), PREF_D);
+}
+
+double vmind (double a, double b)
+{
+  vector double va = vec_promote (a, PREF_D);
+  vector double vb = vec_promote (b, PREF_D);
+  return vec_extract (vec_min (va, vb), PREF_D);
+}
+
+#ifdef _BIG_ENDIAN
+   const int PREF_F = 0;
+#else
+   const int PREF_F = 3;
+#endif
+
+float vmaxf (float a, float b)
+{
+  vector float va = vec_promote (a, PREF_F);
+  vector float vb = vec_promote (b, PREF_F);
+  return vec_extract (vec_max (va, vb), PREF_F);
+}
+
+float vminf (float a, float b)
+{
+  vector float va = vec_promote (a, PREF_F);
+  vector float vb = vec_promote (b, PREF_F);
+  return vec_extract (vec_min (va, vb), PREF_F);
+}
2021-11-01 Haochen Gui 

gcc/
* config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin): Disable
gimple fold for VSX_BUILTIN_XVMINDP, ALTIVEC_BUILTIN_VMINFP,
VSX_BUILTIN_XVMAXDP, ALTIVEC_BUILTIN_VMAXFP when fast-math is not
set.

gcc/testsuite/
* gcc.target/powerpc/vec-minmax-1.c: New test.
*

67 matches

Mail list logo