[PATCH], PR 104253, Fix __ibm128 conversions on IEEE 128-bit system

2022-01-28 Thread Michael Meissner via Gcc-patches
Use correct names for __ibm128 if long double is IEEE 128-bit.

If you are on a PowerPC system where the default long double is IEEE
128-bit, GCC will use the wrong names for some of the conversion functions
for the __ibm128 type.

What is happening is when the defult long double is IEEE 128-bit, the
various convert, truncation, and extend functions did not specify a
default name for the conversion.  The machine indepentent portions of the
compiler would construct a call with an 'if' name (IFmode being the mode
for IBM 128-bit floating point).  This patch specifies to use the
tradiational 'tf' name for these conversion functions.

I have tested this on the following systems with no regressions.  Can I check
this into the trunk?  This will need to be back ported to the GCC 11 branch
also.

*   Power10, little endian, with IBM 128-bit long double default
*   Power10, little endian, with IEEE 128-bit long double default
*   Power9, little endian, with IBM 128-bit long double default
*   Power8, big endian, both 64/32-bit targets.

Note, there is another bug that the original test case uncovered that we need a
tweak for built-in handling for the __builtin_pack_ibm128 and
__builtin_unpack_ibm128 built-in functions.  This patch fixes the more general
problem that the wrong conversion functions between DImode and IFmode were 
called.

2022-01-28  Michael Meissner  

gcc/
PR target/104253
* config/rs6000/rs6000.cc (init_float128_ibm): Use the TF names
for builtin conversions between __ibm128 and DImode when long
double uses the IEEE 128-bit format.

gcc/testsuite/
PR target/104253
* gcc.target/powerpc/pr104253.c: New test.
---
 gcc/config/rs6000/rs6000.cc |   6 +
 gcc/testsuite/gcc.target/powerpc/pr104253.c | 154 
 2 files changed, 160 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104253.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index f5c089a2a26..3b4c496df56 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3,6 +3,12 @@ init_float128_ibm (machine_mode mode)
   set_conv_libfunc (trunc_optab, DDmode, mode, "__dpd_trunctfdd");
   set_conv_libfunc (sext_optab, TDmode, mode, "__dpd_extendtftd");
 
+  set_conv_libfunc (sfix_optab, DImode, mode, "__fixtfdi");
+  set_conv_libfunc (ufix_optab, DImode, mode, "__fixunstfdi");
+
+  set_conv_libfunc (sfloat_optab, mode, DImode, "__floatditf");
+  set_conv_libfunc (ufloat_optab, mode, DImode, "__floatunditf");
+
   if (TARGET_POWERPC64)
{
  set_conv_libfunc (sfix_optab, TImode, mode, "__fixtfti");
diff --git a/gcc/testsuite/gcc.target/powerpc/pr104253.c 
b/gcc/testsuite/gcc.target/powerpc/pr104253.c
new file mode 100644
index 000..a1e73fe012c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr104253.c
@@ -0,0 +1,154 @@
+/*
+ * Require float128 support because __ibm128 currently is not enabled unless we
+ * also have __float128 support.
+ */
+
+/* { dg-do run } */
+/* { require-effective-target ppc_float128_sw } */
+/* { dg-options "-O2 -mvsx -mfloat128" } */
+
+/*
+ * PR target/104253
+ *
+ * Verify that the various conversions to and from __ibm128 work.  When the
+ *  default for long double is changed to IEEE 128-bit, originally GCC would
+ *  call the functions using an 'if' name instead of 'tf.
+ */
+
+#include 
+
+extern float  ibm128_to_sf  (__ibm128) __attribute__((noinline));
+extern double ibm128_to_df  (__ibm128) __attribute__((noinline));
+extern intibm128_to_si  (__ibm128) __attribute__((noinline));
+extern long long  ibm128_to_di  (__ibm128) __attribute__((noinline));
+extern unsigned int   ibm128_to_usi (__ibm128) __attribute__((noinline));
+extern unsigned long long ibm128_to_udi (__ibm128) __attribute__((noinline));
+
+extern __ibm128 sf_to_ibm128  (float)  __attribute__((noinline));
+extern __ibm128 df_to_ibm128  (double) __attribute__((noinline));
+extern __ibm128 si_to_ibm128  (int)__attribute__((noinline));
+extern __ibm128 di_to_ibm128  (long long)  __attribute__((noinline));
+extern __ibm128 usi_to_ibm128 (unsigned int)   __attribute__((noinline));
+extern __ibm128 udi_to_ibm128 (unsigned long long) __attribute__((noinline));
+
+float
+ibm128_to_sf  (__ibm128 x)
+{
+  return x;
+}
+
+double
+ibm128_to_df  (__ibm128 x)
+{
+  return x;
+}
+
+int
+ibm128_to_si  (__ibm128 x)
+{
+  return x;
+}
+
+long long
+ibm128_to_di  (__ibm128 x)
+{
+  return x;
+}
+
+unsigned int
+ibm128_to_usi (__ibm128 x)
+{
+  return x;
+}
+
+unsigned long long
+ibm128_to_udi (__ibm128 x)
+{
+  return x;
+}
+
+__ibm128
+sf_to_ibm128  (float x)
+{
+  return x;
+}
+
+__ibm128
+df_to_ibm128  (double x)
+{
+  return x;
+}
+
+__ibm128
+si_to_ibm128  (int x)
+{
+  return x;
+}
+
+__ibm128
+di_to_ibm128  (long long x)
+{
+  return x;
+}
+

[PATCH] c++: Reject union std::initializer_list [PR102434]

2022-01-28 Thread Marek Polacek via Gcc-patches
Weird things are going to happen if you define your std::initializer_list
as a union.  In this case, we crash in output_constructor_regular_field.

Let's not allow such a definition in the first place.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/102434

gcc/cp/ChangeLog:

* class.cc (finish_struct): Don't allow union initializer_list.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist128.C: New test.
---
 gcc/cp/class.cc  |  2 ++
 gcc/testsuite/g++.dg/cpp0x/initlist128.C | 16 
 2 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist128.C

diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc
index 5db3722ae46..696155714e8 100644
--- a/gcc/cp/class.cc
+++ b/gcc/cp/class.cc
@@ -7766,6 +7766,8 @@ finish_struct (tree t, tree attributes)
ok = true;
}
}
+  /* It also cannot be a union.  */
+  ok &= NON_UNION_CLASS_TYPE_P (t);
   if (!ok)
fatal_error (input_location, "definition of %qD does not match "
 "%<#include %>", TYPE_NAME (t));
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist128.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist128.C
new file mode 100644
index 000..22246860f4e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist128.C
@@ -0,0 +1,16 @@
+// PR c++/102434
+// { dg-do compile { target c++11 } }
+
+using size_t = decltype(sizeof 0);
+
+namespace std {
+  template union initializer_list { // { dg-error "definition of 
.*std::initializer_list.* does not match" }
+const T *ptr;
+size_t n;
+  };
+}
+template
+void Task() {}
+auto b = {  };
+
+// { dg-prune-output "compilation terminated" }

base-commit: f6f2d6cfec1c2fe9570b98211be58329d8d7749b
-- 
2.34.1



[r12-6924 Regression] FAIL: libgomp.c++/pr86291.C execution test on Linux/x86_64

2022-01-28 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

c2b610e7c6c89fd422c5c31f01023bcddf3cf4a5 is the first bad commit
commit c2b610e7c6c89fd422c5c31f01023bcddf3cf4a5
Author: Zhao Wei Liew 
Date:   Fri Jan 28 13:36:39 2022 -0500

match.pd: Simplify 1 / X for integer X [PR95424]

caused

FAIL: gcc.dg/tree-ssa/divide-7.c scan-tree-dump optimized ".. <= 2 ? x_..D. : 
0;"
FAIL: libgomp.c++/pr86291.C execution test

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6924/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/divide-7.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/divide-7.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/divide-7.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/divide-7.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/pr86291.C --target_board='unix{-m32}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/pr86291.C --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/pr86291.C --target_board='unix{-m64}'"
$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="c++.exp=libgomp.c++/pr86291.C --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH 3/8] rs6000: Convert built-in constraints to form

2022-01-28 Thread Segher Boessenkool
On Fri, Jan 28, 2022 at 11:50:21AM -0600, Bill Schmidt wrote:
> When introducing the new built-in support, I tried to match as many
> existing error messages as possible.  One common form was "argument X must
> be a Y-bit unsigned literal".  Another was "argument X must be a literal
> between X' and  Y', inclusive".  During reviews, Segher requested that I
> eventually convert all messages of the first form into the second form for
> consistency.  That's what this patch does, replacing all -form
> constraints (first form) with -form constraints (second form).

Well, I asked for the error messages to be clearer and more consistent
like that.  I don't think changing our source code like this is an
improvement (*we* know what a 5-bit signed number is).  Do you think
after your patch it is clearer and we will make fewer errors?


Segher


[PATCH] AutoFDO: don't set param_early_inliner_max_iterations to 10.

2022-01-28 Thread Eugene Rozenfeld via Gcc-patches
param_early_inliner_max_iterations specifies the maximum number
of nested indirect inlining iterations performed by early inliner.
Normally, the default value is 1.

For AutoFDO this parameter was also used as the number of iteration for
its indirect call promotion loop and the default value was set to 10.
While it makes sense to have 10 in the indirect call promotion loop
(we want to make the IR match the profiled binary before actual annotation)
there is no reason to have a special default value for the
regular early inliner.

This change removes the special AutoFDO default value setting for
param_early_inliner_max_iterations while keeping 10 as the number of
iterations for the AutoFDO indirect call promotion loop.

This change improves a simple fibonacci benchmark in AutoFDO mode
by 15% on x86_64-pc-linux-gnu.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
* auto-profile.cc (auto_profile): Hard-code the number of iterations 
(10).

gcc/ChangeLog:
* opt.cc (common_handle_option): Don't set 
param_early_inliner_max_iterations to 10 for AutoFDO.
---
 gcc/auto-profile.cc | 3 +--
 gcc/opts.cc | 2 --
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
index 0bfaae7b091..c7cee639c85 100644
--- a/gcc/auto-profile.cc
+++ b/gcc/auto-profile.cc
@@ -1644,8 +1644,7 @@ auto_profile (void)
function before annotation, so the profile inside bar@loc_foo2
will be useful.  */
 autofdo::stmt_set promoted_stmts;
-for (int i = 0; i < opt_for_fn (node->decl,
-   param_early_inliner_max_iterations); i++)
+for (int i = 0; i < 10; i++)
   {
 if (!flag_value_profile_transformations
 || !autofdo::afdo_vpt_for_early_inline (_stmts))
diff --git a/gcc/opts.cc b/gcc/opts.cc
index 17e1884f0e4..f6f6a8e1709 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -2899,8 +2899,6 @@ common_handle_option (struct gcc_options *opts,
 case OPT_fauto_profile:
   enable_fdo_optimizations (opts, opts_set, value);
   SET_OPTION_IF_UNSET (opts, opts_set, flag_profile_correction, value);
-  SET_OPTION_IF_UNSET (opts, opts_set,
-  param_early_inliner_max_iterations, 10);
   break;

 case OPT_fprofile_generate_:
--
2.25.1


Re: [PATCH 1/8] rs6000: More factoring of overload processing

2022-01-28 Thread Segher Boessenkool
On Fri, Jan 28, 2022 at 03:19:48PM -0600, Bill Schmidt wrote:
> On 1/28/22 1:11 PM, Segher Boessenkool wrote:
> > On Fri, Jan 28, 2022 at 11:50:19AM -0600, Bill Schmidt wrote:
> >> + and the generic code will issue the appropriate error message.  Skip
> >> + this test for functions where we don't fully describe all the 
> >> possible
> >> + overload signatures in rs6000-overload.def (because they aren't 
> >> relevant
> >> + to the expansion here).  If we don't, we get confusing error 
> >> messages.  */
> >> +  if (fcode != RS6000_OVLD_VEC_PROMOTE
> >> +  && fcode != RS6000_OVLD_VEC_SPLATS
> >> +  && fcode != RS6000_OVLD_VEC_EXTRACT
> >> +  && fcode != RS6000_OVLD_VEC_INSERT
> >> +  && fcode != RS6000_OVLD_VEC_STEP
> >> +  && (!VOID_TYPE_P (TREE_VALUE (fnargs)) || n < nargs))
> >>  return NULL;
> > Can you expand a bit on this, give an example for example?  It is very
> > hard to understand this code, the way it depends on code following many
> > lines later.
> 
> Sure, sorry.
> 
> This check gives up if the number of arguments doesn't match the prototype.
> It gives a fairly generic error message.  That part of it has always been
> in here.
> 
> Now, I moved this check forward relative to the big switch statement on
> fcode, because there are redundant checks for the number of arguments
> in each of the resolve_vec_* helper functions.  This allowed me to simplify
> those a bit.
> 
> Now, it turns out that this doesn't work so well for functions that aren't
> fully described in rs6000-overload.def.  For example, for vec_splats we
> have:
> 
> ; There are no actual builtins for vec_splats.  There is special handling for
> ; this in altivec_resolve_overloaded_builtin in rs6000-c.cc, where the call
> ; is replaced by a constructor.  The single overload here causes
> ; __builtin_vec_splats to be registered with the front end so that can happen.
> [VEC_SPLATS, vec_splats, __builtin_vec_splats]
>   vsi __builtin_vec_splats (vsi);
> ABS_V4SI SPLATS_FAKERY
> 
> So even though __builtin_vec_splats accepts all vector types, the
> infrastructure cheats and just records one prototype.  We end up getting
> an error message that refers to this specific prototype even when we are
> handling a different argument type.  That is completely confusing to the
> user.  So I felt I was starting to get too deep for a simple refactoring
> patch, and gave up on early number-of-arguments checking for the special
> cases that use the _FAKERY technique.
> 
> That's probably still not clear, but maybe clearer?

Much better, thanks!

So put a comment before the code handling the arg checking for
vec_splats etc. saying just that?  Or the much condensed form "these
codes should be handled separately because " :-)  (And the larger
explanation in the commit message -- there you can talk about the old
code / old situation as well :-) )

> >> +default:
> >> +  ;
> > Don't.
> >
> > I like this better than a BS break statement, but it is just as stupid.
> >
> > If you need this, you don't want a switch statement, but some number of
> > if statements.  You cannot use a switch as a shorthand for this because
> > we have a silly warning and -Werror for this use.
> >
> > You probably get easier to understand code that way, too, you can get
> > rid of the above (just do some early returns), etc.
> 
> If I understand correctly, you'd like me to resubmit this in if-then-else
> form.  That's fine, just want to be sure that's what you want.

Yes please.  This is new code, so let's please keep it as readable as
possible.  Since you need to redo some of it anyway as well...


Segher


Re: [PATCH] sh-linux fix target cpu

2022-01-28 Thread Jeff Law via Gcc-patches




On 1/12/2022 2:02 AM, Yoshinori Sato wrote:

sh-linux not supported any SH1 and SH2a little-endian.
Add exceptios it.

gcc/ChangeLog:

* config/sh/t-linux (MULTILIB_EXCEPTIONS): Add m1, mb/m1 and m2a.
Thanks.  Technically this is probably too late to make gcc-12 as we're 
in stage4 (regression fixes only).  BUt it was posted during stage3 
(general bugfixing) and is very very low risk.


I went ahead and committed it for you.

Thanks, and sorry for the delays.

jeff


Re: [PATCH] tree-optimization/103514 Missing XOR-EQ-AND Optimization

2022-01-28 Thread Jeff Law via Gcc-patches




On 1/5/2022 1:12 PM, Navid Rahimi via Gcc-patches wrote:

Hi GCC community,

This patch will add the missed pattern described in bug 103514 [1] to the 
match.pd. [1] includes proof of correctness for the patch too.

PR tree-optimization/103514
* match.pd (a & b) ^ (a == b) -> !(a | b): New optimization.
* match.pd (a & b) == (a ^ b) -> !(a | b): New optimization.
* gcc.dg/tree-ssa/pr103514.c: Testcase for this optimization.

1) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103514
Note the bug was filed an fixed during stage3, review just didn't happen 
in a reasonable timeframe.


I'm going to ACK this for the trunk and go ahead and commit it for you.

Thanks for your patience,
jeff



Re: [PATCH][V2] Add mold detection for libs.

2022-01-28 Thread Jonathan Wakely via Gcc-patches
On Fri, 28 Jan 2022 at 18:17, Jeff Law wrote:
>
>
>
> On 1/24/2022 4:11 AM, Martin Liška wrote:
> > On 1/21/22 17:54, Jonathan Wakely wrote:
> >> Yes, OK (but please CC the libstdc++ list, not just me).
> >
> > Hello.
> >
> > Sorry for that. Anyway, I would like to install the extended version
> > of the patch
> > that touches all libraries.
> >
> > Ready to be installed?
> It looks to me like Jon ack'd in his original reply.  "Yes, OK ..."

Yes the libstdc++ part is still OK. I can't approve the equivalent
changes for the other libs.


Re: [PATCH][GCC13] Don't force side effects for hardware vector element broadcast

2022-01-28 Thread Maciej W. Rozycki
On Fri, 28 Jan 2022, Richard Biener wrote:

> > that's not what it does.  It treats it like
> >
> >   float tem = f;
> >   return x + { tem, tem, tem, tem };
> >
> > avoiding, like for x + (1.0f + f) creating
> >
> >   return x + { 1.0+f, 1.0+f, 1.0+f ...}
> >
> > it's more CSE than volatile qualifying.

 I see, thanks for your time to explain me.  I got this confused.

> > Because the IL from the frontends should not depend on target capabilities
> > and whether we have to preserve side-effects properly doesn't depend on
> > the cheapness of the operation itself.  Consider
> >
> >   return x + bar (f);
> >
> > you definitely want bar(f) to be only evaluated once, even when the
> > target can cheaply do the splat.

 Indeed.

> Btw,
> 
> diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
> index efd10332c53..c0f7d98931d 100644
> --- a/gcc/tree-cfg.cc
> +++ b/gcc/tree-cfg.cc
> @@ -4703,6 +4703,12 @@ verify_gimple_assign_single (gassign *stmt)
>   debug_generic_stmt (rhs1);
>   return true;
> }
> +  if (TREE_SIDE_EFFECTS (rhs1) && !gimple_clobber_p (stmt))
> +   {
> + error ("%qs with side-effects", code_name);
> + debug_generic_stmt (rhs1);
> + return true;
> +   }
>return res;
> 
>  case ASSERT_EXPR:
> 
> does not cause ICEs on the two testcases (on trunk).

 Right, so it has turned out I had the wrong binary run under GDB, sigh.  
I have re-verified the current trunk and indeed the side-effect annotation 
has gone:

(gdb) frame
#0  store_constructor (exp=,
target=0x75b388e0, cleared=0, size=..., reverse=false)
at .../gcc/expr.cc:7169
7169if (!TREE_SIDE_EFFECTS (exp)
(gdb) pt exp
 
unit-size 
align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x75c41260 precision:32
pointer_to_this >
sizes-gimplified V4SF
size 
unit-size 
align:128 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x75c6d740 nunits:4 context >
length:4
val 
visited var 
def_stmt GIMPLE_NOP
version:2>
val 
visited var 
def_stmt GIMPLE_NOP
version:2>
val 
visited var 
def_stmt GIMPLE_NOP
version:2>
val 
visited var 
def_stmt GIMPLE_NOP
version:2>>
(gdb)

and I have tracked down commit 512429a885e8 ("tree-optimization/99863 - 
clear vector CTOR TREE_SIDE_EFFECTS") of yours to be the change required.

 Thank you for your assistance!

  Maciej


Re: [PATCH 2/8] rs6000: Don't #ifdef "short" built-in names

2022-01-28 Thread Bill Schmidt via Gcc-patches


On 1/28/22 2:32 PM, Segher Boessenkool wrote:
> On Fri, Jan 28, 2022 at 11:50:20AM -0600, Bill Schmidt wrote:
>> It was recently pointed out that we get anomalous behavior when using
>> __attribute__((target)) to select a CPU.  As an example, when building for
>> -mcpu=power8 but using __attribute__((target("mcpu=power10")), it is legal
>> to call __builtin_vec_mod, but not vec_mod, even though these are
>> equivalent.  This is because the equivalence is established with a #define
>> that is guarded by #ifdef _ARCH_PWR10.
> Yeah that is bad.
>
>> This goofy behavior occurs with both the old builtins support and the
>> new.  One of the goals of the new builtins support was to make sure all
>> appropriate interfaces are available using __attribute__((target)), so I
>> failed in this respect.  This patch corrects the problem by removing the
>> apply.  For example, #ifdef __PPU__ is still appropriate.
> "By removing the apply"...  What does that mean?

Er, wow.  Meant to say "by removing the #define."  Strange error... will fix.

Thanks for catching that!
Bill

>
> Nice cleanup (and nice bugfix of course).  Okay for trunk (with that
> comment improved a bit perhaps).  Thanks!
>
>
> Segher


Re: [PATCH 1/8] rs6000: More factoring of overload processing

2022-01-28 Thread Bill Schmidt via Gcc-patches


On 1/28/22 1:11 PM, Segher Boessenkool wrote:
> On Fri, Jan 28, 2022 at 11:50:19AM -0600, Bill Schmidt wrote:
>> This patch continues the refactoring started with r12-6014.
> ab3f5b71dc6e
>
>> + and the generic code will issue the appropriate error message.  Skip
>> + this test for functions where we don't fully describe all the possible
>> + overload signatures in rs6000-overload.def (because they aren't 
>> relevant
>> + to the expansion here).  If we don't, we get confusing error messages. 
>>  */
>> +  if (fcode != RS6000_OVLD_VEC_PROMOTE
>> +  && fcode != RS6000_OVLD_VEC_SPLATS
>> +  && fcode != RS6000_OVLD_VEC_EXTRACT
>> +  && fcode != RS6000_OVLD_VEC_INSERT
>> +  && fcode != RS6000_OVLD_VEC_STEP
>> +  && (!VOID_TYPE_P (TREE_VALUE (fnargs)) || n < nargs))
>>  return NULL;
> Can you expand a bit on this, give an example for example?  It is very
> hard to understand this code, the way it depends on code following many
> lines later.

Sure, sorry.

This check gives up if the number of arguments doesn't match the prototype.
It gives a fairly generic error message.  That part of it has always been
in here.

Now, I moved this check forward relative to the big switch statement on
fcode, because there are redundant checks for the number of arguments
in each of the resolve_vec_* helper functions.  This allowed me to simplify
those a bit.

Now, it turns out that this doesn't work so well for functions that aren't
fully described in rs6000-overload.def.  For example, for vec_splats we
have:

; There are no actual builtins for vec_splats.  There is special handling for
; this in altivec_resolve_overloaded_builtin in rs6000-c.cc, where the call
; is replaced by a constructor.  The single overload here causes
; __builtin_vec_splats to be registered with the front end so that can happen.
[VEC_SPLATS, vec_splats, __builtin_vec_splats]
  vsi __builtin_vec_splats (vsi);
ABS_V4SI SPLATS_FAKERY

So even though __builtin_vec_splats accepts all vector types, the
infrastructure cheats and just records one prototype.  We end up getting
an error message that refers to this specific prototype even when we are
handling a different argument type.  That is completely confusing to the
user.  So I felt I was starting to get too deep for a simple refactoring
patch, and gave up on early number-of-arguments checking for the special
cases that use the _FAKERY technique.

That's probably still not clear, but maybe clearer?

>
>> +default:
>> +  ;
> Don't.
>
> I like this better than a BS break statement, but it is just as stupid.
>
> If you need this, you don't want a switch statement, but some number of
> if statements.  You cannot use a switch as a shorthand for this because
> we have a silly warning and -Werror for this use.
>
> You probably get easier to understand code that way, too, you can get
> rid of the above (just do some early returns), etc.

If I understand correctly, you'd like me to resubmit this in if-then-else
form.  That's fine, just want to be sure that's what you want.

Thanks for the review!
Bill

>
>
> Segher


Re: [PATCH v2] preprocessor: -Wbidi-chars and UCNs [PR104030]

2022-01-28 Thread Marek Polacek via Gcc-patches
On Fri, Jan 28, 2022 at 05:12:41PM +0100, Martin Liška wrote:
> On 1/28/22 16:26, Marek Polacek wrote:
> > On Fri, Jan 28, 2022 at 04:08:18PM +0100, Martin Liška wrote:
> > > On 1/28/22 15:59, Marek Polacek wrote:
> > > > On Fri, Jan 28, 2022 at 02:53:16PM +0100, Martin Liška wrote:
> > > > > On 1/24/22 23:36, Marek Polacek via Gcc-patches wrote:
> > > > > > |@@ -7820,6 +7820,10 @@ bidi contexts. @option{-Wbidi-chars=none} 
> > > > > > turns the warning off. @option{-Wbidi-chars=any} warns about any 
> > > > > > use of bidirectional control characters. +By default, this warning 
> > > > > > does not warn about UCNs. It is, however, possible +to turn on such 
> > > > > > checking by using @option{-Wbidi-chars=unpaired,ucn} or 
> > > > > > +@option{-Wbidi-chars=any,ucn}.|
> > > > > 
> > > > > Hello.
> > > > > 
> > > > > Can you please extend the documentation entry and explain what 'ucn' 
> > > > > actually means?
> > > > > 
> > > > > '''
> > > > > There are three levels of warning supported by GCC@.  The default is
> > > > > @option{-Wbidi-chars=unpaired}, which warns about improperly 
> > > > > terminated
> > > > > bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
> > > > > @option{-Wbidi-chars=any} warns about any use of bidirectional control
> > > > > characters.
> > > > > '''
> > > > > 
> > > > > Right now we have 4 levels and 'ucn' is not defined the paragraph.
> > > > 
> > > > The following paragraph says
> > > > 
> > > > By default, this warning does not warn about UCNs.  It is, however, 
> > > > possible
> > > > to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or
> > > > @option{-Wbidi-chars=any,ucn}.
> > > > 
> > > > Is that not enough?
> > > 
> > > Yeah, makes sense. Do I understand it correctly that one can't use 
> > > -Wbidi-chars=ucn?
> > 
> > You could, it just means use the default (=unpaired) with UCN checking 
> > enabled.
> > Do you want me to make a note about that in the manual?
> 
> Yes, please do so.

Done:

doc: Update -Wbidi-chars documentation

gcc/ChangeLog:

* doc/invoke.texi: Update -Wbidi-chars documentation.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9e588db4fce..cfd415110cd 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -7822,7 +7822,9 @@ characters.

 By default, this warning does not warn about UCNs.  It is, however, possible
 to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or
-@option{-Wbidi-chars=any,ucn}.
+@option{-Wbidi-chars=any,ucn}.  Using @option{-Wbidi-chars=ucn} is valid,
+and is equivalent to @option{-Wbidi-chars=unpaired,ucn}, if no previous
+@option{-Wbidi-chars=any} was specified.

 @item -Wbool-compare
 @opindex Wno-bool-compare


Marek



[PATCH v2] c++: ICE with auto[] and VLA [PR102414]

2022-01-28 Thread Marek Polacek via Gcc-patches
On Thu, Jan 27, 2022 at 10:17:00PM -0500, Jason Merrill wrote:
> On 1/27/22 20:02, Marek Polacek wrote:
> > @@ -11159,8 +11159,16 @@ create_array_type_for_decl (tree name, tree type, 
> > tree size, location_t loc)
> > /* Figure out the index type for the array.  */
> > if (size)
> > -itype = compute_array_index_type_loc (loc, name, size,
> > - tf_warning_or_error);
> > +{
> > +  itype = compute_array_index_type_loc (loc, name, size,
> > +   tf_warning_or_error);
> > +  if (type_uses_auto (type)
> > + && !TREE_CONSTANT (maybe_constant_value (size)))
> 
> Maybe variably_modified_type_p (itype)?

Ah yes, I missed that v_m_t_p actually looks at TYPE_MAX_VALUE...
Fixed.
 
> > +   {
> > + sorry_at (loc, "variable-length array of %");
> > + return error_mark_node;
> > +   }
> > +}
> > return build_cplus_array_type (type, itype);
> >   }
> > diff --git a/gcc/testsuite/g++.dg/cpp23/auto-array3.C 
> > b/gcc/testsuite/g++.dg/cpp23/auto-array3.C
> > new file mode 100644
> > index 000..e383a17d0ee
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/cpp23/auto-array3.C
> > @@ -0,0 +1,17 @@
> > +// PR c++/102414
> > +// { dg-do compile { target c++11 } }
> > +// { dg-options "" }
> > +
> > +constexpr int sz () { return 3; }
> > +
> > +void f ()
> > +{
> > +  int a[3];
> > +  auto (*a1)[0/0] =  // { dg-message "variable-length array of .auto." }
> > +// { dg-warning "division by zero" "" { target *-*-* } .-1 }
> 
> Let's move the error into the other testcase?

Done.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Here we ICE in unify_array_domain when we're trying to deduce the type
of an array, as in

  auto(*p)[i] = (int(*)[i])0;

but unify_array_domain doesn't arbitrarily complex bounds.  Another
test is, e.g.,

  auto (*b)[0/0] = 

where the type of the array is

  <<< Unknown tree: template_type_parm >>>[0:(sizetype) ((ssizetype) (0 / 0) - 
1)]

It seems to me that we need not handle these.

PR c++/102414
PR c++/101874

gcc/cp/ChangeLog:

* decl.cc (create_array_type_for_decl): Use template_placeholder_p.
Sorry on a variable-length array of auto.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-array3.C: New test.
* g++.dg/cpp23/auto-array4.C: New test.
---
 gcc/cp/decl.cc   | 14 +++---
 gcc/testsuite/g++.dg/cpp23/auto-array3.C | 16 
 gcc/testsuite/g++.dg/cpp23/auto-array4.C | 14 ++
 3 files changed, 41 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-array3.C
 create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-array4.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 26ce9bfefec..09eed9ceba6 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -11099,7 +11099,7 @@ create_array_type_for_decl (tree name, tree type, tree 
size, location_t loc)
 
   /* [dcl.type.class.deduct] prohibits forming an array of placeholder
  for a deduced class type.  */
-  if (is_auto (type) && CLASS_PLACEHOLDER_TEMPLATE (type))
+  if (template_placeholder_p (type))
 {
   if (name)
error_at (loc, "%qD declared as array of template placeholder "
@@ -11169,8 +11169,16 @@ create_array_type_for_decl (tree name, tree type, tree 
size, location_t loc)
 
   /* Figure out the index type for the array.  */
   if (size)
-itype = compute_array_index_type_loc (loc, name, size,
- tf_warning_or_error);
+{
+  itype = compute_array_index_type_loc (loc, name, size,
+   tf_warning_or_error);
+  if (type_uses_auto (type)
+ && variably_modified_type_p (itype, /*fn=*/NULL_TREE))
+   {
+ sorry_at (loc, "variable-length array of %");
+ return error_mark_node;
+   }
+}
 
   return build_cplus_array_type (type, itype);
 }
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-array3.C 
b/gcc/testsuite/g++.dg/cpp23/auto-array3.C
new file mode 100644
index 000..fdf25470fc2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-array3.C
@@ -0,0 +1,16 @@
+// PR c++/102414
+// PR c++/101874
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+constexpr int sz () { return 3; }
+
+void f ()
+{
+  int a[3];
+  const int N = 3;
+  auto (*a2)[N] = 
+  constexpr int M = 3;
+  auto (*a3)[M] = 
+  auto (*a4)[sz()] = 
+}
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-array4.C 
b/gcc/testsuite/g++.dg/cpp23/auto-array4.C
new file mode 100644
index 000..4385c4a6bf9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-array4.C
@@ -0,0 +1,14 @@
+// PR c++/102414
+// PR c++/101874
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+void
+f (int i)
+{
+  auto x[i] = { 0 }; // { dg-message "variable-length array of .auto." }
+  auto(*p)[i] = (int(*)[i])0; // { dg-message "variable-length array of 
.auto." }
+  int a[3];
+  auto 

Re: [PATCH 2/8] rs6000: Don't #ifdef "short" built-in names

2022-01-28 Thread Segher Boessenkool
On Fri, Jan 28, 2022 at 11:50:20AM -0600, Bill Schmidt wrote:
> It was recently pointed out that we get anomalous behavior when using
> __attribute__((target)) to select a CPU.  As an example, when building for
> -mcpu=power8 but using __attribute__((target("mcpu=power10")), it is legal
> to call __builtin_vec_mod, but not vec_mod, even though these are
> equivalent.  This is because the equivalence is established with a #define
> that is guarded by #ifdef _ARCH_PWR10.

Yeah that is bad.

> This goofy behavior occurs with both the old builtins support and the
> new.  One of the goals of the new builtins support was to make sure all
> appropriate interfaces are available using __attribute__((target)), so I
> failed in this respect.  This patch corrects the problem by removing the
> apply.  For example, #ifdef __PPU__ is still appropriate.

"By removing the apply"...  What does that mean?

Nice cleanup (and nice bugfix of course).  Okay for trunk (with that
comment improved a bit perhaps).  Thanks!


Segher


Re: [PATCH] Disable -fsplit-stack support on non-glibc targets

2022-01-28 Thread Iain Sandoe



> On 25 Jan 2022, at 15:27, David Edelsohn  wrote:
> 
> This patch broke bootstrap on AIX.  It may have broken Darwin. 

As you anticipated, it did : fixed as below (tested on powerpc-darwin9,
pushed to master).

thanks
Iain

[pushed] Darwin, PPC: Fix bootstrap after GLIBC version changes.

A recent patch added tests for OPTION_GLIBC that is defined in
linux.h and linux64.h.  This broke bootstrap for powerpc Darwin.
Fixed by adding a definition to 0 for OPTION_GLIBC.

Signed-off-by: Iain Sandoe 

gcc/ChangeLog:

* config/rs6000/darwin.h (OPTION_GLIBC): Define to 0.
---
 gcc/config/rs6000/darwin.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h
index b5cef42610f..210c60694cd 100644
--- a/gcc/config/rs6000/darwin.h
+++ b/gcc/config/rs6000/darwin.h
@@ -34,6 +34,8 @@
 #endif
 #endif
 
+#define OPTION_GLIBC 0
+
 /* The object file format is Mach-O.  */
 
 #define TARGET_OBJECT_FORMAT OBJECT_MACHO
-- 
2.24.1





Re: [PATCH 1/8] rs6000: More factoring of overload processing

2022-01-28 Thread Segher Boessenkool
On Fri, Jan 28, 2022 at 11:50:19AM -0600, Bill Schmidt wrote:
> This patch continues the refactoring started with r12-6014.

ab3f5b71dc6e

> + and the generic code will issue the appropriate error message.  Skip
> + this test for functions where we don't fully describe all the possible
> + overload signatures in rs6000-overload.def (because they aren't relevant
> + to the expansion here).  If we don't, we get confusing error messages.  
> */
> +  if (fcode != RS6000_OVLD_VEC_PROMOTE
> +  && fcode != RS6000_OVLD_VEC_SPLATS
> +  && fcode != RS6000_OVLD_VEC_EXTRACT
> +  && fcode != RS6000_OVLD_VEC_INSERT
> +  && fcode != RS6000_OVLD_VEC_STEP
> +  && (!VOID_TYPE_P (TREE_VALUE (fnargs)) || n < nargs))
>  return NULL;

Can you expand a bit on this, give an example for example?  It is very
hard to understand this code, the way it depends on code following many
lines later.

> +default:
> +  ;

Don't.

I like this better than a BS break statement, but it is just as stupid.

If you need this, you don't want a switch statement, but some number of
if statements.  You cannot use a switch as a shorthand for this because
we have a silly warning and -Werror for this use.

You probably get easier to understand code that way, too, you can get
rid of the above (just do some early returns), etc.


Segher


Re: [PATCH] PR101260 regcprop: Add mode change check for copy reg

2022-01-28 Thread Jeff Law via Gcc-patches




On 1/21/2022 9:52 AM, Andreas Krebbel via Gcc-patches wrote:

When propagating a multi-word register into an access with a smaller
mode the can_change_mode backend hook is already consulted for the
original register.  This however is also required for the intermediate
copy in copy_regno which might use a different register class.

Bootstrapped on x86_64 and s390x. No testsuite regressions.

Ok for mainline?

gcc/ChangeLog:

PR rtl-optimization/101260
* regcprop.cc (maybe_mode_change): Invoke mode_change_ok also for
copy_regno.
I'd been sitting on Stefan's patch because it just didn't feel right, 
but I also knew I'd need to dig further into it to either resolve my 
concerns or be able to express them clearly to Stefan.


I like this patch much better :-)  OK for the trunk.  Bonus points if 
you pick up Stefan's testcase from way back in October.


Jeff


Re: [PATCH v3] match.pd: Simplify 1 / X for integer X [PR95424]

2022-01-28 Thread Jeff Law via Gcc-patches




On 1/19/2022 11:42 AM, Zhao Wei Liew via Gcc-patches wrote:

This patch implements an optimization for the following C++ code:

int f(int x) {
 return 1 / x;
}

int f(unsigned int x) {
 return 1 / x;
}

Before this patch, x86-64 gcc -std=c++20 -O3 produces the following assembly:

f(int):
 xor edx, edx
 mov eax, 1
 idiv edi
 ret
f(unsigned int):
 xor edx, edx
 mov eax, 1
 div edi
 ret

In comparison, clang++ -std=c++20 -O3 produces the following assembly:

f(int):
 lea ecx, [rdi + 1]
 xor eax, eax
 cmp ecx, 3
 cmovb eax, edi
 ret
f(unsigned int):
 xor eax, eax
 cmp edi, 1
 sete al
 ret

Clang's output is more efficient as it avoids expensive div operations.

With this patch, GCC now produces the following assembly:

f(int):
 lea eax, [rdi + 1]
 cmp eax, 2
 mov eax, 0
 cmovbe eax, edi
 ret
f(unsigned int):
 xor eax, eax
 cmp edi, 1
 sete al
 ret

which is virtually identical to Clang's assembly output. Any slight differences
in the output for f(int) is possibly related to a different missed optimization.

v2: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587751.html
Changes from v2:
1. Refactor from using a switch statement to using the built-in
if-else statement.

v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587634.html
Changes from v1:
1. Refactor common if conditions.
2. Use build_[minus_]one_cst (type) to get -1/1 of the correct type.
3. Match only for TRUNC_DIV_EXPR and TYPE_PRECISION (type) > 1.

gcc/ChangeLog:

* match.pd: Simplify 1 / X where X is an integer.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/divide-6.c: New test.
* gcc.dg/tree-ssa/divide-7.c: New test.
Thanks.  Given the original submission and most of the review work was 
done prior to stage3 closing, I went ahead and installed this on the trunk.

jeff



Re: [PATCH][V2] Add mold detection for libs.

2022-01-28 Thread Jeff Law via Gcc-patches




On 1/24/2022 4:11 AM, Martin Liška wrote:

On 1/21/22 17:54, Jonathan Wakely wrote:

Yes, OK (but please CC the libstdc++ list, not just me).


Hello.

Sorry for that. Anyway, I would like to install the extended version 
of the patch

that touches all libraries.

Ready to be installed?

It looks to me like Jon ack'd in his original reply.  "Yes, OK ..."

jeff



Re: [PATCH] c++: bogus warning with value init of const pmf [PR92752]

2022-01-28 Thread Jason Merrill via Gcc-patches

On 1/28/22 12:54, Patrick Palka wrote:

Here we're emitting a -Wignored-qualifiers warning for an
intermediate compiler-generated cast of 0 to 'method-type* const'
as part of value initialization of a const pmf.  This patch
suppresses the warning by stripping cv-quals from this pointer
type before performing the cast.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/92752

gcc/cp/ChangeLog:

* typeck.cc (build_ptrmemfunc): Strip cv-quals from the pointer
type after building the pmf type.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wignored-qualifiers2.C: New test.
---
  gcc/cp/typeck.cc|  1 +
  .../g++.dg/warn/Wignored-qualifiers2.C  | 17 +
  2 files changed, 18 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wignored-qualifiers2.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index 11c9d8aff3e..0ce1cd3b6c8 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -9593,6 +9593,7 @@ build_ptrmemfunc (tree type, tree pfn, int force, bool 
c_cast_p,
/* Handle null pointer to member function conversions.  */
if (null_ptr_cst_p (pfn))
  {
+  type = cv_unqualified (type);
pfn = cp_build_c_cast (input_location, type, pfn, complain);


Maybe we want to cast to TYPE_PTRMEMFUNC_FN_TYPE_RAW (to_type) instead 
of "type" here?  OK with that change.



return build_ptrmemfunc1 (to_type,
integer_zero_node,
diff --git a/gcc/testsuite/g++.dg/warn/Wignored-qualifiers2.C 
b/gcc/testsuite/g++.dg/warn/Wignored-qualifiers2.C
new file mode 100644
index 000..c4c37545c02
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wignored-qualifiers2.C
@@ -0,0 +1,17 @@
+// PR c++/92752
+// { dg-do compile }
+// { dg-additional-options "-Wignored-qualifiers" }
+
+struct X;
+
+template
+struct Wrap {
+  T data;
+  Wrap() : data() {}
+};
+
+typedef int (X::*type)();
+Wrap x;
+#if __cpp_initializer_lists
+const type t{};
+#endif




Re: [PATCH] Fix multiple_of_p behavior with NOP_EXPR

2022-01-28 Thread Jeff Law via Gcc-patches




On 1/24/2022 7:46 AM, Richard Biener via Gcc-patches wrote:

We were passing down the original type to recursive invocations
of multiple_of_p for say (int)(unsigned * unsigned).

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

2022-01-24  Richard Biener  

* fold-const.cc (multiple_of_p): Pass the correct type of
the expression to the recursive invocation of multiple_of_p
for conversions and use CASE_CONVERT.

OK by me.  Your call on gcc-12 vs gcc-13 timing.
jeff



Re: [PATCH] Reduce multiple_of_p uses

2022-01-28 Thread Jeff Law via Gcc-patches




On 1/24/2022 7:45 AM, Richard Biener via Gcc-patches wrote:

There are a few cases where we know we're dealing with (poly-)integer
constants, so remove the use of multiple_of_p in those cases to make
the PR100499 fix less impactful.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

2022-01-24  Richard Biener  

PR tree-optimization/100499
* tree-cfg.cc (verify_gimple_assign_ternary): Use multiple_p
on poly-ints instead of multiple_of_p.
* tree-ssa.cc (maybe_rewrite_mem_ref_base): Likewise.
(non_rewritable_mem_ref_base): Likewise.
(non_rewritable_lvalue_p): Likewise.
(execute_update_addresses_taken): Likewise.
So it's not a full fix for this class of problems, but removes some of 
the cases where we could potentially overflow and give the wrong 
result.  I'd be happier if we had a concrete testcase for these 
instances you're fixing, but I think it's a move in the right direction 
and I obviously trust your judgment on whether or not to install it now 
or wait for gcc-13.


So your call.  Both on whether or not to try and construct testcases to 
trigger these instances and whether or not to install now or wait for 
gcc-13.


jeff



Re: [PATCH] store-merging: Fix up a -fcompare-debug bug in get_status_for_store_merging [PR104263]

2022-01-28 Thread Richard Biener via Gcc-patches



> Am 28.01.2022 um 18:36 schrieb Jeff Law :
> 
> 
> 
>> On 1/28/2022 9:31 AM, Jakub Jelinek via Gcc-patches wrote:
>> Hi!
>> 
>> As mentioned in the PRthe following testcase fails, because the last
>> stmt of a bb with -g is a debug stmt and get_status_for_store_merging
>> uses gimple_seq_last_stmt (bb_seq (bb)) when testing if it is valid
>> for store merging.  The debug stmt isn't valid, while a stmt at that
>> position with -g0 is valid and so the divergence.
>> 
>> As we walk the whole bb already, this patch just remembers the last
>> non-debug stmt, so that we don't need to skip backwards debug stmts at the
>> end of the bb to find last real stmt.
>> 
>> Bootstrapped/regtested on powerpc64le-linux, ok for trunk

Ok
Thanks,
Richard 
>> 2022-01-28  Jakub Jelinek  
>> 
>>PR tree-optimization/104263
>>* gimple-ssa-store-merging.cc (get_status_for_store_merging): For
>>cfun->can_throw_non_call_exceptions && cfun->eh test whether
>>last non-debug stmt in the bb is store_valid_for_store_merging_p
>>rather than last stmt.
>> 
>>* gcc.dg/pr104263.c: New test.
> OK
> jeff
> 


[PATCH, rs6000] Clean up Power10 fusion options

2022-01-28 Thread Pat Haugen via Gcc-patches
Mark Power10 fusion option undocumented and remove sub-options.

Bootstrapped and regression tested on powerpc64le(Power10).
Ok for master?

-Pat


2022-01-28  Pat Haugen  

gcc/
* config/rs6000/rs6000.opt (mpower10-fusion): Mark Undocumented.
(mpower10-fusion-ld-cmpi, mpower10-fusion-2logical,
mpower10-fusion-logical-add, mpower10-fusion-add-logical,
mpower10-fusion-2add, mpower10-fusion-2store): Remove.
* config/rs6000/rs6000-cpus.def (ISA_3_1_MASKS_SERVER,
OTHER_P9_VECTOR_MASKS): Remove Power10 fusion sub-options.
* config/rs6000/rs6000.cc (rs6000_option_override_internal,
power10_sched_reorder): Likewise.
* config/rs6000/genfusion.pl (gen_ld_cmpi_p10, gen_logical_addsubf,
gen_addadd): Likewise
* config/rs6000/fusion.md: Regenerate.
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index c2a77182a9e..b4e69e9fefd 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -491,33 +491,9 @@ Target Mask(P8_VECTOR) Var(rs6000_isa_flags)
 Use vector and scalar instructions added in ISA 2.07.
 
 mpower10-fusion
-Target Mask(P10_FUSION) Var(rs6000_isa_flags)
+Target Undocumented Mask(P10_FUSION) Var(rs6000_isa_flags)
 Fuse certain integer operations together for better performance on power10.
 
-mpower10-fusion-ld-cmpi
-Target Undocumented Mask(P10_FUSION_LD_CMPI) Var(rs6000_isa_flags)
-Fuse certain integer operations together for better performance on power10.
-
-mpower10-fusion-2logical
-Target Undocumented Mask(P10_FUSION_2LOGICAL) Var(rs6000_isa_flags)
-Fuse pairs of scalar or vector logical operations together for better 
performance on power10.
-
-mpower10-fusion-logical-add
-Target Undocumented Mask(P10_FUSION_LOGADD) Var(rs6000_isa_flags)
-Fuse scalar logical op with add/subf for better performance on power10.
-
-mpower10-fusion-add-logical
-Target Undocumented Mask(P10_FUSION_ADDLOG) Var(rs6000_isa_flags)
-Fuse scalar add/subf with logical op for better performance on power10.
-
-mpower10-fusion-2add
-Target Undocumented Mask(P10_FUSION_2ADD) Var(rs6000_isa_flags)
-Fuse dependent pairs of add or vaddudm instructions for better performance on 
power10.
-
-mpower10-fusion-2store
-Target Undocumented Mask(P10_FUSION_2STORE) Var(rs6000_isa_flags)
-Fuse certain store operations together for better performance on power10.
-
 mcrypto
 Target Mask(CRYPTO) Var(rs6000_isa_flags)
 Use ISA 2.07 Category:Vector.AES and Category:Vector.SHA2 instructions.
diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 325b21967c4..963947f6939 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -85,13 +85,7 @@
 #define ISA_3_1_MASKS_SERVER   (ISA_3_0_MASKS_SERVER   \
 | OPTION_MASK_POWER10  \
 | OTHER_POWER10_MASKS  \
-| OPTION_MASK_P10_FUSION   \
-| OPTION_MASK_P10_FUSION_LD_CMPI   \
-| OPTION_MASK_P10_FUSION_2LOGICAL  \
-| OPTION_MASK_P10_FUSION_LOGADD\
-| OPTION_MASK_P10_FUSION_ADDLOG\
-| OPTION_MASK_P10_FUSION_2ADD  \
-| OPTION_MASK_P10_FUSION_2STORE)
+| OPTION_MASK_P10_FUSION)
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS  (OPTION_MASK_FLOAT128_HW\
@@ -139,12 +133,6 @@
 | OPTION_MASK_FPRND\
 | OPTION_MASK_POWER10  \
 | OPTION_MASK_P10_FUSION   \
-| OPTION_MASK_P10_FUSION_LD_CMPI   \
-| OPTION_MASK_P10_FUSION_2LOGICAL  \
-| OPTION_MASK_P10_FUSION_LOGADD\
-| OPTION_MASK_P10_FUSION_ADDLOG\
-| OPTION_MASK_P10_FUSION_2ADD  \
-| OPTION_MASK_P10_FUSION_2STORE\
 | OPTION_MASK_HTM  \
 | OPTION_MASK_ISEL \
 | OPTION_MASK_MFCRF\
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index a5fd36b72d9..548366abada 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -4458,30 +4458,6 @@ rs6000_option_override_internal (bool global_init_p)
   && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) == 0)
 rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
 
-  if (TARGET_POWER10 &&
-  

[Patch]middle-end: updating the reg use in exit block for -fzero-call-used-regs [PR100775]

2022-01-28 Thread Qing Zhao via Gcc-patches
Hi,

PR 100775 ( ICE: in df_exit_block_bitmap_verify, at df-scan.c:4164 with -mthumb 
-fzero-call-used-regs=used)
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100775

Although the ICE only happens on arm, but this is a bug in the middle end. So, 
I think this bug has higher priority, 
Need to be included into gcc12, and also need to be back ported to gcc11. 

In the pass_zero_call_used_regs, when updating dataflow info after adding
the register zeroing sequence in the epilogue of the function, we should
call "df_update_exit_block_uses" to update the register use information in
the exit block to include all the registers that have been zeroed.

The change has been bootstrapped and reg-tested on both x86 and aarch64 (with 
-enable-checking=yes,rtl,df). 
Since I cannot find an arm machine,  no bootstrap and reg-tested on arm yet.

For the arm failure, I just tested it with the cross build and it has no issue 
withe the fix.

(One question here:
Previously, I though “df_set_bb_dirty (EXIT_BLOCK_PTR_FOR_FN (cfun))” and a 
later “df_analyze()” should rescan 
the changed exit block of the function, and update all the df info 
automatically, it apparently not the case, the register
use info at exit block is not automatically updated, we have to add an 
explicitly call to “df_update_exit_block_uses”.
I checked the pass_thread_prologue_and_epilogue, looks like it also explicitly 
calls “df_update_entry_exit_and_calls” 
to update the register use info.
Shall the “df_set_bb_dirty” + “df_analyze” automatically update the reg use 
info of the dirty block?).

Let me know whether there is any issue with the fix?

Thanks

Qing

===

From e1cca5659c85e7c536f5016a2c75c615e65dba75 Mon Sep 17 00:00:00 2001
From: Qing Zhao 
Date: Fri, 28 Jan 2022 16:29:51 +
Subject: [PATCH] middle-end: updating the reg use in exit block for
 -fzero-call-used-regs [PR100775]

In the pass_zero_call_used_regs, when updating dataflow info after adding
the register zeroing sequence in the epilogue of the function, we should
call "df_update_exit_block_uses" to update the register use information in
the exit block to include all the registers that have been zeroed.

2022-01-27  Qing Zhao  

gcc/ChangeLog:

* function.cc (gen_call_used_regs_seq): Call
df_update_exit_block_uses when updating df.

gcc/testsuite/ChangeLog:

* gcc.target/arm/pr100775.c: New test.
---
 gcc/function.cc | 1 +
 gcc/testsuite/gcc.target/arm/pr100775.c | 8 
 2 files changed, 9 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/arm/pr100775.c

diff --git a/gcc/function.cc b/gcc/function.cc
index e1d2565f8d92..c8a77c9a6246 100644
--- a/gcc/function.cc
+++ b/gcc/function.cc
@@ -5942,6 +5942,7 @@ gen_call_used_regs_seq (rtx_insn *ret, unsigned int 
zero_regs_type)
   /* Update the data flow information.  */
   crtl->must_be_zero_on_return |= zeroed_hardregs;
   df_set_bb_dirty (EXIT_BLOCK_PTR_FOR_FN (cfun));
+  df_update_exit_block_uses ();
 }
 }
 
diff --git a/gcc/testsuite/gcc.target/arm/pr100775.c 
b/gcc/testsuite/gcc.target/arm/pr100775.c
new file mode 100644
index ..dd2255a95492
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr100775.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-mthumb -fzero-call-used-regs=used" } */
+
+int
+foo (int x)
+{
+  return x;
+}
-- 
2.27.0







Re: [PATCH] [rtl/cprop_hardreg] Don't propagate for a more expensive reg-reg move.

2022-01-28 Thread Jeff Law via Gcc-patches




On 1/24/2022 6:31 PM, liuhongt via Gcc-patches wrote:

For i386, it enables optimization like:

 vmovd   %xmm0, %edx
-   vmovd   %xmm0, %eax
+   movl%edx, %eax

Bootstrapped and regtested on CLX for both
x86_64-pc-linux-gnu{-m32,} and
x86_64-pc-linux-gnu{-m32\ -march=native,\ -march=native}

Ok for trunk?

gcc/ChangeLog:

PR rtl-optimization/104059
* regcprop.cc (copyprop_hardreg_forward_1): Don't propagate
for a more expensive reg-reg move.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr104059.c: New test.

OK
jeff



[PATCH] c++: bogus warning with value init of const pmf [PR92752]

2022-01-28 Thread Patrick Palka via Gcc-patches
Here we're emitting a -Wignored-qualifiers warning for an
intermediate compiler-generated cast of 0 to 'method-type* const'
as part of value initialization of a const pmf.  This patch
suppresses the warning by stripping cv-quals from this pointer
type before performing the cast.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/92752

gcc/cp/ChangeLog:

* typeck.cc (build_ptrmemfunc): Strip cv-quals from the pointer
type after building the pmf type.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wignored-qualifiers2.C: New test.
---
 gcc/cp/typeck.cc|  1 +
 .../g++.dg/warn/Wignored-qualifiers2.C  | 17 +
 2 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wignored-qualifiers2.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index 11c9d8aff3e..0ce1cd3b6c8 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -9593,6 +9593,7 @@ build_ptrmemfunc (tree type, tree pfn, int force, bool 
c_cast_p,
   /* Handle null pointer to member function conversions.  */
   if (null_ptr_cst_p (pfn))
 {
+  type = cv_unqualified (type);
   pfn = cp_build_c_cast (input_location, type, pfn, complain);
   return build_ptrmemfunc1 (to_type,
integer_zero_node,
diff --git a/gcc/testsuite/g++.dg/warn/Wignored-qualifiers2.C 
b/gcc/testsuite/g++.dg/warn/Wignored-qualifiers2.C
new file mode 100644
index 000..c4c37545c02
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wignored-qualifiers2.C
@@ -0,0 +1,17 @@
+// PR c++/92752
+// { dg-do compile }
+// { dg-additional-options "-Wignored-qualifiers" }
+
+struct X;
+
+template
+struct Wrap {
+  T data;
+  Wrap() : data() {}
+};
+
+typedef int (X::*type)();
+Wrap x;
+#if __cpp_initializer_lists
+const type t{};
+#endif
-- 
2.35.0



[PATCH 8/8] rs6000: Fix some missing built-in attributes [PR104004]

2022-01-28 Thread Bill Schmidt via Gcc-patches
PR104004 caught some misses on my part in converting to the new built-in
function infrastructure.  In particular, I forgot to mark all of the "nosoft"
built-ins, and one of those should also have been marked "no32bit".

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-27  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtin.def (MFFSL): Mark nosoft.
(MTFSB0): Likewise.
(MTFSB1): Likewise.
(SET_FPSCR_RN): Likewise.
(SET_FPSCR_DRN): Mark nosoft and no32bit.
---
 gcc/config/rs6000/rs6000-builtins.def | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index c8f0cf332eb..98619a649e3 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -215,7 +215,7 @@
 ; processors, this builtin automatically falls back to mffs on older
 ; platforms.  Thus it appears here in the [always] stanza.
   double __builtin_mffsl ();
-MFFSL rs6000_mffsl {}
+MFFSL rs6000_mffsl {nosoft}
 
 ; This is redundant with __builtin_pack_ibm128, as it requires long
 ; double to be __ibm128.  Should probably be deprecated.
@@ -226,10 +226,10 @@
 MFTB rs6000_mftb_di {32bit}
 
   void __builtin_mtfsb0 (const int<0,31>);
-MTFSB0 rs6000_mtfsb0 {}
+MTFSB0 rs6000_mtfsb0 {nosoft}
 
   void __builtin_mtfsb1 (const int<0,31>);
-MTFSB1 rs6000_mtfsb1 {}
+MTFSB1 rs6000_mtfsb1 {nosoft}
 
   void __builtin_mtfsf (const int<0,255>, double);
 MTFSF rs6000_mtfsf {}
@@ -238,7 +238,7 @@
 PACK_IF packif {}
 
   void __builtin_set_fpscr_rn (const int[0,3]);
-SET_FPSCR_RN rs6000_set_fpscr_rn {}
+SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
 
   const double __builtin_unpack_ibm128 (__ibm128, const int<0,1>);
 UNPACK_IF unpackif {}
@@ -2969,7 +2969,7 @@
 PACK_TD packtd {}
 
   void __builtin_set_fpscr_drn (const int[0,7]);
-SET_FPSCR_DRN rs6000_set_fpscr_drn {}
+SET_FPSCR_DRN rs6000_set_fpscr_drn {nosoft,no32bit}
 
   const unsigned long long __builtin_unpack_dec128 (_Decimal128, \
 const int<0,1>);
-- 
2.27.0



[PATCH 7/8] rs6000: vec_neg built-ins wrongly require POWER8

2022-01-28 Thread Bill Schmidt via Gcc-patches
As the subject states.  Fixing this is accomplished by moving the built-ins
to the correct stanzas, [altivec] and [vsx].

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-27  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtin.def (NEG_V16QI): Move to [altivec]
stanza.
(NEG_V4SF): Likewise.
(NEG_V4SI): Likewise.
(NEG_V8HI): Likewise.
(NEG_V2DF): Move to [vsx] stanza.
(NEG_V2DI): Likewise.
---
 gcc/config/rs6000/rs6000-builtins.def | 36 +--
 1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index 2bb997a5279..c8f0cf332eb 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -410,6 +410,18 @@
   const vss __builtin_altivec_nabs_v8hi (vss);
 NABS_V8HI nabsv8hi2 {}
 
+  const vsc __builtin_altivec_neg_v16qi (vsc);
+NEG_V16QI negv16qi2 {}
+
+  const vf __builtin_altivec_neg_v4sf (vf);
+NEG_V4SF negv4sf2 {}
+
+  const vsi __builtin_altivec_neg_v4si (vsi);
+NEG_V4SI negv4si2 {}
+
+  const vss __builtin_altivec_neg_v8hi (vss);
+NEG_V8HI negv8hi2 {}
+
   void __builtin_altivec_stvebx (vsc, signed long, void *);
 STVEBX altivec_stvebx {stvec}
 
@@ -1175,6 +1187,12 @@
   const vsll __builtin_altivec_nabs_v2di (vsll);
 NABS_V2DI nabsv2di2 {}
 
+  const vd __builtin_altivec_neg_v2df (vd);
+NEG_V2DF negv2df2 {}
+
+  const vsll __builtin_altivec_neg_v2di (vsll);
+NEG_V2DI negv2di2 {}
+
   void __builtin_altivec_stvx_v2df (vd, signed long, void *);
 STVX_V2DF altivec_stvx_v2df {stvec}
 
@@ -2118,24 +2136,6 @@
   const vus __builtin_altivec_nand_v8hi_uns (vus, vus);
 NAND_V8HI_UNS nandv8hi3 {}
 
-  const vsc __builtin_altivec_neg_v16qi (vsc);
-NEG_V16QI negv16qi2 {}
-
-  const vd __builtin_altivec_neg_v2df (vd);
-NEG_V2DF negv2df2 {}
-
-  const vsll __builtin_altivec_neg_v2di (vsll);
-NEG_V2DI negv2di2 {}
-
-  const vf __builtin_altivec_neg_v4sf (vf);
-NEG_V4SF negv4sf2 {}
-
-  const vsi __builtin_altivec_neg_v4si (vsi);
-NEG_V4SI negv4si2 {}
-
-  const vss __builtin_altivec_neg_v8hi (vss);
-NEG_V8HI negv8hi2 {}
-
   const vsc __builtin_altivec_orc_v16qi (vsc, vsc);
 ORC_V16QI orcv16qi3 {}
 
-- 
2.27.0



[PATCH 6/8] rs6000: Remove -m[no-]fold-gimple flag [PR103686]

2022-01-28 Thread Bill Schmidt via Gcc-patches
The -m[no-]fold-gimple flag was really intended primarily for internal
testing while implementing GIMPLE folding for rs6000 vector built-in
functions.  It ended up leaking into other places, causing problems such
as PR103686 identifies.  Let's remove it.

There are a number of tests in the testsuite that require adjustment.
Some specify -mfold-gimple directly, which is the default, so that is
handled by removing the option.  Others unnecessarily specify
-mno-fold-gimple, as the tests work fine without this.  Again that is
handled by removing the option.  There are a couple of extra variants of
tests specifically for -mno-fold-gimple; for those, we can just remove the
whole test.

gcc.target/powerpc/builtins-1.c was more problematic.  It was written in
such a way as to be extremely fragile.  For this one, I rewrote the whole
test in a different style, using individual functions to test each
built-in function.  These same tests are also largely covered by
builtins-1-be-folded.c and builtins-1-le-folded.c, so I chose to
explicitly make this test -mbig for simplicity, and use -O2 for clean code
generation.  I made some slight modifications to the expected instruction
counts as a result, and tested on both 32- and 64-bit.  Most instruction
count tests now use the {\m ... \M} style, but I wasn't able to figure out
how to get this right for vcmpequd. and vcmpgtud.  Using \. didn't do the
trick, and I got tired of messing with it.  I can change those if you
suggest the proper incantation for an opcode ending with a period.

Bootstrapped and tested on powerpc64le-linux-gnu and on
powerpc64-linux-gnu (32- and 64-bit) with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-27  Bill Schmidt  

gcc/
PR target/103686
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin):
Remove test for !rs6000_fold_gimple.
* config/rs6000/rs6000.cc (rs6000_option_override_internal):
Likewise.
* config/rs6000/rs6000.opt (mfold-gimple): Remove.

gcc/testsuite/
PR target/103686
* gcc.target/powerpc/builtins-1-be-folded.c: Remove -mfold-gimple
option.
* gcc.target/powerpc/builtins-1-le-folded.c: Likewise.
* gcc.target/powerpc/builtins-1.c: Rewrite to use small functions
and restrict to -O2 -mbig for predictability.  Adjust instruction
counts.
* gcc.target/powerpc/builtins-5.c: Remove -mno-fold-gimple
option.
* gcc.target/powerpc/p8-vec-xl-xst.c: Likewise.
* gcc.target/powerpc/pr83926.c: Likewise.
* gcc.target/powerpc/pr86731-nogimplefold-longlong.c: Delete.
* gcc.target/powerpc/pr86731-nogimplefold.c: Delete.
* gcc.target/powerpc/swaps-p8-17.c: Remove -mno-fold-gimple
option.
---
 gcc/config/rs6000/rs6000-builtin.cc   |3 -
 gcc/config/rs6000/rs6000.cc   |4 -
 gcc/config/rs6000/rs6000.opt  |4 -
 .../gcc.target/powerpc/builtins-1-be-folded.c |2 +-
 .../gcc.target/powerpc/builtins-1-le-folded.c |2 +-
 gcc/testsuite/gcc.target/powerpc/builtins-1.c | 1210 +
 gcc/testsuite/gcc.target/powerpc/builtins-5.c |3 +-
 .../gcc.target/powerpc/p8-vec-xl-xst.c|3 +-
 gcc/testsuite/gcc.target/powerpc/pr83926.c|3 +-
 .../powerpc/pr86731-nogimplefold-longlong.c   |   32 -
 .../gcc.target/powerpc/pr86731-nogimplefold.c |   63 -
 .../gcc.target/powerpc/swaps-p8-17.c  |3 +-
 12 files changed, 951 insertions(+), 381 deletions(-)
 delete mode 100644 
gcc/testsuite/gcc.target/powerpc/pr86731-nogimplefold-longlong.c
 delete mode 100644 gcc/testsuite/gcc.target/powerpc/pr86731-nogimplefold.c

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 163287f2b67..dc9e3a4df1d 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -1299,9 +1299,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
   fprintf (stderr, "rs6000_gimple_fold_builtin %d %s %s\n",
   fn_code, fn_name1, fn_name2);
 
-  if (!rs6000_fold_gimple)
-return false;
-
   /* Prevent gimple folding for code that does not have a LHS, unless it is
  allowed per the rs6000_builtin_valid_without_lhs helper function.  */
   if (!gimple_call_lhs (stmt)
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index d27e1ec4a60..a4acb5d1f43 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3851,10 +3851,6 @@ rs6000_option_override_internal (bool global_init_p)
   & OPTION_MASK_DIRECT_MOVE))
 rs6000_isa_flags |= ~rs6000_isa_flags_explicit & OPTION_MASK_STRICT_ALIGN;
 
-  if (!rs6000_fold_gimple)
- fprintf (stderr,
- "gimple folding of rs6000 builtins has been disabled.\n");
-
   /* Add some warnings for VSX.  */
   if (TARGET_VSX)
 {
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index c2a77182a9e..68c0cae6e63 

[PATCH 5/8] rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]

2022-01-28 Thread Bill Schmidt via Gcc-patches
These built-ins were misimplemented as always having big-endian semantics.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-18  Bill Schmidt  

gcc/
PR target/95082
* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Handle
endianness for vclzlsbb and vctzlsbb.
* config/rs6000/rs6000-builtins.def (VCLZLSBB_V16QI): Change
default pattern and indicate a different pattern will be used for
big endian.
(VCLZLSBB_V4SI): Likewise.
(VCLZLSBB_V8HI): Likewise.
(VCTZLSBB_V16QI): Likewise.
(VCTZLSBB_V4SI): Likewise.
(VCTZLSBB_V8HI): Likewise.

gcc/testsuite/
PR target/95082
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c: Restrict to -mbig.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c: Likewise.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c: New.
* gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c: New.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c: Restrict to -mbig.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c: Likewise.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c: New.
* gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c: New.
---
 gcc/config/rs6000/rs6000-builtin.cc   | 12 
 gcc/config/rs6000/rs6000-builtins.def | 12 ++--
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c | 15 +++
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c | 15 +++
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-0.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-1.c |  2 +-
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c | 15 +++
 .../gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c | 15 +++
 10 files changed, 82 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-4.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/vsu/vec-cnttz-lsbb-4.c

diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index 191a6108a5e..163287f2b67 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -3485,6 +3485,18 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* 
subtarget */,
icode = CODE_FOR_vsx_store_v8hi;
   else if (fcode == RS6000_BIF_ST_ELEMREV_V16QI)
icode = CODE_FOR_vsx_store_v16qi;
+  else if (fcode == RS6000_BIF_VCLZLSBB_V16QI)
+   icode = CODE_FOR_vclzlsbb_v16qi;
+  else if (fcode == RS6000_BIF_VCLZLSBB_V4SI)
+   icode = CODE_FOR_vclzlsbb_v4si;
+  else if (fcode == RS6000_BIF_VCLZLSBB_V8HI)
+   icode = CODE_FOR_vclzlsbb_v8hi;
+  else if (fcode == RS6000_BIF_VCTZLSBB_V16QI)
+   icode = CODE_FOR_vctzlsbb_v16qi;
+  else if (fcode == RS6000_BIF_VCTZLSBB_V4SI)
+   icode = CODE_FOR_vctzlsbb_v4si;
+  else if (fcode == RS6000_BIF_VCTZLSBB_V8HI)
+   icode = CODE_FOR_vctzlsbb_v8hi;
   else
gcc_unreachable ();
 }
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index cfe31c2e7de..2bb997a5279 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2551,13 +2551,13 @@
 VBPERMD altivec_vbpermd {}
 
   const signed int __builtin_altivec_vclzlsbb_v16qi (vsc);
-VCLZLSBB_V16QI vclzlsbb_v16qi {}
+VCLZLSBB_V16QI vctzlsbb_v16qi {endian}
 
   const signed int __builtin_altivec_vclzlsbb_v4si (vsi);
-VCLZLSBB_V4SI vclzlsbb_v4si {}
+VCLZLSBB_V4SI vctzlsbb_v4si {endian}
 
   const signed int __builtin_altivec_vclzlsbb_v8hi (vss);
-VCLZLSBB_V8HI vclzlsbb_v8hi {}
+VCLZLSBB_V8HI vctzlsbb_v8hi {endian}
 
   const vsc __builtin_altivec_vctzb (vsc);
 VCTZB ctzv16qi2 {}
@@ -2572,13 +2572,13 @@
 VCTZW ctzv4si2 {}
 
   const signed int __builtin_altivec_vctzlsbb_v16qi (vsc);
-VCTZLSBB_V16QI vctzlsbb_v16qi {}
+VCTZLSBB_V16QI vclzlsbb_v16qi {endian}
 
   const signed int __builtin_altivec_vctzlsbb_v4si (vsi);
-VCTZLSBB_V4SI vctzlsbb_v4si {}
+VCTZLSBB_V4SI vclzlsbb_v4si {endian}
 
   const signed int __builtin_altivec_vctzlsbb_v8hi (vss);
-VCTZLSBB_V8HI vctzlsbb_v8hi {}
+VCTZLSBB_V8HI vclzlsbb_v8hi {endian}
 
   const signed int __builtin_altivec_vcmpaeb_p (vsc, vsc);
 VCMPAEB_P vector_ae_v16qi_p {}
diff --git a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c 
b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
index 0faf233425e..dc92d6fdd65 100644
--- a/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { powerpc*-*-* } } } */
 /* 

[PATCH 3/8] rs6000: Convert built-in constraints to form

2022-01-28 Thread Bill Schmidt via Gcc-patches
When introducing the new built-in support, I tried to match as many
existing error messages as possible.  One common form was "argument X must
be a Y-bit unsigned literal".  Another was "argument X must be a literal
between X' and  Y', inclusive".  During reviews, Segher requested that I
eventually convert all messages of the first form into the second form for
consistency.  That's what this patch does, replacing all -form
constraints (first form) with -form constraints (second form).

For the moment, the parser will still accept  arguments, but I've added
a note in rs6000-builtins.def that this form is deprecated in favor of
.  I think it's harmless to leave it in, in case a desire for the
distinction comes up in the future.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-12  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtins.def (MTFSB0): Replace -form
constraints with -form constraints.
(MTFSB1): Likewise.
(MTFSF): Likewise.
(UNPACK_IF): Likewise.
(UNPACK_TF): Likewise.
(DSS): Likewise.
(DST): Likewise.
(DSTST): Likewise.
(DSTSTT): Likewise.
(DSTT): Likewise.
(VCFSX): Likewise.
(VCFUX): Likewise.
(VCTSXS): Likewise.
(VCTUXS): Likewise.
(VSLDOI_16QI): Likewise.
(VSLDOI_4SF): Likewise.
(VSLDOI_4SI): Likewise.
(VSLDOI_8HI): Likewise.
(VSPLTB): Likewise.
(VSPLTH): Likewise.
(VSPLTW): Likewise.
(VEC_SET_V16QI): Likewise.
(VEC_SET_V4SF): Likewise.
(VEC_SET_V4SI): Likewise.
(VEC_SET_V8HI): Likewise.
(VSLDOI_2DF): Likewise.
(VSLDOI_2DI): Likewise.
(VEC_SET_V2DF): Likewise.
(VEC_SET_V2DI): Likewise.
(XVCVSXDDP_SCALE): Likewise.
(XVCVUXDDP_SCALE): Likewise.
(XXPERMDI_16QI): Likewise.
(XXPERMDI_1TI): Likewise.
(XXPERMDI_2DF): Likewise.
(XXPERMDI_2DI): Likewise.
(XXPERMDI_4SF): Likewise.
(XXPERMDI_4SI): Likewise.
(XXPERMDI_8HI): Likewise.
(XXSLDWI_16QI): Likewise.
(XXSLDWI_2DF): Likewise.
(XXSLDWI_2DI): Likewise.
(XXSLDWI_4SF): Likewise.
(XXSLDWI_4SI): Likewise.
(XXSLDWI_8HI): Likewise.
(XXSPLTD_V2DF): Likewise.
(XXSPLTD_V2DI): Likewise.
(UNPACK_V1TI): Likewise.
(BCDADD_V1TI): Likewise.
(BCDADD_V16QI): Likewise.
(BCDADD_EQ_V1TI): Likewise.
(BCDADD_EQ_V16QI): Likewise.
(BCDADD_GT_V1TI): Likewise.
(BCDADD_GT_V16QI): Likewise.
(BCDADD_LT_V1TI): Likewise.
(BCDADD_LT_V16QI): Likewise.
(BCDADD_OV_V1TI): Likewise.
(BCDADD_OV_V16QI): Likewise.
(BCDSUB_V1TI): Likewise.
(BCDSUB_V16QI): Likewise.
(BCDSUB_EQ_V1TI): Likewise.
(BCDSUB_EQ_V16QI): Likewise.
(BCDSUB_GT_V1TI): Likewise.
(BCDSUB_GT_V16QI): Likewise.
(BCDSUB_LT_V1TI): Likewise.
(BCDSUB_LT_V16QI): Likewise.
(BCDSUB_OV_V1TI): Likewise.
(BCDSUB_OV_V16QI): Likewise.
(VSTDCDP): Likewise.
(VSTDCSP): Likewise.
(VTDCDP): Likewise.
(VTDCSP): Likewise.
(TSTSFI_EQ_DD): Likewise.
(TSTSFI_EQ_TD): Likewise.
(TSTSFI_GT_DD): Likewise.
(TSTSFI_GT_TD): Likewise.
(TSTSFI_LT_DD): Likewise.
(TSTSFI_LT_TD): Likewise.
(TSTSFI_OV_DD): Likewise.
(TSTSFI_OV_TD): Likewise.
(VSTDCQP): Likewise.
(DDEDPD): Likewise.
(DDEDPDQ): Likewise.
(DENBCD): Likewise.
(DENBCDQ): Likewise.
(DSCLI): Likewise.
(DSCLIQ): Likewise.
(DSCRI): Likewise.
(DSCRIQ): Likewise.
(UNPACK_TD): Likewise.
(VSHASIGMAD): Likewise.
(VSHASIGMAW): Likewise.
(VCNTMBB): Likewise.
(VCNTMBD): Likewise.
(VCNTMBH): Likewise.
(VCNTMBW): Likewise.
(VREPLACE_UN_UV2DI): Likewise.
(VREPLACE_UN_UV4SI): Likewise.
(VREPLACE_UN_V2DF): Likewise.
(VREPLACE_UN_V2DI): Likewise.
(VREPLACE_UN_V4SF): Likewise.
(VREPLACE_UN_V4SI): Likewise.
(VREPLACE_ELT_UV2DI): Likewise.
(VREPLACE_ELT_UV4SI): Likewise.
(VREPLACE_ELT_V2DF): Likewise.
(VREPLACE_ELT_V2DI): Likewise.
(VREPLACE_ELT_V4SF): Likewise.
(VREPLACE_ELT_V4SI): Likewise.
(VSLDB_V16QI): Likewise.
(VSLDB_V2DI): Likewise.
(VSLDB_V4SI): Likewise.
(VSLDB_V8HI): Likewise.
(VSRDB_V16QI): Likewise.
(VSRDB_V2DI): Likewise.
(VSRDB_V4SI): Likewise.
(VSRDB_V8HI): Likewise.
(VXXSPLTI32DX_V4SF): Likewise.
(VXXSPLTI32DX_V4SI): Likewise.
(XXEVAL): Likewise.
(XXGENPCVM_V16QI): Likewise.
(XXGENPCVM_V2DI): Likewise.
(XXGENPCVM_V4SI): Likewise.
(XXGENPCVM_V8HI): Likewise.

[PATCH 2/8] rs6000: Don't #ifdef "short" built-in names

2022-01-28 Thread Bill Schmidt via Gcc-patches
It was recently pointed out that we get anomalous behavior when using
__attribute__((target)) to select a CPU.  As an example, when building for
-mcpu=power8 but using __attribute__((target("mcpu=power10")), it is legal
to call __builtin_vec_mod, but not vec_mod, even though these are
equivalent.  This is because the equivalence is established with a #define
that is guarded by #ifdef _ARCH_PWR10.

This goofy behavior occurs with both the old builtins support and the
new.  One of the goals of the new builtins support was to make sure all
appropriate interfaces are available using __attribute__((target)), so I
failed in this respect.  This patch corrects the problem by removing the
apply.  For example, #ifdef __PPU__ is still appropriate.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-06  Bill Schmidt  

gcc/
* config/rs6000/rs6000-overload.def (VEC_ABSD): Remove #ifdef token.
(VEC_BLENDV): Likewise.
(VEC_BPERM): Likewise.
(VEC_CFUGE): Likewise.
(VEC_CIPHER_BE): Likewise.
(VEC_CIPHERLAST_BE): Likewise.
(VEC_CLRL): Likewise.
(VEC_CLRR): Likewise.
(VEC_CMPNEZ): Likewise.
(VEC_CNTLZ): Likewise.
(VEC_CNTLZM): Likewise.
(VEC_CNTTZM): Likewise.
(VEC_CNTLZ_LSBB): Likewise.
(VEC_CNTM): Likewise.
(VEC_CNTTZ): Likewise.
(VEC_CNTTZ_LSBB): Likewise.
(VEC_CONVERT_4F32_8F16): Likewise.
(VEC_DIV): Likewise.
(VEC_DIVE): Likewise.
(VEC_EQV): Likewise.
(VEC_EXPANDM): Likewise.
(VEC_EXTRACT_FP_FROM_SHORTH): Likewise.
(VEC_EXTRACT_FP_FROM_SHORTL): Likewise.
(VEC_EXTRACTH): Likewise.
(VEC_EXTRACTL): Likewise.
(VEC_EXTRACTM): Likewise.
(VEC_EXTRACT4B): Likewise.
(VEC_EXTULX): Likewise.
(VEC_EXTURX): Likewise.
(VEC_FIRSTMATCHINDEX): Likewise.
(VEC_FIRSTMACHOREOSINDEX): Likewise.
(VEC_FIRSTMISMATCHINDEX): Likewise.
(VEC_FIRSTMISMATCHOREOSINDEX): Likewise.
(VEC_GB): Likewise.
(VEC_GENBM): Likewise.
(VEC_GENHM): Likewise.
(VEC_GENWM): Likewise.
(VEC_GENDM): Likewise.
(VEC_GENQM): Likewise.
(VEC_GENPCVM): Likewise.
(VEC_GNB): Likewise.
(VEC_INSERTH): Likewise.
(VEC_INSERTL): Likewise.
(VEC_INSERT4B): Likewise.
(VEC_LXVL): Likewise.
(VEC_MERGEE): Likewise.
(VEC_MERGEO): Likewise.
(VEC_MOD): Likewise.
(VEC_MSUB): Likewise.
(VEC_MULH): Likewise.
(VEC_NAND): Likewise.
(VEC_NCIPHER_BE): Likewise.
(VEC_NCIPHERLAST_BE): Likewise.
(VEC_NEARBYINT): Likewise.
(VEC_NMADD): Likewise.
(VEC_ORC): Likewise.
(VEC_PDEP): Likewise.
(VEC_PERMX): Likewise.
(VEC_PEXT): Likewise.
(VEC_POPCNT): Likewise.
(VEC_PARITY_LSBB): Likewise.
(VEC_REPLACE_ELT): Likewise.
(VEC_REPLACE_UN): Likewise.
(VEC_REVB): Likewise.
(VEC_RINT): Likewise.
(VEC_RLMI): Likewise.
(VEC_RLNM): Likewise.
(VEC_SBOX_BE): Likewise.
(VEC_SIGNEXTI): Likewise.
(VEC_SIGNEXTLL): Likewise.
(VEC_SIGNEXTQ): Likewise.
(VEC_SLDB): Likewise.
(VEC_SLV): Likewise.
(VEC_SPLATI): Likewise.
(VEC_SPLATID): Likewise.
(VEC_SPLATI_INS): Likewise.
(VEC_SQRT): Likewise.
(VEC_SRDB): Likewise.
(VEC_SRV): Likewise.
(VEC_STRIL): Likewise.
(VEC_STRIL_P): Likewise.
(VEC_STRIR): Likewise.
(VEC_STRIR_P): Likewise.
(VEC_STXVL): Likewise.
(VEC_TERNARYLOGIC): Likewise.
(VEC_TEST_LSBB_ALL_ONES): Likewise.
(VEC_TEST_LSBB_ALL_ZEROS): Likewise.
(VEC_VEE): Likewise.
(VEC_VES): Likewise.
(VEC_VIE): Likewise.
(VEC_VPRTYB): Likewise.
(VEC_VSCEEQ): Likewise.
(VEC_VSCEGT): Likewise.
(VEC_VSCELT): Likewise.
(VEC_VSCEUO): Likewise.
(VEC_VSEE): Likewise.
(VEC_VSES): Likewise.
(VEC_VSIE): Likewise.
(VEC_VSTDC): Likewise.
(VEC_VSTDCN): Likewise.
(VEC_VTDC): Likewise.
(VEC_XL): Likewise.
(VEC_XL_BE): Likewise.
(VEC_XL_LEN_R): Likewise.
(VEC_XL_SEXT): Likewise.
(VEC_XL_ZEXT): Likewise.
(VEC_XST): Likewise.
(VEC_XST_BE): Likewise.
(VEC_XST_LEN_R): Likewise.
(VEC_XST_TRUNC): Likewise.
(VEC_XXPERMDI): Likewise.
(VEC_XXSLDWI): Likewise.
(VEC_TSTSFI_EQ_DD): Likewise.
(VEC_TSTSFI_EQ_TD): Likewise.
(VEC_TSTSFI_GT_DD): Likewise.
(VEC_TSTSFI_GT_TD): Likewise.
(VEC_TSTSFI_LT_DD): Likewise.
(VEC_TSTSFI_LT_TD): Likewise.
(VEC_TSTSFI_OV_DD): Likewise.
(VEC_TSTSFI_OV_TD): Likewise.
(VEC_VADDCUQ): Likewise.
(VEC_VADDECUQ): 

Re: RFA: libiberty: Fix infinite recursion in rust demangler (PRs 98886 and 99935)

2022-01-28 Thread Jeff Law via Gcc-patches




On 1/26/2022 8:29 AM, Nick Clifton via Gcc-patches wrote:

Hi Guys,

   I would like to propose the patch below to fix a couple of sources
   of infinite recursion in libiberty's rust demangling code.  This patch
   is based upon the one submitted for PR 99935, but extended to cope
   with the case presented in PR 98886 and also fixed so that the "uint"
   type is not used.

   Tested with a patched version of the binutils sources on an
   x86-pc-linux-gnu target.

Cheers
   Nick

2022-01-26  Nick Clifton  

* rust-demangle.c (struct rust_demangler): Add a recursion
counter.
(demangle_path): Increment/decrement the recursion counter upon
entry and exit.  Fail if the counter exceeds a fixed limit.
(demangle_type): Likewise.
(rust_demangle_callback): Initialise the recursion counter,
disabling if requested by the option flags.
OK.  I'm both amazed and disappointed at how all the demanglers seem to 
be prone to infinite recursion.


jeff



[PATCH 1/8] rs6000: More factoring of overload processing

2022-01-28 Thread Bill Schmidt via Gcc-patches
This patch continues the refactoring started with r12-6014.  I had previously
noted that the resolve_vec* routines can be further simplified by processing
the argument list earlier, so that all routines can use the arrays of arguments
and types.  I found that this was useful for some of the routines, but not for
all of them.

For several of the special-cased overloads, we don't specify all of the
possible type combinations in rs6000-overload.def, because the types don't
matter for the expansion we do.  For these, we can't use generic error message
handling when the number of arguments is incorrect, because the result is
misleading error messages that indicate argument types are wrong.

So this patch goes halfway and improves the factoring on the remaining special
cases, but leaves vec_splats, vec_promote, vec_extract, vec_insert, and
vec_step alone.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks,
Bill


2022-01-18  Bill Schmidt  

gcc/
* config/rs6000/rs6000-c.cc (resolve_vec_mul): Accept args and types
parameters instead of arglist and nargs.  Simplify accordingly.  Remove
unnecessary test for argument count mismatch.
(resolve_vec_cmpne): Likewise.
(resolve_vec_adde_sube): Likewise.
(resolve_vec_addec_subec): Likewise.
(altivec_resolve_overloaded_builtin): Move overload special handling
after the gathering of arguments into args[] and types[] and the test
for correct number of arguments.  Don't perform the test for correct
number of arguments for certain special cases.  Call the other special
cases with args and types instead of arglist and nargs.
---
 gcc/config/rs6000/rs6000-c.cc | 304 ++
 1 file changed, 127 insertions(+), 177 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 145421ab8f2..35c1383f059 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -939,37 +939,25 @@ altivec_build_resolved_builtin (tree *args, int n, tree 
fntype, tree ret_type,
 enum resolution { unresolved, resolved, resolved_bad };
 
 /* Resolve an overloaded vec_mul call and return a tree expression for the
-   resolved call if successful.  NARGS is the number of arguments to the call.
-   ARGLIST contains the arguments.  RES must be set to indicate the status of
+   resolved call if successful.  ARGS contains the arguments to the call.
+   TYPES contains their types.  RES must be set to indicate the status of
the resolution attempt.  LOC contains statement location information.  */
 
 static tree
-resolve_vec_mul (resolution *res, vec *arglist, unsigned nargs,
-location_t loc)
+resolve_vec_mul (resolution *res, tree *args, tree *types, location_t loc)
 {
   /* vec_mul needs to be special cased because there are no instructions for it
  for the {un}signed char, {un}signed short, and {un}signed int types.  */
-  if (nargs != 2)
-{
-  error ("builtin %qs only accepts 2 arguments", "vec_mul");
-  *res = resolved;
-  return error_mark_node;
-}
-
-  tree arg0 = (*arglist)[0];
-  tree arg0_type = TREE_TYPE (arg0);
-  tree arg1 = (*arglist)[1];
-  tree arg1_type = TREE_TYPE (arg1);
 
   /* Both arguments must be vectors and the types must be compatible.  */
-  if (TREE_CODE (arg0_type) != VECTOR_TYPE
-  || !lang_hooks.types_compatible_p (arg0_type, arg1_type))
+  if (TREE_CODE (types[0]) != VECTOR_TYPE
+  || !lang_hooks.types_compatible_p (types[0], types[1]))
 {
   *res = resolved_bad;
   return error_mark_node;
 }
 
-  switch (TYPE_MODE (TREE_TYPE (arg0_type)))
+  switch (TYPE_MODE (TREE_TYPE (types[0])))
 {
 case E_QImode:
 case E_HImode:
@@ -978,21 +966,21 @@ resolve_vec_mul (resolution *res, vec 
*arglist, unsigned nargs,
 case E_TImode:
   /* For scalar types just use a multiply expression.  */
   *res = resolved;
-  return fold_build2_loc (loc, MULT_EXPR, TREE_TYPE (arg0), arg0,
- fold_convert (TREE_TYPE (arg0), arg1));
+  return fold_build2_loc (loc, MULT_EXPR, types[0], args[0],
+ fold_convert (types[0], args[1]));
 case E_SFmode:
   {
/* For floats use the xvmulsp instruction directly.  */
*res = resolved;
tree call = rs6000_builtin_decls[RS6000_BIF_XVMULSP];
-   return build_call_expr (call, 2, arg0, arg1);
+   return build_call_expr (call, 2, args[0], args[1]);
   }
 case E_DFmode:
   {
/* For doubles use the xvmuldp instruction directly.  */
*res = resolved;
tree call = rs6000_builtin_decls[RS6000_BIF_XVMULDP];
-   return build_call_expr (call, 2, arg0, arg1);
+   return build_call_expr (call, 2, args[0], args[1]);
   }
 /* Other types are errors.  */
 default:
@@ -1002,37 +990,25 @@ resolve_vec_mul (resolution *res, vec 

[PATCH 0/8] rs6000: Built-in function cleanups and bug fixes

2022-01-28 Thread Bill Schmidt via Gcc-patches
Hi!

This is a resubmission of some patches and a new submission of others.
Patches 1, 3, and 4 finish up the pending clean-up work for the new built-in
infrastructure support.  Patches 2 and 5-8 fix a variety of bugs not specific
to the new infrastructure.  I'm submitting these as a group primarily because
5-8 are dependent on the previous patches, particularly patch 4, which
consolidates much of the built-in code in a new file.

Thanks for your consideration!

Bill


Bill Schmidt (8):
  rs6000: More factoring of overload processing
  rs6000: Don't #ifdef "short" built-in names
  rs6000: Convert  built-in constraints to  form
  rs6000: Consolidate target built-ins code
  rs6000: Fix LE code gen for vec_cnt[lt]z_lsbb [PR95082]
  rs6000: Remove -m[no-]fold-gimple flag [PR103686]
  rs6000: vec_neg built-ins wrongly require POWER8
  rs6000: Fix some missing built-in attributes [PR104004]

 gcc/config.gcc|2 +-
 gcc/config/rs6000/rs6000-builtin.cc   | 3721 +
 gcc/config/rs6000/rs6000-builtins.def |  578 +--
 gcc/config/rs6000/rs6000-c.cc |  304 +-
 gcc/config/rs6000/rs6000-call.cc  | 3524 
 gcc/config/rs6000/rs6000-overload.def |  344 +-
 gcc/config/rs6000/rs6000.cc   |  167 +-
 gcc/config/rs6000/rs6000.h|1 -
 gcc/config/rs6000/rs6000.opt  |4 -
 gcc/config/rs6000/t-rs6000|4 +
 .../powerpc/bfp/scalar-test-data-class-10.c   |2 +-
 .../powerpc/bfp/scalar-test-data-class-2.c|2 +-
 .../powerpc/bfp/scalar-test-data-class-3.c|2 +-
 .../powerpc/bfp/scalar-test-data-class-4.c|2 +-
 .../powerpc/bfp/scalar-test-data-class-5.c|2 +-
 .../powerpc/bfp/scalar-test-data-class-9.c|2 +-
 .../powerpc/bfp/vec-test-data-class-4.c   |2 +-
 .../powerpc/bfp/vec-test-data-class-5.c   |2 +-
 .../powerpc/bfp/vec-test-data-class-6.c   |2 +-
 .../powerpc/bfp/vec-test-data-class-7.c   |2 +-
 .../gcc.target/powerpc/builtins-1-be-folded.c |2 +-
 .../gcc.target/powerpc/builtins-1-le-folded.c |2 +-
 gcc/testsuite/gcc.target/powerpc/builtins-1.c | 1210 --
 gcc/testsuite/gcc.target/powerpc/builtins-5.c |3 +-
 .../gcc.target/powerpc/dfp/dtstsfi-12.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-14.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-17.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-19.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-2.c|2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-22.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-24.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-27.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-29.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-32.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-34.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-37.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-39.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-4.c|2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-42.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-44.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-47.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-49.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-52.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-54.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-57.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-59.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-62.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-64.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-67.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-69.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-7.c|2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-72.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-74.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-77.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-79.c   |2 +-
 .../gcc.target/powerpc/dfp/dtstsfi-9.c|2 +-
 .../gcc.target/powerpc/p8-vec-xl-xst.c|3 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-1.c  |2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-2.c  |2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-3.c  |2 +-
 gcc/testsuite/gcc.target/powerpc/pr80315-4.c  |2 +-
 gcc/testsuite/gcc.target/powerpc/pr82015.c|4 +-
 gcc/testsuite/gcc.target/powerpc/pr83926.c|3 +-
 .../powerpc/pr86731-nogimplefold-longlong.c   |   32 -
 .../gcc.target/powerpc/pr86731-nogimplefold.c |   63 -
 gcc/testsuite/gcc.target/powerpc/pr91903.c|   60 +-
 .../gcc.target/powerpc/swaps-p8-17.c  |3 +-
 .../powerpc/test_fpscr_rn_builtin_error.c |8 +-
 .../gcc.target/powerpc/vec-ternarylogic-10.c  |6 +-
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-0.c |2 +-
 .../gcc.target/powerpc/vsu/vec-cntlz-lsbb-1.c |2 +-
 

Re: [PATCH] testsuite/70230 - fix failures with default SSP

2022-01-28 Thread Jeff Law via Gcc-patches




On 1/26/2022 6:42 PM, Allan McRae via Gcc-patches wrote:

Configuring with --enable-default-ssp triggers various testsuite
failures.  These contain asm statements that are not compatible with
-fstack-protector.  Adding -fno-stack-protector to dg-options to
work around this issue.

Tested on x86_64-linux.

2022-01-26  Allan McRae  

PR testsuite/70230
* gcc.dg/asan/use-after-scope-4.c (dg-options): Add
-fno-stack-protector.
* gcc.dg/stack-usage-1.c: Likewise
* gcc.dg/superblock.c: Likewise
* gcc.target/i386/avx-vzeroupper-17.c: Likewise
* gcc.target/i386/cleanup-1.c: Likewise
* gcc.target/i386/cleanup-2.c: Likewise
* gcc.target/i386/interrupt-redzone-1.c: Likewise
* gcc.target/i386/interrupt-redzone-2.c: Likewise
* gcc.target/i386/pr79793-1.c: Likewise
* gcc.target/i386/pr79793-2.c: Likewise
* gcc.target/i386/shrink_wrap_1.c: Likewise
* gcc.target/i386/stack-check-11.c: Likewise
* gcc.target/i386/stack-check-18.c: Likewise
* gcc.target/i386/stack-check-19.c: Likewise
* gcc.target/i386/stackalign/pr88483-1.c: Likewise
* gcc.target/i386/stackalign/pr88483-2.c: Likewise
* gcc.target/i386/sw-1.c: Likewise
We're actually in stage4, which means regression fixes only.  But since 
this only modifies the testsuite, it can't change code generation or the 
like.   I'll go out on a limb, ack and commit.


Jeff



Re: [PATCH] store-merging: Fix up a -fcompare-debug bug in get_status_for_store_merging [PR104263]

2022-01-28 Thread Jeff Law via Gcc-patches




On 1/28/2022 9:31 AM, Jakub Jelinek via Gcc-patches wrote:

Hi!

As mentioned in the PRthe following testcase fails, because the last
stmt of a bb with -g is a debug stmt and get_status_for_store_merging
uses gimple_seq_last_stmt (bb_seq (bb)) when testing if it is valid
for store merging.  The debug stmt isn't valid, while a stmt at that
position with -g0 is valid and so the divergence.

As we walk the whole bb already, this patch just remembers the last
non-debug stmt, so that we don't need to skip backwards debug stmts at the
end of the bb to find last real stmt.

Bootstrapped/regtested on powerpc64le-linux, ok for trunk?

2022-01-28  Jakub Jelinek  

PR tree-optimization/104263
* gimple-ssa-store-merging.cc (get_status_for_store_merging): For
cfun->can_throw_non_call_exceptions && cfun->eh test whether
last non-debug stmt in the bb is store_valid_for_store_merging_p
rather than last stmt.

* gcc.dg/pr104263.c: New test.

OK
jeff



[PATCH] ipa: Create LOAD references when necessary during inlining (PR 103171)

2022-01-28 Thread Martin Jambor
Hi,

in r12-2523-g13586172d0b70c ipa-prop tracking of jump functions during
inlining got the ability to remove ADDR references when inlining
discovered that they were not necessary or turn them into LOAD
references when we know that what was a function call argument passed
by reference will end up as a load (one or more).

Unfortunately, the code only creates the LOAD references when
replacing removed ADDR references and PR 103171 showed that with some
ordering of inlining, we need to add the LOAD reference before we know
we can remove the ADDR one - or the reference will be lost, leading to
link errors or even ICEs.

Specifically in testcase gcc.dg/lto/pr103171_1.c added in this patch,
if foo() is inlined to entry(), we need to create the LOAD reference
so that when later bar() is inlined into foo() and we discover that
the paameter is unused, we can remove the ADDR reference and still
keep the varaible around for the load.

Bootstrapped, LTO bootstrapped and tested on x86_64-linux.  OK for
trunk?

Thanks,

Martin



gcc/ChangeLog:

2022-01-28  Martin Jambor  

PR ipa/103171
* ipa-prop.cc (propagate_controlled_uses): Add a LOAD reference
always when an ADDR_EXPR constant is known to reach a load because
of inlining, not just when removing an ADDR reference.

gcc/testsuite/ChangeLog:

2022-01-28  Martin Jambor  

PR ipa/103171
* gcc.dg/ipa/remref-6.c: Adjust dump scan string.
* gcc.dg/ipa/remref-7.c: New test.
* gcc.dg/lto/pr103171_0.c: New test.
* gcc.dg/lto/pr103171_1.c: Likewise.
---
 gcc/ipa-prop.cc   | 30 ---
 gcc/testsuite/gcc.dg/ipa/remref-6.c   |  2 +-
 gcc/testsuite/gcc.dg/ipa/remref-7.c   | 33 +
 gcc/testsuite/gcc.dg/lto/pr103171_0.c | 11 +
 gcc/testsuite/gcc.dg/lto/pr103171_1.c | 35 +++
 5 files changed, 96 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/remref-7.c
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr103171_0.c
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr103171_1.c

diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index e55fe2776f2..72aa3e2f60d 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -4181,6 +4181,20 @@ propagate_controlled_uses (struct cgraph_edge *cs)
  int d = ipa_get_controlled_uses (old_root_info, i);
  int c = rdesc->refcount;
  rdesc->refcount = combine_controlled_uses_counters (c, d);
+ if (rdesc->refcount != IPA_UNDESCRIBED_USE
+ && ipa_get_param_load_dereferenced (old_root_info, i))
+   {
+ tree cst = ipa_get_jf_constant (jf);
+ gcc_checking_assert (TREE_CODE (cst) == ADDR_EXPR
+  && (TREE_CODE (TREE_OPERAND (cst, 0))
+  == VAR_DECL));
+ symtab_node *n = symtab_node::get (TREE_OPERAND (cst, 0));
+ new_root->create_reference (n, IPA_REF_LOAD, NULL);
+ if (dump_file)
+   fprintf (dump_file, "ipa-prop: Address IPA constant will reach "
+"a load so adding LOAD reference from %s to %s.\n",
+new_root->dump_name (), n->dump_name ());
+   }
  if (rdesc->refcount == 0)
{
  tree cst = ipa_get_jf_constant (jf);
@@ -4193,20 +4207,8 @@ propagate_controlled_uses (struct cgraph_edge *cs)
  symtab_node *n = symtab_node::get (TREE_OPERAND (cst, 0));
  if (n)
{
- struct cgraph_node *clone;
- bool removed = remove_described_reference (n, rdesc);
- /* The reference might have been removed by IPA-CP.  */
- if (removed
- && ipa_get_param_load_dereferenced (old_root_info, i))
-   {
- new_root->create_reference (n, IPA_REF_LOAD, NULL);
- if (dump_file)
-   fprintf (dump_file, "ipa-prop: ...replaced it with "
-"LOAD one from %s to %s.\n",
-new_root->dump_name (), n->dump_name ());
-   }
-
- clone = cs->caller;
+ remove_described_reference (n, rdesc);
+ cgraph_node *clone = cs->caller;
  while (clone->inlined_to
 && clone->ipcp_clone
 && clone != rdesc->cs->caller)
diff --git a/gcc/testsuite/gcc.dg/ipa/remref-6.c 
b/gcc/testsuite/gcc.dg/ipa/remref-6.c
index 7deae3114a4..f31f4c14319 100644
--- a/gcc/testsuite/gcc.dg/ipa/remref-6.c
+++ b/gcc/testsuite/gcc.dg/ipa/remref-6.c
@@ -20,5 +20,5 @@ void entry()
 }
 
 /* { dg-final { scan-ipa-dump "Removed a reference"  "inline" } }  */
-/* { dg-final { scan-ipa-dump "replaced it with LOAD"  "inline"  } } */
+/* { dg-final { scan-ipa-dump "adding LOAD reference"  "inline"  } } */
 /* { dg-final 

[OG11][committed] openmp: Add warning when functions containing metadirectives with 'construct={target}' called directly

2022-01-28 Thread Kwok Cheung Yeung

Hello

I have backported and committed the patch 'openmp: Add warning when 
functions containing metadirectives with 'construct={target}' called 
directly' to the devel/omp/gcc-11 development branch:


d6d82af7918 openmp: Add warning when functions containing metadirectives 
with 'construct={target}' called directly


Thanks

Kwok


[PATCH] openmp: Add warning when functions containing metadirectives with 'construct={target}' called directly

2022-01-28 Thread Kwok Cheung Yeung

Hello

Regarding this issue which we discussed previously - I have created a 
patch that adds a warning when this situation is detected.


When a metadirective in a explicitly marked target function is 
gimplified, it is checked to see if it contains a 'construct={target}' 
selector - if it does, then the containing function is marked with 'omp 
metadirective construct target'.


In the omp-low pass, when function calls are processed, the target 
function is checked to see if it contains the marker. If it does and the 
call is not made in a target context, a warning is emitted.


This will obviously not catch every possible occurence (e.g. if the 
function containing the metadirective is called from another target 
function which is then called locally, or if the call is made via a 
function pointer), but it might still be useful? Okay for mainline (once 
the metadirective patches are done)?


Thanks

Kwok

On 26/07/2021 10:23 pm, Jakub Jelinek wrote:

On Mon, Jul 26, 2021 at 10:19:35PM +0100, Kwok Cheung Yeung wrote:

Yes, that is a target variant, but I'm pretty sure we've decided that
the target construct added for declare target is actually not a dynamic
property.  So basically mostly return to the 5.0 wording with clarifications
for Fortran.  See
https://github.com/OpenMP/spec/issues/2612#issuecomment-849742988
for details.
Making the target in construct dynamic would pretty much force all the
scoring to be dynamic as well.


In that comment, Deepak says:

So, we decided to keep the target trait static, requiring that the declare
target directive must be explicit and that the function version must be
different from the version of the function that may be called outside of a
target region (with the additional clarification that whether it differs or
not will be implementation defined).

"the function version must be different from the version of the function
that may be called outside of a target region": This is what we do not have
in GCC at the moment - the function versions called within and outside
target regions are the same on the host.

"whether it differs or not will be implementation defined": So whether a
function with 'declare target' and a metadirective involving a 'target'
construct behaves the same or not when called from both inside and outside
of a target region is implementation defined?

I will leave the treatment of target constructs in the selector as it is
then, with both calls going to the same function with the metadirective
resolving to the 'target' variant. I will try to address your other concerns
later.


I think you're right, it should differ in the host vs. target version iff
it is in explicit declare target block, my memory is weak, but let's implement
the 5.0 wording for now (and ignore the 5.1 wording later on) and only when
we'll be doing 5.2 change this (and change for both metadirective and
declare variant at that point).
Ok?

Jakub
From 741b037a8cd6b85d43a6273ab305ce07705dfa23 Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Fri, 28 Jan 2022 13:56:33 +
Subject: [PATCH] openmp: Add warning when functions containing metadirectives
 with 'construct={target}' called directly

void f(void)
{
  #pragma omp metadirective \
when (construct={target}: A) \
default (B)
...
}
...
{
  #pragma omp target
f(); // Target call

  f(); // Local call
}

With the OpenMP 5.0/5.1 specifications, we would expect A to be selected in
the metadirective when the target call is made, but B when f is called
directly outside of a target context.  However, since GCC does not have
separate copies of f for local and target calls, and the construct selector
is static, it must be resolved one way or the other at compile-time (currently
in the favour of selecting A), which may be unexpected behaviour.

This patch attempts to detect the above situation, and will emit a warning
if found.

2022-01-28  Kwok Cheung Yeung  

gcc/
* gimplify.cc (gimplify_omp_metadirective): Mark offloadable functions
containing metadirectives with 'construct={target}' in the selector.
* omp-general.cc (omp_has_target_constructor_p): New.
* omp-general.h (omp_has_target_constructor_p): New prototype.
* omp-low.cc (lower_omp_1): Emit warning if marked functions called
outside of a target context.

gcc/testsuite/
* c-c++-common/gomp/metadirective-4.c (main): Add expected warning.
* gfortran.dg/gomp/metadirective-4.f90 (test): Likewise.

libgomp/
* testsuite/libgomp.c-c++-common/metadirective-2.c (main): Add
expected warning.
* testsuite/libgomp.fortran/metadirective-2.f90 (test): Likewise.
---
 gcc/gimplify.cc   | 21 +++
 gcc/omp-general.cc| 21 +++
 gcc/omp-general.h |  1 +
 gcc/omp-low.cc| 18 
 

[PATCH] store-merging: Fix up a -fcompare-debug bug in get_status_for_store_merging [PR104263]

2022-01-28 Thread Jakub Jelinek via Gcc-patches
Hi!

As mentioned in the PRthe following testcase fails, because the last
stmt of a bb with -g is a debug stmt and get_status_for_store_merging
uses gimple_seq_last_stmt (bb_seq (bb)) when testing if it is valid
for store merging.  The debug stmt isn't valid, while a stmt at that
position with -g0 is valid and so the divergence.

As we walk the whole bb already, this patch just remembers the last
non-debug stmt, so that we don't need to skip backwards debug stmts at the
end of the bb to find last real stmt.

Bootstrapped/regtested on powerpc64le-linux, ok for trunk?

2022-01-28  Jakub Jelinek  

PR tree-optimization/104263
* gimple-ssa-store-merging.cc (get_status_for_store_merging): For
cfun->can_throw_non_call_exceptions && cfun->eh test whether
last non-debug stmt in the bb is store_valid_for_store_merging_p
rather than last stmt.

* gcc.dg/pr104263.c: New test.

--- gcc/gimple-ssa-store-merging.cc.jj  2022-01-20 11:30:45.521578942 +0100
+++ gcc/gimple-ssa-store-merging.cc 2022-01-28 11:27:25.437947561 +0100
@@ -5364,6 +5364,7 @@ get_status_for_store_merging (basic_bloc
   unsigned int num_constructors = 0;
   gimple_stmt_iterator gsi;
   edge e;
+  gimple *last_stmt = NULL;
 
   for (gsi = gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next ())
 {
@@ -5372,6 +5373,8 @@ get_status_for_store_merging (basic_bloc
   if (is_gimple_debug (stmt))
continue;
 
+  last_stmt = stmt;
+
   if (store_valid_for_store_merging_p (stmt) && ++num_statements >= 2)
break;
 
@@ -5398,7 +5401,7 @@ get_status_for_store_merging (basic_bloc
 return BB_INVALID;
 
   if (cfun->can_throw_non_call_exceptions && cfun->eh
-  && store_valid_for_store_merging_p (gimple_seq_last_stmt (bb_seq (bb)))
+  && store_valid_for_store_merging_p (last_stmt)
   && (e = find_fallthru_edge (bb->succs))
   && e->dest == bb->next_bb)
 return BB_EXTENDED_VALID;
--- gcc/testsuite/gcc.dg/pr104263.c.jj  2022-01-28 11:32:26.718619588 +0100
+++ gcc/testsuite/gcc.dg/pr104263.c 2022-01-28 11:32:04.111944459 +0100
@@ -0,0 +1,25 @@
+/* PR tree-optimization/104263 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fcompare-debug -fnon-call-exceptions 
-fno-inline-small-functions" } */
+
+int n;
+
+int
+bar (void)
+{
+  int a;
+
+  n = 0;
+  a = 0;
+
+  return n;
+}
+
+__attribute__ ((pure, returns_twice)) int
+foo (void)
+{
+  n = bar () + 1;
+  foo ();
+
+  return 0;
+}

Jakub



Re: [PATCH v2] preprocessor: -Wbidi-chars and UCNs [PR104030]

2022-01-28 Thread Martin Liška

On 1/28/22 16:26, Marek Polacek wrote:

On Fri, Jan 28, 2022 at 04:08:18PM +0100, Martin Liška wrote:

On 1/28/22 15:59, Marek Polacek wrote:

On Fri, Jan 28, 2022 at 02:53:16PM +0100, Martin Liška wrote:

On 1/24/22 23:36, Marek Polacek via Gcc-patches wrote:

|@@ -7820,6 +7820,10 @@ bidi contexts. @option{-Wbidi-chars=none} turns the 
warning off. @option{-Wbidi-chars=any} warns about any use of bidirectional 
control characters. +By default, this warning does not warn about UCNs. It is, 
however, possible +to turn on such checking by using 
@option{-Wbidi-chars=unpaired,ucn} or +@option{-Wbidi-chars=any,ucn}.|


Hello.

Can you please extend the documentation entry and explain what 'ucn' actually 
means?

'''
There are three levels of warning supported by GCC@.  The default is
@option{-Wbidi-chars=unpaired}, which warns about improperly terminated
bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
@option{-Wbidi-chars=any} warns about any use of bidirectional control
characters.
'''

Right now we have 4 levels and 'ucn' is not defined the paragraph.


The following paragraph says

By default, this warning does not warn about UCNs.  It is, however, possible
to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or
@option{-Wbidi-chars=any,ucn}.

Is that not enough?


Yeah, makes sense. Do I understand it correctly that one can't use 
-Wbidi-chars=ucn?


You could, it just means use the default (=unpaired) with UCN checking enabled.
Do you want me to make a note about that in the manual?


Yes, please do so.

Martin



Marek





Re: [PATCH v2] preprocessor: -Wbidi-chars and UCNs [PR104030]

2022-01-28 Thread Marek Polacek via Gcc-patches
On Fri, Jan 28, 2022 at 04:08:18PM +0100, Martin Liška wrote:
> On 1/28/22 15:59, Marek Polacek wrote:
> > On Fri, Jan 28, 2022 at 02:53:16PM +0100, Martin Liška wrote:
> > > On 1/24/22 23:36, Marek Polacek via Gcc-patches wrote:
> > > > |@@ -7820,6 +7820,10 @@ bidi contexts. @option{-Wbidi-chars=none} turns 
> > > > the warning off. @option{-Wbidi-chars=any} warns about any use of 
> > > > bidirectional control characters. +By default, this warning does not 
> > > > warn about UCNs. It is, however, possible +to turn on such checking by 
> > > > using @option{-Wbidi-chars=unpaired,ucn} or 
> > > > +@option{-Wbidi-chars=any,ucn}.|
> > > 
> > > Hello.
> > > 
> > > Can you please extend the documentation entry and explain what 'ucn' 
> > > actually means?
> > > 
> > > '''
> > > There are three levels of warning supported by GCC@.  The default is
> > > @option{-Wbidi-chars=unpaired}, which warns about improperly terminated
> > > bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
> > > @option{-Wbidi-chars=any} warns about any use of bidirectional control
> > > characters.
> > > '''
> > > 
> > > Right now we have 4 levels and 'ucn' is not defined the paragraph.
> > 
> > The following paragraph says
> > 
> > By default, this warning does not warn about UCNs.  It is, however, possible
> > to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or
> > @option{-Wbidi-chars=any,ucn}.
> > 
> > Is that not enough?
> 
> Yeah, makes sense. Do I understand it correctly that one can't use 
> -Wbidi-chars=ucn?

You could, it just means use the default (=unpaired) with UCN checking enabled.
Do you want me to make a note about that in the manual?

Marek



[PATCH][pushed] Remove extra newline in ICE report.

2022-01-28 Thread Martin Liška

Revert partially what I did in g:76ef38e3178a11e76a66b4d4c0e10e85fe186a45.

Pushed based on IRC discussion with Jakub.

Martin

gcc/ChangeLog:

* diagnostic.cc (diagnostic_action_after_output): Remove extra
newline.
---
 gcc/diagnostic.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index 10472046bdc..87eb473d2f3 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -668,10 +668,10 @@ diagnostic_action_after_output (diagnostic_context 
*context,
  real_abort ();
 
 	if (context->report_bug)

- fnotice (stderr, "\nPlease submit a full bug report, "
+ fnotice (stderr, "Please submit a full bug report, "
   "with preprocessed source.\n");
else
- fnotice (stderr, "\nPlease submit a full bug report, "
+ fnotice (stderr, "Please submit a full bug report, "
   "with preprocessed source (by using -freport-bug).\n");
 
 	if (count > 0)

--
2.34.1



Re: [PATCH v2] preprocessor: -Wbidi-chars and UCNs [PR104030]

2022-01-28 Thread Martin Liška

On 1/28/22 15:59, Marek Polacek wrote:

On Fri, Jan 28, 2022 at 02:53:16PM +0100, Martin Liška wrote:

On 1/24/22 23:36, Marek Polacek via Gcc-patches wrote:

|@@ -7820,6 +7820,10 @@ bidi contexts. @option{-Wbidi-chars=none} turns the 
warning off. @option{-Wbidi-chars=any} warns about any use of bidirectional 
control characters. +By default, this warning does not warn about UCNs. It is, 
however, possible +to turn on such checking by using 
@option{-Wbidi-chars=unpaired,ucn} or +@option{-Wbidi-chars=any,ucn}.|


Hello.

Can you please extend the documentation entry and explain what 'ucn' actually 
means?

'''
There are three levels of warning supported by GCC@.  The default is
@option{-Wbidi-chars=unpaired}, which warns about improperly terminated
bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
@option{-Wbidi-chars=any} warns about any use of bidirectional control
characters.
'''

Right now we have 4 levels and 'ucn' is not defined the paragraph.


The following paragraph says

By default, this warning does not warn about UCNs.  It is, however, possible
to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or
@option{-Wbidi-chars=any,ucn}.

Is that not enough?


Yeah, makes sense. Do I understand it correctly that one can't use 
-Wbidi-chars=ucn?

Thanks,
Martin



Marek





Re: [PATCH] internal_error - do not use leading capital letter

2022-01-28 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 27, 2022 at 01:40:15PM +0100, Martin Liška wrote:
> That's follow up patch based on the discussion with Jakub.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/host-darwin.cc (segv_crash_handler):
>   Do not use leading capital letter.
>   (segv_handler): Likewise.
>   * ipa-sra.cc (verify_splitting_accesses): Likewise.
>   * varasm.cc (get_section): Likewise.
> 
> gcc/d/ChangeLog:
> 
>   * decl.cc (d_finish_decl): Do not use leading capital letter.
> ---
>  gcc/config/rs6000/host-darwin.cc | 4 ++--
>  gcc/d/decl.cc| 2 +-
>  gcc/ipa-sra.cc   | 4 ++--
>  gcc/varasm.cc| 2 +-
>  4 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/config/rs6000/host-darwin.cc 
> b/gcc/config/rs6000/host-darwin.cc
> index 541f7e1c81c..efb1965004e 100644
> --- a/gcc/config/rs6000/host-darwin.cc
> +++ b/gcc/config/rs6000/host-darwin.cc
> @@ -58,7 +58,7 @@ extern int sigaltstack(const struct sigaltstack *, struct 
> sigaltstack *);
>  static void
>  segv_crash_handler (int sig ATTRIBUTE_UNUSED)
>  {
> -  internal_error ("Segmentation Fault (code)");
> +  internal_error ("segmentation fault (code)");
>  }
>  static void
> @@ -128,7 +128,7 @@ segv_handler (int sig ATTRIBUTE_UNUSED,
>fprintf (stderr, "[address=%08lx pc=%08x]\n",
>  uc->uc_mcontext->MC_FLD(es).MC_FLD(dar),
>  uc->uc_mcontext->MC_FLD(ss).MC_FLD(srr0));
> -  internal_error ("Segmentation Fault");
> +  internal_error ("egmentation fault");

s went missing.

Otherwise LGTM.

Jakub



Re: [PATCH v2] preprocessor: -Wbidi-chars and UCNs [PR104030]

2022-01-28 Thread Marek Polacek via Gcc-patches
On Fri, Jan 28, 2022 at 02:53:16PM +0100, Martin Liška wrote:
> On 1/24/22 23:36, Marek Polacek via Gcc-patches wrote:
> > |@@ -7820,6 +7820,10 @@ bidi contexts. @option{-Wbidi-chars=none} turns the 
> > warning off. @option{-Wbidi-chars=any} warns about any use of bidirectional 
> > control characters. +By default, this warning does not warn about UCNs. It 
> > is, however, possible +to turn on such checking by using 
> > @option{-Wbidi-chars=unpaired,ucn} or +@option{-Wbidi-chars=any,ucn}.|
> 
> Hello.
> 
> Can you please extend the documentation entry and explain what 'ucn' actually 
> means?
> 
> '''
> There are three levels of warning supported by GCC@.  The default is
> @option{-Wbidi-chars=unpaired}, which warns about improperly terminated
> bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
> @option{-Wbidi-chars=any} warns about any use of bidirectional control
> characters.
> '''
> 
> Right now we have 4 levels and 'ucn' is not defined the paragraph.

The following paragraph says

By default, this warning does not warn about UCNs.  It is, however, possible
to turn on such checking by using @option{-Wbidi-chars=unpaired,ucn} or
@option{-Wbidi-chars=any,ucn}.

Is that not enough?

Marek



Re: [PATCH v2] preprocessor: -Wbidi-chars and UCNs [PR104030]

2022-01-28 Thread Martin Liška

On 1/24/22 23:36, Marek Polacek via Gcc-patches wrote:

|@@ -7820,6 +7820,10 @@ bidi contexts. @option{-Wbidi-chars=none} turns the 
warning off. @option{-Wbidi-chars=any} warns about any use of bidirectional 
control characters. +By default, this warning does not warn about UCNs. It is, 
however, possible +to turn on such checking by using 
@option{-Wbidi-chars=unpaired,ucn} or +@option{-Wbidi-chars=any,ucn}.|


Hello.

Can you please extend the documentation entry and explain what 'ucn' actually 
means?

'''
There are three levels of warning supported by GCC@.  The default is
@option{-Wbidi-chars=unpaired}, which warns about improperly terminated
bidi contexts.  @option{-Wbidi-chars=none} turns the warning off.
@option{-Wbidi-chars=any} warns about any use of bidirectional control
characters.
'''

Right now we have 4 levels and 'ucn' is not defined the paragraph.

Thanks,
Martin



[PATCH v6 12/12] LoongArch Port: Add doc.

2022-01-28 Thread chenglulu
* contrib/config-list.mk: Add LoongArch triplet.
* gcc/doc/install.texi: Add LoongArch options section.
* gcc/doc/invoke.texi: Add LoongArch options section.
* gcc/doc/md.texi: Add LoongArch options section.
---
 contrib/config-list.mk |   5 +-
 gcc/doc/install.texi   |  47 +-
 gcc/doc/invoke.texi| 201 +
 gcc/doc/md.texi|  55 +++
 4 files changed, 302 insertions(+), 6 deletions(-)

diff --git a/contrib/config-list.mk b/contrib/config-list.mk
index 3e1d1321861..ba6f12e4693 100644
--- a/contrib/config-list.mk
+++ b/contrib/config-list.mk
@@ -57,7 +57,10 @@ LIST = aarch64-elf aarch64-linux-gnu aarch64-rtems \
   i686-wrs-vxworksae \
   i686-cygwinOPT-enable-threads=yes i686-mingw32crt ia64-elf \
   ia64-freebsd6 ia64-linux ia64-hpux ia64-hp-vms iq2000-elf lm32-elf \
-  lm32-rtems lm32-uclinux m32c-rtems m32c-elf m32r-elf m32rle-elf \
+  lm32-rtems lm32-uclinux \
+  loongarch64-linux-gnu loongarch64-linux-gnuf64 \
+  loongarch64-linux-gnuf32 loongarch64-linux-gnusf \
+  m32c-rtems m32c-elf m32r-elf m32rle-elf \
   m68k-elf m68k-netbsdelf \
   m68k-uclinux m68k-linux m68k-rtems \
   mcore-elf microblaze-linux microblaze-elf \
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 96b4dfc871a..337f752192f 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -752,9 +752,9 @@ Here are the possible CPU types:
 @quotation
 aarch64, aarch64_be, alpha, alpha64, amdgcn, arc, arceb, arm, armeb, avr, bfin,
 bpf, cr16, cris, csky, epiphany, fido, fr30, frv, ft32, h8300, hppa, hppa2.0,
-hppa64, i486, i686, ia64, iq2000, lm32, m32c, m32r, m32rle, m68k, mcore,
-microblaze, microblazeel, mips, mips64, mips64el, mips64octeon, mips64orion,
-mips64vr, mipsel, mipsisa32, mipsisa32r2, mipsisa64, mipsisa64r2,
+hppa64, i486, i686, ia64, iq2000, lm32, loongarch64, m32c, m32r, m32rle, m68k,
+mcore, microblaze, microblazeel, mips, mips64, mips64el, mips64octeon,
+mips64orion, mips64vr, mipsel, mipsisa32, mipsisa32r2, mipsisa64, mipsisa64r2,
 mipsisa64r2el, mipsisa64sb1, mipsisa64sr71k, mipstx39, mmix, mn10300, moxie,
 msp430, nds32be, nds32le, nios2, nvptx, or1k, pdp11, powerpc, powerpc64,
 powerpc64le, powerpcle, pru, riscv32, riscv32be, riscv64, riscv64be, rl78, rx,
@@ -1171,8 +1171,9 @@ sysv, aix.
 @itemx --without-multilib-list
 Specify what multilibs to build.  @var{list} is a comma separated list of
 values, possibly consisting of a single value.  Currently only implemented
-for aarch64*-*-*, arm*-*-*, riscv*-*-*, sh*-*-* and x86-64-*-linux*.  The
-accepted values and meaning for each target is given below.
+for aarch64*-*-*, arm*-*-*, loongarch64-*-*, riscv*-*-*, sh*-*-* and
+x86-64-*-linux*.  The accepted values and meaning for each target is given
+below.
 
 @table @code
 @item aarch64*-*-*
@@ -1259,6 +1260,14 @@ profile.  The union of these options is considered when 
specifying both
 @code{-mfloat-abi=hard}
 @end multitable
 
+@item loongarch*-*-*
+@var{list} is a comma-separated list of the following ABI identifiers:
+@code{lp64d[/base]} @code{lp64f[/base]} @code{lp64d[/base]}, where the
+@code{/base} suffix may be omitted, to enable their respective run-time
+libraries.  If @var{list} is empty, @code{default}
+or @option{--with-multilib-list} is not specified, then the default ABI
+as specified by @option{--with-abi} or implied by @option{--target} is 
selected.
+
 @item riscv*-*-*
 @var{list} is a single ABI name.  The target architecture must be either
 @code{rv32gc} or @code{rv64gc}.  This will build a single multilib for the
@@ -4430,6 +4439,34 @@ This configuration is intended for embedded systems.
 Lattice Mico32 processor.
 This configuration is intended for embedded systems running uClinux.
 
+@html
+
+@end html
+@anchor{loongarch}
+@heading LoongArch
+LoongArch processor.
+The following LoongArch targets are available:
+@table @code
+@item loongarch64-linux-gnu*
+LoongArch processor running GNU/Linux.  This target triplet may be coupled
+with a small set of possible suffixes to identify their default ABI type:
+@table @code
+@item f64
+Uses @code{lp64d/base} ABI by default.
+@item f32
+Uses @code{lp64f/base} ABI by default.
+@item sf
+Uses @code{lp64s/base} ABI by default.
+@end table
+
+@item loongarch64-linux-gnu
+Same as @code{loongarch64-linux-gnuf64}, but may be used with
+@option{--with-abi=*} to configure the default ABI type.
+@end table
+
+More information about LoongArch can be found at
+@uref{https://github.com/loongson/LoongArch-Documentation}.
+
 @html
 
 @end html
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9e588db4fce..40f113b14eb 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -995,6 +995,16 @@ Objective-C and Objective-C++ Dialects}.
 @gccoptlist{-mbarrel-shift-enabled  -mdivide-enabled  -mmultiply-enabled @gol
 -msign-extend-enabled  -muser-enabled}
 
+@emph{LoongArch Options}
+@gccoptlist{-march=@var{cpu-type}  -mtune=@var{cpu-type} 

[PATCH v6 11/12] LoongArch Port: gcc/testsuite

2022-01-28 Thread chenglulu
gcc/testsuite/

* g++.dg/cpp0x/constexpr-rom.C: Add build options for LoongArch.
* g++.old-deja/g++.abi/ptrmem.C: Add LoongArch support.
* g++.old-deja/g++.pt/ptrmem6.C: xfail for LoongArch.
* gcc.dg/20020312-2.c: Add LoongArch support.
* gcc.dg/loop-8.c: Skip on LoongArch.
* gcc.dg/torture/stackalign/builtin-apply-2.c: Likewise.
* gcc.dg/tree-ssa/ssa-fre-3.c: Likewise.
* go.test/go-test.exp: Define the LoongArch target.
* lib/target-supports.exp: Like wise.
* gcc.target/loongarch/loongarch.exp: New file.
* gcc.target/loongarch/tst-asm-const.c: Like wise.
---
 gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C|  2 +-
 gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C   |  2 +-
 gcc/testsuite/g++.old-deja/g++.pt/ptrmem6.C   |  2 +-
 gcc/testsuite/gcc.dg/20020312-2.c |  2 +
 gcc/testsuite/gcc.dg/loop-8.c |  2 +-
 .../torture/stackalign/builtin-apply-2.c  |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c |  2 +-
 .../gcc.target/loongarch/loongarch.exp| 40 +++
 .../gcc.target/loongarch/tst-asm-const.c  | 16 
 gcc/testsuite/go.test/go-test.exp |  3 ++
 gcc/testsuite/lib/target-supports.exp | 14 +++
 11 files changed, 81 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/loongarch.exp
 create mode 100644 gcc/testsuite/gcc.target/loongarch/tst-asm-const.c

diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C
index 2e0ef685f36..424979a604b 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C
@@ -1,6 +1,6 @@
 // PR c++/49673: check that test_data goes into .rodata
 // { dg-do compile { target c++11 } }
-// { dg-additional-options -G0 { target { { alpha*-*-* frv*-*-* ia64-*-* 
lm32*-*-* m32r*-*-* microblaze*-*-* mips*-*-* nios2-*-* powerpc*-*-* 
rs6000*-*-* } && { ! { *-*-darwin* *-*-aix* alpha*-*-*vms* } } } } }
+// { dg-additional-options -G0 { target { { alpha*-*-* frv*-*-* ia64-*-* 
lm32*-*-* m32r*-*-* microblaze*-*-* mips*-*-* loongarch*-*-* nios2-*-* 
powerpc*-*-* rs6000*-*-* } && { ! { *-*-darwin* *-*-aix* alpha*-*-*vms* } } } } 
}
 // { dg-final { scan-assembler "\\.rdata" { target mips*-*-* } } }
 // { dg-final { scan-assembler "rodata" { target { { *-*-linux-gnu *-*-gnu* 
*-*-elf } && { ! { mips*-*-* riscv*-*-* } } } } } }
 
diff --git a/gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C 
b/gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C
index bda7960d8a2..f69000e9081 100644
--- a/gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C
+++ b/gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C
@@ -7,7 +7,7 @@
function.  However, some platforms use all bits to encode a
function pointer.  Such platforms use the lowest bit of the delta,
that is shifted left by one bit.  */
-#if defined __MN10300__ || defined __SH5__ || defined __arm__ || defined 
__thumb__ || defined __mips__ || defined __aarch64__ || defined __PRU__
+#if defined __MN10300__ || defined __SH5__ || defined __arm__ || defined 
__thumb__ || defined __mips__ || defined __aarch64__ || defined __PRU__ || 
defined __loongarch__
 #define ADJUST_PTRFN(func, virt) ((void (*)())(func))
 #define ADJUST_DELTA(delta, virt) (((delta) << 1) + !!(virt))
 #else
diff --git a/gcc/testsuite/g++.old-deja/g++.pt/ptrmem6.C 
b/gcc/testsuite/g++.old-deja/g++.pt/ptrmem6.C
index 9f4bbe43f89..8f8f7017ab7 100644
--- a/gcc/testsuite/g++.old-deja/g++.pt/ptrmem6.C
+++ b/gcc/testsuite/g++.old-deja/g++.pt/ptrmem6.C
@@ -25,7 +25,7 @@ int main() {
   h<::j>(); // { dg-error "" } 
   g<(void (A::*)()) ::f>(); // { dg-error "" "" { xfail c++11 } }
   h<(int A::*) ::i>(); // { dg-error "" "" { xfail c++11 } }
-  g<(void (A::*)()) ::f>(); // { dg-error "" "" { xfail { c++11 && { 
aarch64*-*-* arm*-*-* mips*-*-* } } } }
+  g<(void (A::*)()) ::f>(); // { dg-error "" "" { xfail { c++11 && { 
aarch64*-*-* arm*-*-* mips*-*-* loongarch*-*-* } } } }
   h<(int A::*) ::j>(); // { dg-error "" } 
   g<(void (A::*)()) 0>(); // { dg-error "" "" { target { ! c++11 } } }
   h<(int A::*) 0>(); // { dg-error "" "" { target { ! c++11 } } }
diff --git a/gcc/testsuite/gcc.dg/20020312-2.c 
b/gcc/testsuite/gcc.dg/20020312-2.c
index 52c33d09b90..92bc150df0f 100644
--- a/gcc/testsuite/gcc.dg/20020312-2.c
+++ b/gcc/testsuite/gcc.dg/20020312-2.c
@@ -37,6 +37,8 @@ extern void abort (void);
 /* PIC register is r1, but is used even without -fpic.  */
 #elif defined(__lm32__)
 /* No pic register.  */
+#elif defined(__loongarch__)
+/* No pic register.  */
 #elif defined(__M32R__)
 /* No pic register.  */
 #elif defined(__m68k__)
diff --git a/gcc/testsuite/gcc.dg/loop-8.c b/gcc/testsuite/gcc.dg/loop-8.c
index a685fc25056..8e5f2087831 100644
--- a/gcc/testsuite/gcc.dg/loop-8.c
+++ b/gcc/testsuite/gcc.dg/loop-8.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O1 -fdump-rtl-loop2_invariant" } */
-/* { dg-skip-if 

[PATCH v6 06/12] LoongArch Port: Builtin functions.

2022-01-28 Thread chenglulu
gcc/

* config/loongarch/larchintrin.h: New file.
* config/loongarch/loongarch-builtins.cc: New file.
---
 gcc/config/loongarch/larchintrin.h | 413 +
 gcc/config/loongarch/loongarch-builtins.cc | 511 +
 2 files changed, 924 insertions(+)
 create mode 100644 gcc/config/loongarch/larchintrin.h
 create mode 100644 gcc/config/loongarch/loongarch-builtins.cc

diff --git a/gcc/config/loongarch/larchintrin.h 
b/gcc/config/loongarch/larchintrin.h
new file mode 100644
index 000..d8e2a743ae5
--- /dev/null
+++ b/gcc/config/loongarch/larchintrin.h
@@ -0,0 +1,413 @@
+/* Intrinsics for LoongArch BASE operations.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published
+by the Free Software Foundation; either version 3, or (at your
+option) any later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+#ifndef _GCC_LOONGARCH_BASE_INTRIN_H
+#define _GCC_LOONGARCH_BASE_INTRIN_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+typedef struct drdtime
+{
+  unsigned long dvalue;
+  unsigned long dtimeid;
+} __drdtime_t;
+
+typedef struct rdtime
+{
+  unsigned int value;
+  unsigned int timeid;
+} __rdtime_t;
+
+#ifdef __loongarch64
+extern __inline __drdtime_t
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__builtin_loongarch_rdtime_d (void)
+{
+  __drdtime_t drdtime;
+  __asm__ volatile (
+"rdtime.d\t%[val],%[tid]\n\t"
+: [val]"="(drdtime.dvalue),[tid]"="(drdtime.dtimeid)
+:);
+  return drdtime;
+}
+#define __rdtime_d __builtin_loongarch_rdtime_d
+#endif
+
+extern __inline __rdtime_t
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__builtin_loongarch_rdtimeh_w (void)
+{
+  __rdtime_t rdtime;
+  __asm__ volatile (
+"rdtimeh.w\t%[val],%[tid]\n\t"
+: [val]"="(rdtime.value),[tid]"="(rdtime.timeid)
+:);
+  return rdtime;
+}
+#define __rdtimel_w __builtin_loongarch_rdtimel_w
+
+extern __inline __rdtime_t
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__builtin_loongarch_rdtimel_w (void)
+{
+  __rdtime_t rdtime;
+  __asm__ volatile (
+"rdtimel.w\t%[val],%[tid]\n\t"
+: [val]"="(rdtime.value),[tid]"="(rdtime.timeid)
+:);
+  return rdtime;
+}
+#define __rdtimeh_w __builtin_loongarch_rdtimeh_w
+
+/* Assembly instruction format:rj, fcsr.  */
+/* Data types in instruction templates:  USI, UQI.  */
+#define __movfcsr2gr(/*ui5*/ _1) __builtin_loongarch_movfcsr2gr ((_1));
+
+/* Assembly instruction format:0, fcsr, rj.  */
+/* Data types in instruction templates:  VOID, UQI, USI.  */
+#define __movgr2fcsr(/*ui5*/ _1, _2) \
+  __builtin_loongarch_movgr2fcsr ((unsigned short) _1, (unsigned int) _2);
+
+#if defined __loongarch64
+/* Assembly instruction format:ui5, rj, si12.  */
+/* Data types in instruction templates:  VOID, USI, UDI, SI.  */
+#define __dcacop(/*ui5*/ _1, /*unsigned long int*/ _2, /*si12*/ _3) \
+  ((void) __builtin_loongarch_dcacop ((_1), (unsigned long int) (_2), (_3)))
+#else
+#error "Don't support this ABI."
+#endif
+
+/* Assembly instruction format:rd, rj.  */
+/* Data types in instruction templates:  USI, USI.  */
+extern __inline unsigned int
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__cpucfg (unsigned int _1)
+{
+  return (unsigned int) __builtin_loongarch_cpucfg ((unsigned int) _1);
+}
+
+#ifdef __loongarch64
+/* Assembly instruction format:rd, rj.  */
+/* Data types in instruction templates:  DI, DI.  */
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__asrtle_d (long int _1, long int _2)
+{
+  __builtin_loongarch_asrtle_d ((long int) _1, (long int) _2);
+}
+
+/* Assembly instruction format:rd, rj.  */
+/* Data types in instruction templates:  DI, DI.  */
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__asrtgt_d (long int _1, long int _2)
+{
+  __builtin_loongarch_asrtgt_d ((long int) _1, (long int) _2);
+}
+#endif
+
+#if defined __loongarch64
+/* Assembly instruction format:rd, rj, ui5.  */
+/* Data types in instruction templates:  DI, DI, UQI.  */
+#define __dlddir(/*long 

[PATCH v6 02/12] LoongArch Port: gcc build

2022-01-28 Thread chenglulu
gcc/

* common/config/loongarch/loongarch-common.cc: New file.
* config/loongarch/genopts/genstr.sh: New file.
* config/loongarch/genopts/loongarch-strings: New file.
* config/loongarch/genopts/loongarch.opt.in: New file.
* config/loongarch/loongarch-str.h: New file.
* config/loongarch/gnu-user.h: New file.
* config/loongarch/linux.h: New file.
* config/loongarch/loongarch-cpu.cc: New file.
* config/loongarch/loongarch-cpu.h: New file.
* config/loongarch/loongarch-def.c: New file.
* config/loongarch/loongarch-def.h: New file.
* config/loongarch/loongarch-driver.cc: New file.
* config/loongarch/loongarch-driver.h: New file.
* config/loongarch/loongarch-opts.cc: New file.
* config/loongarch/loongarch-opts.h: New file.
* config/loongarch/loongarch.opt: New file.
* config/loongarch/t-linux: New file.
* config/loongarch/t-loongarch: New file.
* gcc_update (files_and_dependencies): Add
config/loongarch/loongarch.opt and config/loongarch/loongarch-str.h.
* config.gcc: Add LoongArch support.
* configure.ac: Add LoongArch support.
---
 contrib/gcc_update|   2 +
 .../config/loongarch/loongarch-common.cc  |  73 +++
 gcc/config.gcc| 410 -
 gcc/config/loongarch/genopts/genstr.sh|  91 +++
 .../loongarch/genopts/loongarch-strings   |  58 ++
 gcc/config/loongarch/genopts/loongarch.opt.in | 189 ++
 gcc/config/loongarch/gnu-user.h   |  84 +++
 gcc/config/loongarch/linux.h  |  50 ++
 gcc/config/loongarch/loongarch-cpu.cc | 206 +++
 gcc/config/loongarch/loongarch-cpu.h  |  30 +
 gcc/config/loongarch/loongarch-def.c  | 164 +
 gcc/config/loongarch/loongarch-def.h  | 151 +
 gcc/config/loongarch/loongarch-driver.cc  | 187 ++
 gcc/config/loongarch/loongarch-driver.h   |  69 +++
 gcc/config/loongarch/loongarch-opts.cc| 580 ++
 gcc/config/loongarch/loongarch-opts.h |  86 +++
 gcc/config/loongarch/loongarch-str.h  |  57 ++
 gcc/config/loongarch/loongarch.opt| 189 ++
 gcc/config/loongarch/t-linux  |  53 ++
 gcc/config/loongarch/t-loongarch  |  68 ++
 gcc/configure.ac  |  33 +-
 21 files changed, 2825 insertions(+), 5 deletions(-)
 create mode 100644 gcc/common/config/loongarch/loongarch-common.cc
 create mode 100755 gcc/config/loongarch/genopts/genstr.sh
 create mode 100644 gcc/config/loongarch/genopts/loongarch-strings
 create mode 100644 gcc/config/loongarch/genopts/loongarch.opt.in
 create mode 100644 gcc/config/loongarch/gnu-user.h
 create mode 100644 gcc/config/loongarch/linux.h
 create mode 100644 gcc/config/loongarch/loongarch-cpu.cc
 create mode 100644 gcc/config/loongarch/loongarch-cpu.h
 create mode 100644 gcc/config/loongarch/loongarch-def.c
 create mode 100644 gcc/config/loongarch/loongarch-def.h
 create mode 100644 gcc/config/loongarch/loongarch-driver.cc
 create mode 100644 gcc/config/loongarch/loongarch-driver.h
 create mode 100644 gcc/config/loongarch/loongarch-opts.cc
 create mode 100644 gcc/config/loongarch/loongarch-opts.h
 create mode 100644 gcc/config/loongarch/loongarch-str.h
 create mode 100644 gcc/config/loongarch/loongarch.opt
 create mode 100644 gcc/config/loongarch/t-linux
 create mode 100644 gcc/config/loongarch/t-loongarch

diff --git a/contrib/gcc_update b/contrib/gcc_update
index 1cf15f9b3c2..641ce164775 100755
--- a/contrib/gcc_update
+++ b/contrib/gcc_update
@@ -86,6 +86,8 @@ gcc/config/arm/arm-tables.opt: gcc/config/arm/arm-cpus.in 
gcc/config/arm/parsecp
 gcc/config/c6x/c6x-tables.opt: gcc/config/c6x/c6x-isas.def 
gcc/config/c6x/genopt.sh
 gcc/config/c6x/c6x-sched.md: gcc/config/c6x/c6x-sched.md.in 
gcc/config/c6x/gensched.sh
 gcc/config/c6x/c6x-mult.md: gcc/config/c6x/c6x-mult.md.in 
gcc/config/c6x/genmult.sh
+gcc/config/loongarch/loongarch-str.h: gcc/config/loongarch/genopts/genstr.sh 
gcc/config/loongarch/genopts/loongarch-string
+gcc/config/loongarch/loongarch.opt: gcc/config/loongarch/genopts/genstr.sh 
gcc/config/loongarch/genopts/loongarch.opt.in
 gcc/config/m68k/m68k-tables.opt: gcc/config/m68k/m68k-devices.def 
gcc/config/m68k/m68k-isas.def gcc/config/m68k/m68k-microarchs.def 
gcc/config/m68k/genopt.sh
 gcc/config/mips/mips-tables.opt: gcc/config/mips/mips-cpus.def 
gcc/config/mips/genopt.sh
 gcc/config/rs6000/rs6000-tables.opt: gcc/config/rs6000/rs6000-cpus.def 
gcc/config/rs6000/genopt.sh
diff --git a/gcc/common/config/loongarch/loongarch-common.cc 
b/gcc/common/config/loongarch/loongarch-common.cc
new file mode 100644
index 000..5bdfd2a30e1
--- /dev/null
+++ b/gcc/common/config/loongarch/loongarch-common.cc
@@ -0,0 +1,73 @@
+/* Common hooks for LoongArch.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+
+This 

[PATCH v6 08/12] LoongArch Port: libgcc

2022-01-28 Thread chenglulu
libgcc/

* config/loongarch/crtfastmath.c: New file.
* config/loongarch/crti.S: Like wise.
* config/loongarch/crtn.S: Like wise.
* config/loongarch/linux-unwind.h: Like wise.
* config/loongarch/sfp-machine.h: Like wise.
* config/loongarch/t-crtstuff: Like wise.
* config/loongarch/t-loongarch: Like wise.
* config/loongarch/t-loongarch64: Like wise.
* config/loongarch/t-softfp-tf: Like wise.
* config.host: Add LoongArch tuples.
* configure.ac: Add LoongArch support.
---
 libgcc/config.host |  28 -
 libgcc/config/loongarch/crtfastmath.c  |  52 +
 libgcc/config/loongarch/crti.S |  43 +++
 libgcc/config/loongarch/crtn.S |  39 +++
 libgcc/config/loongarch/linux-unwind.h |  80 +
 libgcc/config/loongarch/sfp-machine.h  | 152 +
 libgcc/config/loongarch/t-crtstuff |   5 +
 libgcc/config/loongarch/t-loongarch|   7 ++
 libgcc/config/loongarch/t-loongarch64  |   1 +
 libgcc/config/loongarch/t-softfp-tf|   3 +
 libgcc/configure.ac|   2 +-
 11 files changed, 410 insertions(+), 2 deletions(-)
 create mode 100644 libgcc/config/loongarch/crtfastmath.c
 create mode 100644 libgcc/config/loongarch/crti.S
 create mode 100644 libgcc/config/loongarch/crtn.S
 create mode 100644 libgcc/config/loongarch/linux-unwind.h
 create mode 100644 libgcc/config/loongarch/sfp-machine.h
 create mode 100644 libgcc/config/loongarch/t-crtstuff
 create mode 100644 libgcc/config/loongarch/t-loongarch
 create mode 100644 libgcc/config/loongarch/t-loongarch64
 create mode 100644 libgcc/config/loongarch/t-softfp-tf

diff --git a/libgcc/config.host b/libgcc/config.host
index 094fd3ad254..8c56fcae5d2 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -138,6 +138,22 @@ hppa*-*-*)
 lm32*-*-*)
cpu_type=lm32
;;
+loongarch*-*-*)
+   cpu_type=loongarch
+   tmake_file="loongarch/t-loongarch"
+   if test "${libgcc_cv_loongarch_hard_float}" = yes; then
+   tmake_file="${tmake_file} t-hardfp-sfdf t-hardfp"
+   else
+   tmake_file="${tmake_file} t-softfp-sfdf"
+   fi
+   if test "${ac_cv_sizeof_long_double}" = 16; then
+   tmake_file="${tmake_file} loongarch/t-softfp-tf"
+   fi
+   if test "${host_address}" = 64; then
+   tmake_file="${tmake_file} loongarch/t-loongarch64"
+   fi
+   tmake_file="${tmake_file} t-softfp"
+   ;;
 m32r*-*-*)
 cpu_type=m32r
 ;;
@@ -925,7 +941,17 @@ lm32-*-rtems*)
 lm32-*-uclinux*)
 extra_parts="$extra_parts crtbegin.o crtendS.o crtbeginT.o"
 tmake_file="lm32/t-lm32 lm32/t-uclinux t-libgcc-pic t-softfp-sfdf 
t-softfp"
-   ;;  
+   ;;
+loongarch*-*-linux*)
+   extra_parts="$extra_parts crtfastmath.o"
+   tmake_file="${tmake_file} t-crtfm loongarch/t-crtstuff"
+   case ${host} in
+ *)
+   tmake_file="${tmake_file} t-slibgcc-libgcc"
+   ;;
+   esac
+   md_unwind_header=loongarch/linux-unwind.h
+   ;;
 m32r-*-elf*)
tmake_file="$tmake_file m32r/t-m32r t-fdpbit"
extra_parts="$extra_parts crtinit.o crtfini.o"
diff --git a/libgcc/config/loongarch/crtfastmath.c 
b/libgcc/config/loongarch/crtfastmath.c
new file mode 100644
index 000..52b0d6da087
--- /dev/null
+++ b/libgcc/config/loongarch/crtfastmath.c
@@ -0,0 +1,52 @@
+/* Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+   Based on MIPS target for GNU compiler.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License
+and a copy of the GCC Runtime Library Exception along with this
+program; see the files COPYING3 and COPYING.RUNTIME respectively.
+If not, see .  */
+
+#ifdef __loongarch_hard_float
+
+/* Rounding control.  */
+#define _FPU_RC_NEAREST 0x000 /* RECOMMENDED.  */
+#define _FPU_RC_ZERO0x100
+#define _FPU_RC_UP  0x200
+#define _FPU_RC_DOWN0x300
+
+/* Enable interrupts for IEEE exceptions.  */
+#define _FPU_IEEE 0x001F
+
+/* Macros for accessing the hardware control word.  */
+#define _FPU_GETCW(cw) __asm__ volatile ("movfcsr2gr %0,$r0" : "=r" (cw))
+#define _FPU_SETCW(cw) __asm__ volatile 

[PATCH v6 10/12] LoongArch Port: libgomp

2022-01-28 Thread chenglulu
libgomp/

* configure.tgt: Add LoongArch triplet.
---
 libgomp/configure.tgt | 4 
 1 file changed, 4 insertions(+)

diff --git a/libgomp/configure.tgt b/libgomp/configure.tgt
index d4f1e741b5a..2cd7272fcd8 100644
--- a/libgomp/configure.tgt
+++ b/libgomp/configure.tgt
@@ -56,6 +56,10 @@ if test x$enable_linux_futex = xyes; then
config_path="linux/ia64 linux posix"
;;
 
+loongarch*-*-linux*)
+   config_path="linux posix"
+   ;;
+
 mips*-*-linux*)
config_path="linux/mips linux posix"
;;
-- 
2.27.0



[PATCH v6 09/12] LoongArch Port: Regenerate libgcc/configure.

2022-01-28 Thread chenglulu
---
 libgcc/configure | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/libgcc/configure b/libgcc/configure
index 4919a56f518..ce04c4f529f 100755
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -2412,6 +2412,9 @@ case "${host}" in
# sets the default TLS model and affects inlining.
PICFLAG=-fPIC
;;
+loongarch*-*-*)
+   PICFLAG=-fpic
+   ;;
 mips-sgi-irix6*)
# PIC is the default.
;;
@@ -5066,7 +5069,7 @@ $as_echo "$libgcc_cv_cfi" >&6; }
 # word size rather than the address size.
 cat > conftest.c <

[PATCH v6 07/12] LoongArch Port: Builtin macros.

2022-01-28 Thread chenglulu
gcc/

*config/loongarch/loongarch-c.cc
---
 gcc/config/loongarch/loongarch-c.cc | 109 
 1 file changed, 109 insertions(+)
 create mode 100644 gcc/config/loongarch/loongarch-c.cc

diff --git a/gcc/config/loongarch/loongarch-c.cc 
b/gcc/config/loongarch/loongarch-c.cc
new file mode 100644
index 000..e914bf306d5
--- /dev/null
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -0,0 +1,109 @@
+/* LoongArch-specific code for C family languages.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Loongson Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "c-family/c-common.h"
+#include "cpplib.h"
+
+#define preprocessing_asm_p() (cpp_get_options (pfile)->lang == CLK_ASM)
+#define builtin_define(TXT) cpp_define (pfile, TXT)
+#define builtin_assert(TXT) cpp_assert (pfile, TXT)
+
+/* Define preprocessor macros for the -march and -mtune options.
+   PREFIX is either _LOONGARCH_ARCH or _LOONGARCH_TUNE, INFO is
+   the selected processor.  If INFO's canonical name is "foo",
+   define PREFIX to be "foo", and define an additional macro
+   PREFIX_FOO.  */
+#define LARCH_CPP_SET_PROCESSOR(PREFIX, CPU_TYPE)  \
+  do   \
+{  \
+  char *macro, *p; \
+  int cpu_type = (CPU_TYPE);   \
+   \
+  macro = concat ((PREFIX), "_",   \
+ loongarch_cpu_strings[cpu_type], NULL);   \
+  for (p = macro; *p != 0; p++)\
+   *p = TOUPPER (*p);  \
+   \
+  builtin_define (macro);  \
+  builtin_define_with_value ((PREFIX), \
+loongarch_cpu_strings[cpu_type], 1);   \
+  free (macro);\
+}  \
+  while (0)
+
+void
+loongarch_cpu_cpp_builtins (cpp_reader *pfile)
+{
+  builtin_assert ("machine=loongarch");
+  builtin_assert ("cpu=loongarch");
+  builtin_define ("__loongarch__");
+
+  LARCH_CPP_SET_PROCESSOR ("_LOONGARCH_ARCH", __ACTUAL_ARCH);
+  LARCH_CPP_SET_PROCESSOR ("_LOONGARCH_TUNE", __ACTUAL_TUNE);
+
+  /* Base architecture / ABI.  */
+  if (TARGET_64BIT)
+{
+  builtin_define ("__loongarch_grlen=64");
+  builtin_define ("__loongarch64");
+}
+
+  if (TARGET_ABI_LP64)
+{
+  builtin_define ("_ABILP64=3");
+  builtin_define ("_LOONGARCH_SIM=_ABILP64");
+  builtin_define ("__loongarch_lp64");
+}
+
+  /* These defines reflect the ABI in use, not whether the
+ FPU is directly accessible.  */
+  if (TARGET_DOUBLE_FLOAT_ABI)
+builtin_define ("__loongarch_double_float=1");
+  else if (TARGET_SINGLE_FLOAT_ABI)
+builtin_define ("__loongarch_single_float=1");
+
+  if (TARGET_DOUBLE_FLOAT_ABI || TARGET_SINGLE_FLOAT_ABI)
+builtin_define ("__loongarch_hard_float=1");
+  else
+builtin_define ("__loongarch_soft_float=1");
+
+
+  /* ISA Extensions.  */
+  if (TARGET_DOUBLE_FLOAT)
+builtin_define ("__loongarch_frlen=64");
+  else if (TARGET_SINGLE_FLOAT)
+builtin_define ("__loongarch_frlen=32");
+  else
+builtin_define ("__loongarch_frlen=0");
+
+  /* Native Data Sizes.  */
+  builtin_define_with_int_value ("_LOONGARCH_SZINT", INT_TYPE_SIZE);
+  builtin_define_with_int_value ("_LOONGARCH_SZLONG", LONG_TYPE_SIZE);
+  builtin_define_with_int_value ("_LOONGARCH_SZPTR", POINTER_SIZE);
+  builtin_define_with_int_value ("_LOONGARCH_FPSET", 32 / MAX_FPRS_PER_FMT);
+  builtin_define_with_int_value ("_LOONGARCH_SPFPSET", 32);
+
+}
-- 
2.27.0



[PATCH v6 03/12] LoongArch Port: Regenerate gcc/configure.

2022-01-28 Thread chenglulu
---
 gcc/configure | 66 ++-
 1 file changed, 60 insertions(+), 6 deletions(-)

diff --git a/gcc/configure b/gcc/configure
index bd4d4721868..3823bc4e783 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -5442,6 +5442,9 @@ case "${target}" in
# sets the default TLS model and affects inlining.
PICFLAG_FOR_TARGET=-fPIC
;;
+loongarch*-*-*)
+   PICFLAG_FOR_TARGET=-fpic
+   ;;
 mips-sgi-irix6*)
# PIC is the default.
;;
@@ -7963,6 +7966,9 @@ else
 mips*-*-*)
   enable_fixed_point=yes
   ;;
+loongarch*-*-*)
+  enable_fixed_point=yes
+  ;;
 *)
   { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING: fixed-point is not 
supported for this target, ignored" >&5
 $as_echo "$as_me: WARNING: fixed-point is not supported for this target, 
ignored" >&2;}
@@ -19659,7 +19665,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19662 "configure"
+#line 19668 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -19765,7 +19771,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19768 "configure"
+#line 19774 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -25548,6 +25554,17 @@ foo:   data8   25
movlr24 = @tprel(foo#)'
tls_as_opt=--fatal-warnings
;;
+  loongarch*-*-*)
+conftest_s='
+   .section .tdata,"awT",@progbits
+x: .word 2
+   .text
+   la.tls.gd $a0,x
+   bl __tls_get_addr'
+   tls_first_major=0
+   tls_first_minor=0
+   tls_as_opt='--fatal-warnings'
+   ;;
   microblaze*-*-*)
 conftest_s='
.section .tdata,"awT",@progbits
@@ -28770,6 +28787,43 @@ $as_echo "#define HAVE_AS_MARCH_ZIFENCEI 1" 
>>confdefs.h
 fi
 
 ;;
+  loongarch*-*-*)
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for 
.dtprelword support" >&5
+$as_echo_n "checking assembler for .dtprelword support... " >&6; }
+if ${gcc_cv_as_loongarch_dtprelword+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  gcc_cv_as_loongarch_dtprelword=no
+  if test x$gcc_cv_as != x; then
+$as_echo '' > conftest.s
+if { ac_try='$gcc_cv_as $gcc_cv_as_flags 2,18,0 -o conftest.o conftest.s 
>&5'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }
+then
+   .section .tdata,"awT",@progbits
+x:
+   .word 2
+   .text
+   .dtprelword x+0x8000
+else
+  echo "configure: failed program was" >&5
+  cat conftest.s >&5
+fi
+rm -f conftest.o conftest.s
+  fi
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$gcc_cv_as_loongarch_dtprelword" >&5
+$as_echo "$gcc_cv_as_loongarch_dtprelword" >&6; }
+
+if test $gcc_cv_as_loongarch_dtprelword != yes; then
+
+$as_echo "#define HAVE_AS_DTPRELWORD 1" >>confdefs.h
+
+fi
+;;
 s390*-*-*)
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for 
.gnu_attribute support" >&5
 $as_echo_n "checking assembler for .gnu_attribute support... " >&6; }
@@ -28933,11 +28987,11 @@ fi
 ;;
 esac
 
-# Mips and HP-UX need the GNU assembler.
+# Mips, LoongArch and HP-UX need the GNU assembler.
 # Linux on IA64 might be able to use the Intel assembler.
 
 case "$target" in
-  mips*-*-* | *-*-hpux* )
+  mips*-*-* | loongarch*-*-* | *-*-hpux* )
 if test x$gas_flag = xyes \
|| test x"$host" != x"$build" \
|| test ! -x "$gcc_cv_as" \
@@ -29374,8 +29428,8 @@ esac
 # ??? Once 2.11 is released, probably need to add first known working
 # version to the per-target configury.
 case "$cpu_type" in
-  aarch64 | alpha | arc | arm | avr | bfin | cris | csky | i386 | m32c | m68k \
-  | microblaze | mips | nds32 | nios2 | pa | riscv | rs6000 | score | sparc \
+  aarch64 | alpha | arc | arm | avr | bfin | cris | csky | i386 | loongarch | 
m32c \
+  | m68k | microblaze | mips | nds32 | nios2 | pa | riscv | rs6000 | score | 
sparc \
   | tilegx | tilepro | visium | xstormy16 | xtensa)
 insn="nop"
 ;;
-- 
2.27.0



[PATCH v6 00/12] Add LoongArch support.

2022-01-28 Thread chenglulu
The LoongArch architecture (LoongArch) is an Instruction Set
Architecture (ISA) that has a Reduced Instruction Set Computer (RISC)
style.
The documents are on
https://loongson.github.io/LoongArch-Documentation/README-EN.html

The ELF ABI Documents are on:
https://loongson.github.io/LoongArch-Documentation/LoongArch-ELF-ABI-EN.html

The binutils has been merged into trunk:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=560b3fe208255ae909b4b1c88ba9c28b09043307

Note: We split -mabi= into -mabi=lp64d/f/s, the new options not support by 
upstream binutils yet,
this GCC port requires the following patch applied to binutils to build.
https://github.com/loongson/binutils-gdb/commit/aacb0bf860f02aa5a7dcb76dd0e392bf871c7586
(will be submitted to upstream after gcc side comfirmed)

We have compiled more than 300 CLFS packages with this compiler.
The CLFS are currently used on Cfarm machines gcc400 and gcc401.


changelog:

v1 -> v2
1. Split patch set.
2. Change some code style.
3. Add -mabi=lp64d/f/s options.
4. Change GLIBC_DYNAMIC_LINKER_LP64 name.

v2 -> v3
1. Change some code style.
2. Bug fix.

v3 -> v4
1. Change some code style.
2. Bug fix.
3. Delete some builtin macros.

v4 -> v5
1. delete wrong insn zero_extendsidi2_internal.
2. Adjust some build options.
3. Change some .c files to .cc.

v5 -> v6
1. Fix compilation issues. The generated files *.opt and *.h
   are generated to $(objdir).


chenglulu (12):
  LoongArch Port: Regenerate configure
  LoongArch Port: gcc build
  LoongArch Port: Regenerate gcc/configure.
  LoongArch Port: Machine Decsription files.
  LoongArch Port: Machine description C files and .h files.
  LoongArch Port: Builtin functions.
  LoongArch Port: Builtin macros.
  LoongArch Port: libgcc
  LoongArch Port: Regenerate libgcc/configure.
  LoongArch Port: libgomp
  LoongArch Port: gcc/testsuite
  LoongArch Port: Add doc.

 config/picflag.m4 |3 +
 configure |   10 +-
 configure.ac  |   10 +-
 contrib/config-list.mk|5 +-
 contrib/gcc_update|2 +
 .../config/loongarch/loongarch-common.cc  |   73 +
 gcc/config.gcc|  410 +-
 gcc/config/host-linux.cc  |2 +
 gcc/config/loongarch/constraints.md   |  212 +
 gcc/config/loongarch/generic.md   |  132 +
 gcc/config/loongarch/genopts/genstr.sh|   91 +
 .../loongarch/genopts/loongarch-strings   |   58 +
 gcc/config/loongarch/genopts/loongarch.opt.in |  189 +
 gcc/config/loongarch/gnu-user.h   |   84 +
 gcc/config/loongarch/la464.md |  132 +
 gcc/config/loongarch/larchintrin.h|  413 ++
 gcc/config/loongarch/linux.h  |   50 +
 gcc/config/loongarch/loongarch-builtins.cc|  511 ++
 gcc/config/loongarch/loongarch-c.cc   |  109 +
 gcc/config/loongarch/loongarch-cpu.cc |  206 +
 gcc/config/loongarch/loongarch-cpu.h  |   30 +
 gcc/config/loongarch/loongarch-def.c  |  164 +
 gcc/config/loongarch/loongarch-def.h  |  151 +
 gcc/config/loongarch/loongarch-driver.cc  |  187 +
 gcc/config/loongarch/loongarch-driver.h   |   69 +
 gcc/config/loongarch/loongarch-ftypes.def |  106 +
 gcc/config/loongarch/loongarch-modes.def  |   29 +
 gcc/config/loongarch/loongarch-opts.cc|  580 ++
 gcc/config/loongarch/loongarch-opts.h |   86 +
 gcc/config/loongarch/loongarch-protos.h   |  241 +
 gcc/config/loongarch/loongarch-str.h  |   57 +
 gcc/config/loongarch/loongarch-tune.h |   72 +
 gcc/config/loongarch/loongarch.cc | 6330 +
 gcc/config/loongarch/loongarch.h  | 1271 
 gcc/config/loongarch/loongarch.md | 3702 ++
 gcc/config/loongarch/loongarch.opt|  189 +
 gcc/config/loongarch/predicates.md|  553 ++
 gcc/config/loongarch/sync.md  |  574 ++
 gcc/config/loongarch/t-linux  |   53 +
 gcc/config/loongarch/t-loongarch  |   68 +
 gcc/configure |   66 +-
 gcc/configure.ac  |   33 +-
 gcc/doc/install.texi  |   47 +-
 gcc/doc/invoke.texi   |  201 +
 gcc/doc/md.texi   |   55 +
 gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C|2 +-
 gcc/testsuite/g++.old-deja/g++.abi/ptrmem.C   |2 +-
 gcc/testsuite/g++.old-deja/g++.pt/ptrmem6.C   |2 +-
 gcc/testsuite/gcc.dg/20020312-2.c |2 +
 gcc/testsuite/gcc.dg/loop-8.c |2 +-
 .../torture/stackalign/builtin-apply-2.c  |2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c |2 +-
 .../gcc.target/loongarch/loongarch.exp|   40 +
 .../gcc.target/loongarch/tst-asm-const.c  |   16 +
 gcc/testsuite/go.test/go-test.exp 

[PATCH v6 01/12] LoongArch Port: Regenerate configure

2022-01-28 Thread chenglulu
* config/picflag.m4: Default add build option '-fpic' for LoongArch.
* configure: Add LoongArch tuples.
* configure.ac: Like wise.
---
 config/picflag.m4 |  3 +++
 configure | 10 +-
 configure.ac  | 10 +-
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/config/picflag.m4 b/config/picflag.m4
index 8b106f9af88..0aefcf619bf 100644
--- a/config/picflag.m4
+++ b/config/picflag.m4
@@ -44,6 +44,9 @@ case "${$2}" in
# sets the default TLS model and affects inlining.
$1=-fPIC
;;
+loongarch*-*-*)
+   $1=-fpic
+   ;;
 mips-sgi-irix6*)
# PIC is the default.
;;
diff --git a/configure b/configure
index 9c2d7df1bb2..87548f0da96 100755
--- a/configure
+++ b/configure
@@ -3060,7 +3060,7 @@ case "${ENABLE_GOLD}" in
   # Check for target supported by gold.
   case "${target}" in
 i?86-*-* | x86_64-*-* | sparc*-*-* | powerpc*-*-* | arm*-*-* \
-| aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-*)
+| aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-* | loongarch*-*-*)
  configdirs="$configdirs gold"
  if test x${ENABLE_GOLD} = xdefault; then
default_ld=gold
@@ -3646,6 +3646,9 @@ case "${target}" in
   i[3456789]86-*-*)
 libgloss_dir=i386
 ;;
+  loongarch*-*-*)
+libgloss_dir=loongarch
+;;
   m68hc11-*-*|m6811-*-*|m68hc12-*-*|m6812-*-*)
 libgloss_dir=m68hc11
 ;;
@@ -4030,6 +4033,11 @@ case "${target}" in
   wasm32-*-*)
 noconfigdirs="$noconfigdirs ld"
 ;;
+  loongarch*-*-linux*)
+;;
+  loongarch*-*-*)
+noconfigdirs="$noconfigdirs gprof"
+;;
 esac
 
 # If we aren't building newlib, then don't build libgloss, since libgloss
diff --git a/configure.ac b/configure.ac
index 68cc5cc31fe..55362afeeae 100644
--- a/configure.ac
+++ b/configure.ac
@@ -353,7 +353,7 @@ case "${ENABLE_GOLD}" in
   # Check for target supported by gold.
   case "${target}" in
 i?86-*-* | x86_64-*-* | sparc*-*-* | powerpc*-*-* | arm*-*-* \
-| aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-*)
+| aarch64*-*-* | tilegx*-*-* | mips*-*-* | s390*-*-* | loongarch*-*-*)
  configdirs="$configdirs gold"
  if test x${ENABLE_GOLD} = xdefault; then
default_ld=gold
@@ -899,6 +899,9 @@ case "${target}" in
   i[[3456789]]86-*-*)
 libgloss_dir=i386
 ;;
+  loongarch*-*-*)
+libgloss_dir=loongarch
+;;
   m68hc11-*-*|m6811-*-*|m68hc12-*-*|m6812-*-*)
 libgloss_dir=m68hc11
 ;;
@@ -1283,6 +1286,11 @@ case "${target}" in
   wasm32-*-*)
 noconfigdirs="$noconfigdirs ld"
 ;;
+  loongarch*-*-linux*)
+;;
+  loongarch*-*-*)
+noconfigdirs="$noconfigdirs gprof"
+;;
 esac
 
 # If we aren't building newlib, then don't build libgloss, since libgloss
-- 
2.27.0



[PATCH] tree-optimization/104267 - fix external def vector type for call args

2022-01-28 Thread Richard Biener via Gcc-patches
The following fixes the vector type registered for external defs
in call arguments when vectorizing with SLP.  We assumed uniform
vectype_in types here but with calls like .COND_MUL we also have
mask arguments which, when invariant or external, need to have
a proper mask vector type.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2022-01-28  Richard Biener  

PR tree-optimization/104267
* tree-vect-stmts.c (vectorizable_call): Properly use the
per-argument determined vector type for externals and
invariants.
---
 gcc/tree-vect-stmts.cc | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 824ebb6354b..000a0f4b47e 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -3444,7 +3444,9 @@ vectorizable_call (vec_info *vinfo,
 {
   if (slp_node)
for (i = 0; i < nargs; ++i)
- if (!vect_maybe_update_slp_op_vectype (slp_op[i], vectype_in))
+ if (!vect_maybe_update_slp_op_vectype (slp_op[i],
+vectypes[i]
+? vectypes[i] : vectype_in))
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-- 
2.31.1


Re: [PATCH][GCC13] Don't force side effects for hardware vector element broadcast

2022-01-28 Thread Richard Biener via Gcc-patches
On Fri, Jan 28, 2022 at 1:22 PM Richard Biener
 wrote:
>
> On Thu, Jan 27, 2022 at 8:14 PM Maciej W. Rozycki  wrote:
> >
> > On Thu, 27 Jan 2022, Richard Biener wrote:
> >
> > > > > > Index: gcc/gcc/c/c-typeck.cc
> > > > > > ===
> > > > > > --- gcc.orig/gcc/c/c-typeck.cc
> > > > > > +++ gcc/gcc/c/c-typeck.cc
> > > > > > @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.
> > > > > >  #include "gomp-constants.h"
> > > > > >  #include "spellcheck-tree.h"
> > > > > >  #include "gcc-rich-location.h"
> > > > > > +#include "optabs-query.h"
> > > > > >  #include "stringpool.h"
> > > > > >  #include "attribs.h"
> > > > > >  #include "asan.h"
> > > > > > @@ -11923,7 +11924,9 @@ build_binary_op (location_t location, en
> > > > > >bool maybe_const = true;
> > > > > >tree sc;
> > > > > >sc = c_fully_fold (op0, false, _const);
> > > > > > -  sc = save_expr (sc);
> > > > > > + if (optab_handler (vec_duplicate_optab,
> > > > > > +TYPE_MODE (type1)) == 
> > > > > > CODE_FOR_nothing)
> > > > > > +   sc = save_expr (sc);
> > > > >
> > > > > This doesn't make much sense - I suppose the CONSTRUCTOR retains
> > > > > TREE_SIDE_EFFECTS but such flag has no meaning on GIMPLE
> > > > > and thus should have been cleared during gimplification or in the end
> > > > > ignored by RTL expansion.
> > > >
> > > >  This is how the expression built here eventually looks in
> > > > `store_constructor':
> > > >
> > > > (gdb) print exp
> > > > $41 = 
> > > > (gdb) pt
> > > >   > > > type  > > > type  > > > size 
> > > > unit-size 
> > > > align:32 warn_if_not_align:0 symtab:0 alias-set -1 
> > > > canonical-type 0x75cf1260 precision:32
> > > > pointer_to_this >
> > > > sizes-gimplified V4SF
> > > > size 
> > > > unit-size 
> > > > align:128 warn_if_not_align:0 symtab:0 alias-set -1 
> > > > canonical-type 0x75d19648 nunits:4 context  > > > 0x75ec0bb8 v4sf-dup.c>>
> > > > side-effects length:4
> > > > val 
> > > > visited var 
> > > > def_stmt GIMPLE_NOP
> > > > version:2>
> > > > val 
> > > > visited var 
> > > > def_stmt GIMPLE_NOP
> > > > version:2>
> > > > val 
> > > > visited var 
> > > > def_stmt GIMPLE_NOP
> > > > version:2>
> > > > val 
> > > > visited var 
> > > > def_stmt GIMPLE_NOP
> > > > version:2>>
> > > > (gdb)
> > > >
> > > > The `side-effects' flag prevents this conditional from executing:
> > > >
> > > > /* Try using vec_duplicate_optab for uniform vectors.  */
> > > > if (!TREE_SIDE_EFFECTS (exp)
> > > > && VECTOR_MODE_P (mode)
> > > > && eltmode == GET_MODE_INNER (mode)
> > > > && ((icode = optab_handler (vec_duplicate_optab, mode))
> > > > != CODE_FOR_nothing)
> > > > && (elt = uniform_vector_p (exp))
> > > > && !VECTOR_TYPE_P (TREE_TYPE (elt)))
> > > >   {
> > > > class expand_operand ops[2];
> > > > create_output_operand ([0], target, mode);
> > > > create_input_operand ([1], expand_normal (elt), 
> > > > eltmode);
> > > > expand_insn (icode, 2, ops);
> > > > if (!rtx_equal_p (target, ops[0].value))
> > > >   emit_move_insn (target, ops[0].value);
> > > > break;
> > > >   }
> > > >
> > > > I don't know what's supposed to clear the flag (and what the purpose of
> > > > setting it in the first place would be then).
> > >
> > > It's probably safe to remove the !TREE_SIDE_EFFECTS check above
> > > but already gimplification should have made sure all side-effects are
> > > pushed to separate stmts.  gimplifiation usually calls 
> > > recompute_side_effects
> > > but that doesn't seem to touch CONSTRUCTORs.  But I do remember fixing
> > > some spurious TREE_SIDE_EFFECTS on CTORs before.
> > >
> > > Might be worth verifying in verify_gimple_assign_single that CTORs
> > > do not have TREE_SIDE_EFFECTS set (unless this is a clobber).
> >
> >  OK, so maybe there's another bug somewhere that causes the side-effects
> > flag not to be cleared where expected, however I an inconvinced as to
> > withdrawing my original point.  That is why treat code like:
> >
> > v4sf
> > odd_even (v4sf x, float y)
> > {
> >   return x + f;
> > }
> >
> > effectively like:
> >
> > v4sf
> > odd_even (v4sf x, volatile float y)
> > {
> >   return x + f;
> > }
>
> that's not what it does.  It treats it like
>
>   float tem = f;
>   return x + { tem, tem, tem, tem };
>
> avoiding, like for x + (1.0f + f) creating
>
>   return x + { 1.0+f, 1.0+f, 1.0+f ...}
>
> it's more CSE than volatile qualifying.
>
> > which I infer from the terse justification in the discussions 

Re: [PATCH][GCC13] Don't force side effects for hardware vector element broadcast

2022-01-28 Thread Richard Biener via Gcc-patches
On Thu, Jan 27, 2022 at 8:14 PM Maciej W. Rozycki  wrote:
>
> On Thu, 27 Jan 2022, Richard Biener wrote:
>
> > > > > Index: gcc/gcc/c/c-typeck.cc
> > > > > ===
> > > > > --- gcc.orig/gcc/c/c-typeck.cc
> > > > > +++ gcc/gcc/c/c-typeck.cc
> > > > > @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.
> > > > >  #include "gomp-constants.h"
> > > > >  #include "spellcheck-tree.h"
> > > > >  #include "gcc-rich-location.h"
> > > > > +#include "optabs-query.h"
> > > > >  #include "stringpool.h"
> > > > >  #include "attribs.h"
> > > > >  #include "asan.h"
> > > > > @@ -11923,7 +11924,9 @@ build_binary_op (location_t location, en
> > > > >bool maybe_const = true;
> > > > >tree sc;
> > > > >sc = c_fully_fold (op0, false, _const);
> > > > > -  sc = save_expr (sc);
> > > > > + if (optab_handler (vec_duplicate_optab,
> > > > > +TYPE_MODE (type1)) == 
> > > > > CODE_FOR_nothing)
> > > > > +   sc = save_expr (sc);
> > > >
> > > > This doesn't make much sense - I suppose the CONSTRUCTOR retains
> > > > TREE_SIDE_EFFECTS but such flag has no meaning on GIMPLE
> > > > and thus should have been cleared during gimplification or in the end
> > > > ignored by RTL expansion.
> > >
> > >  This is how the expression built here eventually looks in
> > > `store_constructor':
> > >
> > > (gdb) print exp
> > > $41 = 
> > > (gdb) pt
> > >   > > type  > > type  > > size 
> > > unit-size 
> > > align:32 warn_if_not_align:0 symtab:0 alias-set -1 
> > > canonical-type 0x75cf1260 precision:32
> > > pointer_to_this >
> > > sizes-gimplified V4SF
> > > size 
> > > unit-size 
> > > align:128 warn_if_not_align:0 symtab:0 alias-set -1 
> > > canonical-type 0x75d19648 nunits:4 context  > > 0x75ec0bb8 v4sf-dup.c>>
> > > side-effects length:4
> > > val 
> > > visited var 
> > > def_stmt GIMPLE_NOP
> > > version:2>
> > > val 
> > > visited var 
> > > def_stmt GIMPLE_NOP
> > > version:2>
> > > val 
> > > visited var 
> > > def_stmt GIMPLE_NOP
> > > version:2>
> > > val 
> > > visited var 
> > > def_stmt GIMPLE_NOP
> > > version:2>>
> > > (gdb)
> > >
> > > The `side-effects' flag prevents this conditional from executing:
> > >
> > > /* Try using vec_duplicate_optab for uniform vectors.  */
> > > if (!TREE_SIDE_EFFECTS (exp)
> > > && VECTOR_MODE_P (mode)
> > > && eltmode == GET_MODE_INNER (mode)
> > > && ((icode = optab_handler (vec_duplicate_optab, mode))
> > > != CODE_FOR_nothing)
> > > && (elt = uniform_vector_p (exp))
> > > && !VECTOR_TYPE_P (TREE_TYPE (elt)))
> > >   {
> > > class expand_operand ops[2];
> > > create_output_operand ([0], target, mode);
> > > create_input_operand ([1], expand_normal (elt), eltmode);
> > > expand_insn (icode, 2, ops);
> > > if (!rtx_equal_p (target, ops[0].value))
> > >   emit_move_insn (target, ops[0].value);
> > > break;
> > >   }
> > >
> > > I don't know what's supposed to clear the flag (and what the purpose of
> > > setting it in the first place would be then).
> >
> > It's probably safe to remove the !TREE_SIDE_EFFECTS check above
> > but already gimplification should have made sure all side-effects are
> > pushed to separate stmts.  gimplifiation usually calls 
> > recompute_side_effects
> > but that doesn't seem to touch CONSTRUCTORs.  But I do remember fixing
> > some spurious TREE_SIDE_EFFECTS on CTORs before.
> >
> > Might be worth verifying in verify_gimple_assign_single that CTORs
> > do not have TREE_SIDE_EFFECTS set (unless this is a clobber).
>
>  OK, so maybe there's another bug somewhere that causes the side-effects
> flag not to be cleared where expected, however I an inconvinced as to
> withdrawing my original point.  That is why treat code like:
>
> v4sf
> odd_even (v4sf x, float y)
> {
>   return x + f;
> }
>
> effectively like:
>
> v4sf
> odd_even (v4sf x, volatile float y)
> {
>   return x + f;
> }

that's not what it does.  It treats it like

  float tem = f;
  return x + { tem, tem, tem, tem };

avoiding, like for x + (1.0f + f) creating

  return x + { 1.0+f, 1.0+f, 1.0+f ...}

it's more CSE than volatile qualifying.

> which I infer from the terse justification in the discussions referred is
> the sole purpose of making use of `save_expr' here, also for targets that
> have a cheap (or free if combined with another operation) `vec_duplicateM'
> machine operation?

Because the IL from the frontends should not depend on target capabilities
and whether we have to preserve side-effects properly doesn't depend 

Re: [PATCH] constrain PHI handling in -Wuse-after-free (PR104232)

2022-01-28 Thread Richard Biener via Gcc-patches
On Thu, Jan 27, 2022 at 7:58 PM Martin Sebor via Gcc-patches
 wrote:
>
> The indiscriminate PHI handling by -Wuse-after-free has caused
> the false positive reported in PR 104232.  The attached patch
> refines the handling to only consider PHIs all of whose operands
> refer to the same object and disregard the rest.
>
> Tested on x86_64-linux and by compiling a few toolchain projects,
> including Glibc and Binutils/GDB, to verify the absence of false
> positives.

OK.

> Martin


Re: [PATCH] RISC-V: Document `auipc' and `bitmanip' `type' attributes

2022-01-28 Thread Maciej W. Rozycki
On Thu, 27 Jan 2022, Andrew Waterman wrote:

> LGTM, thanks for correcting this oversight in my patch.

 Committed, thanks for your review!

  Maciej


Re: [Submitted, PR103970, Fortran, Coarray] Multi-image co_broadcast of derived type with allocatable components fails^

2022-01-28 Thread Andre Vehreschild via Gcc-patches
Hi Tobias,

I don't know why that bootstrapped initially. I fixed the patch (naming a
```
else 
  /* Prevent warning.  */
  cdesc = NULL_TREE;
```
obvious) and rerun bootstrap making sure to purge everything beforehand. It did
not break bootstrap on x86_64-linux/f35. Hope it doesn't elsewhere with submit
26e237fb5b8.

Thanks for pointing this out.

Regards,
Andre

On Fri, 28 Jan 2022 10:36:23 +0100
Andre Vehreschild via Fortran  wrote:

> Hi Tobias,
> 
> ups, sorry, reverted immediately.
> 
> Regards,
>   Andre
> 
> On Fri, 28 Jan 2022 10:27:26 +0100
> Tobias Burnus  wrote:
> 
> > Hi Andre,
> > 
> > your patch breaks bootstrapping:
> > 
> > ../../repos/gcc/gcc/fortran/trans-array.cc: In function ‘tree_node*
> > structure_alloc_comps(gfc_symbol*, tree, tree, int, int, int,
> > gfc_co_subroutines_args*)’:
> > ../../repos/gcc/gcc/fortran/trans-array.cc:9200:42: error: ‘cdesc’ may be
> > used uninitialized [-Werror=maybe-uninitialized] 9200 |
> > gfc_conv_descriptor_data_set (, cdesc, comp); |
> > ~^~~~
> > ../../repos/gcc/gcc/fortran/trans-array.cc:9082:16: note: ‘cdesc’ was
> > declared here 9082 |   tree cdesc; |^ cc1plus:
> > all warnings being treated as errors make[3]: *** [Makefile:1143:
> > fortran/trans-array.o] Error 1
> > 
> > Tobias
> > 
> > On 28.01.22 10:07, Andre Vehreschild via Fortran wrote:  
> > > Hi Harald,
> > >
> > > thanks for the fast review. I have submitted as c9c48ab7bad.
> > >
> > > Will wait for two weeks (reminder set :-)) before backporting to gcc-11.
> > >
> > > Thank you and regards,
> > >   Andre
> > >
> > > On Tue, 25 Jan 2022 22:30:22 +0100
> > > Harald Anlauf via Fortran  wrote:
> > >
> > >> Hi Andre',
> > >>
> > >> Am 25.01.22 um 17:32 schrieb Andre Vehreschild via Fortran:
> > >>> Hi all,
> > >>>
> > >>> attached patch fixes wrong code generation when broadcasting a derived
> > >>> type containing allocatable and non-allocatable scalars. Furthermore
> > >>> does it prevent broadcasting of coarray-tokens, which are always local
> > >>> this_image. Thus having them on a different image makes no sense.
> > >>>
> > >>> Bootstrapped and regtested ok on x86_64-linux/F35.
> > >>>
> > >>> Ok, for trunk and backport to 12 and 11-branch after decent time?
> > >>>
> > >>> I perceived that 12 is closed for this kind of bugfix, therefore asking
> > >>> ok for 13.
> > >> I do not think that 12 is closed for bugfixing, especially not for
> > >> fortran.  And if my cursory reading of the patch is not misleading,
> > >> the impact of the patch is really limited to coarrays.
> > >>
> > >> You may want to wait for another 1-2 days for additional comments.
> > >> If not, it is OK from my side.
> > >>
> > >> Thanks for the patch!
> > >>
> > >> Harald
> > >>
> > >>> Regards,
> > >>> Andre
> > >>> --
> > >>> Andre Vehreschild * Email: vehre ad gmx dot de
> > >>
> > >
> > > --
> > > Andre Vehreschild * Email: vehre ad gmx dot de
> > -
> > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
> > 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
> > Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
> > Registergericht München, HRB 106955  
> 
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [PATCH] tree-optimization/104263 - avoid retaining abnormal edges for non-call/goto stmts

2022-01-28 Thread Richard Biener via Gcc-patches
On Fri, 28 Jan 2022, Jakub Jelinek wrote:

> On Fri, Jan 28, 2022 at 12:04:00PM +0100, Richard Biener wrote:
> > Still I think assuming there are no abnormal edges when neither
> > of the flag is set is premature (as can be seen here).  I also
> > don't think what we do in the function is very timing critical,
> > but sure, we walk all successor edges.
> 
> Ok then.

Just to add - gimple_purge_dead_abnormal_call_edges should only
be called if the caller determined a possible change.  I've checked
and only fixup_cfg calls it unconditionally (I guess on purpose).

Richard.


Re: [PATCH] tree-optimization/104263 - avoid retaining abnormal edges for non-call/goto stmts

2022-01-28 Thread Jakub Jelinek via Gcc-patches
On Fri, Jan 28, 2022 at 12:04:00PM +0100, Richard Biener wrote:
> Still I think assuming there are no abnormal edges when neither
> of the flag is set is premature (as can be seen here).  I also
> don't think what we do in the function is very timing critical,
> but sure, we walk all successor edges.

Ok then.

Jakub



Re: [PATCH] enable -Winvalid-memory-order for C++ [PR99612]

2022-01-28 Thread Jonathan Wakely via Gcc-patches
On Fri, 28 Jan 2022 at 00:59, Martin Sebor wrote:
> In practice, I'd expect most calls to atomic functions to be made
> with constant memory models, and code like in the test case above
> to be uncommon, so I think the choice of warning at -O0 was
> the right one.

Some of us consider it a misfeature that the C++ functions use
function parameters for the memory model at all. They could have been
template arguments, so that only constants would be allowed:

atomic.fetch();



Re: [PATCH] tree-optimization/104263 - avoid retaining abnormal edges for non-call/goto stmts

2022-01-28 Thread Richard Biener via Gcc-patches
On Fri, 28 Jan 2022, Jakub Jelinek wrote:

> On Fri, Jan 28, 2022 at 11:29:38AM +0100, Richard Biener wrote:
> > This removes a premature optimization from
> > gimple_purge_dead_abnormal_call_edges which, after eliding the
> > last setjmp (or computed goto) statement from a function and
> > thus clearing cfun->calls_setjmp, leaves us with the abnormal
> > edges from other calls that are elided for example via inlining
> > or DCE.  That's a CFG / IL combination that should be impossible
> > (not addressing the fact that with cfun->calls_setjmp and
> > cfun->has_nonlocal_label cleared we should not have any abnormal
> > edge at all).
> > 
> > For the testcase in the PR this means that IPA inlining will
> > remove the abormal edges from the block after inlining the call
> > the edge was coming from.
> 
> Couldn't DCE when it clears calls_setjmp and doesn't set it again
> (I think we never clear has_nonlocal_label) temporarily set
> calls_setjmp and gimple_purge_all_dead_abnormal_call_edges
> with it?
> Or have next to calls_setjmp a maybe_calls_setjmp flag that
> would be sticky like has_nonlocal_labels and would be never cleared?

I suppose we could do things like this.  Note in CFG cleanup
we call gimple_purge_dead_eh_edges on each block but not
gimple_purge_dead_abnormal_call_edges.  DCE, when resetting the
flag could also manually axe all abnormal edges in the function.

Still I think assuming there are no abnormal edges when neither
of the flag is set is premature (as can be seen here).  I also
don't think what we do in the function is very timing critical,
but sure, we walk all successor edges.

uninit has interesting code checking ->calls_setjmp conditionally
ignoring abnormal SSA names but only then ... (it should be
able to ignore them when _none_ of the flags is set instead).

That said, gimple_purge_dead_abnormal_call_edges wants to check
"are there possibly any abnormal edges in the function" and clearly
testing just the flags doesn't do it but resetting the flag was
important enough to cut out (sometimes bogus?) checks before
optimizations.

Richard.


Re: [PATCH] tree-optimization/104263 - avoid retaining abnormal edges for non-call/goto stmts

2022-01-28 Thread Jakub Jelinek via Gcc-patches
On Fri, Jan 28, 2022 at 11:29:38AM +0100, Richard Biener wrote:
> This removes a premature optimization from
> gimple_purge_dead_abnormal_call_edges which, after eliding the
> last setjmp (or computed goto) statement from a function and
> thus clearing cfun->calls_setjmp, leaves us with the abnormal
> edges from other calls that are elided for example via inlining
> or DCE.  That's a CFG / IL combination that should be impossible
> (not addressing the fact that with cfun->calls_setjmp and
> cfun->has_nonlocal_label cleared we should not have any abnormal
> edge at all).
> 
> For the testcase in the PR this means that IPA inlining will
> remove the abormal edges from the block after inlining the call
> the edge was coming from.

Couldn't DCE when it clears calls_setjmp and doesn't set it again
(I think we never clear has_nonlocal_label) temporarily set
calls_setjmp and gimple_purge_all_dead_abnormal_call_edges
with it?
Or have next to calls_setjmp a maybe_calls_setjmp flag that
would be sticky like has_nonlocal_labels and would be never cleared?

Jakub



[PATCH] tree-optimization/104263 - avoid retaining abnormal edges for non-call/goto stmts

2022-01-28 Thread Richard Biener via Gcc-patches
This removes a premature optimization from
gimple_purge_dead_abnormal_call_edges which, after eliding the
last setjmp (or computed goto) statement from a function and
thus clearing cfun->calls_setjmp, leaves us with the abnormal
edges from other calls that are elided for example via inlining
or DCE.  That's a CFG / IL combination that should be impossible
(not addressing the fact that with cfun->calls_setjmp and
cfun->has_nonlocal_label cleared we should not have any abnormal
edge at all).

For the testcase in the PR this means that IPA inlining will
remove the abormal edges from the block after inlining the call
the edge was coming from.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

2022-01-28  Richard Biener  

PR tree-optimization/104263
* tree-cfg.cc (gimple_purge_dead_abnormal_call_edges):
Purge edges also when !cfun->has_nonlocal_label
and !cfun->calls_setjmp.

* gcc.dg/tree-ssa/inline-13.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/inline-13.c | 27 +++
 gcc/tree-cfg.cc   |  4 
 2 files changed, 27 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/inline-13.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/inline-13.c 
b/gcc/testsuite/gcc.dg/tree-ssa/inline-13.c
new file mode 100644
index 000..94d8a9c709e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/inline-13.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-early-inlining -fdump-tree-fixup_cfg3" } */
+
+int n;
+
+static int
+bar (void)
+{
+  int a;
+
+  n = 0;
+  a = 0;
+
+  return n;
+}
+
+__attribute__ ((pure, returns_twice)) int
+foo (void)
+{
+  n = bar () + 1;
+  foo ();
+
+  return 0;
+}
+
+/* Abnormal edges should be properly elided after IPA inlining of bar.  */
+/* { dg-final { scan-tree-dump-times "bb" 1 "fixup_cfg3" } } */
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 2340cd7cef0..260a7fb97c6 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -8916,10 +8916,6 @@ gimple_purge_dead_abnormal_call_edges (basic_block bb)
   edge_iterator ei;
   gimple *stmt = last_stmt (bb);
 
-  if (!cfun->has_nonlocal_label
-  && !cfun->calls_setjmp)
-return false;
-
   if (stmt && stmt_can_make_abnormal_goto (stmt))
 return false;
 
-- 
2.31.1


Re: [PATCH] cfgrtl: Fix up locus comparison in unique_locus_on_edge_between_p [PR104237]

2022-01-28 Thread Richard Biener via Gcc-patches
On Fri, 28 Jan 2022, Jakub Jelinek wrote:

> Hi!
> 
> The testcase in the PR (not included for the testsuite because we don't
> have an (easy) way to -fcompare-debug LTO, we'd need 2 compilations/linking,
> one with -g and one with -g0 and -fdump-rtl-final= at the end of lto1
> and compare that) has different code generation for -g vs. -g0.
> 
> The difference appears during expansion, where we have a goto_locus
> that is at -O0 compared to the INSN_LOCATION of the previous and next insn
> across an edge.  With -g0 the locations are equal and so no nop is added.
> With -g the locations aren't equal and so a nop is added holding that
> location.
> 
> The reason for the different location is in the way how we stream in
> locations by lto1.
> We have lto_location_cache::apply_location_cache that is called with some
> set of expanded locations, qsorts them, creates location_t's for those
> and remembers the last expanded location.
> lto_location_cache::input_location_and_block when read in expanded_location
> is equal to the last expanded location just reuses the last location_t
> (or adds/changes/removes LOCATION_BLOCK in it), when it is not queues
> it for next apply_location_cache.  Now, when streaming in -g input, we can
> see extra locations that don't appear with -g0, and if we are unlucky
> enough, those can be sorted last during apply_location_cache and affect
> what locations are used from the single entry cache next.
> In particular, second apply_location_cache with non-empty loc_cache in
> the testcase has 14 locations with -g0 and 16 with -g and those 2 extra
> ones sort both last (they are the same).  The last one from -g0 then
> appears to be input_location_and_block sourced again, for -g0 triggers
> the single entry cache, while for -g it doesn't and so apply_location_cache
> will create for it another location_t with the same content.
> 
> The following patch fixes it by comparing everything we care about the
> location instead (well, better in addition) to a simple location_t ==
> location_t check.  I think we don't care about the sysp flag for debug
> info...
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2022-01-28  Jakub Jelinek  
> 
>   PR lto/104237
>   * cfgrtl.cc (loc_equal): New function.
>   (unique_locus_on_edge_between_p): Use it.
> 
> --- gcc/cfgrtl.cc.jj  2022-01-18 11:58:58.947991128 +0100
> +++ gcc/cfgrtl.cc 2022-01-27 19:32:13.949937750 +0100
> @@ -778,6 +778,29 @@ rtl_split_block (basic_block bb, void *i
>return new_bb;
>  }
>  
> +/* Return true if LOC1 and LOC2 are equivalent for
> +   unique_locus_on_edge_between_p purposes.  */
> +
> +static bool
> +loc_equal (location_t loc1, location_t loc2)
> +{
> +  if (loc1 == loc2)
> +return true;
> +
> +  expanded_location loce1 = expand_location (loc1);
> +  expanded_location loce2 = expand_location (loc2);
> +
> +  if (loce1.line != loce2.line
> +  || loce1.column != loce2.column
> +  || loce1.data != loce2.data)
> +return false;
> +  if (loce1.file == loce2.file)
> +return true;
> +  return (loce1.file != NULL
> +   && loce2.file != NULL
> +   && filename_cmp (loce1.file, loce2.file) == 0);
> +}
> +
>  /* Return true if the single edge between blocks A and B is the only place
> in RTL which holds some unique locus.  */
>  
> @@ -796,7 +819,7 @@ unique_locus_on_edge_between_p (basic_bl
>while (insn != end && (!NONDEBUG_INSN_P (insn) || !INSN_HAS_LOCATION 
> (insn)))
>  insn = PREV_INSN (insn);
>  
> -  if (insn != end && INSN_LOCATION (insn) == goto_locus)
> +  if (insn != end && loc_equal (INSN_LOCATION (insn), goto_locus))
>  return false;
>  
>/* Then scan block B forward.  */
> @@ -808,7 +831,7 @@ unique_locus_on_edge_between_p (basic_bl
>   insn = NEXT_INSN (insn);
>  
>if (insn != end && INSN_HAS_LOCATION (insn)
> -   && INSN_LOCATION (insn) == goto_locus)
> +   && loc_equal (INSN_LOCATION (insn), goto_locus))
>   return false;
>  }
>  
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


[Ada] Fix PR ada/104258

2022-01-28 Thread Eric Botcazou via Gcc-patches
This is a regression present on mainline and 11 branch: the new transformation 
applied during expansion by Narrow_Large_Operation would incorrectly perform 
name resolution for the operator again.

Tested on x86_64-suse-linux, applied on mainline and 11 branch.


2022-01-28  Eric Botcazou  

PR ada/104258
* exp_ch4.adb (Narrow_Large_Operation): Also copy the entity, if
any, when rewriting the operator node.


2022-01-28  Eric Botcazou  

* gnat.dg/generic_comp.adb: New test.

-- 
Eric Botcazoudiff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
index c31f5bb36ba..2506c67e936 100644
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -14314,9 +14314,13 @@ package body Exp_Ch4 is
  return;
   end if;
 
-  --  Finally, rewrite the operation in the narrower type
+  --  Finally, rewrite the operation in the narrower type, but make sure
+  --  not to perform name resolution for the operator again.
 
   Nop := New_Op_Node (Kind, Sloc (N));
+  if Nkind (N) in N_Has_Entity then
+ Set_Entity (Nop, Entity (N));
+  end if;
 
   if Binary then
  Set_Left_Opnd (Nop, Convert_To (Ntyp, L));
-- { dg-do run }

procedure Generic_Comp is

   generic
  type Element_Type is private;
  type Index_Type is (<>);
  type Array_Type is array (Index_Type range <>) of Element_Type;
  with function ">" (Left, Right : Element_Type) return Boolean is <>;
   procedure Gen (Data: in out Array_Type);

   procedure Gen (Data: in out Array_Type) is
   begin
  if not (Data'Length > 1)
or else not (Integer'(Data'Length) > 1)
or else not Standard.">" (Data'Length, 1)
or else not Standard.">" (Integer'(Data'Length), 1)
  then
 raise Program_Error;
  end if;
   end;

   type My_Array is array (Positive range <>) of Integer;

   function Less_Than (L, R : Integer) return Boolean is
   begin
  return L < R;
   end;

   procedure Chk_Down is new Gen (Element_Type => Integer,
  Index_Type   => Positive,
  Array_Type   => My_Array,
  ">"  => Less_Than);

   Data : My_Array (1 .. 2);

begin
   Chk_Down (Data);
end;


[PATCH] Make graph dumping work for fn != cfun

2022-01-28 Thread Richard Biener via Gcc-patches
The following makes dumping of a function as graph work as intended
when specifying a function other than cfun.  Unfortunately the loop
and the dominance APIs are not set up to work for other functions
than cfun so you won't get any fancy loop dumps but the non-loop
dump works up to reaching mark_dfs_back_edges which I trivially made
function aware and adjusted current callers with a wrapper.

With all this, doing dot-fn id->src_cfun from the debugger when
debugging inlining works.  Previously you got a strange mix of
the src and dest functions visualized ;)

Bootstrap & regtest running on x86_64-unknown-linux-gnu, will push
if it succeeds.

Richard.

2022-01-28  Richard Biener  

* cfganal.h (mark_dfs_back_edges): Provide API with struct
function argument.
* cfganal.cc (mark_dfs_back_edges): Take a struct function
to work on, add a wrapper passing cfun.
* graph.cc (draw_cfg_nodes_no_loops): Replace stray cfun
uses with fun which is already passed.
(draw_cfg_edges): Likewise.
(draw_cfg_nodes_for_loop): Do not use draw_cfg_nodes_for_loop
for fun != cfun.
---
 gcc/cfganal.cc | 28 +---
 gcc/cfganal.h  |  1 +
 gcc/graph.cc   | 15 ---
 3 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/gcc/cfganal.cc b/gcc/cfganal.cc
index e570d27768b..79c627a1716 100644
--- a/gcc/cfganal.cc
+++ b/gcc/cfganal.cc
@@ -58,7 +58,7 @@ private:
and heavily borrowed from pre_and_rev_post_order_compute.  */
 
 bool
-mark_dfs_back_edges (void)
+mark_dfs_back_edges (struct function *fun)
 {
   int *pre;
   int *post;
@@ -67,20 +67,20 @@ mark_dfs_back_edges (void)
   bool found = false;
 
   /* Allocate the preorder and postorder number arrays.  */
-  pre = XCNEWVEC (int, last_basic_block_for_fn (cfun));
-  post = XCNEWVEC (int, last_basic_block_for_fn (cfun));
+  pre = XCNEWVEC (int, last_basic_block_for_fn (fun));
+  post = XCNEWVEC (int, last_basic_block_for_fn (fun));
 
   /* Allocate stack for back-tracking up CFG.  */
-  auto_vec stack (n_basic_blocks_for_fn (cfun) + 1);
+  auto_vec stack (n_basic_blocks_for_fn (fun) + 1);
 
   /* Allocate bitmap to track nodes that have been visited.  */
-  auto_sbitmap visited (last_basic_block_for_fn (cfun));
+  auto_sbitmap visited (last_basic_block_for_fn (fun));
 
   /* None of the nodes in the CFG have been visited yet.  */
   bitmap_clear (visited);
 
   /* Push the first edge on to the stack.  */
-  stack.quick_push (ei_start (ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs));
+  stack.quick_push (ei_start (ENTRY_BLOCK_PTR_FOR_FN (fun)->succs));
 
   while (!stack.is_empty ())
 {
@@ -94,8 +94,8 @@ mark_dfs_back_edges (void)
   ei_edge (ei)->flags &= ~EDGE_DFS_BACK;
 
   /* Check if the edge destination has been visited yet.  */
-  if (dest != EXIT_BLOCK_PTR_FOR_FN (cfun) && ! bitmap_bit_p (visited,
- dest->index))
+  if (dest != EXIT_BLOCK_PTR_FOR_FN (fun) && ! bitmap_bit_p (visited,
+dest->index))
{
  /* Mark that we have visited the destination.  */
  bitmap_set_bit (visited, dest->index);
@@ -112,14 +112,14 @@ mark_dfs_back_edges (void)
}
   else
{
- if (dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
- && src != ENTRY_BLOCK_PTR_FOR_FN (cfun)
+ if (dest != EXIT_BLOCK_PTR_FOR_FN (fun)
+ && src != ENTRY_BLOCK_PTR_FOR_FN (fun)
  && pre[src->index] >= pre[dest->index]
  && post[dest->index] == 0)
ei_edge (ei)->flags |= EDGE_DFS_BACK, found = true;
 
  if (ei_one_before_end_p (ei)
- && src != ENTRY_BLOCK_PTR_FOR_FN (cfun))
+ && src != ENTRY_BLOCK_PTR_FOR_FN (fun))
post[src->index] = postnum++;
 
  if (!ei_one_before_end_p (ei))
@@ -135,6 +135,12 @@ mark_dfs_back_edges (void)
   return found;
 }
 
+bool
+mark_dfs_back_edges (void)
+{
+  return mark_dfs_back_edges (cfun);
+}
+
 /* Find unreachable blocks.  An unreachable block will have 0 in
the reachable bit in block->flags.  A nonzero value indicates the
block is reachable.  */
diff --git a/gcc/cfganal.h b/gcc/cfganal.h
index 386cfbf211f..ac637de2b5a 100644
--- a/gcc/cfganal.h
+++ b/gcc/cfganal.h
@@ -49,6 +49,7 @@ private:
   bitmap_obstack m_bitmaps;
 };
 
+extern bool mark_dfs_back_edges (struct function *);
 extern bool mark_dfs_back_edges (void);
 extern void find_unreachable_blocks (void);
 extern void verify_no_unreachable_blocks (void);
diff --git a/gcc/graph.cc b/gcc/graph.cc
index 9990c8eccfd..bc29862fcad 100644
--- a/gcc/graph.cc
+++ b/gcc/graph.cc
@@ -169,14 +169,14 @@ draw_cfg_nodes_no_loops (pretty_printer *pp, struct 
function *fun)
   int *rpo = XNEWVEC (int, n_basic_blocks_for_fn (fun));
   int i, n;
 
-  auto_sbitmap visited (last_basic_block_for_fn (cfun));
+  auto_sbitmap visited 

Re: [Submitted, PR103970, Fortran, Coarray] Multi-image co_broadcast of derived type with allocatable components fails^

2022-01-28 Thread Andre Vehreschild via Gcc-patches
Hi Tobias,

ups, sorry, reverted immediately.

Regards,
Andre

On Fri, 28 Jan 2022 10:27:26 +0100
Tobias Burnus  wrote:

> Hi Andre,
> 
> your patch breaks bootstrapping:
> 
> ../../repos/gcc/gcc/fortran/trans-array.cc: In function ‘tree_node*
> structure_alloc_comps(gfc_symbol*, tree, tree, int, int, int,
> gfc_co_subroutines_args*)’:
> ../../repos/gcc/gcc/fortran/trans-array.cc:9200:42: error: ‘cdesc’ may be
> used uninitialized [-Werror=maybe-uninitialized] 9200 |
> gfc_conv_descriptor_data_set (, cdesc, comp); |
> ~^~~~
> ../../repos/gcc/gcc/fortran/trans-array.cc:9082:16: note: ‘cdesc’ was
> declared here 9082 |   tree cdesc; |^ cc1plus:
> all warnings being treated as errors make[3]: *** [Makefile:1143:
> fortran/trans-array.o] Error 1
> 
> Tobias
> 
> On 28.01.22 10:07, Andre Vehreschild via Fortran wrote:
> > Hi Harald,
> >
> > thanks for the fast review. I have submitted as c9c48ab7bad.
> >
> > Will wait for two weeks (reminder set :-)) before backporting to gcc-11.
> >
> > Thank you and regards,
> >   Andre
> >
> > On Tue, 25 Jan 2022 22:30:22 +0100
> > Harald Anlauf via Fortran  wrote:
> >  
> >> Hi Andre',
> >>
> >> Am 25.01.22 um 17:32 schrieb Andre Vehreschild via Fortran:  
> >>> Hi all,
> >>>
> >>> attached patch fixes wrong code generation when broadcasting a derived
> >>> type containing allocatable and non-allocatable scalars. Furthermore does
> >>> it prevent broadcasting of coarray-tokens, which are always local
> >>> this_image. Thus having them on a different image makes no sense.
> >>>
> >>> Bootstrapped and regtested ok on x86_64-linux/F35.
> >>>
> >>> Ok, for trunk and backport to 12 and 11-branch after decent time?
> >>>
> >>> I perceived that 12 is closed for this kind of bugfix, therefore asking ok
> >>> for 13.  
> >> I do not think that 12 is closed for bugfixing, especially not for
> >> fortran.  And if my cursory reading of the patch is not misleading,
> >> the impact of the patch is really limited to coarrays.
> >>
> >> You may want to wait for another 1-2 days for additional comments.
> >> If not, it is OK from my side.
> >>
> >> Thanks for the patch!
> >>
> >> Harald
> >>  
> >>> Regards,
> >>> Andre
> >>> --
> >>> Andre Vehreschild * Email: vehre ad gmx dot de  
> >>  
> >
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de  
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht
> München, HRB 106955


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


Re: [Submitted, PR103970, Fortran, Coarray] Multi-image co_broadcast of derived type with allocatable components fails^

2022-01-28 Thread Tobias Burnus

Hi Andre,

your patch breaks bootstrapping:

../../repos/gcc/gcc/fortran/trans-array.cc: In function ‘tree_node* 
structure_alloc_comps(gfc_symbol*, tree, tree, int, int, int, 
gfc_co_subroutines_args*)’:
../../repos/gcc/gcc/fortran/trans-array.cc:9200:42: error: ‘cdesc’ may be used 
uninitialized [-Werror=maybe-uninitialized]
 9200 | gfc_conv_descriptor_data_set (, cdesc, comp);
  | ~^~~~
../../repos/gcc/gcc/fortran/trans-array.cc:9082:16: note: ‘cdesc’ was declared 
here
 9082 |   tree cdesc;
  |^
cc1plus: all warnings being treated as errors
make[3]: *** [Makefile:1143: fortran/trans-array.o] Error 1

Tobias

On 28.01.22 10:07, Andre Vehreschild via Fortran wrote:

Hi Harald,

thanks for the fast review. I have submitted as c9c48ab7bad.

Will wait for two weeks (reminder set :-)) before backporting to gcc-11.

Thank you and regards,
  Andre

On Tue, 25 Jan 2022 22:30:22 +0100
Harald Anlauf via Fortran  wrote:


Hi Andre',

Am 25.01.22 um 17:32 schrieb Andre Vehreschild via Fortran:

Hi all,

attached patch fixes wrong code generation when broadcasting a derived type
containing allocatable and non-allocatable scalars. Furthermore does it
prevent broadcasting of coarray-tokens, which are always local this_image.
Thus having them on a different image makes no sense.

Bootstrapped and regtested ok on x86_64-linux/F35.

Ok, for trunk and backport to 12 and 11-branch after decent time?

I perceived that 12 is closed for this kind of bugfix, therefore asking ok
for 13.

I do not think that 12 is closed for bugfixing, especially not for
fortran.  And if my cursory reading of the patch is not misleading,
the impact of the patch is really limited to coarrays.

You may want to wait for another 1-2 days for additional comments.
If not, it is OK from my side.

Thanks for the patch!

Harald


Regards,
Andre
--
Andre Vehreschild * Email: vehre ad gmx dot de




--
Andre Vehreschild * Email: vehre ad gmx dot de

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH] cfgrtl: Fix up locus comparison in unique_locus_on_edge_between_p [PR104237]

2022-01-28 Thread Jakub Jelinek via Gcc-patches
Hi!

The testcase in the PR (not included for the testsuite because we don't
have an (easy) way to -fcompare-debug LTO, we'd need 2 compilations/linking,
one with -g and one with -g0 and -fdump-rtl-final= at the end of lto1
and compare that) has different code generation for -g vs. -g0.

The difference appears during expansion, where we have a goto_locus
that is at -O0 compared to the INSN_LOCATION of the previous and next insn
across an edge.  With -g0 the locations are equal and so no nop is added.
With -g the locations aren't equal and so a nop is added holding that
location.

The reason for the different location is in the way how we stream in
locations by lto1.
We have lto_location_cache::apply_location_cache that is called with some
set of expanded locations, qsorts them, creates location_t's for those
and remembers the last expanded location.
lto_location_cache::input_location_and_block when read in expanded_location
is equal to the last expanded location just reuses the last location_t
(or adds/changes/removes LOCATION_BLOCK in it), when it is not queues
it for next apply_location_cache.  Now, when streaming in -g input, we can
see extra locations that don't appear with -g0, and if we are unlucky
enough, those can be sorted last during apply_location_cache and affect
what locations are used from the single entry cache next.
In particular, second apply_location_cache with non-empty loc_cache in
the testcase has 14 locations with -g0 and 16 with -g and those 2 extra
ones sort both last (they are the same).  The last one from -g0 then
appears to be input_location_and_block sourced again, for -g0 triggers
the single entry cache, while for -g it doesn't and so apply_location_cache
will create for it another location_t with the same content.

The following patch fixes it by comparing everything we care about the
location instead (well, better in addition) to a simple location_t ==
location_t check.  I think we don't care about the sysp flag for debug
info...

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-01-28  Jakub Jelinek  

PR lto/104237
* cfgrtl.cc (loc_equal): New function.
(unique_locus_on_edge_between_p): Use it.

--- gcc/cfgrtl.cc.jj2022-01-18 11:58:58.947991128 +0100
+++ gcc/cfgrtl.cc   2022-01-27 19:32:13.949937750 +0100
@@ -778,6 +778,29 @@ rtl_split_block (basic_block bb, void *i
   return new_bb;
 }
 
+/* Return true if LOC1 and LOC2 are equivalent for
+   unique_locus_on_edge_between_p purposes.  */
+
+static bool
+loc_equal (location_t loc1, location_t loc2)
+{
+  if (loc1 == loc2)
+return true;
+
+  expanded_location loce1 = expand_location (loc1);
+  expanded_location loce2 = expand_location (loc2);
+
+  if (loce1.line != loce2.line
+  || loce1.column != loce2.column
+  || loce1.data != loce2.data)
+return false;
+  if (loce1.file == loce2.file)
+return true;
+  return (loce1.file != NULL
+ && loce2.file != NULL
+ && filename_cmp (loce1.file, loce2.file) == 0);
+}
+
 /* Return true if the single edge between blocks A and B is the only place
in RTL which holds some unique locus.  */
 
@@ -796,7 +819,7 @@ unique_locus_on_edge_between_p (basic_bl
   while (insn != end && (!NONDEBUG_INSN_P (insn) || !INSN_HAS_LOCATION (insn)))
 insn = PREV_INSN (insn);
 
-  if (insn != end && INSN_LOCATION (insn) == goto_locus)
+  if (insn != end && loc_equal (INSN_LOCATION (insn), goto_locus))
 return false;
 
   /* Then scan block B forward.  */
@@ -808,7 +831,7 @@ unique_locus_on_edge_between_p (basic_bl
insn = NEXT_INSN (insn);
 
   if (insn != end && INSN_HAS_LOCATION (insn)
- && INSN_LOCATION (insn) == goto_locus)
+ && loc_equal (INSN_LOCATION (insn), goto_locus))
return false;
 }
 

Jakub



[Submitted, PR103970, Fortran, Coarray] Multi-image co_broadcast of derived type with allocatable components fails^

2022-01-28 Thread Andre Vehreschild via Gcc-patches
Hi Harald,

thanks for the fast review. I have submitted as c9c48ab7bad.

Will wait for two weeks (reminder set :-)) before backporting to gcc-11.

Thank you and regards,
Andre

On Tue, 25 Jan 2022 22:30:22 +0100
Harald Anlauf via Fortran  wrote:

> Hi Andre',
>
> Am 25.01.22 um 17:32 schrieb Andre Vehreschild via Fortran:
> > Hi all,
> >
> > attached patch fixes wrong code generation when broadcasting a derived type
> > containing allocatable and non-allocatable scalars. Furthermore does it
> > prevent broadcasting of coarray-tokens, which are always local this_image.
> > Thus having them on a different image makes no sense.
> >
> > Bootstrapped and regtested ok on x86_64-linux/F35.
> >
> > Ok, for trunk and backport to 12 and 11-branch after decent time?
> >
> > I perceived that 12 is closed for this kind of bugfix, therefore asking ok
> > for 13.
>
> I do not think that 12 is closed for bugfixing, especially not for
> fortran.  And if my cursory reading of the patch is not misleading,
> the impact of the patch is really limited to coarrays.
>
> You may want to wait for another 1-2 days for additional comments.
> If not, it is OK from my side.
>
> Thanks for the patch!
>
> Harald
>
> > Regards,
> > Andre
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de
>
>


--
Andre Vehreschild * Email: vehre ad gmx dot de