RE: [PATCH 5/21]middle-end: update vectorizer's control update to support picking an exit other than loop latch

2023-11-13 Thread Richard Biener
On Mon, 13 Nov 2023, Tamar Christina wrote:

> > -Original Message-
> > From: Richard Biener 
> > Sent: Tuesday, November 7, 2023 3:04 PM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> > Subject: Re: [PATCH 5/21]middle-end: update vectorizer's control update to
> > support picking an exit other than loop latch
> > 
> > On Mon, 6 Nov 2023, Tamar Christina wrote:
> > 
> > > Hi All,
> > >
> > > As requested, the vectorizer is now free to pick it's own exit which
> > > can be different than what the loop CFG infrastucture uses.  The
> > > vectorizer makes use of this to vectorize loops that it previously could 
> > > not.
> > >
> > > But this means that loop control must be materialized in the block
> > > that needs it less we corrupt the SSA chain.  This makes it so we use
> > > the vectorizer's main IV block instead of the loop infra.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > >   * tree-ssa-loop-manip.cc (standard_iv_increment_position):
> > Conditionally
> > >   take dest BB.
> > >   * tree-ssa-loop-manip.h (standard_iv_increment_position): Likewise.
> > >   * tree-vect-loop-manip.cc (vect_set_loop_controls_directly): Use it.
> > >   (vect_set_loop_condition_partial_vectors_avx512): Likewise.
> > >   (vect_set_loop_condition_normal): Likewise.
> > >
> > > --- inline copy of patch --
> > > diff --git a/gcc/tree-ssa-loop-manip.h b/gcc/tree-ssa-loop-manip.h
> > > index
> > >
> > bda09f51d5619420331c513a9906831c779fd2b4..5938588c8882d842b00
> > 301423df1
> > > 11cbe7bf7ba8 100644
> > > --- a/gcc/tree-ssa-loop-manip.h
> > > +++ b/gcc/tree-ssa-loop-manip.h
> > > @@ -38,7 +38,8 @@ extern basic_block split_loop_exit_edge (edge, bool
> > > = false);  extern basic_block ip_end_pos (class loop *);  extern
> > > basic_block ip_normal_pos (class loop *);  extern void
> > > standard_iv_increment_position (class loop *,
> > > - gimple_stmt_iterator *, bool *);
> > > + gimple_stmt_iterator *, bool *,
> > > + basic_block = NULL);
> > >  extern bool
> > >  gimple_duplicate_loop_body_to_header_edge (class loop *, edge, unsigned
> > int,
> > >  sbitmap, edge, vec *, int); 
> > > diff
> > --git
> > > a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc index
> > >
> > e7436915e01297e7af2a3bcf1afd01e014de6f32..bdc7a3d74a788f450ca5d
> > de6c294
> > > 92ce4d4e4550 100644
> > > --- a/gcc/tree-ssa-loop-manip.cc
> > > +++ b/gcc/tree-ssa-loop-manip.cc
> > > @@ -792,14 +792,19 @@ ip_normal_pos (class loop *loop)
> > >
> > >  /* Stores the standard position for induction variable increment in LOOP
> > > (just before the exit condition if it is available and latch block is 
> > > empty,
> > > -   end of the latch block otherwise) to BSI.  INSERT_AFTER is set to 
> > > true if
> > > -   the increment should be inserted after *BSI.  */
> > > +   end of the latch block otherwise) to BSI.  If DEST_BB is specified 
> > > then that
> > > +   basic block is used as the destination instead of the loop latch 
> > > source
> > > +   block.  INSERT_AFTER is set to true if the increment should be 
> > > inserted
> > after
> > > +   *BSI.  */
> > >
> > >  void
> > >  standard_iv_increment_position (class loop *loop, gimple_stmt_iterator
> > *bsi,
> > > - bool *insert_after)
> > > + bool *insert_after, basic_block dest_bb)
> > >  {
> > > -  basic_block bb = ip_normal_pos (loop), latch = ip_end_pos (loop);
> > > +  basic_block bb = dest_bb;
> > > +  if (!bb)
> > > +bb = ip_normal_pos (loop);
> > > +  basic_block latch = ip_end_pos (loop);
> > 
> > I don't think that's a good API extension.  Given that we don't support an 
> > early
> > exit after the main IV exit doesn't this code already work fine as-is?  It 
> > chooses
> > the last exit.  The position is also not semantically relevant, we just try 
> > to keep
> > the latch empty here (that is, it's a bit of a "bad" API).
> > 
> > So, do you really need this change?
> 
> Yes I do, If you look at these kinds of loops 
> https://gist.github.com/Mistuke/66f14fe5c1be32b91ce149bd9b8bb35f
> 
> You'll see that the main exit, i.e. the one attached to the latch block is 
> the early break.
> Because SCEV can't analyze it picks the main exit to be the one in BB4.
> 
> This means that the loop control must be placed in BB4.  If we place ivtmp_10 
> = ivtmp_9 - 1
> In BB 3 then that's broken SSA.  If we use `ivtmp_9` in BB4 then we'll have 
> an off by one issue.

OK, but then I think the fix is to not use standard_iv_increment_position
(it's a weird API anyway).  Instead insert before the main exit condition.

Btw, I assumed this order of main / early exit cannot happen.  But I
didn't re-review the main exit identification code yet.


Re: [PATCH v4] gcc: Introduce -fhardened

2023-11-13 Thread Richard Biener
On Fri, Nov 3, 2023 at 11:51 PM Marek Polacek  wrote:
>
> On Thu, Oct 26, 2023 at 05:55:56PM +0200, Richard Biener wrote:
> >
> >
> > > Am 24.10.2023 um 21:09 schrieb Marek Polacek :
> > >
> > > On Tue, Oct 24, 2023 at 09:22:25AM +0200, Richard Biener wrote:
> > >>> On Mon, Oct 23, 2023 at 9:26 PM Marek Polacek  
> > >>> wrote:
> > >>>
> > >>> On Thu, Oct 19, 2023 at 02:24:11PM +0200, Richard Biener wrote:
> >  Can you see how our
> >  primary and secondary targets (+ host OS) behave here?
> > >>>
> > >>> That's very reasonable.  I tried to build gcc on Compile Farm 119 (AIX) 
> > >>> but
> > >>> that fails with:
> > >>>
> > >>> ar  -X64 x ../ppc64/libgcc/libgcc_s.a shr.o
> > >>> ar: 0707-100 ../ppc64/libgcc/libgcc_s.a does not exist.
> > >>> make[2]: *** [/home/polacek/gcc/libgcc/config/rs6000/t-slibgcc-aix:98: 
> > >>> all] Error 1
> > >>> make[2]: Leaving directory 
> > >>> '/home/polacek/x/trunk/powerpc-ibm-aix7.3.1.0/libgcc'
> > >>>
> > >>> and I tried Darwin (104) and that fails with
> > >>>
> > >>> *** Configuration aarch64-apple-darwin21.6.0 not supported
> > >>>
> > >>> Is anyone else able to build gcc on those machines, or test the attached
> > >>> patch?
> > >>>
> >  I think the
> >  documentation should elaborate a bit on expectations for non-Linux/GNU
> >  targets, specifically I think the default configuration for a target 
> >  should
> >  with -fhardened _not_ have any -Whardened diagnostics.  Maybe we can
> >  have a testcase for this?
> > >>>
> > >>> Sorry, I'm not sure how to test that.  I suppose if -fhardened enables
> > >>> something not supported on those systems, and it's something for which
> > >>> we have a configure test, then we shouldn't warn.  This is already the
> > >>> case for -pie, -z relro, and -z now.
> > >>
> > >> I was thinking of
> > >>
> > >> /* { dg-do compile } */
> > >> /* { dg-additional-options "-fhardened -Whardened" } */
> > >>
> > >> int main () {}
> > >>
> > >> and excess errors should catch "misconfigurations"?
> > >
> > > I see.  fhardened-3.c is basically just like this (-Whardened is on by 
> > > default).
> > >
> > >>> Should the docs say something like the following for features without
> > >>> configure checks?
> > >>>
> > >>> @option{-fhardened} can, on certain systems, attempt to enable features
> > >>> not supported on that particular system.  In that case, it's possible to
> > >>> prevent the warning using the @option{-Wno-hardened} option.
> > >>
> > >> Yeah, but ideally
> > >>
> > >> @option{-fhardened} can, on certain systems, not enable features not
> > >> available on those systems and @option{-Whardened} will not diagnose
> > >> those as missing.
> > >>
> > >> But I understand it doesn't work like that?
> > >
> > > Right.  It will not diagnose missing features if they have a configure
> > > check, otherwise it will.  And I don't know if we want a configure check
> > > for every feature.  Maybe we can add them in the future if the current
> > > patch turns out to be problematical in practice?
> >
> > Maybe we can have a switch on known target triples and statically configure 
> > based
> > On that, eventually even not support -fhardened for targets not listed.  
> > That’s certainly easier than detecting the target system features (think of 
> > cross compilers)
>
> You mean like the following?  The only difference is the addition of
> HAVE_FHARDENED_SUPPORT and updating the tests to only run on gnu/linux
> targets.  If other OSs want to use -fhardened, they need to update the
> configure test.  Thanks,

Yes, something like this.  IMHO we should aim to at least support all
our primary platforms (and maybe secondary if they have a relevant
host OS part).

> Bootstrapped/regtested on x86_64-pc-linux-gnu and
> powerpc64le-unknown-linux-gnu.
> -- >8 --
> In 
> I proposed -fhardened, a new umbrella option that enables a reasonable set
> of hardening flags.  The read of the room seems to be that the option
> would be useful.  So here's a patch implementing that option.
>
> Currently, -fhardened enables:
>
>   -D_FORTIFY_SOURCE=3 (or =2 for older glibcs)
>   -D_GLIBCXX_ASSERTIONS
>   -ftrivial-auto-var-init=zero
>   -fPIE  -pie  -Wl,-z,relro,-z,now
>   -fstack-protector-strong
>   -fstack-clash-protection
>   -fcf-protection=full (x86 GNU/Linux only)
>
> -fhardened will not override options that were specified on the command line
> (before or after -fhardened).  For example,
>
>  -D_FORTIFY_SOURCE=1 -fhardened
>
> means that _FORTIFY_SOURCE=1 will be used.  Similarly,
>
>   -fhardened -fstack-protector
>
> will not enable -fstack-protector-strong.
>
> Currently, -fhardened is only supported on GNU/Linux.
>
> In DW_AT_producer it is reflected only as -fhardened; it doesn't expand
> to anything.  This patch provides -Whardened, enabled by default, which
> warns when -fhardened couldn't enable a particular option.  I think most
> often it will 

Re: [PATCH 1/6] c-family: Introduce pedpermerror

2023-11-13 Thread Florian Weimer
* Florian Weimer:

> It turns out that permerror_opt is not directly usable for
> -fpermissive in the C front end.  The front end uses pedwarn
> extensively, and pedwarns are not overridable by -Wno-* options,
> yet permerrors are.  Add new pedpermerror helpers that turn into
> pedwarns if -pedantic-errors is active.
>
> Due to the dependency on flag_pedantic_errors, the new helpers
> are specific to the C-family front ends.  For implementing the
> rich location variant, export emit_diagnostic_valist from
> gcc/diagnostic.cc in parallel to its location_t variant.
>
> gcc/
>
>   * diagnostic-core.h (emit_diagnostic_valist): Declare function.
>   * diagnostic.cc (emit_diagnostic_valist): Define it.
>
> gcc/c-family/
>
>   * c-common.h (pedpermerror): Declare functions.
>   * c-warn.cc (pedpermerror): Define them.

Jason suggested off-list that this shouldn't be necessary, and the
description of -pedantic-errors is wrong (it is possible to undo
-pedantic-errors effects with -Wno-error=…).  The permerror_opt
interface should already do what I need.

It turns out that I was very unlucky and picked -Wreturn-type for my
tests.

This:

long i = "abc";
volatile j;
int f (void) { return; }

Gives, with GCC 13:

$ gcc -pedantic-errors -Wno-error=implicit-int -Wno-error=int-conversion 
-Wno-error=return-type test.c
test.c:1:10: warning: initialization of ‘long int’ from ‘char *’ makes integer 
from pointer without a cast [-Wint-conversion]
1 | long i = "abc";
  |  ^
test.c:2:10: warning: type defaults to ‘int’ in declaration of ‘j’ 
[-Wimplicit-int]
2 | volatile j;
  |  ^
test.c: In function ‘f’:
test.c:3:16: error: ‘return’ with no value, in function returning non-void
3 | int f (void) { return; }
  |^~
test.c:3:5: note: declared here
3 | int f (void) { return; }
  | ^

This happens because we drop the OPT_Wreturn_type in some cases:

  if (flag_isoc99)
warned_here = pedwarn
  (loc, warn_return_type >= 0 ? OPT_Wreturn_type : 0,
   "% with no value, in function returning non-void");
  else
warned_here = warning_at
  (loc, OPT_Wreturn_type,
   "% with no value, in function returning non-void");

And for the other direction:

  if (TREE_CODE (TREE_TYPE (retval)) != VOID_TYPE)
warned_here = pedwarn
  (xloc, warn_return_type >= 0 ? OPT_Wreturn_type : 0,
   "% with a value, in function returning void");

I think with the -Wreturn-mismatch split, we can drop the
warn_return_type >= 0 condition, and then permerror_opt should indeed
do the right thing.

I'll write the kitchen sink test now, use that to verify this theory,
and repost as appropriate.

Thanks,
Florian



[PATCH] tree: Handle BITINT_TYPE in type_contains_placeholder_1 [PR112511]

2023-11-13 Thread Jakub Jelinek
Hi!

The following testcase ICEs because BITINT_TYPE isn't handled in
type_contains_placeholder_1.  Given that Ada doesn't emit it, it doesn't
matter that much where exactly we handle it as right now it should never
contain a placeholder; I've picked the same spot as INTEGER_TYPE, but if
you prefer e.g. the one with OFFSET_TYPE above, I can move it there too.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2023-11-14  Jakub Jelinek  

PR middle-end/112511
* tree.cc (type_contains_placeholder_1): Handle BITINT_TYPE like
INTEGER_TYPE.

* gcc.dg/pr112511.c: New test.

--- gcc/tree.cc.jj  2023-11-09 09:04:20.357518740 +0100
+++ gcc/tree.cc 2023-11-13 14:57:33.001715799 +0100
@@ -4210,6 +4210,7 @@ type_contains_placeholder_1 (const_tree
   return false;
 
 case INTEGER_TYPE:
+case BITINT_TYPE:
 case REAL_TYPE:
 case FIXED_POINT_TYPE:
   /* Here we just check the bounds.  */
--- gcc/testsuite/gcc.dg/pr112511.c.jj  2023-11-13 15:00:13.872457749 +0100
+++ gcc/testsuite/gcc.dg/pr112511.c 2023-11-13 14:59:59.933653398 +0100
@@ -0,0 +1,17 @@
+/* PR middle-end/112511 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-O2" } */
+
+struct T { _BitInt(22) a; };
+
+void
+bar (struct T t)
+{
+}
+
+void
+foo (void)
+{
+  struct T t;
+  bar (t);
+}

Jakub



RE: [RFC] Intel AVX10.1 Compiler Design and Support

2023-11-13 Thread Jiang, Haochen
> > > > I wonder whether adoption could be made easier by also providing a
> > > > -mavx10[.0] level that removes some of the more obscure sub-ISA
> > > > requirements to cover more existing implementations (I'd not add 
> > > > -mavx10.0-512 here).
> > > > I'd require only skylake-AVX512 features here, basically all
> > > > non-KNL AVX512 CPUs should have a "virtual" AVX10 level that
> > > > allows to use that feature set,
> > >
> > > We have -mno-evex512 can cover those cases, so what you want is like
> > > a simple alias of "-march=skylake-avx512 -mno-evex512"?
> >
> > For the AVX512 enabled sub-isas of skylake-avx512 yes I guess.
> >
> > > > restricted to 256bits so future AVX10-256 implementations can
> > > > handle it as well as all existing (and relevant, which excludes
> > > > KNL) AVX512 implementations.
> > > >
> > > > Otherwise AVX10 is really a hard sell (as AVX512 was originally).
> > >
> > > It's a rebranding of the existing AVX512 to AVX10, AVX10.0  just
> > > complicated things further(considering we already have x86-64-v4
> > > which is different from skylake-avx512).
> >
> > Well, the cut-off for "AVX512" is quite arbitrary.  Introducing a
> > "new" ISA that's only available in HW available in the future and
> > suggesting users to embrace that already (like Intel did with AVX512
> > without offering client SKU support) is a hard sell.
> >
> > I realize Intel thinks client SKU support for AVX10 (restricted to
> > 256bit) will be "easier".  But then don't expect anybody to adopt that in 
> > the next 10 years.
> >
> > Just to add - we were suggesting to use x86_64-v3 for the "next"
> > enterprise product but got downvoted to x86_64-v2 for compatibility reasons.
> >
> > If it were possible I'd axe x86_64-v4.  Maybe we should add a
> > x86_64-v3.5 that sits inbetween v3 and v4, offering AVX512 but
> > restricted to 256bit (and obviously not requiring more of the AVX512 
> > features that v4 requires).
>
> About the arch level is indeed a problem, especially since the default size of
> avx10 is 256.
> +Florian Weimer for more inputs.

IMO, AVX10.1 options should be there and the arch level issue should not affect 
the
existence of this series of options.

The issue currently we are facing is much about the arch level issue actually 
since
we have defined x86-64-v4 before. The "-march=skylake-server -mno-evex512" is
much like something x86-64-v4-256.

Thx,
Haochen


[PATCH] c++/modules: Restrict partitions when parsing import declarations [PR110808]

2023-11-13 Thread Nathaniel Shead
I'll also note that the comments above the parsing functions here no
longer exactly match with the grammar in the standard, should they be
updated as well?

Bootstrapped and regtested on x86_64-pc-linux-gnu. I don't have write
access.

-- >8 --

Currently we allow declarations like 'import A:B', even from an
unrelated source file (not part of module A), which causes errors in
merging declarations. However, the syntax in [module.import] doesn't
even allow this form of import, so this patch prevents this from
parsing at all and avoids the error that way.

PR c++/110808

gcc/cp/ChangeLog:

* parser.cc (cp_parser_module_name): Add param import_p.
Disallow partitions after module-name for import decls.
(cp_parser_module_declaration): Pass import_p = false.
(cp_parser_import_declaration): Pass import_p = true.

gcc/testsuite/ChangeLog:

* g++.dg/modules/part-hdr-1_c.C: Fix syntax.
* g++.dg/modules/part-mac-1_c.C: Likewise.
* g++.dg/modules/part-8_a.C: New test.
* g++.dg/modules/part-8_b.C: New test.
* g++.dg/modules/part-8_c.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/parser.cc| 11 ++-
 gcc/testsuite/g++.dg/modules/part-8_a.C |  6 ++
 gcc/testsuite/g++.dg/modules/part-8_b.C |  6 ++
 gcc/testsuite/g++.dg/modules/part-8_c.C |  8 
 gcc/testsuite/g++.dg/modules/part-hdr-1_c.C |  2 +-
 gcc/testsuite/g++.dg/modules/part-mac-1_c.C |  2 +-
 6 files changed, 28 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/part-8_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/part-8_b.C
 create mode 100644 gcc/testsuite/g++.dg/modules/part-8_c.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 5116bcb78f6..72a5c52313d 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -14858,10 +14858,11 @@ cp_parser_already_scoped_statement (cp_parser* 
parser, bool *if_p,
module-name . identifier
header-name
 
-   Returns a pointer to module object, NULL.   */
+   Returns a pointer to module object, or NULL.  Also parses an optional
+   module-partition after the module-name unless IMPORT_P is true.  */
 
 static module_state *
-cp_parser_module_name (cp_parser *parser)
+cp_parser_module_name (cp_parser *parser, bool import_p)
 {
   cp_token *token = cp_lexer_peek_token (parser->lexer);
   if (token->type == CPP_HEADER_NAME)
@@ -14890,7 +14891,7 @@ cp_parser_module_name (cp_parser *parser)
   tree name = cp_lexer_consume_token (parser->lexer)->u.value;
   parent = get_module (name, parent, partitioned);
   token = cp_lexer_peek_token (parser->lexer);
-  if (!partitioned && token->type == CPP_COLON)
+  if (!partitioned && !import_p && token->type == CPP_COLON)
partitioned = true;
   else if (token->type != CPP_DOT)
break;
@@ -14961,7 +14962,7 @@ cp_parser_module_declaration (cp_parser *parser, 
module_parse mp_state,
 }
   else
 {
-  module_state *mod = cp_parser_module_name (parser);
+  module_state *mod = cp_parser_module_name (parser, /*import_p=*/false);
   tree attrs = cp_parser_attributes_opt (parser);
 
   mp_state = MP_PURVIEW_IMPORTS;
@@ -15004,7 +15005,7 @@ cp_parser_import_declaration (cp_parser *parser, 
module_parse mp_state,
 }
   else
 {
-  module_state *mod = cp_parser_module_name (parser);
+  module_state *mod = cp_parser_module_name (parser, /*import_p=*/true);
   tree attrs = cp_parser_attributes_opt (parser);
 
   if (!mod || !cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON))
diff --git a/gcc/testsuite/g++.dg/modules/part-8_a.C 
b/gcc/testsuite/g++.dg/modules/part-8_a.C
new file mode 100644
index 000..09f956ff36f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/part-8_a.C
@@ -0,0 +1,6 @@
+// PR c++/110808
+// { dg-additional-options "-fmodules-ts" }
+// { dg-module-cmi group:tres }
+
+export module group:tres;
+int mul() { return 0; }
diff --git a/gcc/testsuite/g++.dg/modules/part-8_b.C 
b/gcc/testsuite/g++.dg/modules/part-8_b.C
new file mode 100644
index 000..1ade029495c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/part-8_b.C
@@ -0,0 +1,6 @@
+// PR c++/110808
+// { dg-additional-options "-fmodules-ts" }
+// { dg-module-cmi group }
+
+export module group;
+export import :tres;
diff --git a/gcc/testsuite/g++.dg/modules/part-8_c.C 
b/gcc/testsuite/g++.dg/modules/part-8_c.C
new file mode 100644
index 000..2351f28f909
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/part-8_c.C
@@ -0,0 +1,8 @@
+// PR c++/110808
+// { dg-additional-options "-fmodules-ts" }
+
+import group:tres;  // { dg-error "expected .;." }
+
+int main() {
+  return mul();  // { dg-error "not declared" }
+}
diff --git a/gcc/testsuite/g++.dg/modules/part-hdr-1_c.C 
b/gcc/testsuite/g++.dg/modules/part-hdr-1_c.C
index 78a53d2fda3..db57adcef44 100644
--- a/gcc/testsuite/g++.dg/modules/part-hdr-1_c.C
+++ 

Re: [PATCH V3 0/7] ira/lra: Support subreg coalesce

2023-11-13 Thread Lehua Ding

Hi Vladimir,

On 2023/11/14 3:37, Vladimir Makarov wrote:


On 11/12/23 07:08, Lehua Ding wrote:

V3 Changes:
   1. fix three ICE.
   2. rebase

Hi,

These patchs try to support subreg coalesce feature in
register allocation passes (ira and lra).

I've started review of v3 patches and here is my initial general 
criticism of your patches:


   * Absence of comments for some functions, e.g. for `HARD_REG_SET 
operator>> (unsigned int shift_amount) const`.


   * Adding significant functionality to existing functions is not 
reflected in the function comment, e.g. in ira_set_allocno_class.


   * A lot of typos, e.g. `pesudo` or `reprensent`.  I think you need to 
check spelling of you comments (I myself do spell checking in emacs by 
ispell-region command).


   * Grammar mistakes, e.g `Flag means need track subreg live range for 
the allocno`.  I understand English is not your native languages (as for 
me).  In case of some doubts I'd recommend to check grammar in ChatGPT 
(Proofread:  text).


   * Some local variables use upper case letters (e.g. `int A`) which 
should be used for macros or enums according to GNU coding standard 
(https://www.gnu.org/prep/standards/standards.html) .


   * Sometimes you put one space at the end of sentence.  Please see GNU 
coding standard and GCC coding conventions 
(https://gcc.gnu.org/codingconventions.html)


   * There is no uniformity in your code, e.g. sometimes you use 'i++', 
sometimes `++i` or `i += 1`.  Although the uniformity is not necessary, 
it makes a better impression about the patches.


Sorry for these issue, I'll address all those comments.

I also did not find what targets did you use for testing.  I am asking 
this because I see new testsuite failures (apx-spill_to_egprs-1.c) even 
on x86-64.  It might be nothing as the test expects a specific code 
generation.


There was testing x86, aarch64, riscv not long ago, but it looks like 
I'm missing something, I just locally tested with the latest code and 
also reproduced this fail you mentioned, along with a c++ fail 
(pr106877.C). I'll have a look at the cause.


Also besides testing major targets I'd recommend testing at least one 
big endian target (I'd recommend ppc64be. gcc110.fsfrance.org could be 
used for this).  Plenty RA issues occur because BE targets are not tested.


You said the address looks a bit wrong, it should be this 
gcc110.fsffrance.org right? I looked for it and it looks like you have 
to go to portal.cfarm.net first to apply for an account on this site, 
I'll try that, thanks a lot.


--
Best,
Lehua (RiVAI)
lehua.d...@rivai.ai


Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-13 Thread waffl3x
On Monday, November 13th, 2023 at 8:48 PM, Jason Merrill  
wrote:


> 
> 
> On 11/11/23 05:43, waffl3x wrote:
> 
> > > [combined reply to all three threads]
> > > 
> > > On 11/9/23 23:24, waffl3x wrote:
> > > 
> > > > > > There are a few known issues still present in this patch. Most 
> > > > > > importantly,
> > > > > > the implicit object argument fails to convert when passed to 
> > > > > > by-value xobj
> > > > > > parameters. This occurs both for xobj parameters that match the 
> > > > > > argument type
> > > > > > and xobj parameters that are unrelated to the object type, but have 
> > > > > > valid
> > > > > > conversions available. This behavior can be observed in the
> > > > > > explicit-obj-by-value[1-3].C tests. The implicit object argument 
> > > > > > appears to be
> > > > > > simply reinterpreted instead of any conversion applied. This is 
> > > > > > elaborated on
> > > > > > in the test cases.
> > > > > 
> > > > > Yes, that's because of:
> > > > > 
> > > > > > @@ -9949,7 +9951,8 @@ build_over_call (struct z_candidate cand, int 
> > > > > > flags, tsubst_flags_t complain)
> > > > > > }
> > > > > > }
> > > > > > / Bypass access control for 'this' parameter. */
> > > > > > - else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
> > > > > > + else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE
> > > > > > + || DECL_XOBJ_MEMBER_FUNC_P (fn))
> > > > > 
> > > > > We don't want to take this path for xob fns. Instead I think we need 
> > > > > to
> > > > > change the existing:
> > > > > 
> > > > > > gcc_assert (first_arg == NULL_TREE);
> > > > > 
> > > > > to assert that if first_arg is non-null, we're dealing with an xob fn,
> > > > > and then go ahead and do the same conversion as the loop body on 
> > > > > first_arg.
> > > > > 
> > > > > > Despite this, calls where there is no valid conversion
> > > > > > available are correctly rejected, which I find surprising. The
> > > > > > explicit-obj-by-value4.C testcase demonstrates this odd but correct 
> > > > > > behavior.
> > > > > 
> > > > > Yes, because checking for conversions is handled elsewhere.
> > > > 
> > > > Yeah, as I noted above I realized that just handling it the same way as
> > > > iobj member functions is fundamentally broken. I was staring at it last
> > > > night and eventually realized that I could just copy the loop body. I
> > > > ended up asserting in the body handling the implicit object argument
> > > > for xobj member functions that first_arg != NULL_TREE, which I wasn't
> > > > sure of, but it seems to work.
> > > 
> > > That sounds like it might cause trouble with
> > > 
> > > struct A {
> > > void f(this A);
> > > };
> > > 
> > > int main()
> > > {
> > > (::f) (A());
> > > }
> > 
> > I will check to see what the behavior with this is. This sounds related
> > to the next question I asked as well.
> > 
> > > > I tried asking in IRC if there are any circumstances where first_arg
> > > > would be null for a non-static member function and I didn't get an
> > > > answer. The code above seemed to indicate that it could be. It just
> > > > looks like old code that is no longer valid and never got removed.
> > > > Consequently this function has made it on my list of things to refactor
> > > > :^).
> > > 
> > > Right, first_arg is only actually used for the implicit object argument,
> > > it's just easier to store it separately from the arguments in (). I'm
> > > not sure which code you mean is no longer valid?
> > 
> > Yeah I agree that it's easier to store it separately.
> > 
> > -- call.cc:build_over_call
> > `else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE) { tree arg = 
> > build_this (first_arg != NULL_TREE ? first_arg : (*args)[arg_index]);`
> > 
> > The trouble is, the code (shown above) does not assume that this holds
> > true. It handles the case where the implicit object argument was passed
> > in with the rest of the arguments. As far as I've observed, it seems
> > like it's always passed in through the first_arg member of cand, which
> > is what I was referring to here.
> > 
> > > > ended up asserting in the body handling the implicit object argument
> > > > for xobj member functions that first_arg != NULL_TREE, which I wasn't
> > > > sure of, but it seems to work.
> > 
> > Since it wasn't clear what I was referring to, here is the code that I
> > wrote (copied from the loop really) handling the case. In case it isn't
> > obvious, I didn't snip the code in the METHOD_TYPE block, it's just
> > snipped here as it's not code I've modified. I'm hopeful that the case
> > you mentioned above is not problematic, but like I said I will be sure
> > to test it.
> > 
> > -- call.cc:build_over_call
> > ```
> > else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
> > {
> > /* SNIP */
> > if (first_arg != NULL_TREE)
> > first_arg = NULL_TREE;
> > else
> > ++arg_index;
> > ++i;
> > is_method = 1;
> > }
> > else if (DECL_XOBJ_MEMBER_FUNC_P (fn))
> > {
> > gcc_assert (cand->first_arg);
> > gcc_assert (cand->num_convs > 0);
> > tree 

Re: Re: [PATCH V2] VECT: Support mask_len_strided_load/mask_len_strided_store in loop vectorize

2023-11-13 Thread juzhe.zh...@rivai.ai
Hi, Richard. Thanks for review.

I have splited patch into 2 patches for easier review
1.  Add mask_len_strided_load/mask_len_strided_store optab/ifn:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636416.html 
2. Apply mask_len_strided load/store into loop vectorizer:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636417.html 

Turns out this 2 patches become much simpler than before.  Code changes much 
fewer than before.

>> vertical space missing
I am not sure I get it. I have tried to address it with Clang-format by 
contrib/clang-format GNU style.
Feel free to comment again if I am wrong on formatting.

>> Are strided_p loads
>> not also supported when gathers are as they can be emulated that
>> way? 
RVV has both indexed load/store and stride load/store instructions (They are 
always existing together).
stride load/store can be considered as a special format of indexed load/store.
Any stride load/store situation can always be lower as gather/scatter for RVV. 
Stride load/store is just an
optimization for invariant stride.

>> In
>> fact I wonder why vect_gather_scatter_fn_p needs to be adjusted at all,
>> don't we have STMT_VINFO_STRIDED_P already?
Thanks. I think STMT_VINFO_STRIDED_P is enough.

>> Why would we go vect_check_gather_scatter for strided loads/stores?!
>> That said, currently STMT_VINFO_STRIDED_P handling only handles
>> invariant strides I think (do we even set STMT_VINFO_STRIDED_P for
>> not invariant strides?).  So we somehow need to better distinguish
>> that from the case of an uniform gather offset that's defined
>> by a stmt inside of the loop from a naming perspective.>> How do we identify 
>> uniformity?  If it's really only when invariant
>> then I fail to see why we handle this via VMAT_GATHER_SCATTER
>> and not VMAT_STRIDED (OK, there's only VMAT_STRIDED_SLP right now,
>> but in the end non-SLP will go away).  VMAT_STRIDED_SLP would
>> also benefit from this (IIRC there's even a FIXME comment).
Address comments. The following patch doesn't change vect_check_gather_scatter
Also, I think RVV only needs to lower invariant stride into stride load/store.




juzhe.zh...@rivai.ai
 
From: Richard Biener
Date: 2023-11-13 20:17
To: juzhe.zh...@rivai.ai
CC: gcc-patches; richard.sandiford
Subject: Re: [PATCH V2] VECT: Support 
mask_len_strided_load/mask_len_strided_store in loop vectorize
On Mon, 13 Nov 2023, juzhe.zh...@rivai.ai wrote:
 
> Hi. Ping this patch which is last optab pattern for RVV support.
> 
> The mask_len_strided_load/mask_len_strided_store document has been approved:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635103.html 
> 
> Bootstrap on X86 and regtest no regression.
> Tested on aarch64 no regression.
> Tested on RISC-V no regression.
> 
> 
> juzhe.zh...@rivai.ai
>  
> From: Juzhe-Zhong
> Date: 2023-11-06 14:55
> To: gcc-patches
> CC: richard.sandiford; rguenther; Juzhe-Zhong
> Subject: [PATCH V2] VECT: Support 
> mask_len_strided_load/mask_len_strided_store in loop vectorize
> This patch adds strided load/store support on loop vectorizer depending on 
> STMT_VINFO_STRIDED_P.
>  
> Bootstrap and regression on X86 passed.
>  
> Ok for trunk ?
>  
> gcc/ChangeLog:
>  
> * internal-fn.cc (strided_load_direct): New function.
> (strided_store_direct): Ditto.
> (expand_strided_store_optab_fn): Ditto.
> (expand_scatter_store_optab_fn): Add strided store.
> (expand_strided_load_optab_fn): New function.
> (expand_gather_load_optab_fn): Add strided load.
> (direct_strided_load_optab_supported_p): New function.
> (direct_strided_store_optab_supported_p): Ditto.
> (internal_load_fn_p): Add strided load.
> (internal_strided_fn_p): New function.
> (internal_fn_len_index): Add strided load/store.
> (internal_fn_mask_index): Ditto.
> (internal_fn_stored_value_index): Add strided store.
> (internal_strided_fn_supported_p): New function.
> * internal-fn.def (MASK_LEN_STRIDED_LOAD): New IFN.
> (MASK_LEN_STRIDED_STORE): Ditto.
> * internal-fn.h (internal_strided_fn_p): New function.
> (internal_strided_fn_supported_p): Ditto.
> * optabs-query.cc (supports_vec_gather_load_p): Add strided load.
> (supports_vec_scatter_store_p): Add strided store.
> * optabs-query.h (supports_vec_gather_load_p): Add strided load.
> (supports_vec_scatter_store_p): Add strided store.
> * tree-vect-data-refs.cc (vect_prune_runtime_alias_test_list): Add strided 
> load/store.
> (vect_gather_scatter_fn_p): Ditto.
> (vect_check_gather_scatter): Ditto.
> * tree-vect-stmts.cc (check_load_store_for_partial_vectors): Ditto.
> (vect_truncate_gather_scatter_offset): Ditto.
> (vect_use_strided_gather_scatters_p): Ditto.
> (vect_get_strided_load_store_ops): Ditto.
> (vectorizable_store): Ditto.
> (vectorizable_load): Ditto.
> * tree-vectorizer.h (vect_gather_scatter_fn_p): Ditto.
>  
> ---
> gcc/internal-fn.cc | 101 -
> gcc/internal-fn.def|   4 ++
> gcc/internal-fn.h  |   2 +
> gcc/optabs-query.cc|  25 

Re: [PATCH] c++: Implement C++26 P2864R2 - Remove Deprecated Arithmetic Conversion on Enumerations From C++26

2023-11-13 Thread Jason Merrill

On 11/13/23 06:50, Jakub Jelinek wrote:

The following patch implements C++26 P2864R2 by emitting pedwarn enabled by
the same options as the C++20 and later warnings (i.e. -Wenum-compare,
-Wdeprecated-enum-enum-conversion and -Wdeprecated-enum-float-conversion
which are all enabled by default).  I think we still want to allow users
some option workaround, so am not using directly error, but if that is
what you want instead, I can change it.


I agree, but we also need to return error_mark_node for these cases when 
SFINAE, i.e. !(complain & tf_warning_or_error)


enum A { a };
enum B { b };

template  decltype (true ? X : Y) f1();
template  decltype (X + Y) f2();
template  decltype (X | Y) f3();

int main()
{
  f1();   // OK in C++23, no match in C++26 

  f2();   // same 

  f3();   // same 


}

Jason



Re: [PATCH v3 1/2] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-13 Thread Jason Merrill

On 11/11/23 05:43, waffl3x wrote:

[combined reply to all three threads]

On 11/9/23 23:24, waffl3x wrote:


There are a few known issues still present in this patch. Most importantly,
the implicit object argument fails to convert when passed to by-value xobj
parameters. This occurs both for xobj parameters that match the argument type
and xobj parameters that are unrelated to the object type, but have valid
conversions available. This behavior can be observed in the
explicit-obj-by-value[1-3].C tests. The implicit object argument appears to be
simply reinterpreted instead of any conversion applied. This is elaborated on
in the test cases.


Yes, that's because of:


@@ -9949,7 +9951,8 @@ build_over_call (struct z_candidate cand, int flags, 
tsubst_flags_t complain)
}
}
/ Bypass access control for 'this' parameter. */
- else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
+ else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE
+ || DECL_XOBJ_MEMBER_FUNC_P (fn))


We don't want to take this path for xob fns. Instead I think we need to
change the existing:


gcc_assert (first_arg == NULL_TREE);


to assert that if first_arg is non-null, we're dealing with an xob fn,
and then go ahead and do the same conversion as the loop body on first_arg.


Despite this, calls where there is no valid conversion
available are correctly rejected, which I find surprising. The
explicit-obj-by-value4.C testcase demonstrates this odd but correct behavior.


Yes, because checking for conversions is handled elsewhere.


Yeah, as I noted above I realized that just handling it the same way as
iobj member functions is fundamentally broken. I was staring at it last
night and eventually realized that I could just copy the loop body. I
ended up asserting in the body handling the implicit object argument
for xobj member functions that first_arg != NULL_TREE, which I wasn't
sure of, but it seems to work.



That sounds like it might cause trouble with

struct A {
void f(this A);
};

int main()
{
(::f) (A());
}


I will check to see what the behavior with this is. This sounds related
to the next question I asked as well.


I tried asking in IRC if there are any circumstances where first_arg
would be null for a non-static member function and I didn't get an
answer. The code above seemed to indicate that it could be. It just
looks like old code that is no longer valid and never got removed.
Consequently this function has made it on my list of things to refactor
:^).



Right, first_arg is only actually used for the implicit object argument,
it's just easier to store it separately from the arguments in (). I'm
not sure which code you mean is no longer valid?


Yeah I agree that it's easier to store it separately.

-- call.cc:build_over_call
```
   else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
 {
   tree arg = build_this (first_arg != NULL_TREE
  ? first_arg
  : (*args)[arg_index]);
```

The trouble is, the code (shown above) does not assume that this holds
true. It handles the case where the implicit object argument was passed
in with the rest of the arguments. As far as I've observed, it seems
like it's always passed in through the first_arg member of cand, which
is what I was referring to here.


ended up asserting in the body handling the implicit object argument
for xobj member functions that first_arg != NULL_TREE, which I wasn't
sure of, but it seems to work.


Since it wasn't clear what I was referring to, here is the code that I
wrote (copied from the loop really) handling the case. In case it isn't
obvious, I didn't snip the code in the METHOD_TYPE block, it's just
snipped here as it's not code I've modified. I'm hopeful that the case
you mentioned above is not problematic, but like I said I will be sure
to test it.

-- call.cc:build_over_call
```
   else if (TREE_CODE (TREE_TYPE (fn)) == METHOD_TYPE)
 {
  /* SNIP */
  if (first_arg != NULL_TREE)
first_arg = NULL_TREE;
  else
++arg_index;
  ++i;
  is_method = 1;
 }
   else if (DECL_XOBJ_MEMBER_FUNC_P (fn))
 {
   gcc_assert (cand->first_arg);
   gcc_assert (cand->num_convs > 0);
   tree type = TREE_VALUE (parm);
   tree arg = cand->first_arg;
   bool conversion_warning = true;

   conv = convs[0];

   /* Set user_conv_p on the argument conversions, so rvalue/base handling
  knows not to allow any more UDCs.  This needs to happen after we
  process cand->warnings.  */
   if (flags & LOOKUP_NO_CONVERSION)
 conv->user_conv_p = true;

   tsubst_flags_t arg_complain = complain;
   if (!conversion_warning)
 arg_complain &= ~tf_warning;

   if (arg_complain & tf_warning)
 maybe_warn_pessimizing_move (arg, type, /*return_p*/false);

   val = convert_like_with_context (conv, arg, fn, 0,
arg_complain);
   val = convert_for_arg_passing (type, val, 

[PATCH] VECT: Add MASK_LEN_STRIDED_LOAD/MASK_LEN_STRIDED_STORE into loop vectorizer

2023-11-13 Thread Juzhe-Zhong
This patch support generating MASK_LEN_STRIDED_LOAD/MASK_LEN_STRIDED_STORE IR
for invariant stride memory access.

It's a special optimization for targets like RVV.

RVV has both indexed load/store and stride load/store.

In RVV, we always have gather/scatter and strided optab at the same time.

E.g. 
void foo (int *__restrict a,
int * __restrict b, int n, int *__restrict indice)
{
for (int i = 0; i < n; i++)
  a[indice[i]] = b[indice[i]] + a[i];
}

Such vector codes, RVV is using indexed load/store for gather/scatter.

E.g.

void foo (int *__restrict a,
int * __restrict b, int n, int m)
{
for (int i = 0; i < n; i++)
  a[i] = b[i * m] + a[i];
}

Such vector codes, RVV is using stride load/store instructions.

We only need to support direct mask_len_stride_xxx optab for invariant stride.

gcc/ChangeLog:

* tree-vect-stmts.cc (vect_get_strided_load_store_ops): Add 
MASK_LEN_STRIDED_LOAD/MASK_LEN_STRIDED_STORE.
(vectorizable_store): Ditto.
(vectorizable_load): Ditto.

---
 gcc/tree-vect-stmts.cc | 47 ++
 1 file changed, 38 insertions(+), 9 deletions(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index ee89f47c468..9c65b688510 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2863,6 +2863,17 @@ vect_get_strided_load_store_ops (stmt_vec_info stmt_info,
   *dataref_bump = cse_and_gimplify_to_preheader (loop_vinfo, bump);
 }
 
+  /* Target supports strided load/store use DR_STEP as stride for VEC_OFFSET
+ directly instead of build VEC_OFFSET with VEC_SERIES.  */
+  internal_fn ifn
+= DR_IS_READ (dr) ? IFN_MASK_LEN_STRIDED_LOAD : IFN_MASK_LEN_STRIDED_STORE;
+  if (direct_internal_fn_supported_p (ifn, vectype, OPTIMIZE_FOR_SPEED))
+{
+  *vec_offset = cse_and_gimplify_to_preheader (loop_vinfo,
+  unshare_expr (DR_STEP (dr)));
+  return;
+}
+
   /* The offset given in GS_INFO can have pointer type, so use the element
  type of the vector instead.  */
   tree offset_type = TREE_TYPE (gs_info->offset_vectype);
@@ -9012,10 +9023,20 @@ vectorizable_store (vec_info *vinfo,
 
  gcall *call;
  if (final_len && final_mask)
-   call = gimple_build_call_internal
-(IFN_MASK_LEN_SCATTER_STORE, 7, dataref_ptr,
- vec_offset, scale, vec_oprnd, final_mask,
- final_len, bias);
+   {
+ if (VECTOR_TYPE_P (TREE_TYPE (vec_offset)))
+   call = gimple_build_call_internal (
+ IFN_MASK_LEN_SCATTER_STORE, 7, dataref_ptr,
+ vec_offset, scale, vec_oprnd, final_mask, final_len,
+ bias);
+ else
+   /* non-vector type offset means that target prefers to
+  use MASK_LEN_STRIDED_STORE instead of
+  MASK_LEN_GATHER_STORE with direct stride argument. */
+   call = gimple_build_call_internal (
+ IFN_MASK_LEN_STRIDED_STORE, 6, dataref_ptr,
+ vec_offset, vec_oprnd, final_mask, final_len, bias);
+   }
  else if (final_mask)
call = gimple_build_call_internal
 (IFN_MASK_SCATTER_STORE, 5, dataref_ptr,
@@ -10956,11 +10977,19 @@ vectorizable_load (vec_info *vinfo,
 
  gcall *call;
  if (final_len && final_mask)
-   call
- = gimple_build_call_internal (IFN_MASK_LEN_GATHER_LOAD, 7,
-   dataref_ptr, vec_offset,
-   scale, zero, final_mask,
-   final_len, bias);
+   {
+ if (VECTOR_TYPE_P (TREE_TYPE (vec_offset)))
+   call = gimple_build_call_internal (
+ IFN_MASK_LEN_GATHER_LOAD, 7, dataref_ptr, vec_offset,
+ scale, zero, final_mask, final_len, bias);
+ else
+   /* non-vector type offset means that target prefers to
+  use MASK_LEN_STRIDED_LOAD instead of
+  MASK_LEN_GATHER_LOAD with direct stride argument.  */
+   call = gimple_build_call_internal (
+ IFN_MASK_LEN_STRIDED_LOAD, 6, dataref_ptr, vec_offset,
+ zero, final_mask, final_len, bias);
+   }
  else if (final_mask)
call = gimple_build_call_internal (IFN_MASK_GATHER_LOAD, 5,
   dataref_ptr, vec_offset,
-- 
2.36.3



[PATCH] DOC/IFN/OPTAB: Add mask_len_strided_load/mask_len_strided_store DOC/OPTAB/IFN

2023-11-13 Thread Juzhe-Zhong
This patch adds mask_len_strided_load/mask_len_strided_store.

Document already has been reviewed.

This patch adds OPTAB/IFN support as follows:

1. strided load
GIMPLE level:

v = MASK_LEN_STRIDED_LOAD (ptr, stride, mask, len, bias)

be expand (by internal-fn.cc) into:

v = mask_len_strided_load (ptr, stried, mask, len, bias)

2. strided store

GIMPLE leve:

MASK_LEN_STRIED_STORE (ptr, stride, v, mask, len, bias)

be expand (by internal-fn.cc) into:

mask_len_stried_store (ptr, stride, v, mask, len, bias)

Bootstrap and regression on X86 no regression.

Ok for trunk ?
 
gcc/ChangeLog:

* doc/md.texi: Add mask_len_strided_load/mask_len_strided_store.
* internal-fn.cc (strided_load_direct): Ditto.
(strided_store_direct): Ditto.
(expand_strided_store_optab_fn): Ditto.
(expand_strided_load_optab_fn): Ditto.
(direct_strided_load_optab_supported_p): Ditto.
(direct_strided_store_optab_supported_p): Ditto.
(internal_fn_len_index): Ditto.
(internal_fn_mask_index): Ditto.
(internal_fn_stored_value_index): Ditto.
* internal-fn.def (MASK_LEN_STRIDED_LOAD): Ditto.
(MASK_LEN_STRIDED_STORE): Ditto.
* optabs.def (OPTAB_D): Ditto.

---
 gcc/doc/md.texi | 27 +++
 gcc/internal-fn.cc  | 63 +
 gcc/internal-fn.def |  6 +
 gcc/optabs.def  |  2 ++
 4 files changed, 98 insertions(+)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 5d86152e5dd..5dc76a1183c 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5094,6 +5094,20 @@ Bit @var{i} of the mask is set if element @var{i} of the 
result should
 be loaded from memory and clear if element @var{i} of the result should be 
undefined.
 Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.
 
+@cindex @code{mask_len_strided_load@var{m}} instruction pattern
+@item @samp{mask_len_strided_load@var{m}}
+Load several separate memory locations into a destination vector of mode 
@var{m}.
+Operand 0 is a destination vector of mode @var{m}.
+Operand 1 is a scalar base address and operand 2 is a scalar stride of Pmode.
+operand 3 is mask operand, operand 4 is length operand and operand 5 is bias 
operand.
+The instruction can be seen as a special case of 
@code{mask_len_gather_load@var{m}@var{n}}
+with an offset vector that is a @code{vec_series} with operand 1 as base and 
operand 2 as step.
+For each element index i load address is operand 1 + @var{i} * operand 2.
+Similar to mask_len_load, the instruction loads at most (operand 4 + operand 
5) elements from memory.
+Element @var{i} of the mask (operand 3) is set if element @var{i} of the 
result should
+be loaded from memory and clear if element @var{i} of the result should be 
zero.
+Mask elements @var{i} with @var{i} > (operand 4 + operand 5) are ignored.
+
 @cindex @code{scatter_store@var{m}@var{n}} instruction pattern
 @item @samp{scatter_store@var{m}@var{n}}
 Store a vector of mode @var{m} into several distinct memory locations.
@@ -5131,6 +5145,19 @@ at most (operand 6 + operand 7) elements of (operand 4) 
to memory.
 Bit @var{i} of the mask is set if element @var{i} of (operand 4) should be 
stored.
 Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.
 
+@cindex @code{mask_len_strided_store@var{m}} instruction pattern
+@item @samp{mask_len_strided_store@var{m}}
+Store a vector of mode m into several distinct memory locations.
+Operand 0 is a scalar base address and operand 1 is scalar stride of Pmode.
+Operand 2 is the vector of values that should be stored, which is of mode 
@var{m}.
+operand 3 is mask operand, operand 4 is length operand and operand 5 is bias 
operand.
+The instruction can be seen as a special case of 
@code{mask_len_scatter_store@var{m}@var{n}}
+with an offset vector that is a @code{vec_series} with operand 1 as base and 
operand 1 as step.
+For each element index i store address is operand 0 + @var{i} * operand 1.
+Similar to mask_len_store, the instruction stores at most (operand 4 + operand 
5) elements of mask (operand 3) to memory.
+Element @var{i} of the mask is set if element @var{i} of (operand 3) should be 
stored.
+Mask elements @var{i} with @var{i} > (operand 4 + operand 5) are ignored.
+
 @cindex @code{vec_set@var{m}} instruction pattern
 @item @samp{vec_set@var{m}}
 Set given field in the vector value.  Operand 0 is the vector to modify,
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 5a998e794ad..bfb307684a9 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -164,6 +164,7 @@ init_internal_fns ()
 #define load_lanes_direct { -1, -1, false }
 #define mask_load_lanes_direct { -1, -1, false }
 #define gather_load_direct { 3, 1, false }
+#define strided_load_direct { -1, -1, false }
 #define len_load_direct { -1, -1, false }
 #define mask_len_load_direct { -1, 4, false }
 #define mask_store_direct { 3, 2, false }
@@ -173,6 +174,7 @@ init_internal_fns ()
 #define 

[Commit QUEUE V3] RISC-V: Support strided load/store

2023-11-13 Thread Juzhe-Zhong
Strided load/store has been approved.

Rebase on V3 and adapt for middle-end IR change.

Will commit after middle-end patche is approved.

gcc/ChangeLog:

* config/riscv/autovec.md (mask_len_strided_load_): New pattern.
(mask_len_strided_store_): Ditto.
* config/riscv/riscv-protos.h (expand_strided_load_store): New function.
* config/riscv/riscv-v.cc (expand_strided_load_store): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c: Adapt 
test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_strided_load-1.c: 
New test.
* 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_strided_load_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_strided_store-1.c: 
New test.
* 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_strided_store_run-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-3.c: New 
test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-3.c: New 
test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-3.c: New 
test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-3.c: 
New test.

---
 gcc/config/riscv/autovec.md   | 30 ++
 gcc/config/riscv/riscv-protos.h   |  1 +
 gcc/config/riscv/riscv-v.cc   | 56 +++
 .../gather-scatter/mask_strided_load-1.c  | 47 +
 .../gather-scatter/mask_strided_load_run-1.c  | 97 +++
 .../gather-scatter/mask_strided_store-1.c | 48 +
 .../gather-scatter/mask_strided_store_run-1.c | 89 +
 .../autovec/gather-scatter/strided_load-1.c   |  2 +-
 .../autovec/gather-scatter/strided_load-2.c   |  2 +-
 .../autovec/gather-scatter/strided_load-3.c   | 45 +
 .../gather-scatter/strided_load_run-3.c   | 84 
 .../autovec/gather-scatter/strided_store-1.c  |  2 +-
 .../autovec/gather-scatter/strided_store-2.c  |  2 +-
 .../autovec/gather-scatter/strided_store-3.c  | 45 +
 .../gather-scatter/strided_store_run-3.c  | 82 
 15 files changed, 628 insertions(+), 4 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_strided_load-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_strided_load_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_strided_store-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_strided_store_run-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-3.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-3.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 80e41af6334..e0c294ffd10 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -272,6 +272,36 @@
   DONE;
 })
 
+;; =
+;; == Strided Load/Store
+;; =
+
+(define_expand "mask_len_strided_load_"
+  [(match_operand:V 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand 2 "pmode_reg_or_0_operand")
+   (match_operand: 3 "vector_mask_operand")
+   (match_operand 4 "autovec_length_operand")
+   (match_operand 5 "const_0_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_strided_load_store (mode, operands, true);
+  DONE;
+})
+
+(define_expand "mask_len_strided_store_"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:V 2 "register_operand")
+   (match_operand: 3 "vector_mask_operand")
+   (match_operand 4 "autovec_length_operand")
+   (match_operand 5 "const_0_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_strided_load_store (mode, operands, false);
+  DONE;
+})
+
 ;; =
 ;; == Array Load/Store
 ;; =
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 8cdfadbcf10..3ef5740cf5b 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -546,6 +546,7 @@ void expand_vec_perm (rtx, rtx, rtx, rtx);
 void expand_select_vl (rtx *);
 void expand_load_store (rtx *, 

[Committed] RISC-V: Fix init-2.c assembly check

2023-11-13 Thread Juzhe-Zhong
Notice the assembly check of init-2.c is wrong.

Committed.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/init-2.c: Fix vid.v check.

---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-2.c
index f27c395441b..ae31e227ad1 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/init-2.c
@@ -45,4 +45,4 @@ DEF_INIT (v128uhi, uint16_t, 128, 0, 1, 2, 3, 4, 5, 6, 7, 8, 
9, 10, 11, 12, 13,
  113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,
  127)
 
-/* { dg-final { scan-assembler-times {vid\.vx} 494 } } */
+/* { dg-final { scan-assembler-times {vid\.v} 14 } } */
-- 
2.36.3



Re: [RFC] Intel AVX10.1 Compiler Design and Support

2023-11-13 Thread Hongtao Liu
On Mon, Nov 13, 2023 at 7:25 PM Richard Biener
 wrote:
>
> On Mon, Nov 13, 2023 at 7:58 AM Hongtao Liu  wrote:
> >
> > On Fri, Nov 10, 2023 at 6:15 PM Richard Biener
> >  wrote:
> > >
> > > On Fri, Nov 10, 2023 at 2:42 AM Haochen Jiang  
> > > wrote:
> > > >
> > > > Hi all,
> > > >
> > > > This RFC patch aims to add AVX10.1 options. After we added 
> > > > -m[no-]evex512
> > > > support, it makes a lot easier to add them comparing to the August 
> > > > version.
> > > > Detail for AVX10 is shown below:
> > > >
> > > > Intel Advanced Vector Extensions 10 (Intel AVX10) Architecture 
> > > > Specification
> > > > It describes the Intel Advanced Vector Extensions 10 Instruction Set
> > > > Architecture.
> > > > https://cdrdv2.intel.com/v1/dl/getContent/784267
> > > >
> > > > The Converged Vector ISA: Intel Advanced Vector Extensions 10 Technical 
> > > > Paper
> > > > It provides introductory information regarding the converged vector 
> > > > ISA: Intel
> > > > Advanced Vector Extensions 10.
> > > > https://cdrdv2.intel.com/v1/dl/getContent/784343
> > > >
> > > > Our proposal is to take AVX10.1-256 and AVX10.1-512 as two "virtual" 
> > > > ISAs in
> > > > the compiler. AVX10.1-512 will imply AVX10.1-256. They will not enable
> > > > anything at first. At the end of the option handling, we will check 
> > > > whether
> > > > the two bits are set. If AVX10.1-256 is set, we will set the AVX512 
> > > > related
> > > > ISA bits. AVX10.1-512 will further set EVEX512 ISA bit.
> > > >
> > > > It means that AVX10 options will be separated from the existing AVX512 
> > > > and the
> > > > newly added -m[no-]evex512 options. AVX10 and AVX512 options will 
> > > > control
> > > > (enable/disable/set vector size) the AVX512 features underneath 
> > > > independently.
> > > > If there’s potential overlap or conflict between AVX10 and AVX512 
> > > > options,
> > > > some rules are provided to define the behavior, which will be described 
> > > > below.
> > > >
> > > > avx10.1 option will be provided as an alias of avx10.1-256.
> > > >
> > > > In the future, the AVX10 options will imply like this:
> > > >
> > > > AVX10.1-256 < AVX10.1-512
> > > >  ^ ^
> > > >  | |
> > > >
> > > > AVX10.2-256 < AVX10.2-512
> > > >  ^ ^
> > > >  | |
> > > >
> > > > AVX10.3-256 < AVX10.3-512
> > > >  ^ ^
> > > >  | |
> > > >
> > > > Each of them will have its own option to enable/disabled corresponding
> > > > features. The alias avx10.x will also be provided.
> > > >
> > > > As mentioned in August version RFC, since we lean towards the adoption 
> > > > of
> > > > AVX10 instead of AVX512 from now on, we don’t recommend users to 
> > > > combine the
> > > > AVX10 and legacy AVX512 options.
> > >
> > > I wonder whether adoption could be made easier by also providing a
> > > -mavx10[.0] level that removes some of the more obscure sub-ISA 
> > > requirements
> > > to cover more existing implementations (I'd not add -mavx10.0-512 here).
> > > I'd require only skylake-AVX512 features here, basically all non-KNL 
> > > AVX512
> > > CPUs should have a "virtual" AVX10 level that allows to use that feature 
> > > set,
> > We have -mno-evex512 can cover those cases, so what you want is like a
> > simple alias of "-march=skylake-avx512 -mno-evex512"?
>
> For the AVX512 enabled sub-isas of skylake-avx512 yes I guess.
>
> > > restricted to 256bits so future AVX10-256 implementations can handle it
> > > as well as all existing (and relevant, which excludes KNL) AVX512
> > > implementations.
> > >
> > > Otherwise AVX10 is really a hard sell (as AVX512 was originally).
> > It's a rebranding of the existing AVX512 to AVX10, AVX10.0  just
> > complicated things further(considering we already have x86-64-v4 which
> > is different from skylake-avx512).
>
> Well, the cut-off for "AVX512" is quite arbitrary.  Introducing a
> "new" ISA that's
> only available in HW available in the future and suggesting users to embrace
> that already (like Intel did with AVX512 without offering client SKU support)
> is a hard sell.
>
> I realize Intel thinks client SKU support for AVX10 (restricted to 256bit) 
> will
> be "easier".  But then don't expect anybody to adopt that in the next 10 
> years.
>
> Just to add - we were suggesting to use x86_64-v3 for the "next" enterprise
> product but got downvoted to x86_64-v2 for compatibility reasons.
>
> If it were possible I'd axe x86_64-v4.  Maybe we should add a x86_64-v3.5
> that sits inbetween v3 and v4, offering AVX512 but restricted to 256bit
> (and obviously not requiring more of the AVX512 features that v4 requires).
About the arch level is indeed a problem, especially since the default
size of avx10 is 256.
+Florian Weimer for more inputs.
>
> Richard.
>
> > >
> > > > However, we would like to introduce some
> > > > simple rules for user when it comes to combination.
> > > >
> > > > 1. Enabling 

Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-11-13 Thread Kewen.Lin
Hi Peter,

on 2023/11/11 07:51, Peter Bergner wrote:
> On 8/27/23 9:06 PM, Kewen.Lin wrote:
>> Assuming we only have ELFv2_ABI_CHECK in PCREL_SUPPORTED_BY_OS, we
>> can have either TARGET_PCREL or !TARGET_PCREL after the checking.
>> For the latter, it's fine and don't need any checks. For the former,
>> if it's implicit, for !TARGET_PREFIXED we will clean it silently;
>> while if it's explicit, for !TARGET_PREFIXED we will emit an error.
>> TARGET_PREFIXED checking has considered Power10, so it's also
>> concerned accordingly.
> [snip]
>> Yeah, looking forward to their opinions.  IMHO, with the current proposed
>> change, pcrel doesn't look like a pure Power10 hardware feature, it also
>> quite relies on ABIs, that's why I thought it seems good not to turn it
>> on by default for Power10.
> 
> Ok, how about the patch below?  This removes OPTION_MASK_PCREL from the
> power10 flags, so instead of our options override code needing to disable
> PCREL on the systems that don't support it, we now enable it only on those
> systems that do support it.

Thanks for the patch!  I like the simplification by replacing function
rs6000_pcrel_p with TARGET_PCREL!  As the consideration on function 
rs6000_machine_from_flags and ISA_3_1_MASKS_SERVER you pointed out in
another mail, I agree it's not a good idea to remove OPTION_MASK_PCREL
from OTHER_POWER10_MASKS.  Since we have the code:

> if (PCREL_SUPPORTED_BY_OS)

> +  else
> +{
> +  if (TARGET_PCREL
> +   && (rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
> + error ("use of %qs is invalid for this target", "-mpcrel");
>rs6000_isa_flags &= ~OPTION_MASK_PCREL;
>  }

now, I think it should be fine not to explicitly mask it off Power10
default flags?  Then leave Power10 default flags unchanged seems more
consistent.  The others look good to me!  Thanks again.

BR,
Kewen

> 
> Jeevitha, can you test this patch to see whether it fixes the testsuite
> issue caused by your earlier patch that was approved, but not yet pushed?
> That was the use GPR2 for register allocation, correct?  Note, you'll need
> to update the patch to replace the rs6000_pcrel_p() usage with just
> TARGET_PCREL, since this patch removes rs6000_pcrel_p().
> 
> If testing is clean and everyone is OK with the patch, I'll officially
> submit it for review with git log entry, etc.
> 
> Peter
> 
> 
> gcc/
>   * config/rs6000/linux64.h (PCREL_SUPPORTED_BY_OS): Only test the ABI.
>   * config/rs6000/rs6000-cpus.def (RS6000_CPU): Remove OPTION_MASK_PCREL
>   from power10.
>   * config/rs6000/predicates.md: Use TARGET_PCREL.
>   * config/rs6000/rs6000-logue.cc (rs6000_decl_ok_for_sibcall): Likewise.
>   (rs6000_global_entry_point_prologue_needed_p): Likewise.
>   (rs6000_output_function_prologue): Likewise.
>   * config/rs6000/rs6000.md: Likewise.
>   * config/rs6000/rs6000.cc (rs6000_option_override_internal): Rework
>   the logic for enabling PCREL by default.
>   (rs6000_legitimize_tls_address): Use TARGET_PCREL.
>   (rs6000_call_template_1): Likewise.
>   (rs6000_indirect_call_template_1): Likewise.
>   (rs6000_longcall_ref): Likewise.
>   (rs6000_call_aix): Likewise.
>   (rs6000_sibcall_aix): Likewise.
>   (rs6000_pcrel_p): Remove.
>   * config/rs6000/rs6000-protos.h (rs6000_pcrel_p): Likewise.
> 
> diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
> index 98b7255c95f..5b77bd7fd51 100644
> --- a/gcc/config/rs6000/linux64.h
> +++ b/gcc/config/rs6000/linux64.h
> @@ -563,8 +563,5 @@ extern int dot_symbols;
>  #define TARGET_FLOAT128_ENABLE_TYPE 1
>  
>  /* Enable using prefixed PC-relative addressing on POWER10 if the ABI
> -   supports it.  The ELF v2 ABI only supports PC-relative relocations for
> -   the medium code model.  */
> -#define PCREL_SUPPORTED_BY_OS(TARGET_POWER10 && TARGET_PREFIXED  
> \
> -  && ELFv2_ABI_CHECK \
> -  && TARGET_CMODEL == CMODEL_MEDIUM)
> +   supports it.  */
> +#define PCREL_SUPPORTED_BY_OS(ELFv2_ABI_CHECK)
> diff --git a/gcc/config/rs6000/rs6000-cpus.def 
> b/gcc/config/rs6000/rs6000-cpus.def
> index 4f350da378c..fe01a2312ae 100644
> --- a/gcc/config/rs6000/rs6000-cpus.def
> +++ b/gcc/config/rs6000/rs6000-cpus.def
> @@ -256,7 +256,8 @@ RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | 
> ISA_2_7_MASKS_SERVER
>   | OPTION_MASK_HTM)
>  RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER
>   | OPTION_MASK_HTM)
> -RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64 | 
> ISA_3_1_MASKS_SERVER)
> +RS6000_CPU ("power10", PROCESSOR_POWER10, MASK_POWERPC64
> + | (ISA_3_1_MASKS_SERVER & ~OPTION_MASK_PCREL))
>  RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
>  RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, OPTION_MASK_PPC_GFXOPT
>   | MASK_POWERPC64)
> diff --git a/gcc/config/rs6000/predicates.md 

Re: [PATCH-3v4, rs6000] Fix regression cases caused 16-byte by pieces move [PR111449]

2023-11-13 Thread Kewen.Lin
Hi,

on 2023/11/10 17:22, HAO CHEN GUI wrote:
> Hi,
>   Originally 16-byte memory to memory is expanded via pattern.
> expand_block_move does an optimization on P8 LE to leverage V2DI reversed
> load/store for memory to memory move. Now it's done by 16-byte by pieces
> move and the optimization is lost. This patch adds an insn_and_split
> pattern to retake the optimization.
> 
>   Compared to the previous version, the main change is to remove volatile
> memory operands check from the insn condition as it's no need.
> 
>   Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
> regressions. Is this OK for trunk?

Okay for trunk, thanks!

BR,
Kewen

> 
> Thanks
> Gui Haochen
> 
> ChangeLog
> rs6000: Fix regression cases caused 16-byte by pieces move
> 
> The previous patch enables 16-byte by pieces move. Originally 16-byte
> move is implemented via pattern.  expand_block_move does an optimization
> on P8 LE to leverage V2DI reversed load/store for memory to memory move.
> Now 16-byte move is implemented via by pieces move and finally split to
> two DI load/store.  This patch creates an insn_and_split pattern to
> retake the optimization.
> 
> gcc/
>   PR target/111449
>   * config/rs6000/vsx.md (*vsx_le_mem_to_mem_mov_ti): New.
> 
> gcc/testsuite/
>   PR target/111449
>   * gcc.target/powerpc/pr111449-2.c: New.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index f3b40229094..26fa32829af 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -414,6 +414,27 @@ (define_mode_attr VM3_char [(V2DI "d")
> 
>  ;; VSX moves
> 
> +;; TImode memory to memory move optimization on LE with p8vector
> +(define_insn_and_split "*vsx_le_mem_to_mem_mov_ti"
> +  [(set (match_operand:TI 0 "indexed_or_indirect_operand" "=Z")
> + (match_operand:TI 1 "indexed_or_indirect_operand" "Z"))]
> +  "!BYTES_BIG_ENDIAN
> +   && TARGET_VSX
> +   && !TARGET_P9_VECTOR
> +   && can_create_pseudo_p ()"
> +  "#"
> +  "&& 1"
> +  [(const_int 0)]
> +{
> +  rtx tmp = gen_reg_rtx (V2DImode);
> +  rtx src =  adjust_address (operands[1], V2DImode, 0);
> +  emit_insn (gen_vsx_ld_elemrev_v2di (tmp, src));
> +  rtx dest = adjust_address (operands[0], V2DImode, 0);
> +  emit_insn (gen_vsx_st_elemrev_v2di (dest, tmp));
> +  DONE;
> +}
> +  [(set_attr "length" "16")])
> +
>  ;; The patterns for LE permuted loads and stores come before the general
>  ;; VSX moves so they match first.
>  (define_insn_and_split "*vsx_le_perm_load_"
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr111449-2.c 
> b/gcc/testsuite/gcc.target/powerpc/pr111449-2.c
> new file mode 100644
> index 000..7003bdc0208
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr111449-2.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile { target { has_arch_pwr8 } } } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-mvsx -O2" } */
> +
> +/* Ensure 16-byte by pieces move is enabled.  */
> +
> +void move1 (void *s1, void *s2)
> +{
> +  __builtin_memcpy (s1, s2, 16);
> +}
> +
> +void move2 (void *s1)
> +{
> +  __builtin_memcpy (s1, "0123456789012345", 16);
> +}
> +
> +/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mp?lxv\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mstxvd2x\M|\mstxv\M} 2 } } */


Re: [PATCH v2] c++: fix parsing with auto(x) [PR112410]

2023-11-13 Thread Jason Merrill

On 11/10/23 20:13, Marek Polacek wrote:

On Thu, Nov 09, 2023 at 07:07:03PM -0500, Jason Merrill wrote:

On 11/9/23 14:58, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Here we are wrongly parsing

int y(auto(42));

which uses the C++23 cast-to-prvalue feature, and initializes y to 42.
However, we were treating the auto as an implicit template parameter.

Fixing the auto{42} case is easy, but when auto is followed by a (,
I found the fix to be much more involved.  For instance, we cannot
use cp_parser_expression, because that can give hard errors.  It's
also necessary to disambiguate 'auto(i)' as 'auto i', not a cast.
auto(), auto(int), auto(f)(int), auto(*), auto(i[]), auto(...), etc.
are all function declarations.  We have to look at more than one
token to decide.


Yeah, this is a most vexing parse problem.  The code is synthesizing
template parameters before we've resolved whether the auto is a
decl-specifier or not.


In this fix, I'm (ab)using cp_parser_declarator, with member_p=false
so that it doesn't commit.  But it handles even more complicated
cases as

int fn (auto (*const **)(int) -> char);


But it doesn't seem to handle the extremely vexing

struct A {
   A(int,int);
};

int main()
{
   int a;
   A b(auto(a), 42);
}


Argh.  This test should indeed be accepted and is currently rejected,
but it's a different problem: 'b' is at block scope and you can't
have a template there.  But when I put it into a namespace scope,
it shows that my patch doesn't work correctly.  I've added auto-fncast14.C
for the latter and opened c++/112482 for the block-scope problem.
  

I think we need to stop synthesizing immediately when we see RID_AUTO, and
instead go back after we successfully parse a declaration and synthesize for
any autos we saw along the way.  :/


That seems very complicated :(.  I had a different idea though; how
about the following patch?  The idea is that if we see that parsing
the parameter-declaration-list didn't work, we undo what synthesize_
did, and let cp_parser_initializer parse "(auto(42))", which should
succeed.  I checked that after cp_finish_decl y is initialized to 42.


Nice, that's much simpler.  Do you also still need the changes to 
cp_parser_simple_type_specifier?



Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Here we are wrongly parsing

   int y(auto(42));

which uses the C++23 cast-to-prvalue feature, and initializes y to 42.
However, we were treating the auto as an implicit template parameter.

Fixing the auto{42} case is easy, but when auto is followed by a (,
I found the fix to be much more involved.  For instance, we cannot
use cp_parser_expression, because that can give hard errors.  It's
also necessary to disambiguate 'auto(i)' as 'auto i', not a cast.
auto(), auto(int), auto(f)(int), auto(*), auto(i[]), auto(...), etc.
are all function declarations.

This patch rectifies that by undoing the implicit function template
modification.  In the test above, we should notice that the parameter
list is ill-formed, and since we've synthesized an implicit template
parameter, we undo it by calling abort_fully_implicit_template.  Then,
we'll parse the "(auto(42))" as an initializer.

PR c++/112410

gcc/cp/ChangeLog:

* parser.cc (cp_parser_simple_type_specifier): Disambiguate
between a variable and function declaration with auto.
(cp_parser_parameter_declaration_clause): Maybe call
abort_fully_implicit_template if it turned out the parameter list was
ill-formed.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-fncast13.C: New test.
* g++.dg/cpp23/auto-fncast14.C: New test.
---
  gcc/cp/parser.cc   | 27 +-
  gcc/testsuite/g++.dg/cpp23/auto-fncast13.C | 61 ++
  gcc/testsuite/g++.dg/cpp23/auto-fncast14.C |  9 
  3 files changed, 96 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast13.C
  create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast14.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 5116bcb78f6..947351b09b8 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -19991,6 +19991,8 @@ cp_parser_simple_type_specifier (cp_parser* parser,
  /* The 'auto' might be the placeholder return type for a function decl
 with trailing return type.  */
  bool have_trailing_return_fn_decl = false;
+ /* Or it might be auto(x) or auto {x}.  */
+ bool decay_copy = false;
  
  	  cp_parser_parse_tentatively (parser);

  cp_lexer_consume_token (parser->lexer);
@@ -20008,6 +20010,11 @@ cp_parser_simple_type_specifier (cp_parser* parser,
 /*consume_paren*/true);
  continue;
}
+ else if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE))
+   {
+ decay_copy = true;
+   

Re: [PATCH] LoongArch: Use finer-grained DBAR hints

2023-11-13 Thread chenglulu


在 2023/11/14 上午7:18, Xi Ruoyao 写道:
/* snip */

  (define_insn "mem_thread_fence_1"
[(set (match_operand:BLK 0 "" "")
(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))
 (match_operand:SI 1 "const_int_operand" "")] ;; model
""
-  "dbar\t0")
+  {
+enum memmodel model = memmodel_base (INTVAL (operands[1]));
+
+switch (model)
+  {
+  case MEMMODEL_ACQUIRE:
+  case MEMMODEL_CONSUME:


Hi,

 * Before calling this template, the function get_memmodel is called to
   process memmodel, which has a piece of code:
 *

   /* Workaround for Bugzilla 59448. GCC doesn't track consume
   properly, so be conservative and promote consume to acquire. */ if
   (val == MEMMODEL_CONSUME) val = MEMMODEL_ACQUIRE;

 *


 * So I think MEMMODEL_CONSUME don't need to be processed here either.

Otherwise is OK.

 * Thanks.


+   /* Consume is implemented using the stronger acquire memory order
+  because of a deficiency in C++11's semantics.  */
+   return "dbar\t0b10100";
+  case MEMMODEL_RELEASE:
+   return "dbar\t0b10010";
+  case MEMMODEL_ACQ_REL:
+  case MEMMODEL_SEQ_CST:
+   return "dbar\t0b1";
+  default:
+   /* GCC internal: "For the '__ATOMIC_RELAXED' model no instructions
+  need to be issued and this expansion is not invoked."
+  Other values should not be returned by memmodel_base.  */
+   gcc_unreachable ();
+  }
+  })
  
  ;; Atomic memory operations.
  


Re: [PATCH] c++: Link extended FP conversion pedwarns to -Wnarrowing [PR111842]

2023-11-13 Thread Jason Merrill

On 11/13/23 14:01, Jonathan Wakely wrote:

Tested x86_64-linux. Does this make sense for trunk?


OK, thanks.


These forbidden conversions are defined as narrowing conversions, and we
already use -Wnarrowing to control other "ill-formed, but only warn by
default" diagnostics required by the standard.

-- >8 --

Several users have been confused by the status of these warnings,
which can be misunderstood as "this might not be what you want",
rather than diagnostics required by the C++ standard. Add the text "ISO
C++ does not allow" to make this clear.

Also link them to -Wnarrowing so that they can be disabled or promoted
to errors independently of other pedwarns.

gcc/cp/ChangeLog:

PR c++/111842
PR c++/112498
* call.cc (convert_like_internal): Use OPT_Wnarrowing for
pedwarns about illformed conversions involving extended
floating-point types. Clarify that ISO C++ requires these
diagnostics.
* g++.dg/cpp23/ext-floating16.C: New test.
* g++.dg/cpp23/ext-floating17.C: New test.
---
  gcc/cp/call.cc  | 10 +++--
  gcc/testsuite/g++.dg/cpp23/ext-floating16.C | 40 
  gcc/testsuite/g++.dg/cpp23/ext-floating17.C | 42 +
  3 files changed, 88 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp23/ext-floating16.C
  create mode 100644 gcc/testsuite/g++.dg/cpp23/ext-floating17.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 4516677bcab..4b0c6c42f01 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -8303,15 +8303,17 @@ convert_like_internal (conversion *convs, tree expr, 
tree fn, int argnum,
totype))
  {
  case 2:
-   if (pedwarn (loc, 0, "converting to %qH from %qI with greater "
-"conversion rank", totype, TREE_TYPE (expr)))
+   if (pedwarn (loc, OPT_Wnarrowing, "ISO C++ does not allow "
+"converting to %qH from %qI with greater "
+"conversion rank", totype, TREE_TYPE (expr)))
  complained = 1;
else if (!complained)
  complained = -1;
break;
  case 3:
-   if (pedwarn (loc, 0, "converting to %qH from %qI with unordered "
-"conversion ranks", totype, TREE_TYPE (expr)))
+   if (pedwarn (loc, OPT_Wnarrowing, "ISO C++ does not allow "
+"converting to %qH from %qI with unordered "
+"conversion rank", totype, TREE_TYPE (expr)))
  complained = 1;
else if (!complained)
  complained = -1;
diff --git a/gcc/testsuite/g++.dg/cpp23/ext-floating16.C 
b/gcc/testsuite/g++.dg/cpp23/ext-floating16.C
new file mode 100644
index 000..d6a562d5cb0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/ext-floating16.C
@@ -0,0 +1,40 @@
+// P1467R9 - Extended floating-point types and standard names.
+// { dg-do compile { target c++23 } }
+// { dg-options "-pedantic-errors -Wno-narrowing" }
+// { dg-add-options float16 }
+// { dg-add-options float32 }
+// { dg-add-options float64 }
+// { dg-add-options float128 }
+
+#include "ext-floating.h"
+
+#ifdef __STRICT_ANSI__
+#undef __SIZEOF_FLOAT128__
+#endif
+
+using namespace std;
+
+#ifdef __STDCPP_FLOAT16_T__
+#ifdef __STDCPP_FLOAT32_T__
+float16_t f16c = 1.0F32;   // { dg-bogus "converting to 'std::float16_t' \\\{aka 
'_Float16'\\\} from '_Float32' with greater conversion rank" "" { target { float16 
&& float32 } } }
+#endif
+#ifdef __STDCPP_FLOAT64_T__
+float16_t f16e = 1.0F64;   // { dg-bogus "converting to 'std::float16_t' \\\{aka 
'_Float16'\\\} from '_Float64' with greater conversion rank" "" { target { float16 
&& float64 } } }
+#endif
+#ifdef __STDCPP_FLOAT128_T__
+float16_t f16g = 1.0F128;  // { dg-bogus "converting to 'std::float16_t' \\\{aka 
'_Float16'\\\} from '_Float128' with greater conversion rank" "" { target { float16 
&& float128 } } }
+#endif
+#endif
+#ifdef __STDCPP_FLOAT32_T__
+#ifdef __STDCPP_FLOAT64_T__
+float32_t f32e = 1.0F64;   // { dg-bogus "converting to 'std::float32_t' \\\{aka 
'_Float32'\\\} from '_Float64' with greater conversion rank" "" { target { float32 
&& float64 } } }
+#endif
+#ifdef __STDCPP_FLOAT128_T__
+float32_t f32g = 1.0F128;  // { dg-bogus "converting to 'std::float32_t' \\\{aka 
'_Float32'\\\} from '_Float128' with greater conversion rank" "" { target { float32 
&& float128 } } }
+#endif
+#endif
+#ifdef __STDCPP_FLOAT64_T__
+#ifdef __STDCPP_FLOAT128_T__
+float64_t f64g = 1.0F128;  // { dg-bogus "converting to 'std::float64_t' \\\{aka 
'_Float64'\\\} from '_Float128' with greater conversion rank" "" { target { float64 
&& float128 } } }
+#endif
+#endif
diff --git a/gcc/testsuite/g++.dg/cpp23/ext-floating17.C 
b/gcc/testsuite/g++.dg/cpp23/ext-floating17.C
new file 

Re: [PATCH v4] c++: implement P2564, consteval needs to propagate up [PR107687]

2023-11-13 Thread Jason Merrill

On 11/6/23 17:34, Marek Polacek wrote:

On Fri, Nov 03, 2023 at 01:51:07PM -0400, Jason Merrill wrote:

On 11/2/23 11:28, Marek Polacek wrote:

On Sat, Oct 14, 2023 at 12:56:11AM -0400, Jason Merrill wrote:

On 10/10/23 13:20, Marek Polacek wrote:

I suppose some
functions cannot possibly be promoted because they don't contain
any CALL_EXPRs.  So we may be able to rule them out while doing
cp_fold_r early.


Yes.  Or, the only immediate-escalating functions referenced have already
been checked.


It looks like you haven't pursued this yet?  One implementation thought:


Oops, I'd forgotten to address that.


maybe_store_cfun... could stop skipping immediate_escalating_function_p
(current_function_decl), and after we're done folding if the current
function isn't in the hash_set we can go ahead and set
DECL_ESCALATION_CHECKED_P?


Clever, I see what you mean.  IOW, we store c_f_d iff the function contains
an i-e expr.  If not, it can't possibly become consteval.  I've added that
into cp_fold_function, and it seems to work well...

...except it revealed a different problem: cp_fold_r -> cp_fold will, since
https://gcc.gnu.org/pipermail/gcc-patches/2016-March/443993.html, remove
UNARY_PLUS_EXPR, leading us into this problem:

   // stmt = +id(i)
   cp_fold (...);
   // stmt = id(i)

and the subsequent tree walk walks the CALL_EXPR's operands, so
cp_fold_immediate_r will never see the CALL_EXPR, so we miss an i-e expr.

Perhaps a better solution than the kludge I added would be to only call
cp_fold_immediate_r after cp_fold.  Or potentially before /and/ after if
cp_fold changes the expression?


Or walk everything with cp_fold_immediate_r before walking again with 
cp_fold_r?



It also seems odd that the ADDR_EXPR case calls vec_safe_push
(deferred_escalating_exprs, while the CALL_EXPR case calls
maybe_store_cfun_for_late_checking, why the different handling?


maybe_store_cfun_for_late_checking saves current_function_decl
so that we can check:

void g (int i) {
fn (i); // error if fn promotes to consteval
}


Yes, but why don't we want the same handling for ADDR_EXPR?


The handling can't be exactly the same due to global vars like

   auto p1 = ;

...but it's wrong to only save the ADDR_EXPR if it's enclosed in
a function, because the ADDR_EXPR could be inside a consteval if
block, in which case I think we're not supposed to error.  Tested
in consteval-prop20.C.  Thanks,


And we don't need the !current_function_decl handling for CALL_EXPR?

The only significant difference I see between  and f() for escalation 
is that the latter might be an immediate invocation.  Once we've 
determined that it's not, so we are in fact looking at an 
immediate-escalating expression, I'd expect the promotion handling to be 
identical.



+  /* Whether cp_fold_immediate_r is looking for immediate-escalating
+ expressions.  */


Isn't that always what it's doing?

The uses of ff_escalating in maybe_explain_promoted_consteval and 
maybe_escalate_decl_and_cfun seem to have different purposes that I'm 
having trouble following.


For the former, it seems to control returning the offending expression 
rather than error_mark_node.  Why don't we always do that?


For the latter, it seems to control recursion, which seems redundant 
with the recursion in that latter function itself.  And the use of the 
flag seems redundant with at_eof.



+/* Remember that the current function declaration contains a call to
+   a function that might be promoted to consteval later.  */
+
+static void
+maybe_store_cfun_for_late_checking ()


This name could say more about escalation?  Maybe 
...for_escalation_checking?


Or, better, merge this with maybe_store_immediate_escalating_fn?


+/* Figure out if DECL should be promoted to consteval and if so, maybe also
+   promote the function we are in currently.  CALL is the CALL_EXPR of DECL.
+   EVALP is where we may store the result of cxx_constant_value so that we
+   don't have to evaluate the same tree again in cp_fold_immediate_r.  */
+
+static void
+maybe_escalate_decl_and_cfun (tree decl, tree call, tree *evalp)
+{
+  if (cp_unevaluated_operand)
+return;
+
+  /* What we're calling is not a consteval function but it may become
+ one.  This requires recursing; DECL may be promoted to consteval
+ because it contains an escalating expression E, but E itself may
+ have to be promoted first, etc.  */
+  if (unchecked_immediate_escalating_function_p (decl))
+{
+  cp_fold_data data (ff_escalating, decl);
+  cp_walk_tree (_SAVED_TREE (decl), cp_fold_immediate_r,
+   , nullptr);
+  DECL_ESCALATION_CHECKED_P (decl) = true;


Why recurse both here and in cp_fold_immediate_r?


+}
+
+  /* In turn, maybe promote the function we find ourselves in...  */
+  if (DECL_IMMEDIATE_FUNCTION_P (decl)
+  /* ...but not if the call to DECL was constant; that is the
+"an immediate invocation that is not a constant expression"
+case.  We do this 

[PATCH] Fix ICE in vectorizable_nonlinear_induction with bitfield.

2023-11-13 Thread liuhongt
 if (TREE_CODE (init_expr) == INTEGER_CST)
init_expr = fold_convert (TREE_TYPE (vectype), init_expr);
  else
gcc_assert (tree_nop_conversion_p (TREE_TYPE (vectype),
   TREE_TYPE (init_expr)));

and init_expr is a 24 bit integer type while vectype has 32bit components.

The "fix" is to bail out instead of asserting.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?(and backport to GCC-13)

gcc/ChangeLog:

PR tree-optimization/112496
* tree-vect-loop.cc (vectorizable_nonlinear_induction): Return
false when !tree_nop_conversion_p (TREE_TYPE (vectype),
TREE_TYPE (init_expr).

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr112496.c: New test.
---
 gcc/testsuite/gcc.target/i386/pr112496.c |  7 +++
 gcc/tree-vect-loop.cc| 13 ++---
 2 files changed, 17 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr112496.c

diff --git a/gcc/testsuite/gcc.target/i386/pr112496.c 
b/gcc/testsuite/gcc.target/i386/pr112496.c
new file mode 100644
index 000..c478fda9cce
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr112496.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+struct T { int x : 24; } v;
+void f1(int x) {
+  while (v.x - ((v.x <<= 1) - v.x)) ;
+}
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 1304b7ece79..75d69bbc30e 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -9429,9 +9429,16 @@ vectorizable_nonlinear_induction (loop_vec_info 
loop_vinfo,
 
   if (TREE_CODE (init_expr) == INTEGER_CST)
 init_expr = fold_convert (TREE_TYPE (vectype), init_expr);
-  else
-gcc_assert (tree_nop_conversion_p (TREE_TYPE (vectype),
-  TREE_TYPE (init_expr)));
+  else if (!tree_nop_conversion_p (TREE_TYPE (vectype), TREE_TYPE (init_expr)))
+{
+  /* INIT_EXPR could be a bit_field, bail out for such case.  */
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"nonlinear induction vectorization failed:"
+" component type of vectype is not a nop conversion"
+" from type of init_expr.\n");
+  return false;
+}
 
   switch (induction_type)
 {
-- 
2.31.1



Re: [PATCH 2/4] c-family, C: handle clang attributes [PR109877].

2023-11-13 Thread Joseph Myers
On Sun, 12 Nov 2023, Iain Sandoe wrote:

> This adds the ability to defer the validation of numeric attribute
> arguments until the sequence is parsed if the attribute being
> handled is one known to be 'clang form'.

This is only for __attribute__ and not [[]]-style attributes, is that as 
intended?  (Doing it for [[]] might be harder because of how the tokens 
after [[ have to be lexed early for Objective-C to see whether there is a 
matching ]] and thus whether it's an attribute at all.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] c-family: Let libcpp know when the compilation is for a PCH [PR9471]

2023-11-13 Thread Joseph Myers
On Fri, 10 Nov 2023, Lewis Hyatt wrote:

> Hello-
> 
> The PR may be 20 years old, but by now it only needs a one-line fix :). Is
> it OK please? Bootstrapped + regtested all langauges on x86-64 Linux.
> Thanks!

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] libstdc++: Improve operator-(weekday x, weekday y).

2023-11-13 Thread Cassio Neri
The current implementation calls __detail::__modulo which is relatively
expensive.

A better implementation is possible if we assume that x.ok() && y.ok() == true,
so that n = x.c_encoding() - y.c_encoding() is in [-6, 6]. In this case, it
suffices to return n >= 0 ? n : n + 7.

The above is allowed by [time.cal.wd.nonmembers]/5: the returned value is
unspecified when x.ok() || y.ok() == false.

The assembly emitted for x86-64 and ARM can be seen in:
https://godbolt.org/z/nMdc5vv9n.

libstdc++-v3/ChangeLog:

* include/std/chrono:
---

OK for trunk?

 libstdc++-v3/include/std/chrono | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/std/chrono b/libstdc++-v3/include/std/chrono
index 10e868e5a03..6131e7e97b3 100644
--- a/libstdc++-v3/include/std/chrono
+++ b/libstdc++-v3/include/std/chrono
@@ -1036,8 +1036,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   friend constexpr days
   operator-(const weekday& __x, const weekday& __y) noexcept
   {
-   auto __n = static_cast(__x._M_wd) - __y._M_wd;
-   return days{__detail::__modulo(__n, 7)};
+   const auto __n = __x.c_encoding() - __y.c_encoding();
+   return static_cast(__n) >= 0 ? days{__n} : days{__n + 7};
   }
 };

--
2.41.0



Re: [PATCH 4/4] maintainer-scripts/gcc_release: cleanup whitespace

2023-11-13 Thread Joseph Myers
On Fri, 10 Nov 2023, Sam James wrote:

> Joseph Myers  writes:
> 
> > On Thu, 2 Nov 2023, Sam James wrote:
> >
> >> maintainer-scripts/
> >>* gcc_release: Cleanup whitespace.
> >
> > OK.
> 
> Thanks. Would you mind pushing the two you approved?

Done.

-- 
Joseph S. Myers
jos...@codesourcery.com


[committed] libcpp: Regenerate config.in

2023-11-13 Thread Arsen Arsenović
The previous commit did not include regenerating files maintained by
autoheader.

libcpp/ChangeLog:

* config.in: Regenerate.
---
Pushed as obvious in response to the quite neat buildbot:
https://builder.sourceware.org/buildbot/#/builders/269/builds/56

 libcpp/config.in | 17 -
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/libcpp/config.in b/libcpp/config.in
index 32397b821667..df4fd44c9ef6 100644
--- a/libcpp/config.in
+++ b/libcpp/config.in
@@ -38,9 +38,21 @@
*/
 #undef HAVE_ALLOCA_H
 
+/* Define to 1 if you have the Mac OS X function
+   CFLocaleCopyPreferredLanguages in the CoreFoundation framework. */
+#undef HAVE_CFLOCALECOPYPREFERREDLANGUAGES
+
+/* Define to 1 if you have the Mac OS X function CFPreferencesCopyAppValue in
+   the CoreFoundation framework. */
+#undef HAVE_CFPREFERENCESCOPYAPPVALUE
+
 /* Define to 1 if you have the `clearerr_unlocked' function. */
 #undef HAVE_CLEARERR_UNLOCKED
 
+/* Define if the GNU dcgettext() function is already present or preinstalled.
+   */
+#undef HAVE_DCGETTEXT
+
 /* Define to 1 if you have the declaration of `abort', and to 0 if you don't.
*/
 #undef HAVE_DECL_ABORT
@@ -171,7 +183,10 @@
 /* Define to 1 if you have the `getc_unlocked' function. */
 #undef HAVE_GETC_UNLOCKED
 
-/* Define if you have the iconv() function. */
+/* Define if the GNU gettext() function is already present or preinstalled. */
+#undef HAVE_GETTEXT
+
+/* Define if you have the iconv() function and it works. */
 #undef HAVE_ICONV
 
 /* Define to 1 if you have the  header file. */
-- 
2.42.1



Re: [PATCH 0/3] Option handling: add documentation URLs

2023-11-13 Thread Joseph Myers
On Fri, 10 Nov 2023, David Malcolm wrote:

> The .opt.urls files it generates become part of the source tree, and
> would be regenerated by maintainers whenever new options are added.
> Forgetting to update the files (or not having Python 3 handy) merely
> means that URLs might be missing or out of date until someone else
> regenerates them.

Do I understand correctly that there are no makefile targets to regenerate 
these files; it's up to maintainers to regenerate them manually?

Advantages:

* No need to update contrib/gcc_update to handle timestamps for the files.

* No modifications unexpectedly appearing in source trees, if the checked 
in files are out of date and you run a build with the timestamps such that 
the file gets regenerated.

Disadvantages:

* You need to know how to do the regeneration manually; "make" is the 
uniform way for generating any file the build system can generate, without 
needing more specific knowledge about that file.

Given the recent discussion starting at 
 of 
post-commit CI to detect auto*-generated files that aren't fully up to 
date, maybe it would be appropriate to add a check for .opt.urls files 
being up to date (including making sure that each .opt file does have a 
corresponding .opt.urls file checked in) to that CI?

Since the Python script has hardcoded information about .opt files and 
corresponding URLs for target options documentation, the patch series 
should update sourcebuild.texi, section "Back End", to identify that 
script as one of the places to update when adding a new target back end.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 0/5] Add support for operand-specific alignment requirements

2023-11-13 Thread Vladimir Makarov



On 11/12/23 09:52, Richard Sandiford wrote:

SME has various instructions that require aligned register tuples.
However, the associated tuple modes are already widely used and do
not need to be aligned in other contexts.  It therefore isn't
appropriate to force alignment in TARGET_HARD_REGNO_MODE_OK.

There are also strided loads and stores that require:

- (regno & 0x8) == 0 for 2-register tuples
- (regno & 0xc) == 0 for 4-register tuples

Although the requirements for strided loads and stores could be
enforced by C++ conditions on the insn, it's convenient to handle
them in the same way as alignment.

This series of patches therefore adds a way for register constraints
to specify which start registers are valid and which aren't.  Most of
the details are in the covering note to the first patch.

This is clearly changing a performance-sensitive part of the compiler.
I've tried to ensure that the overhead is only small for targets that
use the new feature.  Almost all of the new code gets optimised away
on targets that don't use the feature.

Richard Sandiford (5):
   Add register filter operand to define_register_constraint
   recog: Handle register filters
   lra: Handle register filters
   ira: Handle register filters
   Add an aligned_register_operand predicate

  gcc/common.md  |  28 
  gcc/doc/md.texi|  41 +++-
  gcc/doc/tm.texi|   3 +-
  gcc/doc/tm.texi.in |   3 +-
  gcc/genconfig.cc   |   2 +
  gcc/genpreds.cc| 146 -
  gcc/gensupport.cc  |  48 +-
  gcc/gensupport.h   |   3 +
  gcc/ira-build.cc   |   8 +++
  gcc/ira-color.cc   |  10 +++
  gcc/ira-int.h  |  14 
  gcc/ira-lives.cc   |  61 +
  gcc/lra-constraints.cc |  13 +++-
  gcc/recog.cc   |  14 +++-
  gcc/recog.h|  24 ++-
  gcc/reginfo.cc |   5 ++
  gcc/rtl.def|   6 +-
  gcc/target-globals.cc  |   6 +-
  gcc/target-globals.h   |   3 +
  19 files changed, 421 insertions(+), 17 deletions(-)

Collecting all occurrence constraints for IRA probably might result in 
worse allocation (when pseudo is spilled because of this) in comparison 
with using wider hard reg set and generating reload insns for some 
pseudo occurrences requiring stricter constraints.  Regional RA 
mitigates this issue.  In any case IRA changes is an improvement in 
comparison with using only hard_regno_mode_ok.  Using smaller 
constraints in certain cases for pseudos spilled after using the biggest 
constraint is just an idea for further RA improvement for targets using 
the filters. The only question is it worth to implement.


All IRA/LRA/reginfo patches are OK for me.  IMHO other changes are 
pretty strait forward not to ask somebody to review them.


Thank you, Richard.




Re: [PATCH 2/4] c-family, C: handle clang attributes [PR109877].

2023-11-13 Thread Marek Polacek
On Sun, Nov 12, 2023 at 08:02:42PM -1000, Iain Sandoe wrote:
> This adds the ability to defer the validation of numeric attribute
> arguments until the sequence is parsed if the attribute being
> handled is one known to be 'clang form'.
> 
> We do this by considering the arguments to be strings regardless
> of content and defer the interpretation of those strings until the
> argument processing.

I don't see any tests here nor in the C++ part of the patch.  Is it
possible to add some (I suppose for now only attribute availability)?

FWIW, for chaining attributes it's best to use attr_chainon since that
handles error_mark_node.  Unfortunately that's currently only in cp/.
 
>   PR c++/109877
> 
> gcc/c-family/ChangeLog:
> 
>   * c-lex.cc (c_lex_with_flags): Allow for the case where
>   we wish to defer interpretation of numeric values until
>   parse time.
>   * c-pragma.h (C_LEX_NUMBER_AS_STRING): New.
> 
> gcc/c/ChangeLog:
> 
>   * c-parser.cc (struct c_parser): Provide a flag to notify
> that argument parsing should return attribute arguments
> as string constants.
>   (c_lex_one_token): Act to defer numeric value validation.
>   (c_parser_clang_attribute_arguments): New.
>   (c_parser_gnu_attribute): Allow for clang-form GNU-style
>   attributes.
> 
> Signed-off-by: Iain Sandoe 
> ---
>  gcc/c-family/c-lex.cc   |  15 ++
>  gcc/c-family/c-pragma.h |   3 ++
>  gcc/c/c-parser.cc   | 109 ++--
>  3 files changed, 122 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc
> index 06c2453c89a..d535f5b460c 100644
> --- a/gcc/c-family/c-lex.cc
> +++ b/gcc/c-family/c-lex.cc
> @@ -533,6 +533,21 @@ c_lex_with_flags (tree *value, location_t *loc, unsigned 
> char *cpp_flags,
>  
>  case CPP_NUMBER:
>{
> + /* If the user wants number-like entities to be returned as a raw
> +string, then don't try to classify them, which emits unwanted
> +diagnostics.  */
> + if (lex_flags & C_LEX_NUMBER_AS_STRING)
> +   {
> + /* build_string adds a trailing NUL at [len].  */
> + tree num_string = build_string (tok->val.str.len + 1,
> + (const char *) tok->val.str.text);
> + TREE_TYPE (num_string) = char_array_type_node;
> + *value = num_string;
> + /* We will effectively note this as CPP_N_INVALID, because we
> +made no checks here.  */
> + break;
> +   }
> +
>   const char *suffix = NULL;
>   unsigned int flags = cpp_classify_number (parse_in, tok, , *loc);
>  
> diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
> index 98177913053..11cde74f9f0 100644
> --- a/gcc/c-family/c-pragma.h
> +++ b/gcc/c-family/c-pragma.h
> @@ -276,6 +276,9 @@ extern void pragma_lex_discard_to_eol ();
>  #define C_LEX_STRING_NO_JOIN   2 /* Do not concatenate strings
>  nor translate them into execution
>  character set.  */
> +#define C_LEX_NUMBER_AS_STRING 4 /* Do not classify a number, but
> +instead return it as a raw
> +string.  */
>  
>  /* This is not actually available to pragma parsers.  It's merely a
> convenient location to declare this function for c-lex, after
> diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
> index 703f9570dbc..16cc05d 100644
> --- a/gcc/c/c-parser.cc
> +++ b/gcc/c/c-parser.cc
> @@ -217,6 +217,9 @@ struct GTY(()) c_parser {
>   should translate them to the execution character set (false
>   inside attributes).  */
>BOOL_BITFIELD translate_strings_p : 1;
> +  /* True if we want to lex arbitrary number-like sequences as their
> + string representation.  */
> +  BOOL_BITFIELD lex_number_as_string : 1;
>  
>/* Objective-C specific parser/lexer information.  */
>  
> @@ -308,10 +311,10 @@ c_lex_one_token (c_parser *parser, c_token *token, bool 
> raw = false)
>  
>if (raw || vec_safe_length (parser->raw_tokens) == 0)
>  {
> +  int lex_flags = parser->lex_joined_string ? 0 : C_LEX_STRING_NO_JOIN;
> +  lex_flags |= parser->lex_number_as_string ? C_LEX_NUMBER_AS_STRING : 0;
>token->type = c_lex_with_flags (>value, >location,
> -   >flags,
> -   (parser->lex_joined_string
> -? 0 : C_LEX_STRING_NO_JOIN));
> +   >flags, lex_flags);
>token->id_kind = C_ID_NONE;
>token->keyword = RID_MAX;
>token->pragma_kind = PRAGMA_NONE;
> @@ -5210,6 +5213,98 @@ c_parser_gnu_attribute_any_word (c_parser *parser)
>return attr_name;
>  }
>  
> +/* Handle parsing clang-form attribute arguments, where we need to adjust
> +   the parsing rules to relate to a specific attribute.  */
> +
> +static tree

Re: [PATCH] Add type-generic clz/ctz/clrsb/ffs/parity/popcount builtins [PR111309]

2023-11-13 Thread Joseph Myers
On Fri, 10 Nov 2023, Jakub Jelinek wrote:

> > Besides the above question I'd say OK (I assume Josephs reply is a
> > general ack from his side).
> 
> Joseph, what are your thoughts on the above?

It's correct not to promote, since that matches the semantics of the 
standard type-generic macros.  (I did suggest in WG14 that the 
type-generic macros might make more sense in the cases of functions that 
are genuinely just functions of their integer argument and not of its 
type, such as population count, than for functions where the result for a 
given integer argument depends on the width of its type and not just the 
integer value, or that passing an explicit width argument might be 
appropriate for type-generic macros in cases where the width matters, but 
WG14 wanted all the type-generic macros as-is.)

-- 
Joseph S. Myers
jos...@codesourcery.com


[committed] libstdc++: Add dg-timeout-factor to remaining IO tests

2023-11-13 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

I meant to add these changes as part of r14-4959-g7d06b29f814580 but
missed these files out.

libstdc++-v3/ChangeLog:

* testsuite/std/time/clock/file/io.cc: Double timeout using
dg-timeout-factor.
* testsuite/std/time/clock/gps/io.cc: Likewise.
* testsuite/std/time/clock/local/io.cc: Likewise.
* testsuite/std/time/clock/system/io.cc: Likewise.
* testsuite/std/time/clock/tai/io.cc: Likewise.
* testsuite/std/time/clock/utc/io.cc: Likewise.
---
 libstdc++-v3/testsuite/std/time/clock/file/io.cc   | 1 +
 libstdc++-v3/testsuite/std/time/clock/gps/io.cc| 1 +
 libstdc++-v3/testsuite/std/time/clock/local/io.cc  | 1 +
 libstdc++-v3/testsuite/std/time/clock/system/io.cc | 1 +
 libstdc++-v3/testsuite/std/time/clock/tai/io.cc| 1 +
 libstdc++-v3/testsuite/std/time/clock/utc/io.cc| 1 +
 6 files changed, 6 insertions(+)

diff --git a/libstdc++-v3/testsuite/std/time/clock/file/io.cc 
b/libstdc++-v3/testsuite/std/time/clock/file/io.cc
index c3ac6cf4d6d..39eb0dbcf51 100644
--- a/libstdc++-v3/testsuite/std/time/clock/file/io.cc
+++ b/libstdc++-v3/testsuite/std/time/clock/file/io.cc
@@ -1,4 +1,5 @@
 // { dg-do run { target c++20 } }
+// { dg-timeout-factor 2 }
 
 #include 
 #include 
diff --git a/libstdc++-v3/testsuite/std/time/clock/gps/io.cc 
b/libstdc++-v3/testsuite/std/time/clock/gps/io.cc
index eac1c13c63a..6f4544fcd97 100644
--- a/libstdc++-v3/testsuite/std/time/clock/gps/io.cc
+++ b/libstdc++-v3/testsuite/std/time/clock/gps/io.cc
@@ -1,4 +1,5 @@
 // { dg-do run { target c++20 } }
+// { dg-timeout-factor 2 }
 
 #include 
 #include 
diff --git a/libstdc++-v3/testsuite/std/time/clock/local/io.cc 
b/libstdc++-v3/testsuite/std/time/clock/local/io.cc
index b16ddbab90b..bb682cd40cf 100644
--- a/libstdc++-v3/testsuite/std/time/clock/local/io.cc
+++ b/libstdc++-v3/testsuite/std/time/clock/local/io.cc
@@ -1,4 +1,5 @@
 // { dg-do run { target c++20 } }
+// { dg-timeout-factor 2 }
 
 #include 
 #include 
diff --git a/libstdc++-v3/testsuite/std/time/clock/system/io.cc 
b/libstdc++-v3/testsuite/std/time/clock/system/io.cc
index d0962353c29..2cc116156f2 100644
--- a/libstdc++-v3/testsuite/std/time/clock/system/io.cc
+++ b/libstdc++-v3/testsuite/std/time/clock/system/io.cc
@@ -1,4 +1,5 @@
 // { dg-do run { target c++20 } }
+// { dg-timeout-factor 2 }
 
 #include 
 #include 
diff --git a/libstdc++-v3/testsuite/std/time/clock/tai/io.cc 
b/libstdc++-v3/testsuite/std/time/clock/tai/io.cc
index 0c5b5872d0c..0fd61c0e612 100644
--- a/libstdc++-v3/testsuite/std/time/clock/tai/io.cc
+++ b/libstdc++-v3/testsuite/std/time/clock/tai/io.cc
@@ -1,4 +1,5 @@
 // { dg-do run { target c++20 } }
+// { dg-timeout-factor 2 }
 
 #include 
 #include 
diff --git a/libstdc++-v3/testsuite/std/time/clock/utc/io.cc 
b/libstdc++-v3/testsuite/std/time/clock/utc/io.cc
index f39eb3f3e8e..58e358f3dbf 100644
--- a/libstdc++-v3/testsuite/std/time/clock/utc/io.cc
+++ b/libstdc++-v3/testsuite/std/time/clock/utc/io.cc
@@ -1,4 +1,5 @@
 // { dg-do run { target c++20 } }
+// { dg-timeout-factor 2 }
 
 #include 
 #include 
-- 
2.41.0



[committed] libstdc++: Micro-optimization for std::optional [PR112480]

2023-11-13 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk.

-- >8 --

This small change removes a branch when clearing a std::optional for
types with no-op destructors. For types where the destructor can be
optimized away (e.g. because it's trivial, or empty and can be inlined)
the _M_destroy() function does nothing but set _M_engaged to false.
Setting _M_engaged=false unconditionally is cheaper than only doing it
when initially true, because it allows the compiler to remove a branch.

The compiler thinks it would be incorrect to unconditionally introduce a
store there, because it could conflict with reads in other threads, so
it won't do that optimization itself. We know it's safe to do because
we're in a non-const member function, so the standard forbids any
potentially concurrent calls to other member functions of the same
object. Making the store unconditional can't create a data race that
isn't already present in the program.

libstdc++-v3/ChangeLog:

PR libstdc++/112480
* include/std/optional (_Optional_payload_base::_M_reset): Set
_M_engaged to false unconditionally.
---
 libstdc++-v3/include/std/optional | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libstdc++-v3/include/std/optional 
b/libstdc++-v3/include/std/optional
index 53450c760d9..a8c97717b72 100644
--- a/libstdc++-v3/include/std/optional
+++ b/libstdc++-v3/include/std/optional
@@ -311,6 +311,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
if (this->_M_engaged)
  _M_destroy();
+   else // This seems redundant but improves codegen, see PR 112480.
+ this->_M_engaged = false;
   }
 };
 
-- 
2.41.0



[PATCH] LoongArch: Use finer-grained DBAR hints

2023-11-13 Thread Xi Ruoyao
LA664 defines DBAR hints 0x1 - 0x1f (except 0xf and 0x1f) as follows [1-2]:

- Bit 4: kind of constraint (0: completion, 1: ordering)
- Bit 3: barrier for previous read (0: true, 1: false)
- Bit 2: barrier for previous write (0: true, 1: false)
- Bit 1: barrier for succeeding read (0: true, 1: false)
- Bit 0: barrier for succeeding write (0: true, 1: false)

LLVM has already utilized them for different memory orders [3]:

- Bit 4 is always set to one because it's only intended to be zero for
  things like MMIO devices, which are out of the scope of memory orders.
- An acquire barrier is used to implement acquire loads like

ld.d $a1, $t0, 0
dbar acquire_hint

  where the load operation (ld.d) should not be reordered with any load
  or store operation after the acquire load.  To accomplish this
  constraint, we need to prevent the load operation from being reordered
  after the barrier, and also prevent any following load/store operation
  from being reordered before the barrier.  Thus bits 0, 1, and 3 must
  be zero, and bit 2 can be one, so acquire_hint should be 0b10100.
- An release barrier is used to implement release stores like

dbar release_hint
st.d $a1, $t0, 0

  where the store operation (st.d) should not be reordered with any load
  or store operation before the release store.  So we need to prevent
  the store operation from being reordered before the barrier, and also
  prevent any preceding load/store operation from being reordered after
  the barrier.  So bits 0, 2, 3 must be zero, and bit 1 can be one.  So
  release_hint should be 0b10010.

A similar mapping has been utilized for RISC-V GCC [4], LoongArch Linux
kernel [1], and LoongArch LLVM [3].  So the mapping should be correct.
And I've also bootstrapped & regtested GCC on a LA664 with this patch.

The LoongArch CPUs should treat "unknown" hints as dbar 0, so we can
unconditionally emit the new hints without a compiler switch.

[1]: https://git.kernel.org/torvalds/c/e031a5f3f1ed
[2]: https://github.com/loongson-community/docs/pull/12
[3]: https://github.com/llvm/llvm-project/pull/68787
[4]: https://gcc.gnu.org/r14-406

gcc/ChangeLog:

* config/loongarch/sync.md (mem_thread_fence): Remove redundant
check.
(mem_thread_fence_1): Emit finer-grained DBAR hints for
different memory models, instead of 0.
---

Bootstrapped and regtested on loongarch64-linux-gnu (running on a
LA664).  Ok for trunk?

 gcc/config/loongarch/sync.md | 49 +---
 1 file changed, 40 insertions(+), 9 deletions(-)

diff --git a/gcc/config/loongarch/sync.md b/gcc/config/loongarch/sync.md
index db3a21690b8..511aba5ffa6 100644
--- a/gcc/config/loongarch/sync.md
+++ b/gcc/config/loongarch/sync.md
@@ -50,23 +50,54 @@ (define_expand "mem_thread_fence"
   [(match_operand:SI 0 "const_int_operand" "")] ;; model
   ""
 {
-  if (INTVAL (operands[0]) != MEMMODEL_RELAXED)
-{
-  rtx mem = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
-  MEM_VOLATILE_P (mem) = 1;
-  emit_insn (gen_mem_thread_fence_1 (mem, operands[0]));
-}
+  rtx mem = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
+  MEM_VOLATILE_P (mem) = 1;
+  emit_insn (gen_mem_thread_fence_1 (mem, operands[0]));
+
   DONE;
 })
 
-;; Until the LoongArch memory model (hence its mapping from C++) is finalized,
-;; conservatively emit a full FENCE.
+;; DBAR hint encoding for LA664 and later micro-architectures, paraphrased from
+;; the Linux patch revealing it [1]:
+;;
+;; - Bit 4: kind of constraint (0: completion, 1: ordering)
+;; - Bit 3: barrier for previous read (0: true, 1: false)
+;; - Bit 2: barrier for previous write (0: true, 1: false)
+;; - Bit 1: barrier for succeeding read (0: true, 1: false)
+;; - Bit 0: barrier for succeeding write (0: true, 1: false)
+;;
+;; [1]: https://git.kernel.org/torvalds/c/e031a5f3f1ed
+;;
+;; Implementations without support for the finer-granularity hints simply treat
+;; all as the full barrier (DBAR 0), so we can unconditionally start emiting 
the
+;; more precise hints right away.
 (define_insn "mem_thread_fence_1"
   [(set (match_operand:BLK 0 "" "")
(unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER))
(match_operand:SI 1 "const_int_operand" "")] ;; model
   ""
-  "dbar\t0")
+  {
+enum memmodel model = memmodel_base (INTVAL (operands[1]));
+
+switch (model)
+  {
+  case MEMMODEL_ACQUIRE:
+  case MEMMODEL_CONSUME:
+   /* Consume is implemented using the stronger acquire memory order
+  because of a deficiency in C++11's semantics.  */
+   return "dbar\t0b10100";
+  case MEMMODEL_RELEASE:
+   return "dbar\t0b10010";
+  case MEMMODEL_ACQ_REL:
+  case MEMMODEL_SEQ_CST:
+   return "dbar\t0b1";
+  default:
+   /* GCC internal: "For the '__ATOMIC_RELAXED' model no instructions
+  need to be issued and this expansion is not invoked."
+  Other values should not be returned by memmodel_base.  */
+   

[committed] i386: Rewrite pushfl2 and popfl1 as unspecs

2023-11-13 Thread Uros Bizjak
Flags reg is valid only with CC mode.

gcc/ChangeLog:

* config/i386/i386-expand.h (gen_pushfl): New prototype.
(gen_popfl): Ditto.
* config/i386/i386-expand.cc (ix86_expand_builtin)
[case IX86_BUILTIN_READ_FLAGS]: Use gen_pushfl.
[case IX86_BUILTIN_WRITE_FLAGS]: Use gen_popfl.
* config/i386/i386.cc (gen_pushfl): New function.
(gen_popfl): Ditto.
* config/i386/i386.md (unspec): Add UNSPEC_PUSHFL and UNSPEC_POPFL.
(@pushfl2): Rename from *pushfl2.
Rewrite as unspec using UNSPEC_PUSHFL.
(@popfl1): Rename from *popfl1.
Rewrite as unspec using UNSPEC_POPFL.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index b52ec51fbe4..a8d871d321e 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -14265,7 +14265,7 @@ rdseed_step:
   if (ignore)
return const0_rtx;
 
-  emit_insn (gen_push (gen_rtx_REG (word_mode, FLAGS_REG)));
+  emit_insn (gen_pushfl ());
 
   if (optimize
  || target == NULL_RTX
@@ -14284,7 +14284,7 @@ rdseed_step:
op0 = copy_to_mode_reg (word_mode, op0);
 
   emit_insn (gen_push (op0));
-  emit_insn (gen_pop (gen_rtx_REG (word_mode, FLAGS_REG)));
+  emit_insn (gen_popfl ());
   return 0;
 
 case IX86_BUILTIN_KTESTC8:
diff --git a/gcc/config/i386/i386-expand.h b/gcc/config/i386/i386-expand.h
index 8e65f7dee45..1ea789c4c3a 100644
--- a/gcc/config/i386/i386-expand.h
+++ b/gcc/config/i386/i386-expand.h
@@ -45,7 +45,9 @@ enum calling_abi ix86_function_abi (const_tree fndecl);
 bool ix86_function_ms_hook_prologue (const_tree fn);
 void warn_once_call_ms2sysv_xlogues (const char *feature);
 rtx gen_push (rtx arg);
+rtx gen_pushfl (void);
 rtx gen_pop (rtx arg);
+rtx gen_popfl (void);
 rtx ix86_expand_builtin (tree exp, rtx target, rtx subtarget,
 machine_mode mode, int ignore);
 bool ix86_vectorize_vec_perm_const (machine_mode vmode, machine_mode op_mode,
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 176ca650aa2..683ac643bc8 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -6465,6 +6465,24 @@ gen_push (rtx arg)
  arg);
 }
 
+rtx
+gen_pushfl (void)
+{
+  struct machine_function *m = cfun->machine;
+  rtx flags, mem;
+
+  if (m->fs.cfa_reg == stack_pointer_rtx)
+m->fs.cfa_offset += UNITS_PER_WORD;
+  m->fs.sp_offset += UNITS_PER_WORD;
+
+  flags = gen_rtx_REG (CCmode, FLAGS_REG);
+
+  mem = gen_rtx_MEM (word_mode,
+gen_rtx_PRE_DEC (Pmode, stack_pointer_rtx));
+
+  return gen_pushfl2 (word_mode, mem, flags);
+}
+
 /* Generate an "pop" pattern for input ARG.  */
 
 rtx
@@ -6479,6 +6497,19 @@ gen_pop (rtx arg)
 stack_pointer_rtx)));
 }
 
+rtx
+gen_popfl (void)
+{
+  rtx flags, mem;
+
+  flags = gen_rtx_REG (CCmode, FLAGS_REG);
+
+  mem = gen_rtx_MEM (word_mode,
+gen_rtx_POST_INC (Pmode, stack_pointer_rtx));
+
+  return gen_popfl1 (word_mode, flags, mem);
+}
+
 /* Generate a "push2" pattern for input ARG.  */
 rtx
 gen_push2 (rtx mem, rtx reg1, rtx reg2)
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 47d21f12c5b..29289f48e9c 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -115,6 +115,8 @@ (define_c_enum "unspec" [
   UNSPEC_SBB
   UNSPEC_CC_NE
   UNSPEC_STC
+  UNSPEC_PUSHFL
+  UNSPEC_POPFL
 
   ;; For SSE/MMX support:
   UNSPEC_FIX_NOTRUNC
@@ -2205,17 +2207,19 @@ (define_insn "*pop1_epilogue"
   [(set_attr "type" "pop")
(set_attr "mode" "")])
 
-(define_insn "*pushfl2"
+(define_insn "@pushfl2"
   [(set (match_operand:W 0 "push_operand" "=<")
-   (match_operand:W 1 "flags_reg_operand"))]
+   (unspec:W [(match_operand:CC 1 "flags_reg_operand")]
+ UNSPEC_PUSHFL))]
   ""
   "pushf{}"
   [(set_attr "type" "push")
(set_attr "mode" "")])
 
-(define_insn "*popfl1"
-  [(set (match_operand:W 0 "flags_reg_operand")
-   (match_operand:W 1 "pop_operand" ">"))]
+(define_insn "@popfl1"
+  [(set (match_operand:CC 0 "flags_reg_operand")
+   (unspec:CC [(match_operand:W 1 "pop_operand" ">")]
+  UNSPEC_POPFL))]
   ""
   "popf{}"
   [(set_attr "type" "pop")


Re: [PATCH V3 1/7] df: Add DF_LIVE_SUBREG problem

2023-11-13 Thread Vladimir Makarov



On 11/12/23 07:08, Lehua Ding wrote:

This patch adds a live_subreg problem to extend the original live_reg to
track the liveness of subreg. We will only try to trace speudo registers
who's mode size is a multiple of nature size and eventually a small portion
of the inside will appear to use subreg. With live_reg problem, live_subreg
prbolem will have the following output. full_in/out mean the entire pesudo
live in/out, partial_in/out mean the subregs of the pesudo are live in/out,
and range_in/out indicates which part of the pesudo is live. all_in/out is
the union of full_in/out and partial_in/out:

I am not a maintainer or reviewer of data-flow analysis framework and 
can not approve this patch except changes in regs.h.  Richard Sandiford 
or Jeff Law as global reviewers probably can do this.


As for regs.h changes, they are ok for me after fixing general issues I 
mentioned in my previous email (two spaces after sentence ends in the 
comments).


I think all this code is a major compiler time and memory consumer in 
all set of the patches.  DF analysis is slow by itself even when only 
effective data structures as bitmaps are used but you are introducing 
even slower data structure as maps (I believe better performance data 
structure can be used instead).  In the very first version of LRA I used 
DFA but it made LRA so slow that I had to introduce own data structures 
which are faster in case of massive RTL changes in LRA.  The same 
problem exists for using generic C++ standard library data as vectors 
and maps for critical code.  It is hard to get a needed performance when 
the exact implementation can vary or be not what you need, e.g. vector 
initial capacity, growth etc.  But again the performance issues can be 
addressed later.





[PATCH] Fixed problem with BTF defining smaller enums.

2023-11-13 Thread Cupertino Miranda
This patch fixes a BTF, which would become invalid when having
smaller then 4 byte definitions of enums.
For example, when using the __attribute__((mode(byte))) in the enum
definition.

Two problems were identified:
 - it would incorrectly create an entry for enum64 when the size of the
   enum was different then 4.
 - it would allocate less then 4 bytes for the value entry in BTF, in
   case the type was smaller.

BTF generated was validated against clang.

gcc/ChangeLog:
* bpfout.cc (btf_calc_num_vbytes): Fixed logic for enum64.
(btf_asm_enum_const): Corrected logic for enum64 and smaller
than 4 bytes values.
---
 gcc/btfout.cc | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index e07fed302c24..d2263ec6eec3 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -299,7 +299,7 @@ btf_calc_num_vbytes (ctf_dtdef_ref dtd)
   break;
 
 case BTF_KIND_ENUM:
-  vlen_bytes += (dtd->dtd_data.ctti_size == 0x8)
+  vlen_bytes += (dtd->dtd_data.ctti_size > 4)
? vlen * sizeof (struct btf_enum64)
: vlen * sizeof (struct btf_enum);
   break;
@@ -914,13 +914,13 @@ btf_asm_enum_const (unsigned int size, ctf_dmdef_t * dmd, 
unsigned int idx)
 {
   dw2_asm_output_data (4, dmd->dmd_name_offset, "ENUM_CONST '%s' idx=%u",
   dmd->dmd_name, idx);
-  if (size == 4)
-dw2_asm_output_data (size, dmd->dmd_value, "bte_value");
-  else
+  if (size > 4)
 {
-  dw2_asm_output_data (4, dmd->dmd_value & 0x, "bte_value_lo32");
+  dw2_asm_output_data (4, dmd->dmd_value & 0xfffe, "bte_value_lo32");
   dw2_asm_output_data (4, (dmd->dmd_value >> 32) & 0x, 
"bte_value_hi32");
 }
+  else
+dw2_asm_output_data (size < 4 ? 4 : size, dmd->dmd_value, "bte_value");
 }
 
 /* Asm'out a function parameter description following a BTF_KIND_FUNC_PROTO.  
*/
-- 
2.30.2



[PATCH] bpf: Forces __buildin_memcmp not to generate a call upto 1024 bytes.

2023-11-13 Thread Cupertino Miranda
This patch forces __builtin_memcmp calls upto data sizes of 1024 to
become inline in caller.
This is a requirement by BPF and it mimics the default behaviour of the
clang BPF implementation.

gcc/ChangeLog:
* config/bpf/bpf.cc (bpf_use_by_pieces_infrastructure_p): Added
function to bypass default behaviour.
* config/bpf/bpf.h (COMPARE_MAX_PIECES): Defined to 1024 bytes.
---
 gcc/config/bpf/bpf.cc | 16 
 gcc/config/bpf/bpf.h  |  5 +
 2 files changed, 21 insertions(+)

diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
index a0956a069729..764a3e487cb6 100644
--- a/gcc/config/bpf/bpf.cc
+++ b/gcc/config/bpf/bpf.cc
@@ -1115,6 +1115,22 @@ bpf_small_register_classes_for_mode_p (machine_mode mode)
 #define TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P \
   bpf_small_register_classes_for_mode_p
 
+static bool
+bpf_use_by_pieces_infrastructure_p (unsigned HOST_WIDE_INT size,
+   unsigned int align ATTRIBUTE_UNUSED,
+   enum by_pieces_operation op,
+   bool speed_p)
+{
+  if (op != COMPARE_BY_PIECES)
+return default_use_by_pieces_infrastructure_p (size, align, op, speed_p);
+
+  return size <= COMPARE_MAX_PIECES;
+}
+
+#undef TARGET_USE_BY_PIECES_INFRASTRUCTURE_P
+#define TARGET_USE_BY_PIECES_INFRASTRUCTURE_P \
+  bpf_use_by_pieces_infrastructure_p
+
 /* Finally, build the GCC target.  */
 
 struct gcc_target targetm = TARGET_INITIALIZER;
diff --git a/gcc/config/bpf/bpf.h b/gcc/config/bpf/bpf.h
index 82702aa7b6ba..1f177ec4c4ef 100644
--- a/gcc/config/bpf/bpf.h
+++ b/gcc/config/bpf/bpf.h
@@ -489,6 +489,11 @@ enum reg_class
locations.  */
 #define MOVE_MAX 8
 
+/* Allow upto 1024 bytes moves to occur using by_pieces
+   infrastructure.  This mimics clang behaviour when using
+   __builtin_memcmp.  */
+#define COMPARE_MAX_PIECES 1024
+
 /* An alias for the machine mode for pointers.  */
 #define Pmode DImode
 
-- 
2.30.2



[PATCH] bpf: Corrected condition in core_mark_as_access_index.

2023-11-13 Thread Cupertino Miranda
gcc/ChangeLog:
* config/bpf/core-builtins.cc (core_mark_as_access_index):
Corrected check.
---
 gcc/config/bpf/core-builtins.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/bpf/core-builtins.cc b/gcc/config/bpf/core-builtins.cc
index 2ba78d7aed2e..1376c9309035 100644
--- a/gcc/config/bpf/core-builtins.cc
+++ b/gcc/config/bpf/core-builtins.cc
@@ -1611,7 +1611,7 @@ core_mark_as_access_index (tree expr)
   || TREE_CODE (expr) == INDIRECT_REF)
 expr = TREE_OPERAND (expr, 0);
 
-  if (bpf_enum_mappings->get (expr) == NULL)
+  if (core_access_index_map->get (expr) == NULL)
 core_access_index_map->put (expr, NULL_TREE);
 }
 
-- 
2.30.2



[PATCH] bpf: Delayed the removal of the parser enum plugin handler.

2023-11-13 Thread Cupertino Miranda
The parser plugin handler that is responsible for collecting enum values
information was being removed way too early.
bpf_resolve_overloaded_core_builtin is called by the parser.
It was moved to the function execute_lower_bpf_core.

gcc/ChangeLog:
* config/bpf/core-builtins.cc
(bpf_resolve_overloaded_core_builtin): Removed call.
(execute_lower_bpf_core): Added all to remove_parser_plugin.
---
 gcc/config/bpf/core-builtins.cc | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/config/bpf/core-builtins.cc b/gcc/config/bpf/core-builtins.cc
index a224847d5d91..2ba78d7aed2e 100644
--- a/gcc/config/bpf/core-builtins.cc
+++ b/gcc/config/bpf/core-builtins.cc
@@ -1473,8 +1473,6 @@ tree
 bpf_resolve_overloaded_core_builtin (location_t loc, tree fndecl,
 void *arglist)
 {
-  remove_parser_plugin ();
-
   if (!bpf_require_core_support ())
 return error_mark_node;
 
@@ -1688,6 +1686,7 @@ make_gimple_core_safe_access_index (tree *tp,
 static unsigned int
 execute_lower_bpf_core (void)
 {
+  remove_parser_plugin ();
   if (!TARGET_BPF_CORE)
 return 0;
 
-- 
2.30.2



Re: [PATCH 1/4] c-family: Add handling for clang-style attributes [PR109877].

2023-11-13 Thread Jeff Law




On 11/12/23 23:02, Iain Sandoe wrote:

This patch set is not actually particualry new, I have been maintaining
it locally one Darwin branches and it has been tested on several versions
of Darwin both with and without Alex's __has_{feature, extension} patch.

This is one of the three most significant blockers to importing the macOS
SDKs properly, and cannot currently be fixincludes-ed (in fact it can not
ever really since the attribute is uaer-facing and so can be in end-user
code that we cannot fix).

OK for trunk?
thanks
Iain

--- 8< ---


The clang compiler supports essentially arbitrary, per-attribute, syntax and
token forms for attribute arguments.  This extends to the case where token
forms are required to be accepted that are not part of the valid set for
standard C or C++.

A motivating  example (in the initial attribute of this form implemented
in this patch set) is version-style (i.e. x.y.z) numeric values.  At present
the c-family cannot handle this, since invalid numeric tokens are rejected
by both C and C++ frontends before we have a chance to decide to accept them
in custom attribute argument parsing.

The solution proposed in this patch series is to allow for a certain set of
attributes names that are known to be 'clang-form' and to defer argument
token validation until the parse of those arguments.

This does not apparently represent any loss of generality - since the
specific attribute names are already claimed by clang and re-using them with
different semantics in GCC would be a highly unfortunate experience for end-
users.

The first patch here adds a mechanism to check attribute identifiers against
a list known to be in clang form.  The 'availability' attribute is added as a
first example.

The acceptance of non-standard tokens is constrained to the interval enclosing
the attribute arguments of cases notified as 'clang-form'.

PR c++/109877

gcc/c-family/ChangeLog:

* c-attribs.cc (attribute_clang_form_p): New.
* c-common.h (attribute_clang_form_p): New.
Patches #1-#3 are fine if nobody has objected within say 48hrs. 
Basically I agree that we have to do something along the lines of what 
you're suggesting and I just want to give folks the opportunity to raise 
any implementation issues they may see.


Patch #4 is obviously darwin specific and I think you can self-approve.


Jeff


[committed] i386: Return CCmode from ix86_cc_mode for unknown RTX code [PR112494]

2023-11-13 Thread Uros Bizjak
Combine wants to combine following instructions into an insn that can
perform both an (arithmetic) operation and set the condition code.  During
the conversion a new RTX is created, and combine passes the RTX code of the
innermost RTX expression of the CC use insn in which CC reg is used to
SELECT_CC_MODE, to determine the new mode of the comparison:

Trying 5 -> 8:
5: r98:DI=0xd7
8: flags:CCZ=cmp(r98:DI,0)
  REG_EQUAL cmp(0xd7,0)
Failed to match this instruction:
(parallel [
(set (reg:CC 17 flags)
(compare:CC (const_int 215 [0xd7])
(const_int 0 [0])))
(set (reg/v:DI 98 [ flags ])
(const_int 215 [0xd7]))
])

where:

(insn 5 2 6 2 (set (reg/v:DI 98 [ flags ])
(const_int 215 [0xd7])) "pr112494.c":8:8 84 {*movdi_internal}
 (nil))

(insn 8 7 11 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg/v:DI 98 [ flags ])
(const_int 0 [0]))) "pr112494.c":11:9 8 {*cmpdi_ccno_1}
 (expr_list:REG_EQUAL (compare:CCZ (const_int 215 [0xd7])
(const_int 0 [0]))
(nil)))

x86_cc_mode (AKA SELECT_CC_MODE) is not prepared to handle random RTX
codes and triggers gcc_unreachable() when SET RTX code is passed to it.
The patch removes gcc_unreachable() and returns CCmode for unknown
RTX codes, so combine can try various combinations involving CC reg
without triggering ICE.

Please note that x86 MOV instructions do not set flags, so the above
combination is not recognized as a valid x86 instruction.

PR target/112494

gcc/ChangeLog:

* config/i386/i386.cc (ix86_cc_mode) [default]: Return CCmode.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr112494.c: New test.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 2c80fd8ebf3..176ca650aa2 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -16469,12 +16469,9 @@ ix86_cc_mode (enum rtx_code code, rtx op0, rtx op1)
return CCNOmode;
   else
return CCGCmode;
-  /* strcmp pattern do (use flags) and combine may ask us for proper
-mode.  */
-case USE:
-  return CCmode;
 default:
-  gcc_unreachable ();
+  /* CCmode should be used in all other cases.  */
+  return CCmode;
 }
 }
 
diff --git a/gcc/testsuite/gcc.target/i386/pr112494.c 
b/gcc/testsuite/gcc.target/i386/pr112494.c
new file mode 100644
index 000..e9482f5d075
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr112494.c
@@ -0,0 +1,17 @@
+/* PR target/112494 */
+/* { dg-options "-Og -fno-tree-copy-prop -fno-tree-fre -fno-tree-ccp 
-fno-tree-forwprop" } */
+
+#include 
+
+int main()
+{
+  long flags = 0xD7;
+
+  __writeeflags(0xD7);
+  flags && !__readeflags();
+
+  if ((flags && (!__readeflags())) != 0xD7)
+;
+
+  return 0;
+}


Re: [PATCH] RISC-V: Optimize combine sequence by merge approach

2023-11-13 Thread Jeff Law




On 11/12/23 20:37, Juzhe-Zhong wrote:

This patch optimization of combine 2 duplicate vectors when their elements are 
not equal.

typedef char v16qi __attribute__ ((vector_size (16)));

void
foo (char a, char b, char *out)
{
   v16qi v = {a, a, a, a, a, b, b, b, b, b, b, b, b, b, b, b};
   *(v16qi *) out = v;
}

Before this patch:

 vsetivlizero,16,e8,m1,ta,ma
 vmv.v.x v1,a0
 vslide1down.vx  v1,v1,a1
 vslide1down.vx  v1,v1,a1
 vslide1down.vx  v1,v1,a1
 vslide1down.vx  v1,v1,a1
 vslide1down.vx  v1,v1,a1
 vslide1down.vx  v1,v1,a1
 vslide1down.vx  v1,v1,a1
 vslide1down.vx  v1,v1,a1
 vslide1down.vx  v1,v1,a1
 vslide1down.vx  v1,v1,a1
 vslide1down.vx  v1,v1,a1
 vse8.v  v1,0(a2)
 ret

After this patch:

vsetivlizero,16,e8,m1,ta,ma
vmv.v.x v1,a0
vid.v   v0
vmsgtu.vi   v0,v0,4
vmerge.vxm  v1,v1,a1,v0
vse8.v  v1,0(a2)
ret

gcc/ChangeLog:

* config/riscv/riscv-v.cc 
(rvv_builder::combine_sequence_use_merge_profitable_p): New function.
(expand_vector_init_merge_combine_sequence): Ditto.
(expand_vec_init): Add optimization.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls-vlmax/combine-merge-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/combine-merge-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/combine-merge_run-1.c: New 
test.
* gcc.target/riscv/rvv/autovec/vls-vlmax/combine-merge_run-2.c: New 
test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-1.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-10.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-11.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-12.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-13.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-14.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-2.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-3.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-4.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-5.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-6.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-7.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-8.c: New test.
* gcc.target/riscv/rvv/autovec/vls/combine-merge-9.c: New test.

OK
jeff


Re: [PING^2][PATCH RFA] PR target/111815: VAX: Only accept the index scaler as the RHS operand to ASHIFT

2023-11-13 Thread Jeff Law




On 11/13/23 13:22, Maciej W. Rozycki wrote:

On Mon, 16 Oct 2023, Maciej W. Rozycki wrote:


  The testcase is generic enough I thought it wouldn't hurt to place it in
a generic part of the testsuite, where it has been verified to pass with
the `powerpc64le-linux-gnu', `riscv64-linux-gnu', and `vax-netbsdelf'
targets.  I'm fine to move it to the VAX part of the testsuite though if
there's disagreement as to my choice.  Otherwise OK to apply for this
part?


  Ping for:
.

OK.  I'm fully supportive of generic tests :-)


Jeff


Re: [PATCH] testsuite: xfail scev-[35].c on ia32

2023-11-13 Thread Alexandre Oliva
On Nov 11, 2023, Thomas Schwinge  wrote:

> ACK -- but why not likewise "fix" the 'gcc.dg/tree-ssa/scev-4.c' FAIL?

I have evidence from earlier compiler version bumps that there's some
correlation and that scev-4.c also failed in the past, but it wasn't
failing for me this time.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


[PING^2][PATCH RFA] PR target/111815: VAX: Only accept the index scaler as the RHS operand to ASHIFT

2023-11-13 Thread Maciej W. Rozycki
On Mon, 16 Oct 2023, Maciej W. Rozycki wrote:

>  The testcase is generic enough I thought it wouldn't hurt to place it in 
> a generic part of the testsuite, where it has been verified to pass with 
> the `powerpc64le-linux-gnu', `riscv64-linux-gnu', and `vax-netbsdelf' 
> targets.  I'm fine to move it to the VAX part of the testsuite though if 
> there's disagreement as to my choice.  Otherwise OK to apply for this 
> part?

 Ping for:
.

  Maciej


RE: [PATCH 5/21]middle-end: update vectorizer's control update to support picking an exit other than loop latch

2023-11-13 Thread Tamar Christina
> -Original Message-
> From: Richard Biener 
> Sent: Tuesday, November 7, 2023 3:04 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> Subject: Re: [PATCH 5/21]middle-end: update vectorizer's control update to
> support picking an exit other than loop latch
> 
> On Mon, 6 Nov 2023, Tamar Christina wrote:
> 
> > Hi All,
> >
> > As requested, the vectorizer is now free to pick it's own exit which
> > can be different than what the loop CFG infrastucture uses.  The
> > vectorizer makes use of this to vectorize loops that it previously could 
> > not.
> >
> > But this means that loop control must be materialized in the block
> > that needs it less we corrupt the SSA chain.  This makes it so we use
> > the vectorizer's main IV block instead of the loop infra.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * tree-ssa-loop-manip.cc (standard_iv_increment_position):
> Conditionally
> > take dest BB.
> > * tree-ssa-loop-manip.h (standard_iv_increment_position): Likewise.
> > * tree-vect-loop-manip.cc (vect_set_loop_controls_directly): Use it.
> > (vect_set_loop_condition_partial_vectors_avx512): Likewise.
> > (vect_set_loop_condition_normal): Likewise.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/tree-ssa-loop-manip.h b/gcc/tree-ssa-loop-manip.h
> > index
> >
> bda09f51d5619420331c513a9906831c779fd2b4..5938588c8882d842b00
> 301423df1
> > 11cbe7bf7ba8 100644
> > --- a/gcc/tree-ssa-loop-manip.h
> > +++ b/gcc/tree-ssa-loop-manip.h
> > @@ -38,7 +38,8 @@ extern basic_block split_loop_exit_edge (edge, bool
> > = false);  extern basic_block ip_end_pos (class loop *);  extern
> > basic_block ip_normal_pos (class loop *);  extern void
> > standard_iv_increment_position (class loop *,
> > -   gimple_stmt_iterator *, bool *);
> > +   gimple_stmt_iterator *, bool *,
> > +   basic_block = NULL);
> >  extern bool
> >  gimple_duplicate_loop_body_to_header_edge (class loop *, edge, unsigned
> int,
> >sbitmap, edge, vec *, int); 
> > diff
> --git
> > a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc index
> >
> e7436915e01297e7af2a3bcf1afd01e014de6f32..bdc7a3d74a788f450ca5d
> de6c294
> > 92ce4d4e4550 100644
> > --- a/gcc/tree-ssa-loop-manip.cc
> > +++ b/gcc/tree-ssa-loop-manip.cc
> > @@ -792,14 +792,19 @@ ip_normal_pos (class loop *loop)
> >
> >  /* Stores the standard position for induction variable increment in LOOP
> > (just before the exit condition if it is available and latch block is 
> > empty,
> > -   end of the latch block otherwise) to BSI.  INSERT_AFTER is set to true 
> > if
> > -   the increment should be inserted after *BSI.  */
> > +   end of the latch block otherwise) to BSI.  If DEST_BB is specified then 
> > that
> > +   basic block is used as the destination instead of the loop latch source
> > +   block.  INSERT_AFTER is set to true if the increment should be inserted
> after
> > +   *BSI.  */
> >
> >  void
> >  standard_iv_increment_position (class loop *loop, gimple_stmt_iterator
> *bsi,
> > -   bool *insert_after)
> > +   bool *insert_after, basic_block dest_bb)
> >  {
> > -  basic_block bb = ip_normal_pos (loop), latch = ip_end_pos (loop);
> > +  basic_block bb = dest_bb;
> > +  if (!bb)
> > +bb = ip_normal_pos (loop);
> > +  basic_block latch = ip_end_pos (loop);
> 
> I don't think that's a good API extension.  Given that we don't support an 
> early
> exit after the main IV exit doesn't this code already work fine as-is?  It 
> chooses
> the last exit.  The position is also not semantically relevant, we just try 
> to keep
> the latch empty here (that is, it's a bit of a "bad" API).
> 
> So, do you really need this change?

Yes I do, If you look at these kinds of loops 
https://gist.github.com/Mistuke/66f14fe5c1be32b91ce149bd9b8bb35f

You'll see that the main exit, i.e. the one attached to the latch block is the 
early break.
Because SCEV can't analyze it picks the main exit to be the one in BB4.

This means that the loop control must be placed in BB4.  If we place ivtmp_10 = 
ivtmp_9 - 1
In BB 3 then that's broken SSA.  If we use `ivtmp_9` in BB4 then we'll have an 
off by one issue.

You could have reached the end of the valid range for the loop when you 
re-enter BB4, since
loads are still allowed you'll then read out of bounds before checking that you 
exit.

This is also annoyingly hard to get correct, which Is what took me a long time. 
 Such loops mean
You need to restart the scalar loop at i_7 if you take the main exit.

Regards,
Tamar

> 
> Maybe we're really using standard_iv_increment_position wrong here, the
> result is supposed to _only_ feed the PHI latch argument.
> Richard.
> 
> >gimple *last = 

Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store

2023-11-13 Thread Jeff Law




On 11/12/23 20:22, pan2...@intel.com wrote:

From: Pan Li 

Update in v4:
* Merge upstream and removed some independent changes.

Update in v3:
* Take known_le instead of known_lt for vector size.
* Return NULL_RTX when gap is not equal 0 and not constant.

Update in v2:
* Move vector type support to get_stored_val.

Original log:

This patch would like to allow the vector mode in the
get_stored_val in the DSE. It is valid for the read
rtx if and only if the read bitsize is less than the
stored bitsize.

Given below example code with
--param=riscv-autovec-preference=fixed-vlmax.

vuint8m1_t test () {
   uint8_t arr[32] = {
 1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
 1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
   };

   return __riscv_vle8_v_u8m1(arr, 32);
}

Before this patch:
test:
   lui a5,%hi(.LANCHOR0)
   addisp,sp,-32
   addia5,a5,%lo(.LANCHOR0)
   li  a3,32
   vl2re64.v   v2,0(a5)
   vsetvli zero,a3,e8,m1,ta,ma
   vs2r.v  v2,0(sp) <== Unnecessary store to stack
   vle8.v  v1,0(sp) <== Ditto
   vs1r.v  v1,0(a0)
   addisp,sp,32
   jr  ra

After this patch:
test:
   lui a5,%hi(.LANCHOR0)
   addia5,a5,%lo(.LANCHOR0)
   li  a4,32
   addisp,sp,-32
   vsetvli zero,a4,e8,m1,ta,ma
   vle8.v  v1,0(a5)
   vs1r.v  v1,0(a0)
   addisp,sp,32
   jr  ra

Below tests are passed within this patch:
* The risc-v regression test.
* The x86 bootstrap and regression test.
* The aarch64 regression test.

PR target/111720

gcc/ChangeLog:

* dse.cc (get_stored_val): Allow vector mode if read size is
less than or equal to stored size.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr111720-0.c: New test.
* gcc.target/riscv/rvv/base/pr111720-1.c: New test.
* gcc.target/riscv/rvv/base/pr111720-10.c: New test.
* gcc.target/riscv/rvv/base/pr111720-2.c: New test.
* gcc.target/riscv/rvv/base/pr111720-3.c: New test.
* gcc.target/riscv/rvv/base/pr111720-4.c: New test.
* gcc.target/riscv/rvv/base/pr111720-5.c: New test.
* gcc.target/riscv/rvv/base/pr111720-6.c: New test.
* gcc.target/riscv/rvv/base/pr111720-7.c: New test.
* gcc.target/riscv/rvv/base/pr111720-8.c: New test.
* gcc.target/riscv/rvv/base/pr111720-9.c: New test.

OK for the trunk.







+  else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
+&& known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
+&& targetm.modes_tieable_p (read_mode, store_mode))
+read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
else
  read_reg = extract_low_bits (read_mode, store_mode,
 copy_rtx (store_info->rhs));
It may not matter, especially for RV, but we could possibly have a 
mixture of scalar and vector modes in the RTL.  Say a vector store 
followed by a scalar read or vice-versa.


I wouldn't try to handle that case unless we had actual evidence it was 
useful to do so.  Just wanted to point out that unlike pseudos we can 
have multiple modes referencing the same memory location.


Jeff


[PATCH] LoongArch: Handle vectorized copysign (x, -1) expansion efficiently

2023-11-13 Thread Xi Ruoyao
With LSX or LASX, copysign (x[i], -1) (or any negative constant) can be
vectorized using [x]vbitseti.{w/d} instructions to directly set the
signbits.

Inspired by Tamar Christina's "AArch64: Handle copysign (x, -1) expansion
efficiently" (r14-5289).

gcc/ChangeLog:

* config/loongarch/lsx.md (copysign3): Allow operand[2] to
be an reg_or_vector_same_val_operand.  If it's a const vector
with same negative elements, expand the copysign with a bitset
instruction.  Otherwise, force it into an register.
* config/loongarch/lasx.md (copysign3): Likewise.

gcc/testsuite/ChangeLog:

* g++.target/loongarch/vect-copysign-negconst.C: New test.
* g++.target/loongarch/vect-copysign-negconst-run.C: New test.
---

Bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?

 gcc/config/loongarch/lasx.md  | 22 -
 gcc/config/loongarch/lsx.md   | 22 -
 .../loongarch/vect-copysign-negconst-run.C| 47 +++
 .../loongarch/vect-copysign-negconst.C| 27 +++
 4 files changed, 116 insertions(+), 2 deletions(-)
 create mode 100644 
gcc/testsuite/g++.target/loongarch/vect-copysign-negconst-run.C
 create mode 100644 gcc/testsuite/g++.target/loongarch/vect-copysign-negconst.C

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index f0f2dd08dd8..2e11f061202 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -3136,11 +3136,31 @@ (define_expand "copysign3"
  (match_operand:FLASX 1 "register_operand")))
(set (match_dup 5)
(and:FLASX (match_dup 3)
-  (match_operand:FLASX 2 "register_operand")))
+  (match_operand:FLASX 2 "reg_or_vector_same_val_operand")))
(set (match_operand:FLASX 0 "register_operand")
(ior:FLASX (match_dup 4) (match_dup 5)))]
   "ISA_HAS_LASX"
 {
+  /* copysign (x, -1) should instead be expanded as setting the sign
+ bit.  */
+  if (!REG_P (operands[2]))
+{
+  rtx op2_elt = unwrap_const_vec_duplicate (operands[2]);
+  if (GET_CODE (op2_elt) == CONST_DOUBLE
+ && real_isneg (CONST_DOUBLE_REAL_VALUE (op2_elt)))
+   {
+ rtx n = GEN_INT (8 * GET_MODE_SIZE (mode) - 1);
+ operands[0] = lowpart_subreg (mode, operands[0],
+   mode);
+ operands[1] = lowpart_subreg (mode, operands[1],
+   mode);
+ emit_insn (gen_lasx_xvbitseti_ (operands[0],
+  operands[1], n));
+ DONE;
+   }
+}
+
+  operands[2] = force_reg (mode, operands[2]);
   operands[3] = loongarch_build_signbit_mask (mode, 1, 0);
 
   operands[4] = gen_reg_rtx (mode);
diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
index 55c7d79a030..8ea41c85b01 100644
--- a/gcc/config/loongarch/lsx.md
+++ b/gcc/config/loongarch/lsx.md
@@ -2873,11 +2873,31 @@ (define_expand "copysign3"
  (match_operand:FLSX 1 "register_operand")))
(set (match_dup 5)
(and:FLSX (match_dup 3)
- (match_operand:FLSX 2 "register_operand")))
+ (match_operand:FLSX 2 "reg_or_vector_same_val_operand")))
(set (match_operand:FLSX 0 "register_operand")
(ior:FLSX (match_dup 4) (match_dup 5)))]
   "ISA_HAS_LSX"
 {
+  /* copysign (x, -1) should instead be expanded as setting the sign
+ bit.  */
+  if (!REG_P (operands[2]))
+{
+  rtx op2_elt = unwrap_const_vec_duplicate (operands[2]);
+  if (GET_CODE (op2_elt) == CONST_DOUBLE
+ && real_isneg (CONST_DOUBLE_REAL_VALUE (op2_elt)))
+   {
+ rtx n = GEN_INT (8 * GET_MODE_SIZE (mode) - 1);
+ operands[0] = lowpart_subreg (mode, operands[0],
+   mode);
+ operands[1] = lowpart_subreg (mode, operands[1],
+   mode);
+ emit_insn (gen_lsx_vbitseti_ (operands[0], operands[1],
+   n));
+ DONE;
+   }
+}
+
+  operands[2] = force_reg (mode, operands[2]);
   operands[3] = loongarch_build_signbit_mask (mode, 1, 0);
 
   operands[4] = gen_reg_rtx (mode);
diff --git a/gcc/testsuite/g++.target/loongarch/vect-copysign-negconst-run.C 
b/gcc/testsuite/g++.target/loongarch/vect-copysign-negconst-run.C
new file mode 100644
index 000..d2d5d15c933
--- /dev/null
+++ b/gcc/testsuite/g++.target/loongarch/vect-copysign-negconst-run.C
@@ -0,0 +1,47 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -march=loongarch64 -mlasx -mno-strict-align" } */
+/* { dg-require-effective-target loongarch_asx_hw } */
+
+#include "vect-copysign-negconst.C"
+
+double d[] = {1.2, -3.4, -5.6, 7.8};
+float f[] = {1.2, -3.4, -5.6, 7.8, -9.0, -11.4, 51.4, 1919.810};
+
+double _abs(double x) { return __builtin_fabs (x); }
+float _abs(float x) { return __builtin_fabsf (x); }
+
+template 
+void
+check (T *arr, T 

Re: [PATCH 4/6] c: Turn -Wimplicit-int into a pedpermerror [PR91093]

2023-11-13 Thread Sam James


Florian Weimer  writes:

> There is a missed opportunity here to issue spelling diagnostics
> in prototype declarations (e.g., for “extern int foo (int32t);”).
>
> gcc/
>
>   * doc/invoke.texi (Warning Options): Document changes.
>
> gcc/c/
>
>   * c-decl.cc (warn_defaults_to): Call emit_diagnostic_valist
>   instead of reimplementing it. Issue a pedpermerror for C99
>   and later.
>   (store_parm_decls_oldstyle): Call pedpermerror for
>   OPT_Wimplicit_int.
>
> gcc/testsuite/
>
>   * gcc.dg/Wimplicit-int-1.c: Compile with -fpermissive due to
>   expected warning.
>   * gcc.dg/Wimplicit-int-4.c: Likewise.
>   * gcc.dg/Wimplicit-int-1a.c: New test.  Copied from
>   gcc.dg/Wimplicit-int-1.c, but expect errors.
>   * gcc.dg/Wimplicit-int-4a.c: New test.  Copied from
>   gcc.dg/Wimplicit-int-4.c, but expect errors.
>   * gcc.dg/gnu23-attr-syntax-2.c: Compile with -fpermissive
>   due to expected implicit-int error.
>   * gcc.dg/gnu23-attr-syntax-3.c: New test.  Copied from
>   gcc.dg/gnu23-attr-syntax-2.c, but expect an error.
>   * gcc.dg/pr105635.c: Build with -fpermissive due to implicit
>   int.
>   * gcc.dg/pr105635-2.c: New test.  Copied from
>   gcc.dg/pr105635.c.  Expect implicit int error.
>   * gcc.dg/noncompile/pr79758.c: Build with -fpermissive due to
>   implicitint.
>   * gcc.dg/noncompile/pr79758-2.c: New test.  Copied from
>   gcc.dg/noncompile/pr79758.c.  Expect implicit int error.

This is PR91093.


Re: [PATCH 2/6] c: Turn int-conversion warnings into permerrors [PR106416]

2023-11-13 Thread Sam James


Florian Weimer  writes:

> gcc/
>
>   * doc/invoke.texi (Warning Options): Document changes.
>
> gcc/c/
>
>   * c-typeck.cc (build_conditional_expr): Use pedpermerror for
>   pointer/integer type mismatches, based on -Wint-conversion.
>   (pedwarn_pedpermerror_init, permerror_init): New function.
>   (pedwarn_init): Call pedwarn_pedpermerror_init.
>   (convert_for_assignment): Use pedpermerror and
>   pedpermerror_init for -Wint-conversion  warnings.
>
> gcc/testsuite/
>
>   * c-c++-common/pr77624-1.c (foo, bar): Expect
>   error instead of warning.
>   * gcc.dg/Wint-conversion-2.c: Compile with -fpermissive due
>   to expected int-conversion warning.
>   * gcc.dg/Wint-conversion-3.c: Likewise.
>   * gcc.dg/Wint-conversion-4.c: New test.  Based on
>   gcc.dg/Wint-conversion-3.c.  Expect int-conversion errors.
>   * gcc.dg/assign-warn-1.c: Compile with -fpermissive.
>   * gcc.dg/assign-warn-4.c: New file.  Extracted from
>   assign-warn1.c.  Expect int-cnversion errors.
>   * gcc.dg/diagnostic-types-1.c: compile with -fpermissive.
>   * gcc.dg/diagnostic-types-2.c: New file.  Extracted from
>   gcc.dg/diagnostic-types-1.c.  Expect some errors instead of
>   warnings.
>   * gcc.dg/gomp/pr35738.c: Compile with -fpermissive due to
>   expected int-conversion error.
>   * gcc.dg/gomp/pr35738-2.c: New test.  Based on
>   gcc.dg/gomp/pr35738.c.  Expect int-converison errors.
>   * gcc.dg/init-excess-3.c: Expect int-converison errors.
>   * gcc.dg/overflow-warn-1.c: Likewise.
>   * gcc.dg/overflow-warn-3.c: Likewise.
>   * gcc.dg/param-type-mismatch.c: Compile with -fpermissive.
>   * gcc.dg/param-type-mismatch-2.c: New test.  Copied from
>   gcc.dg/param-type-mismatch.c.  Expect errors.
>   * gcc.dg/pr61162-2.c: Compile with -fpermissive.
>   * gcc.dg/pr61162-3.c: New test. Extracted from
>   gcc.dg/pr61162-2.c.  Expect int-conversion errors.
>   * gcc.dg/spec-barrier-3.c: Use -fpermissive due to expected
>   int-conversion error.
>   * gcc.dg/spec-barrier-3a.c: New test.  Based on
>   gcc.dg/spec-barrier-3.c.  Expect int-conversion errors.
>   * gcc.target/aarch64/acle/memtag_2.c: Use -fpermissive due to expected
>   int-conversion error.
>   * gcc.target/aarch64/acle/memtag_2a.c: New test.  Copied from
>   gcc.target/aarch64/acle/memtag_2.c.  Expect error.
>   * gcc.target/aarch64/sve/acle/general-c/load_3.c (f1): Expect
>   error.
>   * gcc.target/aarch64/sve/acle/general-c/store_2.c (f1): Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/store_scatter_index_1.c
>   (f1): Likewise.
>   * 
> gcc.target/aarch64/sve/acle/general-c/store_scatter_index_restricted_1.c
>   (f1): Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_2.c
>   (f1): Likewise.
>   * 
> gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_restricted_1.c
>   (f1): Likewise.

This is PR106416.

> ---


Re: [PATCH 0/6] Turn some C warnings into errors by default

2023-11-13 Thread Sam James


Florian Weimer  writes:

> This patch series converts the following warnings into errors by
> default:
>
>   -Wint-conversion
>   -Wimplicit-function-declaration
>   -Wimplicit-int
>   -Wreturn-mismatch
>   -Wincompatible-pointer-types
>
> As explained in the first commit, I decided not to use permerror_opt
> because it does not exhibit the existing behavior for -pedantic-errors.
>
> The impact on existing sources of the last commit is not really known to
> me at this point.  I plan to start a Fedora build later this week with
> an instrumented compiler, to see how much of a compatible impact it will
> be.  The first conversion pass through Fedora only covered
> -Wimplicit-function-declaration, -Wimplicit-int.  I started looking at
> -Wint-conversion, and it did not seem to be too bad, so I think
> including it should be fine.  I'm more worried about
> -Wincompatible-pointer-types.
>
> I have not yet added a new overview test for -fpermissive.  Such a test
> should trigger all the dozen or so places where I introduced
> pedpermerror, and see what happens under multiple dialects, each with
> -fpermissive and without, and maybe also with and withoyt for
> -pedantic-errors in -std=gnu89 and default modes.  I plan to do this
> once I get some initial feedback on the direction of these series
> because this test would likely be obsolete fairly quickly if changes to
> the diagnostics are required.  I did copy some existing tests to test
> both the error and warning (-fpermissive) diagnostics, and adjusted
> others to expect errors, so there is already quite a bit coverage
> without that overview test.
>
> Right now, this series breaks the build on aarch64-linux-gnu due to an
> incompatible pointer assignment in libgcc:
>
>   [PATCH] aarch64: Avoid -Wincompatible-pointer-types warning in Linux 
> unwinder
>   
> 
>
> Other targets had the same issue previously, but I've already fixed most
> of them (I hope).  There could of course be similar issues lurking in
> target-specific code, or even in system headers.
>
> With the recent testsuite fixes, the testsuite should be fairly clean
> despite these changes.  I verified that on i686-linux-gnu,
> powerpc64-linux-gnu, and x86_64-linux-gnu.  There is one
> aarch64-linux-gnu testsuite change I'd like the AArch64 maintainers to
> review:

I'll incrementally work on the other targets once stuff lands.

>
>   [PATCH] aarch64: Call named function in gcc.target/aarch64/aapcs64/ice_1.c
>   
> 
>
> Recently, I also found a problem in the gm2 testsuite:
>
>   [PATCH] gm2: Add missing declaration of m2pim_M2RTS_Terminate to test
>   
> 
>

Also, please tag PR96284 for all of these.

> Thanks,
> Florian
>
>
> Florian Weimer (6):
>   c-family: Introduce pedpermerror
>   c: Turn int-conversion warnings into permerrors
>   c: Turn -Wimplicit-function-declaration into a pedpermerror
>   c: Turn -Wimplicit-int into a pedpermerror
>   c: Turn -Wreturn-mismatch into a pedpermerror
>   c: Turn -Wincompatible-pointer-types into a pedpermerror
>
>  gcc/c-family/c-common.h   |   4 +
>  gcc/c-family/c-warn.cc|  34 
>  gcc/c/c-decl.cc   |  40 ++--
>  gcc/c/c-typeck.cc | 164 +--
>  gcc/diagnostic-core.h |   3 +
>  gcc/diagnostic.cc |   7 +
>  gcc/doc/invoke.texi   |  33 +++-
>  gcc/testsuite/c-c++-common/pr77624-1.c|   4 +-
>  .../c-c++-common/spellcheck-reserved.c|   4 +-
>  gcc/testsuite/gcc.dg/20030906-1.c |   2 +-
>  gcc/testsuite/gcc.dg/20030906-1a.c|  21 ++
>  gcc/testsuite/gcc.dg/20030906-2.c |   2 +-
>  gcc/testsuite/gcc.dg/20030906-2a.c|  21 ++
>  .../Wimplicit-function-declaration-c99-2.c|   7 +
>  .../Wimplicit-function-declaration-c99.c  |   2 +-
>  gcc/testsuite/gcc.dg/Wimplicit-int-1.c|   2 +-
>  gcc/testsuite/gcc.dg/Wimplicit-int-1a.c   |  11 ++
>  gcc/testsuite/gcc.dg/Wimplicit-int-4.c|   2 +-
>  gcc/testsuite/gcc.dg/Wimplicit-int-4a.c   |  11 ++
>  .../gcc.dg/Wincompatible-pointer-types-2.c|   2 +-
>  .../gcc.dg/Wincompatible-pointer-types-4.c|   2 +-
>  .../gcc.dg/Wincompatible-pointer-types-5.c|  10 +
>  .../gcc.dg/Wincompatible-pointer-types-6.c|  10 +
>  gcc/testsuite/gcc.dg/Wint-conversion-2.c  |   2 +-
>  gcc/testsuite/gcc.dg/Wint-conversion-3.c  |   2 +-
>  gcc/testsuite/gcc.dg/Wint-conversion-4.c  |  14 ++
>  gcc/testsuite/gcc.dg/Wreturn-mismatch-1.c |   2 +-
>  gcc/testsuite/gcc.dg/Wreturn-mismatch-1a.c|  40 
>  gcc/testsuite/gcc.dg/Wreturn-mismatch-2.c |   2 +-
>  

Re: [PATCH 3/6] c: Turn -Wimplicit-function-declaration into a pedpermerror [PR91092]

2023-11-13 Thread Sam James


Florian Weimer  writes:

> In the future, it may make sense to avoid cascading errors from
> the implicit declaration, especially its assumed int return type.
> This change here only changes the kind of the diagnostic, not
> its wording or consequences.
>
> gcc/c/
>
>   * doc/invoke.texi (Warning Options): Document changes.
>
> gcc/c/
>
>   * c-decl.cc (implicit_decl_pedpermerror): Rename from
>   implicit_decl_warning.  Call pedpermerror instead of
>   pedwarn and warning_at.
>   (implicitly_declare): Adjust callers.
>
> gcc/testsuite/
>
>   * c-c++-common/spellcheck-reserved.c (test, test_2): Expect
>   error instead of warning.
>   (f): Expect error instead of warning.
>   * gcc.dg/Wimplicit-function-declaration-c99.c: Compile with
>   -fpermissive due to expected warning.
>   * gcc.dg/Wimplicit-function-declaration-c99-2.c: New test.
>   Copied from gcc.dg/Wimplicit-function-declaration-c99.c.
>   Expect error.
>   * gcc.dg/missing-header-fixit-1.c: Compile with -fpermissive
>   due to expect error.
>   * gcc.dg/missing-header-fixit-1a.c: New test.  Copied from
>   gcc.dg/missing-header-fixit-1.c, but expect error.
>   * gcc.dg/missing-header-fixit-2.c: Compile with -fpermissive
>   due to expect error.
>   * gcc.dg/missing-header-fixit-2a.c: New test.  Copied from
>   gcc.dg/missing-header-fixit-2.c, but expect error.
>   * gcc.dg/missing-header-fixit-4.c: Compile with -fpermissive
>   due to expect error.
>   * gcc.dg/missing-header-fixit-4a.c: New test.  Copied from
>   gcc.dg/missing-header-fixit-4.c, but expect error.
>   * gcc.dg/missing-header-fixit-5.c: Compile with -fpermissive
>   due to expect error.
>   * gcc.dg/missing-header-fixit-5a.c: New test.  Copied from
>   gcc.dg/missing-header-fixit-5.c, but expect error.
>   * gcc.dg/pr61852.c: Expect implicit-function-declaration
>   error instead of warning.
>   * gcc.dg/spellcheck-identifiers-2.c: Compile with
>   -fpermissive due to expected warnings.
>   * gcc.dg/spellcheck-identifiers-2a.c: New test.  Copied
>   from gcc.dg/spellcheck-identifiers-2a.c.  Expect errors.
>   * gcc.dg/spellcheck-identifiers-3.c: Compile with
>   -fpermissive due to expected warnings.
>   * gcc.dg/spellcheck-identifiers-3a.c: New test.  Copied
>   from gcc.dg/spellcheck-identifiers-2a.c.  Expect errors.
>   * gcc.dg/spellcheck-identifiers-4.c: Compile with
>   -fpermissive due to expected warnings.
>   * gcc.dg/spellcheck-identifiers-4a.c: New test.  Copied
>   from gcc.dg/spellcheck-identifiers-2a.c.  Expect error.
>   * gcc.dg/spellcheck-identifiers.c: Compile with
>   -fpermissive due to expected warnings.
>   * gcc.dg/spellcheck-identifiers-1a.c: New test.  Copied
>   from gcc.dg/spellcheck-identifiers.c.  Expect errors.
>   * gcc.target/aarch64/sve/acle/general-c/ld1sh_gather_1.c (f1):
>   Expect error.
>   * gcc.target/aarch64/sve/acle/general-c/load_ext_gather_index_1.c:
>   (f1): Likewise.
>   * 
> gcc.target/aarch64/sve/acle/general-c/load_ext_gather_index_restricted_1.c:
>   (f1): Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_1.c:
>   (f1): Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_2.c:
>   (f1): Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_3.c:
>   (f1): Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_4.c:
>   (f1): Likewise.
>   * gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_5.c:
>   (f1): Likewise.
>   * 
> gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_1.c:
>   (f1): Likewise.
>   * 
> gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_2.c:
>   (f1): Likewise.
>   * 
> gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_3.c:
>   (f1): Likewise.
>   * 
> gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_4.c:
>   (f1): Likewise.

This is PR91092. In due course, we'll also have to update the porting to
GCC 14 page.


Re: [PATCH V3 0/7] ira/lra: Support subreg coalesce

2023-11-13 Thread Vladimir Makarov



On 11/12/23 07:08, Lehua Ding wrote:

V3 Changes:
   1. fix three ICE.
   2. rebase

Hi,

These patchs try to support subreg coalesce feature in
register allocation passes (ira and lra).

I've started review of v3 patches and here is my initial general 
criticism of your patches:


  * Absence of comments for some functions, e.g. for `HARD_REG_SET 
operator>> (unsigned int shift_amount) const`.


  * Adding significant functionality to existing functions is not 
reflected in the function comment, e.g. in ira_set_allocno_class.


  * A lot of typos, e.g. `pesudo` or `reprensent`.  I think you need to 
check spelling of you comments (I myself do spell checking in emacs by 
ispell-region command).


  * Grammar mistakes, e.g `Flag means need track subreg live range for 
the allocno`.  I understand English is not your native languages (as for 
me).  In case of some doubts I'd recommend to check grammar in ChatGPT 
(Proofread:  text).


  * Some local variables use upper case letters (e.g. `int A`) which 
should be used for macros or enums according to GNU coding standard 
(https://www.gnu.org/prep/standards/standards.html) .


  * Sometimes you put one space at the end of sentence.  Please see GNU 
coding standard and GCC coding conventions 
(https://gcc.gnu.org/codingconventions.html)


  * There is no uniformity in your code, e.g. sometimes you use 'i++', 
sometimes `++i` or `i += 1`.  Although the uniformity is not necessary, 
it makes a better impression about the patches.



I also did not find what targets did you use for testing.  I am asking 
this because I see new testsuite failures (apx-spill_to_egprs-1.c) even 
on x86-64.  It might be nothing as the test expects a specific code 
generation.


Also besides testing major targets I'd recommend testing at least one 
big endian target (I'd recommend ppc64be. gcc110.fsfrance.org could be 
used for this).  Plenty RA issues occur because BE targets are not tested.





Re: [PATCH 6/6] c: Turn -Wincompatible-pointer-types into a pedpermerror

2023-11-13 Thread Jeff Law




On 11/13/23 06:11, Florian Weimer wrote:

gcc/

* doc/invoke.texi (Warning Options): Document changes.

gcc/c/

* c-typeck.cc (build_conditional_expr): Use pedpermerror-
equivalent for pointer type mismatches in conditional
expression.
(convert_for_assignment): Use pedpermerror and
pedpermerror_init for OPT_Wincompatible_pointer_types
warnings.

gcc/testsuite/

* gcc.dg/Wincompatible-pointer-types-2.c: Compile with
-fpermissivedue to expected errors.
* gcc.dg/Wincompatible-pointer-types-4.c: Likewise.
* gcc.dg/Wincompatible-pointer-types-5.c: New test.  Copied
from gcc.dg/Wincompatible-pointer-types-2.c.  Expect errors.
* gcc.dg/Wincompatible-pointer-types-6.c: New test.  Copied
from gcc.dg/Wincompatible-pointer-types-4.c.  Expect errors.
* gcc.dg/anon-struct-11.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/anon-struct-11a.c: New test.  Copied from
gcc.dg/anon-struct-11.c.  Expect errors.
* gcc.dg/anon-struct-13.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/anon-struct-13a.c: New test.  Copied from
gcc.dg/anon-struct-13.c.  Expect errors.
* gcc.dg/builtin-arith-overflow-4.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/builtin-arith-overflow-4a.c: New test.  Copied from
gcc.dg/builtin-arith-overflow-4.c.  Expect errors.
* gcc.dg/c23-qual-4.c: Expect -Wincompatible-pointer-types errors.
* gcc.dg/dfp/composite-type.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/dfp/composite-type-2.c: New test.  Copied from
gcc.dg/dfp/composite-type.c.  Expect errors.
* gcc.dg/diag-aka-1.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/diag-aka-1a.c: New test.  Copied from gcc.dg/diag-aka-1a.c.
Expect errors.
* gcc.dg/enum-compat-1.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/enum-compat-2.c: New test.  Copied from
gcc.dg/enum-compat-1.c.  Expect errors.
* gcc.dg/func-ptr-conv-1.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/func-ptr-conv-2.c: New test.  Copied from
gcc.dg/func-ptr-conv-1.c.  Expect errors.
* gcc.dg/init-bad-7.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/init-bad-7a.c: New test.  Copied from gcc.dg/init-bad-7.c.
Expect errors.
* gcc.dg/noncompile/incomplete-3.c (foo): Expect
-Wincompatible-pointer-types error.
* gcc.dg/param-type-mismatch-2.c (test8): Likewise.
* gcc.dg/pointer-array-atomic.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/pointer-array-atomic-2.c: New test.  Copied from
gcc.dg/pointer-array-atomic.c.  Expect errors.
* gcc.dg/pointer-array-quals-1.c (test): Expect
-Wincompatible-pointer-types errors.
* gcc.dg/transparent-union-1.c: Compile with -fpermissive
due to expected errors.
* gcc.dg/transparent-union-1a.c: New test.  Copied from
gcc.dg/transparent-union-1.c.  Expect errors.
* gcc.target/aarch64/acle/memtag_2a.c
(test_memtag_warning_return_qualifier): Expect additional
errors.
* gcc.target/aarch64/sve/acle/general-c/load_2.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_1.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_2.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_3.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_4.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_5.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_1.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_2.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_3.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_4.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/sizeless-1.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/sizeless-2.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_1.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_2.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_scatter_index_1.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/store_scatter_index_restricted_1.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_2.c
(f1): 

Re: [PATCH 5/6] c: Turn -Wreturn-mismatch into a pedpermerror

2023-11-13 Thread Jeff Law




On 11/13/23 06:11, Florian Weimer wrote:

gcc/

* doc/invoke.texi (Warning Options): Document changes.

gcc/c/

* c-typeck.cc (c_finish_return): Issue a permerror
for mismatching pointers to builtins.  For mismatching
other pointers, issue a pedpermerror.

gcc/testsuite/

* gcc.dg/20030906-1.c: Compile with -fpermissive due to
expected -Wreturn-mismatch error.
* gcc.dg/20030906-1a.c: New test.  Copied from
gcc.dg/20030906-1.c.  Expect the error.
* gcc.dg/20030906-2.c: Compile with -fpermissive due to
expected -Wreturn-mismatch error.
* gcc.dg/20030906-2a.c: New test.  Copied from
gcc.dg/20030906-2.c.  Expect the error.
* gcc.dg/Wreturn-mismatch-1.c: Compile with -fpermissive due to
expected -Wreturn-mismatch error.
* gcc.dg/Wreturn-mismatch-1a.c: New test.  Copied from
gcc.dg/Wreturn-mismatch-1.c.  Expect the error.
* gcc.dg/Wreturn-mismatch-2.c: Compile with -fpermissive due to
expected -Wreturn-mismatch error.
* gcc.dg/Wreturn-mismatch-2a.c: New test.  Copied from
gcc.dg/Wreturn-mismatch-2.c.  Expect the error.
* gcc.dg/diagnostic-range-bad-return.c: Compile with
-fpermissive due to expected -Wreturn-mismatch error.
* gcc.dg/diagnostic-range-bad-return-2.c: New test.
Copied from gcc.dg/diagnostic-range-bad-return.c.  Expect the
error.
* gcc.dg/pr105635-2.c: Expect -Wreturn-mismatch error.
* gcc.dg/pr23075.c: Build with -fpermissive due to
expected -Wreturn-mismatch error.
* gcc.dg/pr23075-2.c: New test.  Copied from gcc.dg/pr23075.c.
Expect the error.
* gcc.dg/pr29521.c: Compile with -fpermissive due to expected
-Wreturn-mismatch error.
* gcc.dg/pr29521-a.c: New test. Copied from gcc.dg/pr29521.c.
Expect error.
* gcc.dg/pr67730.c: Compile with -fpermissive due to expected
-Wreturn-mismatch error.
* gcc.dg/pr67730-a.c: New test.  Copied from
gcc.dg/pr67730-a.c.  Expect error.
* gcc.target/powerpc/conditional-return.c: Compile with
-fpermissive due to expected-Wreturn-mismatch error.
---

OK
jeff


Re: [PATCH 4/6] c: Turn -Wimplicit-int into a pedpermerror

2023-11-13 Thread Jeff Law




On 11/13/23 06:11, Florian Weimer wrote:

There is a missed opportunity here to issue spelling diagnostics
in prototype declarations (e.g., for “extern int foo (int32t);”).

gcc/

* doc/invoke.texi (Warning Options): Document changes.

gcc/c/

* c-decl.cc (warn_defaults_to): Call emit_diagnostic_valist
instead of reimplementing it. Issue a pedpermerror for C99
and later.
(store_parm_decls_oldstyle): Call pedpermerror for
OPT_Wimplicit_int.

gcc/testsuite/

* gcc.dg/Wimplicit-int-1.c: Compile with -fpermissive due to
expected warning.
* gcc.dg/Wimplicit-int-4.c: Likewise.
* gcc.dg/Wimplicit-int-1a.c: New test.  Copied from
gcc.dg/Wimplicit-int-1.c, but expect errors.
* gcc.dg/Wimplicit-int-4a.c: New test.  Copied from
gcc.dg/Wimplicit-int-4.c, but expect errors.
* gcc.dg/gnu23-attr-syntax-2.c: Compile with -fpermissive
due to expected implicit-int error.
* gcc.dg/gnu23-attr-syntax-3.c: New test.  Copied from
gcc.dg/gnu23-attr-syntax-2.c, but expect an error.
* gcc.dg/pr105635.c: Build with -fpermissive due to implicit
int.
* gcc.dg/pr105635-2.c: New test.  Copied from
gcc.dg/pr105635.c.  Expect implicit int error.
* gcc.dg/noncompile/pr79758.c: Build with -fpermissive due to
implicitint.
* gcc.dg/noncompile/pr79758-2.c: New test.  Copied from
gcc.dg/noncompile/pr79758.c.  Expect implicit int error.
---

OK
jeff


Re: [PATCH 0/7] ira/lra: Support subreg coalesce

2023-11-13 Thread Vladimir Makarov



On 11/12/23 07:01, Lehua Ding wrote:
Thanks for the specint performance data. I'll do my best to get the 
compile time and memory issues fixed. I'm very curious to know if the 
way used to solve the subreg coalesce problem makes sense to you?


If it works,  it is ok for me.  There is always a room for any 
optimization even if it decreases compilation speed considerably. We 
just need to keep the same speed for optimization level <= 2.  We can 
put really expensive optimizations to -O3 or -Ofast.


Although the first thing I would try myself is to do subreg liveness 
analysis only locally (inside BBs).  The majority cases I saw to improve 
subreg RA were local (inside a BB).   For such approach, we probably 
would have only minor compiler speed slowdown and could use the 
optimization by default.




Re: [PATCH 3/6] c: Turn -Wimplicit-function-declaration into a pedpermerror

2023-11-13 Thread Jeff Law




On 11/13/23 06:10, Florian Weimer wrote:

In the future, it may make sense to avoid cascading errors from
the implicit declaration, especially its assumed int return type.
This change here only changes the kind of the diagnostic, not
its wording or consequences.

gcc/c/

* doc/invoke.texi (Warning Options): Document changes.

gcc/c/

* c-decl.cc (implicit_decl_pedpermerror): Rename from
implicit_decl_warning.  Call pedpermerror instead of
pedwarn and warning_at.
(implicitly_declare): Adjust callers.

gcc/testsuite/

* c-c++-common/spellcheck-reserved.c (test, test_2): Expect
error instead of warning.
(f): Expect error instead of warning.
* gcc.dg/Wimplicit-function-declaration-c99.c: Compile with
-fpermissive due to expected warning.
* gcc.dg/Wimplicit-function-declaration-c99-2.c: New test.
Copied from gcc.dg/Wimplicit-function-declaration-c99.c.
Expect error.
* gcc.dg/missing-header-fixit-1.c: Compile with -fpermissive
due to expect error.
* gcc.dg/missing-header-fixit-1a.c: New test.  Copied from
gcc.dg/missing-header-fixit-1.c, but expect error.
* gcc.dg/missing-header-fixit-2.c: Compile with -fpermissive
due to expect error.
* gcc.dg/missing-header-fixit-2a.c: New test.  Copied from
gcc.dg/missing-header-fixit-2.c, but expect error.
* gcc.dg/missing-header-fixit-4.c: Compile with -fpermissive
due to expect error.
* gcc.dg/missing-header-fixit-4a.c: New test.  Copied from
gcc.dg/missing-header-fixit-4.c, but expect error.
* gcc.dg/missing-header-fixit-5.c: Compile with -fpermissive
due to expect error.
* gcc.dg/missing-header-fixit-5a.c: New test.  Copied from
gcc.dg/missing-header-fixit-5.c, but expect error.
* gcc.dg/pr61852.c: Expect implicit-function-declaration
error instead of warning.
* gcc.dg/spellcheck-identifiers-2.c: Compile with
-fpermissive due to expected warnings.
* gcc.dg/spellcheck-identifiers-2a.c: New test.  Copied
from gcc.dg/spellcheck-identifiers-2a.c.  Expect errors.
* gcc.dg/spellcheck-identifiers-3.c: Compile with
-fpermissive due to expected warnings.
* gcc.dg/spellcheck-identifiers-3a.c: New test.  Copied
from gcc.dg/spellcheck-identifiers-2a.c.  Expect errors.
* gcc.dg/spellcheck-identifiers-4.c: Compile with
-fpermissive due to expected warnings.
* gcc.dg/spellcheck-identifiers-4a.c: New test.  Copied
from gcc.dg/spellcheck-identifiers-2a.c.  Expect error.
* gcc.dg/spellcheck-identifiers.c: Compile with
-fpermissive due to expected warnings.
* gcc.dg/spellcheck-identifiers-1a.c: New test.  Copied
from gcc.dg/spellcheck-identifiers.c.  Expect errors.
* gcc.target/aarch64/sve/acle/general-c/ld1sh_gather_1.c (f1):
Expect error.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_index_1.c:
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_index_restricted_1.c:
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_1.c:
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_2.c:
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_3.c:
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_4.c:
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_5.c:
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_1.c:
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_2.c:
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_3.c:
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/load_ext_gather_offset_restricted_4.c:
(f1): Likewise.
---

OK
jeff


Re: [PATCH 2/6] c: Turn int-conversion warnings into permerrors

2023-11-13 Thread Jeff Law




On 11/13/23 06:10, Florian Weimer wrote:

gcc/

* doc/invoke.texi (Warning Options): Document changes.

gcc/c/

* c-typeck.cc (build_conditional_expr): Use pedpermerror for
pointer/integer type mismatches, based on -Wint-conversion.
(pedwarn_pedpermerror_init, permerror_init): New function.
(pedwarn_init): Call pedwarn_pedpermerror_init.
(convert_for_assignment): Use pedpermerror and
pedpermerror_init for -Wint-conversion  warnings.

gcc/testsuite/

* c-c++-common/pr77624-1.c (foo, bar): Expect
error instead of warning.
* gcc.dg/Wint-conversion-2.c: Compile with -fpermissive due
to expected int-conversion warning.
* gcc.dg/Wint-conversion-3.c: Likewise.
* gcc.dg/Wint-conversion-4.c: New test.  Based on
gcc.dg/Wint-conversion-3.c.  Expect int-conversion errors.
* gcc.dg/assign-warn-1.c: Compile with -fpermissive.
* gcc.dg/assign-warn-4.c: New file.  Extracted from
assign-warn1.c.  Expect int-cnversion errors.
* gcc.dg/diagnostic-types-1.c: compile with -fpermissive.
* gcc.dg/diagnostic-types-2.c: New file.  Extracted from
gcc.dg/diagnostic-types-1.c.  Expect some errors instead of
warnings.
* gcc.dg/gomp/pr35738.c: Compile with -fpermissive due to
expected int-conversion error.
* gcc.dg/gomp/pr35738-2.c: New test.  Based on
gcc.dg/gomp/pr35738.c.  Expect int-converison errors.
* gcc.dg/init-excess-3.c: Expect int-converison errors.
* gcc.dg/overflow-warn-1.c: Likewise.
* gcc.dg/overflow-warn-3.c: Likewise.
* gcc.dg/param-type-mismatch.c: Compile with -fpermissive.
* gcc.dg/param-type-mismatch-2.c: New test.  Copied from
gcc.dg/param-type-mismatch.c.  Expect errors.
* gcc.dg/pr61162-2.c: Compile with -fpermissive.
* gcc.dg/pr61162-3.c: New test. Extracted from
gcc.dg/pr61162-2.c.  Expect int-conversion errors.
* gcc.dg/spec-barrier-3.c: Use -fpermissive due to expected
int-conversion error.
* gcc.dg/spec-barrier-3a.c: New test.  Based on
gcc.dg/spec-barrier-3.c.  Expect int-conversion errors.
* gcc.target/aarch64/acle/memtag_2.c: Use -fpermissive due to expected
int-conversion error.
* gcc.target/aarch64/acle/memtag_2a.c: New test.  Copied from
gcc.target/aarch64/acle/memtag_2.c.  Expect error.
* gcc.target/aarch64/sve/acle/general-c/load_3.c (f1): Expect
error.
* gcc.target/aarch64/sve/acle/general-c/store_2.c (f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_scatter_index_1.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/store_scatter_index_restricted_1.c
(f1): Likewise.
* gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_2.c
(f1): Likewise.
* 
gcc.target/aarch64/sve/acle/general-c/store_scatter_offset_restricted_1.c
(f1): Likewise.

OK
jeff


Re: [PATCH 1/6] c-family: Introduce pedpermerror

2023-11-13 Thread Jeff Law




On 11/13/23 06:09, Florian Weimer wrote:

It turns out that permerror_opt is not directly usable for
-fpermissive in the C front end.  The front end uses pedwarn
extensively, and pedwarns are not overridable by -Wno-* options,
yet permerrors are.  Add new pedpermerror helpers that turn into
pedwarns if -pedantic-errors is active.

Due to the dependency on flag_pedantic_errors, the new helpers
are specific to the C-family front ends.  For implementing the
rich location variant, export emit_diagnostic_valist from
gcc/diagnostic.cc in parallel to its location_t variant.

gcc/

* diagnostic-core.h (emit_diagnostic_valist): Declare function.
* diagnostic.cc (emit_diagnostic_valist): Define it.

gcc/c-family/

* c-common.h (pedpermerror): Declare functions.
* c-warn.cc (pedpermerror): Define them.

OK
jeff


Re: [PATCH v2 3/3] RISC-V: Add support for XCVbi extension in CV32E40P

2023-11-13 Thread Patrick O'Neill

Hi Mary,

GCC tip-of-tree with this patch series applied fails to build glibc with:
--with-arch=rv32imac --with-abi=ilp32
--with-arch=rv32imafdc --with-abi=ilp32d

Failing command:
./bin/riscv32-unknown-elf-gcc 
-B/scratch/tc-testing/tc-nov-13-zcvbi-series/build/build-newlib/riscv32-unknown-elf/newlib/ 
-isystem 
/scratch/tc-testing/tc-nov-13-zcvbi-series/build/build-newlib/riscv32-unknown-elf/newlib/targ-include 
-isystem 
/scratch/tc-testing/tc-nov-13-zcvbi-series/newlib/newlib/libc/include 
-B/scratch/tc-testing/tc-nov-13-zcvbi-series/build/build-newlib/riscv32-unknown-elf/libgloss/riscv32 
-L/scratch/tc-testing/tc-nov-13-zcvbi-series/build/build-newlib/riscv32-unknown-elf/libgloss/libnosys 
-L/scratch/tc-testing/tc-nov-13-zcvbi-series/newlib/libgloss/riscv32 
-DPACKAGE_NAME=\"newlib\" -DPACKAGE_TARNAME=\"newlib\" 
-DPACKAGE_VERSION=\"4.1.0\" -DPACKAGE_STRING=\"newlib\ 4.1.0\" 
-DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -I. 
-I/scratch/tc-testing/tc-nov-13-zcvbi-series/build/../newlib/newlib/libc/time 
-fno-builtin -DHAVE_NANOSLEEP  -O2 -D_POSIX_MODE -ffunction-sections 
-fdata-sections    -mcmodel=medlow -c -o lib_a-mktime.o `test -f 
'mktime.c' || echo 
'/scratch/tc-testing/tc-nov-13-zcvbi-series/build/../newlib/newlib/libc/time/'`mktime.c

during RTL pass: combine
/scratch/tc-testing/tc-nov-13-zcvbi-series/build/../newlib/newlib/libc/time/mktime.c: 
In function 'validate_structure':
/scratch/tc-testing/tc-nov-13-zcvbi-series/build/../newlib/newlib/libc/time/mktime.c:148:1: 
internal compiler error: Segmentation fault

  148 | }
  | ^
0x1276843 crash_signal
    ../../../gcc/gcc/toplev.cc:316
0x2136c93 recog_1349
    ../../../gcc/gcc/config/riscv/thead.md:438
0x21dcab5 recog_for_combine_1
    ../../../gcc/gcc/combine.cc:11382
0x21e12ae recog_for_combine
    ../../../gcc/gcc/combine.cc:11652
0x21f594a try_combine
    ../../../gcc/gcc/combine.cc:3867
0x21f8159 combine_instructions
    ../../../gcc/gcc/combine.cc:1285
0x21f8159 rest_of_handle_combine
    ../../../gcc/gcc/combine.cc:15080
0x21f8159 execute
    ../../../gcc/gcc/combine.cc:15124
Please submit a full bug report, with preprocessed source (by using 
-freport-bug).

Please include the complete backtrace with any bug report.
See  for instructions.

Creduced testcase:
./bin/riscv32-unknown-elf-gcc -O1 mktime.c -freport-bug
int a, b;
void c() { a = b % 7; }

during RTL pass: combine
mktime.c: In function 'c':
mktime.c:2:23: internal compiler error: Segmentation fault
    2 | void c() { a = b % 7; }
  |   ^
0x1276843 crash_signal
    ../../../gcc/gcc/toplev.cc:316
0x2136c93 recog_1349
    ../../../gcc/gcc/config/riscv/thead.md:438
0x21dcab5 recog_for_combine_1
    ../../../gcc/gcc/combine.cc:11382
0x21e12ae recog_for_combine
    ../../../gcc/gcc/combine.cc:11652
0x21f594a try_combine
    ../../../gcc/gcc/combine.cc:3867
0x21f8159 combine_instructions
    ../../../gcc/gcc/combine.cc:1285
0x21f8159 rest_of_handle_combine
    ../../../gcc/gcc/combine.cc:15080
0x21f8159 execute
    ../../../gcc/gcc/combine.cc:15124
Please submit a full bug report, with preprocessed source.
Please include the complete backtrace with any bug report.
See  for instructions.
Preprocessed source stored into /tmp/ccJL816S.out file, please attach 
this to your bugreport.


I've attached the -freport-bug output to this email.

Thanks,
Patrick

glibc: a704fd9a133bfb10510e18702f48a6a9c88dbbd5
gcc base hash: 2020bce38cf2e02cbd1097faa8f1fd6586364a7e

On 11/13/23 05:35, Mary Bennett wrote:

Spec: 
github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
   Mary Bennett 
   Nandni Jamnadas 
   Pietra Ferreira 
   Charlie Keaney
   Jessica Mills
   Craig Blackmore 
   Simon Cook 
   Jeremy Bennett 
   Helene Chelin 

gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Create XCVbi extension
  support.
* config/riscv/riscv.opt: Likewise.
* config/riscv/corev.md: Implement cv_branch pattern
  for cv.beqimm and cv.bneimm.
* config/riscv/riscv.md: Change pattern priority so corev.md
  patterns run before riscv.md patterns.
* config/riscv/constraints.md: Implement constraints
  cv_bi_s5 - signed 5-bit immediate.
* config/riscv/predicates.md: Implement predicate
  const_int5s_operand - signed 5 bit immediate.
* doc/sourcebuild.texi: Add XCVbi documentation.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/cv-bi-beqimm-compile-1.c: New test.
* gcc.target/riscv/cv-bi-beqimm-compile-2.c: New test.
* gcc.target/riscv/cv-bi-bneimm-compile-1.c: New test.
* gcc.target/riscv/cv-bi-bneimm-compile-2.c: New test.
* lib/target-supports.exp: Add proc for XCVbi.
---
  gcc/common/config/riscv/riscv-common.cc   |  2 +
  gcc/config/riscv/constraints.md   |  6 +++
  

[PATCH] c++: Link extended FP conversion pedwarns to -Wnarrowing [PR111842]

2023-11-13 Thread Jonathan Wakely
Tested x86_64-linux. Does this make sense for trunk?

These forbidden conversions are defined as narrowing conversions, and we
already use -Wnarrowing to control other "ill-formed, but only warn by
default" diagnostics required by the standard.

-- >8 --

Several users have been confused by the status of these warnings,
which can be misunderstood as "this might not be what you want",
rather than diagnostics required by the C++ standard. Add the text "ISO
C++ does not allow" to make this clear.

Also link them to -Wnarrowing so that they can be disabled or promoted
to errors independently of other pedwarns.

gcc/cp/ChangeLog:

PR c++/111842
PR c++/112498
* call.cc (convert_like_internal): Use OPT_Wnarrowing for
pedwarns about illformed conversions involving extended
floating-point types. Clarify that ISO C++ requires these
diagnostics.
* g++.dg/cpp23/ext-floating16.C: New test.
* g++.dg/cpp23/ext-floating17.C: New test.
---
 gcc/cp/call.cc  | 10 +++--
 gcc/testsuite/g++.dg/cpp23/ext-floating16.C | 40 
 gcc/testsuite/g++.dg/cpp23/ext-floating17.C | 42 +
 3 files changed, 88 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp23/ext-floating16.C
 create mode 100644 gcc/testsuite/g++.dg/cpp23/ext-floating17.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 4516677bcab..4b0c6c42f01 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -8303,15 +8303,17 @@ convert_like_internal (conversion *convs, tree expr, 
tree fn, int argnum,
totype))
  {
  case 2:
-   if (pedwarn (loc, 0, "converting to %qH from %qI with greater "
-"conversion rank", totype, TREE_TYPE (expr)))
+   if (pedwarn (loc, OPT_Wnarrowing, "ISO C++ does not allow "
+"converting to %qH from %qI with greater "
+"conversion rank", totype, TREE_TYPE (expr)))
  complained = 1;
else if (!complained)
  complained = -1;
break;
  case 3:
-   if (pedwarn (loc, 0, "converting to %qH from %qI with unordered "
-"conversion ranks", totype, TREE_TYPE (expr)))
+   if (pedwarn (loc, OPT_Wnarrowing, "ISO C++ does not allow "
+"converting to %qH from %qI with unordered "
+"conversion rank", totype, TREE_TYPE (expr)))
  complained = 1;
else if (!complained)
  complained = -1;
diff --git a/gcc/testsuite/g++.dg/cpp23/ext-floating16.C 
b/gcc/testsuite/g++.dg/cpp23/ext-floating16.C
new file mode 100644
index 000..d6a562d5cb0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/ext-floating16.C
@@ -0,0 +1,40 @@
+// P1467R9 - Extended floating-point types and standard names.
+// { dg-do compile { target c++23 } }
+// { dg-options "-pedantic-errors -Wno-narrowing" }
+// { dg-add-options float16 }
+// { dg-add-options float32 }
+// { dg-add-options float64 }
+// { dg-add-options float128 }
+
+#include "ext-floating.h"
+
+#ifdef __STRICT_ANSI__
+#undef __SIZEOF_FLOAT128__
+#endif
+
+using namespace std;
+
+#ifdef __STDCPP_FLOAT16_T__
+#ifdef __STDCPP_FLOAT32_T__
+float16_t f16c = 1.0F32;   // { dg-bogus "converting to 
'std::float16_t' \\\{aka '_Float16'\\\} from '_Float32' with greater conversion 
rank" "" { target { float16 && float32 } } }
+#endif
+#ifdef __STDCPP_FLOAT64_T__
+float16_t f16e = 1.0F64;   // { dg-bogus "converting to 
'std::float16_t' \\\{aka '_Float16'\\\} from '_Float64' with greater conversion 
rank" "" { target { float16 && float64 } } }
+#endif
+#ifdef __STDCPP_FLOAT128_T__
+float16_t f16g = 1.0F128;  // { dg-bogus "converting to 
'std::float16_t' \\\{aka '_Float16'\\\} from '_Float128' with greater 
conversion rank" "" { target { float16 && float128 } } }
+#endif
+#endif
+#ifdef __STDCPP_FLOAT32_T__
+#ifdef __STDCPP_FLOAT64_T__
+float32_t f32e = 1.0F64;   // { dg-bogus "converting to 
'std::float32_t' \\\{aka '_Float32'\\\} from '_Float64' with greater conversion 
rank" "" { target { float32 && float64 } } }
+#endif
+#ifdef __STDCPP_FLOAT128_T__
+float32_t f32g = 1.0F128;  // { dg-bogus "converting to 
'std::float32_t' \\\{aka '_Float32'\\\} from '_Float128' with greater 
conversion rank" "" { target { float32 && float128 } } }
+#endif
+#endif
+#ifdef __STDCPP_FLOAT64_T__
+#ifdef __STDCPP_FLOAT128_T__
+float64_t f64g = 1.0F128;  // { dg-bogus "converting to 
'std::float64_t' \\\{aka '_Float64'\\\} from '_Float128' with greater 
conversion rank" "" { target { float64 && float128 } } }
+#endif
+#endif
diff --git a/gcc/testsuite/g++.dg/cpp23/ext-floating17.C 
b/gcc/testsuite/g++.dg/cpp23/ext-floating17.C
new file mode 100644
index 000..796e045537a
--- /dev/null
+++ 

Re: [PATCH] testsuite: Fix bad-mapper-1.C test failures with posix_spawn

2023-11-13 Thread Jeff Law




On 11/12/23 23:53, Richard Biener wrote:

On Sun, Nov 12, 2023 at 12:12 AM Brendan Shanks  wrote:


bad-mapper-1.C has been failing since the posix_spawn codepath was added
to libiberty, adjust the check to accept the changed error message.

Patch has been verified on x86_64 Linux.


OK

And pushed to the trunk.  I'd also checked it on ppc64le :-)

jeff


Re: [PATCH] rs6000: Disable PCREL for unsupported targets [PR111045]

2023-11-13 Thread jeevitha



On 11/11/23 5:21 am, Peter Bergner wrote:

> Jeevitha, can you test this patch to see whether it fixes the testsuite
> issue caused by your earlier patch that was approved, but not yet pushed?
> That was the use GPR2 for register allocation, correct?  Note, you'll need
> to update the patch to replace the rs6000_pcrel_p() usage with just
> TARGET_PCREL, since this patch removes rs6000_pcrel_p().

Yeah Peter. This patch fixes the testsuite issue for GPR2 register allocation
patch [PR110320].


Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

2023-11-13 Thread Jeff Law




On 11/13/23 07:47, Robin Dapp wrote:

As per Jeff's remark I'm going to push the attached.

Regards
  Robin

Subject: [PATCH v4] RISC-V: vsetvl: Refine REG_EQUAL equality.

This patch enhances the equality check for REG_EQUAL notes in the vsetvl
pass by using the == operator instead of rtx_equal_p.  With that, in
situations like the following, a5 and a7 are not considered equal
anymore.
One final note.  The register allocator tries to promote REG_EQUAL notes 
to REG_EQUIV notes when it's provably safe.  I don't think that code is 
terribly aggressive and I doubt it'd kick in for the forms shown below.




(insn 62 60 63 4 (set (reg:DI 17 a7 [orig:154 loop_len_54 ] [154])
 (umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
 (reg:DI 30 t5 [219]))) 442 {umindi3}
  (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
 (const_int 8 [0x8]))
 (nil)))
(insn 63 62 65 4 (set (reg:DI 15 a5 [orig:175 _103 ] [175])
 (minus:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
 (reg:DI 17 a7 [orig:154 loop_len_54 ] [154]))) 11 {subdi3}
  (nil))
(insn 65 63 66 4 (set (reg:DI 16 a6 [orig:153 loop_len_53 ] [153])
 (umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
 (reg:DI 30 t5 [219]))) 442 {umindi3}
  (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
 (const_int 8 [0x8]))
 (nil)))



Jeff


Re: [PATCH] aarch64: New RTL optimization pass avoid-store-forwarding.

2023-11-13 Thread Manos Anagnostakis
Hi Richard,

thank you for reviewing the patch.

On Sat, Nov 11, 2023 at 6:57 PM Richard Sandiford 
wrote:

> Thanks for the patch.
>
> Manos Anagnostakis  writes:
> > This is an RTL pass that detects store forwarding from stores to larger
> loads (load pairs).
> >
> > This optimization is SPEC2017-driven and was found to be beneficial for
> some benchmarks,
> > through testing on ampere1/ampere1a machines.
> >
> > For example, it can transform cases like
> >
> > str  d5, [sp, #320]
> > fmul d5, d31, d29
> > ldp  d31, d17, [sp, #312] # Large load from small store
> >
> > to
> >
> > str  d5, [sp, #320]
> > fmul d5, d31, d29
> > ldr  d31, [sp, #312]
> > ldr  d17, [sp, #320]
>
> For this particular case, it would be good to have a spill forwarding
> pass that inserts:
>
>   mov d17, d5
>
> before the fmul (possibly with further clean-up beyond that).
>
> But I realise the patch handles the general case where that isn't possible.
> The patch can also afford to be approximate.
>
> Some of the insn recognition code will probably need to be updated for
> Alex's pass that forms LDP/STP, since that's going to have writeback
> support.  We'll also need to arrange a pass ording that works for both
> passes.
>
> The patch generally looks good, and it's nicely compact.  Some specific
> comments below:
>
> > Currently, the pass is disabled by default on all architectures and
> enabled by a target-specific option.
> >
> > If deemed beneficial enough for a default, it will be enabled on
> ampere1/ampere1a,
> > or other architectures as well, without needing to be turned on by this
> option.
> >
> > Bootstrapped and regtested on aarch64-linux.
> >
> > gcc/ChangeLog:
> >
> > * alias.cc (memrefs_conflict_p): Expose static function.
> > * alias.h (memrefs_conflict_p): Expose static function.
> > * config.gcc: Add aarch64-store-forwarding.o to extra_objs.
> > * config/aarch64/aarch64-passes.def (INSERT_PASS_AFTER): New
> pass.
> > * config/aarch64/aarch64-protos.h
> (make_pass_avoid_store_forwarding): Declare.
> > * config/aarch64/aarch64.opt (mavoid-store-forwarding): New
> option.
> >   (aarch64-store-forwarding-threshold): New param.
> > * config/aarch64/t-aarch64: Add aarch64-store-forwarding.o
> > * doc/invoke.texi: Document new option and new param.
> > * config/aarch64/aarch64-store-forwarding.cc: New file.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/ldp_ssll_no_overlap_address.c: New test.
> > * gcc.target/aarch64/ldp_ssll_no_overlap_offset.c: New test.
> > * gcc.target/aarch64/ldp_ssll_overlap.c: New test.
> >
> > Signed-off-by: Manos Anagnostakis 
> > Co-Authored-By: Manolis Tsamis 
> > Co-Authored-By: Philipp Tomsich 
> > ---
> >  gcc/alias.cc  |   2 +-
> >  gcc/alias.h   |   1 +
> >  gcc/config.gcc|   1 +
> >  gcc/config/aarch64/aarch64-passes.def |   1 +
> >  gcc/config/aarch64/aarch64-protos.h   |   1 +
> >  .../aarch64/aarch64-store-forwarding.cc   | 347 ++
> >  gcc/config/aarch64/aarch64.opt|   9 +
> >  gcc/config/aarch64/t-aarch64  |  10 +
> >  gcc/doc/invoke.texi   |  12 +-
> >  .../aarch64/ldp_ssll_no_overlap_address.c |  33 ++
> >  .../aarch64/ldp_ssll_no_overlap_offset.c  |  33 ++
> >  .../gcc.target/aarch64/ldp_ssll_overlap.c |  33 ++
> >  12 files changed, 481 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/config/aarch64/aarch64-store-forwarding.cc
> >  create mode 100644
> gcc/testsuite/gcc.target/aarch64/ldp_ssll_no_overlap_address.c
> >  create mode 100644
> gcc/testsuite/gcc.target/aarch64/ldp_ssll_no_overlap_offset.c
> >  create mode 100644 gcc/testsuite/gcc.target/aarch64/ldp_ssll_overlap.c
> >
> > diff --git a/gcc/alias.cc b/gcc/alias.cc
> > index 86d8f7104ad..303683f85e3 100644
> > --- a/gcc/alias.cc
> > +++ b/gcc/alias.cc
> > @@ -2488,7 +2488,7 @@ offset_overlap_p (poly_int64 c, poly_int64 xsize,
> poly_int64 ysize)
> > one for X + non-constant and Y + non-constant when X and Y are equal.
> > If that is fixed the TBAA hack for union type-punning can be
> removed.  */
> >
> > -static int
> > +int
> >  memrefs_conflict_p (poly_int64 xsize, rtx x, poly_int64 ysize, rtx y,
> >   poly_int64 c)
> >  {
> > diff --git a/gcc/alias.h b/gcc/alias.h
> > index ab06ac9055f..49836f7d808 100644
> > --- a/gcc/alias.h
> > +++ b/gcc/alias.h
> > @@ -41,6 +41,7 @@ bool alias_ptr_types_compatible_p (tree, tree);
> >  int compare_base_decls (tree, tree);
> >  bool refs_same_for_tbaa_p (tree, tree);
> >  bool mems_same_for_tbaa_p (rtx, rtx);
> > +int memrefs_conflict_p (poly_int64, rtx, poly_int64, rtx, poly_int64);
> >
> >  /* This alias set can be used to force a memory to conflict with all
> > other memories, creating a barrier across which no memory reference
> > 

Re: [PATCH V3 0/7] ira/lra: Support subreg coalesce

2023-11-13 Thread Dimitar Dimitrov
On Sun, Nov 12, 2023 at 08:08:10PM +0800, Lehua Ding wrote:
> V3 Changes:
>   1. fix three ICE.
>   2. rebase
> 
> Hi,
> 
> These patchs try to support subreg coalesce feature in
> register allocation passes (ira and lra).
> 

Hi Lehua,

V3 indeed fixes the arm-none-eabi build. It's also confirmed by Linaro CI:
  
https://patchwork.sourceware.org/project/gcc/patch/20231112120817.2635864-8-lehua.d...@rivai.ai/

But avr and pru backends are still broken, albeit with different crash
signatures. Both targets are peculiar because they have
UNITS_PER_WORD=1. I'll try building some 16-bit target like msp430.

AVR fails when building libgcc:
/mnt/nvme/dinux/local-workspace/gcc/libgcc/config/avr/lib2funcs.c: In function 
'__roundlr':
/mnt/nvme/dinux/local-workspace/gcc/libgcc/config/avr/lib2funcs.c:115:3: 
internal compiler error: in check_allocation, at ira.cc:2673
  115 |   }
  |   ^
/mnt/nvme/dinux/local-workspace/gcc/libgcc/config/avr/lib2funcs.c:106:3: note: 
in expansion of macro 'ROUND2'
  106 |   ROUND2 (FX)
  |   ^~
/mnt/nvme/dinux/local-workspace/gcc/libgcc/config/avr/lib2funcs.c:117:1: note: 
in expansion of macro 'ROUND1'
  117 | ROUND1(L_LABEL)
  | ^~
0xc80b8d check_allocation
/mnt/nvme/dinux/local-workspace/gcc/gcc/ira.cc:2673
0xc89451 ira
/mnt/nvme/dinux/local-workspace/gcc/gcc/ira.cc:5873
0xc89451 execute
/mnt/nvme/dinux/local-workspace/gcc/gcc/ira.cc:6104

Script I'm using to build avr: 
https://github.com/dinuxbg/gnupru/blob/master/testing/manual-build-avr.sh



PRU fails building newlib:
/mnt/nvme/dinux/local-workspace/newlib/newlib/libc/stdlib/gdtoa-gdtoa.c:835:9: 
internal compiler error: in lra_create_live_ranges, at lra-lives.cc:1933
  835 | }
  | ^
0x6b951c lra_create_live_ranges(bool, bool)
/mnt/nvme/dinux/local-workspace/gcc/gcc/lra-lives.cc:1933
0xd9320c lra(_IO_FILE*)
/mnt/nvme/dinux/local-workspace/gcc/gcc/lra.cc:2638
0xd3e519 do_reload
/mnt/nvme/dinux/local-workspace/gcc/gcc/ira.cc:5960
0xd3e519 execute
/mnt/nvme/dinux/local-workspace/gcc/gcc/ira.cc:6148

Script I'm using to build pru: 
https://github.com/dinuxbg/gnupru/blob/master/testing/manual-build-pru.sh

Regards,
Dimitar,


Re: [RFC PATCH] Detecting lifetime-dse issues via Valgrind

2023-11-13 Thread Richard Biener



> Am 13.11.2023 um 15:52 schrieb Alexander Monakov :
> 
> 
>> On Mon, 13 Nov 2023, Richard Biener wrote:
>> 
>> Another generic comment - placing a built-in call probably pessimizes code
>> generation unless we handle it specially during alias analysis (or in
>> builtin_fnspec).
> 
> But considering the resulting code is intended to be run under Valgrind,
> isn't a bit worse quality acceptable? Note that we don't want loads
> following the built-in to be optimized out, they are necessary as they
> will be flagged by Valgrind as attempts to read uninitialized memory.
> 
> I suspect positioning the pass immediately after build_ssa as we do now
> is quite imperfect because we will then instrument 'x' in 
> 
>  void f()
>  {
>int x, *p;
>p = 
>  }
> 
> Ideally we'd position it such that more locals are put in SSA form,
> but not too late to miss some UB, right? Perhaps after first pass_ccp?

I guess it’s worth experimenting.  Even doing it right before RTL expansion 
might work.  Note if you pick ccp you have to use a separate place for -O0

>> I also don't like having another pass for this - did you
>> investigate to do the instrumentation at the point the CLOBBERs are
>> introduced?
> 
> I don't see a better approach, some CLOBBERs are emitted by the C++
> front-end via build_clobber_this, some by omp-expand, some during
> gimplification. I'm not a fan of useless IR rescans either, but
> this pass is supposed to run very rarely, not by default.
> 
>> Another possibility would be to make this more generic
>> and emit the instrumentation when we lower GIMPLE_BIND during
>> the GIMPLE lowering pass, you wouldn't then rely on the CLOBBERs
>> some of which only appear when -fstack-reuse=none is not used.
> 
> The CLOBBERs that trigger on Firefox and LLVM are emitted not during
> gimplification, but via build_clobber_this in the front-end.
> 
> Alexander
> 


Re: gfortran.dg/dg.exp debug messages pollute test output

2023-11-13 Thread Iain Sandoe
Hi Rainer,

> On 13 Nov 2023, at 05:27, Rainer Orth  wrote:
> 
> I happened to notice that make check output is cluttered with the likes
> of
> 
> dg set al  
> -B/var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/./libatomic/  
> -B/var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/./libatomic/.libs
>  ml /var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/.
> revised FFLAGS -pedantic-errors 
> -B/var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/./libatomic/ 
> -B/var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/./libatomic/.libs
> 
> (quite a number of instances).  Those messages are from
> gfortran.dg/dg.exp, introduced by your patch
> 
> commit a0673ec5f9236dca6ada23f28343c591ccd575e4
> Author: Iain Sandoe 
> Date:   Fri Apr 16 20:01:40 2021 +0100
> 
>Testsuite: allow non-installed testing on darwin
> 
> I couldn't find any submission for that patch, unfortunately.  

FX submitted the patch series, I can find the reference if you need it.

> Besides,
> it's unclear if those messages can just be removed (they are pretty
> cryptic as is) or at least changed to use verbose instead of puts.
> Please fix.

will do.
thanks
Iain

> 
>   Rainer
> 
> -- 
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University



Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-13 Thread Arnaud Charlet
> >>> OK, I thought there would be some defines that we could use for that, too 
> >>> bad if there isn't
> >>> and indeed we might need to perform another runtime check then as 
> >>> suggested by Iain.
> >> 
> >> I can see a possible interface, operatingSystemVersion in NSProcessInfo.h 
> >> - Objective C
> >> needed, I think
> > 
> > Some of the NS interfaces are available to regular C (e.g. stuff in 
> > CoreFoundation), and I am
> > fairly/very sure that we will be able to find a machanism that does not 
> > involve introducing an
> > ObjC dep.  [I am obvioulsy not in any way against ObjC - since i’m the 
> > maintainer ;) .. but it
> > seems heavyweight for solving this issue].
> 
> It certainly would be heavyweight, since TargetConditionals.h includes 
> TARGET_OS_OSX, 
> which is 1 if we’re compiling for macOS and 0 otherwise (there’s a useful 
> chart at :83 in the 
> MacOSX13.1 SDK).
> 
> Two ways ahead here:
> (1) just replace the current __arm__, __arm64__ test with this

That would be fine here (replace refs to *arm* by TARGET_OS_OSX), since this 
was my original
suggestion (copied at the top of this email).

> (2) as 1, but implement the runtime test for case sensitivity only for macOS
> 
> Whether (2) is acceptable depends, I suppose, on what issues Iain encountered 
> on Darwin 9 
> & Darwin 17. I’ll be content to go with (1).


Re: [PATCH 2/3] Add generated .opt.urls files

2023-11-13 Thread Marc Poulhiès


David Malcolm  writes:
>
> The new regenerate-opt-urls.py script only parsed
>   buildir/gcc/HTML/gcc-14.0.0/gcc/Option-Index.html
> looking for anchors for options via a regex.
>
> Looking at my build, I don't see any generated Ada HTML docs, so maybe
> I messed this up?  Does the generated HTML from the generated Ada
> texinfo go somewhere else?  (and, in particular, does it have its own
> index?)
>
> Perhaps this script could also deal directly with Sphinx-generated
> HTML?

I investigated a bit... The Ada part doesn't handle the html target, so
it's expected you don't have anything to parse. The online docs are
generated using a different script, not using these makefiles.

I'll see if I can fix the html target for ada, so that your script
doesn't need to be changed :)

>>  We are generating the texinfo files
>> from sphinx, so maybe we could adjust the script to also match what
>> the
>> sphinx generator produces?
>
> It *might* be as simple as pointing it at the option index for the
> generated HTML for Ada.

That's worth a try, when/if I can fix the HTML target.

Marc


Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-13 Thread Simon Wright
On 11 Nov 2023, at 18:10, Iain Sandoe  wrote:
> 
>> On 11 Nov 2023, at 07:47, Simon Wright  wrote:
>> 
>> On 6 Nov 2023, at 08:36, Arnaud Charlet  wrote:
>>> 
> So without changing fundamentally the model, you can't decide dynamically 
> for the whole
> system. Making the choice based on the current directory is pretty 
> random, since the current
> directory isn't well defined at program's start up and could be pretty 
> much any filesystem.
 
 I’d imagine that projects spread over more than one 
 differently-case-sensitive filesystem would
 be rare. As to the current directory at compiler startup, with GPRbuild 
 it’s the object directory, so
 likely to be somewhere near the project’s source tree.
>>> 
>>> I am not talking about the current directory when the compiler runs, I am 
>>> talking about the
>>> current directory where the target program runs, which can be pretty much 
>>> anywhere.
>>> 
>>> In other words, you are modifying a runtime file (adaint.c) which is used 
>>> both by the host compiler
>>> and by the target applications. My comment worries about the target 
>>> applications while yours
>>> applies to the host compiler only.
>> 
>> I don’t understand?
>> 
>> The change works out whether the filesystem of the current working directory 
>> is CS, whether
>> it’s the compiler or some user program that’s running it (it looks like that 
>> would have to be via 
>> some higher-level compiler package, I found only GNAT.Command_Line and 
>> GNAT.Directory_Operations).
>> 
>> I can see that might not be what the user program wants, but if they 
>> actually care the current
>> situation isn’t great anyway; the compiler definitely makes the wrong choice 
>> for new Macs.
>> 
> Note that the current setting on arm is actually for iOS, which we did 
> support at AdaCore
> at some point (and could revive in the future, who knows).
 
 Wouldn’t it be more natural to go via LLVM? I understand from Iain that 
 iOS isn’t currently
 supported by GCC.
>>> 
>>> That's another option. We'd like to keep both options on the table, since 
>>> both options have
>>> pros and cons.
>>> 
> So it would be fine to refine the test to differentiate between macOS and 
> embedded iOS and co,
> that would be a better change here.
 
 There didn’t seem to be a way to do that.
>>> 
>>> OK, I thought there would be some defines that we could use for that, too 
>>> bad if there isn't
>>> and indeed we might need to perform another runtime check then as suggested 
>>> by Iain.
>> 
>> I can see a possible interface, operatingSystemVersion in NSProcessInfo.h - 
>> Objective C
>> needed, I think
> 
> Some of the NS interfaces are available to regular C (e.g. stuff in 
> CoreFoundation), and I am
> fairly/very sure that we will be able to find a machanism that does not 
> involve introducing an
> ObjC dep.  [I am obvioulsy not in any way against ObjC - since i’m the 
> maintainer ;) .. but it
> seems heavyweight for solving this issue].

It certainly would be heavyweight, since TargetConditionals.h includes 
TARGET_OS_OSX, 
which is 1 if we’re compiling for macOS and 0 otherwise (there’s a useful chart 
at :83 in the 
MacOSX13.1 SDK).

Two ways ahead here:
(1) just replace the current __arm__, __arm64__ test with this
(2) as 1, but implement the runtime test for case sensitivity only for macOS

Whether (2) is acceptable depends, I suppose, on what issues Iain encountered 
on Darwin 9 
& Darwin 17. I’ll be content to go with (1).

Re: [PATCH v4] gcc: Introduce -fhardened

2023-11-13 Thread Marek Polacek
Ping.

On Fri, Nov 03, 2023 at 06:51:16PM -0400, Marek Polacek wrote:
> On Thu, Oct 26, 2023 at 05:55:56PM +0200, Richard Biener wrote:
> > 
> > 
> > > Am 24.10.2023 um 21:09 schrieb Marek Polacek :
> > > 
> > > On Tue, Oct 24, 2023 at 09:22:25AM +0200, Richard Biener wrote:
> > >>> On Mon, Oct 23, 2023 at 9:26 PM Marek Polacek  
> > >>> wrote:
> > >>> 
> > >>> On Thu, Oct 19, 2023 at 02:24:11PM +0200, Richard Biener wrote:
> >  Can you see how our
> >  primary and secondary targets (+ host OS) behave here?
> > >>> 
> > >>> That's very reasonable.  I tried to build gcc on Compile Farm 119 (AIX) 
> > >>> but
> > >>> that fails with:
> > >>> 
> > >>> ar  -X64 x ../ppc64/libgcc/libgcc_s.a shr.o
> > >>> ar: 0707-100 ../ppc64/libgcc/libgcc_s.a does not exist.
> > >>> make[2]: *** [/home/polacek/gcc/libgcc/config/rs6000/t-slibgcc-aix:98: 
> > >>> all] Error 1
> > >>> make[2]: Leaving directory 
> > >>> '/home/polacek/x/trunk/powerpc-ibm-aix7.3.1.0/libgcc'
> > >>> 
> > >>> and I tried Darwin (104) and that fails with
> > >>> 
> > >>> *** Configuration aarch64-apple-darwin21.6.0 not supported
> > >>> 
> > >>> Is anyone else able to build gcc on those machines, or test the attached
> > >>> patch?
> > >>> 
> >  I think the
> >  documentation should elaborate a bit on expectations for non-Linux/GNU
> >  targets, specifically I think the default configuration for a target 
> >  should
> >  with -fhardened _not_ have any -Whardened diagnostics.  Maybe we can
> >  have a testcase for this?
> > >>> 
> > >>> Sorry, I'm not sure how to test that.  I suppose if -fhardened enables
> > >>> something not supported on those systems, and it's something for which
> > >>> we have a configure test, then we shouldn't warn.  This is already the
> > >>> case for -pie, -z relro, and -z now.
> > >> 
> > >> I was thinking of
> > >> 
> > >> /* { dg-do compile } */
> > >> /* { dg-additional-options "-fhardened -Whardened" } */
> > >> 
> > >> int main () {}
> > >> 
> > >> and excess errors should catch "misconfigurations"?
> > > 
> > > I see.  fhardened-3.c is basically just like this (-Whardened is on by 
> > > default).
> > > 
> > >>> Should the docs say something like the following for features without
> > >>> configure checks?
> > >>> 
> > >>> @option{-fhardened} can, on certain systems, attempt to enable features
> > >>> not supported on that particular system.  In that case, it's possible to
> > >>> prevent the warning using the @option{-Wno-hardened} option.
> > >> 
> > >> Yeah, but ideally
> > >> 
> > >> @option{-fhardened} can, on certain systems, not enable features not
> > >> available on those systems and @option{-Whardened} will not diagnose
> > >> those as missing.
> > >> 
> > >> But I understand it doesn't work like that?
> > > 
> > > Right.  It will not diagnose missing features if they have a configure
> > > check, otherwise it will.  And I don't know if we want a configure check
> > > for every feature.  Maybe we can add them in the future if the current
> > > patch turns out to be problematical in practice?
> > 
> > Maybe we can have a switch on known target triples and statically configure 
> > based
> > On that, eventually even not support -fhardened for targets not listed.  
> > That’s certainly easier than detecting the target system features (think of 
> > cross compilers)
> 
> You mean like the following?  The only difference is the addition of
> HAVE_FHARDENED_SUPPORT and updating the tests to only run on gnu/linux
> targets.  If other OSs want to use -fhardened, they need to update the
> configure test.  Thanks,
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu and
> powerpc64le-unknown-linux-gnu.
> 
> -- >8 --
> In 
> I proposed -fhardened, a new umbrella option that enables a reasonable set
> of hardening flags.  The read of the room seems to be that the option
> would be useful.  So here's a patch implementing that option.
> 
> Currently, -fhardened enables:
> 
>   -D_FORTIFY_SOURCE=3 (or =2 for older glibcs)
>   -D_GLIBCXX_ASSERTIONS
>   -ftrivial-auto-var-init=zero
>   -fPIE  -pie  -Wl,-z,relro,-z,now
>   -fstack-protector-strong
>   -fstack-clash-protection
>   -fcf-protection=full (x86 GNU/Linux only)
> 
> -fhardened will not override options that were specified on the command line
> (before or after -fhardened).  For example,
> 
>  -D_FORTIFY_SOURCE=1 -fhardened
> 
> means that _FORTIFY_SOURCE=1 will be used.  Similarly,
> 
>   -fhardened -fstack-protector
> 
> will not enable -fstack-protector-strong.
> 
> Currently, -fhardened is only supported on GNU/Linux.
> 
> In DW_AT_producer it is reflected only as -fhardened; it doesn't expand
> to anything.  This patch provides -Whardened, enabled by default, which
> warns when -fhardened couldn't enable a particular option.  I think most
> often it will say that _FORTIFY_SOURCE wasn't enabled because optimization
> were not enabled.
> 
> 

Re: [PATCH] LoongArch: Remove redundant barrier instructions before LL-SC loops

2023-11-13 Thread Xi Ruoyao
On Wed, 2023-11-08 at 16:27 +0800, Xi Ruoyao wrote:
> On Wed, 2023-11-08 at 09:49 +0800, chenglulu wrote:
> > 
> > 在 2023/11/6 下午7:36, Xi Ruoyao 写道:
> > > This is isomorphic to the LLVM changes [1-2].
> > > 
> > > On LoongArch, the LL and SC instructions has memory barrier semantics:
> > > 
> > > - LL:  + 
> > > - SC:  + 
> > > 
> > > But the compare and swap operation is allowed to fail, and if it fails
> > > the SC instruction is not executed, thus the guarantee of acquiring
> > > semantics cannot be ensured. Therefore, an acquire barrier needs to be
> > > generated when failure_memorder includes an acquire operation.
> > > 
> > > On CPUs implementing LoongArch v1.10 or later, "dbar 0b10100" is an
> > > acquire barrier; on CPUs implementing LoongArch v1.00, it is a full
> > > barrier.  So it's always enough for acquire semantics.  OTOH if an
> > > acquire semantic is not needed, we still needs the "dbar 0x700" as the
> > > load-load barrier like all LL-SC loops.
> > 
> > I don't think there's a problem with the logic. I'm also working on 
> > correcting the content of the atomic functions now, and I'm doing a 
> > correctness test, including this modification, and I'll email you back
> > after the correctness test is completed.
> 
> Ok.  I'd like to note that we now have only 10 days before GCC 14 stage
> 1 ends, so we'd be better hurry.

Update: I've bootstrapped and regtested it on a LA664 and there is no
regression.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [committed] libstdc++: Reformat Python code

2023-11-13 Thread Romain GEISSLER
> Le 13 nov. 2023 à 16:24, Jonathan Wakely  a écrit :
> 
> Yes, I'll do that backport (and most of the other Python improvements
> too, at least for gcc-13).
> 
> Thanks for raising it.
> 

Cool thanks ! ;)

In the meantime, in my own toolchains I have silenced (without fixing it) the 
warnings
with this simple patch (in case anyone else wants to quickly get rid of it, but 
it’s not
really a good long term workaround).

--- libstdc++-v3/python/libstdcxx/v6/__init__.py
+++ libstdc++-v3/python/libstdcxx/v6/__init__.py
@@ -1 +1,2 @@
-
+import warnings
+warnings.filterwarnings("ignore", category=SyntaxWarning)

gfortran.dg/dg.exp debug messages pollute test output

2023-11-13 Thread Rainer Orth
Hi Iain,

I happened to notice that make check output is cluttered with the likes
of

dg set al  
-B/var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/./libatomic/  
-B/var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/./libatomic/.libs
 ml /var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/.
revised FFLAGS -pedantic-errors 
-B/var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/./libatomic/ 
-B/var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/./libatomic/.libs

(quite a number of instances).  Those messages are from
gfortran.dg/dg.exp, introduced by your patch

commit a0673ec5f9236dca6ada23f28343c591ccd575e4
Author: Iain Sandoe 
Date:   Fri Apr 16 20:01:40 2021 +0100

Testsuite: allow non-installed testing on darwin

I couldn't find any submission for that patch, unfortunately.  Besides,
it's unclear if those messages can just be removed (they are pretty
cryptic as is) or at least changed to use verbose instead of puts.

Please fix.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [committed] libstdc++: Reformat Python code

2023-11-13 Thread Jonathan Wakely
On Mon, 13 Nov 2023 at 14:41, Romain GEISSLER
 wrote:
>
> > Le 28 sept. 2023 à 22:21, Jonathan Wakely  a écrit :
> >
> > Tested x86_64-linux (GDB 13.2, Python 3.11). Pushed to trunk.
> >
> > -- >8 --
> >
> > Some of these changes were suggested by autopep8's --aggressive
> > option, others are for readability.
> >
> > Break long lines by splitting strings across multiple lines, or
> > introducing local variables to hold results.
> >
> > Use raw strings for regular expressions, so that backslashes don't need
> > to be escaped.
>
> Hi Jonathan,
>
> FYI, it seems that with python 3.12, the bits "Use raw strings for regular 
> expressions"
> seems to fix the following new Python warnings:
>
> /opt/1A/toolchain/x86_64-v23.0.19/lib/../share/gcc-13.2.1/python/libstdcxx/v6/printers.py:1273:
>  SyntaxWarning: invalid escape sequence '\d'
>   self.typename = re.sub('^std::experimental::fundamentals_v\d::', 
> 'std::experimental::', self.typename, 1)
> /opt/1A/toolchain/x86_64-v23.0.19/lib/../share/gcc-13.2.1/python/libstdcxx/v6/printers.py:1302:
>  SyntaxWarning: invalid escape sequence '\w'
>   x = re.sub("std::string(?!\w)", s, m.group(1))
>  … (snapped, there are a bit more than that in total).
>
> How ok would it be to backport to the branches still maintained the "raw 
> string" fix,
> in order to avoid deprecation warnings as soon as people use gdb with python 
> >= 3.12 ?

Yes, I'll do that backport (and most of the other Python improvements
too, at least for gcc-13).

Thanks for raising it.



[PING ^1][PATCH] rs6000: Fix issue in specifying PTImode as an attribute [PR106895]

2023-11-13 Thread jeevitha
Ping!

please review.

Thanks & Regards
Jeevitha

On 25/08/23 7:49 am, Peter Bergner wrote:
> On 8/24/23 12:35 PM, Michael Meissner wrote:
>> On Thu, Jul 20, 2023 at 10:05:28AM +0530, jeevitha wrote:
>>> gcc/
>>> PR target/110411
>>> * config/rs6000/rs6000.h (enum rs6000_builtin_type_index): Add fields
>>> to hold PTImode type.
>>> * config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Add node
>>> for PTImode type.
>>
>> It is good as far as it goes, but I suspect we will eventually need to extend
>> it.  In particular, the reason people need PTImode is they need the even/odd
>> register layout.  What you've done enables users to declare this value.
> 
> Sure, it could be extended, but that is not what this patch is about.
> It's purely to allow the kernel team access to the guaranteed even/odd
> register layout for some inline asm code.  Any extension would be a
> follow-on patch to this.
> 
> 
> 
> On 8/9/23 3:48 AM, Kewen.Lin wrote:
>> IIUC, this builtin type registering makes this type expose to users, so
>> I wonder if we want to actually expose this type for users' uses.
>> If yes, we need to update the documentation (and not sure if the current
>> name is good enough); otherwise, I wonder if there is some existing
>> practice to declare a builtin type with a name which users can't actually
>> use and is just for shadowing a mode.
> 
> Segher, Mike, Jeevitha and I talked about the patch and Segher mentioned
> that under some conditions, it's fine to keep the type undocumented.
> Hopefully he'll weigh in on whether this particular patch is one of
> those cases or not.  
> 
> 
> Peter


Re: [PATCH] RISC-V: vsetvl: Refine REG_EQUAL equality.

2023-11-13 Thread Robin Dapp
As per Jeff's remark I'm going to push the attached.

Regards
 Robin

Subject: [PATCH v4] RISC-V: vsetvl: Refine REG_EQUAL equality.

This patch enhances the equality check for REG_EQUAL notes in the vsetvl
pass by using the == operator instead of rtx_equal_p.  With that, in
situations like the following, a5 and a7 are not considered equal
anymore.

(insn 62 60 63 4 (set (reg:DI 17 a7 [orig:154 loop_len_54 ] [154])
(umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(const_int 8 [0x8]))
(nil)))
(insn 63 62 65 4 (set (reg:DI 15 a5 [orig:175 _103 ] [175])
(minus:DI (reg:DI 15 a5 [orig:174 _100 ] [174])
(reg:DI 17 a7 [orig:154 loop_len_54 ] [154]))) 11 {subdi3}
 (nil))
(insn 65 63 66 4 (set (reg:DI 16 a6 [orig:153 loop_len_53 ] [153])
(umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(reg:DI 30 t5 [219]))) 442 {umindi3}
 (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:175 _103 ] [175])
(const_int 8 [0x8]))
(nil)))

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (source_equal_p): Use pointer
equality for REG_EQUAL.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c: New test.
---
 gcc/config/riscv/riscv-vsetvl.cc  |  6 -
 .../rvv/autovec/partial/multiple_rgroup_zbb.c | 23 +++
 2 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 3fa25a6404d..8466b5d019e 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -560,7 +560,11 @@ source_equal_p (insn_info *insn1, insn_info *insn2)
 
   rtx note1 = find_reg_equal_equiv_note (rinsn1);
   rtx note2 = find_reg_equal_equiv_note (rinsn2);
-  if (note1 && note2 && rtx_equal_p (note1, note2))
+  /* We could handle the case of similar-looking REG_EQUALs as well but
+ would need to verify that no insn in between modifies any of the source
+ operands.  */
+  if (note1 && note2 && rtx_equal_p (note1, note2)
+  && REG_NOTE_KIND (note1) == REG_EQUIV)
 return true;
   return false;
 }
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c
new file mode 100644
index 000..15178a2c848
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/multiple_rgroup_zbb.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } *.
+/* { dg-options "-march=rv64gcv_zbb -mabi=lp64d -O2 --param 
riscv-autovec-preference=fixed-vlmax -fno-schedule-insns -fno-schedule-insns2" 
} */
+
+#include 
+
+void __attribute__ ((noipa))
+test (uint16_t *__restrict f, uint32_t *__restrict d, uint64_t *__restrict e,
+  uint16_t x, uint16_t x2, uint16_t x3, uint16_t x4, uint32_t y,
+  uint32_t y2, uint64_t z, int n)
+{
+  for (int i = 0; i < n; ++i)
+{
+  f[i * 4 + 0] = x;
+  f[i * 4 + 1] = x2;
+  f[i * 4 + 2] = x3;
+  f[i * 4 + 3] = x4;
+  d[i * 2 + 0] = y;
+  d[i * 2 + 1] = y2;
+  e[i] = z;
+}
+}
+
+/* { dg-final { scan-assembler-times 
"vsetvli\tzero,\s*\[a-z0-9\]+,\s*e16,\s*m1,\s*ta,\s*ma" 4 } } */
-- 
2.41.0



Re: [PATCH v2 3/3] RISC-V: Add support for XCVbi extension in CV32E40P

2023-11-13 Thread Kito Cheng
I would prefer you suppress those patterns in riscv.cc rather than
adjust the including order, T-Head extension goes that way too, so I
would prefer to keep it using a consistent way.

you can search !TARGET_XTHEAD in riscv.md


On Mon, Nov 13, 2023 at 9:36 PM Mary Bennett  wrote:
>
> Spec: 
> github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md
>
> Contributors:
>   Mary Bennett 
>   Nandni Jamnadas 
>   Pietra Ferreira 
>   Charlie Keaney
>   Jessica Mills
>   Craig Blackmore 
>   Simon Cook 
>   Jeremy Bennett 
>   Helene Chelin 
>
> gcc/ChangeLog:
> * common/config/riscv/riscv-common.cc: Create XCVbi extension
>   support.
> * config/riscv/riscv.opt: Likewise.
> * config/riscv/corev.md: Implement cv_branch pattern
>   for cv.beqimm and cv.bneimm.
> * config/riscv/riscv.md: Change pattern priority so corev.md
>   patterns run before riscv.md patterns.
> * config/riscv/constraints.md: Implement constraints
>   cv_bi_s5 - signed 5-bit immediate.
> * config/riscv/predicates.md: Implement predicate
>   const_int5s_operand - signed 5 bit immediate.
> * doc/sourcebuild.texi: Add XCVbi documentation.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/riscv/cv-bi-beqimm-compile-1.c: New test.
> * gcc.target/riscv/cv-bi-beqimm-compile-2.c: New test.
> * gcc.target/riscv/cv-bi-bneimm-compile-1.c: New test.
> * gcc.target/riscv/cv-bi-bneimm-compile-2.c: New test.
> * lib/target-supports.exp: Add proc for XCVbi.
> ---
>  gcc/common/config/riscv/riscv-common.cc   |  2 +
>  gcc/config/riscv/constraints.md   |  6 +++
>  gcc/config/riscv/corev.md | 14 ++
>  gcc/config/riscv/predicates.md|  4 ++
>  gcc/config/riscv/riscv.md | 11 -
>  gcc/config/riscv/riscv.opt|  2 +
>  gcc/doc/sourcebuild.texi  |  3 ++
>  .../gcc.target/riscv/cv-bi-beqimm-compile-1.c | 17 +++
>  .../gcc.target/riscv/cv-bi-beqimm-compile-2.c | 48 +++
>  .../gcc.target/riscv/cv-bi-bneimm-compile-1.c | 17 +++
>  .../gcc.target/riscv/cv-bi-bneimm-compile-2.c | 48 +++
>  gcc/testsuite/lib/target-supports.exp | 13 +
>  12 files changed, 184 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-beqimm-compile-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-beqimm-compile-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-bneimm-compile-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/cv-bi-bneimm-compile-2.c
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index 6a1978bd0e4..04631e007f0 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -313,6 +313,7 @@ static const struct riscv_ext_version 
> riscv_ext_version_table[] =
>{"xcvmac", ISA_SPEC_CLASS_NONE, 1, 0},
>{"xcvalu", ISA_SPEC_CLASS_NONE, 1, 0},
>{"xcvelw", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"xcvbi", ISA_SPEC_CLASS_NONE, 1, 0},
>
>{"xtheadba", ISA_SPEC_CLASS_NONE, 1, 0},
>{"xtheadbb", ISA_SPEC_CLASS_NONE, 1, 0},
> @@ -1669,6 +1670,7 @@ static const riscv_ext_flag_table_t 
> riscv_ext_flag_table[] =
>{"xcvmac",_options::x_riscv_xcv_subext, MASK_XCVMAC},
>{"xcvalu",_options::x_riscv_xcv_subext, MASK_XCVALU},
>{"xcvelw",_options::x_riscv_xcv_subext, MASK_XCVELW},
> +  {"xcvbi", _options::x_riscv_xcv_subext, MASK_XCVBI},
>
>{"xtheadba",  _options::x_riscv_xthead_subext, MASK_XTHEADBA},
>{"xtheadbb",  _options::x_riscv_xthead_subext, MASK_XTHEADBB},
> diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
> index 2711efe68c5..718b4bd77df 100644
> --- a/gcc/config/riscv/constraints.md
> +++ b/gcc/config/riscv/constraints.md
> @@ -247,3 +247,9 @@
>(and (match_code "const_int")
> (and (match_test "IN_RANGE (ival, 0, 1073741823)")
>  (match_test "exact_log2 (ival + 1) != -1"
> +
> +(define_constraint "CV_bi_sign5"
> +  "@internal
> +   A 5-bit signed immediate for CORE-V Immediate Branch."
> +  (and (match_code "const_int")
> +   (match_test "IN_RANGE (ival, -16, 15)")))
> diff --git a/gcc/config/riscv/corev.md b/gcc/config/riscv/corev.md
> index 92bf0b5d6a6..f6a1f916d7e 100644
> --- a/gcc/config/riscv/corev.md
> +++ b/gcc/config/riscv/corev.md
> @@ -706,3 +706,17 @@
>
>[(set_attr "type" "load")
>(set_attr "mode" "SI")])
> +
> +;; XCVBI Builtins
> +(define_insn "cv_branch"
> +  [(set (pc)
> +   (if_then_else
> +(match_operator 1 "equality_operator"
> +[(match_operand:X 2 "register_operand" "r")
> + (match_operand:X 3 "const_int5s_operand" 
> "CV_bi_sign5")])
> +(label_ref (match_operand 0 "" ""))
> +   

Re: [committed] libstdc++: Reformat Python code

2023-11-13 Thread Romain GEISSLER
> Le 28 sept. 2023 à 22:21, Jonathan Wakely  a écrit :
> 
> Tested x86_64-linux (GDB 13.2, Python 3.11). Pushed to trunk.
> 
> -- >8 --
> 
> Some of these changes were suggested by autopep8's --aggressive
> option, others are for readability.
> 
> Break long lines by splitting strings across multiple lines, or
> introducing local variables to hold results.
> 
> Use raw strings for regular expressions, so that backslashes don't need
> to be escaped.

Hi Jonathan,

FYI, it seems that with python 3.12, the bits "Use raw strings for regular 
expressions"
seems to fix the following new Python warnings:

/opt/1A/toolchain/x86_64-v23.0.19/lib/../share/gcc-13.2.1/python/libstdcxx/v6/printers.py:1273:
 SyntaxWarning: invalid escape sequence '\d'
  self.typename = re.sub('^std::experimental::fundamentals_v\d::', 
'std::experimental::', self.typename, 1)
/opt/1A/toolchain/x86_64-v23.0.19/lib/../share/gcc-13.2.1/python/libstdcxx/v6/printers.py:1302:
 SyntaxWarning: invalid escape sequence '\w'
  x = re.sub("std::string(?!\w)", s, m.group(1))
 … (snapped, there are a bit more than that in total).

How ok would it be to backport to the branches still maintained the "raw 
string" fix,
in order to avoid deprecation warnings as soon as people use gdb with python >= 
3.12 ?

Thanks,
Romain

Re: [RFC PATCH] Detecting lifetime-dse issues via Valgrind

2023-11-13 Thread Alexander Monakov


On Mon, 13 Nov 2023, Richard Biener wrote:

> Another generic comment - placing a built-in call probably pessimizes code
> generation unless we handle it specially during alias analysis (or in
> builtin_fnspec).

But considering the resulting code is intended to be run under Valgrind,
isn't a bit worse quality acceptable? Note that we don't want loads
following the built-in to be optimized out, they are necessary as they
will be flagged by Valgrind as attempts to read uninitialized memory.

I suspect positioning the pass immediately after build_ssa as we do now
is quite imperfect because we will then instrument 'x' in 

  void f()
  {
int x, *p;
p = 
  }

Ideally we'd position it such that more locals are put in SSA form,
but not too late to miss some UB, right? Perhaps after first pass_ccp?

> I also don't like having another pass for this - did you
> investigate to do the instrumentation at the point the CLOBBERs are
> introduced?

I don't see a better approach, some CLOBBERs are emitted by the C++
front-end via build_clobber_this, some by omp-expand, some during
gimplification. I'm not a fan of useless IR rescans either, but
this pass is supposed to run very rarely, not by default.

> Another possibility would be to make this more generic
> and emit the instrumentation when we lower GIMPLE_BIND during
> the GIMPLE lowering pass, you wouldn't then rely on the CLOBBERs
> some of which only appear when -fstack-reuse=none is not used.

The CLOBBERs that trigger on Firefox and LLVM are emitted not during
gimplification, but via build_clobber_this in the front-end.

Alexander


[committed 21/22] arm: testsuite: fix some more architecture tests

2023-11-13 Thread Richard Earnshaw

This fixes a bunch more tests that try to override the default architecture;
some partially used the framework for doing this, others just blindly
added a -march option, which was doomed to cause problems.  In most cases
we can now run these tests regardless of the users testing options and
the base compiler configuration.

gcc/testsuite:

* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
Add test for v7a_arm.
* gcc.target/arm/pr60650-2.c: Use require-effective-target and
add-options.
* gcc.target/arm/pr60657.c: Likewise.
* gcc.target/arm/pr60663.c: Likewise.
* gcc.target/arm/pr81863.c: Likewise.
* gcc.target/arm/pr97969.c: Likewise.
* gcc.target/arm/pr98931.c: Likewise.
* gcc.target/arm/tail-long-call.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/pr60650-2.c  | 4 +++-
 gcc/testsuite/gcc.target/arm/pr60657.c| 4 +++-
 gcc/testsuite/gcc.target/arm/pr60663.c| 4 +++-
 gcc/testsuite/gcc.target/arm/pr81863.c| 4 +++-
 gcc/testsuite/gcc.target/arm/pr97969.c| 4 +++-
 gcc/testsuite/gcc.target/arm/pr98931.c| 5 +++--
 gcc/testsuite/gcc.target/arm/tail-long-call.c | 6 --
 gcc/testsuite/lib/target-supports.exp | 1 +
 8 files changed, 23 insertions(+), 9 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr60650-2.c b/gcc/testsuite/gcc.target/arm/pr60650-2.c
index c44d7b5f9d2..36d57f08676 100644
--- a/gcc/testsuite/gcc.target/arm/pr60650-2.c
+++ b/gcc/testsuite/gcc.target/arm/pr60650-2.c
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
 /* { dg-skip-if "-mpure-code supports M-profile only" { *-*-* } { "-mpure-code" } } */
-/* { dg-options "-O2 -fno-omit-frame-pointer -march=armv7-a" } */
+/* { dg-options "-O2 -fno-omit-frame-pointer" } */
+/* { dg-add-options arm_arch_v7a } */
 
 int a, h, j;
 long long d, e, i;
diff --git a/gcc/testsuite/gcc.target/arm/pr60657.c b/gcc/testsuite/gcc.target/arm/pr60657.c
index ca121526bb3..e608dd39355 100644
--- a/gcc/testsuite/gcc.target/arm/pr60657.c
+++ b/gcc/testsuite/gcc.target/arm/pr60657.c
@@ -1,6 +1,8 @@
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
 /* { dg-skip-if "-mpure-code supports M-profile only" { *-*-* } { "-mpure-code" } } */
-/* { dg-options "-O2 -march=armv7-a" } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_arch_v7a } */
 
 
 void foo (void);
diff --git a/gcc/testsuite/gcc.target/arm/pr60663.c b/gcc/testsuite/gcc.target/arm/pr60663.c
index 8d76d79301a..bb48d7fdae8 100644
--- a/gcc/testsuite/gcc.target/arm/pr60663.c
+++ b/gcc/testsuite/gcc.target/arm/pr60663.c
@@ -1,7 +1,9 @@
 /* PR rtl-optimization/60663 */
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
 /* { dg-skip-if "-mpure-code supports M-profile only" { *-*-* } { "-mpure-code" } } */
-/* { dg-options "-O2 -march=armv7-a" } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_arch_v7a } */
 
 int
 foo (void)
diff --git a/gcc/testsuite/gcc.target/arm/pr81863.c b/gcc/testsuite/gcc.target/arm/pr81863.c
index 85bfab144c0..a96f3b58411 100644
--- a/gcc/testsuite/gcc.target/arm/pr81863.c
+++ b/gcc/testsuite/gcc.target/arm/pr81863.c
@@ -1,8 +1,10 @@
 /* testsuite/gcc.target/arm/pr48183.c */
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v7a_arm_ok } */
 /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mslow-flash-data" } } */
 /* { dg-skip-if "-mpure-code and -mword-relocations incompatible" { *-*-* } { "-mpure-code" } } */
-/* { dg-options "-O2 -mword-relocations -march=armv7-a -marm" } */
+/* { dg-options "-O2 -mword-relocations" } */
+/* { dg-add-options arm_arch_v7a_arm } */
 /* { dg-final { scan-assembler-not "\[\\t \]+movw" } } */
 
 int a, d, f;
diff --git a/gcc/testsuite/gcc.target/arm/pr97969.c b/gcc/testsuite/gcc.target/arm/pr97969.c
index 714a1d18870..b8c3a23676a 100644
--- a/gcc/testsuite/gcc.target/arm/pr97969.c
+++ b/gcc/testsuite/gcc.target/arm/pr97969.c
@@ -1,5 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-std=c99 -fno-omit-frame-pointer -mthumb -w -Os" } */
+/* { dg-require-effective-target arm_arch_v6m_ok } */
+/* { dg-options "-std=c99 -fno-omit-frame-pointer -w -Os" } */
+/* { dg-add-options arm_arch_v6m } */
 
 typedef a[23];
 enum { b };
diff --git a/gcc/testsuite/gcc.target/arm/pr98931.c b/gcc/testsuite/gcc.target/arm/pr98931.c
index 66070ad7298..56f66a3b82f 100644
--- a/gcc/testsuite/gcc.target/arm/pr98931.c
+++ b/gcc/testsuite/gcc.target/arm/pr98931.c
@@ -1,6 +1,7 @@
 /* { dg-do assemble } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" "-mcpu=*" } } */
-/* { dg-options "-march=armv8.1-m.main -O3 --param=max-completely-peeled-insns=1300 --save-temps -mthumb" } */
+/* { dg-require-effective-target arm_arch_v8_1m_main_ok } */
+/* { dg-options "-O3 --param=max-completely-peeled-insns=1300 --save-temps" } */
+/* { 

[committed 22/22] arm: testsuite: improve compatibility of gcc.dg/debug/pr57351.c

2023-11-13 Thread Richard Earnshaw

This test is arm specific and requires neon.  To improve compatibility
add a new test for armv7-a with neon and use that.

gcc/testsuite:

* lib/target-supports.exp (v7a_neon): New feature-test target.
* gcc.dg/debug/pr57351.c: Use it.
---
 gcc/testsuite/gcc.dg/debug/pr57351.c  | 7 +++
 gcc/testsuite/lib/target-supports.exp | 1 +
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/debug/pr57351.c b/gcc/testsuite/gcc.dg/debug/pr57351.c
index 236d74ddedb..50861a4bf88 100644
--- a/gcc/testsuite/gcc.dg/debug/pr57351.c
+++ b/gcc/testsuite/gcc.dg/debug/pr57351.c
@@ -1,8 +1,7 @@
 /* { dg-do compile } */
-/* { dg-require-effective-target arm_neon }  */
-/* { dg-require-effective-target arm_arch_v7a_ok }  */
-/* { dg-options "-std=c99 -Os -g -march=armv7-a" } */
-/* { dg-add-options arm_neon } */
+/* { dg-require-effective-target arm_arch_v7a_neon_ok }  */
+/* { dg-options "-std=c99 -Os -g" } */
+/* { dg-add-options arm_arch_v7a_neon } */
 
 typedef unsigned int size_t;
 typedef int ptrdiff_t;
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index ae43dc97872..43a040e135c 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5425,6 +5425,7 @@ foreach { armfunc armflag armdefs } {
 	v6m "-march=armv6-m -mthumb -mfloat-abi=soft" __ARM_ARCH_6M__
 	v7a "-march=armv7-a+fp" __ARM_ARCH_7A__
 	v7a_arm "-march=armv7-a+fp -marm" "__ARM_ARCH_7A__ && !__thumb__"
+	v7a_neon "-march=armv7-a+simd -mfpu=auto -mfloat-abi=softfp" "__ARM_ARCH_7A__ && __ARM_NEON__"
 	v7r "-march=armv7-r+fp" __ARM_ARCH_7R__
 	v7m "-march=armv7-m -mthumb -mfloat-abi=soft" __ARM_ARCH_7M__
 	v7em "-march=armv7e-m+fp -mthumb" __ARM_ARCH_7EM__


[committed 18/22] arm: testsuite: improve compatibility of gcc.target/arm/pr19599.c

2023-11-13 Thread Richard Earnshaw

Add +fp to the architecture specification, so that -mfpu=auto works
with the hard-float ABI.

gcc/testsuite:

* gcc.target/arm/pr19599.c: Add +fp to the architecture.
---
 gcc/testsuite/gcc.target/arm/pr19599.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr19599.c b/gcc/testsuite/gcc.target/arm/pr19599.c
index a536548442f..d2f15ae4499 100644
--- a/gcc/testsuite/gcc.target/arm/pr19599.c
+++ b/gcc/testsuite/gcc.target/arm/pr19599.c
@@ -1,6 +1,6 @@
 /* { dg-skip-if "need at least armv5te" { *-*-* } { "-march=armv[234]*" "-mthumb" } { "" } } */
 /* { dg-skip-if "FDPIC does not support armv5te" { arm*-*-uclinuxfdpiceabi } "*" "" } */
-/* { dg-options "-O2 -march=armv5te -marm" }  */
+/* { dg-options "-O2 -march=armv5te+fp -marm" }  */
 /* { dg-final { scan-assembler "bx" } } */
 
 int (*indirect_func)();


[committed 15/22] arm: testsuite: improve compatibility of ftest-armv7m-thumb.c

2023-11-13 Thread Richard Earnshaw

This test is specific to armv7m cores which do not support hardware
floating-point.  We can improve its compatibility by having the default
options for this core specify -mfloat-abi=soft.

gcc/testsuite:

* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
Use soft-float ABI for armv7m.
* gcc.target/arm/ftest-armv7m-thumb.c: Use dg-require-effective-target
to check flag compatibility.
---
 gcc/testsuite/gcc.target/arm/ftest-armv7m-thumb.c | 3 +--
 gcc/testsuite/lib/target-supports.exp | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/ftest-armv7m-thumb.c b/gcc/testsuite/gcc.target/arm/ftest-armv7m-thumb.c
index 363b48b7516..ba1985f5b0d 100644
--- a/gcc/testsuite/gcc.target/arm/ftest-armv7m-thumb.c
+++ b/gcc/testsuite/gcc.target/arm/ftest-armv7m-thumb.c
@@ -1,6 +1,5 @@
 /* { dg-do compile } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-march=*" } { "-march=arm7-m" } } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" } { "" } } */
+/* { dg-require-effective-target arm_arch_v7m_ok }
 /* { dg-options "-mthumb" } */
 /* { dg-add-options arm_arch_v7m } */
 
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 9d2958626ad..316e34a34be 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5425,7 +5425,7 @@ foreach { armfunc armflag armdefs } {
 	v6m "-march=armv6-m -mthumb -mfloat-abi=soft" __ARM_ARCH_6M__
 	v7a "-march=armv7-a+fp" __ARM_ARCH_7A__
 	v7r "-march=armv7-r+fp" __ARM_ARCH_7R__
-	v7m "-march=armv7-m -mthumb" __ARM_ARCH_7M__
+	v7m "-march=armv7-m -mthumb -mfloat-abi=soft" __ARM_ARCH_7M__
 	v7em "-march=armv7e-m+fp -mthumb" __ARM_ARCH_7EM__
 	v7ve "-march=armv7ve+fp -marm"
 		"__ARM_ARCH_7A__ && __ARM_FEATURE_IDIV"


[committed 16/22] arm: testsuite: improve compatibility of gcc.target/arm/macro_defs*.c

2023-11-13 Thread Richard Earnshaw

Convert these tests to use dg-add-options for increased compatibilty.
Since they also result in an empty translation unit, override the
default testsuite options.

gcc/testsuite:

* gcc.target/arm/macro_defs0.c: Use dg-effective-target and
dg-add-options.
* gcc.target/arm/macro_defs1.c: Likewise.
* gcc.target/arm/macro_defs2.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/macro_defs0.c | 7 +++
 gcc/testsuite/gcc.target/arm/macro_defs1.c | 6 ++
 gcc/testsuite/gcc.target/arm/macro_defs2.c | 6 ++
 3 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/macro_defs0.c b/gcc/testsuite/gcc.target/arm/macro_defs0.c
index 684d49ffafa..17fd157452e 100644
--- a/gcc/testsuite/gcc.target/arm/macro_defs0.c
+++ b/gcc/testsuite/gcc.target/arm/macro_defs0.c
@@ -1,8 +1,7 @@
 /* { dg-do compile } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-march=*" } { "-march=armv7-m" } } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=soft" } } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" } { "" } } */
-/* { dg-options "-march=armv7-m -mcpu=cortex-m3 -mfloat-abi=soft -mthumb" } */
+/* { dg-require-effective-target arm_arch_v7m_ok } */
+/* { dg-options "" } */
+/* { dg-add-options arm_arch_v7m } */
 
 #ifdef __ARM_FP
 #error __ARM_FP should not be defined
diff --git a/gcc/testsuite/gcc.target/arm/macro_defs1.c b/gcc/testsuite/gcc.target/arm/macro_defs1.c
index 655ba9334f3..bd22154321e 100644
--- a/gcc/testsuite/gcc.target/arm/macro_defs1.c
+++ b/gcc/testsuite/gcc.target/arm/macro_defs1.c
@@ -1,10 +1,8 @@
 /* { dg-do compile } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-march=*" } { "-march=armv6-m" } } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" } { "" } } */
 /* { dg-require-effective-target arm_arch_v6m_ok } */
-/* { dg-options "-march=armv6-m -mthumb" } */
+/* { dg-options "" } */
+/* { dg-add-options arm_arch_v6m } */
 
 #ifdef __ARM_NEON_FP
 #error __ARM_NEON_FP should not be defined
 #endif
-
diff --git a/gcc/testsuite/gcc.target/arm/macro_defs2.c b/gcc/testsuite/gcc.target/arm/macro_defs2.c
index 9a960423562..a26fc237611 100644
--- a/gcc/testsuite/gcc.target/arm/macro_defs2.c
+++ b/gcc/testsuite/gcc.target/arm/macro_defs2.c
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv7ve -mcpu=cortex-a15 -mfpu=neon-vfpv4" } */
-/* { dg-add-options arm_neon } */
 /* { dg-require-effective-target arm_neon_ok } */
+/* { dg-options "" } */
+/* { dg-add-options arm_neon } */
 
 #ifndef __ARM_NEON_FP
 #error  __ARM_NEON_FP is not defined but should be
@@ -10,5 +10,3 @@
 #ifndef __ARM_FP
 #error  __ARM_FP is not defined but should be
 #endif
-
-


[committed 20/22] testsuite: arm: tighten up mode-specific ISA tests

2023-11-13 Thread Richard Earnshaw

Some of the standard Arm architecture tests require the test to use a
specific instruction set (arm or thumb).  But although the framework
was checking that the flag was accepted, it wasn't checking that the
flag wasn't somehow being override (eg by run-specific options).  We
can improve these tests easily by checking whether or not __thumb-_ is
defined.

gcc/testsuite:

* lib/target-supports.exp (check_effective_target_arm_arch_FUNC_ok):
For instruction-set specific tests, check that __thumb__ is, or
isn't defined as appropriate.
---
 gcc/testsuite/lib/target-supports.exp | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 316e34a34be..3d504d26164 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5403,25 +5403,25 @@ proc check_effective_target_arm_fp16_hw { } {
 foreach { armfunc armflag armdefs } {
 	v4 "-march=armv4 -marm" __ARM_ARCH_4__
 	v4t "-march=armv4t -mfloat-abi=softfp" __ARM_ARCH_4T__
-	v4t_arm "-march=armv4t -marm" __ARM_ARCH_4T__
-	v4t_thumb "-march=armv4t -mthumb -mfloat-abi=softfp" __ARM_ARCH_4T__
+	v4t_arm "-march=armv4t -marm" "__ARM_ARCH_4T__ && !__thumb__"
+	v4t_thumb "-march=armv4t -mthumb -mfloat-abi=softfp" "__ARM_ARCH_4T__ && __thumb__"
 	v5t "-march=armv5t -mfloat-abi=softfp" __ARM_ARCH_5T__
-	v5t_arm "-march=armv5t -marm" __ARM_ARCH_5T__
-	v5t_thumb "-march=armv5t -mthumb -mfloat-abi=softfp" __ARM_ARCH_5T__
+	v5t_arm "-march=armv5t -marm" "__ARM_ARCH_5T__ && !__thumb__"
+	v5t_thumb "-march=armv5t -mthumb -mfloat-abi=softfp" "__ARM_ARCH_5T__ && __thumb__"
 	v5te "-march=armv5te+fp -mfloat-abi=softfp" __ARM_ARCH_5TE__
-	v5te_arm "-march=armv5te+fp -marm" __ARM_ARCH_5TE__
-	v5te_thumb "-march=armv5te+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_5TE__
-	xscale_arm "-mcpu=xscale -mfloat-abi=soft -marm" __XSCALE__
+	v5te_arm "-march=armv5te+fp -marm" "__ARM_ARCH_5TE__ && !__thumb__"
+	v5te_thumb "-march=armv5te+fp -mthumb -mfloat-abi=softfp" "__ARM_ARCH_5TE__ && __thumb__"
+	xscale_arm "-mcpu=xscale -mfloat-abi=soft -marm" "__XSCALE__ && !__thumb__"
 	v6 "-march=armv6+fp -mfloat-abi=softfp" __ARM_ARCH_6__
-	v6_arm "-march=armv6+fp -marm" __ARM_ARCH_6__
-	v6_thumb "-march=armv6+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_6__
+	v6_arm "-march=armv6+fp -marm" "__ARM_ARCH_6__ && !__thumb__"
+	v6_thumb "-march=armv6+fp -mthumb -mfloat-abi=softfp" "__ARM_ARCH_6__ && __thumb__"
 	v6k "-march=armv6k+fp -mfloat-abi=softfp" __ARM_ARCH_6K__
-	v6k_arm "-march=armv6k+fp -marm" __ARM_ARCH_6K__
-	v6k_thumb "-march=armv6k+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_6K__
+	v6k_arm "-march=armv6k+fp -marm" "__ARM_ARCH_6K__ && !__thumb__"
+	v6k_thumb "-march=armv6k+fp -mthumb -mfloat-abi=softfp" "__ARM_ARCH_6K__ && __thumb__"
 	v6t2 "-march=armv6t2+fp" __ARM_ARCH_6T2__
 	v6z "-march=armv6z+fp -mfloat-abi=softfp" __ARM_ARCH_6Z__
-	v6z_arm "-march=armv6z+fp -marm" __ARM_ARCH_6Z__
-	v6z_thumb "-march=armv6z+fp -mthumb -mfloat-abi=softfp" __ARM_ARCH_6Z__
+	v6z_arm "-march=armv6z+fp -marm" "__ARM_ARCH_6Z__ && !__thumb__"
+	v6z_thumb "-march=armv6z+fp -mthumb -mfloat-abi=softfp" "__ARM_ARCH_6Z__ && __thumb__"
 	v6m "-march=armv6-m -mthumb -mfloat-abi=soft" __ARM_ARCH_6M__
 	v7a "-march=armv7-a+fp" __ARM_ARCH_7A__
 	v7r "-march=armv7-r+fp" __ARM_ARCH_7R__


[committed 13/22] arm: testsuite: improve compatibility of pragma_arch_switch_2.c

2023-11-13 Thread Richard Earnshaw

This test was explicitly setting the architecture on the command-line and
in the body of the test.  In both cases this causes problems with the auto
FPU setting.  Fix by using the testsuite infrastructure correctly and by
adding +fp to the pragma.

gcc/testsuite:

* gcc.target/arm/pragma_arch_switch_2.c: Use testsuite infrastructure
to set the architecture flags.  Add +fp to the pragma that changes the
architecture.
---
 gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c b/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c
index 5080d2c7a91..567943bd8ed 100644
--- a/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c
+++ b/gcc/testsuite/gcc.target/arm/pragma_arch_switch_2.c
@@ -3,9 +3,10 @@
 /* { dg-do assemble } */
 /* { dg-require-effective-target arm_arm_ok } */
 /* { dg-require-effective-target arm_arch_v5te_arm_ok } */
-/* { dg-additional-options "-Wall -O2 -march=armv5te -std=gnu99 -marm" } */
+/* { dg-additional-options "-Wall -O2 -std=gnu99" } */
+/* { dg-add-options arm_arch_v5te_arm } */
 
-#pragma GCC target ("arch=armv6")
+#pragma GCC target ("arch=armv6+fp")
 int test_assembly (int hi, int lo)
 {
int res;


[committed 09/22] arm: testsuite: tidy up pr65647-2.c pre-checks.

2023-11-13 Thread Richard Earnshaw

Another case where we can make better use of the infrastructure to
improve the compatibility of this test.

gcc/testsuite:

* gcc.target/arm/pr65647-2.c: Use dg-add-options to manage target
flags.
---
 gcc/testsuite/gcc.target/arm/pr65647-2.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr65647-2.c b/gcc/testsuite/gcc.target/arm/pr65647-2.c
index e3978e512ea..79637bfd9d7 100644
--- a/gcc/testsuite/gcc.target/arm/pr65647-2.c
+++ b/gcc/testsuite/gcc.target/arm/pr65647-2.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_arch_v6_arm_ok } */
-/* { dg-options "-O3 -marm -march=armv6 -std=c99" } */
+/* { dg-options "-O3 -std=c99" } */
+/* { dg-add-options arm_arch_v6_arm } */
 
 typedef struct {
   int i;


[committed 19/22] arm: testsuite: improve compatibility of gcc.target/arm/pr59575.c

2023-11-13 Thread Richard Earnshaw

Use dg-require-effective-target/dg-add-options to improve
compatibility of this test with various compiler configurations.

gcc/testsuite:

* gcc.target/arm/pr59575.c: Use dg-require-effective-target and
dg-add-options.
---
 gcc/testsuite/gcc.target/arm/pr59575.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr59575.c b/gcc/testsuite/gcc.target/arm/pr59575.c
index cc49be3d61f..27d7d40526e 100644
--- a/gcc/testsuite/gcc.target/arm/pr59575.c
+++ b/gcc/testsuite/gcc.target/arm/pr59575.c
@@ -1,7 +1,9 @@
 /* PR target/59575 */
 /* { dg-do compile } */
+/* { dg-require-effective-target arm_arch_v7a_ok } */
 /* { dg-skip-if "-mpure-code supports M-profile only" { *-*-* } { "-mpure-code" } } */
-/* { dg-options "-Os -g -march=armv7-a" } */
+/* { dg-options "-Os -g" } */
+/* { dg-add-options arm_arch_v7a } */
 
 void foo (int *);
 int *bar (int, long long, int);


[committed 14/22] arm: testsuite: modernize framework usage for arm/scd42-2.c

2023-11-13 Thread Richard Earnshaw

Make this test more useful by using dg-require-effective-target/
dg-add-options.

gcc/testsuite:

* gcc.target/arm/scd42-2.c: Use modern dg- flags.
---
 gcc/testsuite/gcc.target/arm/scd42-2.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/scd42-2.c b/gcc/testsuite/gcc.target/arm/scd42-2.c
index 3c9768d22d9..cd416885a80 100644
--- a/gcc/testsuite/gcc.target/arm/scd42-2.c
+++ b/gcc/testsuite/gcc.target/arm/scd42-2.c
@@ -1,11 +1,8 @@
 /* Verify that mov is preferred on XScale for loading a 2 byte constant. */
 /* { dg-do compile } */
-/* { dg-skip-if "Test is specific to the Xscale" { arm*-*-* } { "-march=*" } { "-march=xscale" } } */
-/* { dg-skip-if "Test is specific to the Xscale" { arm*-*-* } { "-mcpu=*" } { "-mcpu=xscale" } } */
-/* { dg-skip-if "Test is specific to ARM mode" { arm*-*-* } { "-mthumb" } { "" } } */
-/* { dg-require-effective-target arm32 } */
-/* { dg-require-effective-target arm_arch_v5te_arm_ok } */
-/* { dg-options "-mcpu=xscale -O -marm" } */
+/* { dg-require-effective-target arm_arch_xscale_arm_ok } */
+/* { dg-options "-O" } */
+/* { dg-add-options arm_arch_xscale_arm } */
 
 unsigned load2(void) __attribute__ ((naked));
 unsigned load2(void)


[committed 12/22] arm: testsuite: improve compatibility of pragma_arch_attribute*.c

2023-11-13 Thread Richard Earnshaw

These tests use pragmas adn attributes to change the architecture.
Sometimes they simply add a feature using "+crc", but other times they
try to completely reset the architecture using "arch=armv8-a+crc".
The latter fails on a hard-float ABI with -mfpu=auto because it also
clears the FP capability.  Fix by adding +simd when the full
architecture is specified.

gcc/testsuite:

* gcc.target/arm/pragma_arch_attribute.c: Add +simd to pragmas that
set an explicit architecture.
* gcc.target/arm/pragma_arch_attribute_2.c: Likewise.
* gcc.target/arm/pragma_arch_attribute_3.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/pragma_arch_attribute.c   | 6 +++---
 gcc/testsuite/gcc.target/arm/pragma_arch_attribute_2.c | 2 +-
 gcc/testsuite/gcc.target/arm/pragma_arch_attribute_3.c | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/pragma_arch_attribute.c b/gcc/testsuite/gcc.target/arm/pragma_arch_attribute.c
index a06dbf04037..a5e1edad3a4 100644
--- a/gcc/testsuite/gcc.target/arm/pragma_arch_attribute.c
+++ b/gcc/testsuite/gcc.target/arm/pragma_arch_attribute.c
@@ -10,7 +10,7 @@
 #endif
 
 #pragma GCC push_options
-#pragma GCC target ("arch=armv8-a+crc")
+#pragma GCC target ("arch=armv8-a+simd+crc")
 #ifndef __ARM_FEATURE_CRC32
 # error "__ARM_FEATURE_CRC32 is not defined in push 1."
 #endif
@@ -41,7 +41,7 @@ void test_crc_unknown_ok_attr_1 ()
 # error "__ARM_FEATURE_CRC32 is defined after attribute set 1."
 #endif
 
-__attribute__((target("arch=armv8-a+crc")))
+__attribute__((target("arch=armv8-a+simd+crc")))
 void test_crc_unknown_ok_attr_2 ()
 {
 	__crc32b (0, 0);
@@ -51,4 +51,4 @@ void test_crc_unknown_ok_attr_2 ()
 # error "__ARM_FEATURE_CRC32 is defined after attribute set 2."
 #endif
 
-#pragma GCC reset_options
\ No newline at end of file
+#pragma GCC reset_options
diff --git a/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_2.c b/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_2.c
index 2e8e385774b..189af170096 100644
--- a/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_2.c
+++ b/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_2.c
@@ -8,7 +8,7 @@
 
 extern uint32_t bar();
 
-__attribute__((target("arch=armv8-a+crc"))) uint32_t crc32cw(uint32_t crc, uint32_t val)
+__attribute__((target("arch=armv8-a+simd+crc"))) uint32_t crc32cw(uint32_t crc, uint32_t val)
 {
 uint32_t res;
 asm("crc32cw %0, %1, %2" : "=r"(res) : "r"(crc), "r"(val));
diff --git a/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_3.c b/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_3.c
index 3714812cf26..eb7f990477b 100644
--- a/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_3.c
+++ b/gcc/testsuite/gcc.target/arm/pragma_arch_attribute_3.c
@@ -9,7 +9,7 @@
 extern uint32_t bar();
 
 #pragma GCC push_options
-#pragma GCC target("arch=armv8-a+crc")
+#pragma GCC target("arch=armv8-a+simd+crc")
 uint32_t crc32cw(uint32_t crc, uint32_t val)
 {
 uint32_t res;


[committed 08/22] arm: testsuite: improve compatibility of arm/lto/pr96939_1.c

2023-11-13 Thread Richard Earnshaw

This test overrides the architecture, but fails to specify the
floating point architecture.  This causes problems if -mfpu=auto is
used.

gcc/testsuite:

* gcc.target/arm/lto/pr96939_1.c: Add +simd to the architecture
specification.
---
 gcc/testsuite/gcc.target/arm/lto/pr96939_1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/lto/pr96939_1.c b/gcc/testsuite/gcc.target/arm/lto/pr96939_1.c
index 53c6093e803..4afdbdaf5ad 100644
--- a/gcc/testsuite/gcc.target/arm/lto/pr96939_1.c
+++ b/gcc/testsuite/gcc.target/arm/lto/pr96939_1.c
@@ -1,5 +1,5 @@
 /* PR target/96939 */
-/* { dg-options "-march=armv8-a+crc" } */
+/* { dg-options "-march=armv8-a+simd+crc" } */
 
 #include 
 


[committed 05/22] arm: testsuite: avoid problems with -mfpu=auto in attr-crypto.c

2023-11-13 Thread Richard Earnshaw

This test overrides the architecture, but fails to describe which
floating-point features are needed.  This causes problems if the ABI
requires FP for parameter passing and -mfpu=auto is selected, so ensure
that one is specified.

gcc/testsuite:

* gcc.target/arm/attr-crypto.c: Add +simd to the -march
specification.
---
 gcc/testsuite/gcc.target/arm/attr-crypto.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/attr-crypto.c b/gcc/testsuite/gcc.target/arm/attr-crypto.c
index 05e458f36b6..3959d0b67e7 100644
--- a/gcc/testsuite/gcc.target/arm/attr-crypto.c
+++ b/gcc/testsuite/gcc.target/arm/attr-crypto.c
@@ -3,7 +3,7 @@
pragma.  */
 /* { dg-skip-if "-mpure-code supports M-profile only" { *-*-* } { "-mpure-code" } } */
 /* { dg-require-effective-target arm_fp_ok } */
-/* { dg-options "-O2 -march=armv8-a" } */
+/* { dg-options "-O2 -march=armv8-a+simd" } */
 /* { dg-add-options arm_fp } */
 
 /* Reset fpu to a value compatible with the next pragmas.  */


[committed 17/22] arm: testsuite: improve compatibility of gcc.target/arm/optional_thumb-*.c

2023-11-13 Thread Richard Earnshaw

These tests deliberately pass invalid option combinations to check
that the compiler is generating the correct diagnostic.  Nevertheless,
we can improve their compatibility with other testsuite options.  For
optional_thumb-1.c we use a soft-float ABI, while for
optional_thumb2.c we use arm_arch_v7em as the target architecture,
then set the architecture manually.

gcc/testsuite:

* gcc.target/arm/optional_thumb-1.c: Force a soft-float ABI.
* gcc.target/arm/optional_thumb-3.c: Check for armv7e-m compatibility,
then set the architecture explicitly.
---
 gcc/testsuite/gcc.target/arm/optional_thumb-1.c | 2 +-
 gcc/testsuite/gcc.target/arm/optional_thumb-3.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-1.c b/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
index 99cb0c3f33b..90d9ada6ade 100644
--- a/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
+++ b/gcc/testsuite/gcc.target/arm/optional_thumb-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { ! default_mode } } } */
 /* { dg-skip-if "-marm/-mthumb/-march/-mcpu given" { *-*-* } { "-marm" "-mthumb" "-march=*" "-mcpu=*" } } */
-/* { dg-options "-march=armv6-m" } */
+/* { dg-options "-march=armv6-m -mfloat-abi=soft" } */
 
 /* Check that -mthumb is not needed when compiling for a Thumb-only target.  */
 
diff --git a/gcc/testsuite/gcc.target/arm/optional_thumb-3.c b/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
index d9150e09e47..a6c661ac031 100644
--- a/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
+++ b/gcc/testsuite/gcc.target/arm/optional_thumb-3.c
@@ -1,7 +1,7 @@
 /* { dg-do compile } */
-/* { dg-require-effective-target arm_cortex_m } */
+/* { dg-require-effective-target arm_arch_v7em_ok } */
 /* { dg-skip-if "-mthumb given" { *-*-* } { "-mthumb" } } */
-/* { dg-options "-marm" } */
+/* { dg-options "-march=armv7e-m+fp -marm" } */
 /* { dg-error "target CPU does not support ARM mode" "missing error with -marm on Thumb-only targets" { target *-*-* } 0 } */
 
 /* Check that -marm gives an error when compiling for a Thumb-only target.  */


[committed 11/22] arm: testsuite: improve compatibility of pr88648-asm-syntax-unified.c

2023-11-13 Thread Richard Earnshaw

Fix another test that was trying to set the architecture directly
rather than using the infrastructure as intended.

gcc/testsuite:

* gcc.target/arm/pr88648-asm-syntax-unified.c: It isn't necessary
to try to override the architecture flags specified by arm_arch_v7a.
---
 gcc/testsuite/gcc.target/arm/pr88648-asm-syntax-unified.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr88648-asm-syntax-unified.c b/gcc/testsuite/gcc.target/arm/pr88648-asm-syntax-unified.c
index 251b4d5bc9d..53d0bb053fc 100644
--- a/gcc/testsuite/gcc.target/arm/pr88648-asm-syntax-unified.c
+++ b/gcc/testsuite/gcc.target/arm/pr88648-asm-syntax-unified.c
@@ -1,8 +1,8 @@
 /* Test for unified syntax assembly generation.  */
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_arch_v7a_ok } */
+/* { dg-options "-marm -masm-syntax-unified" } */
 /* { dg-add-options arm_arch_v7a } */
-/* { dg-options "-marm -march=armv7-a -masm-syntax-unified" } */
 
 void test ()
 {


[committed 06/22] arm: testsuite: avoid problems with -mfpu=auto in attr_thumb-static2.c

2023-11-13 Thread Richard Earnshaw

This test overrides the architecture, but fails to describe which
floating-point features are needed.  This causes problems if the ABI
requires FP for parameter passing and -mfpu=auto is selected, so ensure
that one is specified.

gcc/testsuite:

* gcc.target/arm/attr_thumb-static2.c: Add +fp to the -march
specification.
---
 gcc/testsuite/gcc.target/arm/attr_thumb-static2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/attr_thumb-static2.c b/gcc/testsuite/gcc.target/arm/attr_thumb-static2.c
index 77454343b23..a38f9a95607 100644
--- a/gcc/testsuite/gcc.target/arm/attr_thumb-static2.c
+++ b/gcc/testsuite/gcc.target/arm/attr_thumb-static2.c
@@ -2,7 +2,7 @@
 
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_arch_v7a_ok } */
-/* { dg-options "-O0 -march=armv7-a" } */
+/* { dg-options "-O0 -march=armv7-a+fp" } */
 
 struct _NSPoint
 {


  1   2   >