Re: Repost [PATCH 1/6] Add -mcpu=future

2024-02-05 Thread Michael Meissner
On Tue, Jan 23, 2024 at 04:44:32PM +0800, Kewen.Lin wrote:
> > --- a/gcc/config/rs6000/rs6000-cpus.def
> > +++ b/gcc/config/rs6000/rs6000-cpus.def
> > @@ -88,6 +88,10 @@
> >  | OPTION_MASK_POWER10  \
> >  | OTHER_POWER10_MASKS)
> >  
> > +/* Flags for a potential future processor that may or may not be 
> > delivered.  */
> > +#define ISA_FUTURE_MASKS   (ISA_3_1_MASKS_SERVER   \
> > +| OPTION_MASK_FUTURE)
> > +
> 
> Nit: Named as "ISA_FUTURE_MASKS_SERVER" seems more accurate as it's 
> constituted
> with ISA_3_1_MASKS_**SERVER** ...

Well the _SERVER stuff was due to the power7 days when we still had to support
the E500 in the main rs6000 tree.  But I will change it to be more consistant
in the future patches.

> ..., then this need to be updated accordingly.
> 
> > diff --git a/gcc/config/rs6000/rs6000-opts.h 
> > b/gcc/config/rs6000/rs6000-opts.h
> > index 33fd0efc936..25890ae3034 100644
> > --- a/gcc/config/rs6000/rs6000-opts.h
> > +++ b/gcc/config/rs6000/rs6000-opts.h
> > @@ -67,7 +67,9 @@ enum processor_type
> > PROCESSOR_MPCCORE,
> > PROCESSOR_CELL,
> > PROCESSOR_PPCA2,
> > -   PROCESSOR_TITAN
> > +   PROCESSOR_TITAN,
> > +
> 
> Nit: unintentional empty line?
> 
> > +   PROCESSOR_FUTURE
> >  };

It was more as a separation.  The MPCCORE, CELL, PPCA2, and TITAN are rather
old processors.  I don't recall why we kept them after the POWER.

Logically we should re-order the list and move MPCCORE, etc. earlier, but I
will delete the blank line in future patches.

> > +static int
> > +rs600_cpu_index_lookup (enum processor_type processor)
> 
> s/rs600_cpu_index_lookup/rs6000_cpu_index_lookup/

I'm going to redo it, and eliminate rs600_cpu_index_lookup.  Thanks for
catching the spelling of rs600 instead of rs6000.

> > +{
> > +  for (size_t i = 0; i < ARRAY_SIZE (processor_target_table); i++)
> > +if (processor_target_table[i].processor == processor)
> > +  return i;
> > +
> > +  return -1;
> > +}
> 
> Nit: Since this is given with a valid enum processor_type, I think it should
> never return -1?  If so, may be more clear with gcc_unreachable () or adjust
> with initial -1, break when hits and assert it's not -1.

As I said, in looking at it, I think I will rewrite the code that uses it to
call rs6000_cpu_name_lookup instead.

> > +
> >  
> >  /* Return number of consecutive hard regs needed starting at reg REGNO
> > to hold something of mode MODE.
> > @@ -3756,23 +3768,45 @@ rs6000_option_override_internal (bool global_init_p)
> >  rs6000_isa_flags &= ~OPTION_MASK_POWERPC64;
> >  #endif
> >  
> > +  /* At the moment, we don't have explict -mtune=future support.  If the 
> > user
> 
> Nit: s/explict/explicit/

Thanks.

> 
> > + explicitly tried to use -mtune=future, give a warning.  If not, use 
> > the
> 
> Nit: s/tried/tries/?

Thanks.  I will reword the comment.

> > + power10 tuning until future tuning is added.  */
> >if (rs6000_tune_index >= 0)
> > -tune_index = rs6000_tune_index;
> > +{
> > +  enum processor_type cur_proc
> > +   = processor_target_table[rs6000_tune_index].processor;
> > +
> > +  if (cur_proc == PROCESSOR_FUTURE)
> > +   {
> > + static bool issued_future_tune_warning = false;
> > + if (!issued_future_tune_warning)
> > +   {
> > + issued_future_tune_warning = true;
> 
> This seems to ensure we only warn this once, but I noticed that in rs6000/
> only some OPT_Wpsabi related warnings adopt this way, I wonder if we don't
> restrict it like this, for a tiny simple case, how many times it would warn?

In a simple case, you would only get the warning once.  But if you use
__attribute__((__target__(...))) or #pragma target ... you might see it more
than once.

> > + warning (0, "%qs is not currently supported", "-mtune=future");
> > +   }
> > +> +  rs6000_tune_index = rs600_cpu_index_lookup 
> > (PROCESSOR_POWER10);
> > +   }
> > +  tune_index = rs6000_tune_index;
> > +}
> >else if (cpu_index >= 0)
> > -rs6000_tune_index = tune_index = cpu_index;
> > +{
> > +  enum processor_type cur_cpu
> > +   = processor_target_table[cpu_index].processor;
> > +
> > +  rs6000_tune_index = tune_index
> > +   = (cur_cpu == PROCESSOR_FUTURE
> > +  ? rs600_cpu_index_lookup (PROCESSOR_POWER10)
> 
> s/rs600_cpu_index_lookup/rs6000_cpu_index_lookup/

See above.

> > +  : cpu_index);
> > +}
> >else
> >  {
> > -  size_t i;
> >enum processor_type tune_proc
> > = (TARGET_POWERPC64 ? PROCESSOR_DEFAULT64 : PROCESSOR_DEFAULT);
> >  
> > -  tune_index = -1;
> > -  for (i = 0; i < ARRAY_SIZE (processor_target_table); i++)
> > -   if (processor_target_table[i].processor == tune_proc)
> > - {
> > -   tune_index = i;
> > -   break;
> > - }
> > +  tune_index = rs600_cpu_index_lookup (tune_proc == PROCESSOR_FUTURE
> > +

Re: [PATCH] AArch64: aarch64_class_max_nregs mishandles 64-bit structure modes [PR112577]

2024-02-05 Thread Tejas Belagod

On 1/24/24 5:09 PM, Richard Sandiford wrote:

Tejas Belagod  writes:

The target hook aarch64_class_max_nregs returns the incorrect result for 64-bit
structure modes like V31DImode or V41DFmode etc.  The calculation of the nregs
is based on the size of AdvSIMD vector register for 64-bit modes which ought to
be UNITS_PER_VREG / 2.  This patch fixes the register size.

Existing tests like gcc.target/aarch64/advsimd-intrinsics/vld1x3.c cover this 
change.

Regression tested on aarch64-linux. Bootstrapped on aarch64-linux.

OK for trunk?

gcc/ChangeLog:

PR target/112577
* config/aarch64/aarch64.cc (aarch64_class_max_nregs): Handle 64-bit
vector structure modes correctly.
---
  gcc/config/aarch64/aarch64.cc | 10 ++
  1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index a5a6b52730d..b9f00bdce3b 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -12914,10 +12914,12 @@ aarch64_class_max_nregs (reg_class_t regclass, 
machine_mode mode)
  && constant_multiple_p (GET_MODE_SIZE (mode),
  aarch64_vl_bytes (mode, vec_flags), ))
return nregs;
-  return (vec_flags & VEC_ADVSIMD
- ? CEIL (lowest_size, UNITS_PER_VREG)
- : CEIL (lowest_size, UNITS_PER_WORD));
-
+  if (vec_flags == (VEC_ADVSIMD | VEC_STRUCT | VEC_PARTIAL))
+   return GET_MODE_SIZE (mode).to_constant () / 8;
+  else
+   return (vec_flags & VEC_ADVSIMD
+   ? CEIL (lowest_size, UNITS_PER_VREG)
+   : CEIL (lowest_size, UNITS_PER_WORD));


Very minor, sorry, but I think it would be more usual style to add the
new condition as an early-out and so not add an "else", especially since
there's alreaedy an early-out for SVE above:

   if (vec_flags == (VEC_ADVSIMD | VEC_STRUCT | VEC_PARTIAL))
return GET_MODE_SIZE (mode).to_constant () / 8;
   return (vec_flags & VEC_ADVSIMD
  ? CEIL (lowest_size, UNITS_PER_VREG)
  : CEIL (lowest_size, UNITS_PER_WORD));

I think it's also worth keeping the blank line between this and the
following block of cases.

OK with that change, thanks.

Richard


Thanks for the review, Richard. Re-spin attached. Will apply.

Thanks,
Tejas.





  case PR_REGS:
  case PR_LO_REGS:
  case PR_HI_REGS:
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
a5a6b52730d6c5013346d128e89915883f1707ae..a7c624f8b7327ae8c1324959c3ab5dfb4e7ebc6c
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -12914,6 +12914,8 @@ aarch64_class_max_nregs (reg_class_t regclass, 
machine_mode mode)
  && constant_multiple_p (GET_MODE_SIZE (mode),
  aarch64_vl_bytes (mode, vec_flags), ))
return nregs;
+  if (vec_flags == (VEC_ADVSIMD | VEC_STRUCT | VEC_PARTIAL))
+   return GET_MODE_SIZE (mode).to_constant () / 8;
   return (vec_flags & VEC_ADVSIMD
  ? CEIL (lowest_size, UNITS_PER_VREG)
  : CEIL (lowest_size, UNITS_PER_WORD));


Re: [PATCH] gcc/Makefile.in: Fix install-info target if BUILD_INFO is empty

2024-02-05 Thread Alexandre Oliva
Hello, Christophe,

Thanks for the patch.

On Feb  5, 2024, Christophe Lyon  wrote:

> In order to save build time, our CI overrides BUILD_INFO="", which
> works when invoking 'make all' but not for 'make install' in case some
> info files need an update.

Hmm, I don't think this would be desirable.  We ship updated info files
in release tarballs, and it would be desirable to install them even if
makeinfo is not available in the build environment.

> I noticed this when testing a patch posted on the gcc-patches list,
> leading to an error at 'make install' time after updating tm.texi (the
> build reported 'new text' in tm.texi and stopped).  This is because
> 'install' depends on 'install-info', which depends on
> $(DESTDIR)$(infodir)/gccint.info (among others).

Ideally, we'd detect and report info files that are out-of-date WRT
their ultimate sources, especially to catch tm.texi.in changes, but
doing so only at install time is clearly suboptimal.

I mean, if we don't have the tools to build info files, it's fine if we
skip their building, and even refrain from installing info files that
are missing or outdated, but we should install prebuilt ones if they're
available, and we should probably *not* refrain from trying to satisfy
the dependencies for info files at build time, even if it turns out that
we can't build the info files themselves.

This suggests to me that, rather than setting BUILD_INFO to the empty
string, we should set it to e.g. no-info, so that $(MAKEINFO) will not
be run because x$(BUILD_INFO) != xinfo, but so that we still get the
dependencies resolved, e.g. by making no-info depend on info.  Or maybe
make it info-check-deps, and insert that between info and its current
deps.  WDYT?

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice but
very few check the facts.  Think Assange & Stallman.  The empires strike back


Re: Ping: Re: [PATCH] libgcc: fix SEH C++ rethrow semantics [PR113337]

2024-02-05 Thread NightStrike
On Mon, Feb 5, 2024, 06:53 Matteo Italia  wrote:

> Il 31/01/24 04:24, LIU Hao ha scritto:
> > 在 2024-01-31 08:08, Jonathan Yong 写道:
> >> On 1/24/24 15:17, Matteo Italia wrote:
> >>> Ping! That's a one-line fix, and you can find all the details in the
> >>> bugzilla entry. Also, I can provide executables built with the
> >>> affected toolchains, demonstrating the problem and the fix.
> >>>
> >>> Thanks,
> >>> Matteo
> >>>
> >>
> >> I was away last week. LH, care to comment? Changes look fine to me.
> >>
> >
> > The change looks good to me, too.
> >
> > I haven't tested it though. According to a similar construction around
> > 'libgcc/unwind.inc:265' it should be that way.
>
> Hello,
>
> thank you for the replies, is there anything else I can do to help push
> this forward?
>

Remember to mention the pr with the right syntax in the ChangeLog so the
bot adds a comment field. I didn't see it in yours, but I might have missed
it.

>


Re: [PATCH] contrib: Fill in HOST{CC,CFLAGS,CXX,CXXFLAGS} in test_installed

2024-02-05 Thread Alexandre Oliva
On Feb  5, 2024, Jakub Jelinek  wrote:

>   * test_installed: Fill in HOSTCC, HOSTCXX, HOSTCFLAGS and
>   HOSTCXXFLAGS.

LGTM, thanks,

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice but
very few check the facts.  Think Assange & Stallman.  The empires strike back


Re: [pushed] c++: defaulted op== for incomplete class [PR107291]

2024-02-05 Thread Jason Merrill

On 2/5/24 21:55, Marek Polacek wrote:

On Mon, Feb 05, 2024 at 09:29:08PM -0500, Jason Merrill wrote:

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

After complaining about lack of friendship, we should not try to go on and
define the defaulted comparison operator anyway.

PR c++/107291

gcc/cp/ChangeLog:

* method.cc (early_check_defaulted_comparison): Fail if not friend.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/spaceship-eq17.C: New test.
---
  gcc/cp/method.cc| 6 +-
  gcc/testsuite/g++.dg/cpp2a/spaceship-eq17.C | 5 +
  2 files changed, 10 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/spaceship-eq17.C

diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index d49e5a565e8..3b8dc75d198 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -1228,7 +1228,11 @@ early_check_defaulted_comparison (tree fn)
  /* Defaulted outside the class body.  */
  ctx = TYPE_MAIN_VARIANT (parmtype);
  if (!is_friend (ctx, fn))
-   error_at (loc, "defaulted %qD is not a friend of %qT", fn, ctx);
+   {
+ error_at (loc, "defaulted %qD is not a friend of %qT", fn, ctx);
+ inform (location_of (ctx), "declared here");
+ ok = false;


Can I push this?

gcc/cp/ChangeLog:

* method.cc (early_check_defaulted_comparison): Add
auto_diagnostic_group.


Oops, yes, please.  In the future, adding missing auto_diagnostic_group 
can be pushed as obvious.



---
  gcc/cp/method.cc | 1 +
  1 file changed, 1 insertion(+)

diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index 3b8dc75d198..957496d3e18 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -1229,6 +1229,7 @@ early_check_defaulted_comparison (tree fn)
  ctx = TYPE_MAIN_VARIANT (parmtype);
  if (!is_friend (ctx, fn))
{
+ auto_diagnostic_group d;
  error_at (loc, "defaulted %qD is not a friend of %qT", fn, ctx);
  inform (location_of (ctx), "declared here");
  ok = false;

base-commit: c5d34912ad576be1ef19be92f7eabde54b9089eb




[PATCH] x86: Update constraints for APX NDD instructions

2024-02-05 Thread H.J. Lu
1. The only supported TLS code sequence with ADD is

addq foo@gottpoff(%rip),%reg

Change je constraint to a memory operand in APX NDD ADD pattern with
register source operand.

2. The instruction length of APX NDD instructions with immediate operand:

op imm, mem, reg

may exceed the size limit of 15 byes when non-default address space,
segment register or address size prefix are used.

Add jM constraint which is a memory operand valid for APX NDD instructions
with immediate operand and add jO constraint which is an offsetable memory
operand valid for APX NDD instructions with immediate operand.  Update
APX NDD patterns with jM and jO constraints.

gcc/

PR target/113711
PR target/113733
* config/i386/constraints.md: List all constraints with j prefix.
(j>): Change auto-dec to auto-inc in documentation.
(je): Changed to a memory constraint with APX NDD TLS operand
check.
(jM): New memory constraint for APX NDD instructions.
(jO): Likewise.
* config/i386/i386-protos.h (x86_poff_operand_p): Removed.
* config/i386/i386.cc (x86_poff_operand_p): Likewise.
* config/i386/i386.md (*add3_doubleword): Use rjO.
(*add_1[SWI48]): Use je and jM.
(addsi_1_zext): Use jM.
(*addv4_doubleword_1[DWI]): Likewise.
(*sub_1[SWI]): Use jM.
(@add3_cc_overflow_1[SWI]): Likewise.
(*add3_doubleword_cc_overflow_1): Use rjO.
(*and3_doubleword): Likewise.
(*anddi_1): Use jM.
(*andsi_1_zext): Likewise.
(*and_1[SWI24]): Likewise.
(*3_doubleword[any_or]: Use rjO
(*code_1[any_or SWI248]): Use jM.
(*si_1_zext[zero_extend + any_or]): Likewise.
* config/i386/predicates.md (apx_ndd_memory_operand): New.
(apx_ndd_add_memory_operand): Likewise.

gcc/testsuite/

PR target/113711
PR target/113733
* gcc.target/i386/apx-ndd-2.c: New test.
* gcc.target/i386/apx-ndd-base-index-1.c: Likewise.
* gcc.target/i386/apx-ndd-no-seg-global-1.c: Likewise.
* gcc.target/i386/apx-ndd-seg-1.c: Likewise.
* gcc.target/i386/apx-ndd-seg-2.c: Likewise.
* gcc.target/i386/apx-ndd-seg-3.c: Likewise.
* gcc.target/i386/apx-ndd-seg-4.c: Likewise.
* gcc.target/i386/apx-ndd-seg-5.c: Likewise.
* gcc.target/i386/apx-ndd-tls-1a.c: Likewise.
* gcc.target/i386/apx-ndd-tls-2.c: Likewise.
* gcc.target/i386/apx-ndd-tls-3.c: Likewise.
* gcc.target/i386/apx-ndd-tls-4.c: Likewise.
* gcc.target/i386/apx-ndd-x32-1.c: Likewise.
---
 gcc/config/i386/constraints.md|  36 -
 gcc/config/i386/i386-protos.h |   1 -
 gcc/config/i386/i386.cc   |  25 
 gcc/config/i386/i386.md   | 129 +-
 gcc/config/i386/predicates.md |  65 +
 gcc/testsuite/gcc.target/i386/apx-ndd-2.c |  17 +++
 .../gcc.target/i386/apx-ndd-base-index-1.c|  50 +++
 .../gcc.target/i386/apx-ndd-no-seg-global-1.c |  74 ++
 gcc/testsuite/gcc.target/i386/apx-ndd-seg-1.c |  98 +
 gcc/testsuite/gcc.target/i386/apx-ndd-seg-2.c |  98 +
 gcc/testsuite/gcc.target/i386/apx-ndd-seg-3.c |  14 ++
 gcc/testsuite/gcc.target/i386/apx-ndd-seg-4.c |   9 ++
 gcc/testsuite/gcc.target/i386/apx-ndd-seg-5.c |  13 ++
 .../gcc.target/i386/apx-ndd-tls-1a.c  |  41 ++
 gcc/testsuite/gcc.target/i386/apx-ndd-tls-2.c |  38 ++
 gcc/testsuite/gcc.target/i386/apx-ndd-tls-3.c |  16 +++
 gcc/testsuite/gcc.target/i386/apx-ndd-tls-4.c |  31 +
 gcc/testsuite/gcc.target/i386/apx-ndd-x32-1.c |  49 +++
 18 files changed, 712 insertions(+), 92 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-base-index-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-no-seg-global-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-seg-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-seg-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-seg-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-seg-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-seg-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-tls-1a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-tls-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-tls-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-tls-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-ndd-x32-1.c

diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 280e4c8e36c..64702d9c0a8 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -372,6 +372,24 @@ (define_address_constraint "Ts"
   "Address operand without segment register"
   (match_operand 0 "address_no_seg_operand"))
 
+;; j prefix is used for 

[PATCH] c++: further DR 2237 fix [PR97202]

2024-02-05 Thread Marek Polacek
Technically, not a regression.  But it's such a simple fix for such
rare code that I think we should put it in now and be done with
DR 2237.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
With a redundant inline specifier like this:

  template struct S : Base {
inline S() {}
  };

we don't detect the simple-template-id as the declarator-id of the
constructor.  The problem is that I check for CPP_TEMPLATE_ID too early,
at a point at which cp_parser_template_id may not have been called yet.
So let's check for it at the end of the function, after the tentative
parse and rollback.

PR c++/97202

gcc/cp/ChangeLog:

* parser.cc (cp_parser_constructor_declarator_p): Check CPP_TEMPLATE_ID
at the end of the function.

gcc/testsuite/ChangeLog:

* g++.dg/DRs/dr2237-5.C: New test.
---
 gcc/cp/parser.cc| 4 +---
 gcc/testsuite/g++.dg/DRs/dr2237-5.C | 7 +++
 2 files changed, 8 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/DRs/dr2237-5.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 473eaf4f1f7..8befd26feab 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -32337,8 +32337,6 @@ cp_parser_constructor_declarator_p (cp_parser *parser, 
cp_parser_flags flags,
   && next_token->type != CPP_TEMPLATE_ID)
 return false;
 
-  const bool saw_template_id = (next_token->type == CPP_TEMPLATE_ID);
-
   /* Parse tentatively; we are going to roll back all of the tokens
  consumed here.  */
   cp_parser_parse_tentatively (parser);
@@ -32558,7 +32556,7 @@ cp_parser_constructor_declarator_p (cp_parser *parser, 
cp_parser_flags flags,
   /* DR 2237 (C++20 only): A simple-template-id is no longer valid as the
  declarator-id of a constructor or destructor.  */
   if (constructor_p
-  && saw_template_id
+  && cp_lexer_peek_token (parser->lexer)->type == CPP_TEMPLATE_ID
   && !cp_parser_uncommitted_to_tentative_parse_p (parser))
 {
   auto_diagnostic_group d;
diff --git a/gcc/testsuite/g++.dg/DRs/dr2237-5.C 
b/gcc/testsuite/g++.dg/DRs/dr2237-5.C
new file mode 100644
index 000..fd51968f7e1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/DRs/dr2237-5.C
@@ -0,0 +1,7 @@
+// PR c++/97202
+// { dg-options "" }
+
+template struct S : Base {
+  inline S() {} // { dg-warning "template-id not allowed for 
constructor" "" { target c++20 } }
+  inline ~S() {} // { dg-warning "template-id not allowed for 
destructor" "" { target c++20 } }
+};

base-commit: c5d34912ad576be1ef19be92f7eabde54b9089eb
prerequisite-patch-id: 2987a013eda8bc2ca9d8373eedac82067a654a5c
-- 
2.43.0



[PATCH v2] c++: DR2237, cdtor and template-id tweaks [PR107126]

2024-02-05 Thread Marek Polacek
On Mon, Feb 05, 2024 at 10:14:34AM -0500, Jason Merrill wrote:
> On 2/3/24 10:24, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > I'm not certain OPT_Wc__20_extensions is the best thing for something
> > from [diff.cpp17]; would you prefer something else?
> 
> I think it wants its own flag, that is enabled in C++20 or by
> -Wc++20-compat.

That seems best.  I called it -Wdeprecated-template-id-cdtor.
 
> > +   if (cxx_dialect >= cxx20)
> > + {
> > +   if (!cp_parser_simulate_error (parser))
> > + pedwarn (tilde_loc, OPT_Wc__20_extensions,
> > +  "template-id not allowed for destructor");
> > +   return error_mark_node;
> > + }
> > +   warning_at (tilde_loc, OPT_Wc__20_compat,
> > +   "template-id not allowed for destructor in C++20");
> 
> After a pedwarn we should accept the code, not return error_mark_node.

/facepalm, yes.
 
> I'm also concerned about pedwarn/warnings not guarded by
> !cp_parser_uncommited_to_tentative_parse; that often leads to warning about
> a tentative parse as a declaration that is eventually abandoned in favor of
> a perfectly fine parse as an expression.

Done.
 
> It would be good for cp_parser_context to add a vec of warnings to emit at
> cp_parser_parse_definitely time, and then
> cp_parser_pedwarn/cp_parser_warning to fill it...

That would be nice; I don't think we can fix bugs like PR61259 otherwise.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Since my r11-532 changes to implement DR2237, for this test:

  template
  struct S {
S();
  };

in C++20 we emit the ugly:

q.C:3:8: error: expected unqualified-id before ')' token
3 |   S();

which doesn't explain what the problem is.  This patch improves that
diagnostic, reduces the error to a pedwarn, and adds a -Wc++20-compat
diagnostic.  We now say:

q.C:3:7: warning: template-id not allowed for constructor in C++20 
[-Wdeprecated-template-id-cdtor]
3 |   S();
q.C:3:7: note: remove the '< >'

This patch does *not* fix

where the C++20 diagnostic is missing altogether.

-Wc++20-compat triggered in libitm/; I sent a patch for that.

DR 2237
PR c++/107126
PR c++/97202

gcc/c-family/ChangeLog:

* c-opts.cc (c_common_post_options): In C++20 or with -Wc++20-compat,
turn on -Wdeprecated-template-id-cdtor.
* c.opt (Wdeprecated-template-id-cdtor): New.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_unqualified_id): Downgrade the DR2237 error to
a pedwarn.
(cp_parser_constructor_declarator_p): Likewise.

gcc/ChangeLog:

* doc/invoke.texi: Document -Wdeprecated-template-id-cdtor.

gcc/testsuite/ChangeLog:

* g++.dg/DRs/dr2237.C: Adjust dg-error.
* g++.dg/parse/constructor2.C: Likewise.
* g++.dg/template/error34.C: Likewise.
* g++.old-deja/g++.pt/ctor2.C: Likewise.
* g++.dg/DRs/dr2237-2.C: New test.
* g++.dg/DRs/dr2237-3.C: New test.
* g++.dg/DRs/dr2237-4.C: New test.
* g++.dg/warn/Wdeprecated-template-id-cdtor-1.C: New test.
* g++.dg/warn/Wdeprecated-template-id-cdtor-2.C: New test.
* g++.dg/warn/Wdeprecated-template-id-cdtor-3.C: New test.
* g++.dg/warn/Wdeprecated-template-id-cdtor-4.C: New test.
---
 gcc/c-family/c-opts.cc|  5 +++
 gcc/c-family/c.opt|  4 +++
 gcc/cp/parser.cc  | 34 ++-
 gcc/doc/invoke.texi   | 18 ++
 gcc/testsuite/g++.dg/DRs/dr2237-2.C   |  9 +
 gcc/testsuite/g++.dg/DRs/dr2237-3.C   | 16 +
 gcc/testsuite/g++.dg/DRs/dr2237-4.C   | 11 ++
 gcc/testsuite/g++.dg/DRs/dr2237.C |  2 +-
 gcc/testsuite/g++.dg/parse/constructor2.C | 16 -
 gcc/testsuite/g++.dg/template/error34.C   | 10 +++---
 .../warn/Wdeprecated-template-id-cdtor-1.C|  9 +
 .../warn/Wdeprecated-template-id-cdtor-2.C|  9 +
 .../warn/Wdeprecated-template-id-cdtor-3.C|  9 +
 .../warn/Wdeprecated-template-id-cdtor-4.C|  9 +
 gcc/testsuite/g++.old-deja/g++.pt/ctor2.C |  2 +-
 15 files changed, 140 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/DRs/dr2237-2.C
 create mode 100644 gcc/testsuite/g++.dg/DRs/dr2237-3.C
 create mode 100644 gcc/testsuite/g++.dg/DRs/dr2237-4.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdeprecated-template-id-cdtor-1.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdeprecated-template-id-cdtor-2.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdeprecated-template-id-cdtor-3.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdeprecated-template-id-cdtor-4.C

diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
index b845aff2226..d2358e2009a 100644
--- a/gcc/c-family/c-opts.cc
+++ 

Re: [pushed] c++: defaulted op== for incomplete class [PR107291]

2024-02-05 Thread Marek Polacek
On Mon, Feb 05, 2024 at 09:29:08PM -0500, Jason Merrill wrote:
> Tested x86_64-pc-linux-gnu, applying to trunk.
> 
> -- 8< --
> 
> After complaining about lack of friendship, we should not try to go on and
> define the defaulted comparison operator anyway.
> 
>   PR c++/107291
> 
> gcc/cp/ChangeLog:
> 
>   * method.cc (early_check_defaulted_comparison): Fail if not friend.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp2a/spaceship-eq17.C: New test.
> ---
>  gcc/cp/method.cc| 6 +-
>  gcc/testsuite/g++.dg/cpp2a/spaceship-eq17.C | 5 +
>  2 files changed, 10 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/spaceship-eq17.C
> 
> diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
> index d49e5a565e8..3b8dc75d198 100644
> --- a/gcc/cp/method.cc
> +++ b/gcc/cp/method.cc
> @@ -1228,7 +1228,11 @@ early_check_defaulted_comparison (tree fn)
> /* Defaulted outside the class body.  */
> ctx = TYPE_MAIN_VARIANT (parmtype);
> if (!is_friend (ctx, fn))
> - error_at (loc, "defaulted %qD is not a friend of %qT", fn, ctx);
> + {
> +   error_at (loc, "defaulted %qD is not a friend of %qT", fn, ctx);
> +   inform (location_of (ctx), "declared here");
> +   ok = false;

Can I push this?

gcc/cp/ChangeLog:

* method.cc (early_check_defaulted_comparison): Add
auto_diagnostic_group.
---
 gcc/cp/method.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index 3b8dc75d198..957496d3e18 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -1229,6 +1229,7 @@ early_check_defaulted_comparison (tree fn)
  ctx = TYPE_MAIN_VARIANT (parmtype);
  if (!is_friend (ctx, fn))
{
+ auto_diagnostic_group d;
  error_at (loc, "defaulted %qD is not a friend of %qT", fn, ctx);
  inform (location_of (ctx), "declared here");
  ok = false;

base-commit: c5d34912ad576be1ef19be92f7eabde54b9089eb
-- 
2.43.0



[pushed] c++: defaulted op== for incomplete class [PR107291]

2024-02-05 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

After complaining about lack of friendship, we should not try to go on and
define the defaulted comparison operator anyway.

PR c++/107291

gcc/cp/ChangeLog:

* method.cc (early_check_defaulted_comparison): Fail if not friend.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/spaceship-eq17.C: New test.
---
 gcc/cp/method.cc| 6 +-
 gcc/testsuite/g++.dg/cpp2a/spaceship-eq17.C | 5 +
 2 files changed, 10 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/spaceship-eq17.C

diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc
index d49e5a565e8..3b8dc75d198 100644
--- a/gcc/cp/method.cc
+++ b/gcc/cp/method.cc
@@ -1228,7 +1228,11 @@ early_check_defaulted_comparison (tree fn)
  /* Defaulted outside the class body.  */
  ctx = TYPE_MAIN_VARIANT (parmtype);
  if (!is_friend (ctx, fn))
-   error_at (loc, "defaulted %qD is not a friend of %qT", fn, ctx);
+   {
+ error_at (loc, "defaulted %qD is not a friend of %qT", fn, ctx);
+ inform (location_of (ctx), "declared here");
+ ok = false;
+   }
}
   else if (!same_type_ignoring_top_level_qualifiers_p (parmtype, ctx))
saw_bad = true;
diff --git a/gcc/testsuite/g++.dg/cpp2a/spaceship-eq17.C 
b/gcc/testsuite/g++.dg/cpp2a/spaceship-eq17.C
new file mode 100644
index 000..039bfac387c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/spaceship-eq17.C
@@ -0,0 +1,5 @@
+// PR c++/107291
+// { dg-do compile { target c++20 } }
+
+struct S4;// { dg-message "declared 
here" }
+bool operator==(S4 const &, S4 const &) = default; // { dg-error "not a 
friend" }

base-commit: d49780c08aade447953bfe4e877d386f5757f165
-- 
2.43.0



Re: Pushed: [PATCH] LoongArch: Avoid out-of-bounds access in loongarch_symbol_insns

2024-02-05 Thread chenglulu



在 2024/2/5 上午1:01, Xi Ruoyao 写道:

I have a question. I see that you often add compilation options in

BOOT_CFLAGS.

I also want to test it. Do you have a recommended set of compilation
options?

When I build a compiler for my system I use
{BOOT_{C,CXX,LD}FLAGS,{C,CXX,LD}FLAGS_FOR_TARGET}="-O3 -march=la664 -
mtune=la664 -pipe -fgraphite-identity -floop-nest-optimize -fipa-pta -
fdevirtualize-at-ltrans -fno-semantic-interposition -Wl,-O1 -Wl,--as-
needed"

and enable PGO (make profiledbootstrap) and LTO (--with-build-
config=bootstrap-lto).

All of them but GRAPHITE (-fgraphite-identity -floop-nest-optimize)
seems "pretty safe" on the architectures I have a hardware of.  GRAPHITE
is causing bootstrap failure on AArch64 with GCC 13 (PR109929) if
combined with PGO and the real cause is still not found yet.

But when I do a test build I normally only enable the flags which may
help to catch some issues, for example when a change only affects LTO I
add --with-build-config=bootstrap-lto, when changing something related
to LASX I use -O3 -mlasx (or -O3 -march=la664) as BOOT_CFLAGS.



Thank you so much. I will try to add optimization options.



[PATCH 2/2] LoongArch: Remove redundant symbol type conversions in larchintrin.h.

2024-02-05 Thread Lulu Cheng
gcc/ChangeLog:

* config/loongarch/larchintrin.h (__movgr2fcsr): Remove redundant
symbol type conversions.
(__cacop_d): Likewise.
(__cpucfg): Likewise.
(__asrtle_d): Likewise.
(__asrtgt_d): Likewise.
(__lddir_d): Likewise.
(__ldpte_d): Likewise.
(__crc_w_b_w): Likewise.
(__crc_w_h_w): Likewise.
(__crc_w_w_w): Likewise.
(__crc_w_d_w): Likewise.
(__crcc_w_b_w): Likewise.
(__crcc_w_h_w): Likewise.
(__crcc_w_w_w): Likewise.
(__crcc_w_d_w): Likewise.
(__csrrd_w): Likewise.
(__csrwr_w): Likewise.
(__csrxchg_w): Likewise.
(__csrrd_d): Likewise.
(__csrwr_d): Likewise.
(__csrxchg_d): Likewise.
(__iocsrrd_b): Likewise.
(__iocsrrd_h): Likewise.
(__iocsrrd_w): Likewise.
(__iocsrrd_d): Likewise.
(__iocsrwr_b): Likewise.
(__iocsrwr_h): Likewise.
(__iocsrwr_w): Likewise.
(__iocsrwr_d): Likewise.
(__frecipe_s): Likewise.
(__frecipe_d): Likewise.
(__frsqrte_s): Likewise.
(__frsqrte_d): Likewise.
---
 gcc/config/loongarch/larchintrin.h | 69 ++
 1 file changed, 33 insertions(+), 36 deletions(-)

diff --git a/gcc/config/loongarch/larchintrin.h 
b/gcc/config/loongarch/larchintrin.h
index 04672e71728..0f55bdae838 100644
--- a/gcc/config/loongarch/larchintrin.h
+++ b/gcc/config/loongarch/larchintrin.h
@@ -87,13 +87,13 @@ __rdtimel_w (void)
 /* Assembly instruction format:fcsr, rj.  */
 /* Data types in instruction templates:  VOID, UQI, USI.  */
 #define __movgr2fcsr(/*ui5*/ _1, _2) \
-  __builtin_loongarch_movgr2fcsr ((_1), (unsigned int) _2);
+  __builtin_loongarch_movgr2fcsr ((_1), _2);
 
 #if defined __loongarch64
 /* Assembly instruction format:ui5, rj, si12.  */
 /* Data types in instruction templates:  VOID, USI, UDI, SI.  */
 #define __cacop_d(/*ui5*/ _1, /*unsigned long int*/ _2, /*si12*/ _3) \
-  ((void) __builtin_loongarch_cacop_d ((_1), (unsigned long int) (_2), (_3)))
+  __builtin_loongarch_cacop_d ((_1), (_2), (_3))
 #else
 #error "Unsupported ABI."
 #endif
@@ -104,7 +104,7 @@ extern __inline unsigned int
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 __cpucfg (unsigned int _1)
 {
-  return (unsigned int) __builtin_loongarch_cpucfg ((unsigned int) _1);
+  return __builtin_loongarch_cpucfg (_1);
 }
 
 #ifdef __loongarch64
@@ -114,7 +114,7 @@ extern __inline void
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 __asrtle_d (long int _1, long int _2)
 {
-  __builtin_loongarch_asrtle_d ((long int) _1, (long int) _2);
+  __builtin_loongarch_asrtle_d (_1, _2);
 }
 
 /* Assembly instruction format:rj, rk.  */
@@ -123,7 +123,7 @@ extern __inline void
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 __asrtgt_d (long int _1, long int _2)
 {
-  __builtin_loongarch_asrtgt_d ((long int) _1, (long int) _2);
+  __builtin_loongarch_asrtgt_d (_1, _2);
 }
 #endif
 
@@ -131,7 +131,7 @@ __asrtgt_d (long int _1, long int _2)
 /* Assembly instruction format:rd, rj, ui5.  */
 /* Data types in instruction templates:  DI, DI, UQI.  */
 #define __lddir_d(/*long int*/ _1, /*ui5*/ _2) \
-  ((long int) __builtin_loongarch_lddir_d ((long int) (_1), (_2)))
+  __builtin_loongarch_lddir_d ((_1), (_2))
 #else
 #error "Unsupported ABI."
 #endif
@@ -140,7 +140,7 @@ __asrtgt_d (long int _1, long int _2)
 /* Assembly instruction format:rj, ui5.  */
 /* Data types in instruction templates:  VOID, DI, UQI.  */
 #define __ldpte_d(/*long int*/ _1, /*ui5*/ _2) \
-  ((void) __builtin_loongarch_ldpte_d ((long int) (_1), (_2)))
+  __builtin_loongarch_ldpte_d ((_1), (_2))
 #else
 #error "Unsupported ABI."
 #endif
@@ -151,7 +151,7 @@ extern __inline int
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 __crc_w_b_w (char _1, int _2)
 {
-  return (int) __builtin_loongarch_crc_w_b_w ((char) _1, (int) _2);
+  return __builtin_loongarch_crc_w_b_w (_1, _2);
 }
 
 /* Assembly instruction format:rd, rj, rk.  */
@@ -160,7 +160,7 @@ extern __inline int
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 __crc_w_h_w (short _1, int _2)
 {
-  return (int) __builtin_loongarch_crc_w_h_w ((short) _1, (int) _2);
+  return __builtin_loongarch_crc_w_h_w (_1, _2);
 }
 
 /* Assembly instruction format:rd, rj, rk.  */
@@ -169,7 +169,7 @@ extern __inline int
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 __crc_w_w_w (int _1, int _2)
 {
-  return (int) __builtin_loongarch_crc_w_w_w ((int) _1, (int) _2);
+  return __builtin_loongarch_crc_w_w_w (_1, _2);
 }
 
 #ifdef __loongarch64
@@ -179,7 +179,7 @@ extern __inline int
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 __crc_w_d_w (long int _1, int _2)
 {
-  return (int) __builtin_loongarch_crc_w_d_w ((long int) _1, (int) _2);
+  return 

[PATCH 1/2] LoongArch: Fix wrong return value type of __iocsrrd_h.

2024-02-05 Thread Lulu Cheng
gcc/ChangeLog:

* config/loongarch/larchintrin.h (__iocsrrd_h): Modify the
function return value type to unsigned short.
---
 gcc/config/loongarch/larchintrin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/loongarch/larchintrin.h 
b/gcc/config/loongarch/larchintrin.h
index ff2c9f460ac..04672e71728 100644
--- a/gcc/config/loongarch/larchintrin.h
+++ b/gcc/config/loongarch/larchintrin.h
@@ -268,7 +268,7 @@ __iocsrrd_b (unsigned int _1)
 
 /* Assembly instruction format:rd, rj.  */
 /* Data types in instruction templates:  UHI, USI.  */
-extern __inline unsigned char
+extern __inline unsigned short
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 __iocsrrd_h (unsigned int _1)
 {
-- 
2.39.3



Re: [PATCH 0/4] Add DF_LIVE_SUBREG data and apply to IRA and LRA

2024-02-05 Thread Lehua Ding




On 2024/2/6 2:17, Joseph Myers wrote:

This series appears to be missing documentation for the new option in
invoke.texi.



OK, I'll add that. Thanks.

--
Best,
Lehua (RiVAI)


Re: [PATCH 0/4] Add DF_LIVE_SUBREG data and apply to IRA and LRA

2024-02-05 Thread Lehua Ding



On 2024/2/6 0:10, Jeff Law wrote:
Just a note.  I doubt this will get much traction from a review 
standpoint until gcc-14 is basically out the door.


My recommendation is to continue development, bugfixing, cleanup, etc 
between now and then.  Consider creating a branch for the work in the 
upstream repo.


OK, thanks for the guidance.

--
Best,
Lehua (RiVAI)



Re: [PATCH] LoongArch: libsanitizer: Enable build lsan and tsan for loongarch64.

2024-02-05 Thread chenglulu



在 2024/2/2 下午6:01, Jakub Jelinek 写道:

On Tue, Jan 30, 2024 at 10:09:51AM +0800, Lulu Cheng wrote:

From: chenguoqi 

libsanitizer/ChangeLog:

* configure.tgt: Enable tsan and lsan for loongarch64.
* tsan/Makefile.am: Add tsan_rtl_loongarch64.S to 
EXTRA_libtsan_la_SOURCES.

This line is too long and should read
* tsan/Makefile.am (EXTRA_libtsan_la_SOURCES): Add
tsan_rtl_loongarch64.S.


Modify the description here and submit it to r14-8816.

Thanks!




* tsan/Makefile.in: Regenerate.

Otherwise LGTM.

Jakub




Re: [PATCH v2] RISC-V: THEAD: Fix improper immediate value for MODIFY_DISP instruction on 32-bit systems.

2024-02-05 Thread Christoph Müllner
On Mon, Feb 5, 2024 at 3:56 PM Jeff Law  wrote:
>
>
>
> On 2/5/24 05:00, Christoph Müllner wrote:
> > On Sat, Feb 3, 2024 at 2:11 PM Andreas Schwab 
> > wrote:
> >>
> >> On Jan 30 2024, Christoph Müllner wrote:
> >>
> >>> retested
> >>
> >> Nope.
> >
> > Sorry for this. I tested for no regressions in the test suite with a
> > cross-build and QEMU and did not do a Werror bootstrap build. I'll
> > provide a fix for this later today (also breaking the line as it is
> > longer than needed).
> Right.  And that's pretty standard given the state of the RISC-V
> platforms.  We've got a platform here that can bootstrap in a reasonable
> amount of time, but I haven't set that up in the CI system yet.
>
> Until such systems are common, these niggling issues are bound to show up.
>
> It's just whitespace around the HOST_WIDE_INT_PRINT_DEC and wrapping the
> long line, right?  I've got that in my tree that's bootstrapping now.  I
> don't mind committing it later today.  But if you get to it before my
> bootstrap is done, feel free to commit as pre-approved.

Pushed:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=184978cd74f962712e813030d58edc109ad9a92d

>
> jeff


[PATCH] RISC-V: Fix infinite compilation of VSETVL PASS

2024-02-05 Thread Juzhe-Zhong
This patch fixes issue reported by Jeff.

Testing is running. Ok for trunk if I passed the testing with no regression ?

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::emit_vsetvl): Fix inifinite 
compilation.
(pre_vsetvl::remove_vsetvl_pre_insns): Ditto.

---
 gcc/config/riscv/riscv-vsetvl.cc | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 2c0dcdf18c5..32f262de199 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -2281,9 +2281,8 @@ private:
   }
   }
 
-  void remove_vsetvl_insn (const vsetvl_info )
+  void remove_vsetvl_insn (rtx_insn *rinsn)
   {
-rtx_insn *rinsn = info.get_insn ()->rtl ();
 if (dump_file)
   {
fprintf (dump_file, "  Eliminate insn %d:\n", INSN_UID (rinsn));
@@ -3231,7 +3230,7 @@ pre_vsetvl::emit_vsetvl ()
  if (curr_info.delete_p ())
{
  if (vsetvl_insn_p (insn->rtl ()))
-   remove_vsetvl_insn (curr_info);
+   remove_vsetvl_insn (curr_info.get_insn ()->rtl ());
  continue;
}
  else if (curr_info.valid_p ())
@@ -3269,7 +3268,7 @@ pre_vsetvl::emit_vsetvl ()
   for (const vsetvl_info  : m_delete_list)
 {
   gcc_assert (vsetvl_insn_p (item.get_insn ()->rtl ()));
-  remove_vsetvl_insn (item);
+  remove_vsetvl_insn (item.get_insn ()->rtl ());
 }
 
   /* Insert vsetvl info that was not deleted after lift up.  */
@@ -3434,7 +3433,7 @@ pre_vsetvl::remove_vsetvl_pre_insns ()
   INSN_UID (rinsn));
  print_rtl_single (dump_file, rinsn);
}
- remove_insn (rinsn);
+ remove_vsetvl_insn (rinsn);
}
 }
 
-- 
2.36.3



[PATCH v2] openmp, fortran: Add Fortran support for indirect clause on the declare target directive

2024-02-05 Thread Kwok Cheung Yeung

Hi

As previously discussed, this version of the patch adds code to emit a 
warning when a directive like this:


!$omp declare target indirect(.true.)

is encountered (i.e. a target directive containing at least one clause, 
but no to/enter clause, which appears to violate the OpenMP standard). A 
test is also added to gfortran.dg/gomp/declare-target-indirect-1.f90 to 
test for this.


I have also added a declare-target-indirect-3.f90 test to libgomp to 
check that procedures passed via a dummy argument work properly when 
used in an indirect call.


Okay for mainline?

Thanks

KwokFrom f6662a7bc76d400fecb5013ad6d6ab3b00b8a6e7 Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Mon, 5 Feb 2024 20:31:49 +
Subject: [PATCH] openmp, fortran: Add Fortran support for indirect clause on
 the declare target directive

2024-02-05  Kwok Cheung Yeung  

gcc/fortran/
* dump-parse-tree.cc (show_attr): Handle omp_declare_target_indirect
attribute.
* f95-lang.cc (gfc_gnu_attributes): Add entry for 'omp declare
target indirect'.
* gfortran.h (symbol_attribute): Add omp_declare_target_indirect
field.
(struct gfc_omp_clauses): Add indirect field.
* openmp.cc (omp_mask2): Add OMP_CLAUSE_INDIRECT.
(gfc_match_omp_clauses): Match indirect clause.
(OMP_DECLARE_TARGET_CLAUSES): Add OMP_CLAUSE_INDIRECT.
(gfc_match_omp_declare_target): Check omp_device_type and apply
omp_declare_target_indirect attribute to symbol if indirect clause
active.  Show warning if there are only device_type and/or indirect
clauses on the directive.
* trans-decl.cc (add_attributes_to_decl): Add 'omp declare target
indirect' attribute if symbol has indirect attribute set.

gcc/testsuite/
* gfortran.dg/gomp/declare-target-4.f90 (f1): Update expected warning.
* gfortran.dg/gomp/declare-target-indirect-1.f90: New.
* gfortran.dg/gomp/declare-target-indirect-2.f90: New.

libgomp/
* testsuite/libgomp.fortran/declare-target-indirect-1.f90: New.
* testsuite/libgomp.fortran/declare-target-indirect-2.f90: New.
* testsuite/libgomp.fortran/declare-target-indirect-3.f90: New.
---
 gcc/fortran/dump-parse-tree.cc|  2 +
 gcc/fortran/f95-lang.cc   |  2 +
 gcc/fortran/gfortran.h|  3 +-
 gcc/fortran/openmp.cc | 50 ++-
 gcc/fortran/trans-decl.cc |  4 ++
 .../gfortran.dg/gomp/declare-target-4.f90 |  2 +-
 .../gomp/declare-target-indirect-1.f90| 62 +++
 .../gomp/declare-target-indirect-2.f90| 25 
 .../declare-target-indirect-1.f90 | 39 
 .../declare-target-indirect-2.f90 | 53 
 .../declare-target-indirect-3.f90 | 25 
 11 files changed, 262 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/declare-target-indirect-1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/declare-target-indirect-2.f90
 create mode 100644 
libgomp/testsuite/libgomp.fortran/declare-target-indirect-1.f90
 create mode 100644 
libgomp/testsuite/libgomp.fortran/declare-target-indirect-2.f90
 create mode 100644 
libgomp/testsuite/libgomp.fortran/declare-target-indirect-3.f90

diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc
index 1563b810b98..7b154eb3ca7 100644
--- a/gcc/fortran/dump-parse-tree.cc
+++ b/gcc/fortran/dump-parse-tree.cc
@@ -914,6 +914,8 @@ show_attr (symbol_attribute *attr, const char * module)
 fputs (" OMP-DECLARE-TARGET", dumpfile);
   if (attr->omp_declare_target_link)
 fputs (" OMP-DECLARE-TARGET-LINK", dumpfile);
+  if (attr->omp_declare_target_indirect)
+fputs (" OMP-DECLARE-TARGET-INDIRECT", dumpfile);
   if (attr->elemental)
 fputs (" ELEMENTAL", dumpfile);
   if (attr->pure)
diff --git a/gcc/fortran/f95-lang.cc b/gcc/fortran/f95-lang.cc
index 358cb17fce2..67fda27aa3e 100644
--- a/gcc/fortran/f95-lang.cc
+++ b/gcc/fortran/f95-lang.cc
@@ -96,6 +96,8 @@ static const attribute_spec gfc_gnu_attributes[] =
 gfc_handle_omp_declare_target_attribute, NULL },
   { "omp declare target link", 0, 0, true,  false, false, false,
 gfc_handle_omp_declare_target_attribute, NULL },
+  { "omp declare target indirect", 0, 0, true,  false, false, false,
+gfc_handle_omp_declare_target_attribute, NULL },
   { "oacc function", 0, -1, true,  false, false, false,
 gfc_handle_omp_declare_target_attribute, NULL },
 };
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index fd73e4ce431..fd843a3241d 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -999,6 +999,7 @@ typedef struct
   /* Mentioned in OMP DECLARE TARGET.  */
   unsigned omp_declare_target:1;
   unsigned omp_declare_target_link:1;
+  unsigned omp_declare_target_indirect:1;
   ENUM_BITFIELD 

[pushed] c++: -frounding-math test [PR109359]

2024-02-05 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

This test was fixed by the patch for PR95226, but that patch had no
testcase so let's add this one.

PR c++/109359

gcc/testsuite/ChangeLog:

* g++.dg/ext/frounding-math1.C: New test.
---
 gcc/testsuite/g++.dg/ext/frounding-math1.C | 8 
 1 file changed, 8 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/frounding-math1.C

diff --git a/gcc/testsuite/g++.dg/ext/frounding-math1.C 
b/gcc/testsuite/g++.dg/ext/frounding-math1.C
new file mode 100644
index 000..ecc46fd6017
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/frounding-math1.C
@@ -0,0 +1,8 @@
+// PR c++/109359
+// { dg-additional-options -frounding-math }
+
+// For a while we were emitting two doubles (4 .long directives) as the value
+// of a float array; it should only be two .longs.
+
+// { dg-final { scan-assembler-times "long" 2 { target x86_64-*-* } } }
+float xs[] = {0.001914, 0.630539};

base-commit: c7e8381748f78335e9fef23f363b6a9e4463ce7e
-- 
2.43.0



[pushed] c++: prvalue of array type [PR111286]

2024-02-05 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

Here we want to build a prvalue array to bind to the T reference, but we
were wrongly trying to strip cv-quals from the array prvalue, which should
be treated the same as a class prvalue.

PR c++/111286

gcc/cp/ChangeLog:

* tree.cc (rvalue): Don't drop cv-quals from an array.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-array22.C: New test.
---
 gcc/cp/tree.cc|  9 +
 gcc/testsuite/g++.dg/cpp0x/initlist-array22.C | 12 
 2 files changed, 17 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist-array22.C

diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index 5c8c05dc168..50dc345ec9a 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -977,11 +977,12 @@ rvalue (tree expr)
 
   expr = mark_rvalue_use (expr);
 
-  /* [basic.lval]
-
- Non-class rvalues always have cv-unqualified types.  */
+  /* [expr.type]: "If a prvalue initially has the type "cv T", where T is a
+ cv-unqualified non-class, non-array type, the type of the expression is
+ adjusted to T prior to any further analysis.  */
   type = TREE_TYPE (expr);
-  if (!CLASS_TYPE_P (type) && cv_qualified_p (type))
+  if (!CLASS_TYPE_P (type) && TREE_CODE (type) != ARRAY_TYPE
+  && cv_qualified_p (type))
 type = cv_unqualified (type);
 
   /* We need to do this for rvalue refs as well to get the right answer
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-array22.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist-array22.C
new file mode 100644
index 000..8629e4be239
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist-array22.C
@@ -0,0 +1,12 @@
+// PR c++/111286
+// { dg-do compile { target c++11 } }
+// { dg-additional-options -Wno-unused }
+
+struct A {
+  A() noexcept {}
+};
+
+void foo() {
+  using T = const A (&)[1];
+  T{};
+}

base-commit: f1412546ac8999b7f6e8cf967ce3f31794c2
-- 
2.43.0



Re: [PATCH] aarch64, acle header: Cast uint64_t pointers to DIMode.

2024-02-05 Thread Iain Sandoe



> On 5 Feb 2024, at 14:56, Iain Sandoe  wrote:
> 
> Tested on aarch64-linux,darwin and a cross from aarch64-darwin to linux,
> OK for trunk, or some alternative is needed?

Hmm.. apparently, this fails the linaro pre-commit CI for g++ with:
error: invalid conversion from 'long int*' to 'long unsigned int*' 
[-fpermissive]

So, I guess some alternative is needed, advice welcome,
Iain

> thanks
> Iain
> 
> --- 8< ---
> 
> Currently, most of the acle tests fail on the Darwin port because
> DI mode is "long" and uint64 is "long long".  The fix for this used
> in other headers is to cast the pointers using __builtin_aarch64_simd_di
> and that is what this patch does.
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/arm_acle.h (__rndr): Cast uint64 pointer to DI
>   mode to avoid typedef mismatches.
>   (__rndrrs): Likewise.
> ---
> gcc/config/aarch64/arm_acle.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/aarch64/arm_acle.h b/gcc/config/aarch64/arm_acle.h
> index 2aa681090fa..823f87187b1 100644
> --- a/gcc/config/aarch64/arm_acle.h
> +++ b/gcc/config/aarch64/arm_acle.h
> @@ -309,14 +309,14 @@ __extension__ extern __inline int
> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> __rndr (uint64_t *__res)
> {
> -  return __builtin_aarch64_rndr (__res);
> +  return __builtin_aarch64_rndr ((__builtin_aarch64_simd_di *) __res);
> }
> 
> __extension__ extern __inline int
> __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> __rndrrs (uint64_t *__res)
> {
> -  return __builtin_aarch64_rndrrs (__res);
> +  return __builtin_aarch64_rndrrs ((__builtin_aarch64_simd_di *) __res);
> }
> 
> #pragma GCC pop_options
> -- 
> 2.39.2 (Apple Git-143)
> 



New Chinese (simplified) PO file for 'gcc' (version 13.2.0)

2024-02-05 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Chinese (simplified) team of translators.  The file is available at:

https://translationproject.org/latest/gcc/zh_CN.po

(This file, 'gcc-13.2.0.zh_CN.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH] libstdc++: /dev/null is not accessible on Windows

2024-02-05 Thread Jonathan Wakely
On Mon, 5 Feb 2024, 19:07 Torbjörn SVENSSON, 
wrote:

> Ok for trunk and releases/gcc-13?
>

OK, thanks


> ---
>
> When running the DejaGNU testsuite on a toolchain built for native
> Windows, the path /dev/null can't be used to open a stream to void.
> On native Windows, the resource is instead named "nul".
>
> In 17_intro/tag_type_explicit_ctor.cc, the following statement would
> fail to match when the DejaGNU testsuite is running in cygwin with a
> native toolchain.
> // dg-error 53 "explicit" "" { target hosted }
>
> The "target hosted"-check is using cpp to verify if _GLIBCXX_HOSTED is
> defined and discards the output by simply redirecting it to /dev/null.
> In v3_target_compile, it's overridden to "nul" for MinGW targets, but
> the same rule applies when host is cygwin, so replace the condition
> with a check for Windows.
>
> The error in the log would look like this for the "target hosted" check:
> cc1plus.exe: fatal error: opening output file /dev/null: No such file or
> directory
>
> The tag_type_explicit_ctor.cc test fails with this on Windows:
> .../tag_type_explicit_ctor.cc:53: error: converting to 'std::defer_lock_t'
> from initializer list would use explicit constructor 'constexpr
> std::defer_lock_t::defer_lock_t()'
> .../tag_type_explicit_ctor.cc:54: error: converting to
> 'std::try_to_lock_t' from initializer list would use explicit constructor
> 'constexpr std::try_to_lock_t::try_to_lock_t()'
> .../tag_type_explicit_ctor.cc:55: error: converting to
> 'std::try_to_lock_t' from initializer list would use explicit constructor
> 'constexpr std::try_to_lock_t::try_to_lock_t()'
> .../tag_type_explicit_ctor.cc:67: error: converting to 'std::defer_lock_t'
> from initializer list would use explicit constructor 'constexpr
> std::defer_lock_t::defer_lock_t()'
> .../tag_type_explicit_ctor.cc:68: error: converting to
> 'std::try_to_lock_t' from initializer list would use explicit constructor
> 'constexpr std::try_to_lock_t::try_to_lock_t()'
> .../tag_type_explicit_ctor.cc:69: error: converting to 'std::adopt_lock_t'
> from initializer list would use explicit constructor 'constexpr
> std::adopt_lock_t::adopt_lock_t()'
>
> Patch has been verified on Windows and Linux.
>
> gcc/testsuite:
>
> * testsuite/lib/libstdc++.exp: Use "nul" for Windows,
>   "/dev/null" for other environments.
>
> Signed-off-by: Torbjörn SVENSSON 
> ---
>  libstdc++-v3/testsuite/lib/libstdc++.exp | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp
> b/libstdc++-v3/testsuite/lib/libstdc++.exp
> index 24d1b43f11b..58804ecab26 100644
> --- a/libstdc++-v3/testsuite/lib/libstdc++.exp
> +++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
> @@ -615,11 +615,14 @@ proc v3_target_compile { source dest type options } {
> }
>  }
>
> -# Small adjustment for MinGW hosts.
> -if { $dest == "/dev/null" && [ishost "*-*-mingw*"] } {
> +# Small adjustment for Windows hosts.
> +if { $dest == "/dev/null"
> + && [info exists ::env(OS)] && [string match "Windows*"
> $::env(OS)] } {
> if { $type == "executable" } {
> set dest "x.exe"
> } else {
> +   # Windows uses special file named "nul" as a substitute for
> +   # /dev/null
> set dest "nul"
> }
>  }
> --
> 2.25.1
>
>


libgo patch committed: Bump version number

2024-02-05 Thread Ian Lance Taylor
This libgo patch bumps the version number for the GCC 14 release.
This is for GCC PR 113668.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
7b0597eba6b29387b56b8d6a4b38f3586e6b49a5
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index ec7e2ab1acf..73cb095322c 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-1cb83a415e86ab4de0d436d277377d8fc060cb61
+e15a14e410b8fc5d28012d5b313cb6c8476c7df9
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/configure.ac b/libgo/configure.ac
index 22158ac7f5d..898091276f7 100644
--- a/libgo/configure.ac
+++ b/libgo/configure.ac
@@ -10,7 +10,7 @@ AC_INIT(package-unused, version-unused,, libgo)
 AC_CONFIG_SRCDIR(Makefile.am)
 AC_CONFIG_HEADER(config.h)
 
-libtool_VERSION=22:0:0
+libtool_VERSION=23:0:0
 AC_SUBST(libtool_VERSION)
 
 AM_ENABLE_MULTILIB(, ..)


Go frontend patch committed: print types in a more readable way

2024-02-05 Thread Ian Lance Taylor
This patch to the Go frontend adds Type::message_name to print types
in ways that makes sense to users.  As we move toward generics, the
error messages need to be able to refer to types in a readable manner.
Today we use this new feature in AST dumps.  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
3818237cd5111fdd089f9c9470d384eebbe6ee1e
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 429904a2b8f..ec7e2ab1acf 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-8c056e335cecec67d1d223a329b7ba4dac778a65
+1cb83a415e86ab4de0d436d277377d8fc060cb61
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/ast-dump.cc b/gcc/go/gofrontend/ast-dump.cc
index eca0bf1fad2..12f49e68700 100644
--- a/gcc/go/gofrontend/ast-dump.cc
+++ b/gcc/go/gofrontend/ast-dump.cc
@@ -223,14 +223,7 @@ Ast_dump_context::dump_type(const Type* t)
   if (t == NULL)
 this->ostream() << "(nil type)";
   else
-// FIXME: write a type pretty printer instead of
-// using mangled names.
-if (this->gogo_ != NULL)
-  {
-   Backend_name bname;
-   t->backend_name(this->gogo_, );
-   this->ostream() << "(" << bname.name() << ")";
-  }
+this->ostream() << "(" << t->message_name() << ")";
 }
 
 // Dump a textual representation of a block to the
diff --git a/gcc/go/gofrontend/types.cc b/gcc/go/gofrontend/types.cc
index b349ad10d6f..a39cfbf7679 100644
--- a/gcc/go/gofrontend/types.cc
+++ b/gcc/go/gofrontend/types.cc
@@ -270,6 +270,16 @@ Type::set_is_error()
   this->classification_ = TYPE_ERROR;
 }
 
+// Return a string version of this type to use in an error message.
+
+std::string
+Type::message_name() const
+{
+  std::string ret;
+  this->do_message_name();
+  return ret;
+}
+
 // If this is a pointer type, return the type to which it points.
 // Otherwise, return NULL.
 
@@ -742,16 +752,14 @@ Type::are_assignable(const Type* lhs, const Type* rhs, 
std::string* reason)
 {
   if (rhs->interface_type() != NULL)
reason->assign(_("need explicit conversion"));
-  else if (lhs_orig->named_type() != NULL
-  && rhs_orig->named_type() != NULL)
+  else
{
- size_t len = (lhs_orig->named_type()->name().length()
-   + rhs_orig->named_type()->name().length()
-   + 100);
+ const std::string& lhs_name(lhs_orig->message_name());
+ const std::string& rhs_name(rhs_orig->message_name());
+ size_t len = lhs_name.length() + rhs_name.length() + 100;
  char* buf = new char[len];
  snprintf(buf, len, _("cannot use type %s as type %s"),
-  rhs_orig->named_type()->message_name().c_str(),
-  lhs_orig->named_type()->message_name().c_str());
+  rhs_name.c_str(), lhs_name.c_str());
  reason->assign(buf);
  delete[] buf;
}
@@ -4244,6 +4252,33 @@ Integer_type::is_identical(const Integer_type* t) const
   return this->is_abstract_ == t->is_abstract_;
 }
 
+// Message name.
+
+void
+Integer_type::do_message_name(std::string* ret) const
+{
+  ret->append("is_byte_)
+ret->append("byte");
+  else if (this->is_rune_)
+ret->append("rune");
+  else
+{
+  if (this->is_unsigned_)
+   ret->push_back('u');
+  if (this->is_abstract_)
+   ret->append("int");
+  else
+   {
+ ret->append("int");
+ char buf[10];
+ snprintf(buf, sizeof buf, "%d", this->bits_);
+ ret->append(buf);
+   }
+}
+  ret->push_back('>');
+}
+
 // Hash code.
 
 unsigned int
@@ -4382,6 +4417,21 @@ Float_type::is_identical(const Float_type* t) const
   return this->is_abstract_ == t->is_abstract_;
 }
 
+// Message name.
+
+void
+Float_type::do_message_name(std::string* ret) const
+{
+  ret->append("is_abstract_)
+{
+  char buf[10];
+  snprintf(buf, sizeof buf, "%d", this->bits_);
+  ret->append(buf);
+}
+  ret->push_back('>');
+}
+
 // Hash code.
 
 unsigned int
@@ -4496,6 +4546,21 @@ Complex_type::is_identical(const Complex_type *t) const
   return this->is_abstract_ == t->is_abstract_;
 }
 
+// Message name.
+
+void
+Complex_type::do_message_name(std::string* ret) const
+{
+  ret->append("is_abstract_)
+{
+  char buf[10];
+  snprintf(buf, sizeof buf, "%d", this->bits_);
+  ret->append(buf);
+}
+  ret->push_back('>');
+}
+
 // Hash code.
 
 unsigned int
@@ -4661,6 +4726,10 @@ class Sink_type : public Type
   { }
 
  protected:
+  void
+  do_message_name(std::string* ret) const
+  { ret->append(""); }
+
   bool
   do_compare_is_identity(Gogo*)
   { return false; }
@@ -4696,6 +4765,70 @@ Type::make_sink_type()
 
 // Class Function_type.
 
+// Message name.
+
+void
+Function_type::do_message_name(std::string* ret) const
+{
+  ret->append("func");
+  if (this->receiver_ != NULL)
+{
+  

Re: [PATCH] RISC-V: Expand VLMAX scalar move in reduction

2024-02-05 Thread Jeff Law




On 2/4/24 23:37, juzhe.zh...@rivai.ai wrote:

I think it just trigger a latent bug that we didn't encounter.

Hi, Robin. Would you mind give me preprocessed file to reproduce the issue ?

I suspect it triggers latent bug in VSETVL PASS.
So it looks like vsetvl has made a transformation that makes DCE go into 
a loop.  At least that's my first impression after attaching to a hung 
build.  The good news is I was able to trigger it without LTO.  I'll 
send the relevant info separately so as not to spam everyone with the 
testcase :-)


jeff


Re: [PATCH v6] x86-64: Find a scratch register for large model profiling

2024-02-05 Thread H.J. Lu
On Mon, Feb 5, 2024 at 10:01 AM Uros Bizjak  wrote:
>
> On Mon, Feb 5, 2024 at 5:43 PM H.J. Lu  wrote:
> >
> > Changes in v6:
> >
> > 1. Use ix86_save_reg and accessible_reg_set in
> > x86_64_select_profile_regnum.
> > 2. Construct a complete reg name in x86_function_profiler.
> >
> > Changes in v5:
> >
> > 1. Add pr113689-3.c.
> > 2. Use %r10 if ix86_profile_before_prologue () return true.
> > 3. Try a callee-saved register which has been saved on stack in the
> > prologue.
> >
> > Changes in v4:
> >
> > 1. Remove pr113689-3.c.
> > 2. Use df_get_live_out.
> >
> > Changes in v3:
> >
> > 1. Remove r10_ok.
> >
> > Changes in v2:
> >
> > 1. Add int_parameter_registers to machine_function to track integer
> > registers used for parameter passing.
> > 2. Update x86_64_select_profile_regnum to try %r10 first and use an
> > caller-saved register, which isn't used for parameter passing.
> >
> > ---
> > 2 scratch registers, %r10 and %r11, are available at function entry for
> > large model profiling.  But %r10 may be used by stack realignment and we
> > can't use %r10 in this case.  Add x86_64_select_profile_regnum to find
> > a caller-saved register which isn't live or a callee-saved register
> > which has been saved on stack in the prologue at entry for large model
> > profiling and sorry if we can't find one.
> >
> > gcc/
> >
> > PR target/113689
> > * config/i386/i386.cc (x86_64_select_profile_regnum): New.
> > (x86_function_profiler): Call x86_64_select_profile_regnum to
> > get a scratch register for large model profiling.
> >
> > gcc/testsuite/
> >
> > PR target/113689
> > * gcc.target/i386/pr113689-1.c: New file.
> > * gcc.target/i386/pr113689-2.c: Likewise.
> > * gcc.target/i386/pr113689-3.c: Likewise.
> > ---
> >  gcc/config/i386/i386.cc| 91 ++
> >  gcc/testsuite/gcc.target/i386/pr113689-1.c | 49 
> >  gcc/testsuite/gcc.target/i386/pr113689-2.c | 41 ++
> >  gcc/testsuite/gcc.target/i386/pr113689-3.c | 48 
> >  4 files changed, 214 insertions(+), 15 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-3.c
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index b3e7c74846e..08aad32af85 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -22749,6 +22749,48 @@ current_fentry_section (const char **name)
> >return true;
> >  }
> >
> > +/* Return a caller-saved register which isn't live or a callee-saved
> > +   register which has been saved on stack in the prologue at entry for
> > +   profile.  */
> > +
> > +static int
> > +x86_64_select_profile_regnum (bool r11_ok ATTRIBUTE_UNUSED)
> > +{
> > +  /* Use %r10 if the profiler is emitted before the prologue or it isn't
> > + used by DRAP.  */
> > +  if (ix86_profile_before_prologue ()
> > +  || !crtl->drap_reg
> > +  || REGNO (crtl->drap_reg) != R10_REG)
> > +return R10_REG;
> > +
> > +  /* The profiler is emitted after the prologue.  If there is a
> > + caller-saved register which isn't live or a callee-saved
> > + register saved on stack in the prologue, use it.  */
> > +
> > +  bitmap reg_live = df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> > +
> > +  int i;
> > +  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
> > +if (GENERAL_REGNO_P (i)
> > +   && i != R10_REG
> > +#ifdef NO_PROFILE_COUNTERS
> > +   && (r11_ok || i != R11_REG)
> > +#else
> > +   && i != R11_REG
> > +#endif
> > +   && TEST_HARD_REG_BIT (accessible_reg_set, i)
> > +   && !fixed_regs[i]
> > +   && (ix86_save_reg (i, true, true)
> > +   || (call_used_regs[i]
> > +   && !REGNO_REG_SET_P (reg_live, i
> > +  return i;
>
> ix86_save_reg will never save fixed regs, so the above can be optimized a bit:
>
>&& TEST_HARD_REG_BIT (accessible_reg_set, i)
>&& (ix86_save_reg (i, true, true)
>|| (call_used_regs[i] && !fixed_regs[i]
>&& !REGNO_REG_SET_P (reg_live, i
>
> OK with the above change.

Fixed.  This is the patch I am checking in.

Thanks.

> Thanks,
> Uros.
>


-- 
H.J.
From dc3fe511a13bccf510793c22e6ba7a0cc0b9c1f6 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Thu, 1 Feb 2024 08:02:27 -0800
Subject: [PATCH] x86-64: Find a scratch register for large model profiling

2 scratch registers, %r10 and %r11, are available at function entry for
large model profiling.  But %r10 may be used by stack realignment and we
can't use %r10 in this case.  Add x86_64_select_profile_regnum to find
a caller-saved register which isn't live or a callee-saved register
which has been saved on stack in the prologue at entry for large model
profiling and sorry if we can't find one.

gcc/

	PR target/113689
	* config/i386/i386.cc 

[PATCH] libstdc++: /dev/null is not accessible on Windows

2024-02-05 Thread Torbjörn SVENSSON
Ok for trunk and releases/gcc-13?

---

When running the DejaGNU testsuite on a toolchain built for native
Windows, the path /dev/null can't be used to open a stream to void.
On native Windows, the resource is instead named "nul".

In 17_intro/tag_type_explicit_ctor.cc, the following statement would
fail to match when the DejaGNU testsuite is running in cygwin with a
native toolchain.
// dg-error 53 "explicit" "" { target hosted }

The "target hosted"-check is using cpp to verify if _GLIBCXX_HOSTED is
defined and discards the output by simply redirecting it to /dev/null.
In v3_target_compile, it's overridden to "nul" for MinGW targets, but
the same rule applies when host is cygwin, so replace the condition
with a check for Windows.

The error in the log would look like this for the "target hosted" check:
cc1plus.exe: fatal error: opening output file /dev/null: No such file or 
directory

The tag_type_explicit_ctor.cc test fails with this on Windows:
.../tag_type_explicit_ctor.cc:53: error: converting to 'std::defer_lock_t' from 
initializer list would use explicit constructor 'constexpr 
std::defer_lock_t::defer_lock_t()'
.../tag_type_explicit_ctor.cc:54: error: converting to 'std::try_to_lock_t' 
from initializer list would use explicit constructor 'constexpr 
std::try_to_lock_t::try_to_lock_t()'
.../tag_type_explicit_ctor.cc:55: error: converting to 'std::try_to_lock_t' 
from initializer list would use explicit constructor 'constexpr 
std::try_to_lock_t::try_to_lock_t()'
.../tag_type_explicit_ctor.cc:67: error: converting to 'std::defer_lock_t' from 
initializer list would use explicit constructor 'constexpr 
std::defer_lock_t::defer_lock_t()'
.../tag_type_explicit_ctor.cc:68: error: converting to 'std::try_to_lock_t' 
from initializer list would use explicit constructor 'constexpr 
std::try_to_lock_t::try_to_lock_t()'
.../tag_type_explicit_ctor.cc:69: error: converting to 'std::adopt_lock_t' from 
initializer list would use explicit constructor 'constexpr 
std::adopt_lock_t::adopt_lock_t()'

Patch has been verified on Windows and Linux.

gcc/testsuite:

* testsuite/lib/libstdc++.exp: Use "nul" for Windows,
  "/dev/null" for other environments.

Signed-off-by: Torbjörn SVENSSON 
---
 libstdc++-v3/testsuite/lib/libstdc++.exp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp 
b/libstdc++-v3/testsuite/lib/libstdc++.exp
index 24d1b43f11b..58804ecab26 100644
--- a/libstdc++-v3/testsuite/lib/libstdc++.exp
+++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
@@ -615,11 +615,14 @@ proc v3_target_compile { source dest type options } {
}
 }
 
-# Small adjustment for MinGW hosts.
-if { $dest == "/dev/null" && [ishost "*-*-mingw*"] } {
+# Small adjustment for Windows hosts.
+if { $dest == "/dev/null"
+ && [info exists ::env(OS)] && [string match "Windows*" $::env(OS)] } {
if { $type == "executable" } {
set dest "x.exe"
} else {
+   # Windows uses special file named "nul" as a substitute for
+   # /dev/null
set dest "nul"
}
 }
-- 
2.25.1



Re: [PATCH] Fix disabling of year 2038 support on 32-bit hosts by default

2024-02-05 Thread Andrew Pinski
On Mon, Feb 5, 2024 at 10:40 AM Thiago Jung Bauermann
 wrote:
>
>
> Thiago Jung Bauermann  writes:
>
> > Hello Luis,
> >
> > Luis Machado  writes:
> >>
> >> Approved-By: Luis Machado 
> >
> > Thanks! Since this is a patch for the repository top-level, is your
> > approval sufficient to commit the patch, or should I have approval from
> > a binutils maintainer as well?
>
> Answering my own question: binutils/MAINTAINERS says:
>
>   GDB global maintainers also have permission to commit and approve
>   patches to the top level files and to those parts of bfd files
>   primarily used by GDB.
>
> So pushed as commit 9c0aa4c53104.


Please also submit/commit to the gcc trunk too since the toplevel
configure should be insync between the 2 repos.

Thanks,
Andrew

>
> --
> Thiago


Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

2024-02-05 Thread Edwin Lu

On 2/2/2024 11:10 PM, Li, Pan2 wrote:

Hi Edwin


I believe the only problematic failures are the 5 vls calling convention
ones where only 24 ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) are found.


Does this "only 24" comes from calling-convention-1.c?


Oops sorry about that. I said I would include all the 7 failures and 
ended up not doing that. The failures are here
FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-1.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable 
scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 35
FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-2.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable 
scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 33
FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-3.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable 
scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 31
FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-4.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable 
scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 29
FAIL: gcc.target/riscv/rvv/autovec/vls/calling-convention-7.c -O3 
-ftree-vectorize --param riscv-autovec-preference=scalable 
scan-assembler-times ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 29


These all have the problem of only 24 ld\\s+a[0-1],\\s*[0-9]+\\(sp\\) 
being found. So that is calling-conventions 1, 2, 3, 4, 7 with only 24 
matching RE.


FAIL: gcc.target/riscv/rvv/base/vcreate.c scan-assembler-times 
vmv1r.v\\s+v[0-9]+,\\s*v[0-9]+ 24 <-- found 36 times
FAIL: gcc.target/riscv/rvv/base/vcreate.c scan-assembler-times 
vmv2r.v\\s+v[0-9]+,\\s*v[0-9]+ 12 <-- found 28 times
FAIL: gcc.target/riscv/rvv/base/vcreate.c scan-assembler-times 
vmv4r.v\\s+v[0-9]+,\\s*v[0-9]+ 16 <-- found 19 times


These find more vmv's than expected

FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-107.c   -O2 
scan-assembler-times vsetvli\\tzero,zero,e32,m1,t[au],m[au] 1 <-- found 
0 times
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-107.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none   scan-assembler-times 
vsetvli\\tzero,zero,e32,m1,t[au],m[au] 1 <-- found 0 times
FAIL: gcc.target/riscv/rvv/vsetvl/avl_single-107.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects   scan-assembler-times 
vsetvli\\tzero,zero,e32,m1,t[au],m[au] 1 <-- found 0 times


These failures are from vsetvli zero,a0,e2,m1,ta,ma being found instead. 
I believe these should be fine.





This is what I'm getting locally (first instance of wrong match):
v32qi_RET1_ARG8:
.LFB109:


V32qi will pass the args by reference instead of GPR(s), thus It is expected. I 
think we need to diff the asm code before and after the patch for the whole 
test-file.
The RE "ld\\s+a[0-1],\\s*[0-9]+\\(sp\\)" would like to check vls mode values 
are returned by a[0-1].



I've been using this https://godbolt.org/z/vdxTY3rc7 (calling convention 
1) as my comparison to what I have compiled locally (included as 
attachment). From what I see, the differences, aside from reordering due 
to latency, are that the ld insns use a5 (for 32-512) or t4 (for 
1024-2048) or t5 (for 4096) for ARG8 and ARG9. Is there something else 
that I might be missing?


Edwin

.file   "calling-convention-1.c"
.option nopic
.attribute arch, 
"rv64i2p1_m2p0_a2p1_f2p2_d2p2_c2p0_v1p0_zicsr2p0_zifencei2p0_zve32f1p0_zve32x1p0_zve64d1p0_zve64f1p0_zve64x1p0_zvl1024b1p0_zvl128b1p0_zvl2048b1p0_zvl256b1p0_zvl32b1p0_zvl4096b1p0_zvl512b1p0_zvl64b1p0"
.attribute unaligned_access, 0
.attribute stack_align, 16
.text
.align  1
.globl  v1qi_RET1_ARG0
.type   v1qi_RET1_ARG0, @function
v1qi_RET1_ARG0:
.LFB0:
.cfi_startproc
li  a0,0
ret
.cfi_endproc
.LFE0:
.size   v1qi_RET1_ARG0, .-v1qi_RET1_ARG0
.align  1
.globl  v2qi_RET1_ARG0
.type   v2qi_RET1_ARG0, @function
v2qi_RET1_ARG0:
.LFB1:
.cfi_startproc
li  a0,0
ret
.cfi_endproc
.LFE1:
.size   v2qi_RET1_ARG0, .-v2qi_RET1_ARG0
.align  1
.globl  v4qi_RET1_ARG0
.type   v4qi_RET1_ARG0, @function
v4qi_RET1_ARG0:
.LFB2:
.cfi_startproc
li  a0,0
ret
.cfi_endproc
.LFE2:
.size   v4qi_RET1_ARG0, .-v4qi_RET1_ARG0
.align  1
.globl  v8qi_RET1_ARG0
.type   v8qi_RET1_ARG0, @function
v8qi_RET1_ARG0:
.LFB3:
.cfi_startproc
li  a0,0
ret
.cfi_endproc
.LFE3:
.size   v8qi_RET1_ARG0, .-v8qi_RET1_ARG0
.align  1
.globl  v16qi_RET1_ARG0
.type   v16qi_RET1_ARG0, @function
v16qi_RET1_ARG0:
.LFB4:
.cfi_startproc
li  a0,0
li  a1,0
ret
.cfi_endproc
.LFE4:
.size   v16qi_RET1_ARG0, .-v16qi_RET1_ARG0
.align  1
.globl  v32qi_RET1_ARG0
.type   v32qi_RET1_ARG0, @function
v32qi_RET1_ARG0:
.LFB5:
.cfi_startproc

Re: [PATCH 0/4] Add DF_LIVE_SUBREG data and apply to IRA and LRA

2024-02-05 Thread Joseph Myers
This series appears to be missing documentation for the new option in 
invoke.texi.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH v6] x86-64: Find a scratch register for large model profiling

2024-02-05 Thread Uros Bizjak
On Mon, Feb 5, 2024 at 5:43 PM H.J. Lu  wrote:
>
> Changes in v6:
>
> 1. Use ix86_save_reg and accessible_reg_set in
> x86_64_select_profile_regnum.
> 2. Construct a complete reg name in x86_function_profiler.
>
> Changes in v5:
>
> 1. Add pr113689-3.c.
> 2. Use %r10 if ix86_profile_before_prologue () return true.
> 3. Try a callee-saved register which has been saved on stack in the
> prologue.
>
> Changes in v4:
>
> 1. Remove pr113689-3.c.
> 2. Use df_get_live_out.
>
> Changes in v3:
>
> 1. Remove r10_ok.
>
> Changes in v2:
>
> 1. Add int_parameter_registers to machine_function to track integer
> registers used for parameter passing.
> 2. Update x86_64_select_profile_regnum to try %r10 first and use an
> caller-saved register, which isn't used for parameter passing.
>
> ---
> 2 scratch registers, %r10 and %r11, are available at function entry for
> large model profiling.  But %r10 may be used by stack realignment and we
> can't use %r10 in this case.  Add x86_64_select_profile_regnum to find
> a caller-saved register which isn't live or a callee-saved register
> which has been saved on stack in the prologue at entry for large model
> profiling and sorry if we can't find one.
>
> gcc/
>
> PR target/113689
> * config/i386/i386.cc (x86_64_select_profile_regnum): New.
> (x86_function_profiler): Call x86_64_select_profile_regnum to
> get a scratch register for large model profiling.
>
> gcc/testsuite/
>
> PR target/113689
> * gcc.target/i386/pr113689-1.c: New file.
> * gcc.target/i386/pr113689-2.c: Likewise.
> * gcc.target/i386/pr113689-3.c: Likewise.
> ---
>  gcc/config/i386/i386.cc| 91 ++
>  gcc/testsuite/gcc.target/i386/pr113689-1.c | 49 
>  gcc/testsuite/gcc.target/i386/pr113689-2.c | 41 ++
>  gcc/testsuite/gcc.target/i386/pr113689-3.c | 48 
>  4 files changed, 214 insertions(+), 15 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-3.c
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index b3e7c74846e..08aad32af85 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -22749,6 +22749,48 @@ current_fentry_section (const char **name)
>return true;
>  }
>
> +/* Return a caller-saved register which isn't live or a callee-saved
> +   register which has been saved on stack in the prologue at entry for
> +   profile.  */
> +
> +static int
> +x86_64_select_profile_regnum (bool r11_ok ATTRIBUTE_UNUSED)
> +{
> +  /* Use %r10 if the profiler is emitted before the prologue or it isn't
> + used by DRAP.  */
> +  if (ix86_profile_before_prologue ()
> +  || !crtl->drap_reg
> +  || REGNO (crtl->drap_reg) != R10_REG)
> +return R10_REG;
> +
> +  /* The profiler is emitted after the prologue.  If there is a
> + caller-saved register which isn't live or a callee-saved
> + register saved on stack in the prologue, use it.  */
> +
> +  bitmap reg_live = df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> +
> +  int i;
> +  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
> +if (GENERAL_REGNO_P (i)
> +   && i != R10_REG
> +#ifdef NO_PROFILE_COUNTERS
> +   && (r11_ok || i != R11_REG)
> +#else
> +   && i != R11_REG
> +#endif
> +   && TEST_HARD_REG_BIT (accessible_reg_set, i)
> +   && !fixed_regs[i]
> +   && (ix86_save_reg (i, true, true)
> +   || (call_used_regs[i]
> +   && !REGNO_REG_SET_P (reg_live, i
> +  return i;

ix86_save_reg will never save fixed regs, so the above can be optimized a bit:

   && TEST_HARD_REG_BIT (accessible_reg_set, i)
   && (ix86_save_reg (i, true, true)
   || (call_used_regs[i] && !fixed_regs[i]
   && !REGNO_REG_SET_P (reg_live, i

OK with the above change.

Thanks,
Uros.

> +  sorry ("no register available for profiling %<-mcmodel=large%s%>",
> +ix86_cmodel == CM_LARGE_PIC ? " -fPIC" : "");
> +
> +  return INVALID_REGNUM;
> +}
> +
>  /* Output assembler code to FILE to increment profiler label # LABELNO
> for profiling a function entry.  */
>  void
> @@ -22783,42 +22825,61 @@ x86_function_profiler (FILE *file, int labelno 
> ATTRIBUTE_UNUSED)
> fprintf (file, "\tleaq\t%sP%d(%%rip), %%r11\n", LPREFIX, labelno);
>  #endif
>
> +  int scratch;
> +  const char *reg;
> +  char legacy_reg[4] = { 0 };
> +
>if (!TARGET_PECOFF)
> {
>   switch (ix86_cmodel)
> {
> case CM_LARGE:
> - /* NB: R10 is caller-saved.  Although it can be used as a
> -static chain register, it is preserved when calling
> -mcount for nested functions.  */
> + scratch = x86_64_select_profile_regnum (true);
> + reg = hi_reg_name[scratch];
> + if 

Re: [PATCH v5] x86-64: Find a scratch register for large model profiling

2024-02-05 Thread H.J. Lu
On Mon, Feb 5, 2024 at 2:56 AM Uros Bizjak  wrote:
>
> On Fri, Feb 2, 2024 at 11:47 PM H.J. Lu  wrote:
> >
> > Changes in v5:
> >
> > 1. Add pr113689-3.c.
> > 2. Use %r10 if ix86_profile_before_prologue () return true.
> > 3. Try a callee-saved register which has been saved on stack in the
> > prologue.
> >
> > Changes in v4:
> >
> > 1. Remove pr113689-3.c.
> > 2. Use df_get_live_out.
> >
> > Changes in v3:
> >
> > 1. Remove r10_ok.
> >
> > Changes in v2:
> >
> > 1. Add int_parameter_registers to machine_function to track integer
> > registers used for parameter passing.
> > 2. Update x86_64_select_profile_regnum to try %r10 first and use an
> > caller-saved register, which isn't used for parameter passing.
> >
> > ---
> > 2 scratch registers, %r10 and %r11, are available at function entry for
> > large model profiling.  But %r10 may be used by stack realignment and we
> > can't use %r10 in this case.  Add x86_64_select_profile_regnum to find
> > a caller-saved register which isn't live or a callee-saved register
> > which has been saved on stack in the prologue at entry for large model
> > profiling and sorry if we can't find one.
> >
> > gcc/
> >
> > PR target/113689
> > * config/i386/i386.cc (set_saved_int_registers_bit): New.
> > (test_saved_int_registers_bit): Likewise.
> > (ix86_emit_save_regs): Call set_saved_int_registers_bit on
> > saved register.
> > (ix86_emit_save_regs_using_mov): Likewise.
> > (x86_64_select_profile_regnum): New.
> > (x86_function_profiler): Call x86_64_select_profile_regnum to
> > get a scratch register for large model profiling.
> > * config/i386/i386.h (machine_function): Add
> > saved_int_registers.
> >
> > gcc/testsuite/
> >
> > PR target/113689
> > * gcc.target/i386/pr113689-1.c: New file.
> > * gcc.target/i386/pr113689-2.c: Likewise.
> > * gcc.target/i386/pr113689-3.c: Likewise.
> > ---
> >  gcc/config/i386/i386.cc| 119 ++---
> >  gcc/config/i386/i386.h |   5 +
> >  gcc/testsuite/gcc.target/i386/pr113689-1.c |  49 +
> >  gcc/testsuite/gcc.target/i386/pr113689-2.c |  41 +++
> >  gcc/testsuite/gcc.target/i386/pr113689-3.c |  48 +
> >  5 files changed, 247 insertions(+), 15 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-3.c
> >
> > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > index b3e7c74846e..1c7aaa4535e 100644
> > --- a/gcc/config/i386/i386.cc
> > +++ b/gcc/config/i386/i386.cc
> > @@ -7387,6 +7387,32 @@ choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned 
> > int *align,
> >return plus_constant (Pmode, base_reg, base_offset);
> >  }
> >
> > +/* Set the integer register REGNO bit in saved_int_registers.  */
> > +
> > +static void
> > +set_saved_int_registers_bit (int regno)
> > +{
> > +  if (LEGACY_INT_REGNO_P (regno))
> > +cfun->machine->saved_int_registers |= 1 << regno;
> > +  else
> > +cfun->machine->saved_int_registers
> > +  |= 1 << (regno - FIRST_REX_INT_REG + 8);
> > +}
> > +
> > +/* Return true if the integer register REGNO bit in saved_int_registers
> > +   is set.  */
> > +
> > +static bool
> > +test_saved_int_registers_bit (int regno)
> > +{
> > +  if (LEGACY_INT_REGNO_P (regno))
> > +return (cfun->machine->saved_int_registers
> > +   & (1 << regno)) != 0;
> > +  else
> > +return (cfun->machine->saved_int_registers
> > +   & (1 << (regno - FIRST_REX_INT_REG + 8))) != 0;
> > +}
> > +
> >  /* Emit code to save registers in the prologue.  */
> >
> >  static void
> > @@ -7403,6 +7429,7 @@ ix86_emit_save_regs (void)
> > insn = emit_insn (gen_push (gen_rtx_REG (word_mode, regno),
> > TARGET_APX_PPX));
> > RTX_FRAME_RELATED_P (insn) = 1;
> > +   set_saved_int_registers_bit (regno);
> >   }
> >  }
> >else
> > @@ -7415,6 +7442,7 @@ ix86_emit_save_regs (void)
> >for (regno = FIRST_PSEUDO_REGISTER - 1; regno >= 0; regno--)
> > if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, true, true))
> >   {
> > +   set_saved_int_registers_bit (regno);
> > if (aligned)
> >   {
> > regno_list[loaded_regnum++] = regno;
> > @@ -7567,6 +7595,7 @@ ix86_emit_save_regs_using_mov (HOST_WIDE_INT 
> > cfa_offset)
> >{
> >  ix86_emit_save_reg_using_mov (word_mode, regno, cfa_offset);
> > cfa_offset -= UNITS_PER_WORD;
> > +   set_saved_int_registers_bit (regno);
> >}
> >  }
>
> Do we really need the above handling? I think that we can use
> ix86_save_reg directly in x86_64_select_profile_regnum below.

Fixed in v6.

> > @@ -22749,6 +22778,48 @@ current_fentry_section (const char 

[PATCH v6] x86-64: Find a scratch register for large model profiling

2024-02-05 Thread H.J. Lu
Changes in v6:

1. Use ix86_save_reg and accessible_reg_set in
x86_64_select_profile_regnum.
2. Construct a complete reg name in x86_function_profiler.

Changes in v5:

1. Add pr113689-3.c.
2. Use %r10 if ix86_profile_before_prologue () return true.
3. Try a callee-saved register which has been saved on stack in the
prologue.

Changes in v4:

1. Remove pr113689-3.c.
2. Use df_get_live_out.

Changes in v3:

1. Remove r10_ok.

Changes in v2:

1. Add int_parameter_registers to machine_function to track integer
registers used for parameter passing.
2. Update x86_64_select_profile_regnum to try %r10 first and use an
caller-saved register, which isn't used for parameter passing.

---
2 scratch registers, %r10 and %r11, are available at function entry for
large model profiling.  But %r10 may be used by stack realignment and we
can't use %r10 in this case.  Add x86_64_select_profile_regnum to find
a caller-saved register which isn't live or a callee-saved register
which has been saved on stack in the prologue at entry for large model
profiling and sorry if we can't find one.

gcc/

PR target/113689
* config/i386/i386.cc (x86_64_select_profile_regnum): New.
(x86_function_profiler): Call x86_64_select_profile_regnum to
get a scratch register for large model profiling.

gcc/testsuite/

PR target/113689
* gcc.target/i386/pr113689-1.c: New file.
* gcc.target/i386/pr113689-2.c: Likewise.
* gcc.target/i386/pr113689-3.c: Likewise.
---
 gcc/config/i386/i386.cc| 91 ++
 gcc/testsuite/gcc.target/i386/pr113689-1.c | 49 
 gcc/testsuite/gcc.target/i386/pr113689-2.c | 41 ++
 gcc/testsuite/gcc.target/i386/pr113689-3.c | 48 
 4 files changed, 214 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-3.c

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index b3e7c74846e..08aad32af85 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -22749,6 +22749,48 @@ current_fentry_section (const char **name)
   return true;
 }
 
+/* Return a caller-saved register which isn't live or a callee-saved
+   register which has been saved on stack in the prologue at entry for
+   profile.  */
+
+static int
+x86_64_select_profile_regnum (bool r11_ok ATTRIBUTE_UNUSED)
+{
+  /* Use %r10 if the profiler is emitted before the prologue or it isn't
+ used by DRAP.  */
+  if (ix86_profile_before_prologue ()
+  || !crtl->drap_reg
+  || REGNO (crtl->drap_reg) != R10_REG)
+return R10_REG;
+
+  /* The profiler is emitted after the prologue.  If there is a
+ caller-saved register which isn't live or a callee-saved
+ register saved on stack in the prologue, use it.  */
+
+  bitmap reg_live = df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+
+  int i;
+  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
+if (GENERAL_REGNO_P (i)
+   && i != R10_REG
+#ifdef NO_PROFILE_COUNTERS
+   && (r11_ok || i != R11_REG)
+#else
+   && i != R11_REG
+#endif
+   && TEST_HARD_REG_BIT (accessible_reg_set, i)
+   && !fixed_regs[i]
+   && (ix86_save_reg (i, true, true)
+   || (call_used_regs[i]
+   && !REGNO_REG_SET_P (reg_live, i
+  return i;
+
+  sorry ("no register available for profiling %<-mcmodel=large%s%>",
+ix86_cmodel == CM_LARGE_PIC ? " -fPIC" : "");
+
+  return INVALID_REGNUM;
+}
+
 /* Output assembler code to FILE to increment profiler label # LABELNO
for profiling a function entry.  */
 void
@@ -22783,42 +22825,61 @@ x86_function_profiler (FILE *file, int labelno 
ATTRIBUTE_UNUSED)
fprintf (file, "\tleaq\t%sP%d(%%rip), %%r11\n", LPREFIX, labelno);
 #endif
 
+  int scratch;
+  const char *reg;
+  char legacy_reg[4] = { 0 };
+
   if (!TARGET_PECOFF)
{
  switch (ix86_cmodel)
{
case CM_LARGE:
- /* NB: R10 is caller-saved.  Although it can be used as a
-static chain register, it is preserved when calling
-mcount for nested functions.  */
+ scratch = x86_64_select_profile_regnum (true);
+ reg = hi_reg_name[scratch];
+ if (LEGACY_INT_REGNO_P (scratch))
+   {
+ legacy_reg[0] = 'r';
+ legacy_reg[1] = reg[0];
+ legacy_reg[2] = reg[1];
+ reg = legacy_reg;
+   }
  if (ASSEMBLER_DIALECT == ASM_INTEL)
-   fprintf (file, "1:\tmovabs\tr10, OFFSET FLAT:%s\n"
-  "\tcall\tr10\n", mcount_name);
+   fprintf (file, "1:\tmovabs\t%s, OFFSET FLAT:%s\n"
+  "\tcall\t%s\n", reg, mcount_name, reg);
  else
-   fprintf (file, "1:\tmovabsq\t$%s, 

Re: [PATCH 0/4] Add DF_LIVE_SUBREG data and apply to IRA and LRA

2024-02-05 Thread Jeff Law




On 2/5/24 00:01, Lehua Ding wrote:
For SPEC INT 2017, when using upstream GCC (whitout these patches), I 
get a
coredump when training the peak case, so no data yet. The cause of the 
core

dump still needs to be investigated.


Typo, SPEC INT 2017 -> SPEC FP 2017
Also There is a bad news, the score of specint 2017 (with these patches) 
is dropped, a bit strange and I need to be locating the cause.
Just a note.  I doubt this will get much traction from a review 
standpoint until gcc-14 is basically out the door.


My recommendation is to continue development, bugfixing, cleanup, etc 
between now and then.  Consider creating a branch for the work in the 
upstream repo.


Jeff


Re: [PATCH] contrib: Fill in HOST{CC,CFLAGS,CXX,CXXFLAGS} in test_installed

2024-02-05 Thread Jeff Law




On 2/5/24 06:50, Jakub Jelinek wrote:

Hi!

gcc/Makefile.in since my r0-60234 change fills in HOSTCC and HOSTCFLAGS
in site.exp and since r8-671 also HOSTCXX and HOSTCXXFLAGS.
If those variables aren't set, we get errors like:
/usr/src/gcc/contrib/test_installed --without-g++ --without-gfortran 
--without-objc struct-layout-1.exp
...
ERROR: tcl error sourcing 
/usr/src/gcc/gcc/testsuite/gcc.dg/compat/struct-layout-1.exp.
ERROR: tcl error code TCL LOOKUP VARNAME HOSTCC
ERROR: can't read "HOSTCC": no such variable
 while executing
"remote_exec build "$HOSTCC $HOSTCFLAGS $generator_cmd""
 (file "/usr/src/gcc/gcc/testsuite/gcc.dg/compat/struct-layout-1.exp" line 
96)
 invoked from within
"source /usr/src/gcc/gcc/testsuite/gcc.dg/compat/struct-layout-1.exp"
 ("uplevel" body line 1)
 invoked from within
"uplevel #0 source /usr/src/gcc/gcc/testsuite/gcc.dg/compat/struct-layout-1.exp"
 invoked from within
"catch "uplevel #0 source $test_file_name" msg"

(similarly in g++ or gfortran) struct-layout-1.exp.  One doesn't need to
test specially for just struct-layout-1.exp alone, just not using any arg
will trigger it as well, just later.

The following patch fills it in as cc and c++ with empty flags to compile
those, I believe that is what e.g. make uses by default, so it should be a
reasonable default.  We IMHO shouldn't default to GCC_UNDER_TEST because
that might be a cross-compiler etc.

Ok for trunk?

2024-02-05  Jakub Jelinek  

* test_installed: Fill in HOSTCC, HOSTCXX, HOSTCFLAGS and
HOSTCXXFLAGS.
Ugh.  test_installed :(  Probably a necessary evil, though I suspect few 
people are using it.  So if it works for the scenarios you're testing, 
then OK by me.



jeff


Re: [PATCH 2/2] rtl-optimization/113255 - avoid re-associating REG_POINTER MINUS

2024-02-05 Thread Jeff Law




On 2/5/24 01:15, Richard Biener wrote:



  PR rtl-optimization/113255
  * simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
  Do not re-associate a MINUS with a REG_POINTER op0.

Nasty little set of problems.  I don't think we ever pondered that we could
have multiple REGNO_POINTER_FLAG objects in the same expression, but clearly
that can happen once you introduce a 3rd term in the expression.

I don't mind avoiding the reassociation, but it feels like we're papering over
problems in alias.cc.  Conceptually it seems like if we have two objects with
REG_POINTER set, then we can't know which one is the real base.  So your patch
in the PR wasn't that bad.


It wasn't bad, it's the only correct fix.  The question is what we do
for branches (or whether we do anything there) and whether we just accept
that that fix causes some optimization regressions.
For the branches, I'd go whatever you feel the safest change is.  While 
it looks like some of this is fundamentally broken, it can't be *that* 
bad since it's just rearing its ugly head now.


I could even make a case that going with the patch from the PR for the 
branches is reasonable.  It's attacking at least part of the root problem.





Alternately, just stop using REG_POINTER for alias analysis?   It looks
fundamentally flawed to me in that context.  In fact, one might argue that the
only legitimate use would be to indicate to the target that we know a pointer
points into an object.  Some targets (the PA) need this because x + y is not
the same as y + x when used as a memory address.

If we wanted to be a bit more surgical, drop REG_POINTER from just the MINUS
handling in alias.cc?


The problem is that REG_POINTER is just used as a heuristic
(and compile-time optimization) as to which of a binary operator
operands we use a base of (preferrably).  find_base_{term,value}
happily look at operands that are not REG_POINTER (that are
not REG_P), since for the case in question, even w/o re-assoc
there would be no way to say the inner MINUS is not a pointer
(it's a REG flag).

The heuristics don't help much when passes like DSE use CSELIB
and combine operations like above, we then get to see that
the way find_base_{term,value} perform pointer analysis is
fundamentally flawed.  Any tweaking there has the chance to
make other cases run into wrong base discoveries.

Exactly.  So maybe I'm missing something -- it sounds like we both agree 
that using REG_POINTER in the aliasing code is just fundamentally broken 
in the modern world (and perhaps has been for a long time).  So we 
"just" need to excise that code from alias.cc.






I'll take it that we need to live with the regressions for GCC 14
and the wrong-code bug in GCC 13 and earlier.
I'm not sure I agree with this statement.  Or maybe I thought the patch 
in the PR was more effective than it really is.  At some level we ought 
to be able to cut out the short-cuts enabled by REG_POINTER.  That runs 
the risk of perturbing more code, but it seems to me that's a risk we 
might need to take.


jeff


Re: [PATCH v2] RISC-V: THEAD: Fix improper immediate value for MODIFY_DISP instruction on 32-bit systems.

2024-02-05 Thread Andreas Schwab
On Feb 05 2024, Jeff Law wrote:

> We're all aware you *can* do that.  But it's never been a requirement to
> commit a patch.

It has always been a requirement that a patch does not break bootstrap.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH v2] RISC-V: THEAD: Fix improper immediate value for MODIFY_DISP instruction on 32-bit systems.

2024-02-05 Thread Jeff Law




On 2/5/24 08:08, Andreas Schwab wrote:

On Feb 05 2024, Jeff Law wrote:


Until such systems are common, these niggling issues are bound to show up.


It won't if you do it properly: build with a cross compiler that was
built from the same source and enable -Werror.
We're all aware you *can* do that.  But it's never been a requirement to 
commit a patch.


jeff


Re: [PATCH] RISC-V: Expand VLMAX scalar move in reduction

2024-02-05 Thread Jeff Law




On 2/4/24 23:37, juzhe.zh...@rivai.ai wrote:

I think it just trigger a latent bug that we didn't encounter.

Hi, Robin. Would you mind give me preprocessed file to reproduce the issue ?

I suspect it triggers latent bug in VSETVL PASS.
I've got a few minutes this morning before meetings start.  I'm going to 
try and trigger this with LTO off, which would help dramatically with 
our ability to provide a testcase.


jeff


Re: [PATCH] libitm: small update for C++20

2024-02-05 Thread Jason Merrill

On 2/3/24 10:14, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
C++20 DR 2237 disallows simple-template-id in cdtors, so you
can't write

 template
 struct S {
   S(); // should be S();
 };

This hasn't been a problem until now but I'm adding a warning about it
to -Wc++20-compat which libitm apparently uses.

libitm/ChangeLog:

* containers.h (vector): Remove the template-id in constructors.
---
  libitm/containers.h | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libitm/containers.h b/libitm/containers.h
index 2842fa038ed..4160b16d569 100644
--- a/libitm/containers.h
+++ b/libitm/containers.h
@@ -48,7 +48,7 @@ class vector
static const size_t default_resize_min = 32;
  
// Don't try to copy this vector.

-  vector(const vector& x);
+  vector(const vector& x);
  
   public:

typedef T datatype;
@@ -59,7 +59,7 @@ class vector
T& operator[] (size_t pos) { return entries[pos]; }
const T& operator[] (size_t pos) const  { return entries[pos]; }
  
-  vector(size_t initial_size = default_initial_capacity)

+  vector(size_t initial_size = default_initial_capacity)
  : m_capacity(initial_size),
m_size(0)
{
@@ -68,7 +68,7 @@ class vector
  else
entries = 0;
}
-  ~vector() { if (m_capacity) free(entries); }
+  ~vector() { if (m_capacity) free(entries); }
  
void resize(size_t additional_capacity)

{

base-commit: 78005c648921899a674d1e561b49b05ccabedfe0




Re: [PATCH] c++: DR2237, cdtor and template-id tweaks [PR107126]

2024-02-05 Thread Jason Merrill

On 2/3/24 10:24, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

I'm not certain OPT_Wc__20_extensions is the best thing for something
from [diff.cpp17]; would you prefer something else?


I think it wants its own flag, that is enabled in C++20 or by 
-Wc++20-compat.



+   if (cxx_dialect >= cxx20)
+ {
+   if (!cp_parser_simulate_error (parser))
+ pedwarn (tilde_loc, OPT_Wc__20_extensions,
+  "template-id not allowed for destructor");
+   return error_mark_node;
+ }
+   warning_at (tilde_loc, OPT_Wc__20_compat,
+   "template-id not allowed for destructor in C++20");


After a pedwarn we should accept the code, not return error_mark_node.

I'm also concerned about pedwarn/warnings not guarded by 
!cp_parser_uncommited_to_tentative_parse; that often leads to warning 
about a tentative parse as a declaration that is eventually abandoned in 
favor of a perfectly fine parse as an expression.


It would be good for cp_parser_context to add a vec of warnings to emit 
at cp_parser_parse_definitely time, and then 
cp_parser_pedwarn/cp_parser_warning to fill it...


Jason



Re: [PATCH v2] RISC-V: THEAD: Fix improper immediate value for MODIFY_DISP instruction on 32-bit systems.

2024-02-05 Thread Andreas Schwab
On Feb 05 2024, Jeff Law wrote:

> Until such systems are common, these niggling issues are bound to show up.

It won't if you do it properly: build with a cross compiler that was
built from the same source and enable -Werror.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


[PATCH] aarch64, acle header: Cast uint64_t pointers to DIMode.

2024-02-05 Thread Iain Sandoe
Tested on aarch64-linux,darwin and a cross from aarch64-darwin to linux,
OK for trunk, or some alternative is needed?
thanks
Iain

--- 8< ---

Currently, most of the acle tests fail on the Darwin port because
DI mode is "long" and uint64 is "long long".  The fix for this used
in other headers is to cast the pointers using __builtin_aarch64_simd_di
and that is what this patch does.

gcc/ChangeLog:

* config/aarch64/arm_acle.h (__rndr): Cast uint64 pointer to DI
mode to avoid typedef mismatches.
(__rndrrs): Likewise.
---
 gcc/config/aarch64/arm_acle.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/arm_acle.h b/gcc/config/aarch64/arm_acle.h
index 2aa681090fa..823f87187b1 100644
--- a/gcc/config/aarch64/arm_acle.h
+++ b/gcc/config/aarch64/arm_acle.h
@@ -309,14 +309,14 @@ __extension__ extern __inline int
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __rndr (uint64_t *__res)
 {
-  return __builtin_aarch64_rndr (__res);
+  return __builtin_aarch64_rndr ((__builtin_aarch64_simd_di *) __res);
 }
 
 __extension__ extern __inline int
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __rndrrs (uint64_t *__res)
 {
-  return __builtin_aarch64_rndrrs (__res);
+  return __builtin_aarch64_rndrrs ((__builtin_aarch64_simd_di *) __res);
 }
 
 #pragma GCC pop_options
-- 
2.39.2 (Apple Git-143)



Re: [PATCH v2] RISC-V: THEAD: Fix improper immediate value for MODIFY_DISP instruction on 32-bit systems.

2024-02-05 Thread Jeff Law




On 2/5/24 05:00, Christoph Müllner wrote:

On Sat, Feb 3, 2024 at 2:11 PM Andreas Schwab 
wrote:


On Jan 30 2024, Christoph Müllner wrote:


retested


Nope.


Sorry for this. I tested for no regressions in the test suite with a
cross-build and QEMU and did not do a Werror bootstrap build. I'll
provide a fix for this later today (also breaking the line as it is
longer than needed).
Right.  And that's pretty standard given the state of the RISC-V 
platforms.  We've got a platform here that can bootstrap in a reasonable 
amount of time, but I haven't set that up in the CI system yet.


Until such systems are common, these niggling issues are bound to show up.

It's just whitespace around the HOST_WIDE_INT_PRINT_DEC and wrapping the 
long line, right?  I've got that in my tree that's bootstrapping now.  I 
don't mind committing it later today.  But if you get to it before my 
bootstrap is done, feel free to commit as pre-approved.


jeff


RE: [PATCH]middle-end: fix ICE when moving statements to empty BB [PR113731]

2024-02-05 Thread Richard Biener
On Mon, 5 Feb 2024, Tamar Christina wrote:

> > It looks like LOOP_VINFO_EARLY_BRK_STORES is "reverse"?  Is that
> > why you are doing gsi_move_before + gsi_prev?  Why do gsi_prev
> > at all?
> > 
> 
> As discussed on IRC, then how about this one.
> Incremental building passed all tests and bootstrap is running.
> 
> Ok for master if bootstrap and regtesting clean?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/113731
>   * gimple-iterator.cc (gsi_move_before): Take new parameter for update
>   method.
>   * gimple-iterator.h (gsi_move_before): Default new param to
>   GSI_SAME_STMT.
>   * tree-vect-loop.cc (move_early_exit_stmts): Call gsi_move_before with
>   GSI_NEW_STMT.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/113731
>   * gcc.dg/vect/vect-early-break_111-pr113731.c: New test.
> 
> --- inline copy of patch ---
> 
> diff --git a/gcc/gimple-iterator.cc b/gcc/gimple-iterator.cc
> index 
> 517c53376f0511af59e124f52ec7be566a6c4789..f67bcfbfdfdd7c6cb0ad0130972f5b1dc4429bcf
>  100644
> --- a/gcc/gimple-iterator.cc
> +++ b/gcc/gimple-iterator.cc
> @@ -666,10 +666,11 @@ gsi_move_after (gimple_stmt_iterator *from, 
> gimple_stmt_iterator *to)
>  
>  
>  /* Move the statement at FROM so it comes right before the statement
> -   at TO.  */
> +   at TO using method M.  */
>  
>  void
> -gsi_move_before (gimple_stmt_iterator *from, gimple_stmt_iterator *to)
> +gsi_move_before (gimple_stmt_iterator *from, gimple_stmt_iterator *to,
> +  gsi_iterator_update m = GSI_SAME_STMT)

Looks like the wrong patch attached?  This should be like in the
ChangeLog.

OK with that change

Richard.

>  {
>gimple *stmt = gsi_stmt (*from);
>gsi_remove (from, false);
> @@ -677,7 +678,7 @@ gsi_move_before (gimple_stmt_iterator *from, 
> gimple_stmt_iterator *to)
>/* For consistency with gsi_move_after, it might be better to have
>   GSI_NEW_STMT here; however, that breaks several places that expect
>   that TO does not change.  */
> -  gsi_insert_before (to, stmt, GSI_SAME_STMT);
> +  gsi_insert_before (to, stmt, m);
>  }
>  
>  
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c 
> b/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c
> new file mode 100644
> index 
> ..2d6db91df97625a7f11609d034e89af0461129b2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> +/* { dg-require-effective-target vect_int } */
> +
> +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> +
> +char* inet_net_pton_ipv4_bits;
> +char inet_net_pton_ipv4_odst;
> +void __errno_location();
> +void inet_net_pton_ipv4();
> +void inet_net_pton() { inet_net_pton_ipv4(); }
> +void inet_net_pton_ipv4(char *dst, int size) {
> +  while ((inet_net_pton_ipv4_bits > dst) & inet_net_pton_ipv4_odst) {
> +if (size-- <= 0)
> +  goto emsgsize;
> +*dst++ = '\0';
> +  }
> +emsgsize:
> +  __errno_location();
> +}
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 
> 30b90d99925bea74caf14833d8ab1695607d0fe9..9aba94bd6ca2061a19487ac4a2735a16d03bcbee
>  100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -11800,8 +11800,7 @@ move_early_exit_stmts (loop_vec_info loop_vinfo)
>   dump_printf_loc (MSG_NOTE, vect_location, "moving stmt %G", stmt);
>  
>gimple_stmt_iterator stmt_gsi = gsi_for_stmt (stmt);
> -  gsi_move_before (_gsi, _gsi);
> -  gsi_prev (_gsi);
> +  gsi_move_before (_gsi, _gsi, GSI_NEW_STMT);
>  }
>  
>/* Update all the stmts with their new reaching VUSES.  */
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


RE: [PATCH]middle-end: add additional runtime test for [PR113467]

2024-02-05 Thread Richard Biener
On Mon, 5 Feb 2024, Tamar Christina wrote:

> > > Ok for master?
> > 
> > I think you need a lp64 target check for the large constants or
> > alternatively use uint64_t?
> > 
> 
> Ok, how about this one.
> 
> Regtested on x86_64-pc-linux-gnu with -m32,-m64 and no issues.
> 
> Ok for master?

OK

> Thanks,
> Tamar
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/113467
>   * gcc.dg/vect/vect-early-break_110-pr113467.c: New test.
> 
> --- inline copy of patch ---
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c 
> b/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c
> new file mode 100644
> index 
> ..1e2c47be5fdf1e1fed88e4b5f45d7eda6c3b85d1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c
> @@ -0,0 +1,52 @@
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> +/* { dg-require-effective-target vect_long_long } */
> +
> +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
> +
> +#include "tree-vect.h"
> +#include 
> +
> +typedef struct gcry_mpi *gcry_mpi_t;
> +struct gcry_mpi {
> +  int nlimbs;
> +  unsigned long *d;
> +};
> +
> +long gcry_mpi_add_ui_up;
> +void gcry_mpi_add_ui(gcry_mpi_t w, gcry_mpi_t u, unsigned v) {
> +  gcry_mpi_add_ui_up = *w->d;
> +  if (u) {
> +uint64_t *res_ptr = w->d, *s1_ptr = w->d;
> +int s1_size = u->nlimbs;
> +unsigned s2_limb = v, x = *s1_ptr++;
> +s2_limb += x;
> +*res_ptr++ = s2_limb;
> +if (x)
> +  while (--s1_size) {
> +x = *s1_ptr++ + 1;
> +*res_ptr++ = x;
> +if (x) {
> +  break;
> +}
> +  }
> +  }
> +}
> +
> +int main()
> +{
> +  check_vect ();
> +
> +  static struct gcry_mpi sv;
> +  static uint64_t vals[] = {4294967288ULL, 191ULL,4160749568ULL, 
> 4294963263ULL,
> +127ULL,4294950912ULL, 255ULL,
> 4294901760ULL,
> +534781951ULL,  33546240ULL,   4294967292ULL, 
> 4294960127ULL,
> +4292872191ULL, 4294967295ULL, 4294443007ULL, 
> 3ULL};
> +  gcry_mpi_t v = 
> +  v->nlimbs = 16;
> +  v->d = vals;
> +
> +  gcry_mpi_add_ui(v, v, 8);
> +  if (v->d[1] != 192)
> +__builtin_abort();
> +}
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


RE: [PATCH]middle-end: fix ICE when moving statements to empty BB [PR113731]

2024-02-05 Thread Tamar Christina
> It looks like LOOP_VINFO_EARLY_BRK_STORES is "reverse"?  Is that
> why you are doing gsi_move_before + gsi_prev?  Why do gsi_prev
> at all?
> 

As discussed on IRC, then how about this one.
Incremental building passed all tests and bootstrap is running.

Ok for master if bootstrap and regtesting clean?

Thanks,
Tamar

gcc/ChangeLog:

PR tree-optimization/113731
* gimple-iterator.cc (gsi_move_before): Take new parameter for update
method.
* gimple-iterator.h (gsi_move_before): Default new param to
GSI_SAME_STMT.
* tree-vect-loop.cc (move_early_exit_stmts): Call gsi_move_before with
GSI_NEW_STMT.

gcc/testsuite/ChangeLog:

PR tree-optimization/113731
* gcc.dg/vect/vect-early-break_111-pr113731.c: New test.

--- inline copy of patch ---

diff --git a/gcc/gimple-iterator.cc b/gcc/gimple-iterator.cc
index 
517c53376f0511af59e124f52ec7be566a6c4789..f67bcfbfdfdd7c6cb0ad0130972f5b1dc4429bcf
 100644
--- a/gcc/gimple-iterator.cc
+++ b/gcc/gimple-iterator.cc
@@ -666,10 +666,11 @@ gsi_move_after (gimple_stmt_iterator *from, 
gimple_stmt_iterator *to)
 
 
 /* Move the statement at FROM so it comes right before the statement
-   at TO.  */
+   at TO using method M.  */
 
 void
-gsi_move_before (gimple_stmt_iterator *from, gimple_stmt_iterator *to)
+gsi_move_before (gimple_stmt_iterator *from, gimple_stmt_iterator *to,
+gsi_iterator_update m = GSI_SAME_STMT)
 {
   gimple *stmt = gsi_stmt (*from);
   gsi_remove (from, false);
@@ -677,7 +678,7 @@ gsi_move_before (gimple_stmt_iterator *from, 
gimple_stmt_iterator *to)
   /* For consistency with gsi_move_after, it might be better to have
  GSI_NEW_STMT here; however, that breaks several places that expect
  that TO does not change.  */
-  gsi_insert_before (to, stmt, GSI_SAME_STMT);
+  gsi_insert_before (to, stmt, m);
 }
 
 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c
new file mode 100644
index 
..2d6db91df97625a7f11609d034e89af0461129b2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+char* inet_net_pton_ipv4_bits;
+char inet_net_pton_ipv4_odst;
+void __errno_location();
+void inet_net_pton_ipv4();
+void inet_net_pton() { inet_net_pton_ipv4(); }
+void inet_net_pton_ipv4(char *dst, int size) {
+  while ((inet_net_pton_ipv4_bits > dst) & inet_net_pton_ipv4_odst) {
+if (size-- <= 0)
+  goto emsgsize;
+*dst++ = '\0';
+  }
+emsgsize:
+  __errno_location();
+}
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 
30b90d99925bea74caf14833d8ab1695607d0fe9..9aba94bd6ca2061a19487ac4a2735a16d03bcbee
 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -11800,8 +11800,7 @@ move_early_exit_stmts (loop_vec_info loop_vinfo)
dump_printf_loc (MSG_NOTE, vect_location, "moving stmt %G", stmt);
 
   gimple_stmt_iterator stmt_gsi = gsi_for_stmt (stmt);
-  gsi_move_before (_gsi, _gsi);
-  gsi_prev (_gsi);
+  gsi_move_before (_gsi, _gsi, GSI_NEW_STMT);
 }
 
   /* Update all the stmts with their new reaching VUSES.  */


rb18247.patch
Description: rb18247.patch


[PATCH] contrib: Fill in HOST{CC,CFLAGS,CXX,CXXFLAGS} in test_installed

2024-02-05 Thread Jakub Jelinek
Hi!

gcc/Makefile.in since my r0-60234 change fills in HOSTCC and HOSTCFLAGS
in site.exp and since r8-671 also HOSTCXX and HOSTCXXFLAGS.
If those variables aren't set, we get errors like:
/usr/src/gcc/contrib/test_installed --without-g++ --without-gfortran 
--without-objc struct-layout-1.exp
...
ERROR: tcl error sourcing 
/usr/src/gcc/gcc/testsuite/gcc.dg/compat/struct-layout-1.exp.
ERROR: tcl error code TCL LOOKUP VARNAME HOSTCC
ERROR: can't read "HOSTCC": no such variable
while executing
"remote_exec build "$HOSTCC $HOSTCFLAGS $generator_cmd""
(file "/usr/src/gcc/gcc/testsuite/gcc.dg/compat/struct-layout-1.exp" line 
96)
invoked from within
"source /usr/src/gcc/gcc/testsuite/gcc.dg/compat/struct-layout-1.exp"
("uplevel" body line 1)
invoked from within
"uplevel #0 source /usr/src/gcc/gcc/testsuite/gcc.dg/compat/struct-layout-1.exp"
invoked from within
"catch "uplevel #0 source $test_file_name" msg"

(similarly in g++ or gfortran) struct-layout-1.exp.  One doesn't need to
test specially for just struct-layout-1.exp alone, just not using any arg
will trigger it as well, just later.

The following patch fills it in as cc and c++ with empty flags to compile
those, I believe that is what e.g. make uses by default, so it should be a
reasonable default.  We IMHO shouldn't default to GCC_UNDER_TEST because
that might be a cross-compiler etc.

Ok for trunk?

2024-02-05  Jakub Jelinek  

* test_installed: Fill in HOSTCC, HOSTCXX, HOSTCFLAGS and
HOSTCXXFLAGS.

--- contrib/test_installed.jj   2024-01-03 11:51:20.865879222 +0100
+++ contrib/test_installed  2024-02-05 14:36:03.625047250 +0100
@@ -114,6 +114,10 @@ set GCC_UNDER_TEST "${GCC_UNDER_TEST-${p
 set GXX_UNDER_TEST "${GXX_UNDER_TEST-${prefix}${prefix+/bin/}g++}"
 set GFORTRAN_UNDER_TEST 
"${GFORTRAN_UNDER_TEST-${prefix}${prefix+/bin/}gfortran}"
 set OBJC_UNDER_TEST "${OBJC_UNDER_TEST-${prefix}${prefix+/bin/}gcc}"
+set HOSTCC "${HOSTCC-cc}"
+set HOSTCXX "${HOSTCXX-c++}"
+set HOSTCFLAGS ""
+set HOSTCXXFLAGS ""
 EOF
 if test x${target} != x; then
   echo "set target_triplet $target" >> site.exp

Jakub



RE: [PATCH]middle-end: add additional runtime test for [PR113467]

2024-02-05 Thread Tamar Christina
> > Ok for master?
> 
> I think you need a lp64 target check for the large constants or
> alternatively use uint64_t?
> 

Ok, how about this one.

Regtested on x86_64-pc-linux-gnu with -m32,-m64 and no issues.

Ok for master?

Thanks,
Tamar

gcc/testsuite/ChangeLog:

PR tree-optimization/113467
* gcc.dg/vect/vect-early-break_110-pr113467.c: New test.

--- inline copy of patch ---

diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c
new file mode 100644
index 
..1e2c47be5fdf1e1fed88e4b5f45d7eda6c3b85d1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c
@@ -0,0 +1,52 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_long_long } */
+
+/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
+
+#include "tree-vect.h"
+#include 
+
+typedef struct gcry_mpi *gcry_mpi_t;
+struct gcry_mpi {
+  int nlimbs;
+  unsigned long *d;
+};
+
+long gcry_mpi_add_ui_up;
+void gcry_mpi_add_ui(gcry_mpi_t w, gcry_mpi_t u, unsigned v) {
+  gcry_mpi_add_ui_up = *w->d;
+  if (u) {
+uint64_t *res_ptr = w->d, *s1_ptr = w->d;
+int s1_size = u->nlimbs;
+unsigned s2_limb = v, x = *s1_ptr++;
+s2_limb += x;
+*res_ptr++ = s2_limb;
+if (x)
+  while (--s1_size) {
+x = *s1_ptr++ + 1;
+*res_ptr++ = x;
+if (x) {
+  break;
+}
+  }
+  }
+}
+
+int main()
+{
+  check_vect ();
+
+  static struct gcry_mpi sv;
+  static uint64_t vals[] = {4294967288ULL, 191ULL,4160749568ULL, 
4294963263ULL,
+127ULL,4294950912ULL, 255ULL,
4294901760ULL,
+534781951ULL,  33546240ULL,   4294967292ULL, 
4294960127ULL,
+4292872191ULL, 4294967295ULL, 4294443007ULL, 3ULL};
+  gcry_mpi_t v = 
+  v->nlimbs = 16;
+  v->d = vals;
+
+  gcry_mpi_add_ui(v, v, 8);
+  if (v->d[1] != 192)
+__builtin_abort();
+}


rb18246.patch
Description: rb18246.patch


Re: [PATCH v4 5/5] Add documentation for musttail attribute

2024-02-05 Thread Andi Kleen
On Sat, Feb 03, 2024 at 09:35:43PM -0700, Sandra Loosemore wrote:
> On 2/2/24 02:09, Andi Kleen wrote:
> > gcc/ChangeLog:
> > 
> > * doc/extend.texi: Document [[musttail]]
> > ---
> >   gcc/doc/extend.texi | 16 
> >   1 file changed, 16 insertions(+)
> > 
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index 142e41ab8fbf..866f6c4a9fed 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -9875,6 +9875,22 @@ foo (int x, int y)
> >   @code{y} is not actually incremented and the compiler can but does not
> >   have to optimize it to just @code{return 42 + 42;}.
> > +@cindex @code{musttail} statement attribute
> > +@item musttail
> > +
> > +The @code{gnu::musttail} or @code{clang::musttail} attribute
> > +can be applied to a return statement that returns the value
> > +of a call to indicate that the call must be a tail call
> > +that does not allocate extra stack space.
> 
> It took me about 3 attempts to parse this.  :-S  I think this might be a
> little better:
> 
> ...can be applied to a @code{return} statement with a return-value
> expression that is a function call.  It asserts that the call must be a tail
> call that does not allocate extra stack space.
> 
> > +
> > +@smallexample
> > +[[gnu::musttail]] return foo();
> > +@end smallexample
> > +
> > +If the compiler cannot generate a tail call it will generate
> 
> s/will generate/generates/
> 
> I'm a big fan of writing in the present tense.  ;-)
> 
> > +an error. Tail calls generally require enabling optimization.
> > +On some targets they may not be supported.
> > +
> >   @end table
> >   @node Attribute Syntax
> 
> In addition to these changes, at the beginning of this section we have
> 
> @node Statement Attributes
> @section Statement Attributes
> @cindex Statement Attributes
> 
> GCC allows attributes to be set on null statements.  @xref{Attribute
> Syntax},
> for details of the exact syntax for using attributes. [...]
> 
> Well, we now have an attribute that goes on a non-null statement, so we have
> to fix this.  The documentation for the other statement attributes is

FWIW we always had, they just were ignored (with a warning)


Thanks Sandra. I applied the changes. Diff appeneded for reference.


diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 866f6c4a9fed..fe1ee245ed69 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -9818,7 +9818,7 @@ same manner as the @code{deprecated} attribute.
 @section Statement Attributes
 @cindex Statement Attributes
 
-GCC allows attributes to be set on null statements.  @xref{Attribute Syntax},
+GCC allows attributes to be set on statements.  @xref{Attribute Syntax},
 for details of the exact syntax for using attributes.  Other attributes are
 available for functions (@pxref{Function Attributes}), variables
 (@pxref{Variable Attributes}), labels (@pxref{Label Attributes}), enumerators
@@ -9879,15 +9879,15 @@ have to optimize it to just @code{return 42 + 42;}.
 @item musttail
 
 The @code{gnu::musttail} or @code{clang::musttail} attribute
-can be applied to a return statement that returns the value
-of a call to indicate that the call must be a tail call
-that does not allocate extra stack space.
+can be applied to a @code{return} statement with a return-value expression
+that is a function call.  It asserts that the call must be a tail call that
+does not allocate extra stack space.
 
 @smallexample
 [[gnu::musttail]] return foo();
 @end smallexample
 
-If the compiler cannot generate a tail call it will generate
+If the compiler cannot generate a tail call it generates
 an error. Tail calls generally require enabling optimization.
 On some targets they may not be supported.
 
@@ -10014,7 +10014,9 @@ the constant expression, if present.
 
 @subsubheading Statement Attributes
 In GNU C, an attribute specifier list may appear as part of a null
-statement.  The attribute goes before the semicolon.
+statement. The attribute goes before the semicolon.
+Some attributes in new style syntax are also supported
+on non-null statements.
 
 @subsubheading Type Attributes
 


RE: [PATCH]middle-end: fix ICE when moving statements to empty BB [PR113731]

2024-02-05 Thread Tamar Christina
> -Original Message-
> From: Richard Biener 
> Sent: Monday, February 5, 2024 1:22 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> Subject: Re: [PATCH]middle-end: fix ICE when moving statements to empty BB
> [PR113731]
> 
> On Mon, 5 Feb 2024, Tamar Christina wrote:
> 
> > Hi All,
> >
> > We use gsi_move_before (_gsi, _gsi); to request that the new
> statement
> > be placed before any other statement.  Typically this then moves the current
> > pointer to be after the statement we just inserted.
> >
> > However it looks like when the BB is empty, this does not happen and the CUR
> > pointer stays NULL.   There's a comment in the source of gsi_insert_before 
> > that
> > explains:
> >
> > /* If CUR is NULL, we link at the end of the sequence (this case happens
> >
> > so it adds it to the end instead of start like you asked.  This means that 
> > in
> > this case there's nothing to move and so we shouldn't move the pointer if 
> > we're
> > already at the HEAD.
> 
> The issue is that a gsi_end_p () is ambiguous, it could be the start
> or the end.  gsi_insert_before treats it as "end" while gsi_insert_after
> treats it as "start" since you can't really insert "after" the "end".
> 
> gsi_move_before doesn't update the insertion pointer (using
> GSI_SAME_STMT), so with a gsi_end_p () you get what you ask for.
> 
> Btw,
> 
>   /* Move all stmts that need moving.  */
>   basic_block dest_bb = LOOP_VINFO_EARLY_BRK_DEST_BB (loop_vinfo);
>   gimple_stmt_iterator dest_gsi = gsi_start_bb (dest_bb);
> 
> should probably use gsi_after_labels (dest_bb) just in case.

See next patch.

> 
> It looks like LOOP_VINFO_EARLY_BRK_STORES is "reverse"?  Is that
> why you are doing gsi_move_before + gsi_prev?  Why do gsi_prev
> at all?
> 

Yes, it stores them reverse because we record them from the latch on up.
So we either have to iterate backwards, insert them to the front or move gsi.

I guess I could remove it by removing the for-each loop and iterating in
reverse.  Is that preferred?

Tamar.

> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > PR tree-optimization/113731
> > * tree-vect-loop.cc (move_early_exit_stmts): Conditionally move pointer.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR tree-optimization/113731
> > * gcc.dg/vect/vect-early-break_111-pr113731.c: New test.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c
> b/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c
> > new file mode 100644
> > index
> ..2d6db91df97625a7f1160
> 9d034e89af0461129b2
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c
> > @@ -0,0 +1,21 @@
> > +/* { dg-do compile } */
> > +/* { dg-add-options vect_early_break } */
> > +/* { dg-require-effective-target vect_early_break } */
> > +/* { dg-require-effective-target vect_int } */
> > +
> > +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> > +
> > +char* inet_net_pton_ipv4_bits;
> > +char inet_net_pton_ipv4_odst;
> > +void __errno_location();
> > +void inet_net_pton_ipv4();
> > +void inet_net_pton() { inet_net_pton_ipv4(); }
> > +void inet_net_pton_ipv4(char *dst, int size) {
> > +  while ((inet_net_pton_ipv4_bits > dst) & inet_net_pton_ipv4_odst) {
> > +if (size-- <= 0)
> > +  goto emsgsize;
> > +*dst++ = '\0';
> > +  }
> > +emsgsize:
> > +  __errno_location();
> > +}
> > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> > index
> 30b90d99925bea74caf14833d8ab1695607d0fe9..e2587315020a35a7d4ebd3e
> 7a9842caa36bb5d3c 100644
> > --- a/gcc/tree-vect-loop.cc
> > +++ b/gcc/tree-vect-loop.cc
> > @@ -11801,7 +11801,8 @@ move_early_exit_stmts (loop_vec_info loop_vinfo)
> >
> >gimple_stmt_iterator stmt_gsi = gsi_for_stmt (stmt);
> >gsi_move_before (_gsi, _gsi);
> > -  gsi_prev (_gsi);
> > +  if (!gsi_end_p (dest_gsi))
> > +   gsi_prev (_gsi);
> >  }
> >
> >/* Update all the stmts with their new reaching VUSES.  */
> >
> >
> >
> >
> >
> 
> --
> Richard Biener 
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH]middle-end: fix ICE when destination BB for stores starts with a label [PR113750]

2024-02-05 Thread Richard Biener
On Mon, 5 Feb 2024, Tamar Christina wrote:

> Hi All,
> 
> The report shows that if the FE leaves a label as the first thing in the dest
> BB then we ICE because we move the stores before the label.
> 
> This is easy to fix if we know that there's still only one way into the BB.
> We would have already rejected the loop if there was multiple paths into the 
> BB
> however I added an additional check just for early break in case the other
> constraints are relaxed later with an explanation.
> 
> After that we fix the issue just by getting the GSI after the labels and I add
> a bunch of testcases for different positions the label can be added.  Only the
> vect-early-break_112-pr113750.c one results in the label being kept.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK.  I'll note the extra check is likely redundant with the one
for in-loop diamonds.

Thanks,
Richard.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/113750
>   * tree-vect-data-refs.cc (vect_analyze_early_break_dependences): Check
>   for single predecessor when doing early break vect.
>   * tree-vect-loop.cc (move_early_exit_stmts): Get gsi at the start but
>   after labels.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/113750
>   * gcc.dg/vect/vect-early-break_112-pr113750.c: New test.
>   * gcc.dg/vect/vect-early-break_113-pr113750.c: New test.
>   * gcc.dg/vect/vect-early-break_114-pr113750.c: New test.
>   * gcc.dg/vect/vect-early-break_115-pr113750.c: New test.
>   * gcc.dg/vect/vect-early-break_116-pr113750.c: New test.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_112-pr113750.c 
> b/gcc/testsuite/gcc.dg/vect/vect-early-break_112-pr113750.c
> new file mode 100644
> index 
> ..559ebd84d5c39881e694e7c8c31be29d846866ed
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_112-pr113750.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> +/* { dg-require-effective-target vect_int } */
> +
> +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> +
> +#ifndef N
> +#define N 800
> +#endif
> +unsigned vect_a[N];
> +unsigned vect_b[N];
> +
> +unsigned test4(unsigned x)
> +{
> + unsigned ret = 0;
> + for (int i = 0; i < N; i++)
> + {
> +   vect_b[i] = x + i;
> +   if (vect_a[i] != x)
> + break;
> +foo:
> +   vect_a[i] = x;
> + }
> + return ret;
> +}
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_113-pr113750.c 
> b/gcc/testsuite/gcc.dg/vect/vect-early-break_113-pr113750.c
> new file mode 100644
> index 
> ..ba85780a46b1378aaec238ff9eb5f906be9a44dd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_113-pr113750.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> +/* { dg-require-effective-target vect_int } */
> +
> +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> +
> +#ifndef N
> +#define N 800
> +#endif
> +unsigned vect_a[N];
> +unsigned vect_b[N];
> +
> +unsigned test4(unsigned x)
> +{
> + unsigned ret = 0;
> + for (int i = 0; i < N; i++)
> + {
> +   vect_b[i] = x + i;
> +   if (vect_a[i] != x)
> + break;
> +   vect_a[i] = x;
> +foo:
> + }
> + return ret;
> +}
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_114-pr113750.c 
> b/gcc/testsuite/gcc.dg/vect/vect-early-break_114-pr113750.c
> new file mode 100644
> index 
> ..37af2998688f5d60e2cdb372ab43afcaa52a3146
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_114-pr113750.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> +/* { dg-require-effective-target vect_int } */
> +
> +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> +
> +#ifndef N
> +#define N 800
> +#endif
> +unsigned vect_a[N];
> +unsigned vect_b[N];
> +
> +unsigned test4(unsigned x)
> +{
> + unsigned ret = 0;
> + for (int i = 0; i < N; i++)
> + {
> +   vect_b[i] = x + i;
> +foo:
> +   if (vect_a[i] != x)
> + break;
> +   vect_a[i] = x;
> + }
> + return ret;
> +}
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_115-pr113750.c 
> b/gcc/testsuite/gcc.dg/vect/vect-early-break_115-pr113750.c
> new file mode 100644
> index 
> ..502686d308e298cd84e9e3b74d7b4ad1979602a9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_115-pr113750.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> +/* { dg-require-effective-target vect_int } */
> +
> +/* { dg-final { scan-tree-dump "LOOP 

Re: [PATCH]middle-end: fix ICE when moving statements to empty BB [PR113731]

2024-02-05 Thread Richard Biener
On Mon, 5 Feb 2024, Tamar Christina wrote:

> Hi All,
> 
> We use gsi_move_before (_gsi, _gsi); to request that the new 
> statement
> be placed before any other statement.  Typically this then moves the current
> pointer to be after the statement we just inserted.
> 
> However it looks like when the BB is empty, this does not happen and the CUR
> pointer stays NULL.   There's a comment in the source of gsi_insert_before 
> that
> explains:
> 
> /* If CUR is NULL, we link at the end of the sequence (this case happens
> 
> so it adds it to the end instead of start like you asked.  This means that in
> this case there's nothing to move and so we shouldn't move the pointer if 
> we're
> already at the HEAD.

The issue is that a gsi_end_p () is ambiguous, it could be the start
or the end.  gsi_insert_before treats it as "end" while gsi_insert_after
treats it as "start" since you can't really insert "after" the "end".

gsi_move_before doesn't update the insertion pointer (using 
GSI_SAME_STMT), so with a gsi_end_p () you get what you ask for.

Btw,

  /* Move all stmts that need moving.  */
  basic_block dest_bb = LOOP_VINFO_EARLY_BRK_DEST_BB (loop_vinfo);
  gimple_stmt_iterator dest_gsi = gsi_start_bb (dest_bb);

should probably use gsi_after_labels (dest_bb) just in case.

It looks like LOOP_VINFO_EARLY_BRK_STORES is "reverse"?  Is that
why you are doing gsi_move_before + gsi_prev?  Why do gsi_prev
at all?

> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/113731
>   * tree-vect-loop.cc (move_early_exit_stmts): Conditionally move pointer.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/113731
>   * gcc.dg/vect/vect-early-break_111-pr113731.c: New test.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c 
> b/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c
> new file mode 100644
> index 
> ..2d6db91df97625a7f11609d034e89af0461129b2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> +/* { dg-require-effective-target vect_int } */
> +
> +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> +
> +char* inet_net_pton_ipv4_bits;
> +char inet_net_pton_ipv4_odst;
> +void __errno_location();
> +void inet_net_pton_ipv4();
> +void inet_net_pton() { inet_net_pton_ipv4(); }
> +void inet_net_pton_ipv4(char *dst, int size) {
> +  while ((inet_net_pton_ipv4_bits > dst) & inet_net_pton_ipv4_odst) {
> +if (size-- <= 0)
> +  goto emsgsize;
> +*dst++ = '\0';
> +  }
> +emsgsize:
> +  __errno_location();
> +}
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 
> 30b90d99925bea74caf14833d8ab1695607d0fe9..e2587315020a35a7d4ebd3e7a9842caa36bb5d3c
>  100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -11801,7 +11801,8 @@ move_early_exit_stmts (loop_vec_info loop_vinfo)
>  
>gimple_stmt_iterator stmt_gsi = gsi_for_stmt (stmt);
>gsi_move_before (_gsi, _gsi);
> -  gsi_prev (_gsi);
> +  if (!gsi_end_p (dest_gsi))
> + gsi_prev (_gsi);
>  }
>  
>/* Update all the stmts with their new reaching VUSES.  */
> 
> 
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH] c: Avoid ICE with _BitInt(N) : 0 bitfield [PR113740]

2024-02-05 Thread Marek Polacek
On Mon, Feb 05, 2024 at 08:57:18AM +0100, Jakub Jelinek wrote:
> Hi!
> 
> finish_struct already made sure not to call build_bitint_type for
> signed _BitInt(2) : 1;
> or
> signed _BitInt(2) : 0;
> bitfields (but instead build a zero precision integral type,
> we remove it later), this patch makes sure we do it also for
> unsigned _BitInt(1) : 0;
> because of the build_bitint_type assertion that precision is
> >= (unsigned ? 1 : 2).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK, thanks.
 
> 2024-02-05  Jakub Jelinek  
> 
>   PR c/113740
>   * c-decl.cc (finish_struct): Only use build_bitint_type if
>   bit-field has width larger or equal to minimum _BitInt
>   precision.
> 
>   * gcc.dg/bitint-85.c: New test.
> 
> --- gcc/c/c-decl.cc.jj2024-02-01 09:14:16.474551596 +0100
> +++ gcc/c/c-decl.cc   2024-02-03 13:03:35.272479105 +0100
> @@ -9555,7 +9555,7 @@ finish_struct (location_t loc, tree t, t
> if (width != TYPE_PRECISION (type))
>   {
> if (TREE_CODE (type) == BITINT_TYPE
> -   && (width > 1 || TYPE_UNSIGNED (type)))
> +   && width >= (TYPE_UNSIGNED (type) ? 1 : 2))
>   TREE_TYPE (field)
> = build_bitint_type (width, TYPE_UNSIGNED (type));
> else
> --- gcc/testsuite/gcc.dg/bitint-85.c.jj   2024-02-03 13:05:49.162639344 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-85.c  2024-02-03 13:05:39.489772259 +0100
> @@ -0,0 +1,5 @@
> +/* PR c/113740 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-std=c23" } */
> +
> +struct S { unsigned _BitInt(32) : 0; };
> 
>   Jakub
> 

Marek



Re: [PATCH]middle-end: add additional runtime test for [PR113467]

2024-02-05 Thread Richard Biener
On Mon, 5 Feb 2024, Tamar Christina wrote:

> Hi All,
> 
> This just adds an additional runtime testcase for the fixed issue.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

I think you need a lp64 target check for the large constants or
alternatively use uint64_t?

> Thanks,
> Tamar
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/113467
>   * gcc.dg/vect/vect-early-break_110-pr113467.c: New test.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c 
> b/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c
> new file mode 100644
> index 
> ..2d8a071c0e922ccfd5fa8c7b2704852dbd95
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c
> @@ -0,0 +1,51 @@
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> +/* { dg-require-effective-target vect_int } */
> +
> +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
> +
> +#include "tree-vect.h"
> +
> +typedef struct gcry_mpi *gcry_mpi_t;
> +struct gcry_mpi {
> +  int nlimbs;
> +  unsigned long *d;
> +};
> +
> +long gcry_mpi_add_ui_up;
> +void gcry_mpi_add_ui(gcry_mpi_t w, gcry_mpi_t u, unsigned v) {
> +  gcry_mpi_add_ui_up = *w->d;
> +  if (u) {
> +unsigned long *res_ptr = w->d, *s1_ptr = w->d;
> +int s1_size = u->nlimbs;
> +unsigned s2_limb = v, x = *s1_ptr++;
> +s2_limb += x;
> +*res_ptr++ = s2_limb;
> +if (x)
> +  while (--s1_size) {
> +x = *s1_ptr++ + 1;
> +*res_ptr++ = x;
> +if (x) {
> +  break;
> +}
> +  }
> +  }
> +}
> +
> +int main()
> +{
> +  check_vect ();
> +
> +  static struct gcry_mpi sv;
> +  static unsigned long vals[] = {4294967288, 191,4160749568, 
> 4294963263,
> + 127,4294950912, 255,
> 4294901760,
> + 534781951,  33546240,   4294967292, 
> 4294960127,
> + 4292872191, 4294967295, 4294443007, 3};
> +  gcry_mpi_t v = 
> +  v->nlimbs = 16;
> +  v->d = vals;
> +
> +  gcry_mpi_add_ui(v, v, 8);
> +  if (v->d[1] != 192)
> +__builtin_abort();
> +}
> 
> 
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


[PATCH]middle-end: fix ICE when destination BB for stores starts with a label [PR113750]

2024-02-05 Thread Tamar Christina
Hi All,

The report shows that if the FE leaves a label as the first thing in the dest
BB then we ICE because we move the stores before the label.

This is easy to fix if we know that there's still only one way into the BB.
We would have already rejected the loop if there was multiple paths into the BB
however I added an additional check just for early break in case the other
constraints are relaxed later with an explanation.

After that we fix the issue just by getting the GSI after the labels and I add
a bunch of testcases for different positions the label can be added.  Only the
vect-early-break_112-pr113750.c one results in the label being kept.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR tree-optimization/113750
* tree-vect-data-refs.cc (vect_analyze_early_break_dependences): Check
for single predecessor when doing early break vect.
* tree-vect-loop.cc (move_early_exit_stmts): Get gsi at the start but
after labels.

gcc/testsuite/ChangeLog:

PR tree-optimization/113750
* gcc.dg/vect/vect-early-break_112-pr113750.c: New test.
* gcc.dg/vect/vect-early-break_113-pr113750.c: New test.
* gcc.dg/vect/vect-early-break_114-pr113750.c: New test.
* gcc.dg/vect/vect-early-break_115-pr113750.c: New test.
* gcc.dg/vect/vect-early-break_116-pr113750.c: New test.

--- inline copy of patch -- 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_112-pr113750.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_112-pr113750.c
new file mode 100644
index 
..559ebd84d5c39881e694e7c8c31be29d846866ed
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_112-pr113750.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+#ifndef N
+#define N 800
+#endif
+unsigned vect_a[N];
+unsigned vect_b[N];
+
+unsigned test4(unsigned x)
+{
+ unsigned ret = 0;
+ for (int i = 0; i < N; i++)
+ {
+   vect_b[i] = x + i;
+   if (vect_a[i] != x)
+ break;
+foo:
+   vect_a[i] = x;
+ }
+ return ret;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_113-pr113750.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_113-pr113750.c
new file mode 100644
index 
..ba85780a46b1378aaec238ff9eb5f906be9a44dd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_113-pr113750.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+#ifndef N
+#define N 800
+#endif
+unsigned vect_a[N];
+unsigned vect_b[N];
+
+unsigned test4(unsigned x)
+{
+ unsigned ret = 0;
+ for (int i = 0; i < N; i++)
+ {
+   vect_b[i] = x + i;
+   if (vect_a[i] != x)
+ break;
+   vect_a[i] = x;
+foo:
+ }
+ return ret;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_114-pr113750.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_114-pr113750.c
new file mode 100644
index 
..37af2998688f5d60e2cdb372ab43afcaa52a3146
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_114-pr113750.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+#ifndef N
+#define N 800
+#endif
+unsigned vect_a[N];
+unsigned vect_b[N];
+
+unsigned test4(unsigned x)
+{
+ unsigned ret = 0;
+ for (int i = 0; i < N; i++)
+ {
+   vect_b[i] = x + i;
+foo:
+   if (vect_a[i] != x)
+ break;
+   vect_a[i] = x;
+ }
+ return ret;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_115-pr113750.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_115-pr113750.c
new file mode 100644
index 
..502686d308e298cd84e9e3b74d7b4ad1979602a9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_115-pr113750.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+#ifndef N
+#define N 800
+#endif
+unsigned vect_a[N];
+unsigned vect_b[N];
+
+unsigned test4(unsigned x)
+{
+ unsigned ret = 0;
+ for (int i = 0; i < N; i++)
+ {
+foo:
+   vect_b[i] = x + i;
+   if (vect_a[i] != x)
+ break;
+   vect_a[i] = x;
+ }
+ return ret;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_116-pr113750.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_116-pr113750.c
new file mode 

[PATCH]middle-end: fix ICE when moving statements to empty BB [PR113731]

2024-02-05 Thread Tamar Christina
Hi All,

We use gsi_move_before (_gsi, _gsi); to request that the new statement
be placed before any other statement.  Typically this then moves the current
pointer to be after the statement we just inserted.

However it looks like when the BB is empty, this does not happen and the CUR
pointer stays NULL.   There's a comment in the source of gsi_insert_before that
explains:

/* If CUR is NULL, we link at the end of the sequence (this case happens

so it adds it to the end instead of start like you asked.  This means that in
this case there's nothing to move and so we shouldn't move the pointer if we're
already at the HEAD.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR tree-optimization/113731
* tree-vect-loop.cc (move_early_exit_stmts): Conditionally move pointer.

gcc/testsuite/ChangeLog:

PR tree-optimization/113731
* gcc.dg/vect/vect-early-break_111-pr113731.c: New test.

--- inline copy of patch -- 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c
new file mode 100644
index 
..2d6db91df97625a7f11609d034e89af0461129b2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+char* inet_net_pton_ipv4_bits;
+char inet_net_pton_ipv4_odst;
+void __errno_location();
+void inet_net_pton_ipv4();
+void inet_net_pton() { inet_net_pton_ipv4(); }
+void inet_net_pton_ipv4(char *dst, int size) {
+  while ((inet_net_pton_ipv4_bits > dst) & inet_net_pton_ipv4_odst) {
+if (size-- <= 0)
+  goto emsgsize;
+*dst++ = '\0';
+  }
+emsgsize:
+  __errno_location();
+}
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 
30b90d99925bea74caf14833d8ab1695607d0fe9..e2587315020a35a7d4ebd3e7a9842caa36bb5d3c
 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -11801,7 +11801,8 @@ move_early_exit_stmts (loop_vec_info loop_vinfo)
 
   gimple_stmt_iterator stmt_gsi = gsi_for_stmt (stmt);
   gsi_move_before (_gsi, _gsi);
-  gsi_prev (_gsi);
+  if (!gsi_end_p (dest_gsi))
+   gsi_prev (_gsi);
 }
 
   /* Update all the stmts with their new reaching VUSES.  */




-- 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c
new file mode 100644
index 
..2d6db91df97625a7f11609d034e89af0461129b2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+char* inet_net_pton_ipv4_bits;
+char inet_net_pton_ipv4_odst;
+void __errno_location();
+void inet_net_pton_ipv4();
+void inet_net_pton() { inet_net_pton_ipv4(); }
+void inet_net_pton_ipv4(char *dst, int size) {
+  while ((inet_net_pton_ipv4_bits > dst) & inet_net_pton_ipv4_odst) {
+if (size-- <= 0)
+  goto emsgsize;
+*dst++ = '\0';
+  }
+emsgsize:
+  __errno_location();
+}
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 
30b90d99925bea74caf14833d8ab1695607d0fe9..e2587315020a35a7d4ebd3e7a9842caa36bb5d3c
 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -11801,7 +11801,8 @@ move_early_exit_stmts (loop_vec_info loop_vinfo)
 
   gimple_stmt_iterator stmt_gsi = gsi_for_stmt (stmt);
   gsi_move_before (_gsi, _gsi);
-  gsi_prev (_gsi);
+  if (!gsi_end_p (dest_gsi))
+   gsi_prev (_gsi);
 }
 
   /* Update all the stmts with their new reaching VUSES.  */





[PATCH]middle-end: add additional runtime test for [PR113467]

2024-02-05 Thread Tamar Christina
Hi All,

This just adds an additional runtime testcase for the fixed issue.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/testsuite/ChangeLog:

PR tree-optimization/113467
* gcc.dg/vect/vect-early-break_110-pr113467.c: New test.

--- inline copy of patch -- 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c
new file mode 100644
index 
..2d8a071c0e922ccfd5fa8c7b2704852dbd95
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c
@@ -0,0 +1,51 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
+
+#include "tree-vect.h"
+
+typedef struct gcry_mpi *gcry_mpi_t;
+struct gcry_mpi {
+  int nlimbs;
+  unsigned long *d;
+};
+
+long gcry_mpi_add_ui_up;
+void gcry_mpi_add_ui(gcry_mpi_t w, gcry_mpi_t u, unsigned v) {
+  gcry_mpi_add_ui_up = *w->d;
+  if (u) {
+unsigned long *res_ptr = w->d, *s1_ptr = w->d;
+int s1_size = u->nlimbs;
+unsigned s2_limb = v, x = *s1_ptr++;
+s2_limb += x;
+*res_ptr++ = s2_limb;
+if (x)
+  while (--s1_size) {
+x = *s1_ptr++ + 1;
+*res_ptr++ = x;
+if (x) {
+  break;
+}
+  }
+  }
+}
+
+int main()
+{
+  check_vect ();
+
+  static struct gcry_mpi sv;
+  static unsigned long vals[] = {4294967288, 191,4160749568, 
4294963263,
+ 127,4294950912, 255,
4294901760,
+ 534781951,  33546240,   4294967292, 
4294960127,
+ 4292872191, 4294967295, 4294443007, 3};
+  gcry_mpi_t v = 
+  v->nlimbs = 16;
+  v->d = vals;
+
+  gcry_mpi_add_ui(v, v, 8);
+  if (v->d[1] != 192)
+__builtin_abort();
+}




-- 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c 
b/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c
new file mode 100644
index 
..2d8a071c0e922ccfd5fa8c7b2704852dbd95
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-early-break_110-pr113467.c
@@ -0,0 +1,51 @@
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
+
+#include "tree-vect.h"
+
+typedef struct gcry_mpi *gcry_mpi_t;
+struct gcry_mpi {
+  int nlimbs;
+  unsigned long *d;
+};
+
+long gcry_mpi_add_ui_up;
+void gcry_mpi_add_ui(gcry_mpi_t w, gcry_mpi_t u, unsigned v) {
+  gcry_mpi_add_ui_up = *w->d;
+  if (u) {
+unsigned long *res_ptr = w->d, *s1_ptr = w->d;
+int s1_size = u->nlimbs;
+unsigned s2_limb = v, x = *s1_ptr++;
+s2_limb += x;
+*res_ptr++ = s2_limb;
+if (x)
+  while (--s1_size) {
+x = *s1_ptr++ + 1;
+*res_ptr++ = x;
+if (x) {
+  break;
+}
+  }
+  }
+}
+
+int main()
+{
+  check_vect ();
+
+  static struct gcry_mpi sv;
+  static unsigned long vals[] = {4294967288, 191,4160749568, 
4294963263,
+ 127,4294950912, 255,
4294901760,
+ 534781951,  33546240,   4294967292, 
4294960127,
+ 4292872191, 4294967295, 4294443007, 3};
+  gcry_mpi_t v = 
+  v->nlimbs = 16;
+  v->d = vals;
+
+  gcry_mpi_add_ui(v, v, 8);
+  if (v->d[1] != 192)
+__builtin_abort();
+}





[PATCH] libgomp: testsuite: Don't XPASS libgomp.c/alloc-pinned-1.c etc. on non-Linux targets [PR113448]

2024-02-05 Thread Rainer Orth
Two libgomp tests XPASS on Solaris (any non-Linux target actually) since
their introduction:

XPASS: libgomp.c/alloc-pinned-1.c execution test
XPASS: libgomp.c/alloc-pinned-2.c execution test

The problem is that the test just prints

OS unsupported

and exits successfully, while the test is XFAILed:

/* { dg-xfail-run-if "Pinning not implemented on this host" { ! *-*-linux-gnu } 
} */

Fixed by aborting immediately after the message above in the non-Linux
case.

Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu.

Ok for trunk?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2024-02-02  Rainer Orth  

libgomp:
PR testsuite/113448
* testsuite/libgomp.c/alloc-pinned-1.c [!__linux__] (CHECK_SIZE):
Call abort.
* testsuite/libgomp.c/alloc-pinned-2.c [!__linux__] (CHECK_SIZE):
Likewise.

# HG changeset patch
# Parent  b7015efde7d6a48dd520698b470fcaf824758f21
libgomp: testsuite: Fix libgomp.c/alloc-pinned-1.c etc. on non-Linux targets [PR113085]

diff --git a/libgomp/testsuite/libgomp.c/alloc-pinned-1.c b/libgomp/testsuite/libgomp.c/alloc-pinned-1.c
--- a/libgomp/testsuite/libgomp.c/alloc-pinned-1.c
+++ b/libgomp/testsuite/libgomp.c/alloc-pinned-1.c
@@ -45,7 +45,10 @@ get_pinned_mem ()
 }
 #else
 #define PAGE_SIZE 1024 /* unknown */
-#define CHECK_SIZE(SIZE) fprintf (stderr, "OS unsupported\n");
+#define CHECK_SIZE(SIZE) { \
+  fprintf (stderr, "OS unsupported\n"); \
+  abort (); \
+  }
 #define EXPECT_OMP_NULL_ALLOCATOR
 
 int
diff --git a/libgomp/testsuite/libgomp.c/alloc-pinned-2.c b/libgomp/testsuite/libgomp.c/alloc-pinned-2.c
--- a/libgomp/testsuite/libgomp.c/alloc-pinned-2.c
+++ b/libgomp/testsuite/libgomp.c/alloc-pinned-2.c
@@ -45,12 +45,16 @@ get_pinned_mem ()
 }
 #else
 #define PAGE_SIZE 1024 /* unknown */
-#define CHECK_SIZE(SIZE) fprintf (stderr, "OS unsupported\n");
+#define CHECK_SIZE(SIZE) { \
+  fprintf (stderr, "OS unsupported\n"); \
+  abort (); \
+  }
 #define EXPECT_OMP_NULL_ALLOCATOR
 
 int
 get_pinned_mem ()
 {
+  abort ();
   return 0;
 }
 #endif


[PATCH] middle-end/109559 - warning in system header not suppressed

2024-02-05 Thread Richard Biener
set_inlining_locations looks at a possible macro expansion location
when the location is in a system header but it fails to update its
counter when there's no macro involved.  The following fixes that.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

This doesn't fix the observed diagnostic in the PR which I think is
now given by design since we diagnose inlined code from a system
header into a function not in a system header.  But I think it still
fixes a bug.

The whole set_inlining_locations is a bit pointless since all that
matters should be the location of the call (and its system header
status) then.  I'll also note that -Wno-system-headers doesn't
help and we don't have any flag to disable diagnosing 
inlined-from-system-header code either.

Unfortunately this is all from changes done by Martin Sebor so
it's difficult to tell the true intention.  The code in
set_inlining_locations doesn't really do what it documents
but fixing (I'll attach the "failed" patch in the PR) will
break testcases that test we diagnose inline copies.

Anyway - OK for the change below where I don't have any testcase.

Thanks,
Richard.

PR middle-end/109559
* tree-diagnostic.cc (set_inlining_locations): Always
increment nsyslocs when loc is in a system header.
---
 gcc/tree-diagnostic.cc | 24 +++-
 1 file changed, 7 insertions(+), 17 deletions(-)

diff --git a/gcc/tree-diagnostic.cc b/gcc/tree-diagnostic.cc
index a660c7d0785..e050a6eccf6 100644
--- a/gcc/tree-diagnostic.cc
+++ b/gcc/tree-diagnostic.cc
@@ -339,24 +339,14 @@ set_inlining_locations (diagnostic_context *,
   block = BLOCK_SUPERCONTEXT (block);
 }
 
+  if (in_system_header_at (loc))
+++nsyslocs;
+
+  /* When there is an inlining context use the macro expansion
+ location for the original location and bump up NSYSLOCS if
+ it's in a system header since it's not counted above.  */
   if (ilocs.length ())
-{
-  /* When there is an inlining context use the macro expansion
-location for the original location and bump up NSYSLOCS if
-it's in a system header since it's not counted above.  */
-  location_t sysloc = expansion_point_location_if_in_system_header (loc);
-  if (sysloc != loc)
-   {
- loc = sysloc;
- ++nsyslocs;
-   }
-}
-  else
-{
-  /* When there's no inlining context use the original location
-and set NSYSLOCS accordingly.  */
-  nsyslocs = in_system_header_at (loc) != 0;
-}
+loc = expansion_point_location_if_in_system_header (loc);
 
   ilocs.safe_push (loc);
 
-- 
2.35.3


Re: [PATCH] x86-64: Update gcc.target/i386/apx-ndd.c

2024-02-05 Thread H.J. Lu
On Mon, Feb 5, 2024 at 3:53 AM H.J. Lu <>  wrote:
>
> Fix the following issues:
>
> 1. Replace long with int64_t to support x32.
> 2. Replace \\(%rdi\\) with \\(%(?:r|e)di\\) for memory operand since x32
> uses (%edi).
> 3. Replace %(?:|r|e)al with %al in negb scan.
>
> * gcc.target/i386/apx-ndd.c: Updated.
> ---
>  gcc/testsuite/gcc.target/i386/apx-ndd.c | 68 -
>  1 file changed, 34 insertions(+), 34 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c 
> b/gcc/testsuite/gcc.target/i386/apx-ndd.c
> index b215f66d3e2..0eb751ad225 100644
> --- a/gcc/testsuite/gcc.target/i386/apx-ndd.c
> +++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c
> @@ -75,9 +75,9 @@ FOO2 (short, add, +)
>  FOO (int, add, +)
>  FOO1 (int, add, +)
>  FOO2 (int, add, +)
> -FOO (long, add, +)
> -FOO1 (long, add, +)
> -FOO2 (long, add, +)
> +FOO (int64_t, add, +)
> +FOO1 (int64_t, add, +)
> +FOO2 (int64_t, add, +)
>
>  FOO (char, sub, -)
>  FOO1 (char, sub, -)
> @@ -85,8 +85,8 @@ FOO (short, sub, -)
>  FOO1 (short, sub, -)
>  FOO (int, sub, -)
>  FOO1 (int, sub, -)
> -FOO (long, sub, -)
> -FOO1 (long, sub, -)
> +FOO (int64_t, sub, -)
> +FOO1 (int64_t, sub, -)
>
>  F (char, neg, -)
>  F1 (char, neg, -)
> @@ -94,8 +94,8 @@ F (short, neg, -)
>  F1 (short, neg, -)
>  F (int, neg, -)
>  F1 (int, neg, -)
> -F (long, neg, -)
> -F1 (long, neg, -)
> +F (int64_t, neg, -)
> +F1 (int64_t, neg, -)
>
>  F (char, not, ~)
>  F1 (char, not, ~)
> @@ -103,8 +103,8 @@ F (short, not, ~)
>  F1 (short, not, ~)
>  F (int, not, ~)
>  F1 (int, not, ~)
> -F (long, not, ~)
> -F1 (long, not, ~)
> +F (int64_t, not, ~)
> +F1 (int64_t, not, ~)
>
>  FOO (char, and, &)
>  FOO1 (char, and, &)
> @@ -112,8 +112,8 @@ FOO (short, and, &)
>  FOO1 (short, and, &)
>  FOO (int, and, &)
>  FOO1 (int, and, &)
> -FOO (long, and, &)
> -FOO1 (long, and, &)
> +FOO (int64_t, and, &)
> +FOO1 (int64_t, and, &)
>
>  FOO (char, or, |)
>  FOO1 (char, or, |)
> @@ -121,8 +121,8 @@ FOO (short, or, |)
>  FOO1 (short, or, |)
>  FOO (int, or, |)
>  FOO1 (int, or, |)
> -FOO (long, or, |)
> -FOO1 (long, or, |)
> +FOO (int64_t, or, |)
> +FOO1 (int64_t, or, |)
>
>  FOO (char, xor, ^)
>  FOO1 (char, xor, ^)
> @@ -130,8 +130,8 @@ FOO (short, xor, ^)
>  FOO1 (short, xor, ^)
>  FOO (int, xor, ^)
>  FOO1 (int, xor, ^)
> -FOO (long, xor, ^)
> -FOO1 (long, xor, ^)
> +FOO (int64_t, xor, ^)
> +FOO1 (int64_t, xor, ^)
>
>  FOO (char, shl, <<)
>  FOO3 (char, shl, <<, 7)
> @@ -139,8 +139,8 @@ FOO (short, shl, <<)
>  FOO3 (short, shl, <<, 7)
>  FOO (int, shl, <<)
>  FOO3 (int, shl, <<, 7)
> -FOO (long, shl, <<)
> -FOO3 (long, shl, <<, 7)
> +FOO (int64_t, shl, <<)
> +FOO3 (int64_t, shl, <<, 7)
>
>  FOO (char, sar, >>)
>  FOO3 (char, sar, >>, 7)
> @@ -148,8 +148,8 @@ FOO (short, sar, >>)
>  FOO3 (short, sar, >>, 7)
>  FOO (int, sar, >>)
>  FOO3 (int, sar, >>, 7)
> -FOO (long, sar, >>)
> -FOO3 (long, sar, >>, 7)
> +FOO (int64_t, sar, >>)
> +FOO3 (int64_t, sar, >>, 7)
>
>  FOO (uint8_t, shr, >>)
>  FOO3 (uint8_t, shr, >>, 7)
> @@ -170,33 +170,33 @@ FOO4 (uint16_t, rol, <<, >>, 1)
>  FOO4 (uint32_t, rol, <<, >>, 1)
>  FOO4 (uint64_t, rol, <<, >>, 1)
>
> -/* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), 
> %(?:|r|e)a(?:x|l)" 4 } } */
> +/* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, 
> \\(%(?:r|e)di\\), %(?:|r|e)a(?:x|l)" 4 } } */
>  /* { dg-final { scan-assembler-times 
> "lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */
> -/* { dg-final { scan-assembler-times 
> "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } 
> } */
> -/* { dg-final { scan-assembler-times "sub(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), 
> %(?:|r|e)a(?:x|l)" 4 } } */
> +/* { dg-final { scan-assembler-times 
> "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), \\(%(?:r|e)di\\), 
> %(?:|r|e)a(?:x|l)" 4 } } */
> +/* { dg-final { scan-assembler-times "sub(?:b|l|w|q)\[^\n\r]*1, 
> \\(%(?:r|e)di\\), %(?:|r|e)a(?:x|l)" 4 } } */
>  /* { dg-final { scan-assembler-times 
> "sub(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), %(?:|r|e)di, %(?:|r|e)a(?:x|l)" 4 } 
> } */
> -/* { dg-final { scan-assembler-times "negb\[^\n\r]\\(%rdi\\), %(?:|r|e)al" 1 
> } } */
> -/* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]\\(%rdi\\), 
> %(?:|r|e)ax" 3 } } */
> +/* { dg-final { scan-assembler-times "negb\[^\n\r]\\(%(?:r|e)di\\), %al" 1 } 
> } */
> +/* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]\\(%(?:r|e)di\\), 
> %(?:|r|e)ax" 3 } } */
>  /* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]%(?:|r|e)di, 
> %(?:|r|e)ax" 4 } } */
> -/* { dg-final { scan-assembler-times "not(?:b|l|w|q)\[^\n\r]\\(%rdi\\), 
> %(?:|r|e)a(?:x|l)" 4 } } */
> +/* { dg-final { scan-assembler-times 
> "not(?:b|l|w|q)\[^\n\r]\\(%(?:r|e)di\\), %(?:|r|e)a(?:x|l)" 4 } } */
>  /* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]%(?:|r|e)di, 
> %(?:|r|e)ax" 4 } } */
> -/* { dg-final { scan-assembler-times "andb\[^\n\r]*1, \\(%rdi\\), %al" 1 } } 
> */
> -/* { dg-final { 

Re: [PATCH] gcc/configure: Re-introduce INSTALL_INFO

2024-02-05 Thread rep . dot . nop
On 5 February 2024 12:30:23 CET, Christophe Lyon  
wrote:
>On Fri, 2 Feb 2024 at 11:40, Christophe Lyon  
>wrote:
>>
>> On Fri, 2 Feb 2024 at 11:10,  wrote:
>> >
>> > On 1 February 2024 18:15:34 CET, Christophe Lyon 
>> >  wrote:
>> > >BUILD_INFO is currently a byproduct of checking makeinfo
>> > >presence/version.  INSTALL_INFO used to be defined similarly, but was
>> > >removed in 2000 (!) by commit 17db658241d18cf6db59d31bc2d6eac96e9257df
>> > >(svn r38141).
>> > >
>> > >In order to save build time, our CI overrides BUILD_INFO="", which
>> > >works when invoking 'make all' but not for 'make install' in case some
>> > >info files need an update.
>> >
>> > Instead of resurrecting INSTALL_INFO maybe you could something along the 
>> > lines of
>> >
>> > https://gcc.gnu.org/bugzilla/attachment.cgi?id=15038=edit
>>
>> Ha indeed something along these lines would work too.
>> Thanks for the archaeology :-)
>>
>> >
>> > not sure which approach would be considered cleaner..
>> Not sure either.
>>
>> What do maintainers prefer?
>>
>
>Actually that leads to a small patch:
>https://gcc.gnu.org/pipermail/gcc-patches/2024-February/644957.html

Thats even better.
thanks

>
>> >
>> > HTH



Pushed: [PATCH] MIPS: Fix wrong MSA FP vector negation

2024-02-05 Thread Xi Ruoyao
On Mon, 2024-02-05 at 09:56 +0800, YunQiang Su wrote:
> Xi Ruoyao  于2024年2月5日周一 02:01写道:
> > 
> > We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is
> > wrong because -0.0 is not 0 - 0.0.  This causes some Python tests to
> > fail when Python is built with MSA enabled.
> > 
> > Use the bnegi.df instructions to simply reverse the sign bit instead.
> > 
> > gcc/ChangeLog:
> > 
> >  * config/mips/mips-msa.md (elmsgnbit): New define_mode_attr.
> >  (neg2): Change the mode iterator from MSA to IMSA because
> >  in FP arithmetic we cannot use (0 - x) for -x.
> >  (neg2): New define_insn to implement FP vector negation,
> >  using a bnegi instruction to negate the sign bit.
> > ---
> > 
> > Bootstrapped and regtested on mips64el-linux-gnuabi64.  Ok for trunk
> > and/or release branches?
> > 
> >   gcc/config/mips/mips-msa.md | 18 +++---
> >   1 file changed, 15 insertions(+), 3 deletions(-)
> > 
> 
> LGTM, while I guess that we also need a test case.

Pushed to trunk and release branches, with a following obvious fix:

diff --git a/gcc/config/mips/mips-msa.md b/gcc/config/mips/mips-msa.md
index 920161ed1d8..779157f2a0c 100644
--- a/gcc/config/mips/mips-msa.md
+++ b/gcc/config/mips/mips-msa.md
@@ -613,7 +613,7 @@ (define_expand "neg2"
 
 (define_insn "neg2"
   [(set (match_operand:FMSA 0 "register_operand" "=f")
-   (neg (match_operand:FMSA 1 "register_operand" "f")))]
+   (neg:FMSA (match_operand:FMSA 1 "register_operand" "f")))]
   "ISA_HAS_MSA"
   "bnegi.\t%w0,%w1,"
   [(set_attr "type" "simd_bit")

I'll write a test case for gcc.dg/vect later (now I have to do
$SOME_REAL_LIFE_THING...)

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v2] RISC-V: THEAD: Fix improper immediate value for MODIFY_DISP instruction on 32-bit systems.

2024-02-05 Thread Christoph Müllner
On Sat, Feb 3, 2024 at 2:11 PM Andreas Schwab  wrote:
>
> On Jan 30 2024, Christoph Müllner wrote:
>
> > retested
>
> Nope.

Sorry for this.
I tested for no regressions in the test suite with a cross-build and
QEMU and did not do a Werror bootstrap build.
I'll provide a fix for this later today (also breaking the line as it
is longer than needed).


>
> ../../gcc/config/riscv/thead.cc:1144:22: error: invalid suffix on literal; 
> C++11 requires a space between literal and string macro 
> [-Werror=literal-suffix]
>  1144 |   fprintf (file, "(%s),"HOST_WIDE_INT_PRINT_DEC",%u", 
> reg_names[REGNO (addr.reg)],
>   |  ^
> cc1plus: all warnings being treated as errors
> make[3]: *** [../../gcc/config/riscv/t-riscv:127: thead.o] Error 1
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."


Re: Ping: Re: [PATCH] libgcc: fix SEH C++ rethrow semantics [PR113337]

2024-02-05 Thread Matteo Italia

Il 31/01/24 04:24, LIU Hao ha scritto:

在 2024-01-31 08:08, Jonathan Yong 写道:

On 1/24/24 15:17, Matteo Italia wrote:
Ping! That's a one-line fix, and you can find all the details in the 
bugzilla entry. Also, I can provide executables built with the 
affected toolchains, demonstrating the problem and the fix.


Thanks,
Matteo



I was away last week. LH, care to comment? Changes look fine to me.



The change looks good to me, too.

I haven't tested it though. According to a similar construction around 
'libgcc/unwind.inc:265' it should be that way.


Hello,

thank you for the replies, is there anything else I can do to help push 
this forward?




[PATCH] x86-64: Update gcc.target/i386/apx-ndd.c

2024-02-05 Thread H.J. Lu <>
Fix the following issues:

1. Replace long with int64_t to support x32.
2. Replace \\(%rdi\\) with \\(%(?:r|e)di\\) for memory operand since x32
uses (%edi).
3. Replace %(?:|r|e)al with %al in negb scan.

* gcc.target/i386/apx-ndd.c: Updated.
---
 gcc/testsuite/gcc.target/i386/apx-ndd.c | 68 -
 1 file changed, 34 insertions(+), 34 deletions(-)

diff --git a/gcc/testsuite/gcc.target/i386/apx-ndd.c 
b/gcc/testsuite/gcc.target/i386/apx-ndd.c
index b215f66d3e2..0eb751ad225 100644
--- a/gcc/testsuite/gcc.target/i386/apx-ndd.c
+++ b/gcc/testsuite/gcc.target/i386/apx-ndd.c
@@ -75,9 +75,9 @@ FOO2 (short, add, +)
 FOO (int, add, +)
 FOO1 (int, add, +)
 FOO2 (int, add, +)
-FOO (long, add, +)
-FOO1 (long, add, +)
-FOO2 (long, add, +)
+FOO (int64_t, add, +)
+FOO1 (int64_t, add, +)
+FOO2 (int64_t, add, +)
 
 FOO (char, sub, -)
 FOO1 (char, sub, -)
@@ -85,8 +85,8 @@ FOO (short, sub, -)
 FOO1 (short, sub, -)
 FOO (int, sub, -)
 FOO1 (int, sub, -)
-FOO (long, sub, -)
-FOO1 (long, sub, -)
+FOO (int64_t, sub, -)
+FOO1 (int64_t, sub, -)
 
 F (char, neg, -)
 F1 (char, neg, -)
@@ -94,8 +94,8 @@ F (short, neg, -)
 F1 (short, neg, -)
 F (int, neg, -)
 F1 (int, neg, -)
-F (long, neg, -)
-F1 (long, neg, -)
+F (int64_t, neg, -)
+F1 (int64_t, neg, -)
 
 F (char, not, ~)
 F1 (char, not, ~)
@@ -103,8 +103,8 @@ F (short, not, ~)
 F1 (short, not, ~)
 F (int, not, ~)
 F1 (int, not, ~)
-F (long, not, ~)
-F1 (long, not, ~)
+F (int64_t, not, ~)
+F1 (int64_t, not, ~)
 
 FOO (char, and, &)
 FOO1 (char, and, &)
@@ -112,8 +112,8 @@ FOO (short, and, &)
 FOO1 (short, and, &)
 FOO (int, and, &)
 FOO1 (int, and, &)
-FOO (long, and, &)
-FOO1 (long, and, &)
+FOO (int64_t, and, &)
+FOO1 (int64_t, and, &)
 
 FOO (char, or, |)
 FOO1 (char, or, |)
@@ -121,8 +121,8 @@ FOO (short, or, |)
 FOO1 (short, or, |)
 FOO (int, or, |)
 FOO1 (int, or, |)
-FOO (long, or, |)
-FOO1 (long, or, |)
+FOO (int64_t, or, |)
+FOO1 (int64_t, or, |)
 
 FOO (char, xor, ^)
 FOO1 (char, xor, ^)
@@ -130,8 +130,8 @@ FOO (short, xor, ^)
 FOO1 (short, xor, ^)
 FOO (int, xor, ^)
 FOO1 (int, xor, ^)
-FOO (long, xor, ^)
-FOO1 (long, xor, ^)
+FOO (int64_t, xor, ^)
+FOO1 (int64_t, xor, ^)
 
 FOO (char, shl, <<)
 FOO3 (char, shl, <<, 7)
@@ -139,8 +139,8 @@ FOO (short, shl, <<)
 FOO3 (short, shl, <<, 7)
 FOO (int, shl, <<)
 FOO3 (int, shl, <<, 7)
-FOO (long, shl, <<)
-FOO3 (long, shl, <<, 7)
+FOO (int64_t, shl, <<)
+FOO3 (int64_t, shl, <<, 7)
 
 FOO (char, sar, >>)
 FOO3 (char, sar, >>, 7)
@@ -148,8 +148,8 @@ FOO (short, sar, >>)
 FOO3 (short, sar, >>, 7)
 FOO (int, sar, >>)
 FOO3 (int, sar, >>, 7)
-FOO (long, sar, >>)
-FOO3 (long, sar, >>, 7)
+FOO (int64_t, sar, >>)
+FOO3 (int64_t, sar, >>, 7)
 
 FOO (uint8_t, shr, >>)
 FOO3 (uint8_t, shr, >>, 7)
@@ -170,33 +170,33 @@ FOO4 (uint16_t, rol, <<, >>, 1)
 FOO4 (uint32_t, rol, <<, >>, 1)
 FOO4 (uint64_t, rol, <<, >>, 1)
 
-/* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), 
%(?:|r|e)a(?:x|l)" 4 } } */
+/* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]*1, 
\\(%(?:r|e)di\\), %(?:|r|e)a(?:x|l)" 4 } } */
 /* { dg-final { scan-assembler-times 
"lea(?:l|q)\[^\n\r]\\(%r(?:d|s)i,%r(?:d|s)i\\), %(?:|r|e)ax" 4 } } */
-/* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), 
\\(%rdi\\), %(?:|r|e)a(?:x|l)" 4 } } */
-/* { dg-final { scan-assembler-times "sub(?:b|l|w|q)\[^\n\r]*1, \\(%rdi\\), 
%(?:|r|e)a(?:x|l)" 4 } } */
+/* { dg-final { scan-assembler-times "add(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), 
\\(%(?:r|e)di\\), %(?:|r|e)a(?:x|l)" 4 } } */
+/* { dg-final { scan-assembler-times "sub(?:b|l|w|q)\[^\n\r]*1, 
\\(%(?:r|e)di\\), %(?:|r|e)a(?:x|l)" 4 } } */
 /* { dg-final { scan-assembler-times "sub(?:b|l|w|q)\[^\n\r]%(?:|r|e)si(?:|l), 
%(?:|r|e)di, %(?:|r|e)a(?:x|l)" 4 } } */
-/* { dg-final { scan-assembler-times "negb\[^\n\r]\\(%rdi\\), %(?:|r|e)al" 1 } 
} */
-/* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]\\(%rdi\\), 
%(?:|r|e)ax" 3 } } */
+/* { dg-final { scan-assembler-times "negb\[^\n\r]\\(%(?:r|e)di\\), %al" 1 } } 
*/
+/* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]\\(%(?:r|e)di\\), 
%(?:|r|e)ax" 3 } } */
 /* { dg-final { scan-assembler-times "neg(?:l|w|q)\[^\n\r]%(?:|r|e)di, 
%(?:|r|e)ax" 4 } } */
-/* { dg-final { scan-assembler-times "not(?:b|l|w|q)\[^\n\r]\\(%rdi\\), 
%(?:|r|e)a(?:x|l)" 4 } } */
+/* { dg-final { scan-assembler-times "not(?:b|l|w|q)\[^\n\r]\\(%(?:r|e)di\\), 
%(?:|r|e)a(?:x|l)" 4 } } */
 /* { dg-final { scan-assembler-times "not(?:l|w|q)\[^\n\r]%(?:|r|e)di, 
%(?:|r|e)ax" 4 } } */
-/* { dg-final { scan-assembler-times "andb\[^\n\r]*1, \\(%rdi\\), %al" 1 } } */
-/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]*1, \\(%rdi\\), 
%(?:|r|e)ax" 3 } } */
+/* { dg-final { scan-assembler-times "andb\[^\n\r]*1, \\(%(?:r|e)di\\), %al" 1 
} } */
+/* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]*1, 
\\(%(?:r|e)di\\), %(?:|r|e)ax" 3 } } */
 /* { dg-final { scan-assembler-times "and(?:l|w|q)\[^\n\r]%(?:|r|e)di, 

Re: [PATCH v2] rs6000: Rework option -mpowerpc64 handling [PR106680]

2024-02-05 Thread Kewen.Lin
Hi Sebastian,

on 2024/2/5 18:38, Sebastian Huber wrote:
> Hello,
> 
> On 27.12.22 11:16, Kewen.Lin via Gcc-patches wrote:
>> Hi Segher,
>>
>> on 2022/12/24 04:26, Segher Boessenkool wrote:
>>> Hi!
>>>
>>> On Wed, Oct 12, 2022 at 04:12:21PM +0800, Kewen.Lin wrote:
 PR106680 shows that -m32 -mpowerpc64 is different from
 -mpowerpc64 -m32, this is determined by the way how we
 handle option powerpc64 in rs6000_handle_option.

 Segher pointed out this difference should be taken as
 a bug and we should ensure that option powerpc64 is
 independent of -m32/-m64.  So this patch removes the
 handlings in rs6000_handle_option and add some necessary
 supports in rs6000_option_override_internal instead.
>>>
>>> Sorry for the late review.
>>>
 +  /* Don't expect powerpc64 enabled on those OSes with 
 OS_MISSING_POWERPC64,
 + since they don't support saving the high part of 64-bit registers on
 + context switch.  If the user explicitly specifies it, we won't 
 interfere
 + with the user's specification.  */
>>>
>>> It depends on the OS, and what you call "context switch".  For example
>>> on Linux the context switches done by the kernel are fine, only things
>>> done by setjmp/longjmp and getcontext/setcontext are not.  So just be a
>>> bit more vague here?  "Since they do not save and restore the high half
>>> of the GPRs correctly in all cases", something like that?
>>>
>>> Okay for trunk like that.  Thanks!
>>>
>>
>> Thanks!  Adjusted as you suggested and committed in r13-4894-gacc727cf02a144.
> 
> I am a bit late, however, this broke the 32-bit support for -mcpu=e6500. For 
> RTEMS, I have the following multilibs:
> 
> MULTILIB_REQUIRED += mcpu=e6500/m32
> MULTILIB_REQUIRED += mcpu=e6500/m32/mvrsave
> MULTILIB_REQUIRED += mcpu=e6500/m32/msoft-float/mno-altivec
> MULTILIB_REQUIRED += mcpu=e6500/m64
> MULTILIB_REQUIRED += mcpu=e6500/m64/mvrsave
> 
> I configured GCC as a bi-arch compiler (32-bit and 64-bit). It seems you 
> removed the -m32 handling, so I am not sure how to approach this issue. I 
> added a test case to the PR:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680

Thanks for reporting, I'll have a look at it (but I'm starting to be on 
vacation, so there may be slow response).

I'm not sure what's happened in bugzilla recently, but I didn't receive any 
mail notifications on your comments
#c5 and #c6 (sorry for the late response), since PR106680 is in state resolved 
maybe it's good to file a new
one for further tracking. :)

BR,
Kewen



Re: [PATCH] gcc/configure: Re-introduce INSTALL_INFO

2024-02-05 Thread Christophe Lyon
On Fri, 2 Feb 2024 at 11:40, Christophe Lyon  wrote:
>
> On Fri, 2 Feb 2024 at 11:10,  wrote:
> >
> > On 1 February 2024 18:15:34 CET, Christophe Lyon 
> >  wrote:
> > >BUILD_INFO is currently a byproduct of checking makeinfo
> > >presence/version.  INSTALL_INFO used to be defined similarly, but was
> > >removed in 2000 (!) by commit 17db658241d18cf6db59d31bc2d6eac96e9257df
> > >(svn r38141).
> > >
> > >In order to save build time, our CI overrides BUILD_INFO="", which
> > >works when invoking 'make all' but not for 'make install' in case some
> > >info files need an update.
> >
> > Instead of resurrecting INSTALL_INFO maybe you could something along the 
> > lines of
> >
> > https://gcc.gnu.org/bugzilla/attachment.cgi?id=15038=edit
>
> Ha indeed something along these lines would work too.
> Thanks for the archaeology :-)
>
> >
> > not sure which approach would be considered cleaner..
> Not sure either.
>
> What do maintainers prefer?
>

Actually that leads to a small patch:
https://gcc.gnu.org/pipermail/gcc-patches/2024-February/644957.html

> >
> > HTH


[PATCH] gcc/Makefile.in: Fix install-info target if BUILD_INFO is empty

2024-02-05 Thread Christophe Lyon
BUILD_INFO is currently a byproduct of checking makeinfo
presence/version.  INSTALL_INFO used to be defined similarly, but was
removed in 2000 (!) by commit 17db658241d18cf6db59d31bc2d6eac96e9257df
(svn r38141).

In order to save build time, our CI overrides BUILD_INFO="", which
works when invoking 'make all' but not for 'make install' in case some
info files need an update.

I noticed this when testing a patch posted on the gcc-patches list,
leading to an error at 'make install' time after updating tm.texi (the
build reported 'new text' in tm.texi and stopped).  This is because
'install' depends on 'install-info', which depends on
$(DESTDIR)$(infodir)/gccint.info (among others).

This patch makes the 'install-info' dependency in 'install'
conditioned by BUILD_INFO.

2024-02-05  Christophe Lyon  

gcc/
* Makefile.in: Use install-info only if BUILD_INFO is not empty.
---
 gcc/Makefile.in | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 4d38b162307..6cb564cfd35 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -817,7 +817,6 @@ INSTALL_HEADERS=install-headers install-mkheaders
 
 # Control whether Info documentation is built and installed.
 BUILD_INFO = @BUILD_INFO@
-INSTALL_INFO = @INSTALL_INFO@
 
 # Control flags for @contents placement in HTML output
 MAKEINFO_TOC_INLINE_FLAG = @MAKEINFO_TOC_INLINE_FLAG@
@@ -3786,9 +3785,13 @@ maintainer-clean:
 # Install the driver last so that the window when things are
 # broken is small.
 install: install-common $(INSTALL_HEADERS) \
-install-cpp install-man $(INSTALL_INFO) install-@POSUB@ \
+install-cpp install-man install-@POSUB@ \
 install-driver install-lto-wrapper install-gcc-ar
 
+ifneq ($(BUILD_INFO),)
+install: install-info
+endif
+
 ifeq ($(enable_plugin),yes)
 install: install-plugin
 endif
-- 
2.34.1



Re: [PATCH v5] x86-64: Find a scratch register for large model profiling

2024-02-05 Thread Uros Bizjak
On Fri, Feb 2, 2024 at 11:47 PM H.J. Lu  wrote:
>
> Changes in v5:
>
> 1. Add pr113689-3.c.
> 2. Use %r10 if ix86_profile_before_prologue () return true.
> 3. Try a callee-saved register which has been saved on stack in the
> prologue.
>
> Changes in v4:
>
> 1. Remove pr113689-3.c.
> 2. Use df_get_live_out.
>
> Changes in v3:
>
> 1. Remove r10_ok.
>
> Changes in v2:
>
> 1. Add int_parameter_registers to machine_function to track integer
> registers used for parameter passing.
> 2. Update x86_64_select_profile_regnum to try %r10 first and use an
> caller-saved register, which isn't used for parameter passing.
>
> ---
> 2 scratch registers, %r10 and %r11, are available at function entry for
> large model profiling.  But %r10 may be used by stack realignment and we
> can't use %r10 in this case.  Add x86_64_select_profile_regnum to find
> a caller-saved register which isn't live or a callee-saved register
> which has been saved on stack in the prologue at entry for large model
> profiling and sorry if we can't find one.
>
> gcc/
>
> PR target/113689
> * config/i386/i386.cc (set_saved_int_registers_bit): New.
> (test_saved_int_registers_bit): Likewise.
> (ix86_emit_save_regs): Call set_saved_int_registers_bit on
> saved register.
> (ix86_emit_save_regs_using_mov): Likewise.
> (x86_64_select_profile_regnum): New.
> (x86_function_profiler): Call x86_64_select_profile_regnum to
> get a scratch register for large model profiling.
> * config/i386/i386.h (machine_function): Add
> saved_int_registers.
>
> gcc/testsuite/
>
> PR target/113689
> * gcc.target/i386/pr113689-1.c: New file.
> * gcc.target/i386/pr113689-2.c: Likewise.
> * gcc.target/i386/pr113689-3.c: Likewise.
> ---
>  gcc/config/i386/i386.cc| 119 ++---
>  gcc/config/i386/i386.h |   5 +
>  gcc/testsuite/gcc.target/i386/pr113689-1.c |  49 +
>  gcc/testsuite/gcc.target/i386/pr113689-2.c |  41 +++
>  gcc/testsuite/gcc.target/i386/pr113689-3.c |  48 +
>  5 files changed, 247 insertions(+), 15 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr113689-3.c
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index b3e7c74846e..1c7aaa4535e 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -7387,6 +7387,32 @@ choose_baseaddr (HOST_WIDE_INT cfa_offset, unsigned 
> int *align,
>return plus_constant (Pmode, base_reg, base_offset);
>  }
>
> +/* Set the integer register REGNO bit in saved_int_registers.  */
> +
> +static void
> +set_saved_int_registers_bit (int regno)
> +{
> +  if (LEGACY_INT_REGNO_P (regno))
> +cfun->machine->saved_int_registers |= 1 << regno;
> +  else
> +cfun->machine->saved_int_registers
> +  |= 1 << (regno - FIRST_REX_INT_REG + 8);
> +}
> +
> +/* Return true if the integer register REGNO bit in saved_int_registers
> +   is set.  */
> +
> +static bool
> +test_saved_int_registers_bit (int regno)
> +{
> +  if (LEGACY_INT_REGNO_P (regno))
> +return (cfun->machine->saved_int_registers
> +   & (1 << regno)) != 0;
> +  else
> +return (cfun->machine->saved_int_registers
> +   & (1 << (regno - FIRST_REX_INT_REG + 8))) != 0;
> +}
> +
>  /* Emit code to save registers in the prologue.  */
>
>  static void
> @@ -7403,6 +7429,7 @@ ix86_emit_save_regs (void)
> insn = emit_insn (gen_push (gen_rtx_REG (word_mode, regno),
> TARGET_APX_PPX));
> RTX_FRAME_RELATED_P (insn) = 1;
> +   set_saved_int_registers_bit (regno);
>   }
>  }
>else
> @@ -7415,6 +7442,7 @@ ix86_emit_save_regs (void)
>for (regno = FIRST_PSEUDO_REGISTER - 1; regno >= 0; regno--)
> if (GENERAL_REGNO_P (regno) && ix86_save_reg (regno, true, true))
>   {
> +   set_saved_int_registers_bit (regno);
> if (aligned)
>   {
> regno_list[loaded_regnum++] = regno;
> @@ -7567,6 +7595,7 @@ ix86_emit_save_regs_using_mov (HOST_WIDE_INT cfa_offset)
>{
>  ix86_emit_save_reg_using_mov (word_mode, regno, cfa_offset);
> cfa_offset -= UNITS_PER_WORD;
> +   set_saved_int_registers_bit (regno);
>}
>  }

Do we really need the above handling? I think that we can use
ix86_save_reg directly in x86_64_select_profile_regnum below.

> @@ -22749,6 +22778,48 @@ current_fentry_section (const char **name)
>return true;
>  }
>
> +/* Return a caller-saved register which isn't live or a callee-saved
> +   register which has been saved on stack in the prologue at entry for
> +   profile.  */
> +
> +static int
> +x86_64_select_profile_regnum (bool r11_ok ATTRIBUTE_UNUSED)
> +{
> +  /* Use %r10 if the profiler is emitted before 

Re: [PATCH v2] rs6000: Rework option -mpowerpc64 handling [PR106680]

2024-02-05 Thread Sebastian Huber

Hello,

On 27.12.22 11:16, Kewen.Lin via Gcc-patches wrote:

Hi Segher,

on 2022/12/24 04:26, Segher Boessenkool wrote:

Hi!

On Wed, Oct 12, 2022 at 04:12:21PM +0800, Kewen.Lin wrote:

PR106680 shows that -m32 -mpowerpc64 is different from
-mpowerpc64 -m32, this is determined by the way how we
handle option powerpc64 in rs6000_handle_option.

Segher pointed out this difference should be taken as
a bug and we should ensure that option powerpc64 is
independent of -m32/-m64.  So this patch removes the
handlings in rs6000_handle_option and add some necessary
supports in rs6000_option_override_internal instead.


Sorry for the late review.


+  /* Don't expect powerpc64 enabled on those OSes with OS_MISSING_POWERPC64,
+ since they don't support saving the high part of 64-bit registers on
+ context switch.  If the user explicitly specifies it, we won't interfere
+ with the user's specification.  */


It depends on the OS, and what you call "context switch".  For example
on Linux the context switches done by the kernel are fine, only things
done by setjmp/longjmp and getcontext/setcontext are not.  So just be a
bit more vague here?  "Since they do not save and restore the high half
of the GPRs correctly in all cases", something like that?

Okay for trunk like that.  Thanks!



Thanks!  Adjusted as you suggested and committed in r13-4894-gacc727cf02a144.


I am a bit late, however, this broke the 32-bit support for -mcpu=e6500. 
For RTEMS, I have the following multilibs:


MULTILIB_REQUIRED += mcpu=e6500/m32
MULTILIB_REQUIRED += mcpu=e6500/m32/mvrsave
MULTILIB_REQUIRED += mcpu=e6500/m32/msoft-float/mno-altivec
MULTILIB_REQUIRED += mcpu=e6500/m64
MULTILIB_REQUIRED += mcpu=e6500/m64/mvrsave

I configured GCC as a bi-arch compiler (32-bit and 64-bit). It seems you 
removed the -m32 handling, so I am not sure how to approach this issue. 
I added a test case to the PR:


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680

--
embedded brains GmbH & Co. KG
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


[PATCH] Vectorizer and address-spaces

2024-02-05 Thread Richard Biener
The following makes sure to use the correct pointer mode when
building pointer types to a non-default address-space.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

* tree-vect-data-refs.cc (vect_create_data_ref_ptr): Use
the default mode when building a pointer.
---
 gcc/tree-vect-data-refs.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 2ca5a1b131b..f79ade9509b 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -5323,7 +5323,7 @@ vect_create_data_ref_ptr (vec_info *vinfo, stmt_vec_info 
stmt_info,
}
   while (sinfo);
 }
-  aggr_ptr_type = build_pointer_type_for_mode (aggr_type, ptr_mode,
+  aggr_ptr_type = build_pointer_type_for_mode (aggr_type, VOIDmode,
   need_ref_all);
   aggr_ptr = vect_get_new_vect_var (aggr_ptr_type, vect_pointer_var, 
base_name);
 
-- 
2.35.3


[PATCH] tree-optimization/113707 - ICE with VN elimination

2024-02-05 Thread Richard Biener
The following avoids different avail answers depending on how the
iteration progressed.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/113707
* tree-ssa-sccvn.cc (rpo_elim::eliminate_avail): After
checking the avail set treat out-of-region defines as
available.

* gcc.dg/torture/pr113707-1.c: New testcase.
* gcc.dg/torture/pr113707-2.c: Likewise.
---
 gcc/testsuite/gcc.dg/torture/pr113707-1.c | 45 +++
 gcc/testsuite/gcc.dg/torture/pr113707-2.c | 26 +
 gcc/tree-ssa-sccvn.cc |  5 +++
 3 files changed, 76 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr113707-1.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr113707-2.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr113707-1.c 
b/gcc/testsuite/gcc.dg/torture/pr113707-1.c
new file mode 100644
index 000..c1a50b31025
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr113707-1.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+
+int printf(const char *, ...);
+struct a {
+  int b;
+} n;
+int a, c, d, e, f = 1, g, h, j = 1, k, l, m, o;
+int main() {
+  struct a p;
+  int i;
+  p.b = 1;
+  if (!j)
+goto q;
+  p.b = i = 0;
+  for (; i < 1; i++)
+if (k)
+  while (m)
+  r:
+  q:
+if (p.b)
+  g = 1;
+  while (1) {
+i = 0;
+for (; i < 5; i++)
+  ;
+if (l) {
+  while (h)
+;
+  if (o) {
+d = 0;
+for (; d < 8; d++)
+  ;
+  }
+}
+for (; e; e--)
+  while (a)
+p = n;
+if (c)
+  goto r;
+printf("0");
+if (f)
+  break;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr113707-2.c 
b/gcc/testsuite/gcc.dg/torture/pr113707-2.c
new file mode 100644
index 000..957e6f1b534
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr113707-2.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+
+int a, b, c, d, e, f, g, h, j, k, l;
+void n() {
+  while (c)
+if (1) {
+  for (h = 5; h; h--) {
+int m = e % 2;
+d = ~g || h ^ m / -1;
+if (h > 5)
+  e = k;
+  }
+  return;
+}
+}
+int main() {
+  if (a)
+for (int i = 0; i < 2; i++) {
+  for (f = 1; f < 6; f++)
+for (c = 7; c >= 0; c--)
+  if (l)
+b = 0;
+  n();
+}
+  return 0;
+}
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index bbcf86588f9..8792cd07901 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -7776,6 +7776,11 @@ rpo_elim::eliminate_avail (basic_block bb, tree op)
  av = av->next;
}
   while (av);
+  /* While we prefer avail we have to fallback to using the value
+directly if defined outside of the region when none of the
+available defs suit.  */
+  if (!valnum_info->visited)
+   return valnum;
 }
   else if (valnum != VN_TOP)
 /* valnum is is_gimple_min_invariant.  */
-- 
2.35.3


Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-02-05 Thread Richard Biener
On Thu, 1 Feb 2024, Andre Vieira (lists) wrote:

> 
> 
> On 01/02/2024 07:19, Richard Biener wrote:
> > On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
> > 
> > 
> > The patch didn't come with a testcase so it's really hard to tell
> > what goes wrong now and how it is fixed ...
> 
> My bad! I had a testcase locally but never added it...
> 
> However... now I look at it and ran it past Richard S, the codegen isn't
> 'wrong', but it does have the potential to lead to some pretty slow codegen,
> especially for inbranch simdclones where it transforms the SVE predicate into
> an Advanced SIMD vector by inserting the elements one at a time...
> 
> An example of which can be seen if you do:
> 
> gcc -O3 -march=armv8-a+sve -msve-vector-bits=128  -fopenmp-simd t.c -S
> 
> with the following t.c:
> #pragma omp declare simd simdlen(4) inbranch
> int __attribute__ ((const)) fn5(int);
> 
> void fn4 (int *a, int *b, int n)
> {
> for (int i = 0; i < n; ++i)
> b[i] = fn5(a[i]);
> }
> 
> Now I do have to say, for our main usecase of libmvec we won't have any
> 'inbranch' Advanced SIMD clones, so we avoid that issue... But of course that
> doesn't mean user-code will.

It seems to use SVE masks with vector(4)  and the
ABI says the mask is vector(4) int.  You say that's because we choose
a Adv SIMD clone for the SVE VLS vector code (it calls _ZGVnM4v_fn5).

The vectorizer creates

  _44 = VEC_COND_EXPR ;

and then vector lowering decomposes this.  That means the vectorizer
lacks a check that the target handles this VEC_COND_EXPR.

Of course I would expect that SVE with VLS vectors is able to
code generate this operation, so it's missing patterns in the end.

Richard.

> I'm gonna remove this patch and run another test regression to see if it
> catches anything weird, but if not then I guess we do have the option to not
> use this patch and aim to solve the costing or codegen issue in GCC-15. We
> don't currently do any simdclone costing and I don't have a clear suggestion
> for how given openmp has no mechanism that I know off to expose the speedup of
> a simdclone over it's scalar variant, so how would we 'compare' a simdclone
> call with extra overhead of argument preparation vs scalar, though at least we
> could prefer a call to a different simdclone with less argument preparation.
> Anyways I digress.
> 
> Other tests, these require aarch64-autovec-preference=2 so that also has me
> worried less...
> 
> gcc -O3 -march=armv8-a+sve -msve-vector-bits=128 --param
> aarch64-autovec-preference=2 -fopenmp-simd t.c -S
> 
> t.c:
> #pragma omp declare simd simdlen(2) notinbranch
> float __attribute__ ((const)) fn1(double);
> 
> void fn0 (float *a, float *b, int n)
> {
> for (int i = 0; i < n; ++i)
> b[i] = fn1((double) a[i]);
> }
> 
> #pragma omp declare simd simdlen(2) notinbranch
> float __attribute__ ((const)) fn3(float);
> 
> void fn2 (float *a, double *b, int n)
> {
> for (int i = 0; i < n; ++i)
> b[i] = (double) fn3(a[i]);
> }
> 
> > Richard.
> > 
> >>>
> >>> That said, I wonder how we end up mixing things up in the first place.
> >>>
> >>> Richard.
> >>
> > 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH] lower-bitint: Remove single label _BitInt switches [PR113737]

2024-02-05 Thread Richard Biener
On Mon, 5 Feb 2024, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs, because group_case_labels_stmt optimizes
>   switch (a.0_7)  [50.00%], case 0:  [50.00%], case 2:  
> [50.00%]>
> where L7 block starts with __builtin_unreachable (); to
>   switch (a.0_7)  [50.00%]>
> and single label GIMPLE_SWITCH is something the switch expansion refuses to
> lower:
>   if (gimple_switch_num_labels (m_switch) == 1
>   || range_check_type (index_type) == NULL_TREE)
> return false;
> (range_check_type never returns NULL for BITINT_TYPE), but the gimple
> lowering pass relies on all large/huge _BitInt switches to be lowered
> by that pass.
> 
> The following patch just removes those after making the single successor
> edge EDGE_FALLTHRU.  I've done it even if !optimize just in case in case
> we'd end up with single case label from earlier passes.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2024-02-05  Jakub Jelinek  
> 
>   PR tree-optimization/113737
>   * gimple-lower-bitint.cc (gimple_lower_bitint): If GIMPLE_SWITCH
>   has just a single label, remove it and make single successor edge
>   EDGE_FALLTHRU.
> 
>   * gcc.dg/bitint-84.c: New test.
> 
> --- gcc/gimple-lower-bitint.cc.jj 2024-02-02 11:30:05.801776658 +0100
> +++ gcc/gimple-lower-bitint.cc2024-02-03 12:49:52.99574 +0100
> @@ -5832,7 +5832,14 @@ gimple_lower_bitint (void)
>  
> if (optimize)
>   group_case_labels_stmt (swtch);
> -   switch_statements.safe_push (swtch);
> +   if (gimple_switch_num_labels (swtch) == 1)
> + {
> +   single_succ_edge (bb)->flags |= EDGE_FALLTHRU;
> +   gimple_stmt_iterator gsi = gsi_for_stmt (swtch);
> +   gsi_remove (, true);
> + }
> +   else
> + switch_statements.safe_push (swtch);
>   }
>  }
>  
> --- gcc/testsuite/gcc.dg/bitint-84.c.jj   2024-02-03 12:56:08.153622744 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-84.c  2024-02-03 12:57:05.425835789 +0100
> @@ -0,0 +1,32 @@
> +/* PR tree-optimization/113737 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-O2 -std=c23" } */
> +
> +#if __BITINT_MAXWIDTH__ >= 129
> +_BitInt(129) a;
> +#else
> +_BitInt(63) a;
> +#endif
> +
> +int b[1], c;
> +
> +int
> +foo (void)
> +{
> +  switch (a)
> +  case 0:
> +  case 2:
> +return 1;
> +  return 0;
> +}
> +
> +void
> +bar (int i)
> +{
> +  for (;; ++i)
> +{
> +  c = b[i];
> +  if (!foo ())
> + __asm__ ("");
> +}
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [x86_64 PATCH] PR target/113690: Fix-up MULT REG_EQUAL notes in STV.

2024-02-05 Thread Uros Bizjak
On Mon, Feb 5, 2024 at 9:06 AM Uros Bizjak  wrote:
>
> On Mon, Feb 5, 2024 at 1:24 AM Roger Sayle  wrote:
> >
> >
> > This patch fixes PR target/113690, an ICE-on-valid regression on x86_64
> > that exhibits with a specific combination of command line options.  The
> > cause is that x86's scalar-to-vector pass converts a chain of instructions
> > from TImode to V1TImode, but fails to appropriately update the attached
> > REG_EQUAL note.  Given that multiplication isn't supported in V1TImode,
> > the REG_NOTE handling code wasn't expecting to see a MULT.  Easily solved
> > with additional handling for other binary operators that may potentially
> > (in future) have an immediate constant as the second operand that needs
> > handling.  For convenience, this code (re)factors the logic to convert
> > a TImode constant into a V1TImode constant vector into a subroutine and
> > reuses it.
> >
> > For the record, STV is actually doing something useful in this strange
> > testcase,  GCC with -O2 -fno-dce -fno-forward-propagate
> > -fno-split-wide-types
> > -funroll-loops generates:
> >
> > foo:movl$v, %eax
> > pxor%xmm0, %xmm0
> > movaps  %xmm0, 48(%rax)
> > movaps  %xmm0, (%rax)
> > movaps  %xmm0, 16(%rax)
> > movaps  %xmm0, 32(%rax)
> > ret
> >
> > With the addition of -mno-stv (to disable the patched code) it gives:
> >
> > foo:movl$v, %eax
> > movq$0, 48(%rax)
> > movq$0, 56(%rax)
> > movq$0, (%rax)
> > movq$0, 8(%rax)
> > movq$0, 16(%rax)
> > movq$0, 24(%rax)
> > movq$0, 32(%rax)
> > movq$0, 40(%rax)
> > ret
> >
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-m32}
> > with no new failures.  Ok for mainline?
> >
> >
> > 2024-02-05  Roger Sayle  
> >
> > gcc/ChangeLog
> > PR target/113690
> > * config/i386/i386-features.cc (timode_convert_cst): New helper
> > function to convert a TImode CONST_SCALAR_INT_P to a V1TImode
> > CONST_VECTOR.
> > (timode_scalar_chain::convert_op): Use timode_convert_cst.
> > (timode_scalar_chain::convert_insn): If a REG_EQUAL note contains
> > a binary operator where the second operand is an immediate integer
> > constant, convert it to V1TImode using timode_convert_cst.
> > Use timode_convert_cst.
> >
> > gcc/testsuite/ChangeLog
> > PR target/113690
> > * gcc.target/i386/pr113690.c: New test case.
>
> OK.

OTOH, how about we follow the approach from
general_scalar_chain::convert_insn and just kill the note?

Uros.


Re: [PATCH 2/2] rtl-optimization/113255 - avoid re-associating REG_POINTER MINUS

2024-02-05 Thread Richard Biener
On Fri, 2 Feb 2024, Jeff Law wrote:

> 
> 
> On 2/1/24 07:20, Richard Biener wrote:
> > The following avoids re-associating
> > 
> >   (minus:DI (reg/f:DI 119)
> >  (minus:DI (reg/f:DI 120)
> >  (reg/f:DI 114)))
> > 
> > into
> > 
> >   (minus:DI (plus:DI (reg/f:DI 114)
> >  (reg/f:DI 119))
> >  (reg/f:DI 120))
> > 
> > as that possibly confuses the REG_POINTER heuristics of RTL
> > alias analysis.  This happens to miscompile the PRs testcase
> > during DSE which expands addresses via CSELIB which eventually
> > simplifies what it substituted to.  The original code does
> > the innocent ptr - (ptr2 - ptr2'), bias a pointer by the
> > difference of two other pointers.
> > 
> > --
> > 
> > This is what I propose for the PR for branches, I have not made much
> > progress with fixing the fallout on the RTL alias analysis change
> > on trunk, so this is the alternative if we decide to revert that.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu on the gcc-13
> > branch, bootstrapped after reverting of the previous fix on
> > x86_64-unknown-linux-gnu on trunk, testing is still ongoing there.
> > 
> > OK?  Any preference for trunk?
> > 
> > Thanks,
> > Richard.
> > 
> >  PR rtl-optimization/113255
> >  * simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
> >  Do not re-associate a MINUS with a REG_POINTER op0.
> Nasty little set of problems.  I don't think we ever pondered that we could
> have multiple REGNO_POINTER_FLAG objects in the same expression, but clearly
> that can happen once you introduce a 3rd term in the expression.
> 
> I don't mind avoiding the reassociation, but it feels like we're papering over
> problems in alias.cc.  Conceptually it seems like if we have two objects with
> REG_POINTER set, then we can't know which one is the real base.  So your patch
> in the PR wasn't that bad.

It wasn't bad, it's the only correct fix.  The question is what we do
for branches (or whether we do anything there) and whether we just accept
that that fix causes some optimization regressions.

> Alternately, just stop using REG_POINTER for alias analysis?   It looks
> fundamentally flawed to me in that context.  In fact, one might argue that the
> only legitimate use would be to indicate to the target that we know a pointer
> points into an object.  Some targets (the PA) need this because x + y is not
> the same as y + x when used as a memory address.
> 
> If we wanted to be a bit more surgical, drop REG_POINTER from just the MINUS
> handling in alias.cc?

The problem is that REG_POINTER is just used as a heuristic
(and compile-time optimization) as to which of a binary operator
operands we use a base of (preferrably).  find_base_{term,value}
happily look at operands that are not REG_POINTER (that are
not REG_P), since for the case in question, even w/o re-assoc
there would be no way to say the inner MINUS is not a pointer
(it's a REG flag).

The heuristics don't help much when passes like DSE use CSELIB
and combine operations like above, we then get to see that
the way find_base_{term,value} perform pointer analysis is
fundamentally flawed.  Any tweaking there has the chance to
make other cases run into wrong base discoveries.

I'll take it that we need to live with the regressions for GCC 14
and the wrong-code bug in GCC 13 and earlier.

Thanks,
Richard.


Re: [PATCH] i386: Clear REG_UNUSED and REG_DEAD notes from the IL at the end of vzeroupper pass [PR113059]

2024-02-05 Thread Uros Bizjak
On Wed, Jan 31, 2024 at 9:23 AM Jakub Jelinek  wrote:
>
> Hi!
>
> The move of the vzeroupper pass from after reload pass to after
> postreload_cse helped only partially, CSE-like passes can still invalidate
> those notes (especially REG_UNUSED) if they use some earlier register
> holding some value later on in the IL.
>
> So, either we could try to move it one pass further after gcse2 and hope
> no later pass invalidates the notes, or the following patch attempts to
> restore the REG_DEAD/REG_UNUSED state from GCC 13 and earlier, where
> the LRA or reload passes remove all REG_DEAD/REG_UNUSED notes and the notes
> reappear only at the start of dse2 pass when it calls
>   df_note_add_problem ();
>   df_analyze ();
> So, effectively
>   NEXT_PASS (pass_postreload_cse);
>   NEXT_PASS (pass_gcse2);
>   NEXT_PASS (pass_split_after_reload);
>   NEXT_PASS (pass_ree);
>   NEXT_PASS (pass_compare_elim_after_reload);
>   NEXT_PASS (pass_thread_prologue_and_epilogue);
> passes operate without those notes in the IL.
> While in GCC 14 mode switching computes the notes problem at the start of
> vzeroupper, the patch below removes them at the end of the pass again, so
> that the above passes continue to operate without them.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2024-01-31  Jakub Jelinek  
>
> PR target/113059
> * config/i386/i386-features.cc (rest_of_handle_insert_vzeroupper):
> Remove REG_DEAD/REG_UNUSED notes at the end of the pass before
> df_analyze call.

Not really a review, but let's rubber stamp this workaround OK.

Thanks,
Uros.

>
> --- gcc/config/i386/i386-features.cc.jj 2024-01-08 12:15:13.611477047 +0100
> +++ gcc/config/i386/i386-features.cc2024-01-30 12:36:27.834515803 +0100
> @@ -2664,6 +2664,32 @@ rest_of_handle_insert_vzeroupper (void)
>/* Call optimize_mode_switching.  */
>g->get_passes ()->execute_pass_mode_switching ();
>
> +  /* LRA removes all REG_DEAD/REG_UNUSED notes and normally they
> + reappear in the IL only at the start of pass_rtl_dse2, which does
> + df_note_add_problem (); df_analyze ();
> + The vzeroupper is scheduled after postreload_cse pass and mode
> + switching computes the notes as well, the problem is that e.g.
> + pass_gcse2 doesn't maintain the notes, see PR113059 and
> + PR112760.  Remove the notes now to restore status quo ante
> + until we figure out how to maintain the notes or what else
> + to do.  */
> +  basic_block bb;
> +  rtx_insn *insn;
> +  FOR_EACH_BB_FN (bb, cfun)
> +FOR_BB_INSNS (bb, insn)
> +  if (NONDEBUG_INSN_P (insn))
> +   {
> + rtx *pnote = _NOTES (insn);
> + while (*pnote != 0)
> +   {
> + if (REG_NOTE_KIND (*pnote) == REG_DEAD
> + || REG_NOTE_KIND (*pnote) == REG_UNUSED)
> +   *pnote = XEXP (*pnote, 1);
> + else
> +   pnote =  (*pnote, 1);
> +   }
> +   }
> +
>df_analyze ();
>return 0;
>  }
>
> Jakub
>


Re: [x86_64 PATCH] PR target/113690: Fix-up MULT REG_EQUAL notes in STV.

2024-02-05 Thread Uros Bizjak
On Mon, Feb 5, 2024 at 1:24 AM Roger Sayle  wrote:
>
>
> This patch fixes PR target/113690, an ICE-on-valid regression on x86_64
> that exhibits with a specific combination of command line options.  The
> cause is that x86's scalar-to-vector pass converts a chain of instructions
> from TImode to V1TImode, but fails to appropriately update the attached
> REG_EQUAL note.  Given that multiplication isn't supported in V1TImode,
> the REG_NOTE handling code wasn't expecting to see a MULT.  Easily solved
> with additional handling for other binary operators that may potentially
> (in future) have an immediate constant as the second operand that needs
> handling.  For convenience, this code (re)factors the logic to convert
> a TImode constant into a V1TImode constant vector into a subroutine and
> reuses it.
>
> For the record, STV is actually doing something useful in this strange
> testcase,  GCC with -O2 -fno-dce -fno-forward-propagate
> -fno-split-wide-types
> -funroll-loops generates:
>
> foo:movl$v, %eax
> pxor%xmm0, %xmm0
> movaps  %xmm0, 48(%rax)
> movaps  %xmm0, (%rax)
> movaps  %xmm0, 16(%rax)
> movaps  %xmm0, 32(%rax)
> ret
>
> With the addition of -mno-stv (to disable the patched code) it gives:
>
> foo:movl$v, %eax
> movq$0, 48(%rax)
> movq$0, 56(%rax)
> movq$0, (%rax)
> movq$0, 8(%rax)
> movq$0, 16(%rax)
> movq$0, 24(%rax)
> movq$0, 32(%rax)
> movq$0, 40(%rax)
> ret
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
>
>
> 2024-02-05  Roger Sayle  
>
> gcc/ChangeLog
> PR target/113690
> * config/i386/i386-features.cc (timode_convert_cst): New helper
> function to convert a TImode CONST_SCALAR_INT_P to a V1TImode
> CONST_VECTOR.
> (timode_scalar_chain::convert_op): Use timode_convert_cst.
> (timode_scalar_chain::convert_insn): If a REG_EQUAL note contains
> a binary operator where the second operand is an immediate integer
> constant, convert it to V1TImode using timode_convert_cst.
> Use timode_convert_cst.
>
> gcc/testsuite/ChangeLog
> PR target/113690
> * gcc.target/i386/pr113690.c: New test case.

OK.

Thanks,
Uros.

>
>
> Thanks in advance,
> Roger
> --
>