[Bug testsuite/115262] [15 regression] gcc.target/powerpc/pr66144-3.c fails after r15-831-g05daf617ea22e1

2024-06-11 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115262

--- Comment #3 from Kewen Lin  ---
(In reply to Peter Bergner from comment #2)
> (In reply to Jeffrey A. Law from comment #1)
> > It looks like the test wants to see xxsel, but after that change we get
> > xxlor and  what looks like a slight difference in register allocation.  I
> > can't really judge if the new code is better, worse is equivalent.
> 
> xxsel XT,XA,XB,XC computes XT = (XA & ~XC) | (XB & XC).  Using De Morgan's
> law given XB == XC, that seems to simplify to XT = XA | XB which is what
> you're producing and an xxlor (a simple logical or) is not going to be
> slower than a xxsel and is probably faster.  I agree with Bill that this
> looks like an example of needing to update the expected results of the test
> case.  I'll let Segher and/or Ke Wen comment though.

I agree they are equivalent here, from the scheduling descriptions, xxsel and
xxlor are in the same unit.

[Bug tree-optimization/115427] fallback for interclass mathfn bifs like isinf, isfinite, isnormal

2024-06-11 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115427

--- Comment #5 from Kewen Lin  ---
(In reply to rguent...@suse.de from comment #4)
> On Tue, 11 Jun 2024, linkw at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115427
> > 
> > --- Comment #3 from Kewen Lin  ---
> > (In reply to Richard Biener from comment #2)
> > > The canonical way would be to handle these in the ISEL pass and remove
> > > the (fallback) expansion.  But then we can see whether the expander FAILs
> > > (ideally expanders would never be allowed to FAIL, and for FAILing 
> > > expanders
> > > we'd have a way to query the target like we have the vec_perm_const hook).
> > > 
> > > But I'll note that currently the expanders may FAIL but then we expand to
> > > a call rather than the inline-expansion (and for example AVR relies on 
> > > this
> > > now to avoid early folding of isnan).
> > > 
> > > So - for the cases of isfininte and friends without a fallback call I
> > > would suggest to expand from ISEL to see if it FAILs and throw away
> > > the result (similar as how IVOPTs probes things).  Or make those _not_
> > > allowed to FAIL?  Why would they fail to expand anyway?
> > 
> > Thanks for the suggestion! IIUC considering the AVR example we still want
> > *isinf* to fall back with the library call (so not falling back with
> > inline-expansion way then).  Currently at least for rs6000 port there is no
> > case that we want to make it FAIL, but not sure some other targets will have
> > such need in future.  From the review comment[1], we don't note it's not
> > allowed to FAIL so we probably need to ensure there is some handling for 
> > FAIL
> > in case some future FAIL cause some unexpected failure. Do you prefer not
> > allowing it to FAIL? then re-open this and go with ISEL if some port wants 
> > it
> > to FAIL?
> 
> I think it would be cleaner to not allow it FAIL since there's no library
> fallback.  

Fair enough!

> FAILing patterns are a hassle when it comes to GIMPLE
> optimizations.

Yeah, for some cases port isn't able to put some condition as part of condition
HAVE_* (such as further checking operand special values etc.), FAIL has to be
used.

> 
> As said, there should be a good reason why patterns FAIL - what's
> the idea behind this feature anyway?

No solid input for this, as the proposed documentation implicitly indicates
FAIL is possible to be used (like some other existing expanders), I didn't
consider carefully if it has a good reason, but just assuming it can happen. :(
It's a really good question if there will be a need for it.

[Bug tree-optimization/115427] fallback for interclass mathfn bifs like isinf, isfinite, isnormal

2024-06-11 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115427

--- Comment #3 from Kewen Lin  ---
(In reply to Richard Biener from comment #2)
> The canonical way would be to handle these in the ISEL pass and remove
> the (fallback) expansion.  But then we can see whether the expander FAILs
> (ideally expanders would never be allowed to FAIL, and for FAILing expanders
> we'd have a way to query the target like we have the vec_perm_const hook).
> 
> But I'll note that currently the expanders may FAIL but then we expand to
> a call rather than the inline-expansion (and for example AVR relies on this
> now to avoid early folding of isnan).
> 
> So - for the cases of isfininte and friends without a fallback call I
> would suggest to expand from ISEL to see if it FAILs and throw away
> the result (similar as how IVOPTs probes things).  Or make those _not_
> allowed to FAIL?  Why would they fail to expand anyway?

Thanks for the suggestion! IIUC considering the AVR example we still want
*isinf* to fall back with the library call (so not falling back with
inline-expansion way then).  Currently at least for rs6000 port there is no
case that we want to make it FAIL, but not sure some other targets will have
such need in future.  From the review comment[1], we don't note it's not
allowed to FAIL so we probably need to ensure there is some handling for FAIL
in case some future FAIL cause some unexpected failure. Do you prefer not
allowing it to FAIL? then re-open this and go with ISEL if some port wants it
to FAIL?

[1]
https://inbox.sourceware.org/gcc-patches/CAFiYyc3wE=xdkrzuvf1kttdrkvaaw-dyw+ztryc1p6+6nmt...@mail.gmail.com/

[Bug tree-optimization/115427] fallback for interclass mathfn bifs like isinf, isfinite, isnormal

2024-06-11 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115427

Kewen Lin  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
   Keywords||internal-improvement
 CC||bergner at gcc dot gnu.org,
   ||guihaoc at gcc dot gnu.org,
   ||rguenth at gcc dot gnu.org,
   ||rsandifo at gcc dot gnu.org,
   ||segher at gcc dot gnu.org

--- Comment #1 from Kewen Lin  ---
Now we have expand_builtin_interclass_mathfn to expand these functions if they
don't have optab defined, it seems fine to generate equivalent RTL as
fold_builtin_interclass_mathfn there. However, by considering the
maintainability, IMHO it's better to reuse the tree exp in
fold_builtin_interclass_mathfn, then we only have one place for such folding.
It would be like something:

@@ -2534,6 +2536,20 @@ expand_builtin_interclass_mathfn (tree exp, rtx target)
   && maybe_emit_unop_insn (icode, ops[0].value, op0, UNKNOWN))
 return ops[0].value;

+  location_t loc = EXPR_LOCATION (exp);
+  tree fold_res
+= fold_builtin_interclass_mathfn (loc, fndecl, orig_arg, false);
+
+  if (fold_res)
+{
+  op0 = expand_expr (fold_res, NULL_RTX, VOIDmode, EXPAND_NORMAL);
+  tree rtype = TREE_TYPE (TREE_TYPE (fndecl));
+  machine_mode rmode = TYPE_MODE (rtype);
+  if (rmode != GET_MODE (op0))
+op0 = convert_to_mode (rmode, op0, 0);
+  return op0;
+}
+
   delete_insns_since (last);
   CALL_EXPR_ARG (exp, 0) = orig_arg;

But unfortunately since fold_builtin_interclass_mathfn is for both front-end
and middle-end, it would have some tree code like TRUTH_NOT_EXPR, which isn't
supported in expand_expr. To make it work, we can change TRUTH_NOT_EXPR with
BIT_NOT_EXPR (like in fold_builtin_unordered_cmp), but there are some other
codes like TRUTH_ANDIF_EXPR, TRUTH_ORIF_EXPR (for ibmlongdouble) which can't be
replaced with BIT_AND_EXPR and BIT_OR_EXPR by considering the short-circuit, so
I tried to use COND_EXPR for them instead, but by testing a case with ibmlong
double, there are still some gaps from the original folding code.

I also tried a hackish way that is to force tree exp to gimple stmts and try to
expand these stmts one by one, but it adds more ssa than before and ICE on ssa
to rtx things, not sure if it's a considerable direction to dig into.

I'm looking for suggestions here, is there some existing practice to follow?
which is preferred that expanding from folded tree exp or generating equivalent
rtx directly.  If for the former one, allowing some difference from the
original folding (FAIL can be rare), or experimenting some other ways.

[Bug tree-optimization/115427] New: fallback for interclass mathfn bifs like isinf, isfinite, isnormal

2024-06-11 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115427

Bug ID: 115427
   Summary: fallback for interclass mathfn bifs like isinf,
isfinite, isnormal
   Product: gcc
   Version: 15.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

This is filed as follow up for the discussion in [1].

The optabs for isfinite and isnormal would be landed soon, the documentation
allows the optab expansion to fail (as it doesn't mention it's not allowed to),
but with an artificial FAIL in the define_expand for these optabs, there are
two cases:
  1) for isinf, it would result in a call to isinf, but in fact
fold_builtin_interclass_mathfn is able to fold them if there is no target
specific implementation.
  2) for isfinite and isnormal, since there is no library call registered, it
would result in a call to __builtin_{isfinite, isnormal}, which is completely
wrong.

So following Richi's suggestion, this PR is to follow up the falling back way.

[1]
https://inbox.sourceware.org/gcc-patches/17c9ab5d-f1d4-9447-fccf-d9aa0ad56...@linux.ibm.com/

[Bug target/115355] [12/13/14/15 Regression] vectorization exposes wrong code on P9 LE starting from r12-4496

2024-06-07 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115355

--- Comment #11 from Kewen Lin  ---
(In reply to Jens Seifert from comment #10)
> Does this affect loop vectorize and slp vectorize ?
> 
> -fno-tree-loop-vectorize avoids loop vectorization to be performed and
> workarounds this issue. Does the same problems also affect SLP
> vectorization, which does not take place in this sample.
> 
> In other words, do I need
> -fno-tree-loop-vectorize
> or
> -fno-tree-vectorize
> to workaround this bug ?

Since it's an issue on vector merge insn patterns in target code and
vectorization just exposes it, it's hard to workaround this bug completely just
by disabling both loop and slp vectorization, as its related bug PR106069
shows, even without vectorization but using some vec merge built-ins, it's
still possible to hit this issue.  But I'd expect disabling both loop and slp
vectorization (-fno-tree-vectorize) can greatly reduce the possibility of
encountering it.

[Bug target/115355] [12/13/14/15 Regression] vectorization exposes wrong code on P9 LE starting from r12-4496

2024-06-06 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115355

--- Comment #9 from Kewen Lin  ---
(In reply to Peter Bergner from comment #7)
> The test fails when setToIdentityBAD's index var is unsigned int.  It passes
> when using unsigned long long, unsigned long, unsigned short and unsigned
> char.  When using unsigned long long/unsigned long, we do no vectorize the

unsigned {long ,}long fails to vectorize due to cost modeling:

  missed:  cost model: the vector iteration cost = 2 divided by the scalar
iteration cost = 1 is greater or equal to the vectorization factor = 2.
  missed:  not vectorized: vectorization not profitable.

it can be forced with -fno-vect-cost-model.

> loop.  We vectorize the loop when using unsigned int/short/char.  The
> vectorized code is a little strange, in that the smaller the integer type we
> use for the index var, the more code we generate.  
> 
> The vectorized code for unsigned char is truly huge!  ...although it does
> seem to work correctly.  I'm attaching the "unsigned char i" code gen for
> setToIdentityBAD for people to examine.  Even though it gives "correct"
> results, it can't really be the code we want to generate, correct???

It's due to aggressive unrolling, as it has one early check on the loop bound
between 16 and 255, then cunroll completely unrolls it for each 16 multiples
(totally 15 loops). A compact version of code can be generated with
-fdisable-tree-cunroll.

[Bug target/115355] [12/13/14/15 Regression] vectorization exposes wrong code on P9 LE starting from r12-4496

2024-06-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115355

--- Comment #8 from Kewen Lin  ---
(In reply to Peter Bergner from comment #5)
> FYI, fails for me with gcc 12 and later and works with gcc 11.  It also
> fails with -O3 -mcpu=power10.

Thanks for the information, bisection shows r12-4496 is the culprit commit, I
just tested and confirmed Xionghu's latest patch for PR106069 also fixed this
one.

  - latest rev. for his fix:
https://inbox.sourceware.org/gcc-patches/20230210025952.1887696-1-xionghu...@tencent.com/,
which was resent from
https://inbox.sourceware.org/gcc-patches/37b57a54-f98e-96a3-edff-866c8aae4...@gmail.com/

  - original thread and some discussions:
https://inbox.sourceware.org/gcc-patches/20220808034247.2618809-1-xionghu...@tencent.com/

The latest rev. looked to me as
(https://inbox.sourceware.org/gcc-patches/e8e69f0c-7f36-e671-6c3b-74401e4d8...@linux.ibm.com/),
still looking forward to Segher's review and approval on this.

[Bug target/115355] PPCLE: Auto-vectorization creates wrong code for Power9

2024-06-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115355

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2024-06-05
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org

--- Comment #4 from Kewen Lin  ---
Thanks for reporting, I'll have a look first.

[Bug target/115282] [15 regression] gcc.dg/vect/costmodel/ppc/costmodel-slp-12.c fails after r15-812-gc71886f2ca2e46

2024-05-31 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115282

Kewen Lin  changed:

   What|Removed |Added

   Last reconfirmed||2024-05-31
 Status|UNCONFIRMED |NEW
 CC||linkw at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Kewen Lin  ---
(In reply to Richard Biener from comment #1)
> I don't see a good reason why, but I don't have a BE cross around to check
> myself.  Does BE vect maybe not have unsigned integer vector multiplication
> support?

BE should have int vector mult too, I noticed it's guarded with TARGET_ALTIVEC.

The first loop (line 17) causes the difference, previously it did the splitting
like: 

test.c:16:17: note:   Splitting SLP group at stmt 6
test.c:16:17: note:   Split group into 6 and 2

but now it won't and then seems to fail due to that:

test.c:16:17: note:   ==> examining statement: _14 = in[_13];
test.c:16:17: missed:   permutation requires at least three vectors _2 =
in[_1];
test.c:16:17: missed:   unsupported load permutation
test.c:25:14: missed:   not vectorized: relevant stmt not supported: _14 =
in[_13];
test.c:16:17: note:   Cannot vectorize all-constant op node 0x140dd450
test.c:16:17: note:   removing SLP instance operations starting from: out[_1] =
_17;
test.c:16:17: missed:  unsupported SLP instances
test.c:16:17: note:  re-trying with SLP disabled
test.c:16:17: note:  vectorization_factor = 4, niters = 8

I can't figure out why it can pass on LE, so I did a test on LE and found it
fails on LE too!?

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-05-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

--- Comment #10 from Kewen Lin  ---
(In reply to Peter Bergner from comment #9)
> (In reply to Kewen Lin from comment #8)
> > Should be fixed on trunk, it's not a regression, but we probably want
> > backporting this?
> 
> For code correctness bugs, yes, we want them backported.

Thanks for confirming!  Will do backporting after burn-in time.

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-05-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

--- Comment #8 from Kewen Lin  ---
Should be fixed on trunk, it's not a regression, but we probably want
backporting this?

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-05-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

Kewen Lin  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-May/651
   ||025.html

--- Comment #18 from Kewen Lin  ---
A formal patch had been sent out as URL field shows, still waiting for review.

[Bug target/114402] rs6000: ICE when long double is ieee128 format by default but without vsx

2024-05-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114402

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #3 from Kewen Lin  ---
Not a regression, it should be rare to adopt ieee long double but disabling
vsx, so not backported.  Should be fixed on trunk.

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-05-14 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

Kewen Lin  changed:

   What|Removed |Added

  Attachment #58067|0   |1
is obsolete||

--- Comment #6 from Kewen Lin  ---
Created attachment 58201
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58201=edit
tested patch

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-04-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

--- Comment #5 from Kewen Lin  ---
Created attachment 58067
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58067=edit
untested patch

[Bug testsuite/113535] rs6000, testsuite: Re-visit the current vect_* for Power

2024-04-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113535

--- Comment #1 from Kewen Lin  ---
One issue: https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650171.html

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-04-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

Kewen Lin  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #4 from Kewen Lin  ---
(In reply to Andrew Pinski from comment #3)
> (In reply to Kewen Lin from comment #2)
> > As https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114843#c8, we may need some
> > similar handling like r14-6440-g4b421728289e6f.
> 
> Note rs6000_emit_epilogue mostly handles eh_returns so it might not be as
> hard as other targets.

Yes, making a patch.

[Bug target/44793] [11/12/13/14/15 Regression] libgcc does not include t-ppccomm on rtems

2024-04-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44793

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WORKSFORME
 CC||linkw at gcc dot gnu.org

--- Comment #26 from Kewen Lin  ---
libgcc/config.host on gcc-11 has:

powerpc-*-rtems*)
  tmake_file="${tmake_file} rs6000/t-ppccomm rs6000/t-savresfgpr
rs6000/t-crtstuff t-crtstuff-p  ic t-fdpbit"
  extra_parts="$extra_parts crtbeginS.o crtendS.o crtbeginT.o ecrti.o
ecrtn.o ncrti.o ncrtn.o"
  ;;

I think this had been fixed already by r0-119741-g6f28886030623a.

Please feel free to reopen it if it still occurs on active releases. Thanks!

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-04-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

--- Comment #2 from Kewen Lin  ---
As https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114843#c8, we may need some
similar handling like r14-6440-g4b421728289e6f.

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-04-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

Kewen Lin  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-04-25
 Status|UNCONFIRMED |NEW
 CC||bergner at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org,
   ||segher at gcc dot gnu.org
 Target|powerpc64-linux-gnu |powerpc64*-linux-gnu
   |powerpc-linux-gnu   |powerpc-linux-gnu

--- Comment #1 from Kewen Lin  ---
Thanks for reporting, confirmed, it also fails on LE (ppc64le-linux).

[Bug testsuite/114842] rs6000: Adjust some test cases with powerpc_vsx_ok

2024-04-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114842

--- Comment #1 from Kewen Lin  ---
We can extend powerpc_vsx to consider current_compiler_flags, it means that if
a test case has an explicit -mvsx, even if users specify -mno-vsx it's still
able to be tested if powerpc_vsx checking concludes VSX is enabled, it can keep
some previous testing coverage.

[Bug testsuite/114842] rs6000: Adjust some test cases with powerpc_vsx_ok

2024-04-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114842

Kewen Lin  changed:

   What|Removed |Added

 Target||powerpc*-linux-gnu
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
   Last reconfirmed||2024-04-25
   Target Milestone|--- |15.0
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

[Bug testsuite/114842] New: rs6000: Adjust some test cases with powerpc_vsx_ok

2024-04-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114842

Bug ID: 114842
   Summary: rs6000: Adjust some test cases with powerpc_vsx_ok
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

The current effective target powerpc_vsx_ok is mainly to check if it's fine to
specify -mvsx (without any warnings etc.) and can finally result in a object
file (it means the underlying environment like assembler supports vsx insns).
But most of the test cases being guarded with this checking actually want to
check if VSX feature is enabled, such as: the wanted behavior only happens with
VSX feature enabled. When users specifying -mno-vsx in RUNTESTFLAGS, it can
disable VSX feature (with some old runtest, -mno-vsx comes after -mvsx), but
powerpc_vsx_ok checking will still pass as it's fine to specify -mvsx, so if
the test case doesn't have explicit -mvsx, then the given -mno-vsx can disable
VSX feature and make that test case fail, meanwhile even if the test case has
specified -mvsx explicitly it would fail with old runtest as -mno-vsx comes
last. We already have another effective target powerpc_vsx which effectively
checks for VSX enabled, so we should update most of test case to adopt it
instead.

[Bug target/88309] [11/12/13/14 Regression] ICE: Floating point exception (in is_miss_rate_acceptable), target assigning alignent of 4 bits(!) to vector

2024-04-24 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88309

Kewen Lin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Kewen Lin  ---
Should be fixed on trunk and active release branches.

[Bug target/105359] _Float128 expanders and builtins disabled on ppc targets with 64-bit long double

2024-04-23 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105359

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-04-23
   Keywords||missed-optimization
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 CC||linkw at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Kewen Lin  ---
Thanks for reporting, I'll have a look.

[Bug testsuite/114744] test case gcc.target/powerpc/builtins-6-p9-runnable.c fails

2024-04-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114744

Kewen Lin  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Kewen Lin  ---
Should be fixed on trunk, since it's a test issue, no backporting need.

[Bug testsuite/114744] test case gcc.target/powerpc/builtins-6-p9-runnable.c fails

2024-04-16 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114744

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2024-04-17
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED

--- Comment #2 from Kewen Lin  ---
This is very very likely a test issue, due to endianness which the load vector
should consider. I'll have a look.

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-04-11 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #17 from Kewen Lin  ---
(In reply to Michael Matz from comment #16)
> (In reply to Kewen Lin from comment #15)
> > I agree, thanks for the comments! btw, I'm not fighting for the current
> > implementation, just want to know more details why users are unable to make
> > use of the current implementation, is it just due to its inefficiency (like
> > the above sequence) or un-usability (unused at all). As your comments, I
> > think it's due to the former (inefficiency)?!
> 
> Okay.  So, yeah, I _think_ that other way (with NOPs between GEP and LEP,
> plus a jump around them) could be made to work with userspace live patching.
> It would just be inefficient.  But do note that that jump around was _not_
> part of the original way of -fpatchable-function-entry, so a change to
> codegen
> would have to have happened anyway to make that other way usable.  And it
> has the
> (perhaps theoretical, who knows :) ) problem of not using the normal 8-byte
> difference between GEP and LEP.
> 

Thanks again for confirming this understanding!

> I think your current proposal from comment #10 is the better from all
> perspectives.

Yeah, I agree. When reworking this support previously, comment #10 like
implementation was considered as a better one but it's not finally made due to
the concern that can break the assumption NOPs should be consecutive, based on
all the inputs here I think it's time to "fix" it by just underscoring this
special not-consecutive NOPs in documentation.

[Bug target/114567] rs6000: explicit _Float128 doesn't generate optimal code

2024-04-10 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114567

--- Comment #1 from Kewen Lin  ---
This is power8 LE specific, for KFmode its mov expander calls
rs6000_emit_le_vsx_move, so it's with V1TI subreg, then rs6000 specific pass
swaps generate one MEM with AND -16, which make combine unable to optimize it
with that *signbit2_dm_mem pattern due to mode_dependent_address_p
returning false always for AND. Although it looks to me we can extend
mode_dependent_address_p to consider the to-mode in that context, it's still
sub-optimal due to the existence of AND -16, which result in an explicit "and"
then.

[Bug testsuite/114662] [14 regression] new test case c_lto_pr113359-2 from r14-9841-g1e3312a25a7b34 fails

2024-04-10 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114662

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Kewen Lin  ---
Should be fixed on latest trunk.

[Bug testsuite/114662] [14 regression] new test case c_lto_pr113359-2 from r14-9841-g1e3312a25a7b34 fails

2024-04-10 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114662

Kewen Lin  changed:

   What|Removed |Added

  Component|lto |testsuite
   Target Milestone|--- |14.0
   Keywords||testsuite-fail

[Bug lto/114662] [14 regression] new test case c_lto_pr113359-2 from r14-9841-g1e3312a25a7b34 fails

2024-04-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114662

Kewen Lin  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 CC||linkw at gcc dot gnu.org
   Last reconfirmed||2024-04-10
 Status|UNCONFIRMED |ASSIGNED

--- Comment #2 from Kewen Lin  ---
I think this is a test issue, with -m32 unsigned long is 4 bytes while CL1,CL2
are 8 bytes constants, then it considers some checks would always fail and the
abort will happen, since the optimization aggressively optimize away the call
to getb, there is no chance to further check "semantic equality". The IR for
main at *.015t.cfg looks like:

int main (int argc, char * * argv)
{
  struct SB b;
  struct SA a;
  int D.3983;

   :
  init ();
  geta (, );
  _1 = a.ax;
  if (_1 != 3735928559)
goto ; [INV]
  else
goto ; [INV]

   :
  __builtin_abort ();

   :
  __builtin_abort ();

}

[Bug rtl-optimization/114664] -fno-omit-frame-pointer causes an ICE during the build of the greenlet package

2024-04-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664

--- Comment #8 from Kewen Lin  ---
(In reply to Peter Bergner from comment #7)
> (In reply to Andrew Pinski from comment #6)
> > Pre-IRA fix was done to specifically reject this:
> > https://inbox.sourceware.org/gcc-patches/
> > ab3a61990702021658w4dc049cap53de8010a7d86...@mail.gmail.com/
> 
> Then that would seem to indicate that mentioning the frame pointer reg in
> the asm clobber list is an error, but how are users supposed to know whether
> -fno-omit-frame-pointer is in effect or not?  I've looked and there is no
> pre-defined macro a user could check.

I noticed even without -fno-omit-frame-pointer, the test case still fails with
the same symptom (with error msg rather than ICE), did I miss something?

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-04-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #15 from Kewen Lin  ---
(In reply to Michael Matz from comment #14)
> Hmm?  But this is not how the global-to-local hand-off is implemented (and
> expected by tooling): a fall-through.  The global entry sets up the GOT
> register, there simply is no '[b localentry]'.
> 
> If you mean to imply that also the '[b localentry]' should be patched in at
> live-patch application time (and hence the GOT setup would need to be moved
> to still somewhere else), then you have the problem that (in the
> not-yet-patched 
> case) as long as the L1-nops sit between global and local entry they will
> always 
> be executed when the global entry is called.

Sorry for confusion, I meant the sequence like:

global entry:
  [TOC base setup] // always here
  [b localentry] // which is added when patching
L1:
  [patched code] // from patching
  localentry: 
  [b L1] // from patching

> That's wasteful.

I agree, nops are not zero cost on Power8/Power9.

> 
> Additionally tooling will be surprised if the address difference between
> global and local entry isn't exactly 8 (i.e. two instructions).  The psABI
> allows for different values, of course.  But I'm willing to bet that there
> are
> bugs in the wild when different values would be actually used.
> 

It's possible that some tooling doesn't conform the ABI doc well, but I think
the tooling should fix itself if that is the case. :)

> So, the nops-between-gep-and-lep could probably be somehow made to work with
> userspace live patching, but your most recent patch here makes this all mood.
> It generates exactly the sequence we want: a single nop at the LEP, and
> a configurable patching area outside of, but near to, the function (here: in
> front of the GEP).

I agree, thanks for the comments! btw, I'm not fighting for the current
implementation, just want to know more details why users are unable to make use
of the current implementation, is it just due to its inefficiency (like the
above sequence) or un-usability (unused at all). As your comments, I think it's
due to the former (inefficiency)?!

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-04-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #13 from Kewen Lin  ---
(In reply to Giuliano Belinassi from comment #12)
> With your patch we have:
> 
> > .LPFE0:
> > ...
> Which seems what is expected.

Hi Giuliano, thanks for your time on testing it!  Could you kindly help to
explain a bit on why "In such way we can't use the this space to place a
trampoline to the new function"? Is it due to inefficient code like needing
more branches?

global entry:
  [b localentry]
L1:
  [patched code]

localentry:
  [b L1]

Or some other reason which makes it unused at all?

[Bug testsuite/114614] New test case gcc.misc-tests/gcov-20.c from r14-9789-g08a52331803f66 fails

2024-04-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114614

Kewen Lin  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Kewen Lin  ---
Should be fixed on latest trunk.

[Bug testsuite/114642] new test case gcc.dg/debug/btf/btf-datasec-3.c from r14-6195-gb8cf266f4ca4ff fails for 32 bits

2024-04-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114642

Kewen Lin  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-April/6
   ||48994.html
 CC||linkw at gcc dot gnu.org
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |david.faust at oracle 
dot com

--- Comment #2 from Kewen Lin  ---
David posted a fix (see URL).

[Bug testsuite/114614] New test case gcc.misc-tests/gcov-20.c from r14-9789-g08a52331803f66 fails

2024-04-07 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114614

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
   Last reconfirmed||2024-04-08
 CC||linkw at gcc dot gnu.org

--- Comment #1 from Kewen Lin  ---
It requires effective target profile_update_atomic.

[Bug target/114567] rs6000: explicit _Float128 doesn't generate optimal code

2024-04-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114567

Kewen Lin  changed:

   What|Removed |Added

   Target Milestone|--- |15.0
   Keywords||missed-optimization
 Target||powerpc64*-linux-gnu
   Last reconfirmed||2024-04-03
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org

[Bug target/114567] New: rs6000: explicit _Float128 doesn't generate optimal code

2024-04-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114567

Bug ID: 114567
   Summary: rs6000: explicit _Float128 doesn't generate optimal
code
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

This is an issue which I happened to spot when I have been working on patches
for PR112993.

=== test case ===

#define TYPE _Float128

#ifdef LD
#undef TYPE
#define TYPE long double
#endif

int sbm (TYPE *a) { return __builtin_signbit (*a); }

==

/opt/gcc-nightly/trunk/bin/gcc -mcpu=power8 -mvsx -O2 -mabi=ieeelongdouble
-Wno-psabi test.c -DLD -S -o ref.s
/opt/gcc-nightly/trunk/bin/gcc -mcpu=power8 -mvsx -O2 -mabi=ibmlongdouble
-Wno-psabi test.c -S -o float128.s

diff -Nur ref.s float128.s
--- ref.s   2024-03-18 05:41:00.302208975 -0400
+++ float128.s  2024-03-18 05:41:00.392205513 -0400
@@ -9,7 +9,10 @@
 sbm:
 .LFB0:
.cfi_startproc
-   ld 3,8(3)
+   rldicr 3,3,0,59
+   lxvd2x 0,0,3
+   xxpermdi 0,0,0,2
+   mfvsrd 3,0
srdi 3,3,63
blr
.long 0

[Bug target/88309] [11/12/13/14 Regression] ICE: Floating point exception (in is_miss_rate_acceptable), target assigning alignent of 4 bits(!) to vector

2024-04-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88309

Kewen Lin  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org

--- Comment #6 from Kewen Lin  ---
(In reply to Andrew Pinski from comment #5)
> (In reply to Kewen Lin from comment #4)
> > Hi Andrew, thanks for digging into this!  William has not worked on GCC
> > project any more, will you make a patch for this?
> 
> I don't have time to test it really.

No problem, I'll work on this.

[Bug target/88309] [11/12/13/14 Regression] ICE: Floating point exception (in is_miss_rate_acceptable), target assigning alignent of 4 bits(!) to vector

2024-04-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88309

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #4 from Kewen Lin  ---
(In reply to Andrew Pinski from comment #3)
> Found it:
>   /* In GIMPLE the type of the MEM_REF specifies the alignment.  The
> required alignment (power) is 4 bytes regardless of data type.  */
>   tree align_ltype = build_aligned_type (lhs_type, 4);
> 
> That should be 4*8 instead of just 4.
> 
> There are 2 build_aligned_type in rs6000-builtins.cc which uses the wrong
> alignment; thinking it was the alignment argument was bytes rather than bits.
> 
> Introduced by r9-2375-g3f7a77cd20d07c which means this is a regression.

Hi Andrew, thanks for digging into this!  William has not worked on GCC project
any more, will you make a patch for this?

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-04-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #11 from Kewen Lin  ---
(In reply to Giuliano Belinassi from comment #9)
> Yes, this is for userspace livepatching.
> 
> Assume the following example:
> https://godbolt.org/z/b9M8nMbo1
> 
> As one can see, the sequence of 14 nops are generated after the global
> function entry point. In such way we can't use the this space to place a
> trampoline to the new function. We need this sequence of nops to be placed
> *before* the global function entry point.
> 

Hi Giuliano, thanks for the inputs!

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-04-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #10 from Kewen Lin  ---
Created attachment 57844
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57844=edit
patch changing the current implementation

Considering the current implementation is not useful at all for both kernel and
userspace uses, I'm inclined to change the current implementation instead of
introducing another option, but updating the documentation to emphasize the
NOPs may not be consecutive for this case.

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-04-01 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #8 from Kewen Lin  ---
Hi @Michael, @Martin, could you help to confirm/clarify what triggers you to be
interested in this feature, is it for some user space usage or not?

[Bug target/114402] rs6000: ICE when long double is ieee128 format by default but without vsx

2024-03-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114402

--- Comment #1 from Kewen Lin  ---
Currently the only pattern to match IEEE128 comparison is:

;; IEEE 128-bit comparisons
(define_insn "*cmp_hw"
  [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
(compare:CCFP (match_operand:IEEE128 1 "altivec_register_operand" "v")
  (match_operand:IEEE128 2 "altivec_register_operand"
"v")))]
  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)"
   "xscmpuqp %0,%1,%2"
  [(set_attr "type" "veccmp")
   (set_attr "size" "128")])

It requires TARGET_FLOAT128_HW, so nothing can be used for matching.

The below patch can fix this ICE, it makes no-vsx IEEE128 also go with libfunc
call like !TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode).

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 5d975dab921..237d138faec 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -15329,7 +15329,7 @@ rs6000_generate_compare (rtx cmp, machine_mode mode)
   rtx op0 = XEXP (cmp, 0);
   rtx op1 = XEXP (cmp, 1);

-  if (!TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode))
+  if (!TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode))
 comp_mode = CCmode;
   else if (FLOAT_MODE_P (mode))
 comp_mode = CCFPmode;
@@ -15361,7 +15361,7 @@ rs6000_generate_compare (rtx cmp, machine_mode mode)

   /* IEEE 128-bit support in VSX registers when we do not have hardware
  support.  */
-  if (!TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode))
+  if (!TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode))
 {
   rtx libfunc = NULL_RTX;
   bool check_nan = false;

[Bug target/114402] rs6000: ICE when long double is ieee128 format by default but without vsx

2024-03-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114402

Kewen Lin  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org,
   ||g...@the-meissners.org,
   ||segher at gcc dot gnu.org
   Last reconfirmed||2024-03-21
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org

[Bug target/114402] rs6000: ICE when long double is ieee128 format by default but without vsx

2024-03-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114402

Kewen Lin  changed:

   What|Removed |Added

 Target||powerpc64*-linux-gnu
   Keywords||ice-on-valid-code
   Target Milestone|--- |15.0
  Known to fail||12.3.1, 13.2.1

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-03-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #6 from Kewen Lin  ---
(In reply to Martin Jambor from comment #5)
> I'd like to ping this, are there plans to implement this in the near-ish
> term?

Some weeks ago, Naveen had been doing some experiments to see if there is a
better way for function tracer support, and if the idea works and the
experiment result is promising, he may request something different, so we are
still waiting for that. @Naveen Feel free to correct me if any
misunderstanding.

[Bug target/114402] New: rs6000: ICE when long double is ieee128 format by default but without vsx

2024-03-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114402

Bug ID: 114402
   Summary: rs6000: ICE when long double is ieee128 format by
default but without vsx
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

When I was doing a patch to make us only have two 128bit fp on rs6000, I found
that we can have long double with ieee128 format by default and even not having
vsx support, but a simple test case with comparison triggers ICE as below:

long double a;
long double b;

int foo() {
  if (a > b)
return 0;
  else
return 1;
}

/opt/gcc-nightly/trunk/bin/gcc test.c -mno-vsx

test.c: In function ‘foo’:
test.c:9:1: error: unrecognizable insn:
9 | }
  | ^
(insn 9 8 10 2 (set (reg:CCFP 123)
(compare:CCFP (reg:TF 117 [ a.0_1 ])
(reg:TF 118 [ b.1_2 ]))) "test.c":5:6 -1
 (nil))
during RTL pass: vregs
test.c:9:1: internal compiler error: in extract_insn, at recog.cc:2812
0x102b7353 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/home/gccbuild/gcc_trunk_git/gcc/gcc/rtl-error.cc:108
0x102b73a7 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/home/gccbuild/gcc_trunk_git/gcc/gcc/rtl-error.cc:116
0x10c6636b extract_insn(rtx_insn*)
/home/gccbuild/gcc_trunk_git/gcc/gcc/recog.cc:2812
0x107ef797 instantiate_virtual_regs_in_insn
/home/gccbuild/gcc_trunk_git/gcc/gcc/function.cc:1611
0x107ef797 instantiate_virtual_regs
/home/gccbuild/gcc_trunk_git/gcc/gcc/function.cc:1994
0x107ef797 execute
/home/gccbuild/gcc_trunk_git/gcc/gcc/function.cc:2041
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

Note that it should be configured with --with-long-double-format=ieee, since if
-mabi=ieeelongdouble is specified, it will requires vsx to be enabled.

[Bug testsuite/114320] New test case in r14-9439-g4aa87b856067d4 fails

2024-03-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114320

--- Comment #3 from Kewen Lin  ---
(In reply to Nathaniel Shead from comment #2)
> Sorry about that. I've not been able to work out what configure flags I need
> to pass to cause this to error in the first place (I don't normally develop
> for powerpc and the machine I'm using doesn't seem to fail no matter what

I guess the machine you are using (were referring to) isn't with powerpc chip,
cfarm provides some powerpc machines (https://portal.cfarm.net/machines/list/),
both ppc64le (LE -m64) and ppc64 (BE -m32/-m64), it's recommended to leverage
them for building/testing. :)

> flags I try), but am I correct in understanding that just adding
> "-Wno-psabi" to the tests should stop them from failing? If so I'm happy to
> push a patch to that effect.

I think so, for now we don't have an effective target dedicated for __ibm128
type but it's guarded the same as what's for __float128 type (it would be
relaxed though in future, even with that using ppc_float128_sw should just be
more strict).  Ideally we can add one effective target powerpc_vsx_ok (should
be powerpc_vsx) to ensure VSX to be enabled, but considering we are going to
rework it in next release and we don't normally disable vsx explicitly, this
can be optional.

[Bug testsuite/114320] New test case in r14-9439-g4aa87b856067d4 fails

2024-03-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114320

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-03-13
 Ever confirmed|0   |1
 CC||linkw at gcc dot gnu.org

--- Comment #1 from Kewen Lin  ---
These new test cases require "-Wno-psabi" to suppress the warning.

[Bug testsuite/101461] [12/13/14 regression] gcc.target/powerpc/fold-vec-load-builtin_vec_xl test cases fail after r12-2266

2024-03-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101461

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
 CC||linkw at gcc dot gnu.org

--- Comment #4 from Kewen Lin  ---
Already fixed by r12-2889-g8464894c86b03e.

[Bug target/113507] can't build a cross compiler to rs6000-ibm-aix7.2

2024-02-26 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113507

Kewen Lin  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |segher at gcc dot 
gnu.org

--- Comment #6 from Kewen Lin  ---
Segher will clean up this rs6000-*-* thing in next release, please use
powerpc*-*-* instead.

[Bug testsuite/106680] Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE

2024-02-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680

--- Comment #12 from Kewen Lin  ---
(In reply to Sebastian Huber from comment #10)
> (In reply to Kewen Lin from comment #9)
> > Note that now we only disable implicit powerpc64 for -m32 when the
> > OS_MISSING_POWERPC64 is set.
> > 
> >   /* Don't expect powerpc64 enabled on those OSes with OS_MISSING_POWERPC64,
> >  since they do not save and restore the high half of the GPRs correctly
> >  in all cases.  If the user explicitly specifies it, we won't interfere
> >  with the user's specification.  */
> > #ifdef OS_MISSING_POWERPC64
> >   if (OS_MISSING_POWERPC64
> >   && TARGET_32BIT
> >   && TARGET_POWERPC64
> >   && !(rs6000_isa_flags_explicit & OPTION_MASK_POWERPC64))
> > rs6000_isa_flags &= ~OPTION_MASK_POWERPC64;
> > #endif
> > 
> > But rtems.h doesn't define OS_MISSING_POWERPC64
> 
> RTEMS supports the 64-bit PowerPC for the 64-bit multilibs.
> 

64-bit kernel should support 64-bit PowerPC, but does 32-bit kernel support
saving and restoring 64-bit regs?

The current rtems.h is saying yes, if it's no, we should fix the rtems.h and
you won't need the explicit -mno-powerpc64 then.


btw, take the comments in freebsd64.h for example.

/* FreeBSD doesn't support saving and restoring 64-bit regs with a 32-bit
   kernel. This is supported when running on a 64-bit kernel with
   COMPAT_FREEBSD32, but tell GCC it isn't so that our 32-bit binaries
   are compatible. */
#define OS_MISSING_POWERPC64 !TARGET_64BIT

[Bug testsuite/106680] Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE

2024-02-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680

--- Comment #11 from Kewen Lin  ---
(In reply to Sebastian Huber from comment #8)
> Yes, it seems that -mcpu=e6500 -mno-powerpc64 yields the right code for the
> attached test case (with or without the -m32).

The default is -m32 I guess? :)

> 
> I am now a bit confused what the purpose of the -m32 and -m64 options is.

For -m32/-m64, the manual says:

Generate code for 32-bit or 64-bit environments of Darwin and SVR4 targets
(including GNU/Linux). The 32-bit environment sets int, long and pointer to 32
bits and generates code that runs on any PowerPC variant. The 64-bit
environment sets int to 32 bits and long and pointer to 64 bits, and generates
code for PowerPC64, as for -mpowerpc64.

But it's possible to interact with option powerpc64, like cpu e6500 which by
default supports powerpc64 and if applied OS is able to support the necessary
context switches, we want -mpowerpc64 kept and it's able to generate more
efficient code (leveraging insns guarded with powerpc64 flag).

[Bug testsuite/106680] Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE

2024-02-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680

--- Comment #9 from Kewen Lin  ---
Note that now we only disable implicit powerpc64 for -m32 when the
OS_MISSING_POWERPC64 is set.

  /* Don't expect powerpc64 enabled on those OSes with OS_MISSING_POWERPC64,
 since they do not save and restore the high half of the GPRs correctly
 in all cases.  If the user explicitly specifies it, we won't interfere
 with the user's specification.  */
#ifdef OS_MISSING_POWERPC64
  if (OS_MISSING_POWERPC64
  && TARGET_32BIT
  && TARGET_POWERPC64
  && !(rs6000_isa_flags_explicit & OPTION_MASK_POWERPC64))
rs6000_isa_flags &= ~OPTION_MASK_POWERPC64;
#endif

But rtems.h doesn't define OS_MISSING_POWERPC64

gcc/config/rs6000/linux.h:#define OS_MISSING_POWERPC64 1
gcc/config/rs6000/freebsd64.h:#define OS_MISSING_POWERPC64 !TARGET_64BIT
gcc/config/rs6000/aix.h:#define OS_MISSING_POWERPC64 1
gcc/config/rs6000/linux64.h:#define OS_MISSING_POWERPC64 !TARGET_64BIT

meanwhile cpu "e6500" has MASK_POWERPC64 set by default (it's 64bit core).

That's why you still have powerpc64 flag set when you specify -m32 on rtems.

[Bug testsuite/106680] Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE

2024-02-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680

--- Comment #7 from Kewen Lin  ---
(In reply to Sebastian Huber from comment #6)
> It seems that the change
> 
> commit acc727cf02a1446dc00f8772f3f479fa3a508f8e
> Author: Kewen Lin 
> Date:   Tue Dec 27 04:13:07 2022 -0600
> 
> rs6000: Rework option -mpowerpc64 handling [PR106680]
> 
> causes a regression for -mcpu=e6500 -m32, for example:
> 
> gcc -fpreprocessed -O2 -S -mcpu=e6500 -m32 -S imfs_add_node.c.67.s
> imfs_add_node.c.67.i
> 
> diff -u imfs_add_node.c.67.s.good.e2acff49fb2962b921bf8b73984b89878b61492c
> imfs_add_node.c.67.s.bad.acc727cf02a1446dc00f8772f3f479fa3a508f8e
> --- imfs_add_node.c.67.s.good.e2acff49fb2962b921bf8b73984b89878b61492c 
> 2024-01-20 12:15:15.143182571 +0100
> +++ imfs_add_node.c.67.s.bad.acc727cf02a1446dc00f8772f3f479fa3a508f8e  
> 2024-01-20 12:11:46.804204927 +0100
> @@ -52,8 +52,8 @@
> bne- 0,.L4
>  .L2:
> mr 4,29
> -   addi 3,1,8
> li 5,24
> +   addi 3,1,8
> bl rtems_filesystem_eval_path_start
> lis 9,IMFS_node_clone@ha
> lwz 10,20(3)
> @@ -63,12 +63,12 @@
> cmpw 0,10,9
> beq- 0,.L24
> li 4,134
> -   addi 3,1,8
> +   li 3,0
> bl rtems_filesystem_eval_path_error
>  .L9:
> li 31,-1
>  .L10:
> -   addi 3,1,8
> +   li 3,0
> bl rtems_filesystem_eval_path_cleanup
>  .L1:
> lwz 0,116(1)
> @@ -93,7 +93,7 @@
> lwz 9,12(31)
> li 8,96
> lhz 10,16(31)
> -   addi 3,1,8
> +   li 3,0
> stw 8,24(1)
> stw 9,8(1)
> stw 10,12(1)
> @@ -105,7 +105,7 @@
> cmpwi 0,9,0
> beq- 0,.L9
> li 4,22
> -   addi 3,1,8
> +   li 3,0
> bl rtems_filesystem_eval_path_error
> b .L9
> .p2align 4,,15
> @@ -129,12 +129,9 @@
> stw 9,0(10)
> stw 10,4(9)
> bl _Timecounter_Getbintime
> -   lwz 10,64(1)
> -   lwz 11,68(1)
> -   stw 10,40(30)
> -   stw 11,44(30)
> -   stw 10,48(30)
> -   stw 11,52(30)
> +   ld 9,64(1)
> +   std 9,40(30)
> +   std 9,48(30)
> b .L10
> .cfi_endproc
>  .LFE351:
> 
> For the call to rtems_filesystem_eval_path_cleanup() the register 3 should
> point to a structure on the stack. Correct is:
> 
> -   addi 3,1,8
> 
> Wrong is:
> 
> +   li 3,0
> 
> It seems that for the -mcpu=e6500 the -m32 option has not the right effect
> and some 64-bit instructions are generated, for example ld and std plus the

As the commit log, the previous behavior that -m32 also disables -mpowerpc64 is
wrong, -m{no,}powerpc64 should be independent of -m32/-m64.

> wrong function parameters.

I supposed that the behavior you wanted with -m32 is not to enable powerpc64
(since the previous behavior is -m32 can disable -mpowerpc64 as well), so I
think you can get the previous behavior if you specify one explicit
-mno-powerpc64 when adopting -m32.

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-30 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-01-30

--- Comment #13 from Kewen Lin  ---
One more finding: without an explicit cpu type but -mvsx, gcc passes -mpower7
to assembler already, but if there is an explicitly specified cpu type, it
won't do that. I think the reason why it doesn't always make it is that only
the last cpu type wins and the passing can override some higher cpu type
unexpectedly.

The fixing candidates seems to be:

diff --git a/libgcc/config/rs6000/t-float128 b/libgcc/config/rs6000/t-float128
index b09b5664af0..47b06d3c30d 100644
--- a/libgcc/config/rs6000/t-float128
+++ b/libgcc/config/rs6000/t-float128
@@ -74,7 +74,7 @@ fp128_includes = $(srcdir)/soft-fp/double.h \
   $(srcdir)/soft-fp/soft-fp.h

 # Build the emulator without ISA 3.0 hardware support.
-FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 \
+FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 -mcpu=power7 \
-mno-float128-hardware -mno-gnu-attribute \
-I$(srcdir)/soft-fp \
-I$(srcdir)/config/rs6000 \

Or

diff --git a/libgcc/config/rs6000/t-float128 b/libgcc/config/rs6000/t-float128
index b09b5664af0..bf4a5e6aaf0 100644
--- a/libgcc/config/rs6000/t-float128
+++ b/libgcc/config/rs6000/t-float128
@@ -74,7 +74,7 @@ fp128_includes = $(srcdir)/soft-fp/double.h \
   $(srcdir)/soft-fp/soft-fp.h

 # Build the emulator without ISA 3.0 hardware support.
-FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 \
+FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 -Wa,-many \
-mno-float128-hardware -mno-gnu-attribute \
-I$(srcdir)/soft-fp \
-I$(srcdir)/config/rs6000 \

As gcc considers -mvsx to imply -mcpu=power7 (appending onto the current
specified cpu type if there is one) while assembler doesn't consider like that.

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

Kewen Lin  changed:

   What|Removed |Added

Summary|Failed bootstrap on ppc |[14 regression] Failed
   |unrecognized opcode:|bootstrap on ppc
   |`lfiwzx' with -mcpu=7450|unrecognized opcode:
   ||`lfiwzx' with -mcpu=7450

--- Comment #12 from Kewen Lin  ---
(In reply to Sam James from comment #10)
> (In reality, I think it is a regression, given:
> a) it regresses non-release checking (which we sometimes use even for
> released versions, it's opt-in though);

But I assumed that non-release checking on old releases should also fail, from
non-release vs. non-release, the behavior doesn't change.

> b) it blocks further testing with GCC 14
> 

Sorry for that, put it back as you like. :)

> but I understand the argument that if a release were made with it, it
> wouldn't be the end of the world by itself and it only affects a specific
> configuration.)

[Bug target/113652] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

--- Comment #11 from Kewen Lin  ---
In gcc, lfiwzx is guarded with TARGET_LFIWZX => TARGET_POPCNTD (ISA2.06), while
-mvsx will guarantee TARGET_POPCNTD (ISA_2_6_MASKS_SERVER) set, so it considers
lfiwzx is supported. IMHO the underlying philosophy is that having the
capability of vsx the supported ISA level is at least 2.06, lfiwzx is supported
from 2.06, so it's supported.

But binutils seems not to follow it:
{"xvadddp", XX3(60,96), XX3_MASK,PPCVSX,PPCVLE, {XT6,
XA6, XB6}},
{"lfiwzx",  X(31,887),  X_MASK,   POWER7|PPCA2, 0,  {FRT,
RA0, RB}},
Both are guarded with different masks and apparently PPCVSX doesn't enable
POWER7.

Hi Alan and Peter,

I wonder if assembler can enable POWER7 when PPCVSX gets enabled like what gcc
adopts now?

[Bug target/113652] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

Kewen Lin  changed:

   What|Removed |Added

Summary|[14 regression] Failed  |Failed bootstrap on ppc
   |bootstrap on ppc|unrecognized opcode:
   |unrecognized opcode:|`lfiwzx' with -mcpu=7450
   |`lfiwzx' with -mcpu=7450|

--- Comment #9 from Kewen Lin  ---
(In reply to Andrew Pinski from comment #8)
> So t-float128 has this line:
> # Build the emulator without ISA 3.0 hardware support.
> FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 \
> ...
> 
> Which gets added to some of the libgcc object files while compiling:
> $(fp128_softfp_obj)  : INTERNAL_CFLAGS += $(FP128_CFLAGS_SW)
> $(fp128_ppc_obj) : INTERNAL_CFLAGS += $(FP128_CFLAGS_SW)
> 
> 
> The problem is CFLAGS gets added also. It seems like passing -mvsx enables
> some other instructions in GCC's code generation BUT does not enable it for
> the assembler ...

ah, just noticed that it's bootstrapping gcc. Stripping regression tag since I
don't think it's actually a regression as comments above.

I found that the libgcc_cv_powerpc_float128 checking can pass with -mcpu=7450
-mabi=altivec -mvsx -mfloat128, the assembler options are "-a32 -mppc -mvsx
-maltivec -mbig" is actually the same as what are used for the case #c5
compiling. So it looks that -mvsx is supposed to tell assembler to recognize
vsx instructions but somehow "lfiwzx" is not counted as vsx instruction.

More specifically "xvadddp" is recognized by assembler with -mvsx while
"lfiwzx" isn't.

$ cat t1.s
.machine "7450"
lfiwzx 1,0,9

$ cat t2.s
.machine "7450"
xvadddp 34,34,35

$ as -a32 -mppc -mvsx t1.s -o t1.o
t1.s: Assembler messages:
t1.s:2: Error: unrecognized opcode: `lfiwzx'
$ as -a32 -mppc -mvsx t2.s -o t2.o
$ echo $?
$ 0

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

--- Comment #7 from Kewen Lin  ---
oops, I meant --enable-checking rather than --checking.

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

--- Comment #6 from Kewen Lin  ---
I think this is related to r10-580-ge154242724b084 and this failure is expected
and a use error.

With it applied, we don't always pass -many to assembler with CHECKING_P
enabled. Actually compilers (gcc-13, gcc-12, gcc-11 or trunk) generate the same
assembly, but because gcc-11/gcc-12/gcc-13 is built with --checking=release by
default which doesn't set CHECKING_P while trunk is built with
--checking=yes,extra by default which set CHECKING_P. So it causes the
different behaviors so that further considered as regression unexpectedly.

The issue should be gone if trunk gets released as gcc-14 or it's built with
--checking=release. IMO Alan's commit aims to help to expose more and more such
unexpected use cases and users can fix them in place. As #c3 "PowerPC 7450 (aka
PowerPC G4) is only capable of -maltivec but not -mvsx", so it's unexpected to
have -mcpu=7450 meanwhile having -mvsx, could you check where the -mvsx comes
from and fix it instead?  Thanks!

btw, a workaround option is to add -Wa,-many to restore the previous behavior
that passing -many to assembler.

[Bug target/113507] can't build a cross compiler to rs6000-ibm-aix7.2

2024-01-22 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113507

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 CC||segher at gcc dot gnu.org
   Last reconfirmed||2024-01-23
 Ever confirmed|0   |1

--- Comment #5 from Kewen Lin  ---
(In reply to H.J. Lu from comment #3)
> (In reply to Kewen Lin from comment #2)
> > Guessing /usr/local/bin/ld is a gnu ld? Based on what I heard before, gnu ld
> > has some problems on aix, people pass object files to aix system and use aix
> > ld there. Not sure if the understanding still holds.
> 
> I am building a cross compiler.  No AIX tools are involved.

Thanks for clarifying, I was dull and misunderstood it.

Confirmed, some symbols are from rs6000-builtin.cc (which is not generated) but
it requires some symbols in rs6000-builtins.cc (which is generated). Both
object files are not included in linking. The below diff can fix it:

diff --git a/gcc/config.gcc b/gcc/config.gcc
index b2d7d7dd475..6b62e4fe56c 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -557,8 +557,10 @@ rs6000*-*-*)
 extra_options="${extra_options} g.opt fused-madd.opt
rs6000/rs6000-tables.opt"
 extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o"
 extra_objs="${extra_objs} rs6000-call.o rs6000-pcrel-opt.o"
+extra_objs="${extra_objs} rs6000-builtin.o rs6000-builtins.o"
 target_gtfiles="$target_gtfiles
\$(srcdir)/config/rs6000/rs6000-logue.cc
\$(srcdir)/config/rs6000/rs6000-call.cc"
 target_gtfiles="$target_gtfiles
\$(srcdir)/config/rs6000/rs6000-pcrel-opt.cc"
+target_gtfiles="$target_gtfiles ./rs6000-builtins.h"
 ;;
 sparc*-*-*)
 cpu_type=sparc

According to David's comments "rs6000-ibm-aix doesn't exist any more" and I
vaguely remembered Segher also mentioned rs6000*-*-*) becomes stale, maybe we
can aggressively drop the whole rs6000*-*-*) case handling?

[Bug target/113507] can't build a cross compiler to rs6000-ibm-aix7.2

2024-01-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113507

Kewen Lin  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org,
   ||dje at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org

--- Comment #2 from Kewen Lin  ---
Guessing /usr/local/bin/ld is a gnu ld? Based on what I heard before, gnu ld
has some problems on aix, people pass object files to aix system and use aix ld
there. Not sure if the understanding still holds.

[Bug testsuite/109705] [14 regression] gcc.dg/vect/pr25413a.c fails after r14-333-g6d4b59a9356ac4

2024-01-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109705

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #7 from Kewen Lin  ---
(In reply to Peter Bergner from comment #6)
> (In reply to GCC Commits from comment #5)
> > commit r14-7270-g39fa71a0882928a25bd170580e3e9e89a69dce36
> > Author: Kewen Lin 
> > Date:   Mon Jan 15 20:55:40 2024 -0600
> > 
> > testsuite: Fix vect_long_mult on Power [PR109705]
> > 
> > As pointed out by the discussion in PR109705, the current
> > vect_long_mult effective target check on Power is broken.
> > This patch is to fix it accordingly.
> 
> Does this need backporting?

I guess no, the only use of vect_long_mult in release branches is
gcc/testsuite/gcc.dg/vect/pr60656.c which has another check
vect_widen_mult_si_to_di_pattern unsupported on Power.

[Bug testsuite/113535] rs6000, testsuite: Re-visit the current vect_* for Power

2024-01-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113535

Kewen Lin  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-22
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 CC||bergner at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

[Bug testsuite/113535] New: rs6000, testsuite: Re-visit the current vect_* for Power

2024-01-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113535

Bug ID: 113535
   Summary: rs6000, testsuite: Re-visit the current vect_* for
Power
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

Inspired by PR109705, open this for tracking the revisit of vect_* checking for
Power and fix some if needed.

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-01-18 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #4 from Kewen Lin  ---
(In reply to Naveen N Rao from comment #2)
> I don't really have a preference, though I tend to agree that nops before
> the local entry point aren't that useful. Even with the current approach,
> not all functions have instructions at the GEP and for those, the nops are
> being generated outside the function. We also won't have a separate GEP/LEP
> with pcrel, so we won't need a separate option eventually.

Thanks for the input! Looking forward to the comments from the others,
especially Segher, David and Peter.

(In reply to Michael Matz from comment #3)
> (In reply to Kewen Lin from comment #1)
> > 
> > As Segher's review comments in [2], to support "before NOPs" before global
> > entry and "after NOPs" after global entry,
> 
> Just to be perfectly clear here: the "after NOPs" need to come after local
> entry
> (which strictly speaking is of course after the global one as well, but I'm
> being anal :) ).

Oops, good catch, I meant to type "after local entry", thanks for the
correction making it perfectly clear. :)

[Bug testsuite/111850] [14 regression] gcc.target/powerpc/fold-vec-extract-char.p7.c fails after r14-4664-g04c9cf5c786b94

2024-01-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111850

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Kewen Lin  ---
Should be fixed on trunk.

[Bug target/99888] Add powerpc ELFv2 support for -fpatchable-function-entry*

2024-01-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99888

--- Comment #16 from Kewen Lin  ---
(In reply to Michael Matz from comment #15)
> Umm.  I just noticed this one as we now try to implement userspace live
> patching
> for ppc64le.  The point of the "before" NOPs is (and always was) that they
> are
> completely out of the way of patchable but as-of-yet unpatched functions.
> 
> For ppc that means the "before" and "after" NOPs cannot be consecutive.  The
> two
> NOP sets being consecutive was never a design criteria or requirement.
> 
> So, while the original bug is fixed by what was committed (local entry was
> skipping the patching-nops), the chosen solution is exactly the wrong one :-/

Thanks for the input! Sigh, sorry that we picked up the wrong one :(, you may
have noticed that the main consideration to choose the current one is to keep
it align with the consecutive NOPs described by the documentation, we need a
separate command line option as Segher's review comment in
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600239.html. Now we have
PR112980 filed for the requested behavior, let's discuss how to support it
there.

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-01-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

Kewen Lin  changed:

   What|Removed |Added

 CC||matz at gcc dot gnu.org
   Last reconfirmed||2024-01-18
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 Ever confirmed|0   |1
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=99888
 Status|UNCONFIRMED |ASSIGNED

--- Comment #1 from Kewen Lin  ---
[1] made me realize that I forgot to post some comments here. (I thought I did
but actually didn't).

As Segher's review comments in [2], to support "before NOPs" before global
entry and "after NOPs" after global entry, we need to introduce a separate
command line option, I think it can be a target specific option, which is
enabled by default and we should mention its default behavior and impact in the
current documentation for -fpatchable-function-entry. I don't have a good name
candidate, any suggestions?

Considering that the current behavior aligning with consecutive NOPs looks
useless (this request and [1]), an alternative is to aggressively change the
current behavior to "before NOPs" before global entry and "after NOPs" after
global entry.

Any preference or other ideas?  Any comments are highly appreciated.

I think with either (any) proposal it's inevitable to make the current behavior
of -fpatchable-function-entry on "before NOPs" change, we should also document
this change in releases/changes.html.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99888#c15
[2] https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600239.html

[Bug other/113317] New test case libgomp.c++/ind-base-2.C fails with ICE

2024-01-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113317

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #3 from Kewen Lin  ---
I can't reproduce this either, tried on at least one machine with P8 LE, P9 LE,
P10 LE or P9 BE. I wonder which internal host was used for testing.

[Bug testsuite/113418] Use of vect_* target selectors in tests out of vect directories

2024-01-16 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113418

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #4 from Kewen Lin  ---
Thanks for filing this, I just realized that it's unexpected to use vect_*
effective target checks outside */vect/ in generic test suites.

> 
> I just found them with a simple grep command so there might be false
> positives or false negatives.  There are also a dozen matches in gcc.target
> but I consider them fine as the target maintainers should know exactly what
> they are doing.

Yes, I think those in target should be fine, although they can be replaced with
some corresponding target specific check(s), sometimes the vect_* is more
readable.

[Bug testsuite/111850] [14 regression] gcc.target/powerpc/fold-vec-extract-char.p7.c fails after r14-4664-g04c9cf5c786b94

2024-01-15 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111850

Kewen Lin  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org

--- Comment #4 from Kewen Lin  ---
Just realized that we also escalated test issue to P1, I'm going to make a
patch for the test case update.

[Bug target/113341] Using GCC as the bootstrap compiler breaks LLVM on 32-bit PowerPC

2024-01-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113341

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #9 from Kewen Lin  ---
Since it's a breakage during stage2, it's concluded that some built stage1
stuffs behave unexpectedly.  You probably can try to run regression testing
just with stage1 compiler to see if there is any regression exposed.

If without any luck, then you probably have to isolate into one or several
object files, since you have "objects" for "good" and "bad" stage1 compiler,
you can be able to isolate some in between further. Once you get some isolated,
you can probably get some hints it's a bug in LLVM source or gcc.

It seems you are using gcc 13.2.1 as version field shows, you can also try some
previous versions like gcc 12 and gcc 11 to see if they work and it's
regressed.

[Bug target/109987] ICE in in rs6000_emit_le_vsx_store on ppc64le with -Ofast -mno-power8-vector

2024-01-10 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109987

Kewen Lin  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org

--- Comment #3 from Kewen Lin  ---
As discussed in PR113115, I'm going to give option power{8,9}-vector removal a
shot.

[Bug target/113115] [14 Regression] ICE In extract_constrain_insn_cached recog.cc with ppc64le-linux-gnu crosscompiler from r14-3592-g9ea1248604d7b6

2024-01-10 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113115

--- Comment #7 from Kewen Lin  ---
(In reply to Peter Bergner from comment #5)
> I really dislike the -mpower{8,9}-vector options, but maybe it's too late to
> remove them for this release?  I'm not sure how involved/invasive that patch
> would be.  Segher, do you have a preference on remove them now or use the
> workaround above and remove in the next release?

(In reply to Segher Boessenkool from comment #6)
> Using -mpower9-vector while not having -mcpu=power9 (or later) is wrong, and
> should
> not work.  Using -mno-power9-vector is just weird.
> 
> If we can neuter the -mpower9-vector (etc.) options now, that would be good.
> But
> there are some complications with the testsuite at least?

OK, it sounds that it's still acceptable to adjust this at this time point, so
I'm working on a patch to evaluate its impact, will post it after full testing.

[Bug middle-end/113100] [14 regression] many strub tests fail after r14-6737-g4e0a467302fea5

2024-01-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113100

Kewen Lin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Kewen Lin  ---
Should be fixed on trunk.

[Bug target/111480] new test case g++.target/powerpc/altivec-19.C fails

2024-01-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111480

Kewen Lin  changed:

   What|Removed |Added

  Component|testsuite   |target
   Keywords|testsuite-fail  |missed-optimization
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #2 from Kewen Lin  ---
Should be fixed.

[Bug testsuite/112751] [14 regression] gcc.target/powerpc/pcrel-sibcall-1.c fails after r14-5628-g53ba8d669550d3

2024-01-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112751

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Kewen Lin  ---
Should be fixed.

[Bug target/112606] [14 Regression] powerpc64le-linux-gnu: 'FAIL: gcc.target/powerpc/p8vector-fp.c scan-assembler xsnabsdp'

2024-01-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112606

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Kewen Lin  ---
Should be fixed on trunk now.

[Bug target/113115] [14 Regression] ICE In extract_constrain_insn_cached recog.cc with ppc64le-linux-gnu crosscompiler from r14-3592-g9ea1248604d7b6

2024-01-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113115

Kewen Lin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #4 from Kewen Lin  ---
(In reply to Peter Bergner from comment #3)
> Ke Wen, is this just a duplicate of PR109987 and PR103627?  I know it was
> bisected to Jeevitha's commit, but it seems more like her commit exposed the
> same latent issue as those other PRs, rather than causing it.  Your thoughts?

Yes, I agree it's duplicated of PR109987, Jeevitha's commit just exposed this
known issue, since we are in stage 3, I wonder if we can go with power9-vector
guarding first
(https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587310.html) since
power9-vector still exists in this release, and we can try to remove these
workaround options in next release. (Sorry that I missed to follow up the
power{8,9}-vector removal)

*** This bug has been marked as a duplicate of bug 109987 ***

[Bug target/109987] ICE in in rs6000_emit_le_vsx_store on ppc64le with -Ofast -mno-power8-vector

2024-01-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109987

Kewen Lin  changed:

   What|Removed |Added

 CC||fkastl at suse dot cz

--- Comment #2 from Kewen Lin  ---
*** Bug 113115 has been marked as a duplicate of this bug. ***

[Bug testsuite/111480] new test case g++.target/powerpc/altivec-19.C fails

2024-01-07 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111480

Kewen Lin  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-08
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-January
   ||/642093.html
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 CC||linkw at gcc dot gnu.org

[Bug target/112606] [14 Regression] powerpc64le-linux-gnu: 'FAIL: gcc.target/powerpc/p8vector-fp.c scan-assembler xsnabsdp'

2024-01-07 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112606

Kewen Lin  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
   Last reconfirmed||2024-01-08
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #5 from Kewen Lin  ---
(In reply to seurer from comment #3)
> These tests also fail starting with
> g:9e9279fadbd1c673c875b9d20261d2de0473f63f, r14-5542-g9e9279fadbd1c6
> 
> FAIL: gcc.target/powerpc/float128-hw5.c scan-assembler-not \\mxscpsgnqp\\M
> FAIL: gcc.target/powerpc/float128-hw5.c scan-assembler-times \\mxsnabsqp\\M 1
> FAIL: gcc.target/powerpc/float128-hw7.c scan-assembler-not \\mxscpsgnqp\\M
> FAIL: gcc.target/powerpc/float128-hw7.c scan-assembler-times \\mxsnabsqp\\M 1

These failures are related to ieee128, the #c4 only handles float/double, a
similar patch was posted for ieee128:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642092.html

[Bug testsuite/112751] [14 regression] gcc.target/powerpc/pcrel-sibcall-1.c fails after r14-5628-g53ba8d669550d3

2024-01-07 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112751

Kewen Lin  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-January
   ||/642091.html
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 Status|NEW |ASSIGNED

[Bug middle-end/113100] [14 regression] many strub tests fail after r14-6737-g4e0a467302fea5

2024-01-07 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113100

Kewen Lin  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-January
   ||/642090.html
 Status|NEW |ASSIGNED

[Bug testsuite/60031] dg-require-effective-target powerpc_vsx_ok is not enough

2024-01-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60031

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #7 from Kewen Lin  ---
We have vsx_hw effective target keyword which uses check_vsx_hw_available.

# Return 1 if the target supports executing VSX instructions, 0
# otherwise.  Cache the result.

Doesn't it satisfy the requirement? Or am I missing something?

[Bug testsuite/106682] Powerpc test gcc.target/powerpc/pr86731-fwrapv-longlong.c fails on power8, passes on power9/power10

2024-01-03 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106682

Kewen Lin  changed:

   What|Removed |Added

 CC||seurer at gcc dot gnu.org

--- Comment #4 from Kewen Lin  ---
*** Bug 101444 has been marked as a duplicate of this bug. ***

[Bug testsuite/101444] [12/13/14 regression] gcc.target/powerpc/pr86731-fwrapv-longlong.c fails after r12-2266

2024-01-03 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101444

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 CC||linkw at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Kewen Lin  ---
Dup.

*** This bug has been marked as a duplicate of bug 106682 ***

[Bug middle-end/113100] [14 regression] many strub tests fail after r14-6737-g4e0a467302fea5

2023-12-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113100

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2023-12-21

--- Comment #2 from Kewen Lin  ---
Confirmed, but it needs an explicit cpu type like -mcpu=power9 for
reproduction, otherwise it could pass on power10 as it can work with pcrel (so
no toc base r2 needed). The change can extend the end of scrubbing, it cleans
the saved toc base unexpectedly.

I noticed that there is one macro SPARC_STACK_BOUNDARY_HACK, which aims to
indicate this SPARC64 specific behavior. Could we leverage this macro (guarded
the biasing with it)? like:

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 125ea158ebf..9bad1e962b4 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -5450,6 +5450,7 @@ expand_builtin_stack_address ()
   rtx ret = convert_to_mode (ptr_mode, copy_to_reg (stack_pointer_rtx),
  STACK_UNSIGNED);

+#ifdef SPARC_STACK_BOUNDARY_HACK
   /* Unbias the stack pointer, bringing it to the boundary between the
  stack area claimed by the active function calling this builtin,
  and stack ranges that could get clobbered if it called another
@@ -5476,7 +5477,9 @@ expand_builtin_stack_address ()
  (caller) function's active area as well, whereas those pushed or
  allocated temporarily for a call are regarded as part of the
  callee's stack range, rather than the caller's.  */
-  ret = plus_constant (ptr_mode, ret, STACK_POINTER_OFFSET);
+  if (SPARC_STACK_BOUNDARY_HACK)
+ret = plus_constant (ptr_mode, ret, STACK_POINTER_OFFSET);
+#endif

   return force_reg (ptr_mode, ret);
 }

[Bug rtl-optimization/85099] [meta-bug] selective scheduling issues

2023-12-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85099
Bug 85099 depends on bug 112995, which changed state.

Bug 112995 Summary: sel-sched2 ICE without checking verify_changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug rtl-optimization/112995] sel-sched2 ICE without checking verify_changes

2023-12-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED
   Target Milestone|--- |14.0

--- Comment #5 from Kewen Lin  ---
Should be fixed on trunk, guessing we don't want a backport, so closing.

[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-12-18 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Kewen Lin  changed:

   What|Removed |Added

 Resolution|FIXED   |---
 Status|RESOLVED|REOPENED

--- Comment #44 from Kewen Lin  ---
I just checked test case in comment #43, I think those Set/Load are able to
initialize those arrays as expected, so re-opening this.

[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-12-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #43 from Kewen Lin  ---
Created attachment 56899
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56899=edit
Previously reduced case for comment 10

[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-12-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #42 from Kewen Lin  ---
(In reply to Richard Biener from comment #41)
> What's the "other" testcase?  Do we know that doesn't suffer from the same
> uninitialized issue?

For "other" test cases, I guessed he referred to my comment #c31, these are
comment #c9 and #c10. Previously I further reduced #c10 and I didn't detect
obvious uninitialized issue (but I could be wrong).

[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-12-15 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #38 from Kewen Lin  ---
I found this has been marked as resolved but it seems that the patch in comment
#34 hasn't been pushed, is it intended? or did I miss something that one commit
was pushed but wasn't associated to this PR?

  1   2   3   4   5   6   7   8   >