[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-04-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

--- Comment #5 from Kewen Lin  ---
Created attachment 58067
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58067=edit
untested patch

[Bug testsuite/113535] rs6000, testsuite: Re-visit the current vect_* for Power

2024-04-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113535

--- Comment #1 from Kewen Lin  ---
One issue: https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650171.html

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-04-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

Kewen Lin  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #4 from Kewen Lin  ---
(In reply to Andrew Pinski from comment #3)
> (In reply to Kewen Lin from comment #2)
> > As https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114843#c8, we may need some
> > similar handling like r14-6440-g4b421728289e6f.
> 
> Note rs6000_emit_epilogue mostly handles eh_returns so it might not be as
> hard as other targets.

Yes, making a patch.

[Bug target/44793] [11/12/13/14/15 Regression] libgcc does not include t-ppccomm on rtems

2024-04-28 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44793

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |WORKSFORME
 CC||linkw at gcc dot gnu.org

--- Comment #26 from Kewen Lin  ---
libgcc/config.host on gcc-11 has:

powerpc-*-rtems*)
  tmake_file="${tmake_file} rs6000/t-ppccomm rs6000/t-savresfgpr
rs6000/t-crtstuff t-crtstuff-p  ic t-fdpbit"
  extra_parts="$extra_parts crtbeginS.o crtendS.o crtbeginT.o ecrti.o
ecrtn.o ncrti.o ncrtn.o"
  ;;

I think this had been fixed already by r0-119741-g6f28886030623a.

Please feel free to reopen it if it still occurs on active releases. Thanks!

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-04-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

--- Comment #2 from Kewen Lin  ---
As https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114843#c8, we may need some
similar handling like r14-6440-g4b421728289e6f.

[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return

2024-04-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846

Kewen Lin  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2024-04-25
 Status|UNCONFIRMED |NEW
 CC||bergner at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org,
   ||segher at gcc dot gnu.org
 Target|powerpc64-linux-gnu |powerpc64*-linux-gnu
   |powerpc-linux-gnu   |powerpc-linux-gnu

--- Comment #1 from Kewen Lin  ---
Thanks for reporting, confirmed, it also fails on LE (ppc64le-linux).

[Bug testsuite/114842] rs6000: Adjust some test cases with powerpc_vsx_ok

2024-04-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114842

--- Comment #1 from Kewen Lin  ---
We can extend powerpc_vsx to consider current_compiler_flags, it means that if
a test case has an explicit -mvsx, even if users specify -mno-vsx it's still
able to be tested if powerpc_vsx checking concludes VSX is enabled, it can keep
some previous testing coverage.

[Bug testsuite/114842] rs6000: Adjust some test cases with powerpc_vsx_ok

2024-04-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114842

Kewen Lin  changed:

   What|Removed |Added

 Target||powerpc*-linux-gnu
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
   Last reconfirmed||2024-04-25
   Target Milestone|--- |15.0
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED

[Bug testsuite/114842] New: rs6000: Adjust some test cases with powerpc_vsx_ok

2024-04-25 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114842

Bug ID: 114842
   Summary: rs6000: Adjust some test cases with powerpc_vsx_ok
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

The current effective target powerpc_vsx_ok is mainly to check if it's fine to
specify -mvsx (without any warnings etc.) and can finally result in a object
file (it means the underlying environment like assembler supports vsx insns).
But most of the test cases being guarded with this checking actually want to
check if VSX feature is enabled, such as: the wanted behavior only happens with
VSX feature enabled. When users specifying -mno-vsx in RUNTESTFLAGS, it can
disable VSX feature (with some old runtest, -mno-vsx comes after -mvsx), but
powerpc_vsx_ok checking will still pass as it's fine to specify -mvsx, so if
the test case doesn't have explicit -mvsx, then the given -mno-vsx can disable
VSX feature and make that test case fail, meanwhile even if the test case has
specified -mvsx explicitly it would fail with old runtest as -mno-vsx comes
last. We already have another effective target powerpc_vsx which effectively
checks for VSX enabled, so we should update most of test case to adopt it
instead.

[Bug target/88309] [11/12/13/14 Regression] ICE: Floating point exception (in is_miss_rate_acceptable), target assigning alignent of 4 bits(!) to vector

2024-04-24 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88309

Kewen Lin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Kewen Lin  ---
Should be fixed on trunk and active release branches.

[Bug target/105359] _Float128 expanders and builtins disabled on ppc targets with 64-bit long double

2024-04-23 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105359

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2024-04-23
   Keywords||missed-optimization
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 CC||linkw at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Kewen Lin  ---
Thanks for reporting, I'll have a look.

[Bug testsuite/114744] test case gcc.target/powerpc/builtins-6-p9-runnable.c fails

2024-04-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114744

Kewen Lin  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Kewen Lin  ---
Should be fixed on trunk, since it's a test issue, no backporting need.

[Bug testsuite/114744] test case gcc.target/powerpc/builtins-6-p9-runnable.c fails

2024-04-16 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114744

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org
 Ever confirmed|0   |1
   Last reconfirmed||2024-04-17
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED

--- Comment #2 from Kewen Lin  ---
This is very very likely a test issue, due to endianness which the load vector
should consider. I'll have a look.

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-04-11 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #17 from Kewen Lin  ---
(In reply to Michael Matz from comment #16)
> (In reply to Kewen Lin from comment #15)
> > I agree, thanks for the comments! btw, I'm not fighting for the current
> > implementation, just want to know more details why users are unable to make
> > use of the current implementation, is it just due to its inefficiency (like
> > the above sequence) or un-usability (unused at all). As your comments, I
> > think it's due to the former (inefficiency)?!
> 
> Okay.  So, yeah, I _think_ that other way (with NOPs between GEP and LEP,
> plus a jump around them) could be made to work with userspace live patching.
> It would just be inefficient.  But do note that that jump around was _not_
> part of the original way of -fpatchable-function-entry, so a change to
> codegen
> would have to have happened anyway to make that other way usable.  And it
> has the
> (perhaps theoretical, who knows :) ) problem of not using the normal 8-byte
> difference between GEP and LEP.
> 

Thanks again for confirming this understanding!

> I think your current proposal from comment #10 is the better from all
> perspectives.

Yeah, I agree. When reworking this support previously, comment #10 like
implementation was considered as a better one but it's not finally made due to
the concern that can break the assumption NOPs should be consecutive, based on
all the inputs here I think it's time to "fix" it by just underscoring this
special not-consecutive NOPs in documentation.

[Bug target/114567] rs6000: explicit _Float128 doesn't generate optimal code

2024-04-10 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114567

--- Comment #1 from Kewen Lin  ---
This is power8 LE specific, for KFmode its mov expander calls
rs6000_emit_le_vsx_move, so it's with V1TI subreg, then rs6000 specific pass
swaps generate one MEM with AND -16, which make combine unable to optimize it
with that *signbit2_dm_mem pattern due to mode_dependent_address_p
returning false always for AND. Although it looks to me we can extend
mode_dependent_address_p to consider the to-mode in that context, it's still
sub-optimal due to the existence of AND -16, which result in an explicit "and"
then.

[Bug testsuite/114662] [14 regression] new test case c_lto_pr113359-2 from r14-9841-g1e3312a25a7b34 fails

2024-04-10 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114662

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Kewen Lin  ---
Should be fixed on latest trunk.

[Bug testsuite/114662] [14 regression] new test case c_lto_pr113359-2 from r14-9841-g1e3312a25a7b34 fails

2024-04-10 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114662

Kewen Lin  changed:

   What|Removed |Added

  Component|lto |testsuite
   Target Milestone|--- |14.0
   Keywords||testsuite-fail

[Bug lto/114662] [14 regression] new test case c_lto_pr113359-2 from r14-9841-g1e3312a25a7b34 fails

2024-04-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114662

Kewen Lin  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 CC||linkw at gcc dot gnu.org
   Last reconfirmed||2024-04-10
 Status|UNCONFIRMED |ASSIGNED

--- Comment #2 from Kewen Lin  ---
I think this is a test issue, with -m32 unsigned long is 4 bytes while CL1,CL2
are 8 bytes constants, then it considers some checks would always fail and the
abort will happen, since the optimization aggressively optimize away the call
to getb, there is no chance to further check "semantic equality". The IR for
main at *.015t.cfg looks like:

int main (int argc, char * * argv)
{
  struct SB b;
  struct SA a;
  int D.3983;

   :
  init ();
  geta (, );
  _1 = a.ax;
  if (_1 != 3735928559)
goto ; [INV]
  else
goto ; [INV]

   :
  __builtin_abort ();

   :
  __builtin_abort ();

}

[Bug rtl-optimization/114664] -fno-omit-frame-pointer causes an ICE during the build of the greenlet package

2024-04-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664

--- Comment #8 from Kewen Lin  ---
(In reply to Peter Bergner from comment #7)
> (In reply to Andrew Pinski from comment #6)
> > Pre-IRA fix was done to specifically reject this:
> > https://inbox.sourceware.org/gcc-patches/
> > ab3a61990702021658w4dc049cap53de8010a7d86...@mail.gmail.com/
> 
> Then that would seem to indicate that mentioning the frame pointer reg in
> the asm clobber list is an error, but how are users supposed to know whether
> -fno-omit-frame-pointer is in effect or not?  I've looked and there is no
> pre-defined macro a user could check.

I noticed even without -fno-omit-frame-pointer, the test case still fails with
the same symptom (with error msg rather than ICE), did I miss something?

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-04-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #15 from Kewen Lin  ---
(In reply to Michael Matz from comment #14)
> Hmm?  But this is not how the global-to-local hand-off is implemented (and
> expected by tooling): a fall-through.  The global entry sets up the GOT
> register, there simply is no '[b localentry]'.
> 
> If you mean to imply that also the '[b localentry]' should be patched in at
> live-patch application time (and hence the GOT setup would need to be moved
> to still somewhere else), then you have the problem that (in the
> not-yet-patched 
> case) as long as the L1-nops sit between global and local entry they will
> always 
> be executed when the global entry is called.

Sorry for confusion, I meant the sequence like:

global entry:
  [TOC base setup] // always here
  [b localentry] // which is added when patching
L1:
  [patched code] // from patching
  localentry: 
  [b L1] // from patching

> That's wasteful.

I agree, nops are not zero cost on Power8/Power9.

> 
> Additionally tooling will be surprised if the address difference between
> global and local entry isn't exactly 8 (i.e. two instructions).  The psABI
> allows for different values, of course.  But I'm willing to bet that there
> are
> bugs in the wild when different values would be actually used.
> 

It's possible that some tooling doesn't conform the ABI doc well, but I think
the tooling should fix itself if that is the case. :)

> So, the nops-between-gep-and-lep could probably be somehow made to work with
> userspace live patching, but your most recent patch here makes this all mood.
> It generates exactly the sequence we want: a single nop at the LEP, and
> a configurable patching area outside of, but near to, the function (here: in
> front of the GEP).

I agree, thanks for the comments! btw, I'm not fighting for the current
implementation, just want to know more details why users are unable to make use
of the current implementation, is it just due to its inefficiency (like the
above sequence) or un-usability (unused at all). As your comments, I think it's
due to the former (inefficiency)?!

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-04-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #13 from Kewen Lin  ---
(In reply to Giuliano Belinassi from comment #12)
> With your patch we have:
> 
> > .LPFE0:
> > ...
> Which seems what is expected.

Hi Giuliano, thanks for your time on testing it!  Could you kindly help to
explain a bit on why "In such way we can't use the this space to place a
trampoline to the new function"? Is it due to inefficient code like needing
more branches?

global entry:
  [b localentry]
L1:
  [patched code]

localentry:
  [b L1]

Or some other reason which makes it unused at all?

[Bug testsuite/114614] New test case gcc.misc-tests/gcov-20.c from r14-9789-g08a52331803f66 fails

2024-04-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114614

Kewen Lin  changed:

   What|Removed |Added

   Target Milestone|--- |14.0
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Kewen Lin  ---
Should be fixed on latest trunk.

[Bug testsuite/114642] new test case gcc.dg/debug/btf/btf-datasec-3.c from r14-6195-gb8cf266f4ca4ff fails for 32 bits

2024-04-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114642

Kewen Lin  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-April/6
   ||48994.html
 CC||linkw at gcc dot gnu.org
 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |david.faust at oracle 
dot com

--- Comment #2 from Kewen Lin  ---
David posted a fix (see URL).

[Bug testsuite/114614] New test case gcc.misc-tests/gcov-20.c from r14-9789-g08a52331803f66 fails

2024-04-07 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114614

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
   Last reconfirmed||2024-04-08
 CC||linkw at gcc dot gnu.org

--- Comment #1 from Kewen Lin  ---
It requires effective target profile_update_atomic.

[Bug target/114567] rs6000: explicit _Float128 doesn't generate optimal code

2024-04-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114567

Kewen Lin  changed:

   What|Removed |Added

   Target Milestone|--- |15.0
   Keywords||missed-optimization
 Target||powerpc64*-linux-gnu
   Last reconfirmed||2024-04-03
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org

[Bug target/114567] New: rs6000: explicit _Float128 doesn't generate optimal code

2024-04-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114567

Bug ID: 114567
   Summary: rs6000: explicit _Float128 doesn't generate optimal
code
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

This is an issue which I happened to spot when I have been working on patches
for PR112993.

=== test case ===

#define TYPE _Float128

#ifdef LD
#undef TYPE
#define TYPE long double
#endif

int sbm (TYPE *a) { return __builtin_signbit (*a); }

==

/opt/gcc-nightly/trunk/bin/gcc -mcpu=power8 -mvsx -O2 -mabi=ieeelongdouble
-Wno-psabi test.c -DLD -S -o ref.s
/opt/gcc-nightly/trunk/bin/gcc -mcpu=power8 -mvsx -O2 -mabi=ibmlongdouble
-Wno-psabi test.c -S -o float128.s

diff -Nur ref.s float128.s
--- ref.s   2024-03-18 05:41:00.302208975 -0400
+++ float128.s  2024-03-18 05:41:00.392205513 -0400
@@ -9,7 +9,10 @@
 sbm:
 .LFB0:
.cfi_startproc
-   ld 3,8(3)
+   rldicr 3,3,0,59
+   lxvd2x 0,0,3
+   xxpermdi 0,0,0,2
+   mfvsrd 3,0
srdi 3,3,63
blr
.long 0

[Bug target/88309] [11/12/13/14 Regression] ICE: Floating point exception (in is_miss_rate_acceptable), target assigning alignent of 4 bits(!) to vector

2024-04-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88309

Kewen Lin  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org

--- Comment #6 from Kewen Lin  ---
(In reply to Andrew Pinski from comment #5)
> (In reply to Kewen Lin from comment #4)
> > Hi Andrew, thanks for digging into this!  William has not worked on GCC
> > project any more, will you make a patch for this?
> 
> I don't have time to test it really.

No problem, I'll work on this.

[Bug target/88309] [11/12/13/14 Regression] ICE: Floating point exception (in is_miss_rate_acceptable), target assigning alignent of 4 bits(!) to vector

2024-04-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88309

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #4 from Kewen Lin  ---
(In reply to Andrew Pinski from comment #3)
> Found it:
>   /* In GIMPLE the type of the MEM_REF specifies the alignment.  The
> required alignment (power) is 4 bytes regardless of data type.  */
>   tree align_ltype = build_aligned_type (lhs_type, 4);
> 
> That should be 4*8 instead of just 4.
> 
> There are 2 build_aligned_type in rs6000-builtins.cc which uses the wrong
> alignment; thinking it was the alignment argument was bytes rather than bits.
> 
> Introduced by r9-2375-g3f7a77cd20d07c which means this is a regression.

Hi Andrew, thanks for digging into this!  William has not worked on GCC project
any more, will you make a patch for this?

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-04-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #11 from Kewen Lin  ---
(In reply to Giuliano Belinassi from comment #9)
> Yes, this is for userspace livepatching.
> 
> Assume the following example:
> https://godbolt.org/z/b9M8nMbo1
> 
> As one can see, the sequence of 14 nops are generated after the global
> function entry point. In such way we can't use the this space to place a
> trampoline to the new function. We need this sequence of nops to be placed
> *before* the global function entry point.
> 

Hi Giuliano, thanks for the inputs!

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-04-02 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #10 from Kewen Lin  ---
Created attachment 57844
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57844=edit
patch changing the current implementation

Considering the current implementation is not useful at all for both kernel and
userspace uses, I'm inclined to change the current implementation instead of
introducing another option, but updating the documentation to emphasize the
NOPs may not be consecutive for this case.

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-04-01 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #8 from Kewen Lin  ---
Hi @Michael, @Martin, could you help to confirm/clarify what triggers you to be
interested in this feature, is it for some user space usage or not?

[Bug target/114402] rs6000: ICE when long double is ieee128 format by default but without vsx

2024-03-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114402

--- Comment #1 from Kewen Lin  ---
Currently the only pattern to match IEEE128 comparison is:

;; IEEE 128-bit comparisons
(define_insn "*cmp_hw"
  [(set (match_operand:CCFP 0 "cc_reg_operand" "=y")
(compare:CCFP (match_operand:IEEE128 1 "altivec_register_operand" "v")
  (match_operand:IEEE128 2 "altivec_register_operand"
"v")))]
  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)"
   "xscmpuqp %0,%1,%2"
  [(set_attr "type" "veccmp")
   (set_attr "size" "128")])

It requires TARGET_FLOAT128_HW, so nothing can be used for matching.

The below patch can fix this ICE, it makes no-vsx IEEE128 also go with libfunc
call like !TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode).

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 5d975dab921..237d138faec 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -15329,7 +15329,7 @@ rs6000_generate_compare (rtx cmp, machine_mode mode)
   rtx op0 = XEXP (cmp, 0);
   rtx op1 = XEXP (cmp, 1);

-  if (!TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode))
+  if (!TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode))
 comp_mode = CCmode;
   else if (FLOAT_MODE_P (mode))
 comp_mode = CCFPmode;
@@ -15361,7 +15361,7 @@ rs6000_generate_compare (rtx cmp, machine_mode mode)

   /* IEEE 128-bit support in VSX registers when we do not have hardware
  support.  */
-  if (!TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode))
+  if (!TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode))
 {
   rtx libfunc = NULL_RTX;
   bool check_nan = false;

[Bug target/114402] rs6000: ICE when long double is ieee128 format by default but without vsx

2024-03-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114402

Kewen Lin  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org,
   ||g...@the-meissners.org,
   ||segher at gcc dot gnu.org
   Last reconfirmed||2024-03-21
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org

[Bug target/114402] rs6000: ICE when long double is ieee128 format by default but without vsx

2024-03-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114402

Kewen Lin  changed:

   What|Removed |Added

 Target||powerpc64*-linux-gnu
   Keywords||ice-on-valid-code
   Target Milestone|--- |15.0
  Known to fail||12.3.1, 13.2.1

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-03-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #6 from Kewen Lin  ---
(In reply to Martin Jambor from comment #5)
> I'd like to ping this, are there plans to implement this in the near-ish
> term?

Some weeks ago, Naveen had been doing some experiments to see if there is a
better way for function tracer support, and if the idea works and the
experiment result is promising, he may request something different, so we are
still waiting for that. @Naveen Feel free to correct me if any
misunderstanding.

[Bug target/114402] New: rs6000: ICE when long double is ieee128 format by default but without vsx

2024-03-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114402

Bug ID: 114402
   Summary: rs6000: ICE when long double is ieee128 format by
default but without vsx
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

When I was doing a patch to make us only have two 128bit fp on rs6000, I found
that we can have long double with ieee128 format by default and even not having
vsx support, but a simple test case with comparison triggers ICE as below:

long double a;
long double b;

int foo() {
  if (a > b)
return 0;
  else
return 1;
}

/opt/gcc-nightly/trunk/bin/gcc test.c -mno-vsx

test.c: In function ‘foo’:
test.c:9:1: error: unrecognizable insn:
9 | }
  | ^
(insn 9 8 10 2 (set (reg:CCFP 123)
(compare:CCFP (reg:TF 117 [ a.0_1 ])
(reg:TF 118 [ b.1_2 ]))) "test.c":5:6 -1
 (nil))
during RTL pass: vregs
test.c:9:1: internal compiler error: in extract_insn, at recog.cc:2812
0x102b7353 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/home/gccbuild/gcc_trunk_git/gcc/gcc/rtl-error.cc:108
0x102b73a7 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/home/gccbuild/gcc_trunk_git/gcc/gcc/rtl-error.cc:116
0x10c6636b extract_insn(rtx_insn*)
/home/gccbuild/gcc_trunk_git/gcc/gcc/recog.cc:2812
0x107ef797 instantiate_virtual_regs_in_insn
/home/gccbuild/gcc_trunk_git/gcc/gcc/function.cc:1611
0x107ef797 instantiate_virtual_regs
/home/gccbuild/gcc_trunk_git/gcc/gcc/function.cc:1994
0x107ef797 execute
/home/gccbuild/gcc_trunk_git/gcc/gcc/function.cc:2041
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

Note that it should be configured with --with-long-double-format=ieee, since if
-mabi=ieeelongdouble is specified, it will requires vsx to be enabled.

[Bug testsuite/114320] New test case in r14-9439-g4aa87b856067d4 fails

2024-03-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114320

--- Comment #3 from Kewen Lin  ---
(In reply to Nathaniel Shead from comment #2)
> Sorry about that. I've not been able to work out what configure flags I need
> to pass to cause this to error in the first place (I don't normally develop
> for powerpc and the machine I'm using doesn't seem to fail no matter what

I guess the machine you are using (were referring to) isn't with powerpc chip,
cfarm provides some powerpc machines (https://portal.cfarm.net/machines/list/),
both ppc64le (LE -m64) and ppc64 (BE -m32/-m64), it's recommended to leverage
them for building/testing. :)

> flags I try), but am I correct in understanding that just adding
> "-Wno-psabi" to the tests should stop them from failing? If so I'm happy to
> push a patch to that effect.

I think so, for now we don't have an effective target dedicated for __ibm128
type but it's guarded the same as what's for __float128 type (it would be
relaxed though in future, even with that using ppc_float128_sw should just be
more strict).  Ideally we can add one effective target powerpc_vsx_ok (should
be powerpc_vsx) to ensure VSX to be enabled, but considering we are going to
rework it in next release and we don't normally disable vsx explicitly, this
can be optional.

[Bug testsuite/114320] New test case in r14-9439-g4aa87b856067d4 fails

2024-03-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114320

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2024-03-13
 Ever confirmed|0   |1
 CC||linkw at gcc dot gnu.org

--- Comment #1 from Kewen Lin  ---
These new test cases require "-Wno-psabi" to suppress the warning.

[Bug testsuite/101461] [12/13/14 regression] gcc.target/powerpc/fold-vec-load-builtin_vec_xl test cases fail after r12-2266

2024-03-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101461

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
 CC||linkw at gcc dot gnu.org

--- Comment #4 from Kewen Lin  ---
Already fixed by r12-2889-g8464894c86b03e.

[Bug target/113507] can't build a cross compiler to rs6000-ibm-aix7.2

2024-02-26 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113507

Kewen Lin  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |segher at gcc dot 
gnu.org

--- Comment #6 from Kewen Lin  ---
Segher will clean up this rs6000-*-* thing in next release, please use
powerpc*-*-* instead.

[Bug testsuite/106680] Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE

2024-02-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680

--- Comment #12 from Kewen Lin  ---
(In reply to Sebastian Huber from comment #10)
> (In reply to Kewen Lin from comment #9)
> > Note that now we only disable implicit powerpc64 for -m32 when the
> > OS_MISSING_POWERPC64 is set.
> > 
> >   /* Don't expect powerpc64 enabled on those OSes with OS_MISSING_POWERPC64,
> >  since they do not save and restore the high half of the GPRs correctly
> >  in all cases.  If the user explicitly specifies it, we won't interfere
> >  with the user's specification.  */
> > #ifdef OS_MISSING_POWERPC64
> >   if (OS_MISSING_POWERPC64
> >   && TARGET_32BIT
> >   && TARGET_POWERPC64
> >   && !(rs6000_isa_flags_explicit & OPTION_MASK_POWERPC64))
> > rs6000_isa_flags &= ~OPTION_MASK_POWERPC64;
> > #endif
> > 
> > But rtems.h doesn't define OS_MISSING_POWERPC64
> 
> RTEMS supports the 64-bit PowerPC for the 64-bit multilibs.
> 

64-bit kernel should support 64-bit PowerPC, but does 32-bit kernel support
saving and restoring 64-bit regs?

The current rtems.h is saying yes, if it's no, we should fix the rtems.h and
you won't need the explicit -mno-powerpc64 then.


btw, take the comments in freebsd64.h for example.

/* FreeBSD doesn't support saving and restoring 64-bit regs with a 32-bit
   kernel. This is supported when running on a 64-bit kernel with
   COMPAT_FREEBSD32, but tell GCC it isn't so that our 32-bit binaries
   are compatible. */
#define OS_MISSING_POWERPC64 !TARGET_64BIT

[Bug testsuite/106680] Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE

2024-02-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680

--- Comment #11 from Kewen Lin  ---
(In reply to Sebastian Huber from comment #8)
> Yes, it seems that -mcpu=e6500 -mno-powerpc64 yields the right code for the
> attached test case (with or without the -m32).

The default is -m32 I guess? :)

> 
> I am now a bit confused what the purpose of the -m32 and -m64 options is.

For -m32/-m64, the manual says:

Generate code for 32-bit or 64-bit environments of Darwin and SVR4 targets
(including GNU/Linux). The 32-bit environment sets int, long and pointer to 32
bits and generates code that runs on any PowerPC variant. The 64-bit
environment sets int to 32 bits and long and pointer to 64 bits, and generates
code for PowerPC64, as for -mpowerpc64.

But it's possible to interact with option powerpc64, like cpu e6500 which by
default supports powerpc64 and if applied OS is able to support the necessary
context switches, we want -mpowerpc64 kept and it's able to generate more
efficient code (leveraging insns guarded with powerpc64 flag).

[Bug testsuite/106680] Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE

2024-02-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680

--- Comment #9 from Kewen Lin  ---
Note that now we only disable implicit powerpc64 for -m32 when the
OS_MISSING_POWERPC64 is set.

  /* Don't expect powerpc64 enabled on those OSes with OS_MISSING_POWERPC64,
 since they do not save and restore the high half of the GPRs correctly
 in all cases.  If the user explicitly specifies it, we won't interfere
 with the user's specification.  */
#ifdef OS_MISSING_POWERPC64
  if (OS_MISSING_POWERPC64
  && TARGET_32BIT
  && TARGET_POWERPC64
  && !(rs6000_isa_flags_explicit & OPTION_MASK_POWERPC64))
rs6000_isa_flags &= ~OPTION_MASK_POWERPC64;
#endif

But rtems.h doesn't define OS_MISSING_POWERPC64

gcc/config/rs6000/linux.h:#define OS_MISSING_POWERPC64 1
gcc/config/rs6000/freebsd64.h:#define OS_MISSING_POWERPC64 !TARGET_64BIT
gcc/config/rs6000/aix.h:#define OS_MISSING_POWERPC64 1
gcc/config/rs6000/linux64.h:#define OS_MISSING_POWERPC64 !TARGET_64BIT

meanwhile cpu "e6500" has MASK_POWERPC64 set by default (it's 64bit core).

That's why you still have powerpc64 flag set when you specify -m32 on rtems.

[Bug testsuite/106680] Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE

2024-02-05 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680

--- Comment #7 from Kewen Lin  ---
(In reply to Sebastian Huber from comment #6)
> It seems that the change
> 
> commit acc727cf02a1446dc00f8772f3f479fa3a508f8e
> Author: Kewen Lin 
> Date:   Tue Dec 27 04:13:07 2022 -0600
> 
> rs6000: Rework option -mpowerpc64 handling [PR106680]
> 
> causes a regression for -mcpu=e6500 -m32, for example:
> 
> gcc -fpreprocessed -O2 -S -mcpu=e6500 -m32 -S imfs_add_node.c.67.s
> imfs_add_node.c.67.i
> 
> diff -u imfs_add_node.c.67.s.good.e2acff49fb2962b921bf8b73984b89878b61492c
> imfs_add_node.c.67.s.bad.acc727cf02a1446dc00f8772f3f479fa3a508f8e
> --- imfs_add_node.c.67.s.good.e2acff49fb2962b921bf8b73984b89878b61492c 
> 2024-01-20 12:15:15.143182571 +0100
> +++ imfs_add_node.c.67.s.bad.acc727cf02a1446dc00f8772f3f479fa3a508f8e  
> 2024-01-20 12:11:46.804204927 +0100
> @@ -52,8 +52,8 @@
> bne- 0,.L4
>  .L2:
> mr 4,29
> -   addi 3,1,8
> li 5,24
> +   addi 3,1,8
> bl rtems_filesystem_eval_path_start
> lis 9,IMFS_node_clone@ha
> lwz 10,20(3)
> @@ -63,12 +63,12 @@
> cmpw 0,10,9
> beq- 0,.L24
> li 4,134
> -   addi 3,1,8
> +   li 3,0
> bl rtems_filesystem_eval_path_error
>  .L9:
> li 31,-1
>  .L10:
> -   addi 3,1,8
> +   li 3,0
> bl rtems_filesystem_eval_path_cleanup
>  .L1:
> lwz 0,116(1)
> @@ -93,7 +93,7 @@
> lwz 9,12(31)
> li 8,96
> lhz 10,16(31)
> -   addi 3,1,8
> +   li 3,0
> stw 8,24(1)
> stw 9,8(1)
> stw 10,12(1)
> @@ -105,7 +105,7 @@
> cmpwi 0,9,0
> beq- 0,.L9
> li 4,22
> -   addi 3,1,8
> +   li 3,0
> bl rtems_filesystem_eval_path_error
> b .L9
> .p2align 4,,15
> @@ -129,12 +129,9 @@
> stw 9,0(10)
> stw 10,4(9)
> bl _Timecounter_Getbintime
> -   lwz 10,64(1)
> -   lwz 11,68(1)
> -   stw 10,40(30)
> -   stw 11,44(30)
> -   stw 10,48(30)
> -   stw 11,52(30)
> +   ld 9,64(1)
> +   std 9,40(30)
> +   std 9,48(30)
> b .L10
> .cfi_endproc
>  .LFE351:
> 
> For the call to rtems_filesystem_eval_path_cleanup() the register 3 should
> point to a structure on the stack. Correct is:
> 
> -   addi 3,1,8
> 
> Wrong is:
> 
> +   li 3,0
> 
> It seems that for the -mcpu=e6500 the -m32 option has not the right effect
> and some 64-bit instructions are generated, for example ld and std plus the

As the commit log, the previous behavior that -m32 also disables -mpowerpc64 is
wrong, -m{no,}powerpc64 should be independent of -m32/-m64.

> wrong function parameters.

I supposed that the behavior you wanted with -m32 is not to enable powerpc64
(since the previous behavior is -m32 can disable -mpowerpc64 as well), so I
think you can get the previous behavior if you specify one explicit
-mno-powerpc64 when adopting -m32.

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-30 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2024-01-30

--- Comment #13 from Kewen Lin  ---
One more finding: without an explicit cpu type but -mvsx, gcc passes -mpower7
to assembler already, but if there is an explicitly specified cpu type, it
won't do that. I think the reason why it doesn't always make it is that only
the last cpu type wins and the passing can override some higher cpu type
unexpectedly.

The fixing candidates seems to be:

diff --git a/libgcc/config/rs6000/t-float128 b/libgcc/config/rs6000/t-float128
index b09b5664af0..47b06d3c30d 100644
--- a/libgcc/config/rs6000/t-float128
+++ b/libgcc/config/rs6000/t-float128
@@ -74,7 +74,7 @@ fp128_includes = $(srcdir)/soft-fp/double.h \
   $(srcdir)/soft-fp/soft-fp.h

 # Build the emulator without ISA 3.0 hardware support.
-FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 \
+FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 -mcpu=power7 \
-mno-float128-hardware -mno-gnu-attribute \
-I$(srcdir)/soft-fp \
-I$(srcdir)/config/rs6000 \

Or

diff --git a/libgcc/config/rs6000/t-float128 b/libgcc/config/rs6000/t-float128
index b09b5664af0..bf4a5e6aaf0 100644
--- a/libgcc/config/rs6000/t-float128
+++ b/libgcc/config/rs6000/t-float128
@@ -74,7 +74,7 @@ fp128_includes = $(srcdir)/soft-fp/double.h \
   $(srcdir)/soft-fp/soft-fp.h

 # Build the emulator without ISA 3.0 hardware support.
-FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 \
+FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 -Wa,-many \
-mno-float128-hardware -mno-gnu-attribute \
-I$(srcdir)/soft-fp \
-I$(srcdir)/config/rs6000 \

As gcc considers -mvsx to imply -mcpu=power7 (appending onto the current
specified cpu type if there is one) while assembler doesn't consider like that.

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

Kewen Lin  changed:

   What|Removed |Added

Summary|Failed bootstrap on ppc |[14 regression] Failed
   |unrecognized opcode:|bootstrap on ppc
   |`lfiwzx' with -mcpu=7450|unrecognized opcode:
   ||`lfiwzx' with -mcpu=7450

--- Comment #12 from Kewen Lin  ---
(In reply to Sam James from comment #10)
> (In reality, I think it is a regression, given:
> a) it regresses non-release checking (which we sometimes use even for
> released versions, it's opt-in though);

But I assumed that non-release checking on old releases should also fail, from
non-release vs. non-release, the behavior doesn't change.

> b) it blocks further testing with GCC 14
> 

Sorry for that, put it back as you like. :)

> but I understand the argument that if a release were made with it, it
> wouldn't be the end of the world by itself and it only affects a specific
> configuration.)

[Bug target/113652] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

--- Comment #11 from Kewen Lin  ---
In gcc, lfiwzx is guarded with TARGET_LFIWZX => TARGET_POPCNTD (ISA2.06), while
-mvsx will guarantee TARGET_POPCNTD (ISA_2_6_MASKS_SERVER) set, so it considers
lfiwzx is supported. IMHO the underlying philosophy is that having the
capability of vsx the supported ISA level is at least 2.06, lfiwzx is supported
from 2.06, so it's supported.

But binutils seems not to follow it:
{"xvadddp", XX3(60,96), XX3_MASK,PPCVSX,PPCVLE, {XT6,
XA6, XB6}},
{"lfiwzx",  X(31,887),  X_MASK,   POWER7|PPCA2, 0,  {FRT,
RA0, RB}},
Both are guarded with different masks and apparently PPCVSX doesn't enable
POWER7.

Hi Alan and Peter,

I wonder if assembler can enable POWER7 when PPCVSX gets enabled like what gcc
adopts now?

[Bug target/113652] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

Kewen Lin  changed:

   What|Removed |Added

Summary|[14 regression] Failed  |Failed bootstrap on ppc
   |bootstrap on ppc|unrecognized opcode:
   |unrecognized opcode:|`lfiwzx' with -mcpu=7450
   |`lfiwzx' with -mcpu=7450|

--- Comment #9 from Kewen Lin  ---
(In reply to Andrew Pinski from comment #8)
> So t-float128 has this line:
> # Build the emulator without ISA 3.0 hardware support.
> FP128_CFLAGS_SW  = -Wno-type-limits -mvsx -mfloat128 \
> ...
> 
> Which gets added to some of the libgcc object files while compiling:
> $(fp128_softfp_obj)  : INTERNAL_CFLAGS += $(FP128_CFLAGS_SW)
> $(fp128_ppc_obj) : INTERNAL_CFLAGS += $(FP128_CFLAGS_SW)
> 
> 
> The problem is CFLAGS gets added also. It seems like passing -mvsx enables
> some other instructions in GCC's code generation BUT does not enable it for
> the assembler ...

ah, just noticed that it's bootstrapping gcc. Stripping regression tag since I
don't think it's actually a regression as comments above.

I found that the libgcc_cv_powerpc_float128 checking can pass with -mcpu=7450
-mabi=altivec -mvsx -mfloat128, the assembler options are "-a32 -mppc -mvsx
-maltivec -mbig" is actually the same as what are used for the case #c5
compiling. So it looks that -mvsx is supposed to tell assembler to recognize
vsx instructions but somehow "lfiwzx" is not counted as vsx instruction.

More specifically "xvadddp" is recognized by assembler with -mvsx while
"lfiwzx" isn't.

$ cat t1.s
.machine "7450"
lfiwzx 1,0,9

$ cat t2.s
.machine "7450"
xvadddp 34,34,35

$ as -a32 -mppc -mvsx t1.s -o t1.o
t1.s: Assembler messages:
t1.s:2: Error: unrecognized opcode: `lfiwzx'
$ as -a32 -mppc -mvsx t2.s -o t2.o
$ echo $?
$ 0

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

--- Comment #7 from Kewen Lin  ---
oops, I meant --enable-checking rather than --checking.

[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450

2024-01-29 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652

--- Comment #6 from Kewen Lin  ---
I think this is related to r10-580-ge154242724b084 and this failure is expected
and a use error.

With it applied, we don't always pass -many to assembler with CHECKING_P
enabled. Actually compilers (gcc-13, gcc-12, gcc-11 or trunk) generate the same
assembly, but because gcc-11/gcc-12/gcc-13 is built with --checking=release by
default which doesn't set CHECKING_P while trunk is built with
--checking=yes,extra by default which set CHECKING_P. So it causes the
different behaviors so that further considered as regression unexpectedly.

The issue should be gone if trunk gets released as gcc-14 or it's built with
--checking=release. IMO Alan's commit aims to help to expose more and more such
unexpected use cases and users can fix them in place. As #c3 "PowerPC 7450 (aka
PowerPC G4) is only capable of -maltivec but not -mvsx", so it's unexpected to
have -mcpu=7450 meanwhile having -mvsx, could you check where the -mvsx comes
from and fix it instead?  Thanks!

btw, a workaround option is to add -Wa,-many to restore the previous behavior
that passing -many to assembler.

[Bug target/113507] can't build a cross compiler to rs6000-ibm-aix7.2

2024-01-22 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113507

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 CC||segher at gcc dot gnu.org
   Last reconfirmed||2024-01-23
 Ever confirmed|0   |1

--- Comment #5 from Kewen Lin  ---
(In reply to H.J. Lu from comment #3)
> (In reply to Kewen Lin from comment #2)
> > Guessing /usr/local/bin/ld is a gnu ld? Based on what I heard before, gnu ld
> > has some problems on aix, people pass object files to aix system and use aix
> > ld there. Not sure if the understanding still holds.
> 
> I am building a cross compiler.  No AIX tools are involved.

Thanks for clarifying, I was dull and misunderstood it.

Confirmed, some symbols are from rs6000-builtin.cc (which is not generated) but
it requires some symbols in rs6000-builtins.cc (which is generated). Both
object files are not included in linking. The below diff can fix it:

diff --git a/gcc/config.gcc b/gcc/config.gcc
index b2d7d7dd475..6b62e4fe56c 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -557,8 +557,10 @@ rs6000*-*-*)
 extra_options="${extra_options} g.opt fused-madd.opt
rs6000/rs6000-tables.opt"
 extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o"
 extra_objs="${extra_objs} rs6000-call.o rs6000-pcrel-opt.o"
+extra_objs="${extra_objs} rs6000-builtin.o rs6000-builtins.o"
 target_gtfiles="$target_gtfiles
\$(srcdir)/config/rs6000/rs6000-logue.cc
\$(srcdir)/config/rs6000/rs6000-call.cc"
 target_gtfiles="$target_gtfiles
\$(srcdir)/config/rs6000/rs6000-pcrel-opt.cc"
+target_gtfiles="$target_gtfiles ./rs6000-builtins.h"
 ;;
 sparc*-*-*)
 cpu_type=sparc

According to David's comments "rs6000-ibm-aix doesn't exist any more" and I
vaguely remembered Segher also mentioned rs6000*-*-*) becomes stale, maybe we
can aggressively drop the whole rs6000*-*-*) case handling?

[Bug target/113507] can't build a cross compiler to rs6000-ibm-aix7.2

2024-01-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113507

Kewen Lin  changed:

   What|Removed |Added

 CC||bergner at gcc dot gnu.org,
   ||dje at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org

--- Comment #2 from Kewen Lin  ---
Guessing /usr/local/bin/ld is a gnu ld? Based on what I heard before, gnu ld
has some problems on aix, people pass object files to aix system and use aix ld
there. Not sure if the understanding still holds.

[Bug testsuite/109705] [14 regression] gcc.dg/vect/pr25413a.c fails after r14-333-g6d4b59a9356ac4

2024-01-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109705

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #7 from Kewen Lin  ---
(In reply to Peter Bergner from comment #6)
> (In reply to GCC Commits from comment #5)
> > commit r14-7270-g39fa71a0882928a25bd170580e3e9e89a69dce36
> > Author: Kewen Lin 
> > Date:   Mon Jan 15 20:55:40 2024 -0600
> > 
> > testsuite: Fix vect_long_mult on Power [PR109705]
> > 
> > As pointed out by the discussion in PR109705, the current
> > vect_long_mult effective target check on Power is broken.
> > This patch is to fix it accordingly.
> 
> Does this need backporting?

I guess no, the only use of vect_long_mult in release branches is
gcc/testsuite/gcc.dg/vect/pr60656.c which has another check
vect_widen_mult_si_to_di_pattern unsupported on Power.

[Bug testsuite/113535] rs6000, testsuite: Re-visit the current vect_* for Power

2024-01-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113535

Kewen Lin  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-22
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 CC||bergner at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

[Bug testsuite/113535] New: rs6000, testsuite: Re-visit the current vect_* for Power

2024-01-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113535

Bug ID: 113535
   Summary: rs6000, testsuite: Re-visit the current vect_* for
Power
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
  Assignee: unassigned at gcc dot gnu.org
  Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

Inspired by PR109705, open this for tracking the revisit of vect_* checking for
Power and fix some if needed.

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-01-18 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

--- Comment #4 from Kewen Lin  ---
(In reply to Naveen N Rao from comment #2)
> I don't really have a preference, though I tend to agree that nops before
> the local entry point aren't that useful. Even with the current approach,
> not all functions have instructions at the GEP and for those, the nops are
> being generated outside the function. We also won't have a separate GEP/LEP
> with pcrel, so we won't need a separate option eventually.

Thanks for the input! Looking forward to the comments from the others,
especially Segher, David and Peter.

(In reply to Michael Matz from comment #3)
> (In reply to Kewen Lin from comment #1)
> > 
> > As Segher's review comments in [2], to support "before NOPs" before global
> > entry and "after NOPs" after global entry,
> 
> Just to be perfectly clear here: the "after NOPs" need to come after local
> entry
> (which strictly speaking is of course after the global one as well, but I'm
> being anal :) ).

Oops, good catch, I meant to type "after local entry", thanks for the
correction making it perfectly clear. :)

[Bug testsuite/111850] [14 regression] gcc.target/powerpc/fold-vec-extract-char.p7.c fails after r14-4664-g04c9cf5c786b94

2024-01-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111850

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Kewen Lin  ---
Should be fixed on trunk.

[Bug target/99888] Add powerpc ELFv2 support for -fpatchable-function-entry*

2024-01-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99888

--- Comment #16 from Kewen Lin  ---
(In reply to Michael Matz from comment #15)
> Umm.  I just noticed this one as we now try to implement userspace live
> patching
> for ppc64le.  The point of the "before" NOPs is (and always was) that they
> are
> completely out of the way of patchable but as-of-yet unpatched functions.
> 
> For ppc that means the "before" and "after" NOPs cannot be consecutive.  The
> two
> NOP sets being consecutive was never a design criteria or requirement.
> 
> So, while the original bug is fixed by what was committed (local entry was
> skipping the patching-nops), the chosen solution is exactly the wrong one :-/

Thanks for the input! Sigh, sorry that we picked up the wrong one :(, you may
have noticed that the main consideration to choose the current one is to keep
it align with the consecutive NOPs described by the documentation, we need a
separate command line option as Segher's review comment in
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600239.html. Now we have
PR112980 filed for the requested behavior, let's discuss how to support it
there.

[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point

2024-01-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980

Kewen Lin  changed:

   What|Removed |Added

 CC||matz at gcc dot gnu.org
   Last reconfirmed||2024-01-18
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 Ever confirmed|0   |1
   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=99888
 Status|UNCONFIRMED |ASSIGNED

--- Comment #1 from Kewen Lin  ---
[1] made me realize that I forgot to post some comments here. (I thought I did
but actually didn't).

As Segher's review comments in [2], to support "before NOPs" before global
entry and "after NOPs" after global entry, we need to introduce a separate
command line option, I think it can be a target specific option, which is
enabled by default and we should mention its default behavior and impact in the
current documentation for -fpatchable-function-entry. I don't have a good name
candidate, any suggestions?

Considering that the current behavior aligning with consecutive NOPs looks
useless (this request and [1]), an alternative is to aggressively change the
current behavior to "before NOPs" before global entry and "after NOPs" after
global entry.

Any preference or other ideas?  Any comments are highly appreciated.

I think with either (any) proposal it's inevitable to make the current behavior
of -fpatchable-function-entry on "before NOPs" change, we should also document
this change in releases/changes.html.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99888#c15
[2] https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600239.html

[Bug other/113317] New test case libgomp.c++/ind-base-2.C fails with ICE

2024-01-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113317

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #3 from Kewen Lin  ---
I can't reproduce this either, tried on at least one machine with P8 LE, P9 LE,
P10 LE or P9 BE. I wonder which internal host was used for testing.

[Bug testsuite/113418] Use of vect_* target selectors in tests out of vect directories

2024-01-16 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113418

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #4 from Kewen Lin  ---
Thanks for filing this, I just realized that it's unexpected to use vect_*
effective target checks outside */vect/ in generic test suites.

> 
> I just found them with a simple grep command so there might be false
> positives or false negatives.  There are also a dozen matches in gcc.target
> but I consider them fine as the target maintainers should know exactly what
> they are doing.

Yes, I think those in target should be fine, although they can be replaced with
some corresponding target specific check(s), sometimes the vect_* is more
readable.

[Bug testsuite/111850] [14 regression] gcc.target/powerpc/fold-vec-extract-char.p7.c fails after r14-4664-g04c9cf5c786b94

2024-01-15 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111850

Kewen Lin  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org

--- Comment #4 from Kewen Lin  ---
Just realized that we also escalated test issue to P1, I'm going to make a
patch for the test case update.

[Bug target/113341] Using GCC as the bootstrap compiler breaks LLVM on 32-bit PowerPC

2024-01-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113341

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #9 from Kewen Lin  ---
Since it's a breakage during stage2, it's concluded that some built stage1
stuffs behave unexpectedly.  You probably can try to run regression testing
just with stage1 compiler to see if there is any regression exposed.

If without any luck, then you probably have to isolate into one or several
object files, since you have "objects" for "good" and "bad" stage1 compiler,
you can be able to isolate some in between further. Once you get some isolated,
you can probably get some hints it's a bug in LLVM source or gcc.

It seems you are using gcc 13.2.1 as version field shows, you can also try some
previous versions like gcc 12 and gcc 11 to see if they work and it's
regressed.

[Bug target/109987] ICE in in rs6000_emit_le_vsx_store on ppc64le with -Ofast -mno-power8-vector

2024-01-10 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109987

Kewen Lin  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org

--- Comment #3 from Kewen Lin  ---
As discussed in PR113115, I'm going to give option power{8,9}-vector removal a
shot.

[Bug target/113115] [14 Regression] ICE In extract_constrain_insn_cached recog.cc with ppc64le-linux-gnu crosscompiler from r14-3592-g9ea1248604d7b6

2024-01-10 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113115

--- Comment #7 from Kewen Lin  ---
(In reply to Peter Bergner from comment #5)
> I really dislike the -mpower{8,9}-vector options, but maybe it's too late to
> remove them for this release?  I'm not sure how involved/invasive that patch
> would be.  Segher, do you have a preference on remove them now or use the
> workaround above and remove in the next release?

(In reply to Segher Boessenkool from comment #6)
> Using -mpower9-vector while not having -mcpu=power9 (or later) is wrong, and
> should
> not work.  Using -mno-power9-vector is just weird.
> 
> If we can neuter the -mpower9-vector (etc.) options now, that would be good.
> But
> there are some complications with the testsuite at least?

OK, it sounds that it's still acceptable to adjust this at this time point, so
I'm working on a patch to evaluate its impact, will post it after full testing.

[Bug middle-end/113100] [14 regression] many strub tests fail after r14-6737-g4e0a467302fea5

2024-01-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113100

Kewen Lin  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #4 from Kewen Lin  ---
Should be fixed on trunk.

[Bug target/111480] new test case g++.target/powerpc/altivec-19.C fails

2024-01-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111480

Kewen Lin  changed:

   What|Removed |Added

  Component|testsuite   |target
   Keywords|testsuite-fail  |missed-optimization
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #2 from Kewen Lin  ---
Should be fixed.

[Bug testsuite/112751] [14 regression] gcc.target/powerpc/pcrel-sibcall-1.c fails after r14-5628-g53ba8d669550d3

2024-01-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112751

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #4 from Kewen Lin  ---
Should be fixed.

[Bug target/112606] [14 Regression] powerpc64le-linux-gnu: 'FAIL: gcc.target/powerpc/p8vector-fp.c scan-assembler xsnabsdp'

2024-01-09 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112606

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Kewen Lin  ---
Should be fixed on trunk now.

[Bug target/113115] [14 Regression] ICE In extract_constrain_insn_cached recog.cc with ppc64le-linux-gnu crosscompiler from r14-3592-g9ea1248604d7b6

2024-01-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113115

Kewen Lin  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #4 from Kewen Lin  ---
(In reply to Peter Bergner from comment #3)
> Ke Wen, is this just a duplicate of PR109987 and PR103627?  I know it was
> bisected to Jeevitha's commit, but it seems more like her commit exposed the
> same latent issue as those other PRs, rather than causing it.  Your thoughts?

Yes, I agree it's duplicated of PR109987, Jeevitha's commit just exposed this
known issue, since we are in stage 3, I wonder if we can go with power9-vector
guarding first
(https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587310.html) since
power9-vector still exists in this release, and we can try to remove these
workaround options in next release. (Sorry that I missed to follow up the
power{8,9}-vector removal)

*** This bug has been marked as a duplicate of bug 109987 ***

[Bug target/109987] ICE in in rs6000_emit_le_vsx_store on ppc64le with -Ofast -mno-power8-vector

2024-01-08 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109987

Kewen Lin  changed:

   What|Removed |Added

 CC||fkastl at suse dot cz

--- Comment #2 from Kewen Lin  ---
*** Bug 113115 has been marked as a duplicate of this bug. ***

[Bug testsuite/111480] new test case g++.target/powerpc/altivec-19.C fails

2024-01-07 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111480

Kewen Lin  changed:

   What|Removed |Added

   Last reconfirmed||2024-01-08
 Ever confirmed|0   |1
 Status|UNCONFIRMED |ASSIGNED
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-January
   ||/642093.html
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 CC||linkw at gcc dot gnu.org

[Bug target/112606] [14 Regression] powerpc64le-linux-gnu: 'FAIL: gcc.target/powerpc/p8vector-fp.c scan-assembler xsnabsdp'

2024-01-07 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112606

Kewen Lin  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
   Last reconfirmed||2024-01-08
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #5 from Kewen Lin  ---
(In reply to seurer from comment #3)
> These tests also fail starting with
> g:9e9279fadbd1c673c875b9d20261d2de0473f63f, r14-5542-g9e9279fadbd1c6
> 
> FAIL: gcc.target/powerpc/float128-hw5.c scan-assembler-not \\mxscpsgnqp\\M
> FAIL: gcc.target/powerpc/float128-hw5.c scan-assembler-times \\mxsnabsqp\\M 1
> FAIL: gcc.target/powerpc/float128-hw7.c scan-assembler-not \\mxscpsgnqp\\M
> FAIL: gcc.target/powerpc/float128-hw7.c scan-assembler-times \\mxsnabsqp\\M 1

These failures are related to ieee128, the #c4 only handles float/double, a
similar patch was posted for ieee128:
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642092.html

[Bug testsuite/112751] [14 regression] gcc.target/powerpc/pcrel-sibcall-1.c fails after r14-5628-g53ba8d669550d3

2024-01-07 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112751

Kewen Lin  changed:

   What|Removed |Added

URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-January
   ||/642091.html
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 Status|NEW |ASSIGNED

[Bug middle-end/113100] [14 regression] many strub tests fail after r14-6737-g4e0a467302fea5

2024-01-07 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113100

Kewen Lin  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2024-January
   ||/642090.html
 Status|NEW |ASSIGNED

[Bug testsuite/60031] dg-require-effective-target powerpc_vsx_ok is not enough

2024-01-04 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60031

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org

--- Comment #7 from Kewen Lin  ---
We have vsx_hw effective target keyword which uses check_vsx_hw_available.

# Return 1 if the target supports executing VSX instructions, 0
# otherwise.  Cache the result.

Doesn't it satisfy the requirement? Or am I missing something?

[Bug testsuite/106682] Powerpc test gcc.target/powerpc/pr86731-fwrapv-longlong.c fails on power8, passes on power9/power10

2024-01-03 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106682

Kewen Lin  changed:

   What|Removed |Added

 CC||seurer at gcc dot gnu.org

--- Comment #4 from Kewen Lin  ---
*** Bug 101444 has been marked as a duplicate of this bug. ***

[Bug testsuite/101444] [12/13/14 regression] gcc.target/powerpc/pr86731-fwrapv-longlong.c fails after r12-2266

2024-01-03 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101444

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 CC||linkw at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED

--- Comment #4 from Kewen Lin  ---
Dup.

*** This bug has been marked as a duplicate of bug 106682 ***

[Bug middle-end/113100] [14 regression] many strub tests fail after r14-6737-g4e0a467302fea5

2023-12-21 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113100

Kewen Lin  changed:

   What|Removed |Added

 CC||linkw at gcc dot gnu.org
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
   Last reconfirmed||2023-12-21

--- Comment #2 from Kewen Lin  ---
Confirmed, but it needs an explicit cpu type like -mcpu=power9 for
reproduction, otherwise it could pass on power10 as it can work with pcrel (so
no toc base r2 needed). The change can extend the end of scrubbing, it cleans
the saved toc base unexpectedly.

I noticed that there is one macro SPARC_STACK_BOUNDARY_HACK, which aims to
indicate this SPARC64 specific behavior. Could we leverage this macro (guarded
the biasing with it)? like:

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 125ea158ebf..9bad1e962b4 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -5450,6 +5450,7 @@ expand_builtin_stack_address ()
   rtx ret = convert_to_mode (ptr_mode, copy_to_reg (stack_pointer_rtx),
  STACK_UNSIGNED);

+#ifdef SPARC_STACK_BOUNDARY_HACK
   /* Unbias the stack pointer, bringing it to the boundary between the
  stack area claimed by the active function calling this builtin,
  and stack ranges that could get clobbered if it called another
@@ -5476,7 +5477,9 @@ expand_builtin_stack_address ()
  (caller) function's active area as well, whereas those pushed or
  allocated temporarily for a call are regarded as part of the
  callee's stack range, rather than the caller's.  */
-  ret = plus_constant (ptr_mode, ret, STACK_POINTER_OFFSET);
+  if (SPARC_STACK_BOUNDARY_HACK)
+ret = plus_constant (ptr_mode, ret, STACK_POINTER_OFFSET);
+#endif

   return force_reg (ptr_mode, ret);
 }

[Bug rtl-optimization/85099] [meta-bug] selective scheduling issues

2023-12-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85099
Bug 85099 depends on bug 112995, which changed state.

Bug 112995 Summary: sel-sched2 ICE without checking verify_changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug rtl-optimization/112995] sel-sched2 ICE without checking verify_changes

2023-12-20 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED
   Target Milestone|--- |14.0

--- Comment #5 from Kewen Lin  ---
Should be fixed on trunk, guessing we don't want a backport, so closing.

[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-12-18 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

Kewen Lin  changed:

   What|Removed |Added

 Resolution|FIXED   |---
 Status|RESOLVED|REOPENED

--- Comment #44 from Kewen Lin  ---
I just checked test case in comment #43, I think those Set/Load are able to
initialize those arrays as expected, so re-opening this.

[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-12-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #43 from Kewen Lin  ---
Created attachment 56899
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56899=edit
Previously reduced case for comment 10

[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-12-17 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #42 from Kewen Lin  ---
(In reply to Richard Biener from comment #41)
> What's the "other" testcase?  Do we know that doesn't suffer from the same
> uninitialized issue?

For "other" test cases, I guessed he referred to my comment #c31, these are
comment #c9 and #c10. Previously I further reduced #c10 and I didn't detect
obvious uninitialized issue (but I could be wrong).

[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3

2023-12-15 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591

--- Comment #38 from Kewen Lin  ---
I found this has been marked as resolved but it seems that the patch in comment
#34 hasn't been pushed, is it intended? or did I miss something that one commit
was pushed but wasn't associated to this PR?

[Bug rtl-optimization/113029] sel-sched2 ICE in verify_target_availability

2023-12-14 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113029

Kewen Lin  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=88652

--- Comment #3 from Kewen Lin  ---
(In reply to Andrew Pinski from comment #1)
> Maybe https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84842#c17 patch helps

Unfortunately it doesn't help, I noticed this and tried below:

diff --git a/gcc/sel-sched.cc b/gcc/sel-sched.cc
index a35b5e16c91..8e3b3bb0467 100644
--- a/gcc/sel-sched.cc
+++ b/gcc/sel-sched.cc
@@ -323,6 +323,10 @@ struct reg_rename

   /* The set of ABIs used by calls that the code motion path crosses.  */
   unsigned int crossed_call_abis : NUM_ABI_IDS;
+
+  /* True if we have merged expressions and one of them had availability
+ bit set.  */
+  unsigned int merged_available_expr : 1;
 };

 /* A global structure that contains the needed information about harg
@@ -388,6 +392,10 @@ struct fur_static_params

   /* The set of ABIs used by calls that the code motion path crosses.  */
   unsigned int crossed_call_abis : NUM_ABI_IDS;
+
+  /* True if we have merged expressions and one of them had availability
+ bit set.  */
+  unsigned int merged_available_expr : 1;
 };

 typedef struct fur_static_params *fur_static_params_p;
@@ -1554,7 +1562,8 @@ verify_target_availability (expr_t expr, regset
used_regs,
|| !hard_available
|| (!reload_completed
&& reg_rename_p->crossed_call_abis
-   && REG_N_CALLS_CROSSED (regno) == 0));
+   && REG_N_CALLS_CROSSED (regno) == 0)
+   || reg_rename_p->merged_available_expr);
 }

 /* Collect unavailable registers due to liveness for EXPR from BNDS
@@ -1654,6 +1663,8 @@ find_best_reg_for_expr (expr_t expr, blist_t bnds, bool
*is_orig_reg_p)
   used_regs = get_clear_regset_from_pool ();
   CLEAR_HARD_REG_SET (reg_rename_data.unavailable_hard_regs);

+  reg_rename_data.crossed_call_abis = false;
+  reg_rename_data.merged_available_expr = false;
   collect_unavailable_regs_from_bnds (expr, bnds, used_regs, _rename_data,
  _insns);

@@ -1861,7 +1872,7 @@ identical_copy_p (rtx_insn *insn)
 /* Undo all transformations on *AV_PTR that were done when
moving through INSN.  */
 static void
-undo_transformations (av_set_t *av_ptr, rtx_insn *insn)
+undo_transformations (av_set_t *av_ptr, rtx_insn *insn, void *static_params)
 {
   av_set_iterator av_iter;
   expr_t expr;
@@ -1940,6 +1951,8 @@ undo_transformations (av_set_t *av_ptr, rtx_insn *insn)
  copy, which was in turn substituted.  The history is
wrong
  in this case.  Do it the hard way.  */
   add = substitute_reg_in_expr (tmp_expr, insn, true);
+if (code_motion_path_driver_info == _hooks)
+  ((fur_static_params_p) static_params)->merged_available_expr
= true;
 if (add)
   av_set_add (_set, tmp_expr);
 clear_expr (tmp_expr);
@@ -3273,6 +3286,7 @@ find_used_regs (insn_t insn, av_set_t orig_ops, regset
used_regs,
   sparams.crossed_call_abis = 0;
   sparams.original_insns = original_insns;
   sparams.used_regs = used_regs;
+  sparams.merged_available_expr = false;

   /* Set the appropriate hooks and data.  */
   code_motion_path_driver_info = _hooks;
@@ -3280,6 +3294,7 @@ find_used_regs (insn_t insn, av_set_t orig_ops, regset
used_regs,
   res = code_motion_path_driver (insn, orig_ops, NULL, , );

   reg_rename_p->crossed_call_abis |= sparams.crossed_call_abis;
+  reg_rename_p->merged_available_expr |= sparams.merged_available_expr;

   gcc_assert (res == 1);
   gcc_assert (original_insns && *original_insns);
@@ -6570,7 +6585,7 @@ code_motion_path_driver (insn_t insn, av_set_t orig_ops,
ilist_t path,
{
  /* Av set ops could have been changed when moving through this
 insn.  To find them below it, we have to un-substitute them. 
*/
- undo_transformations (_ops, insn);
+ undo_transformations (_ops, insn, static_params);
}
  else
{

[Bug rtl-optimization/113029] sel-sched2 ICE in verify_target_availability

2023-12-14 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113029

--- Comment #2 from Kewen Lin  ---
I noticed there are some existing PRs (PR107984, PR99328, PR88652, PR84842) on
verify_target_availability ICE, and in PR84842 there is a tentative patch, I
tried to make it fit with the latest trunk, but this still fails, so I file
this.

[Bug rtl-optimization/113029] New: sel-sched2 ICE in verify_target_availability

2023-12-14 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113029

Bug ID: 113029
   Summary: sel-sched2 ICE in verify_target_availability
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

Test case:

#include 
#define c(d, g) g, d
#define e(d, g) g, d
vector double f, n;
int m;
int k;
void j (vector double, double, double);
vector double combine (double, double);
vector double i (double, double);
vector double l (vector double, double);
vector double
o (vector double, double)
{
  vector double a;
  vector double b;
  p ("");
  j (f, c (1, 2));
  j (n, c (3, 4));
  b = i (3, 4);
  j (a, e (1, 2));
  j (b, e (3, 4));
  j (l (a, 5.0), e (5, 2));
  j (o (b, 6.0), e (3, 6));
  k = vec_extract (b, 1);
  j (combine (0, k), c (2, 4));
  m = vec_extract (b, 0);
  j (i (0, m), e (1, 3));
  i (0, vec_extract (b, 1));
}

Option: -std=c89 -O2 -mcpu=power10 -fselective-scheduling2

during RTL pass: sched2
test.c: In function ‘o’:
test.c:29:1: internal compiler error: in verify_target_availability, at
sel-sched.cc:1553
   29 | }
  | ^
0x10c54c43 verify_target_availability
/home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:1553
0x10c54c43 find_best_reg_for_expr
/home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:1667
0x10c54c43 fill_vec_av_set
/home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:3784
0x10cb fill_ready_list
/home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:4014
0x10cb find_best_expr
/home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:4374
0x10cb fill_insns
/home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:5535
0x10cb schedule_on_fences
/home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:7353
0x10cb sel_sched_region_2
/home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:7491
0x10c57b8b sel_sched_region_1
/home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:7533
0x10c59723 sel_sched_region(int)
/home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:7634
0x10c59723 sel_sched_region(int)
/home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:7619
0x10c59beb run_selective_scheduling()
/home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:7720
0x10c2e6ef rest_of_handle_sched2
/home/gccbuild/gcc_trunk_git/gcc/gcc/sched-rgn.cc:3748
0x10c2e6ef execute
/home/gccbuild/gcc_trunk_git/gcc/gcc/sched-rgn.cc:3895

[Bug rtl-optimization/112995] sel-sched2 ICE without checking verify_changes

2023-12-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995

--- Comment #3 from Kewen Lin  ---
(In reply to Andrew Pinski from comment #2)
> fselective-scheduling has so many issues.

ah, thanks a lot for pointing this out.

I was testing the impact of my proposed scheduling change and found this
feature didn't work well on Power (turning on it by default and failed to build
even without bootstrap). I thought it's able to specify these
selective-scheduling related options on Power, maybe we need to ensure some
quality there. I just know Power is not alone ;-), by scanning those PRs under
meta-bug I noticed at least more than three had the same/similar ICE traces as
what I found in those exposed failures. As noticing this
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110526#c4, I wonder if they have
become in low priority?

[Bug rtl-optimization/112995] sel-sched2 ICE without checking verify_changes

2023-12-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995

--- Comment #1 from Kewen Lin  ---
Initially we have:

(insn 31 6 10 2 (set (reg/v:SI 9 9 [orig:119 c ] [119])
(reg/v:SI 64 0 [orig:119 c ] [119])) "test.i":5:5 555
{*movsi_internal1}
 (expr_list:REG_DEAD (reg/v:SI 64 0 [orig:119 c ] [119])
(nil)))
(insn 10 31 25 2 (set (reg:DI 10 10 [128])
(ashift:DI (sign_extend:DI (reg/v:SI 9 9 [orig:119 c ] [119]))
(const_int 2 [0x2]))) "test.i":7:8 278 {ashdi3_extswsli}
 (nil))
(insn 25 10 27 2 (set (reg:DI 64 0 [135])
(sign_extend:DI (reg/v:SI 9 9 [orig:119 c ] [119]))) "test.i":6:5 31
{extendsidi2}
 (expr_list:REG_DEAD (reg/v:SI 9 9 [orig:119 c ] [119])
(nil)))

with moving up, we have:

(insn 46 0 0 (set (reg:DI 64 0 [135])
(sign_extend:DI (reg/v:SI 64 0 [orig:119 c ] [119]))) 31 {extendsidi2}
 (expr_list:REG_DEAD (reg/v:SI 9 9 [orig:119 c ] [119])
(nil)))

in try_replace_dest_reg, we updated the above EXPR_INSN_RTX to:

(insn 48 0 0 (set (reg:DI 32 0)
(sign_extend:DI (reg/v:SI 64 0 [orig:119 c ] [119]))) 31 {extendsidi2}
 (nil))

This doesn't match any constraint and it's an unexpected modification.

Unfortunately function try_replace_dest_reg just checks the orig insn with:

  if (REGNO (best_reg) != REGNO (INSN_LHS (orig_insn))
  && (! replace_src_with_reg_ok_p (orig_insn, best_reg)
  || ! replace_dest_with_reg_ok_p (orig_insn, best_reg)))

But it doesn't check EXPR_INSN_RTX, I think it's under the assumption that if
the original insn is able to be replaced then the change on EXPR_INSN_RTX is
fine, but this isn't true as the given test case shows.

[Bug rtl-optimization/112995] sel-sched2 ICE without checking verify_changes

2023-12-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995

Kewen Lin  changed:

   What|Removed |Added

  Known to fail||11.4.0
   Last reconfirmed||2023-12-13
 Status|UNCONFIRMED |ASSIGNED
   Keywords||ice-on-valid-code
 CC||amonakov at gcc dot gnu.org,
   ||bergner at gcc dot gnu.org,
   ||segher at gcc dot gnu.org
 Target||powerpc64le-linux-gnu
 Ever confirmed|0   |1
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org

[Bug rtl-optimization/112995] New: sel-sched2 ICE without checking verify_changes

2023-12-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995

Bug ID: 112995
   Summary: sel-sched2 ICE without checking verify_changes
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: linkw at gcc dot gnu.org
  Target Milestone: ---

With selective scheduling 2 enabled by default, I failed to build gcc with
non-bootstrap on Power10, one reduced test case is listed below:

int a[];
int b(__ieee128 e) {
  int c;
  __ieee128 d;
  c = e;
  d = c;
  d = a[c] + d;
  return d;
}

option: -O2 -S -fselective-scheduling2 -mcpu=power10 (or -mcpu=power9)

ICE reason:

test.c:9:1: error: insn does not satisfy its constraints:
9 | }
  | ^

(insn 48 0 0 (set (reg:DI 32 0)
(sign_extend:DI (reg/v:SI 64 0 [orig:119 c ] [119]))) 31 {extendsidi2}
 (nil))

[Bug target/112993] rs6000: Rework precision for 128bit float types and modes

2023-12-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112993
Bug 112993 depends on bug 112788, which changed state.

Bug 112788 Summary: [14 regression] ICEs in fold_range, at range-op.cc:206 
after r14-5972-gea19de921b01a6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112788

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

[Bug tree-optimization/112788] [14 regression] ICEs in fold_range, at range-op.cc:206 after r14-5972-gea19de921b01a6

2023-12-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112788

Kewen Lin  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #7 from Kewen Lin  ---
Should be fixed on latest trunk, we should get rid of this workaround in next
release, it will be tracked in PR112993.

[Bug target/112993] rs6000: Rework precision for 128bit float types and modes

2023-12-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112993

Kewen Lin  changed:

   What|Removed |Added

   Last reconfirmed||2023-12-13
 Ever confirmed|0   |1
   Keywords|build, ice-checking,|internal-improvement
   |ice-on-valid-code   |
   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 Status|UNCONFIRMED |ASSIGNED

[Bug target/112993] New: rs6000: Rework precision for 128bit float types and modes

2023-12-12 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112993

Bug ID: 112993
   Summary: rs6000: Rework precision for 128bit float types and
modes
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: build, ice-checking, ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: linkw at gcc dot gnu.org
CC: amacleod at redhat dot com, andy at gwentswordclub dot 
co.uk,
bergner at gcc dot gnu.org, linkw at gcc dot gnu.org,
meissner at gcc dot gnu.org, segher at gcc dot gnu.org,
seurer at gcc dot gnu.org, tschwinge at gcc dot gnu.org
Depends on: 112788
  Target Milestone: ---
  Host: powerpc64le-linux-gnu
Target: powerpc64le-linux-gnu
 Build: powerpc64le-linux-gnu

+++ This bug was initially created as a clone of Bug #112788 +++

As PR112788 shows and the review comments from Andrew and Jakub at
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640342.html, we should
get rid of the workaround for PR112788 from GCC 15+.

This PR is filed for tracking this, we would expect that the precision for
those types and modes are all 128 bit, also TFmode becomes one macro
conditionally defined as IFmode or KFmode.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112788
[Bug 112788] [14 regression] ICEs in fold_range, at range-op.cc:206 after
r14-5972-gea19de921b01a6

[Bug tree-optimization/112788] [14 regression] ICEs in fold_range, at range-op.cc:206 after r14-5972-gea19de921b01a6

2023-12-07 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112788

--- Comment #5 from Kewen Lin  ---
One workaround patch was posted at
https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639140.html.

We also found that with default long double format ieee128 the culprit commit
caused the libquadmath library isn't able to be built on a system with ieee128
libs, consequently there are a lot of fortran testing failures.

The workaround also fixed some failures which existed there previously:

UNRESOLVED->NA: 20_util/from_chars/8.cc  -std=gnu++23 compilation failed to
produce executable
NA->PASS: 20_util/from_chars/8.cc  -std=gnu++23 execution test
FAIL->PASS: 20_util/from_chars/8.cc  -std=gnu++23 (test for excess errors)
UNRESOLVED->NA: 20_util/from_chars/8.cc  -std=gnu++26 compilation failed to
produce executable
NA->PASS: 20_util/from_chars/8.cc  -std=gnu++26 execution test
FAIL->PASS: 20_util/from_chars/8.cc  -std=gnu++26 (test for excess errors)
UNRESOLVED->NA: 20_util/to_chars/float128_c++23.cc  -std=gnu++23 compilation
failed to produce executable
NA->PASS: 20_util/to_chars/float128_c++23.cc  -std=gnu++23 execution test
FAIL->PASS: 20_util/to_chars/float128_c++23.cc  -std=gnu++23 (test for excess
errors)
UNRESOLVED->NA: 20_util/to_chars/float128_c++23.cc  -std=gnu++26 compilation
failed to produce executable
NA->PASS: 20_util/to_chars/float128_c++23.cc  -std=gnu++26 execution test
FAIL->PASS: 20_util/to_chars/float128_c++23.cc  -std=gnu++26 (test for excess
errors)

[Bug tree-optimization/112788] [14 regression] ICEs in fold_range, at range-op.cc:206 after r14-5972-gea19de921b01a6

2023-12-03 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112788

Kewen Lin  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |linkw at gcc dot gnu.org
 Status|NEW |ASSIGNED
   Last reconfirmed|2023-12-03 00:00:00 |2023-12-01 0:00

--- Comment #4 from Kewen Lin  ---
(In reply to Andrew Macleod from comment #2)
> (In reply to Kewen Lin from comment #1)
> 
> > 
> > ranger makes use of type precision directly instead of something like
> > types_compatible_p. I wonder if we can introduce a target hook (or hookpod)
> > to make ranger unrestrict this check a bit, the justification is that for
> > float type its precision information is encoded in its underlying
> > real_format, if two float types underlying modes are the same, the precision
> > are actually the same. 
> > 
> > btw, the operand_check_ps seems able to call range_compatible_p?
> 
> It could, but just a precision check seemed enough at the time.
> The patch also went thru many iterations and it was only the final version
> that operand_check_p() ended up with types as the parameter rather than
> ranges.
> 
> You bring up a good point tho. I just switched those routines to call
> range_compatible_p() and checked it in.  Now it is all centralized in the
> one routine going forward. 

Nice! Thanks a lot for your prompt enhancement!

>  
> It does seem wrong that the float precision don't match, and weird that its
> hard to fix :-)   

Yes, I dislike it and thought it's not sensible and tried to fix, but as the
discussion in the thread mentioned above showed there were some historical
reasons and practical issues to fix it. At the time, Segher mentioned he had
some patches to avoid different modes having the same format but encountered
some issues before and would have a re-try, but now stage 1 passed again, I
guessed we have to stay with it in this release.

> It should now be possible to have range_compatible_p check
> the underlying mode for floats rather than the precision...  If there's a
> good argument for it, and you want to give that a shot...

I have to admit this idea is just a workaround, even the actual float precision
is encoded in the format associated to the underlying mode, but it's still
unexpected to have two types with the same underlying mode but different type
precision. I'm going to make and test a workaround patch since it affected the
build again as reported. :(

[Bug tree-optimization/112788] [14 regression] ICEs in fold_range, at range-op.cc:206 after r14-5972-gea19de921b01a6

2023-12-01 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112788

Kewen Lin  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1
 CC||linkw at gcc dot gnu.org,
   ||meissner at gcc dot gnu.org,
   ||segher at gcc dot gnu.org
   Last reconfirmed||2023-12-01

--- Comment #1 from Kewen Lin  ---
Confirmed.

A reduced test case:

long double a, b, c;
long double d() { return -__builtin_fmaf128_round_to_odd(c, b, a); }

c.0_1 = c;
b.1_2 = b;
a.2_3 = a;
_4 = __builtin_fmaf128_round_to_odd (c.0_1, b.1_2, a.2_3);
_6 = -_4;
return _6;

 206├───> gcc_assert (m_operator->operand_check_p (type, lh.type (), rh.type
()));

stmt: _6 = -_4;

(gdb) pge lh.type()
_Float128
(gdb) pge rh.type()
long double

The root cause is the same to what's in PR107299, TYPE_PRECISION of rh.type is
127 while that of lh.type is 128, some attempts were tried to fix this
precision difference before but failed to, like:
https://inbox.sourceware.org/gcc-patches/718677e7-614d-7977-312d-05a75e1fd...@linux.ibm.com/.

ranger makes use of type precision directly instead of something like
types_compatible_p. I wonder if we can introduce a target hook (or hookpod) to
make ranger unrestrict this check a bit, the justification is that for float
type its precision information is encoded in its underlying real_format, if two
float types underlying modes are the same, the precision are actually the same. 

btw, the operand_check_ps seems able to call range_compatible_p?

[Bug target/112778] ICE in ppc64-linux-gnu crosscompiler in store_by_pieces since r14-5946-g1ff6d9f7428b06

2023-11-30 Thread linkw at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112778

Kewen Lin  changed:

   What|Removed |Added

 CC||aoliva at gcc dot gnu.org,
   ||bergner at gcc dot gnu.org,
   ||linkw at gcc dot gnu.org,
   ||segher at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
Summary|ICE in ppc64-linux-gnu  |ICE in ppc64-linux-gnu
   |crosscompiler in|crosscompiler in
   |store_by_pieces, at |store_by_pieces since
   |expr.cc:1820|r14-5946-g1ff6d9f7428b06
   Keywords|needs-bisection |
   Last reconfirmed||2023-12-01

--- Comment #1 from Kewen Lin  ---
Confirmed, thanks for reporting, it starts from r14-5946-g1ff6d9f7428b06.

It looks function try_store_by_multiple_pieces has the wrong assumption. For
the code "memset (buf, 'v', 3)", it checks 

+  if (max_bits < orig_max_bits
+  && xlenest + blksize >= xlenest
+  && can_store_by_pieces (xlenest + blksize,
+  builtin_memset_read_str,
+  , align, true))

, succeeds and breaks. later it goes with blksize:

  to = store_by_pieces (to, blksize,
constfun, constfundata,
align, true,
max_len != 0 ? RETURN_END : RETURN_BEGIN);

and fails at targetm.use_by_pieces_infrastructure_p assertion.

It's concluded that can_store_by_pieces (xlenest + blksize, ...) doesn't
necessarily means can_store_by_pieces (blksize, ...).

  1   2   3   4   5   6   7   8   9   >