[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846 --- Comment #5 from Kewen Lin --- Created attachment 58067 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58067=edit untested patch
[Bug testsuite/113535] rs6000, testsuite: Re-visit the current vect_* for Power
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113535 --- Comment #1 from Kewen Lin --- One issue: https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650171.html
[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #4 from Kewen Lin --- (In reply to Andrew Pinski from comment #3) > (In reply to Kewen Lin from comment #2) > > As https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114843#c8, we may need some > > similar handling like r14-6440-g4b421728289e6f. > > Note rs6000_emit_epilogue mostly handles eh_returns so it might not be as > hard as other targets. Yes, making a patch.
[Bug target/44793] [11/12/13/14/15 Regression] libgcc does not include t-ppccomm on rtems
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=44793 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |WORKSFORME CC||linkw at gcc dot gnu.org --- Comment #26 from Kewen Lin --- libgcc/config.host on gcc-11 has: powerpc-*-rtems*) tmake_file="${tmake_file} rs6000/t-ppccomm rs6000/t-savresfgpr rs6000/t-crtstuff t-crtstuff-p ic t-fdpbit" extra_parts="$extra_parts crtbeginS.o crtendS.o crtbeginT.o ecrti.o ecrtn.o ncrti.o ncrtn.o" ;; I think this had been fixed already by r0-119741-g6f28886030623a. Please feel free to reopen it if it still occurs on active releases. Thanks!
[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846 --- Comment #2 from Kewen Lin --- As https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114843#c8, we may need some similar handling like r14-6440-g4b421728289e6f.
[Bug target/114846] powerpc: epilogue in _Unwind_RaiseException corrupts return value due to __builtin_eh_return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114846 Kewen Lin changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2024-04-25 Status|UNCONFIRMED |NEW CC||bergner at gcc dot gnu.org, ||linkw at gcc dot gnu.org, ||segher at gcc dot gnu.org Target|powerpc64-linux-gnu |powerpc64*-linux-gnu |powerpc-linux-gnu |powerpc-linux-gnu --- Comment #1 from Kewen Lin --- Thanks for reporting, confirmed, it also fails on LE (ppc64le-linux).
[Bug testsuite/114842] rs6000: Adjust some test cases with powerpc_vsx_ok
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114842 --- Comment #1 from Kewen Lin --- We can extend powerpc_vsx to consider current_compiler_flags, it means that if a test case has an explicit -mvsx, even if users specify -mno-vsx it's still able to be tested if powerpc_vsx checking concludes VSX is enabled, it can keep some previous testing coverage.
[Bug testsuite/114842] rs6000: Adjust some test cases with powerpc_vsx_ok
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114842 Kewen Lin changed: What|Removed |Added Target||powerpc*-linux-gnu Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org Last reconfirmed||2024-04-25 Target Milestone|--- |15.0 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED
[Bug testsuite/114842] New: rs6000: Adjust some test cases with powerpc_vsx_ok
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114842 Bug ID: 114842 Summary: rs6000: Adjust some test cases with powerpc_vsx_ok Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: linkw at gcc dot gnu.org Target Milestone: --- The current effective target powerpc_vsx_ok is mainly to check if it's fine to specify -mvsx (without any warnings etc.) and can finally result in a object file (it means the underlying environment like assembler supports vsx insns). But most of the test cases being guarded with this checking actually want to check if VSX feature is enabled, such as: the wanted behavior only happens with VSX feature enabled. When users specifying -mno-vsx in RUNTESTFLAGS, it can disable VSX feature (with some old runtest, -mno-vsx comes after -mvsx), but powerpc_vsx_ok checking will still pass as it's fine to specify -mvsx, so if the test case doesn't have explicit -mvsx, then the given -mno-vsx can disable VSX feature and make that test case fail, meanwhile even if the test case has specified -mvsx explicitly it would fail with old runtest as -mno-vsx comes last. We already have another effective target powerpc_vsx which effectively checks for VSX enabled, so we should update most of test case to adopt it instead.
[Bug target/88309] [11/12/13/14 Regression] ICE: Floating point exception (in is_miss_rate_acceptable), target assigning alignent of 4 bits(!) to vector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88309 Kewen Lin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #11 from Kewen Lin --- Should be fixed on trunk and active release branches.
[Bug target/105359] _Float128 expanders and builtins disabled on ppc targets with 64-bit long double
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105359 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2024-04-23 Keywords||missed-optimization Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org CC||linkw at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Kewen Lin --- Thanks for reporting, I'll have a look.
[Bug testsuite/114744] test case gcc.target/powerpc/builtins-6-p9-runnable.c fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114744 Kewen Lin changed: What|Removed |Added Target Milestone|--- |14.0 Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #4 from Kewen Lin --- Should be fixed on trunk, since it's a test issue, no backporting need.
[Bug testsuite/114744] test case gcc.target/powerpc/builtins-6-p9-runnable.c fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114744 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org Ever confirmed|0 |1 Last reconfirmed||2024-04-17 Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED --- Comment #2 from Kewen Lin --- This is very very likely a test issue, due to endianness which the load vector should consider. I'll have a look.
[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980 --- Comment #17 from Kewen Lin --- (In reply to Michael Matz from comment #16) > (In reply to Kewen Lin from comment #15) > > I agree, thanks for the comments! btw, I'm not fighting for the current > > implementation, just want to know more details why users are unable to make > > use of the current implementation, is it just due to its inefficiency (like > > the above sequence) or un-usability (unused at all). As your comments, I > > think it's due to the former (inefficiency)?! > > Okay. So, yeah, I _think_ that other way (with NOPs between GEP and LEP, > plus a jump around them) could be made to work with userspace live patching. > It would just be inefficient. But do note that that jump around was _not_ > part of the original way of -fpatchable-function-entry, so a change to > codegen > would have to have happened anyway to make that other way usable. And it > has the > (perhaps theoretical, who knows :) ) problem of not using the normal 8-byte > difference between GEP and LEP. > Thanks again for confirming this understanding! > I think your current proposal from comment #10 is the better from all > perspectives. Yeah, I agree. When reworking this support previously, comment #10 like implementation was considered as a better one but it's not finally made due to the concern that can break the assumption NOPs should be consecutive, based on all the inputs here I think it's time to "fix" it by just underscoring this special not-consecutive NOPs in documentation.
[Bug target/114567] rs6000: explicit _Float128 doesn't generate optimal code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114567 --- Comment #1 from Kewen Lin --- This is power8 LE specific, for KFmode its mov expander calls rs6000_emit_le_vsx_move, so it's with V1TI subreg, then rs6000 specific pass swaps generate one MEM with AND -16, which make combine unable to optimize it with that *signbit2_dm_mem pattern due to mode_dependent_address_p returning false always for AND. Although it looks to me we can extend mode_dependent_address_p to consider the to-mode in that context, it's still sub-optimal due to the existence of AND -16, which result in an explicit "and" then.
[Bug testsuite/114662] [14 regression] new test case c_lto_pr113359-2 from r14-9841-g1e3312a25a7b34 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114662 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #4 from Kewen Lin --- Should be fixed on latest trunk.
[Bug testsuite/114662] [14 regression] new test case c_lto_pr113359-2 from r14-9841-g1e3312a25a7b34 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114662 Kewen Lin changed: What|Removed |Added Component|lto |testsuite Target Milestone|--- |14.0 Keywords||testsuite-fail
[Bug lto/114662] [14 regression] new test case c_lto_pr113359-2 from r14-9841-g1e3312a25a7b34 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114662 Kewen Lin changed: What|Removed |Added Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org CC||linkw at gcc dot gnu.org Last reconfirmed||2024-04-10 Status|UNCONFIRMED |ASSIGNED --- Comment #2 from Kewen Lin --- I think this is a test issue, with -m32 unsigned long is 4 bytes while CL1,CL2 are 8 bytes constants, then it considers some checks would always fail and the abort will happen, since the optimization aggressively optimize away the call to getb, there is no chance to further check "semantic equality". The IR for main at *.015t.cfg looks like: int main (int argc, char * * argv) { struct SB b; struct SA a; int D.3983; : init (); geta (, ); _1 = a.ax; if (_1 != 3735928559) goto ; [INV] else goto ; [INV] : __builtin_abort (); : __builtin_abort (); }
[Bug rtl-optimization/114664] -fno-omit-frame-pointer causes an ICE during the build of the greenlet package
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114664 --- Comment #8 from Kewen Lin --- (In reply to Peter Bergner from comment #7) > (In reply to Andrew Pinski from comment #6) > > Pre-IRA fix was done to specifically reject this: > > https://inbox.sourceware.org/gcc-patches/ > > ab3a61990702021658w4dc049cap53de8010a7d86...@mail.gmail.com/ > > Then that would seem to indicate that mentioning the frame pointer reg in > the asm clobber list is an error, but how are users supposed to know whether > -fno-omit-frame-pointer is in effect or not? I've looked and there is no > pre-defined macro a user could check. I noticed even without -fno-omit-frame-pointer, the test case still fails with the same symptom (with error msg rather than ICE), did I miss something?
[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980 --- Comment #15 from Kewen Lin --- (In reply to Michael Matz from comment #14) > Hmm? But this is not how the global-to-local hand-off is implemented (and > expected by tooling): a fall-through. The global entry sets up the GOT > register, there simply is no '[b localentry]'. > > If you mean to imply that also the '[b localentry]' should be patched in at > live-patch application time (and hence the GOT setup would need to be moved > to still somewhere else), then you have the problem that (in the > not-yet-patched > case) as long as the L1-nops sit between global and local entry they will > always > be executed when the global entry is called. Sorry for confusion, I meant the sequence like: global entry: [TOC base setup] // always here [b localentry] // which is added when patching L1: [patched code] // from patching localentry: [b L1] // from patching > That's wasteful. I agree, nops are not zero cost on Power8/Power9. > > Additionally tooling will be surprised if the address difference between > global and local entry isn't exactly 8 (i.e. two instructions). The psABI > allows for different values, of course. But I'm willing to bet that there > are > bugs in the wild when different values would be actually used. > It's possible that some tooling doesn't conform the ABI doc well, but I think the tooling should fix itself if that is the case. :) > So, the nops-between-gep-and-lep could probably be somehow made to work with > userspace live patching, but your most recent patch here makes this all mood. > It generates exactly the sequence we want: a single nop at the LEP, and > a configurable patching area outside of, but near to, the function (here: in > front of the GEP). I agree, thanks for the comments! btw, I'm not fighting for the current implementation, just want to know more details why users are unable to make use of the current implementation, is it just due to its inefficiency (like the above sequence) or un-usability (unused at all). As your comments, I think it's due to the former (inefficiency)?!
[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980 --- Comment #13 from Kewen Lin --- (In reply to Giuliano Belinassi from comment #12) > With your patch we have: > > > .LPFE0: > > ... > Which seems what is expected. Hi Giuliano, thanks for your time on testing it! Could you kindly help to explain a bit on why "In such way we can't use the this space to place a trampoline to the new function"? Is it due to inefficient code like needing more branches? global entry: [b localentry] L1: [patched code] localentry: [b L1] Or some other reason which makes it unused at all?
[Bug testsuite/114614] New test case gcc.misc-tests/gcov-20.c from r14-9789-g08a52331803f66 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114614 Kewen Lin changed: What|Removed |Added Target Milestone|--- |14.0 Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #3 from Kewen Lin --- Should be fixed on latest trunk.
[Bug testsuite/114642] new test case gcc.dg/debug/btf/btf-datasec-3.c from r14-6195-gb8cf266f4ca4ff fails for 32 bits
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114642 Kewen Lin changed: What|Removed |Added URL||https://gcc.gnu.org/piperma ||il/gcc-patches/2024-April/6 ||48994.html CC||linkw at gcc dot gnu.org Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |david.faust at oracle dot com --- Comment #2 from Kewen Lin --- David posted a fix (see URL).
[Bug testsuite/114614] New test case gcc.misc-tests/gcov-20.c from r14-9789-g08a52331803f66 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114614 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org Last reconfirmed||2024-04-08 CC||linkw at gcc dot gnu.org --- Comment #1 from Kewen Lin --- It requires effective target profile_update_atomic.
[Bug target/114567] rs6000: explicit _Float128 doesn't generate optimal code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114567 Kewen Lin changed: What|Removed |Added Target Milestone|--- |15.0 Keywords||missed-optimization Target||powerpc64*-linux-gnu Last reconfirmed||2024-04-03 Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org
[Bug target/114567] New: rs6000: explicit _Float128 doesn't generate optimal code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114567 Bug ID: 114567 Summary: rs6000: explicit _Float128 doesn't generate optimal code Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: linkw at gcc dot gnu.org Target Milestone: --- This is an issue which I happened to spot when I have been working on patches for PR112993. === test case === #define TYPE _Float128 #ifdef LD #undef TYPE #define TYPE long double #endif int sbm (TYPE *a) { return __builtin_signbit (*a); } == /opt/gcc-nightly/trunk/bin/gcc -mcpu=power8 -mvsx -O2 -mabi=ieeelongdouble -Wno-psabi test.c -DLD -S -o ref.s /opt/gcc-nightly/trunk/bin/gcc -mcpu=power8 -mvsx -O2 -mabi=ibmlongdouble -Wno-psabi test.c -S -o float128.s diff -Nur ref.s float128.s --- ref.s 2024-03-18 05:41:00.302208975 -0400 +++ float128.s 2024-03-18 05:41:00.392205513 -0400 @@ -9,7 +9,10 @@ sbm: .LFB0: .cfi_startproc - ld 3,8(3) + rldicr 3,3,0,59 + lxvd2x 0,0,3 + xxpermdi 0,0,0,2 + mfvsrd 3,0 srdi 3,3,63 blr .long 0
[Bug target/88309] [11/12/13/14 Regression] ICE: Floating point exception (in is_miss_rate_acceptable), target assigning alignent of 4 bits(!) to vector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88309 Kewen Lin changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org --- Comment #6 from Kewen Lin --- (In reply to Andrew Pinski from comment #5) > (In reply to Kewen Lin from comment #4) > > Hi Andrew, thanks for digging into this! William has not worked on GCC > > project any more, will you make a patch for this? > > I don't have time to test it really. No problem, I'll work on this.
[Bug target/88309] [11/12/13/14 Regression] ICE: Floating point exception (in is_miss_rate_acceptable), target assigning alignent of 4 bits(!) to vector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88309 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #4 from Kewen Lin --- (In reply to Andrew Pinski from comment #3) > Found it: > /* In GIMPLE the type of the MEM_REF specifies the alignment. The > required alignment (power) is 4 bytes regardless of data type. */ > tree align_ltype = build_aligned_type (lhs_type, 4); > > That should be 4*8 instead of just 4. > > There are 2 build_aligned_type in rs6000-builtins.cc which uses the wrong > alignment; thinking it was the alignment argument was bytes rather than bits. > > Introduced by r9-2375-g3f7a77cd20d07c which means this is a regression. Hi Andrew, thanks for digging into this! William has not worked on GCC project any more, will you make a patch for this?
[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980 --- Comment #11 from Kewen Lin --- (In reply to Giuliano Belinassi from comment #9) > Yes, this is for userspace livepatching. > > Assume the following example: > https://godbolt.org/z/b9M8nMbo1 > > As one can see, the sequence of 14 nops are generated after the global > function entry point. In such way we can't use the this space to place a > trampoline to the new function. We need this sequence of nops to be placed > *before* the global function entry point. > Hi Giuliano, thanks for the inputs!
[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980 --- Comment #10 from Kewen Lin --- Created attachment 57844 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57844=edit patch changing the current implementation Considering the current implementation is not useful at all for both kernel and userspace uses, I'm inclined to change the current implementation instead of introducing another option, but updating the documentation to emphasize the NOPs may not be consecutive for this case.
[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980 --- Comment #8 from Kewen Lin --- Hi @Michael, @Martin, could you help to confirm/clarify what triggers you to be interested in this feature, is it for some user space usage or not?
[Bug target/114402] rs6000: ICE when long double is ieee128 format by default but without vsx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114402 --- Comment #1 from Kewen Lin --- Currently the only pattern to match IEEE128 comparison is: ;; IEEE 128-bit comparisons (define_insn "*cmp_hw" [(set (match_operand:CCFP 0 "cc_reg_operand" "=y") (compare:CCFP (match_operand:IEEE128 1 "altivec_register_operand" "v") (match_operand:IEEE128 2 "altivec_register_operand" "v")))] "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" "xscmpuqp %0,%1,%2" [(set_attr "type" "veccmp") (set_attr "size" "128")]) It requires TARGET_FLOAT128_HW, so nothing can be used for matching. The below patch can fix this ICE, it makes no-vsx IEEE128 also go with libfunc call like !TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode). diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 5d975dab921..237d138faec 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -15329,7 +15329,7 @@ rs6000_generate_compare (rtx cmp, machine_mode mode) rtx op0 = XEXP (cmp, 0); rtx op1 = XEXP (cmp, 1); - if (!TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode)) + if (!TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)) comp_mode = CCmode; else if (FLOAT_MODE_P (mode)) comp_mode = CCFPmode; @@ -15361,7 +15361,7 @@ rs6000_generate_compare (rtx cmp, machine_mode mode) /* IEEE 128-bit support in VSX registers when we do not have hardware support. */ - if (!TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode)) + if (!TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)) { rtx libfunc = NULL_RTX; bool check_nan = false;
[Bug target/114402] rs6000: ICE when long double is ieee128 format by default but without vsx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114402 Kewen Lin changed: What|Removed |Added CC||bergner at gcc dot gnu.org, ||g...@the-meissners.org, ||segher at gcc dot gnu.org Last reconfirmed||2024-03-21 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org
[Bug target/114402] rs6000: ICE when long double is ieee128 format by default but without vsx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114402 Kewen Lin changed: What|Removed |Added Target||powerpc64*-linux-gnu Keywords||ice-on-valid-code Target Milestone|--- |15.0 Known to fail||12.3.1, 13.2.1
[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980 --- Comment #6 from Kewen Lin --- (In reply to Martin Jambor from comment #5) > I'd like to ping this, are there plans to implement this in the near-ish > term? Some weeks ago, Naveen had been doing some experiments to see if there is a better way for function tracer support, and if the idea works and the experiment result is promising, he may request something different, so we are still waiting for that. @Naveen Feel free to correct me if any misunderstanding.
[Bug target/114402] New: rs6000: ICE when long double is ieee128 format by default but without vsx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114402 Bug ID: 114402 Summary: rs6000: ICE when long double is ieee128 format by default but without vsx Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: linkw at gcc dot gnu.org Target Milestone: --- When I was doing a patch to make us only have two 128bit fp on rs6000, I found that we can have long double with ieee128 format by default and even not having vsx support, but a simple test case with comparison triggers ICE as below: long double a; long double b; int foo() { if (a > b) return 0; else return 1; } /opt/gcc-nightly/trunk/bin/gcc test.c -mno-vsx test.c: In function ‘foo’: test.c:9:1: error: unrecognizable insn: 9 | } | ^ (insn 9 8 10 2 (set (reg:CCFP 123) (compare:CCFP (reg:TF 117 [ a.0_1 ]) (reg:TF 118 [ b.1_2 ]))) "test.c":5:6 -1 (nil)) during RTL pass: vregs test.c:9:1: internal compiler error: in extract_insn, at recog.cc:2812 0x102b7353 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) /home/gccbuild/gcc_trunk_git/gcc/gcc/rtl-error.cc:108 0x102b73a7 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) /home/gccbuild/gcc_trunk_git/gcc/gcc/rtl-error.cc:116 0x10c6636b extract_insn(rtx_insn*) /home/gccbuild/gcc_trunk_git/gcc/gcc/recog.cc:2812 0x107ef797 instantiate_virtual_regs_in_insn /home/gccbuild/gcc_trunk_git/gcc/gcc/function.cc:1611 0x107ef797 instantiate_virtual_regs /home/gccbuild/gcc_trunk_git/gcc/gcc/function.cc:1994 0x107ef797 execute /home/gccbuild/gcc_trunk_git/gcc/gcc/function.cc:2041 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. Note that it should be configured with --with-long-double-format=ieee, since if -mabi=ieeelongdouble is specified, it will requires vsx to be enabled.
[Bug testsuite/114320] New test case in r14-9439-g4aa87b856067d4 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114320 --- Comment #3 from Kewen Lin --- (In reply to Nathaniel Shead from comment #2) > Sorry about that. I've not been able to work out what configure flags I need > to pass to cause this to error in the first place (I don't normally develop > for powerpc and the machine I'm using doesn't seem to fail no matter what I guess the machine you are using (were referring to) isn't with powerpc chip, cfarm provides some powerpc machines (https://portal.cfarm.net/machines/list/), both ppc64le (LE -m64) and ppc64 (BE -m32/-m64), it's recommended to leverage them for building/testing. :) > flags I try), but am I correct in understanding that just adding > "-Wno-psabi" to the tests should stop them from failing? If so I'm happy to > push a patch to that effect. I think so, for now we don't have an effective target dedicated for __ibm128 type but it's guarded the same as what's for __float128 type (it would be relaxed though in future, even with that using ppc_float128_sw should just be more strict). Ideally we can add one effective target powerpc_vsx_ok (should be powerpc_vsx) to ensure VSX to be enabled, but considering we are going to rework it in next release and we don't normally disable vsx explicitly, this can be optional.
[Bug testsuite/114320] New test case in r14-9439-g4aa87b856067d4 fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114320 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2024-03-13 Ever confirmed|0 |1 CC||linkw at gcc dot gnu.org --- Comment #1 from Kewen Lin --- These new test cases require "-Wno-psabi" to suppress the warning.
[Bug testsuite/101461] [12/13/14 regression] gcc.target/powerpc/fold-vec-load-builtin_vec_xl test cases fail after r12-2266
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101461 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED CC||linkw at gcc dot gnu.org --- Comment #4 from Kewen Lin --- Already fixed by r12-2889-g8464894c86b03e.
[Bug target/113507] can't build a cross compiler to rs6000-ibm-aix7.2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113507 Kewen Lin changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |segher at gcc dot gnu.org --- Comment #6 from Kewen Lin --- Segher will clean up this rs6000-*-* thing in next release, please use powerpc*-*-* instead.
[Bug testsuite/106680] Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680 --- Comment #12 from Kewen Lin --- (In reply to Sebastian Huber from comment #10) > (In reply to Kewen Lin from comment #9) > > Note that now we only disable implicit powerpc64 for -m32 when the > > OS_MISSING_POWERPC64 is set. > > > > /* Don't expect powerpc64 enabled on those OSes with OS_MISSING_POWERPC64, > > since they do not save and restore the high half of the GPRs correctly > > in all cases. If the user explicitly specifies it, we won't interfere > > with the user's specification. */ > > #ifdef OS_MISSING_POWERPC64 > > if (OS_MISSING_POWERPC64 > > && TARGET_32BIT > > && TARGET_POWERPC64 > > && !(rs6000_isa_flags_explicit & OPTION_MASK_POWERPC64)) > > rs6000_isa_flags &= ~OPTION_MASK_POWERPC64; > > #endif > > > > But rtems.h doesn't define OS_MISSING_POWERPC64 > > RTEMS supports the 64-bit PowerPC for the 64-bit multilibs. > 64-bit kernel should support 64-bit PowerPC, but does 32-bit kernel support saving and restoring 64-bit regs? The current rtems.h is saying yes, if it's no, we should fix the rtems.h and you won't need the explicit -mno-powerpc64 then. btw, take the comments in freebsd64.h for example. /* FreeBSD doesn't support saving and restoring 64-bit regs with a 32-bit kernel. This is supported when running on a 64-bit kernel with COMPAT_FREEBSD32, but tell GCC it isn't so that our 32-bit binaries are compatible. */ #define OS_MISSING_POWERPC64 !TARGET_64BIT
[Bug testsuite/106680] Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680 --- Comment #11 from Kewen Lin --- (In reply to Sebastian Huber from comment #8) > Yes, it seems that -mcpu=e6500 -mno-powerpc64 yields the right code for the > attached test case (with or without the -m32). The default is -m32 I guess? :) > > I am now a bit confused what the purpose of the -m32 and -m64 options is. For -m32/-m64, the manual says: Generate code for 32-bit or 64-bit environments of Darwin and SVR4 targets (including GNU/Linux). The 32-bit environment sets int, long and pointer to 32 bits and generates code that runs on any PowerPC variant. The 64-bit environment sets int to 32 bits and long and pointer to 64 bits, and generates code for PowerPC64, as for -mpowerpc64. But it's possible to interact with option powerpc64, like cpu e6500 which by default supports powerpc64 and if applied OS is able to support the necessary context switches, we want -mpowerpc64 kept and it's able to generate more efficient code (leveraging insns guarded with powerpc64 flag).
[Bug testsuite/106680] Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680 --- Comment #9 from Kewen Lin --- Note that now we only disable implicit powerpc64 for -m32 when the OS_MISSING_POWERPC64 is set. /* Don't expect powerpc64 enabled on those OSes with OS_MISSING_POWERPC64, since they do not save and restore the high half of the GPRs correctly in all cases. If the user explicitly specifies it, we won't interfere with the user's specification. */ #ifdef OS_MISSING_POWERPC64 if (OS_MISSING_POWERPC64 && TARGET_32BIT && TARGET_POWERPC64 && !(rs6000_isa_flags_explicit & OPTION_MASK_POWERPC64)) rs6000_isa_flags &= ~OPTION_MASK_POWERPC64; #endif But rtems.h doesn't define OS_MISSING_POWERPC64 gcc/config/rs6000/linux.h:#define OS_MISSING_POWERPC64 1 gcc/config/rs6000/freebsd64.h:#define OS_MISSING_POWERPC64 !TARGET_64BIT gcc/config/rs6000/aix.h:#define OS_MISSING_POWERPC64 1 gcc/config/rs6000/linux64.h:#define OS_MISSING_POWERPC64 !TARGET_64BIT meanwhile cpu "e6500" has MASK_POWERPC64 set by default (it's 64bit core). That's why you still have powerpc64 flag set when you specify -m32 on rtems.
[Bug testsuite/106680] Test gcc.target/powerpc/bswap64-4.c fails on 32-bit BE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106680 --- Comment #7 from Kewen Lin --- (In reply to Sebastian Huber from comment #6) > It seems that the change > > commit acc727cf02a1446dc00f8772f3f479fa3a508f8e > Author: Kewen Lin > Date: Tue Dec 27 04:13:07 2022 -0600 > > rs6000: Rework option -mpowerpc64 handling [PR106680] > > causes a regression for -mcpu=e6500 -m32, for example: > > gcc -fpreprocessed -O2 -S -mcpu=e6500 -m32 -S imfs_add_node.c.67.s > imfs_add_node.c.67.i > > diff -u imfs_add_node.c.67.s.good.e2acff49fb2962b921bf8b73984b89878b61492c > imfs_add_node.c.67.s.bad.acc727cf02a1446dc00f8772f3f479fa3a508f8e > --- imfs_add_node.c.67.s.good.e2acff49fb2962b921bf8b73984b89878b61492c > 2024-01-20 12:15:15.143182571 +0100 > +++ imfs_add_node.c.67.s.bad.acc727cf02a1446dc00f8772f3f479fa3a508f8e > 2024-01-20 12:11:46.804204927 +0100 > @@ -52,8 +52,8 @@ > bne- 0,.L4 > .L2: > mr 4,29 > - addi 3,1,8 > li 5,24 > + addi 3,1,8 > bl rtems_filesystem_eval_path_start > lis 9,IMFS_node_clone@ha > lwz 10,20(3) > @@ -63,12 +63,12 @@ > cmpw 0,10,9 > beq- 0,.L24 > li 4,134 > - addi 3,1,8 > + li 3,0 > bl rtems_filesystem_eval_path_error > .L9: > li 31,-1 > .L10: > - addi 3,1,8 > + li 3,0 > bl rtems_filesystem_eval_path_cleanup > .L1: > lwz 0,116(1) > @@ -93,7 +93,7 @@ > lwz 9,12(31) > li 8,96 > lhz 10,16(31) > - addi 3,1,8 > + li 3,0 > stw 8,24(1) > stw 9,8(1) > stw 10,12(1) > @@ -105,7 +105,7 @@ > cmpwi 0,9,0 > beq- 0,.L9 > li 4,22 > - addi 3,1,8 > + li 3,0 > bl rtems_filesystem_eval_path_error > b .L9 > .p2align 4,,15 > @@ -129,12 +129,9 @@ > stw 9,0(10) > stw 10,4(9) > bl _Timecounter_Getbintime > - lwz 10,64(1) > - lwz 11,68(1) > - stw 10,40(30) > - stw 11,44(30) > - stw 10,48(30) > - stw 11,52(30) > + ld 9,64(1) > + std 9,40(30) > + std 9,48(30) > b .L10 > .cfi_endproc > .LFE351: > > For the call to rtems_filesystem_eval_path_cleanup() the register 3 should > point to a structure on the stack. Correct is: > > - addi 3,1,8 > > Wrong is: > > + li 3,0 > > It seems that for the -mcpu=e6500 the -m32 option has not the right effect > and some 64-bit instructions are generated, for example ld and std plus the As the commit log, the previous behavior that -m32 also disables -mpowerpc64 is wrong, -m{no,}powerpc64 should be independent of -m32/-m64. > wrong function parameters. I supposed that the behavior you wanted with -m32 is not to enable powerpc64 (since the previous behavior is -m32 can disable -mpowerpc64 as well), so I think you can get the previous behavior if you specify one explicit -mno-powerpc64 when adopting -m32.
[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2024-01-30 --- Comment #13 from Kewen Lin --- One more finding: without an explicit cpu type but -mvsx, gcc passes -mpower7 to assembler already, but if there is an explicitly specified cpu type, it won't do that. I think the reason why it doesn't always make it is that only the last cpu type wins and the passing can override some higher cpu type unexpectedly. The fixing candidates seems to be: diff --git a/libgcc/config/rs6000/t-float128 b/libgcc/config/rs6000/t-float128 index b09b5664af0..47b06d3c30d 100644 --- a/libgcc/config/rs6000/t-float128 +++ b/libgcc/config/rs6000/t-float128 @@ -74,7 +74,7 @@ fp128_includes = $(srcdir)/soft-fp/double.h \ $(srcdir)/soft-fp/soft-fp.h # Build the emulator without ISA 3.0 hardware support. -FP128_CFLAGS_SW = -Wno-type-limits -mvsx -mfloat128 \ +FP128_CFLAGS_SW = -Wno-type-limits -mvsx -mfloat128 -mcpu=power7 \ -mno-float128-hardware -mno-gnu-attribute \ -I$(srcdir)/soft-fp \ -I$(srcdir)/config/rs6000 \ Or diff --git a/libgcc/config/rs6000/t-float128 b/libgcc/config/rs6000/t-float128 index b09b5664af0..bf4a5e6aaf0 100644 --- a/libgcc/config/rs6000/t-float128 +++ b/libgcc/config/rs6000/t-float128 @@ -74,7 +74,7 @@ fp128_includes = $(srcdir)/soft-fp/double.h \ $(srcdir)/soft-fp/soft-fp.h # Build the emulator without ISA 3.0 hardware support. -FP128_CFLAGS_SW = -Wno-type-limits -mvsx -mfloat128 \ +FP128_CFLAGS_SW = -Wno-type-limits -mvsx -mfloat128 -Wa,-many \ -mno-float128-hardware -mno-gnu-attribute \ -I$(srcdir)/soft-fp \ -I$(srcdir)/config/rs6000 \ As gcc considers -mvsx to imply -mcpu=power7 (appending onto the current specified cpu type if there is one) while assembler doesn't consider like that.
[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652 Kewen Lin changed: What|Removed |Added Summary|Failed bootstrap on ppc |[14 regression] Failed |unrecognized opcode:|bootstrap on ppc |`lfiwzx' with -mcpu=7450|unrecognized opcode: ||`lfiwzx' with -mcpu=7450 --- Comment #12 from Kewen Lin --- (In reply to Sam James from comment #10) > (In reality, I think it is a regression, given: > a) it regresses non-release checking (which we sometimes use even for > released versions, it's opt-in though); But I assumed that non-release checking on old releases should also fail, from non-release vs. non-release, the behavior doesn't change. > b) it blocks further testing with GCC 14 > Sorry for that, put it back as you like. :) > but I understand the argument that if a release were made with it, it > wouldn't be the end of the world by itself and it only affects a specific > configuration.)
[Bug target/113652] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652 --- Comment #11 from Kewen Lin --- In gcc, lfiwzx is guarded with TARGET_LFIWZX => TARGET_POPCNTD (ISA2.06), while -mvsx will guarantee TARGET_POPCNTD (ISA_2_6_MASKS_SERVER) set, so it considers lfiwzx is supported. IMHO the underlying philosophy is that having the capability of vsx the supported ISA level is at least 2.06, lfiwzx is supported from 2.06, so it's supported. But binutils seems not to follow it: {"xvadddp", XX3(60,96), XX3_MASK,PPCVSX,PPCVLE, {XT6, XA6, XB6}}, {"lfiwzx", X(31,887), X_MASK, POWER7|PPCA2, 0, {FRT, RA0, RB}}, Both are guarded with different masks and apparently PPCVSX doesn't enable POWER7. Hi Alan and Peter, I wonder if assembler can enable POWER7 when PPCVSX gets enabled like what gcc adopts now?
[Bug target/113652] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652 Kewen Lin changed: What|Removed |Added Summary|[14 regression] Failed |Failed bootstrap on ppc |bootstrap on ppc|unrecognized opcode: |unrecognized opcode:|`lfiwzx' with -mcpu=7450 |`lfiwzx' with -mcpu=7450| --- Comment #9 from Kewen Lin --- (In reply to Andrew Pinski from comment #8) > So t-float128 has this line: > # Build the emulator without ISA 3.0 hardware support. > FP128_CFLAGS_SW = -Wno-type-limits -mvsx -mfloat128 \ > ... > > Which gets added to some of the libgcc object files while compiling: > $(fp128_softfp_obj) : INTERNAL_CFLAGS += $(FP128_CFLAGS_SW) > $(fp128_ppc_obj) : INTERNAL_CFLAGS += $(FP128_CFLAGS_SW) > > > The problem is CFLAGS gets added also. It seems like passing -mvsx enables > some other instructions in GCC's code generation BUT does not enable it for > the assembler ... ah, just noticed that it's bootstrapping gcc. Stripping regression tag since I don't think it's actually a regression as comments above. I found that the libgcc_cv_powerpc_float128 checking can pass with -mcpu=7450 -mabi=altivec -mvsx -mfloat128, the assembler options are "-a32 -mppc -mvsx -maltivec -mbig" is actually the same as what are used for the case #c5 compiling. So it looks that -mvsx is supposed to tell assembler to recognize vsx instructions but somehow "lfiwzx" is not counted as vsx instruction. More specifically "xvadddp" is recognized by assembler with -mvsx while "lfiwzx" isn't. $ cat t1.s .machine "7450" lfiwzx 1,0,9 $ cat t2.s .machine "7450" xvadddp 34,34,35 $ as -a32 -mppc -mvsx t1.s -o t1.o t1.s: Assembler messages: t1.s:2: Error: unrecognized opcode: `lfiwzx' $ as -a32 -mppc -mvsx t2.s -o t2.o $ echo $? $ 0
[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652 --- Comment #7 from Kewen Lin --- oops, I meant --enable-checking rather than --checking.
[Bug target/113652] [14 regression] Failed bootstrap on ppc unrecognized opcode: `lfiwzx' with -mcpu=7450
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113652 --- Comment #6 from Kewen Lin --- I think this is related to r10-580-ge154242724b084 and this failure is expected and a use error. With it applied, we don't always pass -many to assembler with CHECKING_P enabled. Actually compilers (gcc-13, gcc-12, gcc-11 or trunk) generate the same assembly, but because gcc-11/gcc-12/gcc-13 is built with --checking=release by default which doesn't set CHECKING_P while trunk is built with --checking=yes,extra by default which set CHECKING_P. So it causes the different behaviors so that further considered as regression unexpectedly. The issue should be gone if trunk gets released as gcc-14 or it's built with --checking=release. IMO Alan's commit aims to help to expose more and more such unexpected use cases and users can fix them in place. As #c3 "PowerPC 7450 (aka PowerPC G4) is only capable of -maltivec but not -mvsx", so it's unexpected to have -mcpu=7450 meanwhile having -mvsx, could you check where the -mvsx comes from and fix it instead? Thanks! btw, a workaround option is to add -Wa,-many to restore the previous behavior that passing -many to assembler.
[Bug target/113507] can't build a cross compiler to rs6000-ibm-aix7.2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113507 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |NEW CC||segher at gcc dot gnu.org Last reconfirmed||2024-01-23 Ever confirmed|0 |1 --- Comment #5 from Kewen Lin --- (In reply to H.J. Lu from comment #3) > (In reply to Kewen Lin from comment #2) > > Guessing /usr/local/bin/ld is a gnu ld? Based on what I heard before, gnu ld > > has some problems on aix, people pass object files to aix system and use aix > > ld there. Not sure if the understanding still holds. > > I am building a cross compiler. No AIX tools are involved. Thanks for clarifying, I was dull and misunderstood it. Confirmed, some symbols are from rs6000-builtin.cc (which is not generated) but it requires some symbols in rs6000-builtins.cc (which is generated). Both object files are not included in linking. The below diff can fix it: diff --git a/gcc/config.gcc b/gcc/config.gcc index b2d7d7dd475..6b62e4fe56c 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -557,8 +557,10 @@ rs6000*-*-*) extra_options="${extra_options} g.opt fused-madd.opt rs6000/rs6000-tables.opt" extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o" extra_objs="${extra_objs} rs6000-call.o rs6000-pcrel-opt.o" +extra_objs="${extra_objs} rs6000-builtin.o rs6000-builtins.o" target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-logue.cc \$(srcdir)/config/rs6000/rs6000-call.cc" target_gtfiles="$target_gtfiles \$(srcdir)/config/rs6000/rs6000-pcrel-opt.cc" +target_gtfiles="$target_gtfiles ./rs6000-builtins.h" ;; sparc*-*-*) cpu_type=sparc According to David's comments "rs6000-ibm-aix doesn't exist any more" and I vaguely remembered Segher also mentioned rs6000*-*-*) becomes stale, maybe we can aggressively drop the whole rs6000*-*-*) case handling?
[Bug target/113507] can't build a cross compiler to rs6000-ibm-aix7.2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113507 Kewen Lin changed: What|Removed |Added CC||bergner at gcc dot gnu.org, ||dje at gcc dot gnu.org, ||linkw at gcc dot gnu.org --- Comment #2 from Kewen Lin --- Guessing /usr/local/bin/ld is a gnu ld? Based on what I heard before, gnu ld has some problems on aix, people pass object files to aix system and use aix ld there. Not sure if the understanding still holds.
[Bug testsuite/109705] [14 regression] gcc.dg/vect/pr25413a.c fails after r14-333-g6d4b59a9356ac4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109705 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #7 from Kewen Lin --- (In reply to Peter Bergner from comment #6) > (In reply to GCC Commits from comment #5) > > commit r14-7270-g39fa71a0882928a25bd170580e3e9e89a69dce36 > > Author: Kewen Lin > > Date: Mon Jan 15 20:55:40 2024 -0600 > > > > testsuite: Fix vect_long_mult on Power [PR109705] > > > > As pointed out by the discussion in PR109705, the current > > vect_long_mult effective target check on Power is broken. > > This patch is to fix it accordingly. > > Does this need backporting? I guess no, the only use of vect_long_mult in release branches is gcc/testsuite/gcc.dg/vect/pr60656.c which has another check vect_widen_mult_si_to_di_pattern unsupported on Power.
[Bug testsuite/113535] rs6000, testsuite: Re-visit the current vect_* for Power
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113535 Kewen Lin changed: What|Removed |Added Last reconfirmed||2024-01-22 Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org CC||bergner at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1
[Bug testsuite/113535] New: rs6000, testsuite: Re-visit the current vect_* for Power
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113535 Bug ID: 113535 Summary: rs6000, testsuite: Re-visit the current vect_* for Power Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: linkw at gcc dot gnu.org Target Milestone: --- Inspired by PR109705, open this for tracking the revisit of vect_* checking for Power and fix some if needed.
[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980 --- Comment #4 from Kewen Lin --- (In reply to Naveen N Rao from comment #2) > I don't really have a preference, though I tend to agree that nops before > the local entry point aren't that useful. Even with the current approach, > not all functions have instructions at the GEP and for those, the nops are > being generated outside the function. We also won't have a separate GEP/LEP > with pcrel, so we won't need a separate option eventually. Thanks for the input! Looking forward to the comments from the others, especially Segher, David and Peter. (In reply to Michael Matz from comment #3) > (In reply to Kewen Lin from comment #1) > > > > As Segher's review comments in [2], to support "before NOPs" before global > > entry and "after NOPs" after global entry, > > Just to be perfectly clear here: the "after NOPs" need to come after local > entry > (which strictly speaking is of course after the global one as well, but I'm > being anal :) ). Oops, good catch, I meant to type "after local entry", thanks for the correction making it perfectly clear. :)
[Bug testsuite/111850] [14 regression] gcc.target/powerpc/fold-vec-extract-char.p7.c fails after r14-4664-g04c9cf5c786b94
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111850 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #6 from Kewen Lin --- Should be fixed on trunk.
[Bug target/99888] Add powerpc ELFv2 support for -fpatchable-function-entry*
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99888 --- Comment #16 from Kewen Lin --- (In reply to Michael Matz from comment #15) > Umm. I just noticed this one as we now try to implement userspace live > patching > for ppc64le. The point of the "before" NOPs is (and always was) that they > are > completely out of the way of patchable but as-of-yet unpatched functions. > > For ppc that means the "before" and "after" NOPs cannot be consecutive. The > two > NOP sets being consecutive was never a design criteria or requirement. > > So, while the original bug is fixed by what was committed (local entry was > skipping the patching-nops), the chosen solution is exactly the wrong one :-/ Thanks for the input! Sigh, sorry that we picked up the wrong one :(, you may have noticed that the main consideration to choose the current one is to keep it align with the consecutive NOPs described by the documentation, we need a separate command line option as Segher's review comment in https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600239.html. Now we have PR112980 filed for the requested behavior, let's discuss how to support it there.
[Bug target/112980] 64-bit powerpc ELFv2 does not allow nops to be generated before function global entry point
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112980 Kewen Lin changed: What|Removed |Added CC||matz at gcc dot gnu.org Last reconfirmed||2024-01-18 Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org Ever confirmed|0 |1 See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=99888 Status|UNCONFIRMED |ASSIGNED --- Comment #1 from Kewen Lin --- [1] made me realize that I forgot to post some comments here. (I thought I did but actually didn't). As Segher's review comments in [2], to support "before NOPs" before global entry and "after NOPs" after global entry, we need to introduce a separate command line option, I think it can be a target specific option, which is enabled by default and we should mention its default behavior and impact in the current documentation for -fpatchable-function-entry. I don't have a good name candidate, any suggestions? Considering that the current behavior aligning with consecutive NOPs looks useless (this request and [1]), an alternative is to aggressively change the current behavior to "before NOPs" before global entry and "after NOPs" after global entry. Any preference or other ideas? Any comments are highly appreciated. I think with either (any) proposal it's inevitable to make the current behavior of -fpatchable-function-entry on "before NOPs" change, we should also document this change in releases/changes.html. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99888#c15 [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600239.html
[Bug other/113317] New test case libgomp.c++/ind-base-2.C fails with ICE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113317 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #3 from Kewen Lin --- I can't reproduce this either, tried on at least one machine with P8 LE, P9 LE, P10 LE or P9 BE. I wonder which internal host was used for testing.
[Bug testsuite/113418] Use of vect_* target selectors in tests out of vect directories
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113418 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #4 from Kewen Lin --- Thanks for filing this, I just realized that it's unexpected to use vect_* effective target checks outside */vect/ in generic test suites. > > I just found them with a simple grep command so there might be false > positives or false negatives. There are also a dozen matches in gcc.target > but I consider them fine as the target maintainers should know exactly what > they are doing. Yes, I think those in target should be fine, although they can be replaced with some corresponding target specific check(s), sometimes the vect_* is more readable.
[Bug testsuite/111850] [14 regression] gcc.target/powerpc/fold-vec-extract-char.p7.c fails after r14-4664-g04c9cf5c786b94
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111850 Kewen Lin changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org --- Comment #4 from Kewen Lin --- Just realized that we also escalated test issue to P1, I'm going to make a patch for the test case update.
[Bug target/113341] Using GCC as the bootstrap compiler breaks LLVM on 32-bit PowerPC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113341 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #9 from Kewen Lin --- Since it's a breakage during stage2, it's concluded that some built stage1 stuffs behave unexpectedly. You probably can try to run regression testing just with stage1 compiler to see if there is any regression exposed. If without any luck, then you probably have to isolate into one or several object files, since you have "objects" for "good" and "bad" stage1 compiler, you can be able to isolate some in between further. Once you get some isolated, you can probably get some hints it's a bug in LLVM source or gcc. It seems you are using gcc 13.2.1 as version field shows, you can also try some previous versions like gcc 12 and gcc 11 to see if they work and it's regressed.
[Bug target/109987] ICE in in rs6000_emit_le_vsx_store on ppc64le with -Ofast -mno-power8-vector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109987 Kewen Lin changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org --- Comment #3 from Kewen Lin --- As discussed in PR113115, I'm going to give option power{8,9}-vector removal a shot.
[Bug target/113115] [14 Regression] ICE In extract_constrain_insn_cached recog.cc with ppc64le-linux-gnu crosscompiler from r14-3592-g9ea1248604d7b6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113115 --- Comment #7 from Kewen Lin --- (In reply to Peter Bergner from comment #5) > I really dislike the -mpower{8,9}-vector options, but maybe it's too late to > remove them for this release? I'm not sure how involved/invasive that patch > would be. Segher, do you have a preference on remove them now or use the > workaround above and remove in the next release? (In reply to Segher Boessenkool from comment #6) > Using -mpower9-vector while not having -mcpu=power9 (or later) is wrong, and > should > not work. Using -mno-power9-vector is just weird. > > If we can neuter the -mpower9-vector (etc.) options now, that would be good. > But > there are some complications with the testsuite at least? OK, it sounds that it's still acceptable to adjust this at this time point, so I'm working on a patch to evaluate its impact, will post it after full testing.
[Bug middle-end/113100] [14 regression] many strub tests fail after r14-6737-g4e0a467302fea5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113100 Kewen Lin changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #4 from Kewen Lin --- Should be fixed on trunk.
[Bug target/111480] new test case g++.target/powerpc/altivec-19.C fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111480 Kewen Lin changed: What|Removed |Added Component|testsuite |target Keywords|testsuite-fail |missed-optimization Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #2 from Kewen Lin --- Should be fixed.
[Bug testsuite/112751] [14 regression] gcc.target/powerpc/pcrel-sibcall-1.c fails after r14-5628-g53ba8d669550d3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112751 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #4 from Kewen Lin --- Should be fixed.
[Bug target/112606] [14 Regression] powerpc64le-linux-gnu: 'FAIL: gcc.target/powerpc/p8vector-fp.c scan-assembler xsnabsdp'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112606 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #7 from Kewen Lin --- Should be fixed on trunk now.
[Bug target/113115] [14 Regression] ICE In extract_constrain_insn_cached recog.cc with ppc64le-linux-gnu crosscompiler from r14-3592-g9ea1248604d7b6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113115 Kewen Lin changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #4 from Kewen Lin --- (In reply to Peter Bergner from comment #3) > Ke Wen, is this just a duplicate of PR109987 and PR103627? I know it was > bisected to Jeevitha's commit, but it seems more like her commit exposed the > same latent issue as those other PRs, rather than causing it. Your thoughts? Yes, I agree it's duplicated of PR109987, Jeevitha's commit just exposed this known issue, since we are in stage 3, I wonder if we can go with power9-vector guarding first (https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587310.html) since power9-vector still exists in this release, and we can try to remove these workaround options in next release. (Sorry that I missed to follow up the power{8,9}-vector removal) *** This bug has been marked as a duplicate of bug 109987 ***
[Bug target/109987] ICE in in rs6000_emit_le_vsx_store on ppc64le with -Ofast -mno-power8-vector
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109987 Kewen Lin changed: What|Removed |Added CC||fkastl at suse dot cz --- Comment #2 from Kewen Lin --- *** Bug 113115 has been marked as a duplicate of this bug. ***
[Bug testsuite/111480] new test case g++.target/powerpc/altivec-19.C fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111480 Kewen Lin changed: What|Removed |Added Last reconfirmed||2024-01-08 Ever confirmed|0 |1 Status|UNCONFIRMED |ASSIGNED URL||https://gcc.gnu.org/piperma ||il/gcc-patches/2024-January ||/642093.html Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org CC||linkw at gcc dot gnu.org
[Bug target/112606] [14 Regression] powerpc64le-linux-gnu: 'FAIL: gcc.target/powerpc/p8vector-fp.c scan-assembler xsnabsdp'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112606 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org Last reconfirmed||2024-01-08 Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 --- Comment #5 from Kewen Lin --- (In reply to seurer from comment #3) > These tests also fail starting with > g:9e9279fadbd1c673c875b9d20261d2de0473f63f, r14-5542-g9e9279fadbd1c6 > > FAIL: gcc.target/powerpc/float128-hw5.c scan-assembler-not \\mxscpsgnqp\\M > FAIL: gcc.target/powerpc/float128-hw5.c scan-assembler-times \\mxsnabsqp\\M 1 > FAIL: gcc.target/powerpc/float128-hw7.c scan-assembler-not \\mxscpsgnqp\\M > FAIL: gcc.target/powerpc/float128-hw7.c scan-assembler-times \\mxsnabsqp\\M 1 These failures are related to ieee128, the #c4 only handles float/double, a similar patch was posted for ieee128: https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642092.html
[Bug testsuite/112751] [14 regression] gcc.target/powerpc/pcrel-sibcall-1.c fails after r14-5628-g53ba8d669550d3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112751 Kewen Lin changed: What|Removed |Added URL||https://gcc.gnu.org/piperma ||il/gcc-patches/2024-January ||/642091.html Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug middle-end/113100] [14 regression] many strub tests fail after r14-6737-g4e0a467302fea5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113100 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org URL||https://gcc.gnu.org/piperma ||il/gcc-patches/2024-January ||/642090.html Status|NEW |ASSIGNED
[Bug testsuite/60031] dg-require-effective-target powerpc_vsx_ok is not enough
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60031 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org --- Comment #7 from Kewen Lin --- We have vsx_hw effective target keyword which uses check_vsx_hw_available. # Return 1 if the target supports executing VSX instructions, 0 # otherwise. Cache the result. Doesn't it satisfy the requirement? Or am I missing something?
[Bug testsuite/106682] Powerpc test gcc.target/powerpc/pr86731-fwrapv-longlong.c fails on power8, passes on power9/power10
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106682 Kewen Lin changed: What|Removed |Added CC||seurer at gcc dot gnu.org --- Comment #4 from Kewen Lin --- *** Bug 101444 has been marked as a duplicate of this bug. ***
[Bug testsuite/101444] [12/13/14 regression] gcc.target/powerpc/pr86731-fwrapv-longlong.c fails after r12-2266
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101444 Kewen Lin changed: What|Removed |Added Resolution|--- |DUPLICATE CC||linkw at gcc dot gnu.org Status|UNCONFIRMED |RESOLVED --- Comment #4 from Kewen Lin --- Dup. *** This bug has been marked as a duplicate of bug 106682 ***
[Bug middle-end/113100] [14 regression] many strub tests fail after r14-6737-g4e0a467302fea5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113100 Kewen Lin changed: What|Removed |Added CC||linkw at gcc dot gnu.org Status|UNCONFIRMED |NEW Ever confirmed|0 |1 Last reconfirmed||2023-12-21 --- Comment #2 from Kewen Lin --- Confirmed, but it needs an explicit cpu type like -mcpu=power9 for reproduction, otherwise it could pass on power10 as it can work with pcrel (so no toc base r2 needed). The change can extend the end of scrubbing, it cleans the saved toc base unexpectedly. I noticed that there is one macro SPARC_STACK_BOUNDARY_HACK, which aims to indicate this SPARC64 specific behavior. Could we leverage this macro (guarded the biasing with it)? like: diff --git a/gcc/builtins.cc b/gcc/builtins.cc index 125ea158ebf..9bad1e962b4 100644 --- a/gcc/builtins.cc +++ b/gcc/builtins.cc @@ -5450,6 +5450,7 @@ expand_builtin_stack_address () rtx ret = convert_to_mode (ptr_mode, copy_to_reg (stack_pointer_rtx), STACK_UNSIGNED); +#ifdef SPARC_STACK_BOUNDARY_HACK /* Unbias the stack pointer, bringing it to the boundary between the stack area claimed by the active function calling this builtin, and stack ranges that could get clobbered if it called another @@ -5476,7 +5477,9 @@ expand_builtin_stack_address () (caller) function's active area as well, whereas those pushed or allocated temporarily for a call are regarded as part of the callee's stack range, rather than the caller's. */ - ret = plus_constant (ptr_mode, ret, STACK_POINTER_OFFSET); + if (SPARC_STACK_BOUNDARY_HACK) +ret = plus_constant (ptr_mode, ret, STACK_POINTER_OFFSET); +#endif return force_reg (ptr_mode, ret); }
[Bug rtl-optimization/85099] [meta-bug] selective scheduling issues
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85099 Bug 85099 depends on bug 112995, which changed state. Bug 112995 Summary: sel-sched2 ICE without checking verify_changes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug rtl-optimization/112995] sel-sched2 ICE without checking verify_changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED Target Milestone|--- |14.0 --- Comment #5 from Kewen Lin --- Should be fixed on trunk, guessing we don't want a backport, so closing.
[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591 Kewen Lin changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED|REOPENED --- Comment #44 from Kewen Lin --- I just checked test case in comment #43, I think those Set/Load are able to initialize those arrays as expected, so re-opening this.
[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591 --- Comment #43 from Kewen Lin --- Created attachment 56899 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56899=edit Previously reduced case for comment 10
[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591 --- Comment #42 from Kewen Lin --- (In reply to Richard Biener from comment #41) > What's the "other" testcase? Do we know that doesn't suffer from the same > uninitialized issue? For "other" test cases, I guessed he referred to my comment #c31, these are comment #c9 and #c10. Previously I further reduced #c10 and I didn't detect obvious uninitialized issue (but I could be wrong).
[Bug target/111591] ppc64be: miscompilation with -mstrict-align / -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111591 --- Comment #38 from Kewen Lin --- I found this has been marked as resolved but it seems that the patch in comment #34 hasn't been pushed, is it intended? or did I miss something that one commit was pushed but wasn't associated to this PR?
[Bug rtl-optimization/113029] sel-sched2 ICE in verify_target_availability
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113029 Kewen Lin changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=88652 --- Comment #3 from Kewen Lin --- (In reply to Andrew Pinski from comment #1) > Maybe https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84842#c17 patch helps Unfortunately it doesn't help, I noticed this and tried below: diff --git a/gcc/sel-sched.cc b/gcc/sel-sched.cc index a35b5e16c91..8e3b3bb0467 100644 --- a/gcc/sel-sched.cc +++ b/gcc/sel-sched.cc @@ -323,6 +323,10 @@ struct reg_rename /* The set of ABIs used by calls that the code motion path crosses. */ unsigned int crossed_call_abis : NUM_ABI_IDS; + + /* True if we have merged expressions and one of them had availability + bit set. */ + unsigned int merged_available_expr : 1; }; /* A global structure that contains the needed information about harg @@ -388,6 +392,10 @@ struct fur_static_params /* The set of ABIs used by calls that the code motion path crosses. */ unsigned int crossed_call_abis : NUM_ABI_IDS; + + /* True if we have merged expressions and one of them had availability + bit set. */ + unsigned int merged_available_expr : 1; }; typedef struct fur_static_params *fur_static_params_p; @@ -1554,7 +1562,8 @@ verify_target_availability (expr_t expr, regset used_regs, || !hard_available || (!reload_completed && reg_rename_p->crossed_call_abis - && REG_N_CALLS_CROSSED (regno) == 0)); + && REG_N_CALLS_CROSSED (regno) == 0) + || reg_rename_p->merged_available_expr); } /* Collect unavailable registers due to liveness for EXPR from BNDS @@ -1654,6 +1663,8 @@ find_best_reg_for_expr (expr_t expr, blist_t bnds, bool *is_orig_reg_p) used_regs = get_clear_regset_from_pool (); CLEAR_HARD_REG_SET (reg_rename_data.unavailable_hard_regs); + reg_rename_data.crossed_call_abis = false; + reg_rename_data.merged_available_expr = false; collect_unavailable_regs_from_bnds (expr, bnds, used_regs, _rename_data, _insns); @@ -1861,7 +1872,7 @@ identical_copy_p (rtx_insn *insn) /* Undo all transformations on *AV_PTR that were done when moving through INSN. */ static void -undo_transformations (av_set_t *av_ptr, rtx_insn *insn) +undo_transformations (av_set_t *av_ptr, rtx_insn *insn, void *static_params) { av_set_iterator av_iter; expr_t expr; @@ -1940,6 +1951,8 @@ undo_transformations (av_set_t *av_ptr, rtx_insn *insn) copy, which was in turn substituted. The history is wrong in this case. Do it the hard way. */ add = substitute_reg_in_expr (tmp_expr, insn, true); +if (code_motion_path_driver_info == _hooks) + ((fur_static_params_p) static_params)->merged_available_expr = true; if (add) av_set_add (_set, tmp_expr); clear_expr (tmp_expr); @@ -3273,6 +3286,7 @@ find_used_regs (insn_t insn, av_set_t orig_ops, regset used_regs, sparams.crossed_call_abis = 0; sparams.original_insns = original_insns; sparams.used_regs = used_regs; + sparams.merged_available_expr = false; /* Set the appropriate hooks and data. */ code_motion_path_driver_info = _hooks; @@ -3280,6 +3294,7 @@ find_used_regs (insn_t insn, av_set_t orig_ops, regset used_regs, res = code_motion_path_driver (insn, orig_ops, NULL, , ); reg_rename_p->crossed_call_abis |= sparams.crossed_call_abis; + reg_rename_p->merged_available_expr |= sparams.merged_available_expr; gcc_assert (res == 1); gcc_assert (original_insns && *original_insns); @@ -6570,7 +6585,7 @@ code_motion_path_driver (insn_t insn, av_set_t orig_ops, ilist_t path, { /* Av set ops could have been changed when moving through this insn. To find them below it, we have to un-substitute them. */ - undo_transformations (_ops, insn); + undo_transformations (_ops, insn, static_params); } else {
[Bug rtl-optimization/113029] sel-sched2 ICE in verify_target_availability
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113029 --- Comment #2 from Kewen Lin --- I noticed there are some existing PRs (PR107984, PR99328, PR88652, PR84842) on verify_target_availability ICE, and in PR84842 there is a tentative patch, I tried to make it fit with the latest trunk, but this still fails, so I file this.
[Bug rtl-optimization/113029] New: sel-sched2 ICE in verify_target_availability
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113029 Bug ID: 113029 Summary: sel-sched2 ICE in verify_target_availability Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: linkw at gcc dot gnu.org Target Milestone: --- Test case: #include #define c(d, g) g, d #define e(d, g) g, d vector double f, n; int m; int k; void j (vector double, double, double); vector double combine (double, double); vector double i (double, double); vector double l (vector double, double); vector double o (vector double, double) { vector double a; vector double b; p (""); j (f, c (1, 2)); j (n, c (3, 4)); b = i (3, 4); j (a, e (1, 2)); j (b, e (3, 4)); j (l (a, 5.0), e (5, 2)); j (o (b, 6.0), e (3, 6)); k = vec_extract (b, 1); j (combine (0, k), c (2, 4)); m = vec_extract (b, 0); j (i (0, m), e (1, 3)); i (0, vec_extract (b, 1)); } Option: -std=c89 -O2 -mcpu=power10 -fselective-scheduling2 during RTL pass: sched2 test.c: In function ‘o’: test.c:29:1: internal compiler error: in verify_target_availability, at sel-sched.cc:1553 29 | } | ^ 0x10c54c43 verify_target_availability /home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:1553 0x10c54c43 find_best_reg_for_expr /home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:1667 0x10c54c43 fill_vec_av_set /home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:3784 0x10cb fill_ready_list /home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:4014 0x10cb find_best_expr /home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:4374 0x10cb fill_insns /home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:5535 0x10cb schedule_on_fences /home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:7353 0x10cb sel_sched_region_2 /home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:7491 0x10c57b8b sel_sched_region_1 /home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:7533 0x10c59723 sel_sched_region(int) /home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:7634 0x10c59723 sel_sched_region(int) /home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:7619 0x10c59beb run_selective_scheduling() /home/gccbuild/gcc_trunk_git/gcc/gcc/sel-sched.cc:7720 0x10c2e6ef rest_of_handle_sched2 /home/gccbuild/gcc_trunk_git/gcc/gcc/sched-rgn.cc:3748 0x10c2e6ef execute /home/gccbuild/gcc_trunk_git/gcc/gcc/sched-rgn.cc:3895
[Bug rtl-optimization/112995] sel-sched2 ICE without checking verify_changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995 --- Comment #3 from Kewen Lin --- (In reply to Andrew Pinski from comment #2) > fselective-scheduling has so many issues. ah, thanks a lot for pointing this out. I was testing the impact of my proposed scheduling change and found this feature didn't work well on Power (turning on it by default and failed to build even without bootstrap). I thought it's able to specify these selective-scheduling related options on Power, maybe we need to ensure some quality there. I just know Power is not alone ;-), by scanning those PRs under meta-bug I noticed at least more than three had the same/similar ICE traces as what I found in those exposed failures. As noticing this https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110526#c4, I wonder if they have become in low priority?
[Bug rtl-optimization/112995] sel-sched2 ICE without checking verify_changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995 --- Comment #1 from Kewen Lin --- Initially we have: (insn 31 6 10 2 (set (reg/v:SI 9 9 [orig:119 c ] [119]) (reg/v:SI 64 0 [orig:119 c ] [119])) "test.i":5:5 555 {*movsi_internal1} (expr_list:REG_DEAD (reg/v:SI 64 0 [orig:119 c ] [119]) (nil))) (insn 10 31 25 2 (set (reg:DI 10 10 [128]) (ashift:DI (sign_extend:DI (reg/v:SI 9 9 [orig:119 c ] [119])) (const_int 2 [0x2]))) "test.i":7:8 278 {ashdi3_extswsli} (nil)) (insn 25 10 27 2 (set (reg:DI 64 0 [135]) (sign_extend:DI (reg/v:SI 9 9 [orig:119 c ] [119]))) "test.i":6:5 31 {extendsidi2} (expr_list:REG_DEAD (reg/v:SI 9 9 [orig:119 c ] [119]) (nil))) with moving up, we have: (insn 46 0 0 (set (reg:DI 64 0 [135]) (sign_extend:DI (reg/v:SI 64 0 [orig:119 c ] [119]))) 31 {extendsidi2} (expr_list:REG_DEAD (reg/v:SI 9 9 [orig:119 c ] [119]) (nil))) in try_replace_dest_reg, we updated the above EXPR_INSN_RTX to: (insn 48 0 0 (set (reg:DI 32 0) (sign_extend:DI (reg/v:SI 64 0 [orig:119 c ] [119]))) 31 {extendsidi2} (nil)) This doesn't match any constraint and it's an unexpected modification. Unfortunately function try_replace_dest_reg just checks the orig insn with: if (REGNO (best_reg) != REGNO (INSN_LHS (orig_insn)) && (! replace_src_with_reg_ok_p (orig_insn, best_reg) || ! replace_dest_with_reg_ok_p (orig_insn, best_reg))) But it doesn't check EXPR_INSN_RTX, I think it's under the assumption that if the original insn is able to be replaced then the change on EXPR_INSN_RTX is fine, but this isn't true as the given test case shows.
[Bug rtl-optimization/112995] sel-sched2 ICE without checking verify_changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995 Kewen Lin changed: What|Removed |Added Known to fail||11.4.0 Last reconfirmed||2023-12-13 Status|UNCONFIRMED |ASSIGNED Keywords||ice-on-valid-code CC||amonakov at gcc dot gnu.org, ||bergner at gcc dot gnu.org, ||segher at gcc dot gnu.org Target||powerpc64le-linux-gnu Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org
[Bug rtl-optimization/112995] New: sel-sched2 ICE without checking verify_changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112995 Bug ID: 112995 Summary: sel-sched2 ICE without checking verify_changes Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: linkw at gcc dot gnu.org Target Milestone: --- With selective scheduling 2 enabled by default, I failed to build gcc with non-bootstrap on Power10, one reduced test case is listed below: int a[]; int b(__ieee128 e) { int c; __ieee128 d; c = e; d = c; d = a[c] + d; return d; } option: -O2 -S -fselective-scheduling2 -mcpu=power10 (or -mcpu=power9) ICE reason: test.c:9:1: error: insn does not satisfy its constraints: 9 | } | ^ (insn 48 0 0 (set (reg:DI 32 0) (sign_extend:DI (reg/v:SI 64 0 [orig:119 c ] [119]))) 31 {extendsidi2} (nil))
[Bug target/112993] rs6000: Rework precision for 128bit float types and modes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112993 Bug 112993 depends on bug 112788, which changed state. Bug 112788 Summary: [14 regression] ICEs in fold_range, at range-op.cc:206 after r14-5972-gea19de921b01a6 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112788 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/112788] [14 regression] ICEs in fold_range, at range-op.cc:206 after r14-5972-gea19de921b01a6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112788 Kewen Lin changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #7 from Kewen Lin --- Should be fixed on latest trunk, we should get rid of this workaround in next release, it will be tracked in PR112993.
[Bug target/112993] rs6000: Rework precision for 128bit float types and modes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112993 Kewen Lin changed: What|Removed |Added Last reconfirmed||2023-12-13 Ever confirmed|0 |1 Keywords|build, ice-checking,|internal-improvement |ice-on-valid-code | Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org Status|UNCONFIRMED |ASSIGNED
[Bug target/112993] New: rs6000: Rework precision for 128bit float types and modes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112993 Bug ID: 112993 Summary: rs6000: Rework precision for 128bit float types and modes Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: build, ice-checking, ice-on-valid-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: linkw at gcc dot gnu.org CC: amacleod at redhat dot com, andy at gwentswordclub dot co.uk, bergner at gcc dot gnu.org, linkw at gcc dot gnu.org, meissner at gcc dot gnu.org, segher at gcc dot gnu.org, seurer at gcc dot gnu.org, tschwinge at gcc dot gnu.org Depends on: 112788 Target Milestone: --- Host: powerpc64le-linux-gnu Target: powerpc64le-linux-gnu Build: powerpc64le-linux-gnu +++ This bug was initially created as a clone of Bug #112788 +++ As PR112788 shows and the review comments from Andrew and Jakub at https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640342.html, we should get rid of the workaround for PR112788 from GCC 15+. This PR is filed for tracking this, we would expect that the precision for those types and modes are all 128 bit, also TFmode becomes one macro conditionally defined as IFmode or KFmode. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112788 [Bug 112788] [14 regression] ICEs in fold_range, at range-op.cc:206 after r14-5972-gea19de921b01a6
[Bug tree-optimization/112788] [14 regression] ICEs in fold_range, at range-op.cc:206 after r14-5972-gea19de921b01a6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112788 --- Comment #5 from Kewen Lin --- One workaround patch was posted at https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639140.html. We also found that with default long double format ieee128 the culprit commit caused the libquadmath library isn't able to be built on a system with ieee128 libs, consequently there are a lot of fortran testing failures. The workaround also fixed some failures which existed there previously: UNRESOLVED->NA: 20_util/from_chars/8.cc -std=gnu++23 compilation failed to produce executable NA->PASS: 20_util/from_chars/8.cc -std=gnu++23 execution test FAIL->PASS: 20_util/from_chars/8.cc -std=gnu++23 (test for excess errors) UNRESOLVED->NA: 20_util/from_chars/8.cc -std=gnu++26 compilation failed to produce executable NA->PASS: 20_util/from_chars/8.cc -std=gnu++26 execution test FAIL->PASS: 20_util/from_chars/8.cc -std=gnu++26 (test for excess errors) UNRESOLVED->NA: 20_util/to_chars/float128_c++23.cc -std=gnu++23 compilation failed to produce executable NA->PASS: 20_util/to_chars/float128_c++23.cc -std=gnu++23 execution test FAIL->PASS: 20_util/to_chars/float128_c++23.cc -std=gnu++23 (test for excess errors) UNRESOLVED->NA: 20_util/to_chars/float128_c++23.cc -std=gnu++26 compilation failed to produce executable NA->PASS: 20_util/to_chars/float128_c++23.cc -std=gnu++26 execution test FAIL->PASS: 20_util/to_chars/float128_c++23.cc -std=gnu++26 (test for excess errors)
[Bug tree-optimization/112788] [14 regression] ICEs in fold_range, at range-op.cc:206 after r14-5972-gea19de921b01a6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112788 Kewen Lin changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |linkw at gcc dot gnu.org Status|NEW |ASSIGNED Last reconfirmed|2023-12-03 00:00:00 |2023-12-01 0:00 --- Comment #4 from Kewen Lin --- (In reply to Andrew Macleod from comment #2) > (In reply to Kewen Lin from comment #1) > > > > > ranger makes use of type precision directly instead of something like > > types_compatible_p. I wonder if we can introduce a target hook (or hookpod) > > to make ranger unrestrict this check a bit, the justification is that for > > float type its precision information is encoded in its underlying > > real_format, if two float types underlying modes are the same, the precision > > are actually the same. > > > > btw, the operand_check_ps seems able to call range_compatible_p? > > It could, but just a precision check seemed enough at the time. > The patch also went thru many iterations and it was only the final version > that operand_check_p() ended up with types as the parameter rather than > ranges. > > You bring up a good point tho. I just switched those routines to call > range_compatible_p() and checked it in. Now it is all centralized in the > one routine going forward. Nice! Thanks a lot for your prompt enhancement! > > It does seem wrong that the float precision don't match, and weird that its > hard to fix :-) Yes, I dislike it and thought it's not sensible and tried to fix, but as the discussion in the thread mentioned above showed there were some historical reasons and practical issues to fix it. At the time, Segher mentioned he had some patches to avoid different modes having the same format but encountered some issues before and would have a re-try, but now stage 1 passed again, I guessed we have to stay with it in this release. > It should now be possible to have range_compatible_p check > the underlying mode for floats rather than the precision... If there's a > good argument for it, and you want to give that a shot... I have to admit this idea is just a workaround, even the actual float precision is encoded in the format associated to the underlying mode, but it's still unexpected to have two types with the same underlying mode but different type precision. I'm going to make and test a workaround patch since it affected the build again as reported. :(
[Bug tree-optimization/112788] [14 regression] ICEs in fold_range, at range-op.cc:206 after r14-5972-gea19de921b01a6
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112788 Kewen Lin changed: What|Removed |Added Status|UNCONFIRMED |NEW Ever confirmed|0 |1 CC||linkw at gcc dot gnu.org, ||meissner at gcc dot gnu.org, ||segher at gcc dot gnu.org Last reconfirmed||2023-12-01 --- Comment #1 from Kewen Lin --- Confirmed. A reduced test case: long double a, b, c; long double d() { return -__builtin_fmaf128_round_to_odd(c, b, a); } c.0_1 = c; b.1_2 = b; a.2_3 = a; _4 = __builtin_fmaf128_round_to_odd (c.0_1, b.1_2, a.2_3); _6 = -_4; return _6; 206├───> gcc_assert (m_operator->operand_check_p (type, lh.type (), rh.type ())); stmt: _6 = -_4; (gdb) pge lh.type() _Float128 (gdb) pge rh.type() long double The root cause is the same to what's in PR107299, TYPE_PRECISION of rh.type is 127 while that of lh.type is 128, some attempts were tried to fix this precision difference before but failed to, like: https://inbox.sourceware.org/gcc-patches/718677e7-614d-7977-312d-05a75e1fd...@linux.ibm.com/. ranger makes use of type precision directly instead of something like types_compatible_p. I wonder if we can introduce a target hook (or hookpod) to make ranger unrestrict this check a bit, the justification is that for float type its precision information is encoded in its underlying real_format, if two float types underlying modes are the same, the precision are actually the same. btw, the operand_check_ps seems able to call range_compatible_p?
[Bug target/112778] ICE in ppc64-linux-gnu crosscompiler in store_by_pieces since r14-5946-g1ff6d9f7428b06
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112778 Kewen Lin changed: What|Removed |Added CC||aoliva at gcc dot gnu.org, ||bergner at gcc dot gnu.org, ||linkw at gcc dot gnu.org, ||segher at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Summary|ICE in ppc64-linux-gnu |ICE in ppc64-linux-gnu |crosscompiler in|crosscompiler in |store_by_pieces, at |store_by_pieces since |expr.cc:1820|r14-5946-g1ff6d9f7428b06 Keywords|needs-bisection | Last reconfirmed||2023-12-01 --- Comment #1 from Kewen Lin --- Confirmed, thanks for reporting, it starts from r14-5946-g1ff6d9f7428b06. It looks function try_store_by_multiple_pieces has the wrong assumption. For the code "memset (buf, 'v', 3)", it checks + if (max_bits < orig_max_bits + && xlenest + blksize >= xlenest + && can_store_by_pieces (xlenest + blksize, + builtin_memset_read_str, + , align, true)) , succeeds and breaks. later it goes with blksize: to = store_by_pieces (to, blksize, constfun, constfundata, align, true, max_len != 0 ? RETURN_END : RETURN_BEGIN); and fails at targetm.use_by_pieces_infrastructure_p assertion. It's concluded that can_store_by_pieces (xlenest + blksize, ...) doesn't necessarily means can_store_by_pieces (blksize, ...).