[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #16 from Jakub Jelinek --- Fixed.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #15 from CVS Commits --- The releases/gcc-11 branch has been updated by Ilya Leoshkevich : https://gcc.gnu.org/g:445ce3cfb6832ec838caa10f2337c3bd00213517 commit r11-8360-g445ce3cfb6832ec838caa10f2337c3bd00213517 Author: Ilya Leoshkevich Date: Mon Apr 26 14:12:08 2021 +0200 IBM Z: Handle hard registers in s390_md_asm_adjust() gen_fprx2_to_tf() and gen_tf_to_fprx2() cannot handle hard registers, since the subregs they create do not pass validation. Change s390_md_asm_adjust() to manually copy between hard VRs and FPRs instead of using these two functions. gcc/ChangeLog: PR target/100217 * config/s390/s390.c (s390_hard_fp_reg_p): New function. (s390_md_asm_adjust): Handle hard registers. gcc/testsuite/ChangeLog: PR target/100217 * gcc.target/s390/vector/long-double-asm-in-out-hard-fp-reg.c: New test. * gcc.target/s390/vector/long-double-asm-inout-hard-fp-reg.c: New test.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #14 from CVS Commits --- The master branch has been updated by Ilya Leoshkevich : https://gcc.gnu.org/g:4f48c335d36674f90046b2823f0ac1c0545dc082 commit r12-379-g4f48c335d36674f90046b2823f0ac1c0545dc082 Author: Ilya Leoshkevich Date: Mon Apr 26 14:12:08 2021 +0200 IBM Z: Handle hard registers in s390_md_asm_adjust() gen_fprx2_to_tf() and gen_tf_to_fprx2() cannot handle hard registers, since the subregs they create do not pass validation. Change s390_md_asm_adjust() to manually copy between hard VRs and FPRs instead of using these two functions. gcc/ChangeLog: PR target/100217 * config/s390/s390.c (s390_hard_fp_reg_p): New function. (s390_md_asm_adjust): Handle hard registers. gcc/testsuite/ChangeLog: PR target/100217 * gcc.target/s390/vector/long-double-asm-in-out-hard-fp-reg.c: New test. * gcc.target/s390/vector/long-double-asm-inout-hard-fp-reg.c: New test.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #13 from Mark Wielaard --- (In reply to Jakub Jelinek from comment #12) > For valgrind, the quick workaround would be -march=z13 when compiling the > s390x tests that have register long double variables. Yes, this works, if fpext and pfpo are build with -march=z13 they compile (and the tests pass under valgrind).
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #12 from Jakub Jelinek --- For valgrind, the quick workaround would be -march=z13 when compiling the s390x tests that have register long double variables.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #11 from Mark Wielaard --- BTW. Disabling that test in valgrind produces another crash in another test that looks similar: gcc -DHAVE_CONFIG_H -I. -I../../.. -I../../.. -I../../../include -I../../../coregrind -I../../../include -I../../../VEX/pub -I../../../VEX/pub -DVGA_s390x=1 -DVGO_linux=1 -DVGP_s390x_linux=1 -DVGPV_s390x_linux_vanilla=1 -Winline -Wall -Wshadow -Wno-long-long -g -m64 -c -o pfpo.o pfpo.c during RTL pass: expand pfpo.c: In function 'main': pfpo.c:35:3: internal compiler error: in gen_rtx_SUBREG, at emit-rtl.c:1023 35 | asm volatile(".short 0x010a\n\t" \ | ^~~ pfpo.c:146:15: note: in expansion of macro 'PFPO' 146 | d32 = PFPO(f128_in[j], long double, _Decimal32, PFPO_F128_TO_D32, | ^~~~ The whole PFPO define is: #define PFPO(initial, src_type, dst_type, fn_code, round, ret_code, cc) \ ({ \ register src_type src_reg asm("f4") = initial;\ register dst_type dst_reg asm("f0"); \ register unsigned long fn asm("0") = fn_code | (round & 0xf); \ register unsigned int ret asm("1"); \ asm volatile(".short 0x010a\n\t" \ "ipm %2\n\t" \ "srl %2,28\n\t" \ :"=f"(dst_reg), "=d"(ret), "=d" (cc) \ : "f"(src_reg), "d"(fn));\ ret_code = ret; \ dst_reg; \ })
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 Jakub Jelinek changed: What|Removed |Added Target Milestone|11.0|11.2 --- Comment #10 from Jakub Jelinek --- GCC 11.1 has been released, retargeting bugs to GCC 11.2.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #9 from Ilya Leoshkevich --- Created attachment 50679 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50679=edit regtesting this patch now
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #8 from Ilya Leoshkevich --- Yeah, inline asm seems to be problematic: /home/iii/gcc/build/gcc/xgcc -B/home/iii/gcc/build/gcc/ /home/iii/gcc/gcc/testsuite/gcc.target/s390/vector/long-double-asm-hardreg.c -fdiagnostics-plain-output -O2 -march=z14 -mzarch -S -o long-double-asm-hardreg.s with the patch from comment 2 produces: foo: .LFB0: .cfi_startproc larl%r5,.L4 vl %v0,.L5-.L4(%r5),3 #APP # 10 "/home/iii/gcc/gcc/testsuite/gcc.target/s390/vector/long-double-asm-hardreg.c" 1 # %v0 # 0 "" 2 #NO_APP br %r14 `vl %v0,.L5-.L4(%r5),3` loads 1.0L into %v0[0:128]. However, it should be loaded into %v0[0:64] . %v2[0:64]. With the patch from comment 3 I get: foo: .LFB0: .cfi_startproc larl%r5,.L4 ld %f0,.L5-.L4(%r5) ld %f2,.L5-.L4+8(%r5) #APP # 10 "/home/iii/gcc/gcc/testsuite/gcc.target/s390/vector/long-double-asm-hardreg.c" 1 # %f0 # 0 "" 2 #NO_APP br %r14 which is correct, but in general case the exact reg that the user requested is not honored.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #7 from Jakub Jelinek --- That said, I'm afraid I don't really understand what wrong happens with the patch I've attached. Trying something like: long double foo (void) { register long double f0 asm ("f0"); f0 = 1.0L; f0 += 127.L; f0 *= 32.L; return f0; } with -O0 -march=z14 -mlong-double-128 so that it is not all folded immediately shows in the end the computations are done in vector registers. And another thing to try is intermix that with inline asm expecting those in "+f" so that intermediate results are pushed to the floating point register pair.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #6 from Jakub Jelinek --- (In reply to Ilya Leoshkevich from comment #5) > That would be an ideal solution, but I wonder how to implement it? Suppose > we find a way to convince expand to pick FPRX2mode for such a long double. > What if the following comes up? > > register long double x asm ("v0"); /* FPRX2mode */ > long double y; /* TFmode */ > x += y; /* convert? */ > > Would it be feasible to also teach expand to do the mode conversions? It is certainly doable, but perhaps with extra target hooks or something similar. Types have their TYPE_MODE and decls have DECL_MODE, though the question is what breaks if TYPE_MODE != DECL_MODE, at least the comment in tree.h says that they can only differ for FIELD_DECLs. Anyway, in GIMPLE register vars are non-SSA, so apart from inline asm one needs separate loads and stores to them, so if we could expand those as having FPRX2 hard reg and loads from it convert to TFmode and stores into it convert from TFmode, ... > One other alternative might be to detect `register long double asm("fN")` > declarations and go back to using floating point register pairs for > functions that contain them. But this might be actually best short-time solution (for GCC 11.x).
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #5 from Ilya Leoshkevich --- That would be an ideal solution, but I wonder how to implement it? Suppose we find a way to convince expand to pick FPRX2mode for such a long double. What if the following comes up? register long double x asm ("v0"); /* FPRX2mode */ long double y; /* TFmode */ x += y; /* convert? */ Would it be feasible to also teach expand to do the mode conversions? One other alternative might be to detect `register long double asm("fN")` declarations and go back to using floating point register pairs for functions that contain them.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 Jakub Jelinek changed: What|Removed |Added Priority|P3 |P2
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #4 from Jakub Jelinek --- That seems like quite undesirable API change. Can't the backend when it sees long double register vars for the fN registers change the mode from TFmode to that new FPRX2mode, so that old code keeps working?
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 Ilya Leoshkevich changed: What|Removed |Added CC||iii at linux dot ibm.com --- Comment #3 from Ilya Leoshkevich --- There main problem here is that `register long double f0 asm ("f0")` does not make sense on z14 anymore. long doubles are stored in vector registers now, not in floating-point register pairs. If we skip the hard reg, the code will end up having the following semantics: vr0[0:128] = 1.0L; asm("/* expect the value in vr0[0:64] . vr2[0:64] */"); and fail during the run time. So I think it's better to use the "best effort" approach and force it into a pseudo, even if this would mean that the user-specified register is not honored: --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/s390.c @@ -16814,6 +16814,12 @@ s390_md_asm_adjust (vec , vec , gcc_assert (allows_reg); /* Copy input value from a vector register into a FPR pair. */ rtx fprx2 = gen_reg_rtx (FPRX2mode); + if (REG_P (inputs[i]) && HARD_REGISTER_P (inputs[i])) + { + rtx orig_input = inputs[i]; + inputs[i] = gen_reg_rtx (TFmode); + emit_move_insn (inputs[i], orig_input); + } emit_insn (gen_tf_to_fprx2 (fprx2, inputs[i])); inputs[i] = fprx2; input_modes[i] = FPRX2mode; I need to check whether we can keep the output logic as is. Ideally the code should be adapted and use the __LONG_DOUBLE_VX__ macro like this: #ifdef __LONG_DOUBLE_VX__ register long double f0 asm ("v0"); #else register long double f0 asm ("f0"); #endif f0 = 1.0L; #ifdef __LONG_DOUBLE_VX__ asm("" : : "v" (f0)); #else asm("" : : "f" (f0)); #endif Maybe a warning recommending to do this should be printed.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #2 from Jakub Jelinek --- Created attachment 50658 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50658=edit gcc11-pr100217.patch Untested fix. IMHO when we have a hard reg in the inline asm, we just need to honor it, trying to force it into a pseudo and then subreg would just mean the user chosen reg is not guaranteed anymore.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #1 from Jakub Jelinek --- Bet the new s390_md_asm_adjust code is unprepared to see hard registers there.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 Jakub Jelinek changed: What|Removed |Added Target Milestone|--- |11.0 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW CC||iii at gcc dot gnu.org, ||krebbel at gcc dot gnu.org Last reconfirmed||2021-04-22