https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123506
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |roger at nextmovesoftware dot
com
--- Comment #6 from Roger Sayle <roger at nextmovesoftware dot com> ---
Many thanks to Jakub for bringing this bug to my attention, and recognizing
that it's related to my proposed solution for PR 122454. As an extremely rough
proof of concept, the short patch below:
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 7d84ad9e6fc..0606536c34a 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -3038,6 +3038,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, rtx orig_src, tree
type,
src = orig_src;
if (!MEM_P (orig_src)
&& (!REG_P (orig_src) || HARD_REGISTER_P (orig_src))
+ && GET_CODE (orig_src) != CONCAT
&& !CONSTANT_P (orig_src))
{
gcc_assert (GET_MODE (orig_src) != VOIDmode);
@@ -3139,6 +3140,16 @@ emit_group_load (rtx dst, rtx src, tree type, poly_int64
ssize)
rtx *tmps;
int i;
+ // RAS 2026
+ if (REG_P (src) && GET_MODE (src) == TImode) {
+ rtx lo = gen_reg_rtx (DImode);
+ rtx hi = gen_reg_rtx (DImode);
+ rtx lo_src = simplify_gen_subreg (DImode, src, TImode, 0);
+ rtx hi_src = simplify_gen_subreg (DImode, src, TImode, 8);
+ emit_insn (gen_parmovdi4 (lo, lo_src, hi, hi_src));
+ src = gen_rtx_CONCAT (TImode, lo, hi);
+ }
+
tmps = XALLOCAVEC (rtx, XVECLEN (dst, 0));
emit_group_load_1 (tmps, dst, src, type, ssize);
resolves the issue (TImode result passing preventing the use of lea), and we
would now produce at -O3:
bar: movl %esi, %eax
movl $11621, %edx
testl %esi, %esi
jns .L2
negl %eax
movl $11109, %edx
.L2: movw %dx, (%rdi)
leaq 2(%rdi), %rdx
ret
Of course, the implementation above (which is x86_64-specific) is unsuitable as
written to submit as a patch, but it proves that a new targetm.emit_group_load
target hook, on top of the proposed parmov<mode>4 patch for PR 122454, could be
used to resolve this issue (once fully tested across multiple platforms).