http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46098
Uros Bizjak <ubizjak at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |ASSIGNED Resolution|FIXED | --- Comment #9 from Uros Bizjak <ubizjak at gmail dot com> 2012-05-14 18:13:50 UTC --- (In reply to comment #8) > Fixed. This fix is not optimal, for the testcase we generate: movsd (%rax), %xmm0 movhpd 8(%rax), %xmm0 movupd %xmm0, -16(%rbp) movapd -16(%rbp), %xmm0 So, we move (unaligned!) memory to a register, and then use movupd to store to aligned stack slot. Luckily, gcc figures that the load is from unaligned memory and generates movsd/movhpd combo. The intention was to generate: movupd (%rax), %xmm0 movapd %xmm0, -16(%rbp) movapd -16(%rbp), %xmm0 So, we don't want a fixup in the expander, but we should always load to a register for "load" builtin class. The patch should be reverted and following patch should be applied instead: Index: i386.c =================================================================== --- i386.c (revision 187465) +++ i386.c (working copy) @@ -29472,8 +29472,8 @@ ix86_expand_special_args_builtin (const struct bui arg_adjust = 0; if (optimize || target == 0 - || GET_MODE (target) != tmode - || !insn_p->operand[0].predicate (target, tmode)) + || !register_operand (target, tmode) + || GET_MODE (target) != tmode) target = gen_reg_rtx (tmode); } I will undo the (arguably, small) damage in all release branches.