http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52285
--- Comment #12 from Steven Bosscher <steven at gcc dot gnu.org> 2012-11-13 23:37:52 UTC --- Created attachment 28678 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=28678 Gross hack (In reply to comment #11) > If loops are still around at LRA time, perhaps LRA should consider putting > it before loop if register pressure is low, or LIM could just have extra > code for this Unfortunately, loop are destroyed _just_ before LRA, at the end of IRA. IRA has its own loop tree but that is destroyed before LRA, too. > I'm not saying it must be LIM, I'm > just looking for suggestions where to perform this. LIM may be too early. I've experimented with the attached patch (based off some other patch for invariant addresses that was bit-rotting on a shelf) and I had to resort to some crude hacks to make loop-invariant even just consider moving the bare frame_pointer_rtx, like manually setting the cost to something high because set_src_cost(frame_pointer_rtx)==0. The result is this code: foo: leaq -72(%rsp), %rcx leaq -8(%rsp), %rdx // A Pyrrhic victory... .p2align 4,,10 .p2align 3 .L5: movq %rcx, %rax .p2align 4,,10 .p2align 3 .L3: movb $0, (%rax) addq $1, %rax cmpq %rdx, %rax jne .L3 subl $64, %edi testl %edi, %edi jg .L5 rep ret Need to think about this a bit more, perhaps postreload-gcse can be used for this instead of LIM...