https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111231

--- Comment #19 from Richard Earnshaw <rearnsha at gcc dot gnu.org> ---
This is another problem with (I suspect) incorrect aliasing information.  If I
compile with -fno-strict-aliasing, I get

  88:   f4432a1f        vst1.8  {d18-d19}, [r3 :64]     // {>E}   SP+96/16
  8c:   f4420a1f        vst1.8  {d16-d17}, [r2 :64]     // {>A}   SP+32/16
  90:   e893000f        ldm     r3, {r0, r1, r2, r3}    // {<E}   SP+96/16
  94:   e884000f        stm     r4, {r0, r1, r2, r3}    // {>G}   SP+128/16
  98:   eddd0b20        vldr    d16, [sp, #128] ; 0x80  // {<G.l} SP+128/8
  9c:   eddd1b22        vldr    d17, [sp, #136] ; 0x88  // {<G.h} SP+136/8
  a0:   e88c000f        stm     ip, {r0, r1, r2, r3}    // {>B}   SP+48/16
  a4:   e28dc040        add     ip, sp, #64     ; 0x40
  a8:   e885000f        stm     r5, {r0, r1, r2, r3}    // {>F}   SP+112/16
  ac:   f2d80570        vshl.s16        q8, q8, #8
  b0:   f3f503e0        vneg.s16        q8, q8
  b4:   edcd0b20        vstr    d16, [sp, #128] ; 0x80  // {>G.l} SP+128/8
  b8:   edcd1b22        vstr    d17, [sp, #136] ; 0x88  // {>G.h} SP+136/8
  bc:   e894000f        ldm     r4, {r0, r1, r2, r3}    // {<G}   SP+128/16
  c0:   e88c000f        stm     ip, {r0, r1, r2, r3}    // {>C}   SP+64/16
  c4:   e28dc050        add     ip, sp, #80     ; 0x50
  c8:   e88c000f        stm     ip, {r0, r1, r2, r3}    // {>D}   SP+80/16
  cc:   e885000f        stm     r5, {r0, r1, r2, r3}    // {>F}   SP+112/16

I've annotated each memory access with its stack address and labeled each
16-byte slot from A to G.

With -fstrict-aliasing this becomes:

  88:   f4420a1f        vst1.8  {d16-d17}, [r2 :64]     // {>A}   SP+32/16
  8c:   eddd0b20        vldr    d16, [sp, #128] ; 0x80  // {<G.l} SP+128/8     
!
  90:   eddd1b22        vldr    d17, [sp, #136] ; 0x88  // {<G.h} SP+136/8     
!
  94:   f4432a1f        vst1.8  {d18-d19}, [r3 :64]     // {>E}   SP+96/16
  98:   e893000f        ldm     r3, {r0, r1, r2, r3}    // {<E}   SP+96/16
  9c:   e88c000f        stm     ip, {r0, r1, r2, r3}    // {>B}   SP+48/16
  a0:   e28dc040        add     ip, sp, #64     ; 0x40
  a4:   f2d80570        vshl.s16        q8, q8, #8
  a8:   e884000f        stm     r4, {r0, r1, r2, r3}    // {>G}   SP+128/16    
!
  ac:   e885000f        stm     r5, {r0, r1, r2, r3}    // {>F}   SP+112/16
  b0:   f3f503e0        vneg.s16        q8, q8
  b4:   edcd0b20        vstr    d16, [sp, #128] ; 0x80  // {>G.l} SP+128/8
  b8:   edcd1b22        vstr    d17, [sp, #136] ; 0x88  // {>G.h} SP+136/8
  bc:   e894000f        ldm     r4, {r0, r1, r2, r3}    // {<G}   SP+128/16
  c0:   e88c000f        stm     ip, {r0, r1, r2, r3}    // {>C}   SP+64/16
  c4:   e28dc050        add     ip, sp, #80     ; 0x50
  c8:   e88c000f        stm     ip, {r0, r1, r2, r3}    // {>D}   SP+80/16
  cc:   e885000f        stm     r5, {r0, r1, r2, r3}    // {>F}   SP+112/16

And we see that the initial store to G has been moved after the reads from it. 
I'm still digging, but it may be pertinent that the reads have been split into
two separate instructions; perhaps when the split was done the alias sets
weren't copied correctly.

Reply via email to