https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88469

            Bug ID: 88469
           Summary: Unaligned stack access on arm (in particular armv5)
           Product: gcc
           Version: 8.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: stefanrin at gmail dot com
  Target Milestone: ---

The compiler generates unaligned stack accesses for its own code, which causes
it to trap on armv5. The disassembly of the offending function looks like this:

(configure arguments: --build=armv5tel-unknown-linux-gnueabi
--prefix=$HOME/gcc8 --enable-languages=c,c++ --with-arch=armv5te
--with-mode=arm --disable-nls)

00398e64 <_ZN11cgraph_node11create_edgeEPS_P5gcall13profile_count>:
  398e64:       e24dd008        sub     sp, sp, #8
  398e68:       e92d41f0        push    {r4, r5, r6, r7, r8, lr}
  398e6c:       e24dd018        sub     sp, sp, #24
  398e70:       e58d3034        str     r3, [sp, #52]   ; 0x34
  398e74:       e1cd63d4        ldrd    r6, [sp, #52]   ; 0x34
  398e78:       e28d3018        add     r3, sp, #24
  398e7c:       e1a08000        mov     r8, r0
  398e80:       e1a05001        mov     r5, r1
  398e84:       e59fe068        ldr     lr, [pc, #104]  ; 398ef4
<_ZN11cgraph_node11create_edgeEPS_P5gcall13profile_count+0x90>
  398e88:       e1cd61f0        strd    r6, [sp, #16]
  398e8c:       e9130003        ldmdb   r3, {r0, r1}
  398e90:       e3a0c000        mov     ip, #0
  398e94:       e1a03002        mov     r3, r2
  398e98:       e88d0003        stm     sp, {r0, r1}
  398e9c:       e1a02005        mov     r2, r5
  398ea0:       e59e0000        ldr     r0, [lr]
  398ea4:       e1a01008        mov     r1, r8
  398ea8:       e58dc008        str     ip, [sp, #8]
  398eac:       ebffff3c        bl      398ba4
<_ZN12symbol_table11create_edgeEP11cgraph_nodeS1_P5gcall13profile_countb>
  398eb0:       e1a04000        mov     r4, r0
  398eb4:       eb0709d8        bl      55b61c
<_Z24initialize_inline_failedP11cgraph_edge>
  398eb8:       e5953044        ldr     r3, [r5, #68]   ; 0x44
  398ebc:       e5843014        str     r3, [r4, #20]
  398ec0:       e3530000        cmp     r3, #0
  398ec4:       15834010        strne   r4, [r3, #16]
  398ec8:       e5983040        ldr     r3, [r8, #64]   ; 0x40
  398ecc:       e1a00004        mov     r0, r4
  398ed0:       e3530000        cmp     r3, #0
  398ed4:       e584301c        str     r3, [r4, #28]
  398ed8:       15834018        strne   r4, [r3, #24]
  398edc:       e5884040        str     r4, [r8, #64]   ; 0x40
  398ee0:       e5854044        str     r4, [r5, #68]   ; 0x44
  398ee4:       e28dd018        add     sp, sp, #24
  398ee8:       e8bd41f0        pop     {r4, r5, r6, r7, r8, lr}
  398eec:       e28dd008        add     sp, sp, #8
  398ef0:       e12fff1e        bx      lr
  398ef4:       012f41a0        teqeq   pc, r0, lsr #3

The ldrd at 398e74 is the problem. To be honest, I don't fully understand
understand this code. profile_count seems to be a struct with a 64 bit value as
its first element. From my understanding of AAPCS, this should not be stored in
r3, because it is not an even register number. But be that as it may, this
seems to store the first part of the 64 bit counter into the stack so that it
can then be loaded into r6/r7 together with its upper part. This can never be
properly aligned.

For comparison, the same function in an armv7 hardfloat build looks like this:

(configure arguments: --build=arm-linux-gnueabihf --prefix=$HOME/gcc8
--enable-languages=c,c++ --with-arch=armv7-a --with-fpu=vfpv3-d16
--with-mode=arm --with-float=hard --disable-nls --enable-multilib)

003c21e0 <_ZN11cgraph_node11create_edgeEPS_P5gcall13profile_count>:
  3c21e0:       e24dd008        sub     sp, sp, #8
  3c21e4:       e309c130        movw    ip, #37168      ; 0x9130
  3c21e8:       e340c133        movt    ip, #307        ; 0x133
  3c21ec:       e92d4370        push    {r4, r5, r6, r8, r9, lr}
  3c21f0:       e24dd018        sub     sp, sp, #24
  3c21f4:       e1a05001        mov     r5, r1
  3c21f8:       e1a06000        mov     r6, r0
  3c21fc:       e58d3034        str     r3, [sp, #52]   ; 0x34
  3c2200:       e1a03002        mov     r3, r2
  3c2204:       e1cd83d4        ldrd    r8, [sp, #52]   ; 0x34
  3c2208:       e1a02001        mov     r2, r1
  3c220c:       e28d1018        add     r1, sp, #24
  3c2210:       e3a0e000        mov     lr, #0
  3c2214:       e1cd81f0        strd    r8, [sp, #16]
  3c2218:       e9110003        ldmdb   r1, {r0, r1}
  3c221c:       e58de008        str     lr, [sp, #8]
  3c2220:       e88d0003        stm     sp, {r0, r1}
  3c2224:       e1a01006        mov     r1, r6
  3c2228:       e59c0000        ldr     r0, [ip]
  3c222c:       ebffff43        bl      3c1f40
<_ZN12symbol_table11create_edgeEP11cgraph_nodeS1_P5gcall13profile_countb>
  3c2230:       e1a04000        mov     r4, r0
  3c2234:       eb071320        bl      586ebc
<_Z24initialize_inline_failedP11cgraph_edge>
  3c2238:       e5953044        ldr     r3, [r5, #68]   ; 0x44
  3c223c:       e1a00004        mov     r0, r4
  3c2240:       e3530000        cmp     r3, #0
  3c2244:       e5843014        str     r3, [r4, #20]
  3c2248:       15834010        strne   r4, [r3, #16]
  3c224c:       e5963040        ldr     r3, [r6, #64]   ; 0x40
  3c2250:       e3530000        cmp     r3, #0
  3c2254:       e584301c        str     r3, [r4, #28]
  3c2258:       15834018        strne   r4, [r3, #24]
  3c225c:       e5864040        str     r4, [r6, #64]   ; 0x40
  3c2260:       e5854044        str     r4, [r5, #68]   ; 0x44
  3c2264:       e28dd018        add     sp, sp, #24
  3c2268:       e8bd4370        pop     {r4, r5, r6, r8, r9, lr}
  3c226c:       e28dd008        add     sp, sp, #8
  3c2270:       e12fff1e        bx      lr

The misaligned ldrd is still there, just arranged in a different way. It may be
ok for armv7 (although this would surprise me), but it definitely is not for
armv5.

This looks very similar to #86555, but it happens in gcc's own code, so it may
not be appropriate to blame the C library in this case.

Reply via email to