Claudio Grasso commented on a discussion on stm32h7/include/lwipbspopts.h: https://gitlab.rtems.org/rtems/pkg/rtems-lwip/-/merge_requests/37#note_151413 > +#define ETH_PAD_SIZE 2 > + > +/* Strongly byte-by-byte MEMCPY for STM32H7 to avoid all hardware traps. > + * Optimized library versions of memcpy() may use word-aligned instructions > (LDR/STR) > + * even on byte-access paths, which can trigger usage faults in certain > memory regions. > + */ > +static inline void stm32h7_byte_memcpy(void *dst, const void *src, size_t > len) { > + uint8_t *d = (uint8_t *)dst; > + const uint8_t *s = (const uint8_t *)src; > + while (len--) { > + *d++ = *s++; > + } > +} > +#define MEMCPY(dst, src, len) stm32h7_byte_memcpy(dst, src, len) > +#define SMEMCPY(dst, src, len) stm32h7_byte_memcpy(dst, src, len) > +#define MEMMOVE(dst, src, len) stm32h7_byte_memcpy(dst, src, len) [memcpy_evidence.txt](/uploads/954b7286eda68a32c0e6a469b863a828/memcpy_evidence.txt) On this BSP the Ethernet Rx/Tx buffers and DMA descriptors live in a non-cacheable MPU region configured as Device memory (for DMA coherency without manual cache maintenance). On Cortex-M7, an **unaligned** word access to Device/Strongly-ordered memory **always faults**, regardless of `CCR.UNALIGN_TRP`. With `ETH_PAD_SIZE = 2`, the lwIP payload sits 2 bytes off the word-aligned DMA buffer, so any copy between them has mismatched source/destination alignment. The problem is that the optimised library `memcpy()` issues word-wide (32-bit) accesses on its bulk path. Here is the disassembly straight from the linked image to show this is not hypothetical: ``` Toolchain : arm-rtems6-gcc (GCC) 13.3.0, Newlib, -mcpu=cortex-m7 Binary : build/arm-rtems6-nucleo-h743zi/stm32h7_test.exe ``` newlib `memcpy()` — bulk path is 16× unrolled 32-bit `ldr.w/str.w`: ``` 0802603c <memcpy>: 802603e: ea41 0300 orr.w r3, r1, r0 ; src | dst 8026042: f013 0303 ands.w r3, r3, #3 ; test low 2 bits 8026046: d16d bne.n 8026124 ; -> "unaligned" entry ... 802604c: f851 3b04 ldr.w r3, [r1], #4 ; 32-bit load, +4 8026050: f840 3b04 str.w r3, [r0], #4 ; 32-bit store, +4 (x16) ``` The "unaligned" entry only realigns the **head** with byte/half-word copies, then branches **back into the same 32-bit word loop** — so when src and dst differ in alignment (our `ETH_PAD_SIZE = 2` case) one operand is still misaligned on the word access: ``` 8026146: f831 3b02 ldrh.w r3, [r1], #2 ; head realign only 802614a: f820 3b02 strh.w r3, [r0], #2 802614e: e77b b.n 8026048 ; back to the word loop ``` Our override compiles to byte-only accesses, so the alignment of either operand is irrelevant and no unaligned fault is possible in any region: ``` 08001c74 <stm32h7_byte_memcpy>: 8001c96: 7812 ldrb r2, [r2, #0] 8001c98: 701a strb r2, [r3, #0] ``` No `ldr.w`/`str.w`, no `ldrh`, no `ldrd`, no `ldm`. That byte-only property is exactly what the STM32H7 Ethernet path needs. Full annotated objdump attached. Sorry I did not realize before this could help. -- View it on GitLab: https://gitlab.rtems.org/rtems/pkg/rtems-lwip/-/merge_requests/37#note_151413 You're receiving this email because of your account on gitlab.rtems.org.
_______________________________________________ bugs mailing list [email protected] http://lists.rtems.org/mailman/listinfo/bugs
