https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108821
Bug ID: 108821 Summary: Extra volatile access with -O2 -ftree-loop-im since GCC-11 Product: gcc Version: 11.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: sirl at gcc dot gnu.org Target Milestone: --- Hi, this small example extern volatile int *x; static int gCrc; static int crc16Add(int crc, int b) __attribute__((noinline)); static int crc16Add(int crc, int b) { return crc + b; } void f(int data, int dataSz) { int i; for(i=0;i<dataSz;i++) { gCrc = crc16Add(gCrc, data); *x = data; } } adds an extra volatile access after the loop (ARM assembler, but x64 shows the same problem): f(int, int): mov r2, r1 cmp r2, #0 ble .L8 push {r3, r4, r5, lr} movw r5, #:lower16:.LANCHOR0 movt r5, #:upper16:.LANCHOR0 movw r4, #:lower16:x movt r4, #:upper16:x mov r1, r0 movs r3, #0 ldr r0, [r5] ldr r4, [r4] .L5: adds r3, r3, #1 bl crc16Add(int, int) cmp r2, r3 str r1, [r4] @ <-- the last store here bne .L5 str r0, [r5] str r1, [r4] @ <-- is duplicated here pop {r3, r4, r5, pc} .L8: bx lr The tree dumps shows the extra access is added during the lim2 pass. Compiling with -fno-tree-loop-im avoids the invalid extra access to volatile memory. I'm not enough of a language lawyer to be sure that the extra volatile access is invalid in C/C++, but at least it's a bad optimization.