https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77308
--- Comment #29 from wilco at gcc dot gnu.org --- (In reply to Bernd Edlinger from comment #28) > With my latest patch I bootstrapped a configuration with > --with-arch=armv7-a --with-tune=cortex-a9 --with-fpu=vfpv3-d16 > --with-float=hard > > I noticed a single regression in gcc.target/arm/pr53447-*.c > > That is caused by disabling the adddi3 expansion. > > void t0p(long long * p) > { > *p += 0x100000001; > } > > used to get compiled to this at -O2: > > ldrd r2, [r0] > adds r2, r2, #1 > adc r3, r3, #1 > strd r2, [r0] > bx lr > > but without the adddi3 pattern I have at -O2: > > ldr r3, [r0] > ldr r1, [r0, #4] > cmn r3, #1 > add r3, r3, #1 > movcc r2, #0 > movcs r2, #1 > add r1, r1, #1 > str r3, [r0] > add r3, r2, r1 > str r3, [r0, #4] > bx lr That's because your patch disables adddi3 completely, which is not correct. We want to use the existing integer sequence, just expanded earlier. Instead of your change, removing the "&& reload_completed" from the arm_adddi3 instruction means we expand before register allocation: ldr r3, [r0] ldr r2, [r0, #4] adds r3, r3, #1 str r3, [r0] adc r2, r2, #16 str r2, [r0, #4] bx lr > Note that also the ldrd instructions are not there. Yes that's yet another bug... > I think this is the effect on the ldrd that you already mentioned, > and it gets worse when the expansion breaks the di registers up > into two si registers. Indeed, splitting early means we end up with 2 loads. However in most cases we should be able to gather the loads and emit LDRD/STRD on Thumb-2 (ARM's LDRD/STRD is far more limited so not as useful). Combine could help with merging 2 loads/stores into a single instruction.