https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46393
--- Comment #1 from Jeffrey A. Law <law at redhat dot com> --- It appears the problem starts with forwprop turning the pointer accesses into array/structure memory accesses. This is generally a good thing. However, in this instance it makes it awful hard to recover the CSE opportunities that are needed to get good compact code. We have 3 instances of: 30 003e D3C2 add.l %d2,%a1 31 0040 D3C9 add.l %a1,%a1 32 0042 D3C2 add.l %d2,%a1 That's 12 wasted bytes. ANd we two have two instances of: 24 0032 5200 addq.b #1,%d0 25 0034 5288 addq.l #1,%a0 Another two wasted bytes. Also related we end up selecting poor addressing modes which probably another 10-16 bytes. But at the core AFAICT is recovery of array/structure access from what was pointer accesses. In theory PRE ought to come along and pull out the redundant address arithmetic, but it doesn't (not even with -O2). It's not clear how prelevant this is across other architectures, so I'm keeping a P4 for now. If someone can show this causing problems on non-dead targets, then we might consider bumping this up to a P2 priority.