http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54110
Bug #: 54110 Summary: lower-subreg related code quality for long long function return Classification: Unclassified Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: amo...@gmail.com powerpc-linux-gcc -m32 -O2 long long ll (long long *x) { return *x; } Rev 189440 output lwz r4,4(r3) lwz r3 0(r3) blr Rev 189441 output mr r9,r3 lwz r3,0(r9) lwz r4,4(r9) blr Not a big deal, but it gets a little worse for long long llo (long long *x) { return x[4095]; } Current mainline lwz r11,32764(r3) lwz r10,32760(r3) mr r4,r11 mr r3,r10 blr Current mainline less r189441 patch lwz r4,32764(r3) lwz r3,32760(r3) blr Noticed when developing testcases for pr53914. I happened to be working with a copy of mainline a day before r189441 and saw ideal code being generated with my pr53914 fix. Some notes - Obviously we don't want to revert r189441 as without that the lower-subreg pass is effectively disabled on powerpc. - Without lower-subreg, combine merges (set (reg:DI 124) (mem:DI (...)); (set (reg:DI 121) (reg:DI 124)); (set (reg:DI 3) (reg:DI 121) into (set (reg:DI 3) (mem:DI (...)); Combine can't do that when lowered to SImode. - For llo.c, lower-subreg leaves the mem as DImode due to offsettable_memref quirks. For smaller offsets the mem is split and you get addi,lwz,lwz,blr. - If we have a DImode mem it persists until the split after reload. It is there that rs6000_split_multireg_move gets into play, and is reponsible for reordering the loads. - I'm not sure under what conditions loads that might trap can be reordered.