https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80735
Bug ID: 80735 Summary: IPA: SRA inhibits constant propagation of structs across multiple function calls Product: gcc Version: 7.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: daniel.santos at pobox dot com CC: mjambor at suse dot cz Target Milestone: --- Created attachment 41350 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41350&action=edit test_case.c I've finally managed a simple test case for this long-standing missed optimization. Given the test code built with -O2 -fno-inline: static const struct foo { long a; long b; } f = {8, 8}; static long a (const struct foo *foo) {return foo->b;} static long b (const struct foo *foo) {return a (foo);} long c (void) {return b (&f);} Result: 0000000000000000 <a.isra.0>: 0: 48 89 f8 mov %rdi,%rax 3: c3 retq 0000000000000010 <b.constprop.1>: 10: bf 08 00 00 00 mov $0x8,%edi 15: eb e9 jmp 0 <a.isra.0> 0000000000000020 <c>: 20: eb ee jmp 10 <b.constprop.1> Although we got isra for foo::b, I had expected a() to consist of only mov $0x8, %eax; retq, and b() just be a jump to a(). But when we disable ipa-sra we get the expected result (-O2 -fno-inline -fno-ipa-sra): 0000000000000000 <a.constprop.1>: 0: b8 08 00 00 00 mov $0x8,%eax 5: c3 retq 0000000000000010 <b.constprop.0>: 10: eb ee jmp 0 <a.constprop.1> 0000000000000020 <c>: 20: eb ee jmp 10 <b.constprop.0> If we replace the struct with an array or a pointer to a long then SRA does not interfere with the constant propagation (-O2 -fno-inline): static const long f[2] = {8, 8}; static long a (const long foo[]) {return foo[1];} static long b (const long foo[]) {return a (foo);} long c (void) {return b (f);} Result 0000000000000000 <a.constprop.1>: 0: b8 08 00 00 00 mov $0x8,%eax 5: c3 retq 0000000000000010 <b.constprop.0>: 10: eb ee jmp 0 <a.constprop.1> 0000000000000020 <c>: 20: eb ee jmp 10 <b.constprop.0> I'm still very new to this part of GCC, but I'm guessing that when we do the SRA, we toss out the original aggregate. If so, then we aren't reserving the possibility that all of the function's callers could get cloned with a constant for the aggregate, which would (probably always?) be better than just plucking the scaler out of the aggregate. I'll be digesting tree.sra.c and the cgraph to try and figure this one out, but if anybody understands this better then I would appreciate some hints. :)