[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 --- Comment #17 from Jakub Jelinek --- Author: jakub Date: Fri Jan 18 10:07:27 2019 New Revision: 268067 URL: https://gcc.gnu.org/viewcvs?rev=268067=gcc=rev Log: PR tree-optimization/86214 * tree-inline.h (struct copy_body_data): Add add_clobbers_to_eh_landing_pads member. * tree-inline.c (add_clobbers_to_eh_landing_pad): New function. (copy_edges_for_bb): Call it if EH edge destination is < id->add_clobbers_to_eh_landing_pads. Fix a comment typo. (expand_call_inline): Set id->add_clobbers_to_eh_landing_pads if flag_stack_reuse != SR_NONE and clear it afterwards. * g++.dg/opt/pr86214-1.C: New test. * g++.dg/opt/pr86214-2.C: New test. Added: trunk/gcc/testsuite/g++.dg/opt/pr86214-1.C trunk/gcc/testsuite/g++.dg/opt/pr86214-2.C Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-inline.c trunk/gcc/tree-inline.h
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 --- Comment #16 from Jakub Jelinek --- Created attachment 45449 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45449=edit gcc9-pr86214.patch Untested patch. Not 100% sure if there must be just one landing pad, will see during bootstrap/regtest. If there must be at most one, the second field can go and it can just clear the first field.
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 --- Comment #15 from Jakub Jelinek --- Ah, the #c13 testcase shows the issue, but in the early inliner, not in IPA inliner like is triggered on the #c7 testcase. This modified testcase triggers it in the IPA inliner: typedef __SIZE_TYPE__ size_t; struct A { A (); ~A (); int a; void qux (const char *); }; int bar (char *); static inline A foo () { char b[8192]; int x = bar (b); A s; if (x > 0 && (size_t) x < sizeof b) s.qux (b); return s; } void baz () { A a; char c[1024]; bar (c); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); }
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 --- Comment #14 from Jakub Jelinek --- (In reply to Jakub Jelinek from comment #11) > The workaround for MySQL, at least for -O2, would be to move logger:msg > definition out from the class, so it is not inline, then at least gcc trunk > doesn't want to inline it and you don't run into this. Or just add -fconserve-stack to the g++ options.
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 --- Comment #13 from Jakub Jelinek --- Short testcase that shows what's going on on the #c7 testcase: typedef __SIZE_TYPE__ size_t; struct A { A (); ~A (); int a; void qux (const char *); }; int bar (char *); static inline __attribute__((always_inline)) A foo () { char b[8192]; int x = bar (b); A s; if (x > 0 && (size_t) x < sizeof b) s.qux (b); return s; } void baz () { A a; foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); foo (); }
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 --- Comment #12 from Jakub Jelinek --- Author: jakub Date: Thu Jan 17 08:05:12 2019 New Revision: 268009 URL: https://gcc.gnu.org/viewcvs?rev=268009=gcc=rev Log: PR tree-optimization/86214 * cfgexpand.c (add_stack_var_conflict): Don't add any conflicts if x == y. Modified: trunk/gcc/ChangeLog trunk/gcc/cfgexpand.c
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #11 from Jakub Jelinek --- Looking at the #c7 testcase, confirming that ~ 29KB stack in one of the functions. The problem is that msg has char buf[8192]; variable in it and is inline, gets inlined into a function 3 times and can throw. ehcleanup1 removes the buf (and str) clobbers that were the only reason to have an EH pad that just rethrows (and I agree it is a good idea to do that, because otherwise inliner thinks the functions are more expensive than they actually are). But then the function into which this function is ultimately inlined has some finalization (destructors) covering the code into which it has been inlined, so the former EH with no successor block because it would throw externally now becomes EH edge from the code to a landing block onto which everything is marked as conflicting, there is no clobber at all for the variables. So, I wonder if we shouldn't add in such cases clobbers to the start of those landing pads in the situation, either for all variables that live in memory, or at least for the larger ones. Will play with it tomorrow. There is another thing - I've noticed add_stack_var_conflict is often called with x == y, shouldn't we return right away in that case? We don't need to record that a var conflicts with itself, we later on return that no variable conflicts with itself. Note, before r255104 we weren't inlining msg into the bigger function and thus the issue was latent. The workaround for MySQL, at least for -O2, would be to move logger:msg definition out from the class, so it is not inline, then at least gcc trunk doesn't want to inline it and you don't run into this.
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 Richard Biener changed: What|Removed |Added Priority|P3 |P2 CC||matz at gcc dot gnu.org, ||rguenth at gcc dot gnu.org --- Comment #10 from Richard Biener --- The issue with Alex testcase is that inlining when faced with ff () { charD.10 cD.2379[1000]; : f (); : cD.2379 ={v} {CLOBBER}; return; : : cD.2379 ={v} {CLOBBER}; resx 1 } ends up unifying the resx 1 parts, making the lifetime of the two instances overlap: g (intD.9 nD.2380) { charD.10 cD.2411[1000]; charD.10 cD.2410[1000]; ... : f (); : cD.2410 ={v} {CLOBBER}; f (); : cD.2411 ={v} {CLOBBER}; ... all fine up to here : : sD.2383 ={v} {CLOBBER}; cD.2399 ={v} {CLOBBER}; resx 1 oops. When analyzing lifetime of variables during RTL expansion we might want to not consider blocks where only the clobbers mention the vars thus virtually move those up as far as possible.
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 Jakub Jelinek changed: What|Removed |Added Target Milestone|8.2 |8.3 --- Comment #9 from Jakub Jelinek --- GCC 8.2 has been released.
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 Alexander Monakov changed: What|Removed |Added Status|WAITING |NEW --- Comment #8 from Alexander Monakov --- Removing the 'waiting' status.
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 --- Comment #7 from Steinar H. Gunderson --- Created attachment 44351 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44351=edit Unreduced test case There; unreduced, from public MySQL. This is preprocessed with GCC 7, and compiles both with 7 and 8 (the latter triggers too big stack).
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 --- Comment #6 from Steinar H. Gunderson --- I wouldn't be surprised if everything is really over-reduced, and that GCC 7 just happens not to hit this by luck. I'll be making a checkout of the public git repository and try to reproduce there, so that I can give you an unreduced version.
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 --- Comment #5 from Alexander Monakov --- Sorry, this still seems over-reduced: the 'cmp' variable is uninitialized, and gcc-7 completely optimizes out the block with large stack usage guarded by 'cmp == 0' test, so gcc-7 vs gcc-8 is not directly comparable. It's strange that gcc-7 optimizes that out, but it's a different issue. Can you attach the unreduced preprocessed source, and if you make another attempt at reducing, perhaps enable most warnings? That said, it seems gcc is not very good at re-discovering non-overlapping stack allocations introduced by inlining. Looking at your testcase I came up with the following minimal test: struct S{~S();}; void f(void *); inline void ff() { char c[1000]; f(c); } void g(int n) { S s; char c[100]; f(c); if (n) ff(), ff(); } (there's no regression vs. gcc-7 on this example, but gcc-4.6 used to get a better result by consuming 1100 bytes rather than 2100).
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 --- Comment #4 from Steinar H. Gunderson --- New test case added. It's quite a bit larger than the old one, but has no -Wreturn-type warning.
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 Steinar H. Gunderson changed: What|Removed |Added Attachment #44296|0 |1 is obsolete|| --- Comment #3 from Steinar H. Gunderson --- Created attachment 44349 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44349=edit Test case #2
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 --- Comment #2 from sgunderson at bigfoot dot com --- OK, starting a reduce that also checks for no -Wreturn-type warnings.
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Target Milestone|--- |8.2 Summary|[8 Regression] Strongly |[8/9 Regression] Strongly |increased stack usage |increased stack usage Alexander Monakov changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2018-06-21 CC||amonakov at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Alexander Monakov --- Inlining decisions are not so different between 7/8, the main difference is gcc-8 translates b::x() into __builtin_unreachable and warns accordingly: warning: no return statement in function returning non-void [-Wreturn-type] b x(b) {} ^ and with that change gcc-8 no longer manages to prove that big arrays have non-overlapping lifetimes. If I change the source to well-formed 'void x(b) {}', it compiles as desired. So, assuming the original MySQL source is free of that warning, the testcase is too aggressively reduced and no longer reflects the original issue. Can you please re-reduce?
[Bug tree-optimization/86214] [8/9 Regression] Strongly increased stack usage
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86214 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Target Milestone|--- |8.2 Summary|[8 Regression] Strongly |[8/9 Regression] Strongly |increased stack usage |increased stack usage