[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 Richard Biener changed: What|Removed |Added Known to work||10.3.0, 9.3.1 Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #16 from Richard Biener --- Fixed.
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 --- Comment #15 from CVS Commits --- The releases/gcc-9 branch has been updated by Richard Biener : https://gcc.gnu.org/g:4595028e7212d6870e9e236f1f5a016b50708b7c commit r9-9508-g4595028e7212d6870e9e236f1f5a016b50708b7c Author: Richard Biener Date: Fri Jan 29 10:23:40 2021 +0100 rtl-optimization/98144 - tame REE memory usage This changes the REE dataflow to change the explicit all-ones starting solution to be implicit via a visited flag, removing the need to initially start with fully populated bitmaps for all basic-blocks. That reduces peak memory use when compiling the RTL checking enabled insn-extract.c testcase from PR98144 from 6GB to less than 2GB. 2021-01-29 Richard Biener PR rtl-optimization/98144 * df.h (df_mir_bb_info): Add con_visited member. * df-problems.c (df_mir_alloc): Initialize con_visited, do not fully populate IN and OUT. (df_mir_reset): Likewise. (df_mir_confluence_0): Set con_visited. (df_mir_confluence_n): Properly handle implicitely fully populated IN and OUT as designated by con_visited and update con_visited accordingly. (cherry picked from commit a8c455bafdefdab0a7b8cdbcdb116c0086bae05e)
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 --- Comment #14 from Richard Biener --- (In reply to Franz Sirl from comment #13) > Some data for the inhouse testcase in Bug 80930 with ASAN+UBSAN: > > gcc-9@r9-8944: OOM killed after 15min at ~85 GB > gcc-10@r10-9345: takes ~25min to compile, max mem ~6.5GB > > Thanks for this nice improvement! Great to hear. You can also try if cherry-picking g:a523add327c6cfdd68cf9b788ea808068d0f508c makes a difference.
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 --- Comment #13 from Franz Sirl --- Some data for the inhouse testcase in Bug 80930 with ASAN+UBSAN: gcc-9@r9-8944: OOM killed after 15min at ~85 GB gcc-10@r10-9345: takes ~25min to compile, max mem ~6.5GB Thanks for this nice improvement!
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 --- Comment #12 from CVS Commits --- The releases/gcc-10 branch has been updated by Richard Biener : https://gcc.gnu.org/g:0ca62e6abf7676af9450aa3fe0d7479c4d9e9253 commit r10-9337-g0ca62e6abf7676af9450aa3fe0d7479c4d9e9253 Author: Richard Biener Date: Fri Jan 29 10:23:40 2021 +0100 rtl-optimization/98144 - tame REE memory usage This changes the REE dataflow to change the explicit all-ones starting solution to be implicit via a visited flag, removing the need to initially start with fully populated bitmaps for all basic-blocks. That reduces peak memory use when compiling the RTL checking enabled insn-extract.c testcase from PR98144 from 6GB to less than 2GB. 2021-01-29 Richard Biener PR rtl-optimization/98144 * df.h (df_mir_bb_info): Add con_visited member. * df-problems.c (df_mir_alloc): Initialize con_visited, do not fully populate IN and OUT. (df_mir_reset): Likewise. (df_mir_confluence_0): Set con_visited. (df_mir_confluence_n): Properly handle implicitely fully populated IN and OUT as designated by con_visited and update con_visited accordingly. (cherry picked from commit a8c455bafdefdab0a7b8cdbcdb116c0086bae05e)
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 Richard Biener changed: What|Removed |Added Known to work||11.0 --- Comment #11 from Richard Biener --- Fixed on trunk, planning to do backports after a while.
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 --- Comment #10 from Richard Biener --- *** Bug 80930 has been marked as a duplicate of this bug. ***
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 --- Comment #9 from CVS Commits --- The master branch has been updated by Richard Biener : https://gcc.gnu.org/g:a8c455bafdefdab0a7b8cdbcdb116c0086bae05e commit r11-6971-ga8c455bafdefdab0a7b8cdbcdb116c0086bae05e Author: Richard Biener Date: Fri Jan 29 10:23:40 2021 +0100 rtl-optimization/98144 - tame REE memory usage This changes the REE dataflow to change the explicit all-ones starting solution to be implicit via a visited flag, removing the need to initially start with fully populated bitmaps for all basic-blocks. That reduces peak memory use when compiling the RTL checking enabled insn-extract.c testcase from PR98144 from 6GB to less than 2GB. 2021-01-29 Richard Biener PR rtl-optimization/98144 * df.h (df_mir_bb_info): Add con_visited member. * df-problems.c (df_mir_alloc): Initialize con_visited, do not fully populate IN and OUT. (df_mir_reset): Likewise. (df_mir_confluence_0): Set con_visited. (df_mir_confluence_n): Properly handle implicitely fully populated IN and OUT as designated by con_visited and update con_visited accordingly.
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 --- Comment #8 from Richard Biener --- So testing on recent trunk shows > /usr/bin/time ~/install/gcc-11.0/usr/local/bin/g++ t.ii -O2 -S -o /dev/null 29.23user 1.21system 0:30.54elapsed 99%CPU (0avgtext+0avgdata 5933552maxresident)k 57448inputs+0outputs (68major+1644983minor)pagefaults 0swaps > /usr/bin/time ~/install/gcc-11.0/usr/local/bin/g++ t.ii -O2 -S -o /dev/null > -fno-ree 32.17user 0.54system 0:33.12elapsed 98%CPU (0avgtext+0avgdata 1587864maxresident)k 0inputs+0outputs (0major+473366minor)pagefaults 0swaps aka -free ups peak memory use by 4GB. I'm testing a patch to teach it some sanity: > /usr/bin/time ~/install/gcc-test/usr/local/bin/g++ t.ii -O2 -S -o /dev/null > 48.63user 0.36system 0:49.01elapsed 99%CPU (0avgtext+0avgdata 1707272maxresident)k 0inputs+0outputs (0major+462159minor)pagefaults 0swaps with only a 100MB peak RSS penalty of -free.
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 --- Comment #7 from Andrew Macleod --- PR 98174 has that patch applied btw.
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #6 from Jakub Jelinek --- (In reply to Andrew Macleod from comment #5) > If I then disable tracking pointer values via de-references in the on entry > cache, memory consumption goes back to normal. Since hybrid EVRP still has > EVRP tracking the non-null pointer values , I am considering this as an > option just within EVRP for this release since we shouldn't miss anything. Perhaps it could be disabled only if the CFG is large for some definition of large?
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 Andrew Macleod changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2020-12-09 CC||amacleod at redhat dot com Ever confirmed|0 |1 --- Comment #5 from Andrew Macleod --- I am working on this. The memory being allocated is the on-entry cache. With a very large CFG, the on-entry cache propagator is consuming excessive memory. First step is to recognize that if an ssa-name is never exported from GORI in any block, then its only range is it's global range, and we don't need to propagate any on-entry range around for it. I had planned to eventually get to this, now its seems more important. It is also a general speedup to avoid ever looking at names which are irrelevant This is true for all ranges except pointers which can become non-zero at a dereference point. And more generally, for anything which is the product of a statement side effect. Dereferences are currently the only thing ranger tracks which fall into this category. Implementing this tweak get back about half that memory. The other half is still tracking around the non-zero-ness of pointers across this massive CFG. I have longer term plans for dealing with statement side effects like this that are beyond the scope of this release, as well as different plans for pointer ranges when we get to supporting other kinds of ranges beyond just integral. If I then disable tracking pointer values via de-references in the on entry cache, memory consumption goes back to normal. Since hybrid EVRP still has EVRP tracking the non-null pointer values , I am considering this as an option just within EVRP for this release since we shouldn't miss anything. I am continuing to look at it though and will come back with firmer conclusions
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 Richard Biener changed: What|Removed |Added Attachment #49682|0 |1 is obsolete|| --- Comment #4 from Richard Biener --- Created attachment 49696 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49696=edit testcase that GCC 7 to GCC 11 can grok Updated testcase produced with a GCC 7 host compiler. While -fno-ree helps for GCC 7 to GCC 10: > /usr/bin/time g++-10 insn-extract.ii -fno-exceptions -fno-rtti -O2 -S > -fno-ree 19.90user 0.33system 0:20.25elapsed 99%CPU (0avgtext+0avgdata 1290440maxresident)k 0inputs+0outputs (0major+280821minor)pagefaults 0swaps it doesn't for trunk which shows ranger allocating 5GB of memory.
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 Matthias Klose changed: What|Removed |Added CC||doko at debian dot org --- Comment #3 from Matthias Klose --- 8441545d4f2afb9e9342e0dac378eafd03f00462 now builds insn-extract.o without rtl checking unconditionally.
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 Richard Biener changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org --- Comment #2 from Richard Biener --- Maybe RTL SSA is a good fit for REE.
[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144 --- Comment #1 from Richard Biener --- Created attachment 49682 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49682=edit preprocessed insn-extract This is x86_64 insn-extract when configured with --enable-checking=yes,rtl