[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2021-05-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

Richard Biener  changed:

   What|Removed |Added

  Known to work||10.3.0, 9.3.1
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #16 from Richard Biener  ---
Fixed.

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2021-05-04 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

--- Comment #15 from CVS Commits  ---
The releases/gcc-9 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:4595028e7212d6870e9e236f1f5a016b50708b7c

commit r9-9508-g4595028e7212d6870e9e236f1f5a016b50708b7c
Author: Richard Biener 
Date:   Fri Jan 29 10:23:40 2021 +0100

rtl-optimization/98144 - tame REE memory usage

This changes the REE dataflow to change the explicit all-ones
starting solution to be implicit via a visited flag, removing
the need to initially start with fully populated bitmaps for
all basic-blocks.  That reduces peak memory use when compiling
the RTL checking enabled insn-extract.c testcase from PR98144
from 6GB to less than 2GB.

2021-01-29  Richard Biener  

PR rtl-optimization/98144
* df.h (df_mir_bb_info): Add con_visited member.
* df-problems.c (df_mir_alloc): Initialize con_visited,
do not fully populate IN and OUT.
(df_mir_reset): Likewise.
(df_mir_confluence_0): Set con_visited.
(df_mir_confluence_n): Properly handle implicitely
fully populated IN and OUT as designated by con_visited
and update con_visited accordingly.

(cherry picked from commit a8c455bafdefdab0a7b8cdbcdb116c0086bae05e)

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2021-02-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

--- Comment #14 from Richard Biener  ---
(In reply to Franz Sirl from comment #13)
> Some data for the inhouse testcase in Bug 80930 with ASAN+UBSAN:
> 
>  gcc-9@r9-8944:   OOM killed after 15min at ~85 GB
>  gcc-10@r10-9345: takes ~25min to compile, max mem ~6.5GB
> 
> Thanks for this nice improvement!

Great to hear.  You can also try if cherry-picking
g:a523add327c6cfdd68cf9b788ea808068d0f508c makes a difference.

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2021-02-04 Thread sirl at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

--- Comment #13 from Franz Sirl  ---
Some data for the inhouse testcase in Bug 80930 with ASAN+UBSAN:

 gcc-9@r9-8944:   OOM killed after 15min at ~85 GB
 gcc-10@r10-9345: takes ~25min to compile, max mem ~6.5GB

Thanks for this nice improvement!

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2021-02-03 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

--- Comment #12 from CVS Commits  ---
The releases/gcc-10 branch has been updated by Richard Biener
:

https://gcc.gnu.org/g:0ca62e6abf7676af9450aa3fe0d7479c4d9e9253

commit r10-9337-g0ca62e6abf7676af9450aa3fe0d7479c4d9e9253
Author: Richard Biener 
Date:   Fri Jan 29 10:23:40 2021 +0100

rtl-optimization/98144 - tame REE memory usage

This changes the REE dataflow to change the explicit all-ones
starting solution to be implicit via a visited flag, removing
the need to initially start with fully populated bitmaps for
all basic-blocks.  That reduces peak memory use when compiling
the RTL checking enabled insn-extract.c testcase from PR98144
from 6GB to less than 2GB.

2021-01-29  Richard Biener  

PR rtl-optimization/98144
* df.h (df_mir_bb_info): Add con_visited member.
* df-problems.c (df_mir_alloc): Initialize con_visited,
do not fully populate IN and OUT.
(df_mir_reset): Likewise.
(df_mir_confluence_0): Set con_visited.
(df_mir_confluence_n): Properly handle implicitely
fully populated IN and OUT as designated by con_visited
and update con_visited accordingly.

(cherry picked from commit a8c455bafdefdab0a7b8cdbcdb116c0086bae05e)

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2021-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

Richard Biener  changed:

   What|Removed |Added

  Known to work||11.0

--- Comment #11 from Richard Biener  ---
Fixed on trunk, planning to do backports after a while.

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2021-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

--- Comment #10 from Richard Biener  ---
*** Bug 80930 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2021-01-29 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

--- Comment #9 from CVS Commits  ---
The master branch has been updated by Richard Biener :

https://gcc.gnu.org/g:a8c455bafdefdab0a7b8cdbcdb116c0086bae05e

commit r11-6971-ga8c455bafdefdab0a7b8cdbcdb116c0086bae05e
Author: Richard Biener 
Date:   Fri Jan 29 10:23:40 2021 +0100

rtl-optimization/98144 - tame REE memory usage

This changes the REE dataflow to change the explicit all-ones
starting solution to be implicit via a visited flag, removing
the need to initially start with fully populated bitmaps for
all basic-blocks.  That reduces peak memory use when compiling
the RTL checking enabled insn-extract.c testcase from PR98144
from 6GB to less than 2GB.

2021-01-29  Richard Biener  

PR rtl-optimization/98144
* df.h (df_mir_bb_info): Add con_visited member.
* df-problems.c (df_mir_alloc): Initialize con_visited,
do not fully populate IN and OUT.
(df_mir_reset): Likewise.
(df_mir_confluence_0): Set con_visited.
(df_mir_confluence_n): Properly handle implicitely
fully populated IN and OUT as designated by con_visited
and update con_visited accordingly.

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2021-01-29 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

--- Comment #8 from Richard Biener  ---
So testing on recent trunk shows

> /usr/bin/time ~/install/gcc-11.0/usr/local/bin/g++ t.ii -O2 -S -o /dev/null
29.23user 1.21system 0:30.54elapsed 99%CPU (0avgtext+0avgdata
5933552maxresident)k
57448inputs+0outputs (68major+1644983minor)pagefaults 0swaps
> /usr/bin/time ~/install/gcc-11.0/usr/local/bin/g++ t.ii -O2 -S -o /dev/null 
> -fno-ree
32.17user 0.54system 0:33.12elapsed 98%CPU (0avgtext+0avgdata
1587864maxresident)k
0inputs+0outputs (0major+473366minor)pagefaults 0swaps

aka -free ups peak memory use by 4GB.  I'm testing a patch to teach it some
sanity:

> /usr/bin/time ~/install/gcc-test/usr/local/bin/g++ t.ii -O2 -S -o /dev/null   
>   
48.63user 0.36system 0:49.01elapsed 99%CPU (0avgtext+0avgdata
1707272maxresident)k
0inputs+0outputs (0major+462159minor)pagefaults 0swaps

with only a 100MB peak RSS penalty of -free.

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2020-12-17 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

--- Comment #7 from Andrew Macleod  ---
PR 98174 has that patch applied btw.

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2020-12-09 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek  ---
(In reply to Andrew Macleod from comment #5)
> If I then disable tracking pointer values via de-references in the on entry
> cache, memory consumption goes back to normal.  Since hybrid EVRP still has
> EVRP tracking the non-null pointer values , I am considering this as an
> option just within EVRP for this release since we shouldn't miss anything.

Perhaps it could be disabled only if the CFG is large for some definition of
large?

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2020-12-09 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

Andrew Macleod  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2020-12-09
 CC||amacleod at redhat dot com
 Ever confirmed|0   |1

--- Comment #5 from Andrew Macleod  ---
I am working on this. 
The memory being allocated is the on-entry cache. With a very large CFG, the
on-entry cache propagator is consuming excessive memory.

First step is to recognize that if an ssa-name is never exported from GORI in
any block, then its only range is it's global range, and we don't need to
propagate any on-entry range around for it. I had planned to eventually get to
this, now its seems more important.  It is also a general speedup to avoid ever
looking at names which are irrelevant

This is true for all ranges except pointers which can become non-zero at a
dereference point.  And more generally, for anything which is the product of a
statement side effect. Dereferences are currently the only thing ranger tracks
which fall into this category. 

Implementing this tweak get back about half that memory.  The other half is
still tracking around the non-zero-ness of pointers across this massive CFG.

I have longer term plans for dealing with statement side effects like this that
are beyond the scope of this release, as well as different plans for pointer
ranges when we get to supporting other kinds of ranges beyond just integral.

If I then disable tracking pointer values via de-references in the on entry
cache, memory consumption goes back to normal.  Since hybrid EVRP still has
EVRP tracking the non-null pointer values , I am considering this as an option
just within EVRP for this release since we shouldn't miss anything.

I am continuing to look at it though and will come back with firmer conclusions

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2020-12-07 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

Richard Biener  changed:

   What|Removed |Added

  Attachment #49682|0   |1
is obsolete||

--- Comment #4 from Richard Biener  ---
Created attachment 49696
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49696=edit
testcase that GCC 7 to GCC 11 can grok

Updated testcase produced with a GCC 7 host compiler.  While -fno-ree helps
for GCC 7 to GCC 10:

> /usr/bin/time g++-10  insn-extract.ii -fno-exceptions -fno-rtti -O2 -S 
> -fno-ree
19.90user 0.33system 0:20.25elapsed 99%CPU (0avgtext+0avgdata
1290440maxresident)k
0inputs+0outputs (0major+280821minor)pagefaults 0swaps

it doesn't for trunk which shows ranger allocating 5GB of memory.

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2020-12-07 Thread doko at debian dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

Matthias Klose  changed:

   What|Removed |Added

 CC||doko at debian dot org

--- Comment #3 from Matthias Klose  ---
8441545d4f2afb9e9342e0dac378eafd03f00462 now builds insn-extract.o without rtl
checking unconditionally.

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2020-12-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

Richard Biener  changed:

   What|Removed |Added

 CC||rsandifo at gcc dot gnu.org

--- Comment #2 from Richard Biener  ---
Maybe RTL SSA is a good fit for REE.

[Bug rtl-optimization/98144] REE needs 6GB DF memory when compiling insn-extract.c with RTL checking enabled

2020-12-04 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98144

--- Comment #1 from Richard Biener  ---
Created attachment 49682
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49682=edit
preprocessed insn-extract

This is x86_64 insn-extract when configured with --enable-checking=yes,rtl