Hi, On Mon, Feb 27, 2017 at 10:34:38AM +0100, Richard Biener wrote: > On Wed, Feb 22, 2017 at 11:11 AM, Martin Jambor <mjam...@suse.cz> wrote: > > Hello, > > > > this is a fix for PR 78140 which is about LTO WPA of Firefox taking > > 1GB memory more than gcc 6. > > > > It works by reusing the ipa_bits and value_range that we previously > > had directly in jump functions and which are just too big to be > > allocated for each actual argument in all of Firefox. Reusing is > > achieved by two hash table traits derived from ggc_cache_remove which > > apparently has been created just for this purpose and once I > > understood them they made my life a lot easier. In future, I will > > have a look at applying this method to other parts of jump functions > > as well. > > > > According to my measurements, the patch saves about 1.2 GB of memory. > > The problem is that some change last week (between revision 245382 and > > 245595) has more than invalidated this: > > > > | compiler | WPA mem (GB) | > > |---------------------+--------------| > > | gcc 6 branch | 3.86 | > > | trunk rev. 245382 | 5.21 | > > | patched rev. 245382 | 4.06 | > > | trunk rev. 245595 | 6.59 | > > | patched rev. 245595 | 5.25 | > > > > (I have verified this by Martin's way of measuring things.) I will > > try to bisect what commit has caused the increase. Still, the patch > > helps a lot. > > > > There is one thing in the patch that intrigues me, I do not understand > > why I had to mark value_range with GTY((for_user)) - as opposed to > > just GTY(()) that was there before - whereas ipa_bits does not need > > it. If anyone could enlighten me, that would be great. But I suppose > > this is not an indication of anything being wrong under the hood. > > > > I have bootstrapped and LTO-bootstrapped the patch on x86_64-linux and > > also bootstrapped (C, C++ and Fortran) on an aarch64 and i686 cfarm > > machine. I have also LTO-built Firefox with the patch and used it to > > browse for a while and it seemed fine. > > > > OK for trunk? > > The idea looks good to me. I wonder what a statistic over ranges > would look like (do they mostly look useful?). >
So, at the jump function level (on trunk from last week), we have: no. of callsites: 1064109 no. of actual arguments: 2465511 (of all types) no. of unknown VRs: 1628727 (not too bad, considering that we only track them for integers and non-NULL for pointers) no. of known VRs: 836784 no. of distinct VRs: 1746 the 20 most popular VRs with their frequencies are: 706245 VR ~[0, 0] 59691 VR [0, 1] 32660 VR [0, -1] 14039 VR [0, 4294967295] 1607 VR [0, 255] 1351 VR [0, 2147483647] 1350 VR ~[2147483648, -2147483649] 1285 VR [0, 65535] 1259 VR [1, 4294967296] 1241 VR [0, 31] 903 VR [-2147483648, 2147483647] 853 VR [-32768, 32767] 827 VR [1, -1] 806 VR [0, -2] 794 VR [1, -2] 696 VR [-128, 127] 662 VR [0, 7] 654 VR [0, 4294967294] 601 VR [0, 15] 475 VR [0, 4611686018427387903] At the other end of the propagation we store value ranges of 165010 formal parameters out of the total of 678762 (but again, of all types). The 20 most popular ones are: 119319 Storing VR ~[0, 0] 13169 Storing VR [0, -1] 8781 Storing VR [0, 0] 3181 Storing VR [1, 1] 3081 Storing VR [0, 4294967295] 2089 Storing VR [0, 1] 918 Storing VR [-1, -1] 870 Storing VR [2147483647, 2147483647] 697 Storing VR [2, 2] 554 Storing VR [0, 2] 527 Storing VR [1, -1] 491 Storing VR [0, 3] 361 Storing VR [1, 2] 350 Storing VR [0, 255] 323 Storing VR [0, 31] 300 Storing VR [-32768, 32767] 285 Storing VR [0, 2147483647] 260 Storing VR [0, 65535] 240 Storing VR [5, 5] 220 Storing VR [8, 8] I haven't had a look at how this translated to the final code, but it is safe to say that the propagation itself does something. Martin