[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2017-11-16 Thread egallager at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375
Bug 45375 depends on bug 48724, which changed state.

Bug 48724 Summary: Lto build of mozilla dies at lto-wrapper: error trying to 
exec 'make -j1': execvp: No such file or directory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48724

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |WORKSFORME

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2016-01-18 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #219 from Jan Hubicka  ---
devirtualization issue is now fixed, so we are down to -fno-lifetime-dse.

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2016-01-08 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #218 from Martin Liška  ---
Hi.

Building Firefox revision:
commit a704d34fb1f9e0f5dbf4113298d885cdb650906c
Author: Matthew Noorenberghe 
Date:   Thu Dec 3 17:33:35 2015 -0800

Bug 1230391 - Disable password visibility toggling in the capture
doorhanger outside Nightly. rs=bnicholson, a=lizzard on a CLOSED TREE

--HG--
extra : source : aea828e2cdf767a358ebc6ea661dd3b9b4160321
extra : intermediate-source : 366dd290472633b06f0942d7737c34e942e0916a

This is a minimal set of LTO options for which the built binary can run:
MYFLAGS="$OPT -march=native -flto=9 -fno-lifetime-dse -fno-devirtualize"

For more details:
# MYFLAGS="$OPT -march=native -flto=9" FAILED
# MYFLAGS="$OPT -march=native -flto=9 -fno-lifetime-dse
-fno-delete-null-pointer-checks -fno-devirtualize -fno-strict-aliasing" OK
# MYFLAGS="$OPT -march=native -flto=9 -fno-lifetime-dse
-fno-delete-null-pointer-checks" FAILED
# MYFLAGS="$OPT -march=native -flto=9 -fno-lifetime-dse
-fno-delete-null-pointer-checks -fno-devirtualize" OK
# MYFLAGS="$OPT -march=native -flto=9 -fno-devirtualize" FAILED
# MYFLAGS="$OPT -march=native -flto=9 -fno-lifetime-dse -fno-devirtualize" OK
# MYFLAGS="$OPT -march=native -flto=9 -fno-lifetime-dse" FAILED

Martin

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2015-01-20 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #217 from Jan Hubicka hubicka at gcc dot gnu.org ---
Author: hubicka
Date: Tue Jan 20 19:48:59 2015
New Revision: 219909

URL: https://gcc.gnu.org/viewcvs?rev=219909root=gccview=rev
Log:

PR lto/45375
* ipa-inline.c: Include lto-streamer.h
(report_inline_failed_reason): Output source file differences and
flags on optimization/target node mismatch.
(can_inline_edge_p): Consider caller to be the outer inline function;
be less restrictive about matching opimize and optimize_size attributes.
(inline_account_function_p): Break out from ...
(inline_small_functions): ... here.
* ipa-inline-transform.c (clone_inlined_nodes): Use
inline_account_function_p.
(inline_call): Use optimize attribution; use inline_account_function_p.
(inline_transform): Use opt_for_fn.
* ipa-inline.h (inline_account_function_p): Declare.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-inline-transform.c
trunk/gcc/ipa-inline.c
trunk/gcc/ipa-inline.h


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2015-01-19 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #216 from Jan Hubicka hubicka at gcc dot gnu.org ---
Author: hubicka
Date: Tue Jan 20 04:39:45 2015
New Revision: 219878

URL: https://gcc.gnu.org/viewcvs?rev=219878root=gccview=rev
Log:

PR lto/45375
* i386.c (ix86_option_override_internal): Use ix86_tune_cost
to set branch cost.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2015-01-19 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #215 from Jan Hubicka hubicka at gcc dot gnu.org ---
Author: hubicka
Date: Mon Jan 19 23:58:19 2015
New Revision: 219871

URL: https://gcc.gnu.org/viewcvs?rev=219871root=gccview=rev
Log:

PR lto/45375
* i386.c (gate): Check flag_expensive_optimizations and
optimize_size.
(ix86_option_override_internal): Drop optimize_size condition
on MASK_ACCUMULATE_OUTGOING_ARGS, MASK_VZEROUPPER,
MASK_AVX256_SPLIT_UNALIGNED_LOAD, MASK_AVX256_SPLIT_UNALIGNED_STORE,
MASK_PREFER_AVX128.
(ix86_avx256_split_vector_move_misalign,
ix86_avx256_split_vector_move_misalign): Check optimize_insn_for_speed.
* sse.md (all uses of TARGET_PREFER_AVX128): Add
optimize_insn_for_speed_p check.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/i386.c
trunk/gcc/config/i386/sse.md


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-11-13 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

Martin Liška marxin at gcc dot gnu.org changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org

--- Comment #214 from Martin Liška marxin at gcc dot gnu.org ---
I've just found ICE for r217480 with LTO and -O2:

lto1: internal compiler error: in lto_output_node, at lto-cgraph.c:462
0x7ce411 lto_output_node
../../gcc/lto-cgraph.c:462
0x7ce411 output_symtab()
../../gcc/lto-cgraph.c:974
0x7db276 lto_output()
../../gcc/lto-streamer-out.c:2309
0x814671 write_lto
../../gcc/passes.c:2346
0x8177c1 ipa_write_optimization_summaries(lto_symtab_encoder_d*)
../../gcc/passes.c:2545
0x59512a do_stream_out
../../gcc/lto/lto.c:2475
0x59a41f stream_out
../../gcc/lto/lto.c:2538
0x59a41f lto_wpa_write_files
../../gcc/lto/lto.c:2655
0x59a41f do_whole_program_analysis
../../gcc/lto/lto.c:3323
0x59a41f lto_main()
../../gcc/lto/lto.c:3443

  if (tag == LTO_symtab_analyzed_node)
gcc_assert (clone_of || !node-clone_of);
^
  if (!clone_of)
streamer_write_hwi_stream (ob-main_stream, LCC_NOT_FOUND);
  else
streamer_write_hwi_stream (ob-main_stream, ref);

If needed I will try to reduce objects that are part of WPA phase.

Martin

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-08-26 Thread steffen at hauihau dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #213 from Steffen Hau steffen at hauihau dot de ---
Hi Jan,

just a short Update: Firefox since version 30 as well as Thunderbird since
version 31 both compile fine with LTO enabled without the need of any
additional patches. The package size was reduced by 51% (firefox ~420MB -
~207MB) and 59% (thunderbird ~480MB - ~200MB). Both programs work as intended,
no crashes or unexpected behaviour so far.

Best regards,
Steffen


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-05-26 Thread steffen at hauihau dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #212 from Steffen Hau steffen at hauihau dot de ---
Hi Jan,

I have binutils version 2.24 with the patch from Markus Trippelsdorf for early
plugin loading, so I have no wrappers for ar, nm and ranlib. I've also
symlinked the liblto_plugin.so in binutils bfd-plugins directory. I'll try to
apply the 3 patches you mentioned in your blog post and see wether they help,
but I think they are not relevant for elfhack portion which is failing on my
system.

Which firefox version did you successfully compile?


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-05-24 Thread hubicka at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #211 from Jan Hubicka hubicka at gcc dot gnu.org ---
Elfhack is rather sensitive to LTO, but it works for me, so this seems like
binutils issue or some elfhack change that happened recently.
I wrote instructions for building firefox with LTO here
http://hubicka.blogspot.ca/2014/04/linktime-optimization-in-gcc-2-firefox.html

Here I am attaching -ftime-report after the symtab hashtable was removed
Execution times (seconds)
 phase setup :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall 
  1536 kB ( 0%) ggc
 phase opt and generate  :  54.29 (58%) usr   1.28 (18%) sys  55.58 (50%) wall 
720779 kB (18%) ggc
 phase stream in :  33.54 (36%) usr   1.84 (26%) sys  35.39 (32%) wall
3389310 kB (82%) ggc
 phase stream out:   6.00 ( 6%) usr   4.02 (56%) sys  19.99 (18%) wall 
 0 kB ( 0%) ggc
 garbage collection  :   1.86 ( 2%) usr   0.00 ( 0%) sys   1.86 ( 2%) wall 
 0 kB ( 0%) ggc
 callgraph optimization  :   0.23 ( 0%) usr   0.00 ( 0%) sys   0.24 ( 0%) wall 
 9 kB ( 0%) ggc
 ipa dead code removal   :   5.70 ( 6%) usr   0.18 ( 3%) sys   6.15 ( 6%) wall 
92 kB ( 0%) ggc
 ipa inheritance graph   :   0.09 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall 
   883 kB ( 0%) ggc
 ipa virtual call target :   5.58 ( 6%) usr   0.06 ( 1%) sys   5.32 ( 5%) wall 
 0 kB ( 0%) ggc
 ipa devirtualization:   0.13 ( 0%) usr   0.00 ( 0%) sys   0.20 ( 0%) wall 
  9201 kB ( 0%) ggc
 ipa cp  :   2.34 ( 2%) usr   0.21 ( 3%) sys   2.55 ( 2%) wall 
223628 kB ( 5%) ggc
 ipa inlining heuristics :  26.97 (29%) usr   0.67 ( 9%) sys  27.66 (25%) wall 
865791 kB (21%) ggc
 ipa comdats :   0.21 ( 0%) usr   0.00 ( 0%) sys   0.21 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa lto gimple in   :   0.07 ( 0%) usr   0.11 ( 2%) sys   0.21 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa lto gimple out  :   0.46 ( 0%) usr   0.19 ( 3%) sys   0.65 ( 1%) wall 
 0 kB ( 0%) ggc
 ipa lto decl in :  24.76 (26%) usr   1.28 (18%) sys  26.08 (23%) wall
2571773 kB (63%) ggc
 ipa lto decl out:   5.45 ( 6%) usr   0.28 ( 4%) sys   5.75 ( 5%) wall 
 0 kB ( 0%) ggc
 ipa lto cgraph I/O  :   1.13 ( 1%) usr   0.24 ( 3%) sys   1.38 ( 1%) wall 
414551 kB (10%) ggc
 ipa lto decl merge  :   2.57 ( 3%) usr   0.01 ( 0%) sys   2.58 ( 2%) wall 
  8227 kB ( 0%) ggc
 ipa lto cgraph merge:   1.72 ( 2%) usr   0.00 ( 0%) sys   1.72 ( 2%) wall 
 12166 kB ( 0%) ggc
 whopr wpa   :   1.04 ( 1%) usr   0.00 ( 0%) sys   1.04 ( 1%) wall 
 2 kB ( 0%) ggc
 whopr wpa I/O   :   0.03 ( 0%) usr   3.55 (50%) sys  13.51 (12%) wall 
 0 kB ( 0%) ggc
 whopr partitioning  :   4.97 ( 5%) usr   0.06 ( 1%) sys   5.02 ( 5%) wall 
  3738 kB ( 0%) ggc
 ipa reference   :   3.62 ( 4%) usr   0.12 ( 2%) sys   3.75 ( 3%) wall 
 0 kB ( 0%) ggc
 ipa profile :   0.33 ( 0%) usr   0.01 ( 0%) sys   0.33 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa pure const  :   3.86 ( 4%) usr   0.01 ( 0%) sys   3.88 ( 3%) wall 
 0 kB ( 0%) ggc
 tree eh :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
 0 kB ( 0%) ggc
 tree CFG cleanup:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
 0 kB ( 0%) ggc
 varconst:   0.05 ( 0%) usr   0.16 ( 2%) sys   0.13 ( 0%) wall 
 0 kB ( 0%) ggc
 unaccounted todo:   0.65 ( 1%) usr   0.00 ( 0%) sys   0.64 ( 1%) wall 
 0 kB ( 0%) ggc
 TOTAL :  93.84 7.14   110.98   
4111626 kB

there are some improvements in devirtualization performance that used quite few
decl-symbol lookups. (about 20%)


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-05-23 Thread steffen at hauihau dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

Steffen Hau steffen at hauihau dot de changed:

   What|Removed |Added

 CC||steffen at hauihau dot de

--- Comment #210 from Steffen Hau steffen at hauihau dot de ---
Latest firefox 29.0.1 does not compile with LTO enabled (Gentoo/GCc 4.9.0). It
fails in elfhack:

make[5]: Entering directory
'/home/misc/gentoo/tmp/portage/www-client/firefox-29.0.1/work/mozilla-release/obj-x86_64-pc-linux-gnu/build/unix/elfhack'
elfhack
/home/misc/gentoo/tmp/portage/www-client/firefox-29.0.1/work/mozilla-release/obj-x86_64-pc-linux-gnu/_virtualenv/bin/python
/home/misc/gentoo/tmp/portage/www-client/firefox-29.0.1/work/mozilla-release/config/expandlibs_exec.py
--depend .deps/elfhack.pp --target elfhack -- x86_64-pc-linux-gnu-g++ -o
elfhack -march=native -pipe -ggdb -flto=5 -fuse-linker-plugin -mno-avx
-std=gnu++0x -MD -MP -MF .deps/elfhack.pp -Wl,-O1 -Wl,--as-needed -march=native
-pipe -ggdb -flto=5 -fuse-linker-plugin -Wl,-znow -Wl,--sort-common
-Wl,--hash-style=gnu -Wl,--enable-new-dtags host_elf.o host_elfhack.o  
x86_64-pc-linux-gnu-gcc -o dummy dummy.o -lpthread -Wl,-O1 -Wl,--as-needed
-march=native -pipe -ggdb -flto=5 -fuse-linker-plugin -Wl,-znow
-Wl,--sort-common -Wl,--hash-style=gnu -Wl,--enable-new-dtags
-Wl,-z,noexecstack -Wl,-z,text 
-Wl,-rpath-link,/home/misc/gentoo/tmp/portage/www-client/firefox-29.0.1/work/mozilla-release/obj-x86_64-pc-linux-gnu/dist/bin
-Wl,-rpath-link,/usr/lib 
x86_64-pc-linux-gnu-g++  -Wall -Wpointer-arith -Woverloaded-virtual
-Werror=return-type -Werror=int-to-pointer-cast -Wtype-limits -Wempty-body
-Wsign-compare -Wno-invalid-offsetof -Wcast-align -march=native -pipe -ggdb
-flto=5 -fuse-linker-plugin -mno-avx -fno-strict-aliasing -fno-rtti
-fno-math-errno -std=gnu++0x -pthread -pipe -fexceptions  -DNDEBUG -DTRIMMED
-O2 -fomit-frame-pointer -fPIC -shared -Wl,-z,defs -Wl,-h,test-array.so -o
test-array.so -lpthread -Wl,-O1 -Wl,--as-needed -march=native -pipe -ggdb
-flto=5 -fuse-linker-plugin -Wl,-znow -Wl,--sort-common -Wl,--hash-style=gnu
-Wl,--enable-new-dtags -Wl,-z,noexecstack -Wl,-z,text 
-Wl,-rpath-link,/home/misc/gentoo/tmp/portage/www-client/firefox-29.0.1/work/mozilla-release/obj-x86_64-pc-linux-gnu/dist/bin
-Wl,-rpath-link,/usr/lib  test-array.o -nostartfiles
x86_64-pc-linux-gnu-g++  -Wall -Wpointer-arith -Woverloaded-virtual
-Werror=return-type -Werror=int-to-pointer-cast -Wtype-limits -Wempty-body
-Wsign-compare -Wno-invalid-offsetof -Wcast-align -march=native -pipe -ggdb
-flto=5 -fuse-linker-plugin -mno-avx -fno-strict-aliasing -fno-rtti
-fno-math-errno -std=gnu++0x -pthread -pipe -fexceptions  -DNDEBUG -DTRIMMED
-O2 -fomit-frame-pointer -fPIC -shared -Wl,-z,defs -Wl,-h,test-ctors.so -o
test-ctors.so -lpthread -Wl,-O1 -Wl,--as-needed -march=native -pipe -ggdb
-flto=5 -fuse-linker-plugin -Wl,-znow -Wl,--sort-common -Wl,--hash-style=gnu
-Wl,--enable-new-dtags -Wl,-z,noexecstack -Wl,-z,text 
-Wl,-rpath-link,/home/misc/gentoo/tmp/portage/www-client/firefox-29.0.1/work/mozilla-release/obj-x86_64-pc-linux-gnu/dist/bin
-Wl,-rpath-link,/usr/lib  test-ctors.o -nostartfiles
===
=== If you get failures below, please file a bug describing the error
=== and your environment (compiler and linker versions), and use
=== --disable-elf-hack until this is fixed.
===
# Fail if the library doesn't have INIT .dynamic info
readelf -d test-ctors.so | grep '(INIT)'
 0x000c (INIT)   0x0
/home/misc/gentoo/tmp/portage/www-client/firefox-29.0.1/work/mozilla-release/obj-x86_64-pc-linux-gnu/build/unix/elfhack/elfhack
-b -f test-ctors.so
===
=== If you get failures below, please file a bug describing the error
=== and your environment (compiler and linker versions), and use
=== --disable-elf-hack until this is fixed.
===
# Fail if the library doesn't have INIT_ARRAY .dynamic info
test-ctors.so: Reduced by 12096 bytes
readelf -d test-array.so | grep '(INIT_ARRAY)'
# Fail if the backup file doesn't exist
[ -f 'test-ctors.so.bak' ]
 0x0019 (INIT_ARRAY) 0x9790
# Fail if the new library doesn't contain less relocations
/home/misc/gentoo/tmp/portage/www-client/firefox-29.0.1/work/mozilla-release/obj-x86_64-pc-linux-gnu/build/unix/elfhack/elfhack
-b -f test-array.so
test-array.so: [ $(objdump -R test-ctors.so.bak | wc -l) -gt $(objdump -R
test-ctors.so | wc -l) ]
Reduced by 12088 bytes
# Fail if the backup file doesn't exist
[ -f 'test-array.so.bak' ]
# Fail if the new library doesn't contain less relocations
[ $(objdump -R test-array.so.bak | wc -l) -gt $(objdump -R test-array.so | wc
-l) ]
# Will either crash or return exit code 1 if elfhack is broken
LD_PRELOAD=/home/misc/gentoo/tmp/portage/www-client/firefox-29.0.1/work/mozilla-release/obj-x86_64-pc-linux-gnu/build/unix/elfhack/test-array.so

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-04-09 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #209 from Markus Trippelsdorf trippels at gcc dot gnu.org ---
(In reply to Markus Trippelsdorf from comment #208)
 Both issues from Comment 201 were fixed by:
 http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00338.html

No, only the first issue is fixed. The second one (LTO/PGO build)
still happens unfortunately.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-04-08 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #208 from Markus Trippelsdorf trippels at gcc dot gnu.org ---
Both issues from Comment 201 were fixed by:
http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00338.html


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-04-02 Thread mliska at suse dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #206 from Martin Liška mliska at suse dot cz ---
Firefox (and chromium) memory reports with -flto=9 and -O2; archive contains
also memory usage graph:

https://docs.google.com/file/d/0B0pisUJ80pO1bnV5V0RtWXJkaVU/edit

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-04-02 Thread mliska at suse dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #207 from Martin Liška mliska at suse dot cz ---
Created attachment 32525
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=32525action=edit
Memory usage graphs for -flto=9, -flto=4, -flto=1 with -O2

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-03-30 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #205 from Jan Hubicka hubicka at ucw dot cz ---
I was looking into this recently, too.  Curiously enough, for me clang+LTO was
winning
but comparing the symbols it seemed that the confiugre scripts picked bit more
features
at GCC side.  I looked briefly on the differences and we can optimize out more
vtables
which I have patch for pending for next stage1 and optimize out write only
global vars.
Still the differences may be worth further investigation - clang seems to
produce noticeably
fewer external relocations, too. This seems like a ABI bug at clang side
though.

What I use for my firefox builds is --param inline-unit-growth=5.  Our -O3
seems bit
of overkill for applicatin of fize of Firefox...

Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-03-29 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #204 from Markus Trippelsdorf trippels at gcc dot gnu.org ---
Here is a comparison of libxul sizes (in bytes, unstripped) for different
compiler options:

gcc (trunk):
-O3 90213016
-O3 -flto   79682648
-O3 -flto / PGO 77250512
-Os 70431584
-Os -flto   62474008

clang (trunk):
-O3 80574784
-O3 -flto   79394992
-Os 72452776
-Os -flto   65111640


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-03-06 Thread jamborm at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #200 from Martin Jambor jamborm at gcc dot gnu.org ---
I currently cannot build Firefox with LTO due to PR 60449 (yeah, I
know, using gcc configured with checking makes life hard, sometimes
unnecessarily).

I get errors like
 /home/mjambor/mozilla/mzc2/media/libvpx/vp8/encoder/onyx_if.c:4884:5: error:
control flow  in the middle of basic block 7


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-03-06 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #201 from Markus Trippelsdorf trippels at gcc dot gnu.org ---
With current gcc trunk and mozilla-central trunk Firefox crashes on startup
when
build with -flto (--enable-optimize=-O3):

0x75ce5d8f in nsCOMPtr_base::assign_with_AddRef(nsISupports*) [clone
.constprop.13162] () from /var/tmp/moz-build-dir/dist/bin/libxul.so
(gdb) bt
#0  0x75ce5d8f in nsCOMPtr_base::assign_with_AddRef(nsISupports*)
[clone .constprop.13162] () from /var/tmp/moz-build-dir/dist/bin/libxul.so
#1  0x73fe60eb in nsSocketTransport::OnSocketDetached(PRFileDesc*) ()
from /var/tmp/moz-build-dir/dist/bin/libxul.so
#2  0x73eb74ac in
nsSocketTransportService::DetachSocket(nsSocketTransportService::SocketContext*,
nsSocketTransportService::SocketContext*) ()
   from /var/tmp/moz-build-dir/dist/bin/libxul.so
#3  0x73fff28f in nsSocketTransportService::Run() () from
/var/tmp/moz-build-dir/dist/bin/libxul.so
#4  0x74059c6a in nsThread::ProcessNextEvent(bool, bool*) () from
/var/tmp/moz-build-dir/dist/bin/libxul.so
#5  0x75ce5b39 in NS_ProcessNextEvent(nsIThread*, bool) [clone
.constprop.13167] () from /var/tmp/moz-build-dir/dist/bin/libxul.so
#6  0x745af7a0 in
mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*)
() from /var/tmp/moz-build-dir/dist/bin/libxul.so
#7  0x73ec649d in MessageLoop::Run() () from
/var/tmp/moz-build-dir/dist/bin/libxul.so
#8  0x73fe7a56 in nsThread::ThreadFunc(void*) () from
/var/tmp/moz-build-dir/dist/bin/libxul.so
#9  0x77e7757c in _pt_root () from
/var/tmp/moz-build-dir/dist/bin/libnspr4.so
#10 0x77bc41e2 in start_thread () from /lib/libpthread.so.0
#11 0x774932ad in clone () from /lib/libc.so.6

When I build with PGO/LTO Firefox crashes later (when I close a
tab with e.g.: https://github.com/JuliaLang/julia/pull/6018 ):

Program received signal SIGSEGV, Segmentation fault.
0x751645ed in PL_DHashTableEnumerate(PLDHashTable*, PLDHashOperator
(*)(PLDHashTable*, PLDHashEntryHdr*, unsigned int, void*), void*) ()
   from /var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
(gdb) bt
#0  0x751645ed in PL_DHashTableEnumerate(PLDHashTable*, PLDHashOperator
(*)(PLDHashTable*, PLDHashEntryHdr*, unsigned int, void*), void*) ()
   from /var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#1  0x75754d32 in PresShell::Destroy() () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#2  0x75754831 in nsDocumentViewer::DestroyPresShell() () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#3  0x755ee5c4 in nsDocumentViewer::Hide() () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#4  0x757b72eb in nsDocShell::SetVisibility(bool) () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#5  0x75a589a4 in nsFrameLoader::Hide() () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#6  0x75a588f6 in nsHideViewer::Run() () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#7  0x753b97de in nsContentUtils::RemoveScriptBlocker() () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#8  0x753cc954 in nsDocument::EndUpdate(unsigned int) () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#9  0x75651dd6 in mozilla::dom::XULDocument::EndUpdate(unsigned int) ()
from /var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#10 0x7549673b in nsINode::doRemoveChildAt(unsigned int, bool,
nsIContent*, nsAttrAndChildArray) () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#11 0x75496085 in nsXULElement::RemoveChildAt(unsigned int, bool) ()
from /var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#12 0x75494df9 in nsINode::RemoveChild(nsINode, mozilla::ErrorResult)
() from /var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#13 0x75494a00 in mozilla::dom::NodeBinding::removeChild(JSContext*,
JS::HandleJSObject*, nsINode*, JSJitMethodCallArgs const) [clone
.lto_priv.13709] ()
   from /var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#14 0x753b01e7 in mozilla::dom::GenericBindingMethod(JSContext*,
unsigned int, JS::Value*) () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#15 0x75262744 in js::Invoke(JSContext*, JS::CallArgs,
js::MaybeConstruct) () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#16 0x7524a14c in Interpret(JSContext*, js::RunState) () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#17 0x75249801 in js::RunScript(JSContext*, js::RunState) () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#18 0x752627ec in js::Invoke(JSContext*, JS::CallArgs,
js::MaybeConstruct) () from
/var/tmp/firefox-destdir/usr/lib/firefox-30.0a1/libxul.so
#19 0x752a574c in js::Invoke(JSContext*, JS::Value 

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-03-06 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #202 from H.J. Lu hjl.tools at gmail dot com ---
LTO miscompiles 435.gromacs in SPEC CPU 2006 on x32 with

-mx32 -O3 -funroll-loops -ffast-math

since r208165 (PR 60418).  Can you try r208163?


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-03-06 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #203 from Markus Trippelsdorf trippels at gcc dot gnu.org ---
(In reply to H.J. Lu from comment #202)
 LTO miscompiles 435.gromacs in SPEC CPU 2006 on x32 with
 
 -mx32 -O3 -funroll-loops -ffast-math
 
 since r208165 (PR 60418).  Can you try r208163?

Yes. Unfortunately with r208163 Firefox still crashes on startup.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-01-17 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

Markus Trippelsdorf trippels at gcc dot gnu.org changed:

   What|Removed |Added

 CC||trippels at gcc dot gnu.org

--- Comment #197 from Markus Trippelsdorf trippels at gcc dot gnu.org ---
Created attachment 31876
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31876action=edit
mozilla-central patch


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-01-17 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #198 from Markus Trippelsdorf trippels at gcc dot gnu.org ---
Created attachment 31877
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31877action=edit
My local PGO/LTO script


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2014-01-17 Thread trippels at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #199 from Markus Trippelsdorf trippels at gcc dot gnu.org ---
Created attachment 31878
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31878action=edit
.mozconfig_profile_gen


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-09-06 Thread markus at trippelsdorf dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #196 from Markus Trippelsdorf markus at trippelsdorf dot de ---
(In reply to Jan Hubicka from comment #195)
 Today there was two fixes for bugs that produce undefined symbols like one
 you see.
 Does the problem still exist on current mainline?  Are you using profile
 feedback?

The problem is gone on current mainline. (And yes I'm using profile feedback.)


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-09-05 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #195 from Jan Hubicka hubicka at ucw dot cz ---
Today there was two fixes for bugs that produce undefined symbols like one you
see.
Does the problem still exist on current mainline?  Are you using profile
feedback?


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-09-03 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #193 from Jan Hubicka hubicka at gcc dot gnu.org ---
I am building firefox with -O3 and get no undefined symbols.  Can you, please,
relink with -Wl,--no-demangle --save-temps -fdump-ipa-all and try to look up
the missing symbol in -lm.res file and if it not UNDEF there make somewhere
available the dumps?
If it is undefined there, it may be firefox bug..


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-09-03 Thread markus at trippelsdorf dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #194 from Markus Trippelsdorf markus at trippelsdorf dot de ---
(In reply to Jan Hubicka from comment #193)
 I am building firefox with -O3 and get no undefined symbols.  Can you,
 please, relink with -Wl,--no-demangle --save-temps -fdump-ipa-all and try to
 look up the missing symbol in -lm.res file and if it not UNDEF there make
 somewhere available the dumps?
 If it is undefined there, it may be firefox bug..

Hmm, it's strange, because there are five undefined references;
one of them does not appear in lm.res at all and the other four 
are all PREVAILING_DEF_IRONLY.
(The whole dump is huge. Please tell me which part you need and
I will try to upload it somewhere.)


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-08-29 Thread markus at trippelsdorf dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #191 from Markus Trippelsdorf markus at trippelsdorf dot de ---
First of all many thanks for your work on reducing memory usage.
Peak memory usage is now lower (~3GB) than clang's (~4GB).

However, with -enable-optimize=-O3 on rev202079 I get:
(An default (-Os) build on rev202053 went fine this morning)

/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.0/../../../../x86_64-pc-linux-gnu/bin/ld:
error: /tmp/ccd3grW1.ltrans0.ltrans.o: requires dynamic R_X86_64_PC32 reloc
against '_ZN17nsHtt
pTransaction18ReadRequestSegmentEP14nsIInputStreamPvPKcjjPj' which may overflow
at runtime; recompile with -fPIC
/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.0/../../../../x86_64-pc-linux-gnu/bin/ld:
error: /tmp/ccd3grW1.ltrans0.ltrans.o: requires dynamic R_X86_64_PC32 reloc
against '_ZN17nsHtt
pTransaction18ReadRequestSegmentEP14nsIInputStreamPvPKcjjPj' which may overflow
at runtime; recompile with -fPIC
/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.0/../../../../x86_64-pc-linux-gnu/bin/ld:
error: /tmp/ccd3grW1.ltrans1.ltrans.o: requires dynamic R_X86_64_PC32 reloc
against '_ZN16nsInp
utStreamTee15WriteSegmentFunEP14nsIInputStreamPvPKcjjPj' which may overflow at
runtime; recompile with -fPIC
/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.0/../../../../x86_64-pc-linux-gnu/bin/ld:
error: /tmp/ccd3grW1.ltrans24.ltrans.o: requires dynamic R_X86_64_PC32 reloc
against '_ZN16nsIn
putStreamTee15WriteSegmentFunEP14nsIInputStreamPvPKcjjPj' which may overflow at
runtime; recompile with -fPIC
/usr/lib/gcc/x86_64-pc-linux-gnu/4.9.0/../../../../x86_64-pc-linux-gnu/bin/ld:
error: read-only segment has dynamic relocations
/tmp/ccd3grW1.ltrans0.ltrans.o:ccd3grW1.ltrans0.o:function
nsHttpTransaction::ReadSegments(nsAHttpSegmentReader*, unsigned int, unsigned
int*): error: undefined reference to 
'nsHttpTransaction::ReadRequestSegment(nsIInputStream*, void*, char const*,
unsigned int, unsigned int, unsigned int*)'
/tmp/ccd3grW1.ltrans0.ltrans.o:ccd3grW1.ltrans0.o:function
nsHttpConnection::OnSocketWritable(): error: undefined reference to
'nsHttpTransaction::ReadRequestSegment(nsIInput
Stream*, void*, char const*, unsigned int, unsigned int, unsigned int*)'
/tmp/ccd3grW1.ltrans0.ltrans.o:ccd3grW1.ltrans0.o:function
nsHttpPipeline::ReadSegments(nsAHttpSegmentReader*, unsigned int, unsigned
int*): error: undefined reference to 'ns
HttpPipeline::ReadFromPipe(nsIInputStream*, void*, char const*, unsigned int,
unsigned int, unsigned int*)'
/tmp/ccd3grW1.ltrans1.ltrans.o:ccd3grW1.ltrans1.o:function
imgRequest::OnDataAvailable(nsIRequest*, nsISupports*, nsIInputStream*,
unsigned long, unsigned int): error: undefi
ned reference to 'nsInputStreamTee::WriteSegmentFun(nsIInputStream*, void*,
char const*, unsigned int, unsigned int, unsigned int*)'
/tmp/ccd3grW1.ltrans24.ltrans.o:ccd3grW1.ltrans24.o:function
nsInputStreamTee::ReadSegments(tag_nsresult (*)(nsIInputStream*, void*, char
const*, unsigned int, unsigned int, 
unsigned int*), void*, unsigned int, unsigned int*): error: undefined reference
to 'nsInputStreamTee::WriteSegmentFun(nsIInputStream*, void*, char const*,
unsigned int, unsig
ned int, unsigned int*)'

Not sure if -O3 or rev202079 is to blame.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-08-29 Thread markus at trippelsdorf dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #192 from Markus Trippelsdorf markus at trippelsdorf dot de ---
It turned out that -enable-optimize=-O3 is the cause.
Rev202079 with -Os links fine.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-08-21 Thread marxin.liska at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

Martin Liška marxin.liska at gmail dot com changed:

   What|Removed |Added

 CC||marxin.liska at gmail dot com

--- Comment #189 from Martin Liška marxin.liska at gmail dot com ---
I've encountered problems connected with PGO:

gcc revision: 201894
firefox changeset:  143205:1d6bf2bd4003 (Aug 20, 2013)

I build instrumented binary without LTO and after that I use the profile for
LTO:
MYFLAGS=-flto=9 -fno-fat-lto-objects -ftoplevel-reorder -fprofile-use
-Wno-error=coverage-mismatch

I know that there are gcda files that are mentioned in this thread and were
removed by me:

jemalloc.gcda (makes sense)
ptsynch.gcda (likewise)

HashFunctions.gcda (?)
sqlite3.gcda (?)

After linking of sqlite3, there are many corrupted profiles like:
/ssd/firefox/js/src/gc/Marking.cpp
/ssd/firefox/js/src/frontend/BytecodeEmitter.cpp
/ssd/firefox/js/src/frontend/Interpreter.cpp
...

Example of an error:
/ssd/firefox/js/src/gc/Marking.cpp: In function
‘js::gc::IsAboutToBeFinalizedJSAtom(JSAtom**)bool [clone .isra.65]’:
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
profile data is not flow-consistent
 }
 ^
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 3-6 thought to be -81
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 3-4 thought to be 39667
/ssd/firefox/js/src/gc/Marking.cpp: In function
‘js::gc::IsAboutToBeFinalizedjs::UnownedBaseShape(js::UnownedBaseShape**)bool
[clone .isra.52]’:
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
profile data is not flow-consistent
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 3-6 thought to be -1
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 3-4 thought to be 41156
/ssd/firefox/js/src/gc/Marking.cpp: In function
‘MarkInternalJSAtom(JSTracer*, JSAtom**)void’:
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
profile data is not flow-consistent
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 9-14 thought to be -39
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 9-10 thought to be 180119
/ssd/firefox/js/src/gc/Marking.cpp: In function
‘MarkInternalJSObject(JSTracer*, JSObject**)void’:
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
profile data is not flow-consistent
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 11-18 thought to be -1
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 11-12 thought to be 49007
/ssd/firefox/js/src/gc/Marking.cpp: In member function ‘js::MarkStackunsigned
long::push(unsigned long)’:
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
profile data is not flow-consistent
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 4-6 thought to be -1
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 4-5 thought to be 1
/ssd/firefox/js/src/gc/Marking.cpp: In member function
‘js::GCMarker::drainMarkStack(js::SliceBudget)’:
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
profile data is not flow-consistent
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 3-4 thought to be -7
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 3-1 thought to be 7
/ssd/firefox/js/src/gc/Marking.cpp: In member function
‘js::ObjectImpl::slotSpan() const’:
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
profile data is not flow-consistent
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 5-7 thought to be -1
/ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
number of executions for edge 5-6 thought to be 15965

Thank you,
Martin

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-08-21 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #190 from Jan Hubicka hubicka at ucw dot cz ---
 /ssd/firefox/js/src/gc/Marking.cpp: In function
 ???js::gc::IsAboutToBeFinalizedJSAtom(JSAtom**)bool [clone .isra.65]???:
 /ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
 profile data is not flow-consistent
  }
  ^
 /ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
 number of executions for edge 3-6 thought to be -81

This actually loks like corruption from concurent updates (profiling is not
thread
safe).  Do you get much more of these?
I can imagine that garbage collector runs in parrallel and often.
 /ssd/firefox/js/src/gc/Marking.cpp:1713:1: error: corrupted profile info:
 number of executions for edge 3-4 thought to be 39667

Perhaps we should fix dumping to dump full 64bit value.. :)

Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-08-03 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #187 from Jan Hubicka hubicka at gcc dot gnu.org ---
WPA time report
Execution times (seconds)
 phase setup :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
  1398 kB ( 0%) ggc
 phase opt and generate  :  80.79 (13%) usr   1.01 ( 3%) sys  81.96 (12%) wall 
315727 kB (25%) ggc
 phase stream in : 283.33 (45%) usr   7.82 (24%) sys 292.12 (44%) wall 
940315 kB (74%) ggc
 phase stream out: 261.66 (42%) usr  23.14 (72%) sys 287.88 (43%) wall 
  7534 kB ( 1%) ggc
 garbage collection  :  14.45 ( 2%) usr   0.02 ( 0%) sys  14.48 ( 2%) wall 
 0 kB ( 0%) ggc
 callgraph optimization  :   2.55 ( 0%) usr   0.00 ( 0%) sys   2.55 ( 0%) wall 
33 kB ( 0%) ggc
 ipa cp  :  10.45 ( 2%) usr   0.36 ( 1%) sys  10.81 ( 2%) wall 
456287 kB (36%) ggc
 ipa inlining heuristics :  42.12 ( 7%) usr   1.06 ( 3%) sys  43.27 ( 7%) wall
1485346 kB (117%) ggc
 ipa lto gimple in   :   0.56 ( 0%) usr   0.25 ( 1%) sys   0.87 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa lto gimple out  :  21.77 ( 3%) usr   1.72 ( 5%) sys  23.53 ( 4%) wall 
 0 kB ( 0%) ggc
 ipa lto decl in : 183.90 (29%) usr   4.77 (15%) sys 189.46 (29%) wall 
959299 kB (76%) ggc
 ipa lto decl out: 231.70 (37%) usr  10.78 (34%) sys 242.73 (37%) wall 
 0 kB ( 0%) ggc
 ipa lto cgraph I/O  :  14.38 ( 2%) usr   1.57 ( 5%) sys  15.99 ( 2%) wall
2405760 kB (190%) ggc
 ipa lto decl merge  :  32.16 ( 5%) usr   0.00 ( 0%) sys  32.24 ( 5%) wall 
  8268 kB ( 1%) ggc
 ipa lto cgraph merge:  28.72 ( 5%) usr   0.06 ( 0%) sys  28.81 ( 4%) wall 
135235 kB (11%) ggc
 whopr wpa   :   9.57 ( 2%) usr   0.05 ( 0%) sys   9.62 ( 1%) wall 
  7537 kB ( 1%) ggc
 whopr wpa I/O   :   2.07 ( 0%) usr  10.62 (33%) sys  15.49 ( 2%) wall 
 0 kB ( 0%) ggc
 whopr partitioning  :   3.26 ( 1%) usr   0.03 ( 0%) sys   3.29 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa reference   :   5.55 ( 1%) usr   0.05 ( 0%) sys   5.62 ( 1%) wall 
 0 kB ( 0%) ggc
 ipa profile :   2.82 ( 0%) usr   0.05 ( 0%) sys   2.88 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa pure const  :   6.25 ( 1%) usr   0.13 ( 0%) sys   6.38 ( 1%) wall 
 0 kB ( 0%) ggc
 unaccounted todo:  13.25 ( 2%) usr   0.28 ( 1%) sys  13.58 ( 2%) wall 
 0 kB ( 0%) ggc
 TOTAL : 625.7931.97   661.97   
1264976 kB


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-08-02 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #185 from Jan Hubicka hubicka at gcc dot gnu.org ---
I merged in some patches intended to reduce memory of Firefox LTO and also
updated firefox tree. Some more involved patches are on the way, so it is
summary where we stand now.

WPA usage in TOP is 10GB now.

1) After streaming in trees, the GGC usage is now 5.1GB
   - 2.5GB are trees,
   - 1GB are linemaps
   - 0.8GB are decl maps (decl states)

tree_list12561507
integer_type 1511296
pointer_type 4610735
record_type  8139077
method_type  2401664
integer_cst  6677946
string_cst   2127890
function_decl6069299
label_decl504859
field_decl   5104957
var_decl  596020
const_decl   5401253
parm_decl9002744
type_decl10150100
result_decl  2181250
addr_expr4173661
tree_binfo   4780477


 I have cache that cuts down the linemaps + patch to not stream PARM_DECLs and
RETURN_DECLs.  With this the usage goes bellow 3GB.

2) Cgraph streaming now becomes important factor.  
   GGC usage goes up to 7.7GB
   GGC use:
 - cgraph nodes themselves are 1.5GB
 - inline summaries are 0.5GB
 - cgraph edges are 3.7GB
 - IPA references 2.3GB
 - IPA-prop 0.7GB
   Off GGC
 - IPA-prop 0.6GB
 - Inline summary 0.5GB
 - symtab encoder 0.17GB

   Here one can easily
 - compress the vectors recording definitions
 - pull off parts of cgraph nodes that are not really needed by WPA (nested
info, etc.)
 - perhaps implement of streaming of merged cgraph.

so good news is that we now have a lot of interesting low hanging fruit. Bad
news is that tree streaming still feels slow.  I suppose we need to dig more
into what trees really need to go into WPA...


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-08-02 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #186 from Jan Hubicka hubicka at gcc dot gnu.org ---
oprofile of merging
6764713.0501  lto1 inflate_fast
38682 7.4624  lto1 compare_tree_sccs_1(tree_node*,
tree_node*, tree_node***)
32365 6.2437  lto1 streamer_read_uhwi(lto_input_block*)
31198 6.0186  lto1
streamer_read_tree_bitfields(lto_input_block*, data_in*, tree_node*)
21155 4.0811  libc-2.11.1.so   msort_with_tmp
19581 3.7775  lto1 ht_lookup_with_hash(ht*, unsigned
char const*, unsigned long, unsigned int, ht_lookup_option)
16584 3.1993  lto1 lto_input_tree(lto_input_block*,
data_in*)
15203 2.9329  lto1 lto_input_tree_1(lto_input_block*,
data_in*, LTO_tags, unsigned int)
15194 2.9312  libc-2.11.1.so   memcpy
14823 2.8596  lto1 htab_find_slot_with_hash
12860 2.4809  lto1
streamer_read_tree_body(lto_input_block*, data_in*, tree_node*)
12705 2.4510  lto1 hash_tabletree_scc_hasher,
xcallocator::find_slot_with_hash(tree_scc const*, unsigned int, insert_option)
11773 2.2712  lto1 adler32
11504 2.2193  libc-2.11.1.so   _IO_vfscanf
11401 2.1994  lto1 unify_scc(streamer_tree_cache_d*,
unsigned int, unsigned int, unsigned int, unsigned int)
9548  1.8420  lto1
streamer_get_pickled_tree(lto_input_block*, data_in*)
9315  1.7970  lto1 inflate

IPA
18799 6.2862  lto1
symtab_remove_unreachable_nodes(bool, _IO_FILE*)
11878 3.9719  lto1
cgraph_redirect_edge_callee(cgraph_edge*, cgraph_node*)
11223 3.7528  lto1 do_per_function(void (*)(void*),
void*)
10813 3.6157  lto1 pointer_set_lookup(pointer_set_t
const*, void const*, unsigned long*)
8415  2.8139  lto1 ipa_reverse_postorder(cgraph_node**)
7689  2.5711  lto1 htab_find_slot_with_hash
7677  2.5671  lto1 do_estimate_growth_1(cgraph_node*,
void*)
7477  2.5002  libc-2.11.1.so   free
7035  2.3524  libc-2.11.1.so   malloc_consolidate

Stream out
9440 16.1663  lto1 linemap_lookup(line_maps*, unsigned
int)
7663 13.1231  lto1 DFS_write_tree(output_block*, sccs*,
tree_node*, bool, bool)
6052 10.3643  lto1
streamer_write_uhwi_stream(lto_output_stream*, unsigned long)
5831  9.9858  lto1 pointer_set_lookup(pointer_set_t
const*, void const*, unsigned long*)
3342  5.7233  lto1
streamer_tree_cache_lookup(streamer_tree_cache_d*, tree_node*, unsigned int*)
2229  3.8172  lto1 pointer_map_insert(pointer_map_t*,
void const*)
2196  3.7607  lto1
streamer_pack_tree_bitfields(output_block*, bitpack_d*, tree_node*)
2054  3.5175  lto1 lto_output_tree(output_block*,
tree_node*, bool, bool)
1656  2.8360  lto1 inflate_fast
1655  2.8342  lto1 pointer_mapunsigned
int::insert(void const*, bool*)


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-06-19 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #184 from Jan Hubicka hubicka at gcc dot gnu.org ---
New profiles after Richard's changes to remove pointer maps from straming in.

Stream in:
samples  %app name symbol name
3659912.3464  lto1 inflate_fast
27382 9.2371  lto1 streamer_read_uhwi(lto_input_block*)
19282 6.5047  lto1
streamer_read_tree_bitfields(lto_input_block*, data_in*, tree_node*)
15807 5.3324  lto1 compare_tree_sccs_1(tree_node*,
tree_node*, tree_node***)
11385 3.8407  libc-2.11.1.so   msort_with_tmp
9054  3.0543  libc-2.11.1.so   memcpy
8701  2.9352  lto1 htab_find_slot_with_hash
8506  2.8694  lto1 lto_input_tree(lto_input_block*,
data_in*)
8405  2.8354  lto1 lto_input_tree_1(lto_input_block*,
data_in*, LTO_tags, unsigned int)
8055  2.7173  lto1 ht_lookup_with_hash(ht*, unsigned
char const*, unsigned long, unsigned int, ht_lookup_option)
6436  2.1711  lto1
streamer_read_tree_body(lto_input_block*, data_in*, tree_node*)
6287  2.1209  lto1 adler32
5891  1.9873  lto1
streamer_get_pickled_tree(lto_input_block*, data_in*)


Stream out:
samples  %app name symbol name
1988514.6837  lto1 DFS_write_tree(output_block*, sccs*,
tree_node*, bool, bool)
1928514.2407  lto1 linemap_lookup(line_maps*, unsigned
int)
1619211.9567  lto1
streamer_write_uhwi_stream(lto_output_stream*, unsigned long)
1592611.7603  lto1 pointer_map_insert(pointer_map_t*,
void const*)
10285 7.5948  lto1 pointer_map_contains(pointer_map_t
const*, void const*)
7324  5.4083  lto1
streamer_tree_cache_lookup(streamer_tree_cache_d*, tree_node*, unsigned int*)
5897  4.3545  lto1
streamer_pack_tree_bitfields(output_block*, bitpack_d*, tree_node*)
5374  3.9683  lto1 lto_output_tree(output_block*,
tree_node*, bool, bool)
4896  3.6154  lto1
streamer_tree_cache_insert_1(streamer_tree_cache_d*, tree_node*, unsigned int,
unsigned int*, bool)
3285  2.4258  libc-2.11.1.so   memset
2669  1.9709  lto1
streamer_write_tree_body(output_block*, tree_node*, bool)
2520  1.8608  libc-2.11.1.so   memcpy
2383  1.7597  lto1
streamer_tree_cache_add_to_node_array(streamer_tree_cache_d*, unsigned int,
tree_node*, unsigned int)

linemap_lookup is easy target, obviously.

Execution times (seconds)
 phase setup :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
  1399 kB ( 0%) ggc
 phase opt and generate  :  69.29 (14%) usr   0.82 ( 3%) sys  70.62 (13%) wall 
270269 kB (11%) ggc
 phase stream in : 224.95 (44%) usr   6.23 (22%) sys 236.02 (43%) wall
2174294 kB (89%) ggc
 phase stream out: 213.26 (42%) usr  21.35 (75%) sys 236.87 (44%) wall 
  7157 kB ( 0%) ggc
 garbage collection  :   9.92 ( 2%) usr   0.00 ( 0%) sys   9.99 ( 2%) wall 
 0 kB ( 0%) ggc
 callgraph optimization  :   1.36 ( 0%) usr   0.00 ( 0%) sys   1.34 ( 0%) wall 
32 kB ( 0%) ggc
 ipa cp  :   7.65 ( 2%) usr   0.32 ( 1%) sys   8.01 ( 1%) wall 
418436 kB (17%) ggc
 ipa inlining heuristics :  38.83 ( 8%) usr   0.83 ( 3%) sys  39.99 ( 7%) wall
1352530 kB (55%) ggc
 ipa lto gimple in   :   0.39 ( 0%) usr   0.05 ( 0%) sys   0.53 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa lto gimple out  :  16.46 ( 3%) usr   1.39 ( 5%) sys  17.93 ( 3%) wall 
 0 kB ( 0%) ggc
 ipa lto decl in : 158.55 (31%) usr   3.99 (14%) sys 166.99 (31%) wall
2583106 kB (105%) ggc
 ipa lto decl out: 191.10 (38%) usr  11.48 (40%) sys 203.47 (37%) wall 
 0 kB ( 0%) ggc
 ipa lto cgraph I/O  :   7.07 ( 1%) usr   1.17 ( 4%) sys   8.27 ( 2%) wall
2134131 kB (87%) ggc
 ipa lto decl merge  :  29.94 ( 6%) usr   0.01 ( 0%) sys  30.06 ( 6%) wall 
  8270 kB ( 0%) ggc
 ipa lto cgraph merge:  12.02 ( 2%) usr   0.04 ( 0%) sys  12.13 ( 2%) wall 
142240 kB ( 6%) ggc
 whopr wpa   :   7.30 ( 1%) usr   0.03 ( 0%) sys   7.39 ( 1%) wall 
  7160 kB ( 0%) ggc
 whopr wpa I/O   :   1.40 ( 0%) usr   8.46 (30%) sys  11.14 ( 2%) wall 
 0 kB ( 0%) ggc
 whopr partitioning  :   2.33 ( 0%) usr   0.01 ( 0%) sys   2.36 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa reference   :   5.44 ( 1%) usr   0.04 ( 0%) sys   5.53 ( 1%) wall 
 0 kB ( 0%) ggc
 ipa profile :   1.26 ( 0%) usr   0.04 ( 0%) sys   1.32 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa pure const  :   5.87 ( 1%) usr   0.13 ( 0%) sys   6.03 ( 1%) wall 
 0 kB ( 0%) ggc
 inline parameters   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
 

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-06-17 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #182 from Jan Hubicka hubicka at gcc dot gnu.org ---
OK, after a while I should update the stats here.  Richard's new tree merging
patch makes libxul linking a lot faster and less memory consuming.
Peak memory usage (in TOP) is now just bellow 10GB, with bit of incremental
improvmenets I hope to get bellow 8GB again soon.

Bulid time is
real19m0.355s
user56m20.459s
sys 2m17.533s

GGC memory usage after stream in 4938399k

Execution times (seconds)
 phase setup :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
  1399 kB ( 0%) ggc
 phase opt and generate  :  72.86 (12%) usr   0.90 ( 3%) sys  75.25 (11%) wall 
270952 kB ( 7%) ggc
 phase stream in : 274.88 (44%) usr   9.01 (26%) sys 294.99 (43%) wall
3478515 kB (93%) ggc
 phase stream out: 282.18 (45%) usr  24.40 (71%) sys 308.42 (45%) wall 
  7162 kB ( 0%) ggc
 garbage collection  :  12.99 ( 2%) usr   0.01 ( 0%) sys  13.00 ( 2%) wall 
 0 kB ( 0%) ggc
 callgraph optimization  :   1.95 ( 0%) usr   0.00 ( 0%) sys   1.95 ( 0%) wall 
32 kB ( 0%) ggc
 ipa cp  :   9.82 ( 2%) usr   0.39 ( 1%) sys  10.26 ( 2%) wall 
418482 kB (11%) ggc
 ipa inlining heuristics :  39.30 ( 6%) usr   1.12 ( 3%) sys  41.52 ( 6%) wall
1353294 kB (36%) ggc
 ipa lto gimple in   :   0.45 ( 0%) usr   0.15 ( 0%) sys   0.62 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa lto gimple out  :  18.24 ( 3%) usr   1.50 ( 4%) sys  19.86 ( 3%) wall 
 0 kB ( 0%) ggc
 ipa lto decl in : 200.68 (32%) usr   5.85 (17%) sys 216.44 (32%) wall
3887175 kB (103%) ggc
 ipa lto decl out: 256.24 (41%) usr  13.44 (39%) sys 271.24 (40%) wall 
 0 kB ( 0%) ggc
 ipa lto cgraph I/O  :   7.20 ( 1%) usr   1.61 ( 5%) sys   8.83 ( 1%) wall
2134157 kB (57%) ggc
 ipa lto decl merge  :  27.71 ( 4%) usr   0.01 ( 0%) sys  27.72 ( 4%) wall 
  8270 kB ( 0%) ggc
 ipa lto cgraph merge:  17.31 ( 3%) usr   0.07 ( 0%) sys  17.39 ( 3%) wall 
142240 kB ( 4%) ggc
 whopr wpa   :   8.82 ( 1%) usr   0.04 ( 0%) sys   8.89 ( 1%) wall 
  7165 kB ( 0%) ggc
 whopr wpa I/O   :   1.63 ( 0%) usr   9.43 (27%) sys  11.19 ( 2%) wall 
 0 kB ( 0%) ggc
 whopr partitioning  :   3.21 ( 1%) usr   0.04 ( 0%) sys   3.25 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa reference   :   5.56 ( 1%) usr   0.04 ( 0%) sys   5.81 ( 1%) wall 
 0 kB ( 0%) ggc
 ipa profile :   1.83 ( 0%) usr   0.02 ( 0%) sys   1.86 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa pure const  :   6.07 ( 1%) usr   0.18 ( 1%) sys   6.26 ( 1%) wall 
 0 kB ( 0%) ggc
 inline parameters   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
14 kB ( 0%) ggc
 tree copy propagation   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
 0 kB ( 0%) ggc
 tree PTA:   0.05 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall 
 0 kB ( 0%) ggc
 tree SSA rewrite:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
27 kB ( 0%) ggc
 tree SSA other  :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
 0 kB ( 0%) ggc
 tree CCP:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall 
 0 kB ( 0%) ggc
 dominance computation   :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
 0 kB ( 0%) ggc
 varconst:   0.14 ( 0%) usr   0.12 ( 0%) sys   0.24 ( 0%) wall 
 0 kB ( 0%) ggc
 unaccounted todo:  10.69 ( 2%) usr   0.29 ( 1%) sys  11.10 ( 2%) wall 
 0 kB ( 0%) ggc 
 TOTAL : 629.9334.31   678.67   
3758029 kB

Memory usage seems about the same with -g.
Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-06-17 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #183 from Jan Hubicka hubicka at gcc dot gnu.org ---
type merging stats
[WPA] read 43156894 SCCs of average size 2.270660
[WPA] 97994652 tree bodies read in total
[WPA] tree SCC table: size 8388593, 3830511 elements, collision ratio: 0.684487
[WPA] tree SCC max chain length 88 (size 1)
[WPA] Compared 19139975 SCCs, 344923 collisions (0.018021)
[WPA] Merged 19067050 SCCs
[WPA] Merged 58757829 tree bodies
[WPA] Merged 11951381 types
[WPA] 4357267 types prevailed (13278034 associated trees)
[WPA] Old merging code merges an additional 2026163 types of which 140937 are
in the same SCC with their prevailing variant (12389865 and 6362266 associated
trees)
[WPA] GIMPLE canonical type table: size 131071, 77910 elements, 4357402
searches, 1095104 collisions (ratio: 0.251320)
[WPA] GIMPLE canonical type hash table: size 8388593, 4357346 elements,
15252531 searches, 11817317 collisions (ratio: 0.774777)
[WPA] # of input files: 4918
[WPA] # of input cgraph nodes: 0
[WPA] # of function bodies: 0
[WPA] # of output files: 0
[WPA] # of output symtab nodes: 0
[WPA] # of output tree pickle references: 0
[WPA] # of output tree bodies: 0
[WPA] # callgraph partitions: 0
[WPA] Compression: 1311851796 input bytes, 4153897270 uncompressed bytes
(ratio: 3.166438)
[WPA] Size of mmap'd section decls: 1311851796 bytes
[LTRANS] read 314277 SCCs of average size 6.082532
[LTRANS] 1911600 tree bodies read in total
[LTRANS] GIMPLE canonical type table: size 16381, 9653 elements, 453967
searches, 24697 collisions (ratio: 0.054403)
[LTRANS] GIMPLE canonical type hash table: size 1048573, 453913 elements,
1562009 searches, 1517260 collisions (ratio: 0.971352)
[LTRANS] # of input files: 1
[LTRANS] # of input cgraph nodes: 0
[LTRANS] # of function bodies: 0


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-03-08 Thread jamborm at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



Martin Jambor jamborm at gcc dot gnu.org changed:



   What|Removed |Added



 Depends on||56570



--- Comment #181 from Martin Jambor jamborm at gcc dot gnu.org 2013-03-08 
10:41:54 UTC ---

The bug described in comment #179 is now PR 56570.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-03-07 Thread rguenth at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #180 from Richard Biener rguenth at gcc dot gnu.org 2013-03-07 
16:08:29 UTC ---

Try



Index: gcc/tree-inline.c

===

--- gcc/tree-inline.c   (revision 196520)

+++ gcc/tree-inline.c   (working copy)

@@ -3929,7 +3929,7 @@ expand_call_inline (basic_block bb, gimp

 {

   id-block = make_node (BLOCK);

   BLOCK_ABSTRACT_ORIGIN (id-block) = fn;

-  BLOCK_SOURCE_LOCATION (id-block) = input_location;

+  BLOCK_SOURCE_LOCATION (id-block) = LOCATION_LOCUS (input_location);

   prepend_lexical_block (gimple_block (stmt), id-block);

 }


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-03-06 Thread jamborm at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #179 from Martin Jambor jamborm at gcc dot gnu.org 2013-03-06 
15:14:35 UTC ---

I'm currently (gcc revision 196427, FF changeset 123831:c95439870e05)

facing a few ICEs during the compilation phase with the following

backtrace:



#0  0x00f89a73 in get_location_from_adhoc_loc (set=0x77ff2000,

loc=2947526575) at /home/mjambor/gcc/trunk/src/libcpp/line-map.c:165

#1  0x00c247fe in inlined_function_outer_scope_p (block=0x7fffee4bcb28)

at /home/mjambor/gcc/trunk/src/gcc/tree.h:5561

#2  pack_ts_block_value_fields (expr=0x7fffee4bcb28, bp=0x7fffd1a0,

ob=0x1c73210)

at /home/mjambor/gcc/trunk/src/gcc/tree-streamer-out.c:319

#3  streamer_pack_tree_bitfields (ob=0x1c73210, bp=0x7fffd1a0,

expr=0x7fffee4bcb28)

at /home/mjambor/gcc/trunk/src/gcc/tree-streamer-out.c:417

#4  0x009c3bc9 in lto_write_tree (ref_p=true, expr=0x7fffee4bcb28,

ob=0x1c73210)

at /home/mjambor/gcc/trunk/src/gcc/lto-streamer-out.c:317

#5  lto_output_tree (ob=0x1c73210, expr=0x7fffee4bcb28, ref_p=true,

this_ref_p=optimized out) at

/home/mjambor/gcc/trunk/src/gcc/lto-streamer-out.c:410

#6  0x00c26617 in write_ts_common_tree_pointers (ref_p=true,

expr=0x73f6bc80, ob=0x1c73210)

at /home/mjambor/gcc/trunk/src/gcc/tree-streamer-out.c:514

#7  streamer_write_tree_body (ob=0x1c73210, expr=0x73f6bc80,

ref_p=optimized out)

at /home/mjambor/gcc/trunk/src/gcc/tree-streamer-out.c:845

#8  0x009c3bf7 in lto_write_tree (ref_p=true, expr=0x73f6bc80,

ob=0x1c73210)

at /home/mjambor/gcc/trunk/src/gcc/lto-streamer-out.c:321

#9  lto_output_tree (ob=ob@entry=0x1c73210, expr=0x73f6bc80,

ref_p=ref_p@entry=true,

this_ref_p=this_ref_p@entry=true)

at /home/mjambor/gcc/trunk/src/gcc/lto-streamer-out.c:410

#10 0x00c26e62 in write_ts_exp_tree_pointers (ref_p=optimized out,

expr=optimized out, ob=optimized out)

at /home/mjambor/gcc/trunk/src/gcc/tree-streamer-out.c:747

#11 streamer_write_tree_body (ob=0x1c73210, expr=0x7fffecc63dc0,

ref_p=optimized out)

at /home/mjambor/gcc/trunk/src/gcc/tree-streamer-out.c:884

#12 0x009c3bf7 in lto_write_tree (ref_p=true, expr=0x7fffecc63dc0,

ob=0x1c73210)

at /home/mjambor/gcc/trunk/src/gcc/lto-streamer-out.c:321

#13 lto_output_tree (ob=0x1c73210, expr=0x7fffecc63dc0, ref_p=true,

this_ref_p=optimized out) at

/home/mjambor/gcc/trunk/src/gcc/lto-streamer-out.c:410

#14 0x00c26df8 in write_ts_exp_tree_pointers (ref_p=optimized out,

expr=optimized out, ob=optimized out)

at /home/mjambor/gcc/trunk/src/gcc/tree-streamer-out.c:746

#15 streamer_write_tree_body (ob=0x1c73210, expr=0x7fffecc70078,

ref_p=optimized out)

at /home/mjambor/gcc/trunk/src/gcc/tree-streamer-out.c:884

#16 0x009c3bf7 in lto_write_tree (ref_p=true, expr=0x7fffecc70078,

ob=0x1c73210)

at /home/mjambor/gcc/trunk/src/gcc/lto-streamer-out.c:321

#17 lto_output_tree (ob=ob@entry=0x1c73210, expr=0x7fffecc70078,

ref_p=ref_p@entry=true,

this_ref_p=this_ref_p@entry=true)

at /home/mjambor/gcc/trunk/src/gcc/lto-streamer-out.c:410

#18 0x00c2681d in write_ts_decl_common_tree_pointers (ref_p=true,

expr=0x7fffecc6d720, ob=0x1c73210)

at /home/mjambor/gcc/trunk/src/gcc/tree-streamer-out.c:584

#19 streamer_write_tree_body (ob=0x1c73210, expr=0x7fffecc6d720,

ref_p=optimized out)

at /home/mjambor/gcc/trunk/src/gcc/tree-streamer-out.c:857

#20 0x009c3bf7 in lto_write_tree (ref_p=true, expr=0x7fffecc6d720,

ob=0x1c73210)

at /home/mjambor/gcc/trunk/src/gcc/lto-streamer-out.c:321

#21 lto_output_tree (ob=0x1c73210, expr=0x7fffecc6d720, ref_p=true,

this_ref_p=optimized out) at

/home/mjambor/gcc/trunk/src/gcc/lto-streamer-out.c:410

#22 0x00ecd118 in output_gimple_stmt (stmt=0x7fffec6206c0,

ob=0x1c73210)

at /home/mjambor/gcc/trunk/src/gcc/gimple-streamer-out.c:143

#23 output_bb (ob=0x1c73210, bb=0x7fffed130f08, fn=0x7fffef8603f0)

at /home/mjambor/gcc/trunk/src/gcc/gimple-streamer-out.c:199

#24 0x009c4f26 in output_function (node=0x7fffef8614a0)

at /home/mjambor/gcc/trunk/src/gcc/lto-streamer-out.c:823

#25 lto_output () at /home/mjambor/gcc/trunk/src/gcc/lto-streamer-out.c:987

#26 0x009fa971 in ipa_write_summaries_2 (

pass=0x1618f00 pass_ipa_lto_gimple_out, state=0x1ad8c00)

at /home/mjambor/gcc/trunk/src/gcc/passes.c:2408



The statement being written is:

(gdb) call debug_gimple_stmt ((gimple)0x7fffec6206c0)

# DEBUG v = 18444633011384221696



This happens for example during compilation of

js/src/ion/shared/CodeGenerator-shared.cpp


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-17 Thread rguenth at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #172 from Richard Biener rguenth at gcc dot gnu.org 2013-01-17 
10:53:29 UTC ---

(In reply to comment #171)

 Created attachment 29182 [details]

 Patch to compress line info

 

 This patch removes column information from LTO (so we lose carret diagnostics

 in warnings/errors output at LTO time that seems resonable thing to do) and

 avoid entering duplicate locators into the linemap.  The patch reduces linemap

 usage from 23% to 5% of GGC memory saving 1-2GB on Mozilla. (also reducing LTO

 file size).



Patch looks incomplete?  What does dropping columns only do to memory use?

Please disable flag_diagnostics_show_caret unconditionally in lto1 if you

do that.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-17 Thread hubicka at ucw dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #173 from Jan Hubicka hubicka at ucw dot cz 2013-01-17 12:30:30 
UTC ---

 Patch looks incomplete?  What does dropping columns only do to memory use?



I will check.  I remember that prior columns there was also some savings for

the cache.

Just saving 20% out of 23% is cooler than saving 20% out of 5% of memory.

Note that we are still over 8GB for Mozilla LTO after latest Mozilla checkout.  



 Please disable flag_diagnostics_show_caret unconditionally in lto1 if you

 do that.



Yeah, I wanted, but I am not sure where in lto.c is proper place to do so?


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-17 Thread jakub at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



Jakub Jelinek jakub at gcc dot gnu.org changed:



   What|Removed |Added



 CC||jakub at gcc dot gnu.org



--- Comment #174 from Jakub Jelinek jakub at gcc dot gnu.org 2013-01-17 
12:42:06 UTC ---

lto_post_options ?


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-17 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #175 from Jan Hubicka hubicka at gcc dot gnu.org 2013-01-17 
14:40:04 UTC ---

Created attachment 29191

  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29191

alternative patch without the compression.



This is alternative patch just skipping columns but not doing the compression.

It seems that compression is actually quite effective.

Non-compressing w/o column info is 1073872920 bytes,

compression + no column is 268566544 bytes

compression + column is 1073872920 bytes



Perhaps I messed up the caching with column info?  It strikes wrong that the

numbers are precisely the same. But perhaps it is just reallocation strategy. I

will also generate fresh numbers for unpatched GCC.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-17 Thread rguenth at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #176 from Richard Biener rguenth at gcc dot gnu.org 2013-01-17 
14:54:22 UTC ---

(In reply to comment #175)

 Created attachment 29191 [details]

 alternative patch without the compression.

 

 This is alternative patch just skipping columns but not doing the compression.

 It seems that compression is actually quite effective.

 Non-compressing w/o column info is 1073872920 bytes,

 compression + no column is 268566544 bytes

 compression + column is 1073872920 bytes

 

 Perhaps I messed up the caching with column info?  It strikes wrong that the

 numbers are precisely the same. But perhaps it is just reallocation strategy. 
 I

 will also generate fresh numbers for unpatched GCC.



+linemap_line_start (line_table, data_in-current_line, 0);



-  return linemap_position_for_column (line_table, data_in-current_col);

+  return linemap_position_for_column (line_table, 0);



linemap_line_start will aready return a location for column 0.



So I'd say we want



  if (file_change)

{

  ...

}



  return linemap_line_start (line_table, data_in-current_line, 0);



instead.  Which hopefully does nothing if nothing changed.



I don't know how you implement caching - you didn't attach a patch to do so.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-17 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #177 from Jan Hubicka hubicka at gcc dot gnu.org 2013-01-17 
15:13:53 UTC ---

Created attachment 29192

  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29192

caching



Aha, now I see why you ask for complete patch. I obviously messed up the code. 

This is how I do caching (in version that still has columns in it). I removed

the final incarnation of the patch, but it should be easy to re-do.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-17 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #178 from Jan Hubicka hubicka at gcc dot gnu.org 2013-01-17 
17:11:13 UTC ---

The global cache with arbitrary large size reduces usage down to 0.3%

(16908304) bytes. So it seems that sharing across files is quite an important

part of the game.  I will try to fiddle with the cache size to see how big

cache is actually needed.



Unpatches mainline needs 1073872920 bytes, that is the same as with dropping

columns and/or my initial local caching implementation.  This is apparently

because of the exponential resizing of the table (i.e. we simply do not save

enough to see a difference).



Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-16 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #171 from Jan Hubicka hubicka at gcc dot gnu.org 2013-01-16 
17:25:04 UTC ---

Created attachment 29182

  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=29182

Patch to compress line info



This patch removes column information from LTO (so we lose carret diagnostics

in warnings/errors output at LTO time that seems resonable thing to do) and

avoid entering duplicate locators into the linemap.  The patch reduces linemap

usage from 23% to 5% of GGC memory saving 1-2GB on Mozilla. (also reducing LTO

file size).


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-10 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #170 from Jan Hubicka hubicka at gcc dot gnu.org 2013-01-10 
15:04:10 UTC ---

OK, here is updated memory use:

cgraph.c:863 (cgraph_allocate_init_indirect_info5905200: 0.1%  0:

0.0%6020160: 0.1%  0: 0.0% 298134

tree.c:1237 (build_int_cst_wide)   15554272: 0.4%  0:

0.0% 782528: 0.0%  0: 0.0% 510525

tree.c:1559 (build_string) 10685931: 0.2%  0:

0.0%   16715642: 0.4%2193469: 1.7% 563828

stringpool.c:75 (alloc_node)  0: 0.0%  0:

0.0%   30574880: 0.7%  0: 0.0% 764372

lto/lto.c:2286 (create_subid_section_table) 1522184: 0.0%  0:

0.0%   39117064: 0.8%8051472: 6.4%   3978

stringpool.c:58 (stringpool_ggc_alloc)0: 0.0%  0:

0.0%   41092405: 0.9%2954893: 2.4% 764372

gimple.c:3167 (iterative_hash_canonical_type)  45040752: 1.0%  0:

0.0%  0: 0.0%  0: 0.0%2815047

lto/lto.c:1222 (iterative_hash_gimple_type)68276864: 1.6%  0:

0.0%  0: 0.0%  0: 0.0%4267304

ggc-common.c:249 (ggc_cleared_alloc_ptr_array_tw  91784: 0.0% 

487289424:48.8%   71432600: 1.5% 248976: 0.2%  10974

lto/lto.c:1266 (iterative_hash_gimple_type)75288576: 1.8%  0:

0.0%  0: 0.0%  0: 0.0%4705536

lto-section-in.c:362 (lto_new_in_decl_state) 694320: 0.0%  0:

0.0%   94861800: 2.0%  0: 0.0% 796301

tree.c:1263 (build_int_cst_wide)   76232736: 1.8%  0:

0.0%   19358880: 0.4%  0: 0.0%2987238

cgraph.c:794 (cgraph_create_edge_1)   0: 0.0%  0:

0.0%  125510632: 2.7%  0: 0.0%1206833

vec.h:565 ((null)) 66034564: 1.5%  98716:

0.0%   68500548: 1.5%3484420: 2.8% 597783

vec.h:695 ((null))124654648: 2.9% 

122044288:12.2%   63749232: 1.4%2614800: 2.1%1590429

tree-streamer-in.c:562 (streamer_alloc_tree)  125829312: 2.9%  0:

0.0%   74222904: 1.6%   7072: 0.0%2005091

lto/lto.c:267 (lto_read_in_decl_state)  1478720: 0.0%  0:

0.0%  216390688: 4.7%   38247784:30.5%5574107

vec.h:747 ((null))173791988: 4.0%   19565412:

2.0%   68225644: 1.5%2680332: 2.1%1396070

vec.h:707 ((null))133872480: 3.1%  0:

0.0%  285212728: 6.1% 800360: 0.6%1059913

cgraph.c:500 (cgraph_allocate_node)   0: 0.0%  0:

0.0%  472831880:10.2%  0: 0.0%1597405

tree.c:1223 (build_int_cst_wide)  607138944:14.1%  0:

0.0%   10427664: 0.2%4719336: 3.8% 315034

toplev.c:959 (realloc_for_line_map)   0: 0.0% 

358037664:35.8% 1073872920:23.1%184: 0.0% 16

tree-streamer-in.c:573 (streamer_alloc_tree) 2762184192:64.2%  0:

0.0% 1861017624:40.0%   59027616:47.1%   34649937

Total4302007795999178184   

   4651003487125411458 68828967

source location GarbageFreed   

 Leak OverheadTimes

---





Actually it is a bit of improvement over my past report.  Some obvious things

1) we still soak in too many trees (40%) of memory.  The per-tree stats are:

decls17310018 -1609736744

types8983387 1509209016

exprs2427302   80045744

constants4079292  135393547

binfos   2005091  200038072

random kinds 5691481  227659664



and counts:

tree_list5691475   

pointer_type 2337585

record_type  3702066   

function_decl1856282

field_decl   2812564

const_decl   2739702

parm_decl3549707

type_decl4780459

result_decl  1144482

tree_binfo   2005091



2) new linemaps are still a disaster

3) VEC rewrite did break stats.



Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-09 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #165 from Jan Hubicka hubicka at gcc dot gnu.org 2013-01-09 
15:16:26 UTC ---

OK, I tracked down the undefined reference to

error: /tmp/cc0oq4BG.ltrans1.ltrans.o: requires dynamic R_X86_64_PC32 reloc

against '_ZN12SkAnnotationC1ER23SkFlattenableReadBuffer' which may overflow at

runtime; recompile with -fPIC



it is caused by bug in Mozilla - it includes file defininig virtual function

that use  '_ZN12SkAnnotationC1ER23SkFlattenableReadBuffer' (in SkPaint) but it

never links with implementation.

Normally the function is optimized out.  It is not due to fact that we never

optimize out virtual functions prior inlining for devirtualization and in WPA

path we forget to remove these when done.



Fixed by the following patch

Index: ipa-inline.c

===

--- ipa-inline.c(revision 194916)

+++ ipa-inline.c(working copy)

@@ -1793,7 +1793,7 @@

 }



   inline_small_functions ();

-  symtab_remove_unreachable_nodes (true, dump_file);

+  symtab_remove_unreachable_nodes (false, dump_file);

   free (order);



   /* Inline functions with a property that after inlining into all callers the

Index: lto/lto.c

===

--- lto/lto.c   (revision 194916)

+++ lto/lto.c   (working copy)

@@ -3215,6 +3215,7 @@

   cgraph_state = CGRAPH_STATE_IPA_SSA;



   execute_ipa_pass_list (all_regular_ipa_passes);

+  symtab_remove_unreachable_nodes (false, dump_file);



   if (cgraph_dump_file)

 {

Index: cgraphclones.c

===

--- cgraphclones.c  (revision 194916)

+++ cgraphclones.c  (working copy)

@@ -184,6 +184,7 @@

   new_node-symbol.decl = decl;

   symtab_register_node ((symtab_node)new_node);

   new_node-origin = n-origin;

+  new_node-symbol.lto_file_data = n-symbol.lto_file_data;

   if (new_node-origin)

 {

   new_node-next_nested = new_node-origin-nested;


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-09 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #166 from Jan Hubicka hubicka at gcc dot gnu.org 2013-01-09 
15:19:41 UTC ---

Markus, the apperance of undefined references I fixed by patch above is highly

sensitive to partitioning and inlining decision.  Can you, please, check if the

problem with PGO remains?  It may be another instance of the same issue.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-09 Thread markus at trippelsdorf dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #167 from Markus Trippelsdorf markus at trippelsdorf dot de 
2013-01-09 19:58:33 UTC ---

(In reply to comment #166)

 Markus, the apperance of undefined references I fixed by patch above is highly

 sensitive to partitioning and inlining decision.  Can you, please, check if 
 the

 problem with PGO remains?  It may be another instance of the same issue.



Just checked it using your patch from comment 165, but the issue from

comment 162 is still there:



/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.0/../../../../x86_64-pc-linux-gnu/bin/ld:

error: /tmp/ccACx905.ltrans6.ltrans.o: requires dynamic R_X86_64_PC32 reloc

against '_ZN13nsXULDocument14MaybeBroadcastEv.466048' which may overflow at

runtime; recompile with -fPIC

/tmp/ccACx905.ltrans6.ltrans.o:ccACx905.ltrans6.o:function

nsRunnableMethodTraitsvoid (nsXULDocument::*)(), true::base_type* NS_N

ewRunnableMethodnsXULDocument*, void (nsXULDocument::*)()(nsXULDocument*,

void (nsXULDocument::*)()) [clone .local.42120] [clone .constprop.89117]:

error: undefined reference to 'nsXULDocument::MaybeBroadcast() [clone .466048]'

/tmp/ccACx905.ltrans6.ltrans.o:ccACx905.ltrans6.o:function

nsRunnableMethodTraitsvoid (nsXULDocument::*)(), true::base_type* NS_N

ewRunnableMethodnsXULDocument*, void (nsXULDocument::*)()(nsXULDocument*,

void (nsXULDocument::*)()) [clone .local.42120] [clone 

.constprop.89117]: error: undefined reference to

'nsXULDocument::MaybeBroadcast() [clone .466048]'



Also the memory usage went through the roof (not sure if this caused

by your patch or my recent git-pull of mozilla-central): 

over 9GB RAM is needed (not much fun on my 8GB test-machine).



(So I will stop testing Firfox for now, until LTO/PGO memory usage

gets sane again (hopefully for 4.9).)


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-09 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #168 from Jan Hubicka hubicka at gcc dot gnu.org 2013-01-09 
21:20:46 UTC ---

Too bad :( 

The patch should reduce memory usage, not increase it.  So it must be something

else.  



My build was around 7GB w/o PGO, I will need to try the PGO builds myself.

My tree is however somewhat out of date. I will try fresh checkout and post mem

usage stats.



Perhaps you can share smewhere the -lm.res and *wpa*cgraph dump of --save-temps

-fdump-ipa-cgraph build?  I will try to figure out those symbols.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-09 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #169 from Jan Hubicka hubicka at gcc dot gnu.org 2013-01-09 
21:22:33 UTC ---

Author: hubicka

Date: Wed Jan  9 21:22:26 2013

New Revision: 195066



URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=195066

Log:



PR lto/45375

* ipa-inline.c (ipa_inline): Remove extern inlines and virtual functions.

* cgraphclones.c (cgraph_clone_node): Cpoy also LTO file data.

* lto.c (do_whole_program_analysis): Remove unreachable nodes after IPA.



Modified:

trunk/gcc/ChangeLog

trunk/gcc/cgraphclones.c

trunk/gcc/ipa-inline.c

trunk/gcc/lto/ChangeLog

trunk/gcc/lto/lto.c


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2013-01-05 Thread leo at yuriev dot ru


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



Leo Yuriev leo at yuriev dot ru changed:



   What|Removed |Added



 CC||leo at yuriev dot ru



--- Comment #164 from Leo Yuriev leo at yuriev dot ru 2013-01-06 00:31:55 UTC 
---

Some trouble while building LLVM with -flto.



../x86_64-linux-gnu/bin/ld.gold: error: /tmp/cc60XH2F.ltrans0.ltrans.o:

requires dynamic R_X86_64_PC32 reloc against 'X86CompilationCallback2' which

may overflow at runtime; recompile with -fPIC



Code:



extern C {

  void X86CompilationCallback(void);

  asm(

.text\n

.align 8\n

.globl  ASMPREFIX X86CompilationCallback\n

TYPE_FUNCTION(X86CompilationCallback)

  ASMPREFIX X86CompilationCallback:\n

...

movq8(%rbp), %rdx\n

call ASMPREFIX X86CompilationCallback2\n

addq$32, %rsp\n

...

  );

}



void __attribute__((used))

X86CompilationCallback2(intptr_t *StackPtr, intptr_t RetAddr) {

  intptr_t *RetAddrLoc = StackPtr[1];

...

}



}


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-14 Thread hubicka at ucw dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #163 from Jan Hubicka hubicka at ucw dot cz 2012-12-14 18:24:31 
UTC ---

 

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

 

 --- Comment #162 from Markus Trippelsdorf markus at trippelsdorf dot de 
 2012-12-13 22:25:27 UTC ---

 The libxul binary size issue is solved now.



Good

 

 During testing I came across another issue that looks similar 

 to the one Comment 146:

 /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.0/../../../../x86_64-pc-linux-gnu/bin/ld:

 error: /tmp/ccwu5G98.ltrans4.ltrans.o: requires dynamic R_X86_64_PC32 reloc

 against '_ZN13nsXUL

 Document14MaybeBroadcastEv.429466' which may overflow at runtime; recompile

 with -fPIC

 /tmp/ccwu5G98.ltrans4.ltrans.o:ccwu5G98.ltrans4.o:function

 nsRunnableMethodTraitsvoid (nsXULDocument::*)(), true::base_type*

 NS_NewRunnableMethodnsXULDocument*, void (nsXU

 LDocument::*)()(nsXULDocument*, void (nsXULDocument::*)()) [clone

 .local.39398] [clone .constprop.84952]: error: undefined reference to

 'nsXULDocument::MaybeBroadcast() [clone .429466]'

 /tmp/ccwu5G98.ltrans4.ltrans.o:ccwu5G98.ltrans4.o:function

 nsRunnableMethodTraitsvoid (nsXULDocument::*)(), true::base_type*

 NS_NewRunnableMethodnsXULDocument*, void (nsXU

 LDocument::*)()(nsXULDocument*, void (nsXULDocument::*)()) [clone

 .local.39398] [clone .constprop.84952]: error: undefined reference to

 'nsXULDocument::MaybeBroadcast() [clone .429466]'

 collect2: error: ld returned 1 exit status

 

 After I deleted both nsXULDocument.o and nsXULDocument.gcda and rebuild with:

  make -f client.mk realbuild MOZ_PROFILE_USE=1 

 the problem did go away.



This sounds like an independent problem with partitining.  I am travelling till

17th, so I will

try to check this locally myself.



Perhaps you can give details on your setup? (i.e. my Mozilla tree got quite

dirty with various local

hacks I made over time, perhaps I should refresh to cleaner state)



Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-13 Thread markus at trippelsdorf dot de

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #160 from Markus Trippelsdorf markus at trippelsdorf dot de 
2012-12-13 09:52:37 UTC ---
(In reply to comment #159)
  hal/Hal.gcda:   96.72%: num counts=30069, min counter=16389
  hal/Hal.gcda:   97.50%: num counts=35296, min counter=10241
  hal/Hal.gcda:   98.28%: num counts=43669, min counter=6145
  hal/Hal.gcda:   99.06%: num counts=59589, min counter=3072
  hal/Hal.gcda:   99.90%: num counts=115840, min counter=320
  
  So it looks like you would want a cutoff of 97.5% to get close to what
  was there before.
 
 Setting the default cutoff to something like 95% would sound fine to me.  I
 see i asked to reduce the parameter but suggested 990. Markus, can you
 try setting HOT_BB_COUNT_WS_PERMILLE to 950?

It doesn't help:

 HOT_BB_COUNT_WS_PERMILLE=950: size of libxul.so: 42149632 bytes

(In reply to comment #157)
 (Unfortunately this new ICE happens with yesterdays gcc when linking libxul:
 
 /var/tmp/mozilla-central/content/base/src/nsDocument.cpp: In member function
 ‘CreateRange’:
 /var/tmp/mozilla-central/content/base/src/nsDocument.cpp:4999:0: internal
 compiler error: in cgraph_mark_address_taken_node, at cgraph.c:1409
 
 I will open a new PR for this later.)

See PR55669


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-13 Thread markus at trippelsdorf dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #161 from Markus Trippelsdorf markus at trippelsdorf dot de 
2012-12-13 12:59:59 UTC ---

I've opened a new bug for the binary size increase issue: 



http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55674


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-13 Thread markus at trippelsdorf dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #162 from Markus Trippelsdorf markus at trippelsdorf dot de 
2012-12-13 22:25:27 UTC ---

The libxul binary size issue is solved now.



During testing I came across another issue that looks similar 

to the one Comment 146:

/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.0/../../../../x86_64-pc-linux-gnu/bin/ld:

error: /tmp/ccwu5G98.ltrans4.ltrans.o: requires dynamic R_X86_64_PC32 reloc

against '_ZN13nsXUL

Document14MaybeBroadcastEv.429466' which may overflow at runtime; recompile

with -fPIC

/tmp/ccwu5G98.ltrans4.ltrans.o:ccwu5G98.ltrans4.o:function

nsRunnableMethodTraitsvoid (nsXULDocument::*)(), true::base_type*

NS_NewRunnableMethodnsXULDocument*, void (nsXU

LDocument::*)()(nsXULDocument*, void (nsXULDocument::*)()) [clone

.local.39398] [clone .constprop.84952]: error: undefined reference to

'nsXULDocument::MaybeBroadcast() [clone .429466]'

/tmp/ccwu5G98.ltrans4.ltrans.o:ccwu5G98.ltrans4.o:function

nsRunnableMethodTraitsvoid (nsXULDocument::*)(), true::base_type*

NS_NewRunnableMethodnsXULDocument*, void (nsXU

LDocument::*)()(nsXULDocument*, void (nsXULDocument::*)()) [clone

.local.39398] [clone .constprop.84952]: error: undefined reference to

'nsXULDocument::MaybeBroadcast() [clone .429466]'

collect2: error: ld returned 1 exit status



After I deleted both nsXULDocument.o and nsXULDocument.gcda and rebuild with:

 make -f client.mk realbuild MOZ_PROFILE_USE=1 

the problem did go away.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-12 Thread markus at trippelsdorf dot de

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #157 from Markus Trippelsdorf markus at trippelsdorf dot de 
2012-12-12 11:43:27 UTC ---
With revision 193740 libxul's size is ~34MB, which is OK.

(Unfortunately this new ICE happens with yesterdays gcc when linking libxul:

/var/tmp/mozilla-central/content/base/src/nsDocument.cpp: In member function
‘CreateRange’:
/var/tmp/mozilla-central/content/base/src/nsDocument.cpp:4999:0: internal
compiler error: in cgraph_mark_address_taken_node, at cgraph.c:1409

I will open a new PR for this later.)

Here are the requested files:

(I don't know which of the ~3000 gcda files you need, so I've uploaded them
all)
http://www.trippelsdorf.de/gcda_before.tar.bz2 (4MB)
http://www.trippelsdorf.de/gcda_after.tar.bz2  (4MB)

(-fdump-ipa-inline output)
http://www.trippelsdorf.de/libxul_before.inline.tar.bz2 (100MB)
http://www.trippelsdorf.de/libxul_after.inline.tar.bz2  (68MB, everything 'till
the ICE hit)


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-12 Thread tejohnson at google dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #158 from Teresa Johnson tejohnson at google dot com 2012-12-12 
18:59:56 UTC ---
On Wed, Dec 12, 2012 at 3:43 AM, markus at trippelsdorf dot de
gcc-bugzi...@gcc.gnu.org wrote:

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

 --- Comment #157 from Markus Trippelsdorf markus at trippelsdorf dot de 
 2012-12-12 11:43:27 UTC ---
 With revision 193740 libxul's size is ~34MB, which is OK.

 (Unfortunately this new ICE happens with yesterdays gcc when linking libxul:

 /var/tmp/mozilla-central/content/base/src/nsDocument.cpp: In member function
 ‘CreateRange’:
 /var/tmp/mozilla-central/content/base/src/nsDocument.cpp:4999:0: internal
 compiler error: in cgraph_mark_address_taken_node, at cgraph.c:1409

 I will open a new PR for this later.)

 Here are the requested files:

 (I don't know which of the ~3000 gcda files you need, so I've uploaded them
 all)
 http://www.trippelsdorf.de/gcda_before.tar.bz2 (4MB)
 http://www.trippelsdorf.de/gcda_after.tar.bz2  (4MB)

Sorry, I should have clarified that any one of them would do (as long
as it corresponded to an object file included in the LTO link for the
main executable), since the info I need is in the program summary
section for the executable, which is duplicated in each of them.


 (-fdump-ipa-inline output)
 http://www.trippelsdorf.de/libxul_before.inline.tar.bz2 (100MB)
 http://www.trippelsdorf.de/libxul_after.inline.tar.bz2  (68MB, everything 
 'till
 the ICE hit)

With the old heuristics, the hot bb cutoff was:
profile_info-sum_max / PARAM_VALUE (HOT_BB_COUNT_FRACTION))

In this case, sum_max is 103439951 and HOT_BB_COUNT_FRACTION was
1, so the cutoff count was 10343.

From the working set computed from the histogram, the 99.9% cutoff
count is 320. See the end of this email for the full set of histograms
and working sets, but here are the top few working sets:

...
hal/Hal.gcda:   96.72%: num counts=30069, min counter=16389
hal/Hal.gcda:   97.50%: num counts=35296, min counter=10241
hal/Hal.gcda:   98.28%: num counts=43669, min counter=6145
hal/Hal.gcda:   99.06%: num counts=59589, min counter=3072
hal/Hal.gcda:   99.90%: num counts=115840, min counter=320

So it looks like you would want a cutoff of 97.5% to get close to what
was there before.

(Honza, I just made some changes to enable gcov-dump to optionally
compute and dump out the working sets from the histogram. I can send
this for upstream review as I have wanted this several times.)

The much smaller cutoff count is why there are fewer calls marked
unlikely and more inlining:

$ grep call is unlikely before/libxul.so.wpa.049i.inline  | wc
 442342 4944522 42560600

$ grep call is unlikely after/libxul.so.wpa.049i.inline  | wc
 392683 4349335 37477001

$ grep Inlined before/libxul.so.wpa.049i.inline  | grep eliminated
Inlined 60432 calls, eliminated 30986 functions

$ grep Inlined after/libxul.so.wpa.049i.inline  | grep eliminated
Inlined 89573 calls, eliminated 28921 functions

On thing that is interesting in the above info, and may be
contributing to the larger size now, is that there are more inlines,
but fewer functions are being eliminated. I'm not sure why that is
offhand. It's possible (probable) that inlining heuristics need some
retuning to make optimal use of the new cutoffs.

We also see additional inlines in some of our large internal apps with
the change, but not much increase in binary size, and it sometimes
leads to better performance - although we are not as much affected
because the google branches were using a much larger
HOT_BB_COUNT_FRACTION of 60K already, in order to get more inlining.
In this case, it looks like you are getting more inlines but it is
apparently performance-neutral?

Looking at a graph of the working set data, the number of counters
starts increasing super-exponentially as the percentages approach
100%. I've been thinking that it may be useful to find the knee of
the curve to determine the appropriate cutoff percentage. I'll see if
I can make some progress on that.

Full histogram/working set data:

hal/Hal.gcda: a300: 512:PROGRAM_SUMMARY checksum=0x3aa34521
hal/Hal.gcda: counts=2109045, runs=7, sum_all=9749748271,
run_max=97136704, sum_max=103439951
hal/Hal.gcda: counter histogram:
hal/Hal.gcda: 0: num counts=1824318, min counter=0, cum_counter=0
hal/Hal.gcda: 1: num counts=30727, min counter=1, cum_counter=30727
hal/Hal.gcda: 2: num counts=11646, min counter=2, cum_counter=23292
hal/Hal.gcda: 3: num counts=5414, min counter=3, cum_counter=16242
hal/Hal.gcda: 4: num counts=5156, min counter=4, cum_counter=20624
hal/Hal.gcda: 5: num counts=3379, min counter=5, cum_counter=16895
hal/Hal.gcda: 6: num counts=3674, min counter=6, cum_counter=22044
hal/Hal.gcda: 7: num counts=2310, min counter=7, cum_counter=16170
hal/Hal.gcda: 8: num counts=4756, min counter=8, cum_counter=40330
hal/Hal.gcda: 9: num counts=4725, min counter=10, 

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-12 Thread hubicka at ucw dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #159 from Jan Hubicka hubicka at ucw dot cz 2012-12-12 20:35:37 
UTC ---

 hal/Hal.gcda:   96.72%: num counts=30069, min counter=16389

 hal/Hal.gcda:   97.50%: num counts=35296, min counter=10241

 hal/Hal.gcda:   98.28%: num counts=43669, min counter=6145

 hal/Hal.gcda:   99.06%: num counts=59589, min counter=3072

 hal/Hal.gcda:   99.90%: num counts=115840, min counter=320

 

 So it looks like you would want a cutoff of 97.5% to get close to what

 was there before.



Setting the default cutoff to something like 95% would sound fine to me.  I

see i asked to reduce the parameter but suggested 990. Markus, can you

try setting HOT_BB_COUNT_WS_PERMILLE to 950?



Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-11 Thread tejohnson at google dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #154 from Teresa Johnson tejohnson at google dot com 2012-12-11 
19:30:53 UTC ---

What was the size of the gcc lto/pgo binary before the change to use the

histogram? Was it close to the gcc 4.7 lto/pgo size? In that case that is a

very large increase, ~25%.



Markus, could you attach to the bug one of the gcda files so that I can see the

program summary and figure out how far off the old hot bb threshold is from the

new histogram-based one? Also, it would be good to see the -fdump-ipa-inline

dumps before and after the regression (if necessary, the before one could be

from 4_7).


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-11 Thread markus at trippelsdorf dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #155 from Markus Trippelsdorf markus at trippelsdorf dot de 
2012-12-11 22:57:14 UTC ---

(In reply to comment #154)

 What was the size of the gcc lto/pgo binary before the change to use the

 histogram? Was it close to the gcc 4.7 lto/pgo size? In that case that is a

 very large increase, ~25%.



With revision 193914 (before the change) the lto/pgo size is 42115424 bytes.

So it looks like Theresa is off the hook.



 Markus, could you attach to the bug one of the gcda files so that I can see 
 the

 program summary and figure out how far off the old hot bb threshold is from 
 the

 new histogram-based one? Also, it would be good to see the -fdump-ipa-inline

 dumps before and after the regression (if necessary, the before one could be

 from 4_7).



Will try to post them tomorrow .


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-11 Thread tejohnson at google dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #156 from Teresa Johnson tejohnson at google dot com 2012-12-12 
00:00:17 UTC ---

On Tue, Dec 11, 2012 at 2:57 PM, markus at trippelsdorf dot de

gcc-bugzi...@gcc.gnu.org wrote:



 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



 --- Comment #155 from Markus Trippelsdorf markus at trippelsdorf dot de 
 2012-12-11 22:57:14 UTC ---

 (In reply to comment #154)

 What was the size of the gcc lto/pgo binary before the change to use the

 histogram? Was it close to the gcc 4.7 lto/pgo size? In that case that is a

 very large increase, ~25%.



 With revision 193914 (before the change) the lto/pgo size is 42115424 bytes.

 So it looks like Theresa is off the hook.



Unfortunately, I am still possibly on the hook since the main suspect

change is r193747 (committed by Honza with changes made by him and I

to use the histogram instead of a hard limit for determining bb

hotness). Between then and when I committed fixes for this under LTO

(r193999) I would expect that the code size might have been worse

temporarily because everything looked hot since the histogram was not

being streamed through the LTO files properly, and so inlining could

have gotten excessive.





 Markus, could you attach to the bug one of the gcda files so that I can see 
 the

 program summary and figure out how far off the old hot bb threshold is from 
 the

 new histogram-based one? Also, it would be good to see the -fdump-ipa-inline

 dumps before and after the regression (if necessary, the before one could be

 from 4_7).



 Will try to post them tomorrow .



Ok thanks.

Teresa





 --

 Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email

 --- You are receiving this mail because: ---

 You are on the CC list for the bug.







--

Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-02 Thread hubicka at ucw dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #147 from Jan Hubicka hubicka at ucw dot cz 2012-12-02 09:23:09 
UTC ---

 

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

 

 --- Comment #146 from Markus Trippelsdorf markus at trippelsdorf dot de 
 2012-12-02 07:36:02 UTC ---

 (In reply to comment #145)

   

   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

   

   --- Comment #144 from Markus Trippelsdorf markus at trippelsdorf dot de 
   2012-12-01 12:39:30 UTC ---

   It looks like there is a LTO code-size regression on trunk:

   (size of libxul.so, build without elfhack):

   

   gcc lto/pgo : size: 42204584 | Kraken bench: 2723.9ms +/- 0.9%

  

  About LTO+PGO please be sure that you have the Teresa's fix from this 
  Friday in

  your tree.

 

 Yes, my tree already included this fix and also the fix from bug 1.



Please try to reduce HOT_BB_COUNT_WS_PERMILLE to 990. I also see some

regressions

on some SPEC benchmarks (such as GCC) and this helps. If it doesn't it would be

nice

to know what value is needed for comparable size.

 

   gcc : size: 34072808 | Kraken bench: 2804.3ms +/- 1.6%

  

  Is LTO w/o PGO bigger than previous builds?

 

 Couldn't tell, because it doesn't link:

 

 /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.0/../../../../x86_64-pc-linux-gnu/bin/ld:

 warning: hidden symbol 'pixman_add_triangles' in

 /var/tmp/moz-build-dir/toolkit/library/../../gfx/cairo/libpixman/src/pixman-trap.o

 is referenced by DSO /usr/lib64/libcairo.so

 /usr/lib/gcc/x86_64-pc-linux-gnu/4.8.0/../../../../x86_64-pc-linux-gnu/bin/ld:

 error: /tmp/cc0oq4BG.ltrans1.ltrans.o: requires dynamic R_X86_64_PC32 reloc

 against '_ZN12SkAnnotationC1ER23SkFlattenableReadBuffer' which may overflow at

 runtime; recompile with -fPIC

 /tmp/cc0oq4BG.ltrans0.ltrans.o:cc0oq4BG.ltrans0.o:function SharedStub: error:

 undefined reference to 'PrepareAndDispatch'

 /tmp/cc0oq4BG.ltrans1.ltrans.o:cc0oq4BG.ltrans1.o:function

 SkAnnotation::CreateProc(SkFlattenableReadBuffer) [clone 
 .local.7828.1055099]:

 error: undefined reference to

 'SkAnnotation::SkAnnotation(SkFlattenableReadBuffer)'

 collect2: error: ld returned 1 exit status

 

 The undefined reference to PrepareAndDispatch is easily fixed by

 an __attribute__ ((used)).

 Do you have an idea on how to fix the

 SkAnnotation::SkAnnotation(SkFlattenableReadBuffer) issue?



Hmm, I remember seeing this one, too.  I will check.



Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-02 Thread markus at trippelsdorf dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #148 from Markus Trippelsdorf markus at trippelsdorf dot de 
2012-12-02 11:57:27 UTC ---

(In reply to comment #147)

  

  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

  

  --- Comment #146 from Markus Trippelsdorf markus at trippelsdorf dot de 
  2012-12-02 07:36:02 UTC ---

  (In reply to comment #145)



http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #144 from Markus Trippelsdorf markus at trippelsdorf dot 
de 2012-12-01 12:39:30 UTC ---

It looks like there is a LTO code-size regression on trunk:

(size of libxul.so, build without elfhack):



gcc lto/pgo : size: 42204584 | Kraken bench: 2723.9ms +/- 0.9%

   

   About LTO+PGO please be sure that you have the Teresa's fix from this 
   Friday in

   your tree.

  

  Yes, my tree already included this fix and also the fix from bug 1.

 

 Please try to reduce HOT_BB_COUNT_WS_PERMILLE to 990. I also see some

 regressions

 on some SPEC benchmarks (such as GCC) and this helps. If it doesn't it would 
 be

 nice to know what value is needed for comparable size.



Unfortunately it doesn't help much, because with --param

hot-bb-count-ws-permille=990 the size is only 0.25% smaller:

(With --param) : 42098856

(Without ) : 42204584



I will try smaller values later.


Re: [Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-02 Thread Jan Hubicka
  Please try to reduce HOT_BB_COUNT_WS_PERMILLE to 990. I also see some
  regressions
  on some SPEC benchmarks (such as GCC) and this helps. If it doesn't it 
  would be
  nice to know what value is needed for comparable size.
 
 Unfortunately it doesn't help much, because with --param
 hot-bb-count-ws-permille=990 the size is only 0.25% smaller:
 (With --param) : 42098856
 (Without ) : 42204584
 
 I will try smaller values later.

Hmm, that sounds like quite bad news - the histogram code was supposed to help
in such cases.  I will try to fix the non-PGO case and lets try to compare how
PGO/non-PGO compare first.  If you could put somewhere the -fdump-ipa-inline
dump, I will try to check if there is something obviously wrong.

In worst case we can resort to combining both heuristics - i.e. keeping the
hot_bb_fraction in addition to histogram code. In fact I planned to do that this
way but Teresa removed the old code and I did not see good reason why to keep 
it.

Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-02 Thread hubicka at ucw dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #149 from Jan Hubicka hubicka at ucw dot cz 2012-12-02 15:05:52 
UTC ---

  Please try to reduce HOT_BB_COUNT_WS_PERMILLE to 990. I also see some

  regressions

  on some SPEC benchmarks (such as GCC) and this helps. If it doesn't it 
  would be

  nice to know what value is needed for comparable size.

 

 Unfortunately it doesn't help much, because with --param

 hot-bb-count-ws-permille=990 the size is only 0.25% smaller:

 (With --param) : 42098856

 (Without ) : 42204584

 

 I will try smaller values later.



Hmm, that sounds like quite bad news - the histogram code was supposed to help

in such cases.  I will try to fix the non-PGO case and lets try to compare how

PGO/non-PGO compare first.  If you could put somewhere the -fdump-ipa-inline

dump, I will try to check if there is something obviously wrong.



In worst case we can resort to combining both heuristics - i.e. keeping the

hot_bb_fraction in addition to histogram code. In fact I planned to do that

this

way but Teresa removed the old code and I did not see good reason why to keep

it.



Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-02 Thread markus at trippelsdorf dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #150 from Markus Trippelsdorf markus at trippelsdorf dot de 
2012-12-02 18:03:28 UTC ---

For comparison I've just disabled skia and build with LTO only;

the size looks good for this case: 31356968


Re: [Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-02 Thread Jan Hubicka
Teresa comitted another bugfix just today. So with bit of luck it will work now?
I will try to look deeper into it ASAP, but I am just getting ready for trip to 
USA.

Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-02 Thread hubicka at ucw dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #151 from Jan Hubicka hubicka at ucw dot cz 2012-12-02 20:52:13 
UTC ---

Teresa comitted another bugfix just today. So with bit of luck it will work

now?

I will try to look deeper into it ASAP, but I am just getting ready for trip to

USA.



Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-02 Thread hubicka at ucw dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #152 from Jan Hubicka hubicka at ucw dot cz 2012-12-02 21:09:24 
UTC ---

Also I suppose you don't have comparsion to 4.7 handy? (I am curious because of

inliner heuristic re-tunning)



Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-02 Thread markus at trippelsdorf dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #153 from Markus Trippelsdorf markus at trippelsdorf dot de 
2012-12-02 21:13:21 UTC ---

On 2012.12.02 at 21:09 +, hubicka at ucw dot cz wrote:

 

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

 

 --- Comment #152 from Jan Hubicka hubicka at ucw dot cz 2012-12-02 21:09:24 
 UTC ---

 Also I suppose you don't have comparsion to 4.7 handy? (I am curious because 
 of

 inliner heuristic re-tunning)



The LTO/PGO sizes were measured with the newest patch from Teresa

already applied.



gcc-4.7 lto/pgo: size: 7456 | Kraken bench: 2706.7ms +/- 1.1%


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-01 Thread markus at trippelsdorf dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #144 from Markus Trippelsdorf markus at trippelsdorf dot de 
2012-12-01 12:39:30 UTC ---

It looks like there is a LTO code-size regression on trunk:

(size of libxul.so, build without elfhack):



gcc lto/pgo : size: 42204584 | Kraken bench: 2723.9ms +/- 0.9%

gcc : size: 34072808 | Kraken bench: 2804.3ms +/- 1.6%

clang lto   : size: 35071848 | Kraken bench: 2804.2ms +/- 1.2%

clang   : size: 36797384 | Kraken bench: 2819.6ms +/- 1.4%


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-01 Thread hubicka at ucw dot cz


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #145 from Jan Hubicka hubicka at ucw dot cz 2012-12-01 22:09:07 
UTC ---

 

 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

 

 --- Comment #144 from Markus Trippelsdorf markus at trippelsdorf dot de 
 2012-12-01 12:39:30 UTC ---

 It looks like there is a LTO code-size regression on trunk:

 (size of libxul.so, build without elfhack):

 

 gcc lto/pgo : size: 42204584 | Kraken bench: 2723.9ms +/- 0.9%



About LTO+PGO please be sure that you have the Teresa's fix from this Friday in

your tree.



 gcc : size: 34072808 | Kraken bench: 2804.3ms +/- 1.6%



Is LTO w/o PGO bigger than previous builds?



 clang lto   : size: 35071848 | Kraken bench: 2804.2ms +/- 1.2%

 clang   : size: 36797384 | Kraken bench: 2819.6ms +/- 1.4%


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-12-01 Thread markus at trippelsdorf dot de


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #146 from Markus Trippelsdorf markus at trippelsdorf dot de 
2012-12-02 07:36:02 UTC ---

(In reply to comment #145)

  

  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

  

  --- Comment #144 from Markus Trippelsdorf markus at trippelsdorf dot de 
  2012-12-01 12:39:30 UTC ---

  It looks like there is a LTO code-size regression on trunk:

  (size of libxul.so, build without elfhack):

  

  gcc lto/pgo : size: 42204584 | Kraken bench: 2723.9ms +/- 0.9%

 

 About LTO+PGO please be sure that you have the Teresa's fix from this Friday 
 in

 your tree.



Yes, my tree already included this fix and also the fix from bug 1.



  gcc : size: 34072808 | Kraken bench: 2804.3ms +/- 1.6%

 

 Is LTO w/o PGO bigger than previous builds?



Couldn't tell, because it doesn't link:



/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.0/../../../../x86_64-pc-linux-gnu/bin/ld:

warning: hidden symbol 'pixman_add_triangles' in

/var/tmp/moz-build-dir/toolkit/library/../../gfx/cairo/libpixman/src/pixman-trap.o

is referenced by DSO /usr/lib64/libcairo.so

/usr/lib/gcc/x86_64-pc-linux-gnu/4.8.0/../../../../x86_64-pc-linux-gnu/bin/ld:

error: /tmp/cc0oq4BG.ltrans1.ltrans.o: requires dynamic R_X86_64_PC32 reloc

against '_ZN12SkAnnotationC1ER23SkFlattenableReadBuffer' which may overflow at

runtime; recompile with -fPIC

/tmp/cc0oq4BG.ltrans0.ltrans.o:cc0oq4BG.ltrans0.o:function SharedStub: error:

undefined reference to 'PrepareAndDispatch'

/tmp/cc0oq4BG.ltrans1.ltrans.o:cc0oq4BG.ltrans1.o:function

SkAnnotation::CreateProc(SkFlattenableReadBuffer) [clone .local.7828.1055099]:

error: undefined reference to

'SkAnnotation::SkAnnotation(SkFlattenableReadBuffer)'

collect2: error: ld returned 1 exit status



The undefined reference to PrepareAndDispatch is easily fixed by

an __attribute__ ((used)).

Do you have an idea on how to fix the

SkAnnotation::SkAnnotation(SkFlattenableReadBuffer) issue?


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-10-08 Thread hubicka at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #142 from Jan Hubicka hubicka at gcc dot gnu.org 2012-10-08 
22:19:55 UTC ---

After updating Mozilla this weekend, I definitely bloat up 8GB machine. The pak

in TOP is around 9-10GB.  I checked malloc usage and there are not many

surprises. It is about 300MB, mostly GGC overhead, pointer maps and such.



Most memory is actually the GGC, about 7GB. Here 5GB survives type and decl

merging and is distributed as follows:

cgraph.c:722 (cgraph_allocate_init_indirect_info1671240: 0.0%  0:

0.0%8202960: 0.2%  0: 0.0% 246855

tree.c:1226 (build_int_cst_wide)  625825208:12.3%  0:

0.0%   10437744: 0.2%4863752: 3.1% 325009

ipa-prop.h:471 (ipa_check_create_edge_args)   0: 0.0%  0:

0.0%   16777216: 0.3%  0: 0.0%  1

ipa-inline-analysis.c:3697 (inline_read_section)  0: 0.0%   28298904:

1.6%   21095504: 0.4%1064480: 0.7% 423701

tree.c:1561 (build_string) 16526800: 0.3%  0:

0.0%   21695715: 0.4%3395427: 2.2% 864326

ipa-prop.c:3393 (ipa_read_node_info)  0: 0.0%4302088:

0.2%   25029448: 0.5% 119192: 0.1% 246788

stringpool.c:75 (alloc_node)  0: 0.0%  0:

0.0%   27817760: 0.5%  0: 0.0% 695444

ipa-ref.c:51 (ipa_record_reference)   0: 0.0% 

188442816:10.3%   28443272: 0.6%2114424: 1.4%1256259

stringpool.c:58 (stringpool_ggc_alloc)0: 0.0%  0:

0.0%   34673092: 0.7%2619412: 1.7% 695444

lto/lto.c:2279 (create_subid_section_table)  275832: 0.0%  0:

0.0%   40363416: 0.8%8051472: 5.2%   3978

tree-streamer-in.c:895 (lto_input_ts_constructor  171812232: 3.4% 

192568640:10.6%   42205992: 0.8%1425072: 0.9% 947082

ipa-prop.c:3380 (ipa_read_node_info)  0: 0.0%   35825488:

2.0%   58764528: 1.1% 659704: 0.4% 909232

tree-streamer-in.c:488 (streamer_alloc_tree)  129846168: 2.6%  0:

0.0%   75997752: 1.5%   7072: 0.0%2063753

tree.c:1263 (build_int_cst_wide)  237791264: 4.7%  0:

0.0%   90464320: 1.8%  0: 0.0%   10257987

ipa-inline-analysis.c:3709 (inline_read_section)  0: 0.0%  133938484:

7.4%  101874268: 2.0%1606480: 1.0%1099389

lto-section-in.c:361 (lto_new_in_decl_state)   3240: 0.0%  0:

0.0%  107452560: 2.1%  0: 0.0% 895465

cgraph.c:653 (cgraph_create_edge_1)   0: 0.0%  0:

0.0%  135509816: 2.6%  0: 0.0%1302979

ggc-common.c:253 (ggc_cleared_alloc_ptr_array_tw   2040: 0.0% 

866397160:47.6%  190623368: 3.7% 263888: 0.2%  11459

lto/lto.c:267 (lto_read_in_decl_state) 3024: 0.0%  0:

0.0%  225743280: 4.4%   41057176:26.5%6268255

ipa-inline-analysis.c:931 (inline_summary_alloc)  0: 0.0%  0:

0.0%  268435464: 5.2%  8: 0.0%  1

cgraph.c:362 (cgraph_allocate_node)   0: 0.0%  0:

0.0%  515473640:10.1%  0: 0.0%1741465

toplev.c:953 (realloc_for_line_map)   0: 0.0% 

358955168:19.7% 1074790424:21.0%184: 0.0% 19

tree-streamer-in.c:499 (streamer_alloc_tree) 3668091656:72.1%  0:

0.0% 1995384408:38.9%   87485792:56.5%   46580224

Total5089831352   1821058652   

   5124870115154815271 91384962

source location GarbageFreed   

 Leak OverheadTimes



I.e. 20% are now linemaps, 38% trees read by the streamer, 10% cgraph nodes, 5%

inline summaries, 4% streamer table converting UIDs to decls (that can be

freed).



The trees are distributed as follows:

Kind   Nodes  Bytes

---

decls20489087 -1105370640

types10321297 1733977896

blocks1020128160960

stmts  0  0

refs   442971806000

exprs8205133  264995952

constants11667038  376994197

identifiers   695444   27817760

vecs  325009  626535448

binfos   2063753  205829776

ssa names  0  0

constructors  3698868877264

random kinds 7039351  281574472

lang_decl kinds0  0

lang_type kinds0  0

omp clauses0  0

---

Total61322307 -1863768211

---

Code   Nodes



I think all the blocks read to WPA are bugs.  We may also do better on sharing

constants.



identifier_node   

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-10-08 Thread steven at gcc dot gnu.org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375



--- Comment #143 from Steven Bosscher steven at gcc dot gnu.org 2012-10-08 
22:30:20 UTC ---

Created attachment 28395

  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28395

Use size_t for tree code book-keeping



...because overflow looks so sloppy.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-09-15 Thread markus at trippelsdorf dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #141 from Markus Trippelsdorf markus at trippelsdorf dot de 
2012-09-15 14:05:38 UTC ---
After the new IonMonkey JIT went in
(http://blog.mozilla.org/javascript/2012/09/12/ionmonkey-in-firefox-18/) 
peak memory use went up. It is now 6.8GB (gcc-4.7 roughly the same: 6.5GB).
So we're approaching the point where a 8GB machine isn't enough to
build Firefox with LTO...


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-08-18 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #139 from Jan Hubicka hubicka at gcc dot gnu.org 2012-08-18 
09:36:55 UTC ---
oprofile of WPA:
649295   18.2243  lto1 lto1 lto_main()
3412569.5783  lto1 lto1
htab_find_slot_with_hash
1265673.5525  lto1 lto1
do_estimate_growth_1(cgraph_node*, void*)
97142 2.7266  lto1 lto1 htab_expand
89658 2.5165  libc-2.11.1.so   libc-2.11.1.so   _int_malloc
82117 2.3048  lto1 lto1
pointer_map_insert(pointer_map_t*, void const*)
60238 1.6907  lto1 lto1
iterative_hash_hashval_t(unsigned int, unsigned int)
58145 1.6320  lto1 lto1
ggc_internal_alloc_stat(unsigned long, char const*, int, char const*)
53679 1.5067  lto1 lto1
linemap_lookup(line_maps*, unsigned int)
47271 1.3268  lto1 lto1
lto_output_tree(output_block*, tree_node*, bool, bool)
43043 1.2081  lto1 lto1
gt_ggc_mx_lang_tree_node(void*)
42675 1.1978  lto1 lto1
verify_cgraph_node(cgraph_node*)
40609 1.1398  lto1 lto1
streamer_tree_cache_insert_1(streamer_tree_cache_d*, tree_node*, unsigned int*,
bool)
40245 1.1296  lto1 lto1
ggc_marked_p(void const*)
39474 1.1079  libc-2.11.1.so   libc-2.11.1.so   memset
38955 1.0934  libc-2.11.1.so   libc-2.11.1.so  
malloc_consolidate
32085 0.9006  lto1 lto1
streamer_write_uhwi_stream(lto_output_stream*, unsigned long)
31965 0.8972  lto1 lto1
ggc_set_mark(void const*)
31406 0.8815  lto1 lto1
lto_input_tree(lto_input_block*, data_in*)
29213 0.8199  lto1 lto1
streamer_read_tree_bitfields(lto_input_block*, tree_node*)
26846 0.7535  lto1 lto1
hash_pointer
25870 0.7261  libc-2.11.1.so   libc-2.11.1.so   memcpy


We still spend insanely long time in walking types in lto_main (introduced by
Michael's patch)


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-08-18 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #140 from Jan Hubicka hubicka at gcc dot gnu.org 2012-08-19 
05:55:26 UTC ---
Author: hubicka
Date: Sun Aug 19 05:55:20 2012
New Revision: 190509

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190509
Log:

PR lto/45375
* ipa-inline.c (want_inline_small_function_p): Bypass
inline limits for hinted functions.
(edge_badness): Dump hints; decrease badness for hinted funcitons.
* ipa-inline.h (enum inline_hints_vals): New enum.
(inline_hints): New type.
(edge_growth_cache_entry): Add hints.
(dump_inline_summary): Update.
(dump_inline_hints): Declare.
(do_estimate_edge_hints): Declare.
(estimate_edge_hints): New inline function.
(reset_edge_growth_cache): Update.
* predict.c (cgraph_maybe_hot_edge_p): Do not ice on indirect edges.
* ipa-inline-analysis.c (dump_inline_hints): New function.
(estimate_edge_devirt_benefit): Return true when function should be
hinted.
(estimate_calls_size_and_time): New hints argument; set it when
devritualization happens.
(estimate_node_size_and_time): New hints argument.
(do_estimate_edge_time): Cache hints.
(do_estimate_edge_growth): Update.
(do_estimate_edge_hints): New function

Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-inline-analysis.c
trunk/gcc/ipa-inline.c
trunk/gcc/ipa-inline.h
trunk/gcc/predict.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/ipa/iinline-1.c


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-08-10 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #137 from Jan Hubicka hubicka at gcc dot gnu.org 2012-08-10 
15:06:51 UTC ---
So since the last report we managed to double WPA memory usage and compile
time...
12m wall, 42m user is needed for WPA build.
Execution times (seconds)
 phase opt and generate  :  97.34 (21%) usr   0.33 ( 1%) sys  97.70 (20%) wall 
 98900 kB ( 3%) ggc
 phase stream in : 242.70 (51%) usr   5.12 (22%) sys 247.94 (50%) wall
3174311 kB (97%) ggc
 phase stream out: 131.99 (28%) usr  17.49 (76%) sys 149.59 (30%) wall 
 0 kB ( 0%) ggc
 garbage collection  :  24.01 ( 5%) usr   0.00 ( 0%) sys  24.03 ( 5%)  ipa
lto gimple out  :  12.59 ( 3%) usr   1.07 ( 5%) sys  13.69 ( 3%) wall  
0 kB ( 0%) ggc
 ipa lto decl in : 188.50 (40%) usr   3.93 (17%) sys 192.53 (39%) wall
2083552 kB (64%) ggc
 ipa lto decl out: 113.33 (24%) usr   8.48 (37%) sys 121.84 (25%) wall 
 0 kB ( 0%) ggc
 ipa lto cgraph I/O  :   5.58 ( 1%) usr   0.67 ( 3%) sys   6.25 ( 1%) wall 
684122 kB (21%) ggc
 ipa lto decl merge  :  10.64 ( 2%) usr   0.01 ( 0%) sys  10.64 ( 2%) wall 
   291 kB ( 0%) ggc
 ipa lto cgraph merge:   9.15 ( 2%) usr   0.01 ( 0%) sys   9.17 ( 2%) wall 
 15100 kB ( 0%) ggc
 whopr wpa   :   5.80 ( 1%) usr   0.05 ( 0%) sys   5.89 ( 1%) wall 
 1 kB ( 0%) ggc
 whopr wpa I/O   :   2.19 ( 0%) usr   7.94 (35%) sys  10.19 ( 2%) 
inline heuristics   :  61.46 (13%) usr   0.31 ( 1%) sys  61.80 (12%) wall 
351753 kB (11%) ggc
 callgraph verifier  :  15.97 ( 3%) usr   0.06 ( 0%) sys  16.00 ( 3%) wall 
 0 kB ( 0%) ggc
 TOTAL : 472.0522.94   495.25   
3274649 kB


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-08-10 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #138 from Jan Hubicka hubicka at gcc dot gnu.org 2012-08-10 
15:35:44 UTC ---
Actually not, I looked up wrong report. The last report in comment #121 shows:
TOTAL : 616.4322.26   651.79   
2165706 kB

So we actually got noticeably faster, but need more memory. 1GB of GGC space,
but a lot more in top report.  I will look into mem report analysis once I am
done with merging some other cleanups/speedups.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-13 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #136 from Jan Hubicka hubicka at gcc dot gnu.org 2012-05-13 
16:29:04 UTC ---
... and oprofile of compilation stage of -flto-partition=none
samples  %image name   app name symbol name
1949762.8536  lto1 lto1 alloc_page
1090911.5966  libc-2.11.1.so   libc-2.11.1.so   _int_malloc
99458 1.4556  lto1 lto1
operand_equal_p
88092 1.2893  lto1 lto1
record_reg_classes
87508 1.2807  lto1 lto1
bitmap_set_bit
75628 1.1069  lto1 lto1
estimate_edge_growth
68760 1.0064  lto1 lto1
mem_attrs_eq_p
62151 0.9096  lto1 lto1
for_each_rtx_1
58274 0.8529  libc-2.11.1.so   libc-2.11.1.so   memset
55257 0.8087  libc-2.11.1.so   libc-2.11.1.so   malloc
52116 0.7628  lto1 lto1
htab_find_slot_with_hash
50481 0.7388  oprofiledoprofiled   
/usr/bin/oprofiled
42524 0.6224  lto1 lto1
ggc_set_mark
40190 0.5882  lto1 lto1
constrain_operands
40124 0.5872  lto1 lto1
lookup_page_table_entry
39279 0.5749  lto1 lto1
extract_insn
34436 0.5040  lto1 lto1
ggc_internal_alloc_stat
33609 0.4919  lto1 lto1
preprocess_constraints
32843 0.4807  lto1 lto1
get_attr_enabled
32582 0.4769  lto1 lto1
reload_cse_simplify_operands
32573 0.4767  lto1 lto1
bitmap_clear_bit
32278 0.4724  libc-2.11.1.so   libc-2.11.1.so  
malloc_consolidate
29633 0.4337  lto1 lto1
bitmap_bit_p
29593 0.4331  lto1 lto1
find_reg_note
29428 0.4307  libc-2.11.1.so   libc-2.11.1.so   _int_free
29161 0.4268  lto1 lto1
df_note_bb_compute
28939 0.4235  libc-2.11.1.so   libc-2.11.1.so   calloc
28794 0.4214  lto1 lto1 cse_insn
28084 0.4110  lto1 lto1
find_reloads
26192 0.3833  lto1 lto1
ix86_decompose_address
25211 0.3690  libc-2.11.1.so   libc-2.11.1.so   memcpy
25016 0.3661  lto1 lto1
df_ref_create_structure
24321 0.3560  lto1 lto1
nonzero_bits1
24066 0.3522  lto1 lto1
htab_traverse_noresize
23895 0.3497  libc-2.11.1.so   libc-2.11.1.so   free


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-12 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #130 from Jan Hubicka hubicka at gcc dot gnu.org 2012-05-12 
14:44:47 UTC ---
After fixing one linker error, I can now build Mozilla with
-flto-partition=none.  It takes 11GB and 40 minutes, so there is space for
improvement ;)

There are some obvious questions, like why IRA needs 63% of GGC memory, and VRP
23%

Also the -flto-partition=none .text section is now 18% smaller.  This is large
enough to be declared a bug, but I am not sure how to track it.

Note that my macihne has quite poor since CPU performance, so the compile times
are likely not comparable with LLVM ones reported above (and I also use
debugging build).

 ipa lto gimple in   :  52.12 ( 2%) usr   3.68 ( 9%) sys  55.72 ( 2%) wall
2998249 kB (84%) ggc
 ipa lto decl in : 225.68 ( 8%) usr   2.39 ( 6%) sys 228.17 ( 8%) wall
1124821 kB (31%) ggc
 ipa lto cgraph I/O  :   4.82 ( 0%) usr   0.44 ( 1%) sys   5.27 ( 0%) wall 
684110 kB (19%) ggc
 cfg construction:   3.01 ( 0%) usr   0.12 ( 0%) sys   3.29 ( 0%) wall 
 70205 kB ( 2%) ggc
 cfg cleanup :  46.57 ( 2%) usr   0.41 ( 1%) sys  46.69 ( 2%) wall 
 75005 kB ( 2%) ggc
 df live regs:  78.21 ( 3%) usr   0.25 ( 1%) sys  77.55 ( 3%) wall 
 0 kB ( 0%) ggc
 alias analysis  :  25.59 ( 1%) usr   0.12 ( 0%) sys  25.88 ( 1%) wall 
474769 kB (13%) ggc
 parser (global) :   8.62 ( 0%) usr   0.65 ( 2%) sys  10.00 ( 0%) wall 
259389 kB ( 7%) ggc
 inline heuristics   :  87.23 ( 3%) usr   0.51 ( 1%) sys  88.41 ( 3%) wall 
451358 kB (13%) ggc
 integration :  50.61 ( 2%) usr   1.51 ( 4%) sys  52.67 ( 2%) wall
1479979 kB (41%) ggc
 tree CFG cleanup:  46.68 ( 2%) usr   0.43 ( 1%) sys  48.09 ( 2%) wall 
 70493 kB ( 2%) ggc
 tree VRP:  65.88 ( 2%) usr   0.73 ( 2%) sys  66.71 ( 2%) wall 
862879 kB (24%) ggc
 tree copy propagation   :  22.30 ( 1%) usr   0.17 ( 0%) sys  22.11 ( 1%) wall 
144298 kB ( 4%) ggc
 tree PTA:  46.70 ( 2%) usr   0.06 ( 0%) sys  46.90 ( 2%) wall 
100249 kB ( 3%) ggc
 tree SSA rewrite:  19.16 ( 1%) usr   0.15 ( 0%) sys  19.09 ( 1%) wall 
149347 kB ( 4%) ggc
 tree SSA incremental:  27.75 ( 1%) usr   0.61 ( 1%) sys  27.86 ( 1%) wall 
 72307 kB ( 2%) ggc
 tree operand scan   :  57.17 ( 2%) usr   3.03 ( 7%) sys  59.92 ( 2%) wall
1296208 kB (36%) ggc
 dominator optimization  :  35.95 ( 1%) usr   0.21 ( 0%) sys  35.74 ( 1%) wall 
311024 kB ( 9%) ggc
 tree CCP:  31.61 ( 1%) usr   0.12 ( 0%) sys  31.17 ( 1%) wall 
69 kB ( 3%) ggc
 tree PRE:  87.46 ( 3%) usr   0.60 ( 1%) sys  88.62 ( 3%) wall 
538859 kB (15%) ggc
 tree FRE:  47.37 ( 2%) usr   0.58 ( 1%) sys  45.89 ( 2%) wall 
274455 kB ( 8%) ggc
 tree aggressive DCE :   8.96 ( 0%) usr   0.22 ( 1%) sys   8.86 ( 0%) wall 
137686 kB ( 4%) ggc
 tree forward propagate  :  10.28 ( 0%) usr   0.10 ( 0%) sys  10.33 ( 0%) wall 
 56466 kB ( 2%) ggc
 tree slp vectorization  :  25.42 ( 1%) usr   0.16 ( 0%) sys  25.50 ( 1%) wall 
436119 kB (12%) ggc
 complete unrolling  :   5.81 ( 0%) usr   0.13 ( 0%) sys   6.07 ( 0%) wall 
115165 kB ( 3%) ggc
 tree vectorization  :   1.44 ( 0%) usr   0.05 ( 0%) sys   1.36 ( 0%) wall 
 31337 kB ( 1%) ggc
 tree iv optimization:  13.00 ( 0%) usr   0.08 ( 0%) sys  12.94 ( 0%) wall 
185893 kB ( 5%) ggc
 dominance computation   :  48.61 ( 2%) usr   0.54 ( 1%) sys  47.65 ( 2%) wall 
 0 kB ( 0%) ggc
 expand vars :  18.81 ( 1%) usr   0.09 ( 0%) sys  18.42 ( 1%) wall 
167798 kB ( 5%) ggc
 expand  : 116.32 ( 4%) usr   0.61 ( 1%) sys 116.22 ( 4%) wall
1508612 kB (42%) ggc
 forward prop:  23.01 ( 1%) usr   0.36 ( 1%) sys  23.43 ( 1%) wall 
130825 kB ( 4%) ggc
 CSE :  67.21 ( 2%) usr   0.23 ( 1%) sys  66.28 ( 2%) wall 
 44439 kB ( 1%) ggc
 dead store elim1:  20.47 ( 1%) usr   0.10 ( 0%) sys  20.83 ( 1%) wall 
103309 kB ( 3%) ggc
 dead store elim2:  18.99 ( 1%) usr   0.18 ( 0%) sys  20.48 ( 1%) wall 
140398 kB ( 4%) ggc
 CPROP   :  52.83 ( 2%) usr   0.33 ( 1%) sys  52.91 ( 2%) wall 
336514 kB ( 9%) ggc
 PRE :  30.60 ( 1%) usr   0.06 ( 0%) sys  30.51 ( 1%) wall 
 52724 kB ( 1%) ggc
 CSE 2   :  37.89 ( 1%) usr   0.04 ( 0%) sys  38.88 ( 1%) wall 
 29785 kB ( 1%) ggc
 combiner:  80.20 ( 3%) usr   0.23 ( 1%) sys  80.57 ( 3%) wall 
400168 kB (11%) ggc
 integrated RA   : 191.13 ( 7%) usr   0.44 ( 1%) sys 190.64 ( 7%) wall
2328880 kB (65%) ggc
 reload  :  65.46 ( 2%) usr   0.09 ( 0%) sys  67.43 ( 2%) wall 
193522 kB ( 5%) ggc
 reload CSE regs :  56.71 ( 2%) usr   0.14 ( 0%) sys  56.49 ( 2%) wall 
241394 kB ( 7%) ggc
 thread pro-  epilogue  :  14.43 ( 1%) usr   0.15 ( 0%) sys  14.97 ( 1%) wall 
201098 kB ( 6%) ggc
 final   :  44.77 ( 2%) usr   2.80 ( 6%) sys  48.99 ( 2%) wall 
367580 kB (10%) ggc
 rest 

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-12 Thread steven at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

Steven Bosscher steven at gcc dot gnu.org changed:

   What|Removed |Added

 CC||steven at gcc dot gnu.org

--- Comment #131 from Steven Bosscher steven at gcc dot gnu.org 2012-05-12 
15:52:54 UTC ---
(In reply to comment #130)
 There are some obvious questions, like why IRA needs 63% of GGC memory,
 and VRP  23%

  tree VRP:  65.88 ( 2%) usr   0.73 ( 2%) sys  66.71 
( 2%) wall  862879 kB (24%) ggc

Is it possible to do this again with gathering statistics enabled? The
only thing I can think of for this would be ASSERT_EXPRs and all the
rewriting involved for them.


  tree slp vectorization  :  25.42 ( 1%) usr   0.16 ( 0%) sys  25.50
 ( 1%) wall  436119 kB (12%) ggc

This 12% also seems excessive.


  CPROP   :  52.83 ( 2%) usr   0.33 ( 1%) sys  52.91
 ( 2%) wall  336514 kB ( 9%) ggc

And this one also.  I'll see if I can understand and explain this one.


  integrated RA   : 191.13 ( 7%) usr   0.44 ( 1%) sys 190.64
 ( 7%) wall 2328880 kB (65%) ggc

Uh, wow! :-(


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-12 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #132 from Jan Hubicka hubicka at ucw dot cz 2012-05-12 18:32:14 
UTC ---
   tree VRP:  65.88 ( 2%) usr   0.73 ( 2%) sys  66.71 
 ( 2%) wall  862879 kB (24%) ggc
 
 Is it possible to do this again with gathering statistics enabled? The

I started it some time ago, but it takes a while (it runs out of RAM even
on my machine ;)

 only thing I can think of for this would be ASSERT_EXPRs and all the
 rewriting involved for them.

It also might be folding doing too much of temporary stuff.

   tree slp vectorization  :  25.42 ( 1%) usr   0.16 ( 0%) sys  25.50
  ( 1%) wall  436119 kB (12%) ggc
 
 This 12% also seems excessive.

Indeed it is.
   integrated RA   : 191.13 ( 7%) usr   0.44 ( 1%) sys 190.64
  ( 7%) wall 2328880 kB (65%) ggc
 
 Uh, wow! :-(

Tep, sems something degenerate here.  IRA is usually not that big of memory
hog.

Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-12 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #133 from Jan Hubicka hubicka at ucw dot cz 2012-05-12 19:07:32 
UTC ---
Another thing to observe is that GGC memory is just 4GB.  I am not sure where
the other 8GB goes when our IL is believed
to be major memory consumer and it resists almost completely in GGC memory.

perhaps some of the streaming hashtables gets out of control.

Also it seems that line number info is about 1GB. It may be win to write better
streaming of locations.
Current one enables almost no reuse of locators.

Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-12 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #134 from Jan Hubicka hubicka at gcc dot gnu.org 2012-05-12 
20:22:27 UTC ---
I tracked down the LTO/WHOPR code size difference. It is EH handling. EH frame
is empty for LTO build and quite large for WHOPR.  Probably -fno-exceptions
getting lots on way to ltrans?

With memory stats there don't seem to be major suprises:
tree-phinodes.c:129 (allocate_phi_node)   110246192: 0.8%  0:
0.0%3405296: 0.1% 409376: 0.0% 372408
gimple.c:600 (gimple_build_nop)   119935632: 0.8%  0:
0.0% 252144: 0.0%  0: 0.0%2503912
gimplify.c:437 (create_tmp_var_raw)   119589760: 0.8%  0:
0.0%1119200: 0.0%  0: 0.0% 754431
tree-vrp.c:3993 (build_assert_expr_for)   124663296: 0.9%  0:
0.0%  0: 0.0%  0: 0.0%1298576
emit-rtl.c:3731 (make_jump_insn_raw)  118395600: 0.8%  0:
0.0%   11138960: 0.3%  0: 0.0%1619182
tree-streamer-in.c:484 (streamer_alloc_tree)   90340024: 0.6%  0:
0.0%   51300472: 1.5%   4376: 0.0%1420249
simplify-rtx.c:183 (simplify_gen_binary)  153607224: 1.1%  0:
0.0% 619968: 0.0%  0: 0.0%6426133
fold-const.c:1870 (fold_convert_loc)  154700600: 1.1%  0:
0.0%   2160: 0.0%  0: 0.0%3867569
ggc-common.c:253 (ggc_cleared_alloc_ptr_array_tw   80243272: 0.6%
1267966456:15.3%   76357960: 2.2%   11155352: 1.2%1833025
lto/lto.c:281 (lto_read_in_decl_state)   835696: 0.0%  0:
0.0%  163487336: 4.6%   31116920: 3.4%4176305
cfg.c:216 (connect_src)   174302184: 1.2% 623048:
0.0%7861944: 0.2% 133632: 0.0%4542618
cfg.c:226 (connect_dest)  177198328: 1.2%5444688:
0.1%8603432: 0.2% 347648: 0.0%4628047
tree.c:9115 (make_vector_type)206615472: 1.4%  0:
0.0%   6720: 0.0%  0: 0.0%1229894
emit-rtl.c:639 (gen_rtx_MEM)  202133352: 1.4%  0:
0.0%6629016: 0.2%  0: 0.0%8698432
dwarf2cfi.c:386 (copy_cfi_row)212886640: 1.5%  0:
0.0%  0: 0.0%  0: 0.0%1400570
tree-inline.c:4851 (copy_decl_no_change)  211988960: 1.5%  0:
0.0%7283480: 0.2%  0: 0.0%1387268
tree-ssanames.c:78 (init_ssanames)224107008: 1.6%  252869632:
3.1%   1536: 0.0%  153516032:16.6% 309555
lists.c:144 (alloc_EXPR_LIST) 236354400: 1.7%  0:
0.0%5798160: 0.2%  0: 0.0%   10089690
gimple.c:2237 (gimple_copy)   268995784: 1.9%  0:
0.0%4002872: 0.1% 644208: 0.1%2530798
gimple-streamer-in.c:95 (input_gimple_stmt)   272340080: 1.9%  0:
0.0%4356168: 0.1% 917040: 0.1%2550173
tree-inline.c:4331 (copy_tree_r)  286698704: 2.0%  0:
0.0%2053920: 0.1%  0: 0.0%5999420
rtl.c:287 (copy_rtx)  291942896: 2.0%  0:
0.0% 318864: 0.0%  0: 0.0%   12315136
emit-rtl.c:393 (gen_raw_REG)  271761568: 1.9%  0:
0.0%   25188032: 0.7%  0: 0.0%9279675
cselib.c:1896 (cselib_subst_to_values)299291264: 2.1%  0:
0.0%  0: 0.0%  0: 0.0%   12658684
emit-rtl.c:5427 (init_emit)   354914672: 2.5%   19547728:
0.2%  0: 0.0%  102897600:11.1% 132600
cgraph.c:359 (cgraph_allocate_node)   0: 0.0%  0:
0.0%  401297520:11.4%  0: 0.0%1286210
emit-rtl.c:3679 (make_insn_raw)   435416472: 3.0%  0:
0.0%1754496: 0.0%  0: 0.0%6071819
fold-const.c:7624 (build_fold_addr_expr_with_typ  463283920: 3.2%  0:
0.0%  72880: 0.0%  0: 0.0%   11583920
tree-ssanames.c:141 (make_ssa_name_fn)459164960: 3.2%  0:
0.0%5805920: 0.2%  0: 0.0%5812136
cfg.c:142 (alloc_block)   469702464: 3.3%  0:
0.0%   20328672: 0.6%  0: 0.0%4375278
toplev.c:964 (realloc_for_line_map)   0: 0.0%  357908640:
4.3% 1073741848:30.4%184: 0.0%  9
tree.c:1228 (build_int_cst_wide) 1188738504: 8.3%  0:
0.0%   31478720: 0.9%  401175208:43.3% 295230
tree-streamer-in.c:495 (streamer_alloc_tree) 2413661896:16.9%  0:
0.0% 1163973288:32.9%   41183648: 4.4%   28110064
Total14300758513   8262871404  
3534486067927547008308001940
source location GarbageFreed   
 Leak OverheadTimes

From explicitely freed GGC mem there are few interesting cases:
alias.c:2807 (init_alias_analysis)0: 0.0%  

[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-12 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #135 from Jan Hubicka hubicka at gcc dot gnu.org 2012-05-12 
21:33:36 UTC ---
... and mem reports on WPA stage:

toplev.c:964 (realloc_for_line_map)   0: 0.0%   89473168:
9.4%  268435472:10.3%160: 0.0%  8
cgraph.c:359 (cgraph_allocate_node)   0: 0.0%  0:
0.0%  401297520:15.3%  0: 0.0%1286210
tree.c:1228 (build_int_cst_wide) 1188709752:33.7%  0:
0.0%   22765400: 0.9%  399425424:83.1% 208540
tree-streamer-in.c:495 (streamer_alloc_tree) 1950272016:55.3%  0:
0.0% 1143907104:43.7%   41182080: 8.6%   22462122
Total3527995024956449616   
   2618397893480920037 47749265
source location GarbageFreed   
 Leak OverheadTimes


So about 50% trees, 15% cgraph nodes (I do have plans how to get those
smaller), 10% linemaps (I wonder if simple cache would not save a lot of
locators), 5% inline summaries

I wonder who is producing that 1GB of temporary integer nodes? Someone abusing
them for counting too much? It is there before IPA, so it seems to be streaming
or type machinery.

Heap vectors:

source locationLeak Peak   
Times
---

ipa-reference.c:186 (set_reference_optimization_   10289688:10.5%   11240664   
  13: 0.0%
lto-cgraph.c:118 (lto_cgraph_encoder_encode)   12756976:13.0%   23348152   
   26300: 0.2%
ipa-ref.c:55 (ipa_record_reference)13593072:13.8%   41932432   
 1000565: 6.0%
passes.c:2214 (execute_one_pass)   21214520:21.5%   41942992   
  557113: 3.3%
ipa-inline-analysis.c:804 (inline_summary_alloc)   30037064:30.5%   30037064   
   1: 0.0%
Total  98450004
 16768143

Bitmap Overall   Allocated   
PeakLeak   searched   search itr
-
ipa-reference.c:911 (propagate) 37274131244280   
3122372031223720  0  0
ipa-reference.c:739 (propagate) 32925813341680
3058960 3058960  0  0
ipa-reference.c:923 (propagate) 37218625153920   
2513852025138520  0  0
ipa-reference.c:417 (init_function_info)48726319809560   
1980956019809560551335
ipa-reference.c:418 (init_function_info)48726319584680   
1958468019584680 79 45
ipa-reference.c:747 (propagate) 32935113229360
3053920 3053920  0  0

Kind   Nodes  Bytes
---
decls11059354 1770384416
types6163492 1035466656
blocks 1 80
stmts  0  0
refs5243 267944
exprs1826905   7444
constants2198755   72290570
identifiers   538891   21555640
vecs  208540  412624304
binfos   1420249  141631744
ssa names111   8880
constructors  1591693820056
random kinds 3270917  130837088

Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-11 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #124 from Jan Hubicka hubicka at ucw dot cz 2012-05-11 08:34:17 
UTC ---
 Just for comparison, clang with -O4 runs only single threaded and does
 everything in memory (no streaming out). It uses 3.5GB of memory (peak) and
 takes 19 minutes to finish...

Interesting.  Micsofot's compiler is also barely in 4GB space, right?
Is it with debug info?

I will try non-WHOPR build to see how bad we are.  The actual IL is about 1.5GB
of the footprint (measuing GGC memory).  I think good part of the rest comes to
mmap
address space (the object files are rather large).

Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-11 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #125 from Richard Guenther rguenth at gcc dot gnu.org 2012-05-11 
08:44:51 UTC ---
(In reply to comment #122)
 oprofile shows:
 139188   15.6963  lto1 lto1
 uniquify_nodes
 66390 7.4868  lto1 lto1
 estimate_edge_growth
 52815 5.9560  lto1 lto1
 VEC_edge_growth_cache_entry_base_length
 47137 5.3157  lto1 lto1
 iterative_hash_hashval_t
 34037 3.8384  lto1 lto1
 htab_find_slot_with_hash
 33604 3.7895  lto1 lto1
 bp_unpack_value
 26584 2.9979  lto1 lto1
 do_estimate_growth_1
 21410 2.4144  lto1 lto1
 ggc_set_mark
 17124 1.9311  lto1 lto1
 inflate_fast
 14464 1.6311  lto1 lto1
 streamer_read_uhwi
 14204 1.6018  lto1 lto1
 lookup_page_table_entry
 11430 1.2890  libc-2.11.1.so   libc-2.11.1.so   memset
 11405 1.2861  lto1 lto1
 streamer_read_hwi_in_range
 11286 1.2727  lto1 lto1
 gt_ggc_mx_lang_tree_node
 11017 1.2424  lto1 lto1
 iterative_hash_gimple_type
 10851 1.2237  lto1 lto1
 pointer_map_insert
 10674 1.2037  lto1 lto1
 lto_input_tree
 10536 1.1881  lto1 lto1
 ht_lookup_with_hash
 10269 1.1580  lto1 lto1
 streamer_read_uchar
 9972  1.1245  lto1 lto1
 streamer_read_uchar
 9089  1.0250  libc-2.11.1.so   libc-2.11.1.so   
 _int_malloc
 9086  1.0246  lto1 lto1 alloc_page
 6603  0.7446  lto1 lto1
 VEC_edge_growth_cache_entry_base_index
 
 looks like uniquify_nodes got out of control?

Well - the obvious possibly slow part of uniquify nodes is that it walks
all fields of record/union types.  So - do you have a more detailed profile
of uniquify_nodes?


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-11 Thread markus at trippelsdorf dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #126 from Markus Trippelsdorf markus at trippelsdorf dot de 
2012-05-11 08:46:39 UTC ---
(In reply to comment #124)
  Just for comparison, clang with -O4 runs only single threaded and does
  everything in memory (no streaming out). It uses 3.5GB of memory (peak) and
  takes 19 minutes to finish...
 
 Interesting.  Micsofot's compiler is also barely in 4GB space, right?

IIRC Mozilla recently switched to a 64-bit toolchain on windows, because the
32-bit linker ran out of memory. So they are above 4GB already.

 Is it with debug info?

No.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-11 Thread mh+gcc at glandium dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #127 from Mike Hommey mh+gcc at glandium dot org 2012-05-11 
08:52:24 UTC ---
(In reply to comment #126)
 (In reply to comment #124)
   Just for comparison, clang with -O4 runs only single threaded and does
   everything in memory (no streaming out). It uses 3.5GB of memory (peak) 
   and
   takes 19 minutes to finish...
  
  Interesting.  Micsofot's compiler is also barely in 4GB space, right?
 
 IIRC Mozilla recently switched to a 64-bit toolchain on windows, because the
 32-bit linker ran out of memory. So they are above 4GB already.

There is unfortunately no cross-linker in MSVC, so you can't link 32-bit
binaries with a 64-bit toolchain. We're in the process of switching to 64-bits
OS with a 32-its toolchain, which will allow an extra gigabyte of
address-space. We've gone past the current 3GB limit a couple times now, at
which point, we moved some stuff out of libxul. Before that, we hit the 2GB
limit, at which point we used the /3GB option that allows for the extra GB.


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-11 Thread hubicka at ucw dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #128 from Jan Hubicka hubicka at ucw dot cz 2012-05-11 08:52:50 
UTC ---
 Well - the obvious possibly slow part of uniquify nodes is that it walks
 all fields of record/union types.  So - do you have a more detailed profile
 of uniquify_nodes?

No, I will try to generate annotated sources then.  I am bit puzzled by this -
looking at the stuff there seems nothing inherently expensive in it.

Honza


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-11 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #129 from Jan Hubicka hubicka at gcc dot gnu.org 2012-05-11 
19:05:19 UTC ---
OK, the slow part of uniuqify_nodes is:
  /* Remove us from our main variant list if we are not the
 variant leader.  */
  if (TYPE_MAIN_VARIANT (t) != t)
{ 
  tem = TYPE_MAIN_VARIANT (t);
  while (tem  TYPE_NEXT_VARIANT (tem) != t)
tem = TYPE_NEXT_VARIANT (tem);
  if (tem)
TYPE_NEXT_VARIANT (tem) = TYPE_NEXT_VARIANT (t);
  TYPE_NEXT_VARIANT (t) = NULL_TREE;
}


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-10 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #121 from Jan Hubicka hubicka at gcc dot gnu.org 2012-05-10 
21:45:10 UTC ---
With inliner performance fix I am going to push out today, the situation looks
as follows:
Execution times (seconds)
 phase parsing   : 606.20 (98%) usr  21.98 (99%) sys 641.28 (98%) wall
2164274 kB (100%) ggc
 phase cgraph: 337.00 (55%) usr  18.52 (83%) sys 367.32 (56%) wall 
 88841 kB ( 4%) ggc
 phase finalize  :  10.21 ( 2%) usr   0.28 ( 1%) sys  10.50 ( 2%) wall 
 0 kB ( 0%) ggc
 garbage collection  :  33.12 ( 5%) usr   0.04 ( 0%) sys  33.21 ( 5%) wall 
 0 kB ( 0%) ggc
 ipa cp  :   3.52 ( 1%) usr   0.15 ( 1%) sys   3.67 ( 1%) wall 
 93737 kB ( 4%) ggc
 ipa lto gimple out  :  14.43 ( 2%) usr   1.38 ( 6%) sys  15.89 ( 2%) wall 
 0 kB ( 0%) ggc
 ipa lto decl in : 221.85 (36%) usr   2.52 (11%) sys 225.61 (35%) wall
1153296 kB (53%) ggc
 ipa lto decl out: 179.65 (29%) usr   8.60 (39%) sys 198.90 (31%) wall 
 0 kB ( 0%) ggc
 ipa lto cgraph I/O  :   4.59 ( 1%) usr   0.50 ( 2%) sys   5.09 ( 1%) wall 
550051 kB (25%) ggc
 ipa lto decl merge  :   9.57 ( 2%) usr   0.00 ( 0%) sys   9.58 ( 1%) wall 
   291 kB ( 0%) ggc
 ipa lto cgraph merge:   6.06 ( 1%) usr   0.00 ( 0%) sys   6.08 ( 1%) wall 
 14158 kB ( 1%) ggc
 whopr wpa   :   6.44 ( 1%) usr   0.06 ( 0%) sys   6.54 ( 1%) wall 
 2 kB ( 0%) ggc
 whopr wpa I/O   :   2.77 ( 0%) usr   8.03 (36%) sys  11.56 ( 2%) wall 
 0 kB ( 0%) ggc
 ipa reference   :   5.16 ( 1%) usr   0.08 ( 0%) sys   5.25 ( 1%) wall 
 0 kB ( 0%) ggc
 ipa profile :   0.55 ( 0%) usr   0.00 ( 0%) sys   0.55 ( 0%) wall 
 0 kB ( 0%) ggc
 ipa pure const  :   5.59 ( 1%) usr   0.02 ( 0%) sys   5.61 ( 1%) wall 
 0 kB ( 0%) ggc
 parser (global) :   3.98 ( 1%) usr   0.04 ( 0%) sys   4.04 ( 1%) wall 
 0 kB ( 0%) ggc
 inline heuristics   :  94.38 (15%) usr   0.31 ( 1%) sys  94.90 (15%) wall 
342900 kB (16%) ggc
 tree CFG cleanup:   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall 
 0 kB ( 0%) ggc
 callgraph verifier  :  18.53 ( 3%) usr   0.08 ( 0%) sys  18.61 ( 3%) wall 
 0 kB ( 0%) ggc
 varconst:   0.04 ( 0%) usr   0.03 ( 0%) sys   0.14 ( 0%) wall 
 0 kB ( 0%) ggc
 unaccounted todo:   4.70 ( 1%) usr   0.10 ( 0%) sys   4.81 ( 1%) wall 
 0 kB ( 0%) ggc
 TOTAL : 616.4322.26   651.79   
2165706 kB

So memory use is somewhat up (4GB compared to 3.2GB) but Mozilla grew a bit,
too, so I think there are no important changes since my last report.

Performance wise we are in better shape than 4.7 release (I will backport the
fix, 4.7 needs over 10 minutes in the inliner) but we still are way too slow,
with over 3 minutes needed for streaming in..


[Bug lto/45375] [meta-bug] Issues with building Mozilla with LTO

2012-05-10 Thread hubicka at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45375

--- Comment #122 from Jan Hubicka hubicka at gcc dot gnu.org 2012-05-10 
21:53:54 UTC ---
oprofile shows:
139188   15.6963  lto1 lto1
uniquify_nodes
66390 7.4868  lto1 lto1
estimate_edge_growth
52815 5.9560  lto1 lto1
VEC_edge_growth_cache_entry_base_length
47137 5.3157  lto1 lto1
iterative_hash_hashval_t
34037 3.8384  lto1 lto1
htab_find_slot_with_hash
33604 3.7895  lto1 lto1
bp_unpack_value
26584 2.9979  lto1 lto1
do_estimate_growth_1
21410 2.4144  lto1 lto1
ggc_set_mark
17124 1.9311  lto1 lto1
inflate_fast
14464 1.6311  lto1 lto1
streamer_read_uhwi
14204 1.6018  lto1 lto1
lookup_page_table_entry
11430 1.2890  libc-2.11.1.so   libc-2.11.1.so   memset
11405 1.2861  lto1 lto1
streamer_read_hwi_in_range
11286 1.2727  lto1 lto1
gt_ggc_mx_lang_tree_node
11017 1.2424  lto1 lto1
iterative_hash_gimple_type
10851 1.2237  lto1 lto1
pointer_map_insert
10674 1.2037  lto1 lto1
lto_input_tree
10536 1.1881  lto1 lto1
ht_lookup_with_hash
10269 1.1580  lto1 lto1
streamer_read_uchar
9972  1.1245  lto1 lto1
streamer_read_uchar
9089  1.0250  libc-2.11.1.so   libc-2.11.1.so   _int_malloc
9086  1.0246  lto1 lto1 alloc_page
6603  0.7446  lto1 lto1
VEC_edge_growth_cache_entry_base_index

looks like uniquify_nodes got out of control?


  1   2   3   >