------- Comment #12 from abel at gcc dot gnu dot org  2008-11-25 14:28 -------
I have somewhat cut the testcase, having the call with two ARG3's instead of
ten coming from ARG4.  With this smaller testcase, I see that the most time is
taken by register renaming (cross to spu-elf, compiled with -O2):

 scheduling            :   0.66 ( 2%) usr   0.03 (30%) sys   0.69 ( 2%) wall  
19208 kB (32%) ggc
 integrated RA         :   4.55 (11%) usr   0.00 ( 0%) sys   4.53 (11%) wall   
 829 kB ( 1%) ggc
 reload                :   2.57 ( 6%) usr   0.01 (10%) sys   2.58 ( 6%) wall  
11996 kB (20%) ggc
 reload CSE regs       :   0.23 ( 1%) usr   0.00 ( 0%) sys   0.22 ( 1%) wall   
2940 kB ( 5%) ggc
 peephole 2            :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 rename registers      :  32.21 (76%) usr   0.01 (10%) sys  32.22 (75%) wall   
 993 kB ( 2%) ggc
 scheduling 2          :   0.58 ( 1%) usr   0.03 (30%) sys   0.61 ( 1%) wall   
5375 kB ( 9%) ggc
 machine dep reorg     :   0.59 ( 1%) usr   0.01 (10%) sys   0.60 ( 1%) wall   
5569 kB ( 9%) ggc
 final                 :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 TOTAL                 :  42.59             0.10            42.71             
59919 kB

With -O2 -fno-rename-registers, I get

 scheduling            :   0.66 ( 6%) usr   0.04 (36%) sys   0.70 ( 7%) wall  
19208 kB (33%) ggc
 integrated RA         :   4.56 (45%) usr   0.00 ( 0%) sys   4.57 (44%) wall   
 829 kB ( 1%) ggc
 reload                :   2.58 (25%) usr   0.00 ( 0%) sys   2.59 (25%) wall  
11996 kB (21%) ggc
 reload CSE regs       :   0.23 ( 2%) usr   0.00 ( 0%) sys   0.24 ( 2%) wall   
2940 kB ( 5%) ggc
 thread pro- & epilogue:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
  22 kB ( 0%) ggc
 peephole 2            :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 rename registers      :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.04 ( 0%) wall   
   0 kB ( 0%) ggc
 scheduling 2          :   0.49 ( 5%) usr   0.04 (36%) sys   0.52 ( 5%) wall   
4949 kB ( 9%) ggc
 machine dep reorg     :   0.50 ( 5%) usr   0.02 (18%) sys   0.51 ( 5%) wall   
5055 kB ( 9%) ggc
 reorder blocks        :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   
   0 kB ( 0%) ggc
 final                 :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 TOTAL                 :  10.21             0.11            10.35             
57732 kB

-frename-registers is enabled by default on spu, so no wonder this is not seen
on other targets. 

oprofile shows me this:

Samples  %        linenr info                 image name               app name
                symbol name
-------------------------------------------------------------------------------
362678   29.6888  rtlanal.c:1412              cc1                      cc1     
                note_stores
  362678   100.000  rtlanal.c:1412              cc1                      cc1   
                  note_stores [self]
-------------------------------------------------------------------------------
304520   24.9280  regrename.c:1941            cc1                      cc1     
                rest_of_handle_regrename
  304520   99.8727  regrename.c:1941            cc1                      cc1   
                  rest_of_handle_regrename [self]
  201       0.0659  bitmap.c:630                cc1                      cc1   
                  bitmap_set_bit
  99        0.0325  df-scan.c:1217              cc1                      cc1   
                  df_insn_rescan
  39        0.0128  df-problems.c:107           cc1                      cc1   
                  df_grow_bb_info
  24        0.0079  (no location information)   cc1                      cc1   
                  bitmap_clear_bit
  17        0.0056  df-scan.c:573               cc1                      cc1   
                  df_grow_reg_info
  8         0.0026  emit-rtl.c:1131             cc1                      cc1   
                  max_reg_num
-------------------------------------------------------------------------------
164550   13.4701  regrename.c:120             cc1                      cc1     
                clear_dead_regs
  164550   100.000  regrename.c:120             cc1                      cc1   
                  clear_dead_regs [self]
-------------------------------------------------------------------------------
  6441     100.000  ira-color.c:1044            cc1                      cc1   
                  allocno_spill_priority_compare
59894     4.9029  ira-color.c:1044            cc1                      cc1     
                allocno_spill_priority_compare
  59894    86.6547  ira-color.c:1044            cc1                      cc1   
                  allocno_spill_priority_compare [self]
  6441      9.3188  ira-color.c:1044            cc1                      cc1   
                  allocno_spill_priority_compare
  1148      1.6609  splay-tree.c:348            cc1                      cc1   
                  splay_tree_remove
  928       1.3426  splay-tree.c:139            cc1                      cc1   
                  splay_tree_splay
  460       0.6655  ira-color.c:1083            cc1                      cc1   
                  splay_tree_free
  247       0.3574  alloc-pool.c:325            cc1                      cc1   
                  pool_free

I don't have enough information to understand where note_stores calls come
from, and I stopped wondering for now.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31850

Reply via email to