[Bug middle-end/49310] [4.7 Regression] Compile time hog in var-tracking emit

2011-10-19 Thread aoliva at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49310

--- Comment #10 from Alexandre Oliva aoliva at gcc dot gnu.org 2011-10-19 
15:50:04 UTC ---
Author: aoliva
Date: Wed Oct 19 15:50:00 2011
New Revision: 180194

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=180194
Log:
PR debug/49310
* var-tracking.c (loc_exp_dep, onepart_aux): New structs.
(variable_part): Replace offset with union.
(enum onepart_enum, onepart_enum_t): New.
(variable_def): Drop cur_loc_changed, add onepart.
(value_chain_def, const_value_chain): Remove.
(VAR_PART_OFFSET, VAR_LOC_1PAUX): New macros, with checking.
(VAR_LOC_DEP_LST, VAR_LOC_DEP_LSTP): New macros.
(VAR_LOC_FROM, VAR_LOC_DEPTH, VAR_LOC_DEP_VEC): Likewise.
(value_chain_pool, value_chains): Remove.
(dropped_values): New.
(struct parm_reg): Only if HAVE_window_save.
(vt_stack_adjustments): Don't record register arguments.
(dv_as_rtx): New.
(dv_onepart_p): Return a onepart_enum_t.
(onepart_pool): New.
(dv_pool): Remove.
(dv_from_rtx): New.
(variable_htab_free): Release onepart aux data.  Reset flags.
(value_chain_htab_hash, value_chain_htab_eq): Remove.
(unshare_variable): Use onepart field.  Propagate onepart aux
data or offset.  Drop cur_loc_changed.
(val_store): Cope with NULL insn.  Rephrase dump output.  Check
for unsuitable locs.  Add FIXME on using cselib locs.
(val_reset): Remove FIXME of unfounded concerns.
(val_resolve): Check for unsuitable locs.  Add FIXME on using
cselib locs.
(variable_union): Use onepart field, adjust access to offset.
(NO_LOC_P): New.
(VALUE_CHANGED, DECL_CHANGED): Update doc.
(set_dv_changed): Clear NO_LOC_P when changed.
(find_loc_in_1pdv): Use onepart field.
(intersect_loc_chains): Likewise.
(unsuitable_loc): New.
(loc_cmp): Keep ENTRY_VALUEs at the end of the loc list.
(add_value_chain, add_value_chains): Remove.
(add_cselib_value_chains, remove_value_chain): Likewise.
(remove_value_chains, remove_cselib_value_chains): Likewise.
(canonicalize_loc_order_check): Use onepart.  Drop cur_loc_changed.
(canonicalize_values_star, canonicalize_vars_star): Use onepart.
(variable_merge_over_cur): Likewise.  Adjust access to offset.
Drop cur_loc_changed.
(variable_merge_over_src): Use onepart field.
(remove_duplicate_values): Likewise.
(variable_post_merge_new_vals): Likewise.
(find_mem_expr_in_1pdv): Likewise.
(dataflow_set_preserve_mem_locs): Likewise.  Drop cur_loc_changed
and value chains.
(dataflow_set_remove_mem_locs): Likewise.  Use VAR_LOC_FROM.
(variable_different_p): Use onepart field.  Move onepart test out
of the loop.
(argument_reg_set): Drop.
(add_uses, add_stores): Preserve but do not record in dynamic
tables equivalences for ENTRY_VALUEs and CFA_based addresses.
Avoid unsuitable address expressions.
(EXPR_DEPTH): Unlimit.
(EXPR_USE_DEPTH): Repurpose PARAM_MAX_VARTRACK_EXPR_DEPTH.
(prepare_call_arguments): Use DECL_RTL_IF_SET.
(dump_var): Adjust access to offset.
(variable_from_dropped, recover_dropped_1paux): New.
(variable_was_changed): Drop cur_loc_changed.  Use onepart.
Preserve onepart aux in empty_var.  Recover empty_var and onepart
aux from dropped_values.
(find_variable_location_part): Special-case onepart.  Adjust
access to offset.
(set_slot_part): Use onepart.  Drop cur_loc_changed.  Adjust
access to offset.  Initialize onepaux.  Drop value chains.
(delete_slot_part): Drop value chains.  Use VAR_LOC_FROM.
(VEC (variable, heap), VEC (rtx, stack)): Define.
(expand_loc_callback_data): Drop dummy, cur_loc_changed,
ignore_cur_loc.  Add expanding, pending, depth.
(loc_exp_dep_alloc, loc_exp_dep_clear): New.
(loc_exp_dep_insert, loc_exp_dep_set): New.
(notify_dependents_of_resolved_value): New.
(update_depth, vt_expand_var_loc_chain): New.
(vt_expand_loc_callback): Revamped.
(resolve_expansions_pending_recursion): New.
(INIT_ELCD, FINI_ELCD): New.
(vt_expand_loc): Use the new macros above.  Drop ignore_cur_loc
parameter, adjust all callers.
(vt_expand_loc_dummy): Drop.
(vt_expand_1pvar): New.
(emit_note_insn_var_location): Operate on non-debug decls only.
Revamp multi-part cur_loc recomputation and one-part expansion.
Drop cur_loc_changed.  Adjust access to offset.
(VEC (variable, heap)): Drop.
(changed_variables_stack, changed_values_stack): Drop.
(check_changed_vars_0, check_changed_vars_1): Remove.
(check_changed_vars_2, check_changed_vars_3): Remove.
(values_to_stack, remove_value_from_changed_variables): New.
(notify_dependents_of_changed_value, process_changed_values): New.
(emit_notes_for_changes): Revamp onepart updates.
(emit_notes_for_differences_1): Use onepart.  Drop cur_loc_changed
and value chains.  Propagate onepaux.  Recover empty_var and onepaux
from dropped_values.
(emit_notes_for_differences_2): Drop value chains.
(emit_notes_in_bb): Adjust.
(vt_emit_notes): Drop value chains, changed_variables_stack.
Initialize and release dropped_values.
(create_entry_value): Revamp.
(vt_add_function_parameter): Use new interface.
(note_register_arguments): Remove.
(vt_initialize): Drop value chains and register arguments.

[Bug middle-end/49310] [4.7 Regression] Compile time hog in var-tracking emit

2011-07-22 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49310

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED

--- Comment #9 from Jakub Jelinek jakub at gcc dot gnu.org 2011-07-22 
09:52:45 UTC ---
http://gcc.gnu.org/viewcvs?root=gccview=revrev=176538


[Bug middle-end/49310] [4.7 Regression] Compile time hog in var-tracking emit

2011-07-04 Thread Joost.VandeVondele at pci dot uzh.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49310

--- Comment #8 from Joost VandeVondele Joost.VandeVondele at pci dot uzh.ch 
2011-07-04 18:57:45 UTC ---
patch ;-)

Index: gcc/params.def
===
--- gcc/params.def  (revision 175820)
+++ gcc/params.def  (working copy)
@@ -845,7 +845,7 @@
 DEFPARAM (PARAM_MAX_VARTRACK_EXPR_DEPTH,
  max-vartrack-expr-depth,
  Max. recursion depth for expanding var tracking expressions,
- 20, 0, 0)
+ 12, 0, 0)

 /* Set minimum insn uid for non-debug insns.  */


[Bug middle-end/49310] [4.7 Regression] Compile time hog in var-tracking emit

2011-06-09 Thread Joost.VandeVondele at pci dot uzh.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49310

--- Comment #7 from Joost VandeVondele Joost.VandeVondele at pci dot uzh.ch 
2011-06-09 06:54:33 UTC ---
two more datapoints (depth=30 is still running):

max-vartrack-expr-depth=22: var-tracking emit :5459.44 (99%) usr
max-vartrack-expr-depth=25: var-tracking emit :42078.07 (100%) usr

these are the timings for the various -Ox

'-g -O0 -fbounds-check' : 14s
'-g -O1 -fbounds-check' : 2631s
'   -O1 -fbounds-check' : 44s
'-g -O2 -fbounds-check' : 43s

from this point of view, something at -O2 seems to be very good at cleaning up
these very long expressions very cheaply. Would it make sense to run that pass
also at -O1 (maybe only when these long expressions are observed) ?


[Bug middle-end/49310] [4.7 Regression] Compile time hog in var-tracking emit

2011-06-08 Thread Joost.VandeVondele at pci dot uzh.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49310

--- Comment #2 from Joost VandeVondele Joost.VandeVondele at pci dot uzh.ch 
2011-06-08 07:16:06 UTC ---
the testcase from 

http://gcc.gnu.org/bugzilla/attachment.cgi?id=20290

can be used more conveniently. It runs in 1.4s and still spends 50% of time in
var-tracking emit. 

Using callgrind, most of the time is in emit_notes_for_changes, calling
htab_traverse.


[Bug middle-end/49310] [4.7 Regression] Compile time hog in var-tracking emit

2011-06-08 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49310

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||aoliva at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org 2011-06-08 
09:54:29 UTC ---
Using -g -O2 -fbounds-check instead of -g -O1 -fbounds-check cures it,
or e.g. -g -O1 -fbounds-check --param max-vartrack-expr-depth=5
speeds it up.  The programming style is very weird, and combined with
-fbounds-check which results in huge number of bbs doesn't help it,
plus the expression chains for the debug vars really seem to be very long (and
at the points where bounds checking failures are reported the relevant
registers
holding the expressions are reused for something else).


[Bug middle-end/49310] [4.7 Regression] Compile time hog in var-tracking emit

2011-06-08 Thread Joost.VandeVondele at pci dot uzh.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49310

--- Comment #4 from Joost VandeVondele Joost.VandeVondele at pci dot uzh.ch 
2011-06-08 13:23:00 UTC ---
(In reply to comment #3)
 Using -g -O2 -fbounds-check instead of -g -O1 -fbounds-check cures it,
 or e.g. -g -O1 -fbounds-check --param max-vartrack-expr-depth=5
 speeds it up.  The programming style is very weird, and combined with
 -fbounds-check which results in huge number of bbs doesn't help it,
 plus the expression chains for the debug vars really seem to be very long (and
 at the points where bounds checking failures are reported the relevant
 registers
 holding the expressions are reused for something else).

Not so sure if I agree with your statement about my programming style ;-). 

sure timings explode with increasing max-vartrack-expr-depth, maybe the table
below can help to pick a good default ?

max-vartrack-expr-depth=2: var-tracking emit :  32.66 (33%) usr
max-vartrack-expr-depth=3: var-tracking emit :  33.03 (34%) usr
max-vartrack-expr-depth=4: var-tracking emit :  33.66 (34%) usr
max-vartrack-expr-depth=5: var-tracking emit :  33.64 (34%) usr
max-vartrack-expr-depth=6: var-tracking emit :  34.34 (35%) usr
max-vartrack-expr-depth=7: var-tracking emit :  35.98 (35%) usr
max-vartrack-expr-depth=8: var-tracking emit :  42.52 (37%) usr
max-vartrack-expr-depth=9: var-tracking emit :  48.79 (39%) usr
max-vartrack-expr-depth=10: var-tracking emit :  53.09 (42%) usr
max-vartrack-expr-depth=12: var-tracking emit :  74.52 (46%) usr
max-vartrack-expr-depth=14: var-tracking emit : 118.90 (63%) usr
max-vartrack-expr-depth=16: var-tracking emit : 313.50 (81%) usr
max-vartrack-expr-depth=18: var-tracking emit : 833.84 (91%) usr
max-vartrack-expr-depth=20: var-tracking emit :2527.38 (97%) usr


[Bug middle-end/49310] [4.7 Regression] Compile time hog in var-tracking emit

2011-06-08 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49310

--- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org 2011-06-08 
13:34:42 UTC ---
10 was the minimal value to get reasonable debug info in some cases (e.g.
gcc.dg/guality/), so perhaps 20 is too much and we should go down to the
default of 12-15.


[Bug middle-end/49310] [4.7 Regression] Compile time hog in var-tracking emit

2011-06-08 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49310

--- Comment #6 from Jakub Jelinek jakub at gcc dot gnu.org 2011-06-08 
13:38:51 UTC ---
Or alternatively make it more dynamic, like if in one function the maximum
level is reached or almost reached (so it could be checked only in
vt_expand_loc_callback) more than some parameter times (like several millions
or so), it would temporarily drop down the limit to a lower value.
It would probably need to recheck all var locations at that spot though,
because
dummy and real expansion should match.


[Bug middle-end/49310] [4.7 Regression] Compile time hog in var-tracking emit

2011-06-07 Thread Joost.VandeVondele at pci dot uzh.ch
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49310

Joost VandeVondele Joost.VandeVondele at pci dot uzh.ch changed:

   What|Removed |Added

Summary|[4.7 Regression] Compile|[4.7 Regression] Compile
   |time hog|time hog in var-tracking
   ||emit

--- Comment #1 from Joost VandeVondele Joost.VandeVondele at pci dot uzh.ch 
2011-06-07 15:10:48 UTC ---
The time report is pretty clear:

 var-tracking emit :2565.20 (97%) usr   0.08 ( 9%) sys2565.58 (97%) wall  
65881 kB ( 8%) ggc
TOTAL :2631.33 0.85  2632.52
788209 kB

For completeness the full report is below

Execution times (seconds)
 phase setup   :   0.03 ( 0%) usr   0.01 ( 1%) sys   0.04 ( 0%) wall   
 261 kB ( 0%) ggc
 phase parsing :   1.12 ( 0%) usr   0.06 ( 7%) sys   1.18 ( 0%) wall  
45507 kB ( 6%) ggc
 phase generate:2630.17 (100%) usr   0.78 (92%) sys2631.29 (100%) wall 
742440 kB (94%) ggc
 phase finalize:   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 garbage collection:   3.46 ( 0%) usr   0.01 ( 1%) sys   3.47 ( 0%) wall   
   0 kB ( 0%) ggc
 callgraph construction:   0.05 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
9747 kB ( 1%) ggc
 callgraph optimization:   0.04 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
 182 kB ( 0%) ggc
 ipa reference :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 ipa pure const:   0.09 ( 0%) usr   0.00 ( 0%) sys   0.07 ( 0%) wall   
   0 kB ( 0%) ggc
 cfg construction  :   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 140 kB ( 0%) ggc
 cfg cleanup   :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
   9 kB ( 0%) ggc
 CFG verifier  :   0.73 ( 0%) usr   0.01 ( 1%) sys   0.81 ( 0%) wall   
   0 kB ( 0%) ggc
 trivially dead code   :   0.27 ( 0%) usr   0.00 ( 0%) sys   0.29 ( 0%) wall   
   0 kB ( 0%) ggc
 df scan insns :   0.27 ( 0%) usr   0.00 ( 0%) sys   0.24 ( 0%) wall   
  14 kB ( 0%) ggc
 df multiple defs  :   0.07 ( 0%) usr   0.00 ( 0%) sys   0.08 ( 0%) wall   
   0 kB ( 0%) ggc
 df reaching defs  :   0.06 ( 0%) usr   0.00 ( 0%) sys   0.05 ( 0%) wall   
   0 kB ( 0%) ggc
 df live regs  :   0.76 ( 0%) usr   0.00 ( 0%) sys   0.69 ( 0%) wall   
   0 kB ( 0%) ggc
 df liveinitialized regs:   0.31 ( 0%) usr   0.00 ( 0%) sys   0.37 ( 0%) wall 
 0 kB ( 0%) ggc
 df use-def / def-use chains:   0.03 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%)
wall   0 kB ( 0%) ggc
 df reg dead/unused notes:   0.58 ( 0%) usr   0.01 ( 1%) sys   0.66 ( 0%) wall 
  8709 kB ( 1%) ggc
 register information  :   0.22 ( 0%) usr   0.00 ( 0%) sys   0.27 ( 0%) wall   
   0 kB ( 0%) ggc
 alias analysis:   0.24 ( 0%) usr   0.00 ( 0%) sys   0.21 ( 0%) wall  
10901 kB ( 1%) ggc
 alias stmt walking:   1.28 ( 0%) usr   0.04 ( 5%) sys   1.22 ( 0%) wall   
 555 kB ( 0%) ggc
 register scan :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.03 ( 0%) wall   
   0 kB ( 0%) ggc
 rebuild jump labels   :   0.08 ( 0%) usr   0.00 ( 0%) sys   0.09 ( 0%) wall   
   0 kB ( 0%) ggc
 parser (global)   :   1.12 ( 0%) usr   0.06 ( 7%) sys   1.18 ( 0%) wall  
45506 kB ( 6%) ggc
 inline heuristics :   0.23 ( 0%) usr   0.00 ( 0%) sys   0.25 ( 0%) wall   
  86 kB ( 0%) ggc
 tree gimplify :   0.46 ( 0%) usr   0.03 ( 4%) sys   0.49 ( 0%) wall  
59986 kB ( 8%) ggc
 tree eh   :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.02 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CFG construction :   0.04 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
9046 kB ( 1%) ggc
 tree CFG cleanup  :   0.22 ( 0%) usr   0.01 ( 1%) sys   0.24 ( 0%) wall   
  35 kB ( 0%) ggc
 tree copy propagation :   0.23 ( 0%) usr   0.02 ( 2%) sys   0.21 ( 0%) wall   
2267 kB ( 0%) ggc
 tree find ref. vars   :   0.05 ( 0%) usr   0.00 ( 0%) sys   0.11 ( 0%) wall   
5044 kB ( 1%) ggc
 tree PTA  :   0.62 ( 0%) usr   0.05 ( 6%) sys   0.70 ( 0%) wall   
1936 kB ( 0%) ggc
 tree PHI insertion:   0.02 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
 310 kB ( 0%) ggc
 tree SSA rewrite  :   0.17 ( 0%) usr   0.02 ( 2%) sys   0.22 ( 0%) wall  
21049 kB ( 3%) ggc
 tree SSA other:   0.05 ( 0%) usr   0.02 ( 2%) sys   0.12 ( 0%) wall   
  22 kB ( 0%) ggc
 tree SSA incremental  :   0.13 ( 0%) usr   0.00 ( 0%) sys   0.14 ( 0%) wall   
 817 kB ( 0%) ggc
 tree operand scan :   0.17 ( 0%) usr   0.10 (12%) sys   0.23 ( 0%) wall  
19454 kB ( 2%) ggc
 dominator optimization:   0.33 ( 0%) usr   0.00 ( 0%) sys   0.39 ( 0%) wall   
5073 kB ( 1%) ggc
 tree SRA  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall   
   0 kB ( 0%) ggc
 tree CCP  :   2.13 ( 0%) usr   0.00 ( 0%) sys   2.13 ( 0%) wall   
5999 kB ( 1%) ggc
 tree PHI const/copy prop: