BZ69347 shows excessive memory consumption in the FSM threading code. The underlying problem has been there since the introduction of the FSM threader, but the increased reliance on the FSM threader has brought the issue to the forefront of things we need to look at.
Essentially the FSM bits, in two places, allocate vectors large enough for every basic block in the current function and does so each call into the FSM threader. The vectors in question are filled with safe_push style operations, so it's safe to initialize them to any constant size. I selected "10" entries. Doing that brings the memory consumption *way* down.
Second, I noticed that record_temporary_equivalences has gotten considerably more expensive with its immediate-use walking. I'm pondering how to bring that cost down, but in the mean time we can simply remove roughly 1/3 of the calls into the routine that were totally redundant.
We would call it from dom_opt_walker::thread_across_edge to record temporary equivalences related to the given edge. Then we'd call ::thread_across_edge from tree-ssa-threadedge.c, which in turn calls thread_through_normal_block which calls record_temporary_equivalences again on the same edge.
Bootstrapped and regression tested on x86_64. Also verified no code generation differences on my bucket of .i files. Installed on the trunk.
Jeff
commit 3ef2e1cc4f4c6904642891a4f779230737d50613 Author: law <law@138bc75d-0d04-0410-961f-82ee72b054a4> Date: Thu Jan 21 22:21:55 2016 +0000 [PATCH] [PR tree-optimization/69347] Fix memory consumption in threader & minor speed improvement PR middle-end/69347 * tree-ssa-dom.c (dom_opt_dom_walker::thread_across_edge): Avoid useless call to record_temporary_equivalences. * tree-ssa-threadbackward.c (find_jump_threads_backwards): Just allocate 10 slots in the bb_path vector and let it grow as needed. (fsm_find_control_statement_thread_paths): Similarly for the next_path vector. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@232711 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 5f0d7c0..c3908ea 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,13 @@ +2016-01-21 Jeff Law <l...@redhat.com> + + PR middle-end/69347 + * tree-ssa-dom.c (dom_opt_dom_walker::thread_across_edge): Avoid + useless call to record_temporary_equivalences. + * tree-ssa-threadbackward.c (find_jump_threads_backwards): Just + allocate 10 slots in the bb_path vector and let it grow as needed. + (fsm_find_control_statement_thread_paths): Similarly for the next_path + vector. + 2016-01-21 David Edelsohn <dje....@gmail.com> * configure.ac (gcc_cv_as_powerpc_mfcrf, gcc_cv_as_machine_directive): diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c index 3eeaa9c..84c9a6a 100644 --- a/gcc/tree-ssa-dom.c +++ b/gcc/tree-ssa-dom.c @@ -935,9 +935,6 @@ dom_opt_dom_walker::thread_across_edge (edge e) m_avail_exprs_stack->push_marker (); m_const_and_copies->push_marker (); - /* Traversing E may result in equivalences we can utilize. */ - record_temporary_equivalences (e, m_const_and_copies, m_avail_exprs_stack); - /* With all the edge equivalences in the tables, go ahead and attempt to thread through E->dest. */ ::thread_across_edge (m_dummy_cond, e, false, diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c index 8d8aa30..8be57a0 100644 --- a/gcc/tree-ssa-threadbackward.c +++ b/gcc/tree-ssa-threadbackward.c @@ -142,7 +142,7 @@ fsm_find_control_statement_thread_paths (tree name, int e_count = 0; edge_iterator ei; vec<basic_block, va_gc> *next_path; - vec_alloc (next_path, n_basic_blocks_for_fn (cfun)); + vec_alloc (next_path, 10); /* When VAR_BB == LAST_BB_IN_PATH, then the first block in the path will already be in VISITED_BBS. When they are not equal, then we @@ -379,7 +379,7 @@ find_jump_threads_backwards (edge e) return; vec<basic_block, va_gc> *bb_path; - vec_alloc (bb_path, n_basic_blocks_for_fn (cfun)); + vec_alloc (bb_path, 10); vec_safe_push (bb_path, e->dest); hash_set<basic_block> *visited_bbs = new hash_set<basic_block>;