Re: Compile-time improvement for if conversion.
On Fri, Oct 14, 2016 at 2:07 PM, Yuri Rumyantsevwrote: > Richard, > > Here is updated patch with the changes proposed by you. > > Bootstrapping and regression testing did not show any new failures. > Is it OK for trunk? Ok with dropping the free_dominance_info_for_region variant without struct function argument and making the one with take struct function * as _first_ argument. Thanks, Richard. > ChangeLog: > 2016-10-14 Yuri Rumyantsev > > * dominance.c (dom_info::dom_info): Add new constructor for region > presented by vector of basic blocks. > (dom_init): New method to initialize members common for both > constructors. > (dom_info::dom_info): Invoke dom_init for partial initialization. > (dom_info::get_idom): Add check to corner cases on basic blocks which > are not in region. > (dom_info::calc_dfs_tree): Check M_FAKE_EXIT_EDGE instead of M_REVERSE > to detect unreachable bbs. > (dom_info::calc_idoms): Likewise. > (compute_dom_fast_query_in_region): New function. > (calculate_dominance_info_for_region): Likewise. > (free_dominance_info_for_region): Add couple functions to free > dominance info for region. > * dominance.h: Add prototypes for introduced region-based functions > tree-if-conv.c: (build_region): New function. > (if_convertible_loop_p_1): Invoke local version of post-dominators > calculation before basic block predication with subsequent freeing > post-dominator info. > (tree_if_conversion): Remove free of post-dominator info > (pass_if_conversion::execute): Delete detection of infinite loops > and fake edges to exit block since post-dominator calculation is > performed per if-converted loop only. > > 2016-10-13 13:15 GMT+03:00 Richard Biener : >> On Wed, Oct 12, 2016 at 3:37 PM, Yuri Rumyantsev wrote: >>> Richard, >>> >>> Here is updated patch . I avoided creation of new entry/exit blocks >>> but instead add check to border cases - do not consider also blocks >>> which are out of region. >>> >>> Any comments will be appreciated. >> >> I mostly like it now. Can you split out all the common parts from the >> dom_info constructor >> to a helper method please and just compute m_n_basic_blocks and a max_index >> and >> do all the memory allocation in that common function? >> >> @@ -276,9 +334,10 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb) >> bn = e->src; >> >> /* If the next node BN is either already visited or a border >> -block the current edge is useless, and simply overwritten >> -with the next edge out of the current node. */ >> - if (bn == m_end_block || m_dfs_order[bn->index]) >> +block or out of region the current edge is useless, and >> simply >> +overwritten with the next edge out of the current node. */ >> + if (bn == m_end_block || bn->dom[d_i] == NULL >> >> clever idea ;) Just thinking if this means we support single entry, >> multiple exit >> regions for CDI_DOMINATORs and multiple entry, single exit regions for >> CDI_POST_DOMINATORs. I think so. Please update the function comments >> accordingly. >> >> >> + m_dfsnum = 1; >> + m_nodes = 0; >> + m_fake_exit_edge = NULL; /* Assume that region is SCC. */ >> >> you mean SESE rather than SCC? >> >> @@ -511,7 +573,7 @@ dom_info::calc_idoms () >>: ei_start (bb->preds); >>edge_iterator einext; >> >> - if (m_reverse) >> + if (m_reverse && !in_region) >> { >> /* If this block has a fake edge to exit, process that first. */ >> if (bitmap_bit_p (m_fake_exit_edge, bb->index)) >> >> I think it's better to simply change the if (m_reverse) test to a if >> (m_fake_exit_edge) test. >> >> I noticed you do not initialize n_bbs_in_dom_tree (just set it to >> zero), it's not really used >> anywhere (but in an assert). >> >> free_dominance_info_for_region needs a function level comment (per >> coding guidelines). >> >> Please make free_dominance_info_for_region take a struct function * pointer. >> >> @@ -1318,11 +1350,13 @@ if_convertible_loop_p_1 (struct loop *loop, >> vec *refs) >> { >>unsigned int i; >>basic_block exit_bb = NULL; >> + vec region; >> >>if (find_data_references_in_loop (loop, refs) == chrec_dont_know) >> return false; >> >> - calculate_dominance_info (CDI_DOMINATORS); >> + if (dom_info_state (CDI_DOMINATORS) != DOM_OK) >> +calculate_dominance_info (CDI_DOMINATORS); >> >> This is a premature (unwanted) change. >> >> Other than that the only O(function-size) part left is the zeroing in >> >> + /* Determine max basic block index in region. */ >> + int max_index = region[0]->index; >> + for (size_t i = 1; i < num; i++) >> +if (region[i]->index > max_index) >> + max_index = region[i]->index; >> + max_index += 1; >> + m_dfs_order = new_zero_array (max_index + 1); >> + m_dfs_last =
Re: Compile-time improvement for if conversion.
Richard, Here is updated patch with the changes proposed by you. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-10-14 Yuri Rumyantsev* dominance.c (dom_info::dom_info): Add new constructor for region presented by vector of basic blocks. (dom_init): New method to initialize members common for both constructors. (dom_info::dom_info): Invoke dom_init for partial initialization. (dom_info::get_idom): Add check to corner cases on basic blocks which are not in region. (dom_info::calc_dfs_tree): Check M_FAKE_EXIT_EDGE instead of M_REVERSE to detect unreachable bbs. (dom_info::calc_idoms): Likewise. (compute_dom_fast_query_in_region): New function. (calculate_dominance_info_for_region): Likewise. (free_dominance_info_for_region): Add couple functions to free dominance info for region. * dominance.h: Add prototypes for introduced region-based functions tree-if-conv.c: (build_region): New function. (if_convertible_loop_p_1): Invoke local version of post-dominators calculation before basic block predication with subsequent freeing post-dominator info. (tree_if_conversion): Remove free of post-dominator info (pass_if_conversion::execute): Delete detection of infinite loops and fake edges to exit block since post-dominator calculation is performed per if-converted loop only. 2016-10-13 13:15 GMT+03:00 Richard Biener : > On Wed, Oct 12, 2016 at 3:37 PM, Yuri Rumyantsev wrote: >> Richard, >> >> Here is updated patch . I avoided creation of new entry/exit blocks >> but instead add check to border cases - do not consider also blocks >> which are out of region. >> >> Any comments will be appreciated. > > I mostly like it now. Can you split out all the common parts from the > dom_info constructor > to a helper method please and just compute m_n_basic_blocks and a max_index > and > do all the memory allocation in that common function? > > @@ -276,9 +334,10 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb) > bn = e->src; > > /* If the next node BN is either already visited or a border > -block the current edge is useless, and simply overwritten > -with the next edge out of the current node. */ > - if (bn == m_end_block || m_dfs_order[bn->index]) > +block or out of region the current edge is useless, and > simply > +overwritten with the next edge out of the current node. */ > + if (bn == m_end_block || bn->dom[d_i] == NULL > > clever idea ;) Just thinking if this means we support single entry, > multiple exit > regions for CDI_DOMINATORs and multiple entry, single exit regions for > CDI_POST_DOMINATORs. I think so. Please update the function comments > accordingly. > > > + m_dfsnum = 1; > + m_nodes = 0; > + m_fake_exit_edge = NULL; /* Assume that region is SCC. */ > > you mean SESE rather than SCC? > > @@ -511,7 +573,7 @@ dom_info::calc_idoms () >: ei_start (bb->preds); >edge_iterator einext; > > - if (m_reverse) > + if (m_reverse && !in_region) > { > /* If this block has a fake edge to exit, process that first. */ > if (bitmap_bit_p (m_fake_exit_edge, bb->index)) > > I think it's better to simply change the if (m_reverse) test to a if > (m_fake_exit_edge) test. > > I noticed you do not initialize n_bbs_in_dom_tree (just set it to > zero), it's not really used > anywhere (but in an assert). > > free_dominance_info_for_region needs a function level comment (per > coding guidelines). > > Please make free_dominance_info_for_region take a struct function * pointer. > > @@ -1318,11 +1350,13 @@ if_convertible_loop_p_1 (struct loop *loop, > vec *refs) > { >unsigned int i; >basic_block exit_bb = NULL; > + vec region; > >if (find_data_references_in_loop (loop, refs) == chrec_dont_know) > return false; > > - calculate_dominance_info (CDI_DOMINATORS); > + if (dom_info_state (CDI_DOMINATORS) != DOM_OK) > +calculate_dominance_info (CDI_DOMINATORS); > > This is a premature (unwanted) change. > > Other than that the only O(function-size) part left is the zeroing in > > + /* Determine max basic block index in region. */ > + int max_index = region[0]->index; > + for (size_t i = 1; i < num; i++) > +if (region[i]->index > max_index) > + max_index = region[i]->index; > + max_index += 1; > + m_dfs_order = new_zero_array (max_index + 1); > + m_dfs_last = _dfs_order[max_index]; > > I suppose we can try addressing this as followup. > > Thanks a lot for doing this. > Richard. > >> 2016-10-11 16:48 GMT+03:00 Richard Biener : >>> On Tue, Oct 11, 2016 at 3:23 PM, Yuri Rumyantsev wrote: Richard, I implemented this by passing callback function in_region which returns true if block belongs to region. I
Re: Compile-time improvement for if conversion.
On Wed, Oct 12, 2016 at 3:37 PM, Yuri Rumyantsevwrote: > Richard, > > Here is updated patch . I avoided creation of new entry/exit blocks > but instead add check to border cases - do not consider also blocks > which are out of region. > > Any comments will be appreciated. I mostly like it now. Can you split out all the common parts from the dom_info constructor to a helper method please and just compute m_n_basic_blocks and a max_index and do all the memory allocation in that common function? @@ -276,9 +334,10 @@ dom_info::calc_dfs_tree_nonrec (basic_block bb) bn = e->src; /* If the next node BN is either already visited or a border -block the current edge is useless, and simply overwritten -with the next edge out of the current node. */ - if (bn == m_end_block || m_dfs_order[bn->index]) +block or out of region the current edge is useless, and simply +overwritten with the next edge out of the current node. */ + if (bn == m_end_block || bn->dom[d_i] == NULL clever idea ;) Just thinking if this means we support single entry, multiple exit regions for CDI_DOMINATORs and multiple entry, single exit regions for CDI_POST_DOMINATORs. I think so. Please update the function comments accordingly. + m_dfsnum = 1; + m_nodes = 0; + m_fake_exit_edge = NULL; /* Assume that region is SCC. */ you mean SESE rather than SCC? @@ -511,7 +573,7 @@ dom_info::calc_idoms () : ei_start (bb->preds); edge_iterator einext; - if (m_reverse) + if (m_reverse && !in_region) { /* If this block has a fake edge to exit, process that first. */ if (bitmap_bit_p (m_fake_exit_edge, bb->index)) I think it's better to simply change the if (m_reverse) test to a if (m_fake_exit_edge) test. I noticed you do not initialize n_bbs_in_dom_tree (just set it to zero), it's not really used anywhere (but in an assert). free_dominance_info_for_region needs a function level comment (per coding guidelines). Please make free_dominance_info_for_region take a struct function * pointer. @@ -1318,11 +1350,13 @@ if_convertible_loop_p_1 (struct loop *loop, vec *refs) { unsigned int i; basic_block exit_bb = NULL; + vec region; if (find_data_references_in_loop (loop, refs) == chrec_dont_know) return false; - calculate_dominance_info (CDI_DOMINATORS); + if (dom_info_state (CDI_DOMINATORS) != DOM_OK) +calculate_dominance_info (CDI_DOMINATORS); This is a premature (unwanted) change. Other than that the only O(function-size) part left is the zeroing in + /* Determine max basic block index in region. */ + int max_index = region[0]->index; + for (size_t i = 1; i < num; i++) +if (region[i]->index > max_index) + max_index = region[i]->index; + max_index += 1; + m_dfs_order = new_zero_array (max_index + 1); + m_dfs_last = _dfs_order[max_index]; I suppose we can try addressing this as followup. Thanks a lot for doing this. Richard. > 2016-10-11 16:48 GMT+03:00 Richard Biener : >> On Tue, Oct 11, 2016 at 3:23 PM, Yuri Rumyantsev wrote: >>> Richard, >>> >>> I implemented this by passing callback function in_region which >>> returns true if block belongs to region. >>> I am testing it now >>> >>> I attach modified patch for your quick review. >> >> + FOR_EACH_VEC_ELT (region, i, bb) >> +{ >> + bb->dom[dir_index] = et_new_tree (bb); >> +} >> + /* Check that region is SESE region. */ >> + entry = region[0]; >> + if ( EDGE_COUNT (entry->succs) > 1) >> +{ >> + /* Create new entry block with the only successor. */ >> + basic_block succ = NULL; >> + bool found = false; >> + FOR_EACH_EDGE (e, ei, entry->succs) >> + if (in_region (region, e->dest)) >> + { >> + gcc_assert (!found); >> >> is that check equal to e->dest->dom[dir_index] != NULL? I think we >> shouldn't need the callback. >> >> +new_entry = create_empty_bb (entry); >> +unchecked_make_edge (new_entry, succ, 0); >> >> I still think this is somewhat gross and we should try to avoid it >> somehow - at least it's well-hidden now ;) >> >> +/* Put it to region as entry node. */ >> +region[0] = new_entry; >> >> so region[0] is overwritten now? >> >> As far as I understand calc_dfs_tree it should be possible to compute >> the region on-the-fly >> from calling calc_dfs_tree_nonrec on the entry/exit (also maybe >> avoiding some of the still >> "large" complexity of zeroing arrays in the constructor). >> >> And if we use that DFS walk directly we should be able to avoid >> creating those fake entry/exit >> blocks by using entry/exit edges instead... (somehow). >> >> Richard. >> >> >> >>> Thanks. >>> >>> 2016-10-11 13:33 GMT+03:00 Richard Biener : On Mon, Oct 10, 2016 at 4:17 PM, Yuri
Re: Compile-time improvement for if conversion.
Richard, Here is updated patch . I avoided creation of new entry/exit blocks but instead add check to border cases - do not consider also blocks which are out of region. Any comments will be appreciated. 2016-10-11 16:48 GMT+03:00 Richard Biener: > On Tue, Oct 11, 2016 at 3:23 PM, Yuri Rumyantsev wrote: >> Richard, >> >> I implemented this by passing callback function in_region which >> returns true if block belongs to region. >> I am testing it now >> >> I attach modified patch for your quick review. > > + FOR_EACH_VEC_ELT (region, i, bb) > +{ > + bb->dom[dir_index] = et_new_tree (bb); > +} > + /* Check that region is SESE region. */ > + entry = region[0]; > + if ( EDGE_COUNT (entry->succs) > 1) > +{ > + /* Create new entry block with the only successor. */ > + basic_block succ = NULL; > + bool found = false; > + FOR_EACH_EDGE (e, ei, entry->succs) > + if (in_region (region, e->dest)) > + { > + gcc_assert (!found); > > is that check equal to e->dest->dom[dir_index] != NULL? I think we > shouldn't need the callback. > > +new_entry = create_empty_bb (entry); > +unchecked_make_edge (new_entry, succ, 0); > > I still think this is somewhat gross and we should try to avoid it > somehow - at least it's well-hidden now ;) > > +/* Put it to region as entry node. */ > +region[0] = new_entry; > > so region[0] is overwritten now? > > As far as I understand calc_dfs_tree it should be possible to compute > the region on-the-fly > from calling calc_dfs_tree_nonrec on the entry/exit (also maybe > avoiding some of the still > "large" complexity of zeroing arrays in the constructor). > > And if we use that DFS walk directly we should be able to avoid > creating those fake entry/exit > blocks by using entry/exit edges instead... (somehow). > > Richard. > > > >> Thanks. >> >> 2016-10-11 13:33 GMT+03:00 Richard Biener : >>> On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev wrote: Richard, If "fake" exit or entry block is created in dominance how we can determine what is its the only predecessor or successor without using a notion of loop? >>> >>> The caller passes in an entry and exit edge instead of a block or loop. >>> >>> Richard. >>> 2016-10-10 15:00 GMT+03:00 Richard Biener : > On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev > wrote: >> Thanks Richard for your comments. >> I'd like to answer on your last comment regarding use split_edge() >> instead of creating fake post-header. I started with this splitting >> but it requires to fix-up closed ssa form by creating additional phi >> nodes, so I decided to use only cfg change without updating ssa form. >> Other changes look reasonable and will fix them. > > Ah. In this case can you investigate what it takes to make the entry/exit > edges rather than BBs? That is, introduce those "fakes" only internally > in dominance.c? > >> 2016-10-10 12:52 GMT+03:00 Richard Biener : >>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev >>> wrote: Hi All, Here is implementation of Richard proposal: < For general infrastructure it would be nice to expose a (post-)dominator < compute for MESE (post-dominators) / SEME (dominators) regions. I believe < what makes if-conversion expensive is the post-dom compute which happens < for each loop for the whole function. It shouldn't be very difficult < to write this, < sharing as much as possible code with the current DOM code might need < quite some refactoring though. I implemented this proposal by adding calculation of dominance info for SESE regions and incorporate this change to if conversion pass. SESE region is built by adding loop pre-header and possibly fake post-header blocks to loop body. Fake post-header is deleted after predication completion. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? >>> >>> It's mostly reasonable but I have a few comments. First, re-using >>> bb->dom[] for the dominator info is somewhat fragile but indeed >>> a requirement to make the patch reasonably small. Please, >>> in calculate_dominance_info_for_region, make sure that >>> !dom_info_available_p (dir). >>> >>> You pass loop * everywhere but require ->aux to be set up as >>> an array of BBs forming the region with special BBs at array ends. >>> >>> Please instead pass in a vec which avoids using ->aux >>> and also allows other non-loop-based SESE regions to be used >>> (I
Re: Compile-time improvement for if conversion.
On Tue, Oct 11, 2016 at 3:23 PM, Yuri Rumyantsevwrote: > Richard, > > I implemented this by passing callback function in_region which > returns true if block belongs to region. > I am testing it now > > I attach modified patch for your quick review. + FOR_EACH_VEC_ELT (region, i, bb) +{ + bb->dom[dir_index] = et_new_tree (bb); +} + /* Check that region is SESE region. */ + entry = region[0]; + if ( EDGE_COUNT (entry->succs) > 1) +{ + /* Create new entry block with the only successor. */ + basic_block succ = NULL; + bool found = false; + FOR_EACH_EDGE (e, ei, entry->succs) + if (in_region (region, e->dest)) + { + gcc_assert (!found); is that check equal to e->dest->dom[dir_index] != NULL? I think we shouldn't need the callback. +new_entry = create_empty_bb (entry); +unchecked_make_edge (new_entry, succ, 0); I still think this is somewhat gross and we should try to avoid it somehow - at least it's well-hidden now ;) +/* Put it to region as entry node. */ +region[0] = new_entry; so region[0] is overwritten now? As far as I understand calc_dfs_tree it should be possible to compute the region on-the-fly from calling calc_dfs_tree_nonrec on the entry/exit (also maybe avoiding some of the still "large" complexity of zeroing arrays in the constructor). And if we use that DFS walk directly we should be able to avoid creating those fake entry/exit blocks by using entry/exit edges instead... (somehow). Richard. > Thanks. > > 2016-10-11 13:33 GMT+03:00 Richard Biener : >> On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev wrote: >>> Richard, >>> >>> If "fake" exit or entry block is created in dominance how we can >>> determine what is its the only predecessor or successor without using >>> a notion of loop? >> >> The caller passes in an entry and exit edge instead of a block or loop. >> >> Richard. >> >>> 2016-10-10 15:00 GMT+03:00 Richard Biener : On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev wrote: > Thanks Richard for your comments. > I'd like to answer on your last comment regarding use split_edge() > instead of creating fake post-header. I started with this splitting > but it requires to fix-up closed ssa form by creating additional phi > nodes, so I decided to use only cfg change without updating ssa form. > Other changes look reasonable and will fix them. Ah. In this case can you investigate what it takes to make the entry/exit edges rather than BBs? That is, introduce those "fakes" only internally in dominance.c? > 2016-10-10 12:52 GMT+03:00 Richard Biener : >> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev >> wrote: >>> Hi All, >>> >>> Here is implementation of Richard proposal: >>> >>> < For general infrastructure it would be nice to expose a >>> (post-)dominator >>> < compute for MESE (post-dominators) / SEME (dominators) regions. I >>> believe >>> < what makes if-conversion expensive is the post-dom compute which >>> happens >>> < for each loop for the whole function. It shouldn't be very difficult >>> < to write this, >>> < sharing as much as possible code with the current DOM code might need >>> < quite some refactoring though. >>> >>> I implemented this proposal by adding calculation of dominance info >>> for SESE regions and incorporate this change to if conversion pass. >>> SESE region is built by adding loop pre-header and possibly fake >>> post-header blocks to loop body. Fake post-header is deleted after >>> predication completion. >>> >>> Bootstrapping and regression testing did not show any new failures. >>> >>> Is it OK for trunk? >> >> It's mostly reasonable but I have a few comments. First, re-using >> bb->dom[] for the dominator info is somewhat fragile but indeed >> a requirement to make the patch reasonably small. Please, >> in calculate_dominance_info_for_region, make sure that >> !dom_info_available_p (dir). >> >> You pass loop * everywhere but require ->aux to be set up as >> an array of BBs forming the region with special BBs at array ends. >> >> Please instead pass in a vec which avoids using ->aux >> and also allows other non-loop-based SESE regions to be used >> (I couldn't spot anything that relies on this being a loop). >> >> Adding a convenience wrapper for loop * would be of course nice, >> to cover the special pre/post-header code in tree-if-conv.c. >> >> In theory a SESE region is fully specified by its entry end exit _edge_, >> so you might want to see if it's possible to use such a pair of edges >> to guard the dfs/idom walks to avoid the need to create
Re: Compile-time improvement for if conversion.
Richard, I implemented this by passing callback function in_region which returns true if block belongs to region. I am testing it now I attach modified patch for your quick review. Thanks. 2016-10-11 13:33 GMT+03:00 Richard Biener: > On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsev wrote: >> Richard, >> >> If "fake" exit or entry block is created in dominance how we can >> determine what is its the only predecessor or successor without using >> a notion of loop? > > The caller passes in an entry and exit edge instead of a block or loop. > > Richard. > >> 2016-10-10 15:00 GMT+03:00 Richard Biener : >>> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev wrote: Thanks Richard for your comments. I'd like to answer on your last comment regarding use split_edge() instead of creating fake post-header. I started with this splitting but it requires to fix-up closed ssa form by creating additional phi nodes, so I decided to use only cfg change without updating ssa form. Other changes look reasonable and will fix them. >>> >>> Ah. In this case can you investigate what it takes to make the entry/exit >>> edges rather than BBs? That is, introduce those "fakes" only internally >>> in dominance.c? >>> 2016-10-10 12:52 GMT+03:00 Richard Biener : > On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev > wrote: >> Hi All, >> >> Here is implementation of Richard proposal: >> >> < For general infrastructure it would be nice to expose a >> (post-)dominator >> < compute for MESE (post-dominators) / SEME (dominators) regions. I >> believe >> < what makes if-conversion expensive is the post-dom compute which >> happens >> < for each loop for the whole function. It shouldn't be very difficult >> < to write this, >> < sharing as much as possible code with the current DOM code might need >> < quite some refactoring though. >> >> I implemented this proposal by adding calculation of dominance info >> for SESE regions and incorporate this change to if conversion pass. >> SESE region is built by adding loop pre-header and possibly fake >> post-header blocks to loop body. Fake post-header is deleted after >> predication completion. >> >> Bootstrapping and regression testing did not show any new failures. >> >> Is it OK for trunk? > > It's mostly reasonable but I have a few comments. First, re-using > bb->dom[] for the dominator info is somewhat fragile but indeed > a requirement to make the patch reasonably small. Please, > in calculate_dominance_info_for_region, make sure that > !dom_info_available_p (dir). > > You pass loop * everywhere but require ->aux to be set up as > an array of BBs forming the region with special BBs at array ends. > > Please instead pass in a vec which avoids using ->aux > and also allows other non-loop-based SESE regions to be used > (I couldn't spot anything that relies on this being a loop). > > Adding a convenience wrapper for loop * would be of course nice, > to cover the special pre/post-header code in tree-if-conv.c. > > In theory a SESE region is fully specified by its entry end exit _edge_, > so you might want to see if it's possible to use such a pair of edges > to guard the dfs/idom walks to avoid the need to create fake blocks. > > Btw, instead of using create_empty_bb, unchecked_make_edge, etc. > please use split_edge() of the entry/exit edges. > > Richard. > >> ChangeLog: >> 2016-10-05 Yuri Rumyantsev >> >> * dominance.c : Include cfgloop.h for loop recognition. >> (dom_info): Add new functions and add boolean argument to recognize >> computation for loop region. >> (dom_info::dom_info): New function. >> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not >> handle unvisited blocks. >> (dom_info::calc_idoms): Likewise. >> (compute_dom_fast_query_in_region): New function. >> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with >> false argument. >> (calculate_dominance_info_for_region): New function. >> (free_dominance_info_for_region): Likewise. >> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false >> argument. >> * dominance.h: Add prototype for introduced functions >> calculate_dominance_info_for_region and >> free_dominance_info_for_region. >> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader. >> (build_sese_region): New function. >> (if_convertible_loop_p_1): Invoke local version of post-dominators >> calculation, free it after basic block predication and delete created >> fake post-header block if any. >>
Re: Compile-time improvement for if conversion.
On Mon, Oct 10, 2016 at 4:17 PM, Yuri Rumyantsevwrote: > Richard, > > If "fake" exit or entry block is created in dominance how we can > determine what is its the only predecessor or successor without using > a notion of loop? The caller passes in an entry and exit edge instead of a block or loop. Richard. > 2016-10-10 15:00 GMT+03:00 Richard Biener : >> On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev wrote: >>> Thanks Richard for your comments. >>> I'd like to answer on your last comment regarding use split_edge() >>> instead of creating fake post-header. I started with this splitting >>> but it requires to fix-up closed ssa form by creating additional phi >>> nodes, so I decided to use only cfg change without updating ssa form. >>> Other changes look reasonable and will fix them. >> >> Ah. In this case can you investigate what it takes to make the entry/exit >> edges rather than BBs? That is, introduce those "fakes" only internally >> in dominance.c? >> >>> 2016-10-10 12:52 GMT+03:00 Richard Biener : On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev wrote: > Hi All, > > Here is implementation of Richard proposal: > > < For general infrastructure it would be nice to expose a (post-)dominator > < compute for MESE (post-dominators) / SEME (dominators) regions. I > believe > < what makes if-conversion expensive is the post-dom compute which happens > < for each loop for the whole function. It shouldn't be very difficult > < to write this, > < sharing as much as possible code with the current DOM code might need > < quite some refactoring though. > > I implemented this proposal by adding calculation of dominance info > for SESE regions and incorporate this change to if conversion pass. > SESE region is built by adding loop pre-header and possibly fake > post-header blocks to loop body. Fake post-header is deleted after > predication completion. > > Bootstrapping and regression testing did not show any new failures. > > Is it OK for trunk? It's mostly reasonable but I have a few comments. First, re-using bb->dom[] for the dominator info is somewhat fragile but indeed a requirement to make the patch reasonably small. Please, in calculate_dominance_info_for_region, make sure that !dom_info_available_p (dir). You pass loop * everywhere but require ->aux to be set up as an array of BBs forming the region with special BBs at array ends. Please instead pass in a vec which avoids using ->aux and also allows other non-loop-based SESE regions to be used (I couldn't spot anything that relies on this being a loop). Adding a convenience wrapper for loop * would be of course nice, to cover the special pre/post-header code in tree-if-conv.c. In theory a SESE region is fully specified by its entry end exit _edge_, so you might want to see if it's possible to use such a pair of edges to guard the dfs/idom walks to avoid the need to create fake blocks. Btw, instead of using create_empty_bb, unchecked_make_edge, etc. please use split_edge() of the entry/exit edges. Richard. > ChangeLog: > 2016-10-05 Yuri Rumyantsev > > * dominance.c : Include cfgloop.h for loop recognition. > (dom_info): Add new functions and add boolean argument to recognize > computation for loop region. > (dom_info::dom_info): New function. > (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not > handle unvisited blocks. > (dom_info::calc_idoms): Likewise. > (compute_dom_fast_query_in_region): New function. > (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with > false argument. > (calculate_dominance_info_for_region): New function. > (free_dominance_info_for_region): Likewise. > (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false > argument. > * dominance.h: Add prototype for introduced functions > calculate_dominance_info_for_region and > free_dominance_info_for_region. > tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader. > (build_sese_region): New function. > (if_convertible_loop_p_1): Invoke local version of post-dominators > calculation, free it after basic block predication and delete created > fake post-header block if any. > (tree_if_conversion): Delete call of free_dominance_info for > post-dominators, free ifc_sese_bbs which represents SESE region. > (pass_if_conversion::execute): Delete detection of infinite loops > and fake edges to exit block since post-dominator calculation is > performed per if-converted loop only.
Re: Compile-time improvement for if conversion.
Richard, If "fake" exit or entry block is created in dominance how we can determine what is its the only predecessor or successor without using a notion of loop? 2016-10-10 15:00 GMT+03:00 Richard Biener: > On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsev wrote: >> Thanks Richard for your comments. >> I'd like to answer on your last comment regarding use split_edge() >> instead of creating fake post-header. I started with this splitting >> but it requires to fix-up closed ssa form by creating additional phi >> nodes, so I decided to use only cfg change without updating ssa form. >> Other changes look reasonable and will fix them. > > Ah. In this case can you investigate what it takes to make the entry/exit > edges rather than BBs? That is, introduce those "fakes" only internally > in dominance.c? > >> 2016-10-10 12:52 GMT+03:00 Richard Biener : >>> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev wrote: Hi All, Here is implementation of Richard proposal: < For general infrastructure it would be nice to expose a (post-)dominator < compute for MESE (post-dominators) / SEME (dominators) regions. I believe < what makes if-conversion expensive is the post-dom compute which happens < for each loop for the whole function. It shouldn't be very difficult < to write this, < sharing as much as possible code with the current DOM code might need < quite some refactoring though. I implemented this proposal by adding calculation of dominance info for SESE regions and incorporate this change to if conversion pass. SESE region is built by adding loop pre-header and possibly fake post-header blocks to loop body. Fake post-header is deleted after predication completion. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? >>> >>> It's mostly reasonable but I have a few comments. First, re-using >>> bb->dom[] for the dominator info is somewhat fragile but indeed >>> a requirement to make the patch reasonably small. Please, >>> in calculate_dominance_info_for_region, make sure that >>> !dom_info_available_p (dir). >>> >>> You pass loop * everywhere but require ->aux to be set up as >>> an array of BBs forming the region with special BBs at array ends. >>> >>> Please instead pass in a vec which avoids using ->aux >>> and also allows other non-loop-based SESE regions to be used >>> (I couldn't spot anything that relies on this being a loop). >>> >>> Adding a convenience wrapper for loop * would be of course nice, >>> to cover the special pre/post-header code in tree-if-conv.c. >>> >>> In theory a SESE region is fully specified by its entry end exit _edge_, >>> so you might want to see if it's possible to use such a pair of edges >>> to guard the dfs/idom walks to avoid the need to create fake blocks. >>> >>> Btw, instead of using create_empty_bb, unchecked_make_edge, etc. >>> please use split_edge() of the entry/exit edges. >>> >>> Richard. >>> ChangeLog: 2016-10-05 Yuri Rumyantsev * dominance.c : Include cfgloop.h for loop recognition. (dom_info): Add new functions and add boolean argument to recognize computation for loop region. (dom_info::dom_info): New function. (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not handle unvisited blocks. (dom_info::calc_idoms): Likewise. (compute_dom_fast_query_in_region): New function. (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with false argument. (calculate_dominance_info_for_region): New function. (free_dominance_info_for_region): Likewise. (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false argument. * dominance.h: Add prototype for introduced functions calculate_dominance_info_for_region and free_dominance_info_for_region. tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader. (build_sese_region): New function. (if_convertible_loop_p_1): Invoke local version of post-dominators calculation, free it after basic block predication and delete created fake post-header block if any. (tree_if_conversion): Delete call of free_dominance_info for post-dominators, free ifc_sese_bbs which represents SESE region. (pass_if_conversion::execute): Delete detection of infinite loops and fake edges to exit block since post-dominator calculation is performed per if-converted loop only.
Re: Compile-time improvement for if conversion.
On Mon, Oct 10, 2016 at 1:42 PM, Yuri Rumyantsevwrote: > Thanks Richard for your comments. > I'd like to answer on your last comment regarding use split_edge() > instead of creating fake post-header. I started with this splitting > but it requires to fix-up closed ssa form by creating additional phi > nodes, so I decided to use only cfg change without updating ssa form. > Other changes look reasonable and will fix them. Ah. In this case can you investigate what it takes to make the entry/exit edges rather than BBs? That is, introduce those "fakes" only internally in dominance.c? > 2016-10-10 12:52 GMT+03:00 Richard Biener : >> On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev wrote: >>> Hi All, >>> >>> Here is implementation of Richard proposal: >>> >>> < For general infrastructure it would be nice to expose a (post-)dominator >>> < compute for MESE (post-dominators) / SEME (dominators) regions. I believe >>> < what makes if-conversion expensive is the post-dom compute which happens >>> < for each loop for the whole function. It shouldn't be very difficult >>> < to write this, >>> < sharing as much as possible code with the current DOM code might need >>> < quite some refactoring though. >>> >>> I implemented this proposal by adding calculation of dominance info >>> for SESE regions and incorporate this change to if conversion pass. >>> SESE region is built by adding loop pre-header and possibly fake >>> post-header blocks to loop body. Fake post-header is deleted after >>> predication completion. >>> >>> Bootstrapping and regression testing did not show any new failures. >>> >>> Is it OK for trunk? >> >> It's mostly reasonable but I have a few comments. First, re-using >> bb->dom[] for the dominator info is somewhat fragile but indeed >> a requirement to make the patch reasonably small. Please, >> in calculate_dominance_info_for_region, make sure that >> !dom_info_available_p (dir). >> >> You pass loop * everywhere but require ->aux to be set up as >> an array of BBs forming the region with special BBs at array ends. >> >> Please instead pass in a vec which avoids using ->aux >> and also allows other non-loop-based SESE regions to be used >> (I couldn't spot anything that relies on this being a loop). >> >> Adding a convenience wrapper for loop * would be of course nice, >> to cover the special pre/post-header code in tree-if-conv.c. >> >> In theory a SESE region is fully specified by its entry end exit _edge_, >> so you might want to see if it's possible to use such a pair of edges >> to guard the dfs/idom walks to avoid the need to create fake blocks. >> >> Btw, instead of using create_empty_bb, unchecked_make_edge, etc. >> please use split_edge() of the entry/exit edges. >> >> Richard. >> >>> ChangeLog: >>> 2016-10-05 Yuri Rumyantsev >>> >>> * dominance.c : Include cfgloop.h for loop recognition. >>> (dom_info): Add new functions and add boolean argument to recognize >>> computation for loop region. >>> (dom_info::dom_info): New function. >>> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not >>> handle unvisited blocks. >>> (dom_info::calc_idoms): Likewise. >>> (compute_dom_fast_query_in_region): New function. >>> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with >>> false argument. >>> (calculate_dominance_info_for_region): New function. >>> (free_dominance_info_for_region): Likewise. >>> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false >>> argument. >>> * dominance.h: Add prototype for introduced functions >>> calculate_dominance_info_for_region and >>> free_dominance_info_for_region. >>> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader. >>> (build_sese_region): New function. >>> (if_convertible_loop_p_1): Invoke local version of post-dominators >>> calculation, free it after basic block predication and delete created >>> fake post-header block if any. >>> (tree_if_conversion): Delete call of free_dominance_info for >>> post-dominators, free ifc_sese_bbs which represents SESE region. >>> (pass_if_conversion::execute): Delete detection of infinite loops >>> and fake edges to exit block since post-dominator calculation is >>> performed per if-converted loop only.
Re: Compile-time improvement for if conversion.
Thanks Richard for your comments. I'd like to answer on your last comment regarding use split_edge() instead of creating fake post-header. I started with this splitting but it requires to fix-up closed ssa form by creating additional phi nodes, so I decided to use only cfg change without updating ssa form. Other changes look reasonable and will fix them. 2016-10-10 12:52 GMT+03:00 Richard Biener: > On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsev wrote: >> Hi All, >> >> Here is implementation of Richard proposal: >> >> < For general infrastructure it would be nice to expose a (post-)dominator >> < compute for MESE (post-dominators) / SEME (dominators) regions. I believe >> < what makes if-conversion expensive is the post-dom compute which happens >> < for each loop for the whole function. It shouldn't be very difficult >> < to write this, >> < sharing as much as possible code with the current DOM code might need >> < quite some refactoring though. >> >> I implemented this proposal by adding calculation of dominance info >> for SESE regions and incorporate this change to if conversion pass. >> SESE region is built by adding loop pre-header and possibly fake >> post-header blocks to loop body. Fake post-header is deleted after >> predication completion. >> >> Bootstrapping and regression testing did not show any new failures. >> >> Is it OK for trunk? > > It's mostly reasonable but I have a few comments. First, re-using > bb->dom[] for the dominator info is somewhat fragile but indeed > a requirement to make the patch reasonably small. Please, > in calculate_dominance_info_for_region, make sure that > !dom_info_available_p (dir). > > You pass loop * everywhere but require ->aux to be set up as > an array of BBs forming the region with special BBs at array ends. > > Please instead pass in a vec which avoids using ->aux > and also allows other non-loop-based SESE regions to be used > (I couldn't spot anything that relies on this being a loop). > > Adding a convenience wrapper for loop * would be of course nice, > to cover the special pre/post-header code in tree-if-conv.c. > > In theory a SESE region is fully specified by its entry end exit _edge_, > so you might want to see if it's possible to use such a pair of edges > to guard the dfs/idom walks to avoid the need to create fake blocks. > > Btw, instead of using create_empty_bb, unchecked_make_edge, etc. > please use split_edge() of the entry/exit edges. > > Richard. > >> ChangeLog: >> 2016-10-05 Yuri Rumyantsev >> >> * dominance.c : Include cfgloop.h for loop recognition. >> (dom_info): Add new functions and add boolean argument to recognize >> computation for loop region. >> (dom_info::dom_info): New function. >> (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not >> handle unvisited blocks. >> (dom_info::calc_idoms): Likewise. >> (compute_dom_fast_query_in_region): New function. >> (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with >> false argument. >> (calculate_dominance_info_for_region): New function. >> (free_dominance_info_for_region): Likewise. >> (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false >> argument. >> * dominance.h: Add prototype for introduced functions >> calculate_dominance_info_for_region and >> free_dominance_info_for_region. >> tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader. >> (build_sese_region): New function. >> (if_convertible_loop_p_1): Invoke local version of post-dominators >> calculation, free it after basic block predication and delete created >> fake post-header block if any. >> (tree_if_conversion): Delete call of free_dominance_info for >> post-dominators, free ifc_sese_bbs which represents SESE region. >> (pass_if_conversion::execute): Delete detection of infinite loops >> and fake edges to exit block since post-dominator calculation is >> performed per if-converted loop only.
Re: Compile-time improvement for if conversion.
On Wed, Oct 5, 2016 at 3:22 PM, Yuri Rumyantsevwrote: > Hi All, > > Here is implementation of Richard proposal: > > < For general infrastructure it would be nice to expose a (post-)dominator > < compute for MESE (post-dominators) / SEME (dominators) regions. I believe > < what makes if-conversion expensive is the post-dom compute which happens > < for each loop for the whole function. It shouldn't be very difficult > < to write this, > < sharing as much as possible code with the current DOM code might need > < quite some refactoring though. > > I implemented this proposal by adding calculation of dominance info > for SESE regions and incorporate this change to if conversion pass. > SESE region is built by adding loop pre-header and possibly fake > post-header blocks to loop body. Fake post-header is deleted after > predication completion. > > Bootstrapping and regression testing did not show any new failures. > > Is it OK for trunk? It's mostly reasonable but I have a few comments. First, re-using bb->dom[] for the dominator info is somewhat fragile but indeed a requirement to make the patch reasonably small. Please, in calculate_dominance_info_for_region, make sure that !dom_info_available_p (dir). You pass loop * everywhere but require ->aux to be set up as an array of BBs forming the region with special BBs at array ends. Please instead pass in a vec which avoids using ->aux and also allows other non-loop-based SESE regions to be used (I couldn't spot anything that relies on this being a loop). Adding a convenience wrapper for loop * would be of course nice, to cover the special pre/post-header code in tree-if-conv.c. In theory a SESE region is fully specified by its entry end exit _edge_, so you might want to see if it's possible to use such a pair of edges to guard the dfs/idom walks to avoid the need to create fake blocks. Btw, instead of using create_empty_bb, unchecked_make_edge, etc. please use split_edge() of the entry/exit edges. Richard. > ChangeLog: > 2016-10-05 Yuri Rumyantsev > > * dominance.c : Include cfgloop.h for loop recognition. > (dom_info): Add new functions and add boolean argument to recognize > computation for loop region. > (dom_info::dom_info): New function. > (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not > handle unvisited blocks. > (dom_info::calc_idoms): Likewise. > (compute_dom_fast_query_in_region): New function. > (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with > false argument. > (calculate_dominance_info_for_region): New function. > (free_dominance_info_for_region): Likewise. > (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false > argument. > * dominance.h: Add prototype for introduced functions > calculate_dominance_info_for_region and > free_dominance_info_for_region. > tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader. > (build_sese_region): New function. > (if_convertible_loop_p_1): Invoke local version of post-dominators > calculation, free it after basic block predication and delete created > fake post-header block if any. > (tree_if_conversion): Delete call of free_dominance_info for > post-dominators, free ifc_sese_bbs which represents SESE region. > (pass_if_conversion::execute): Delete detection of infinite loops > and fake edges to exit block since post-dominator calculation is > performed per if-converted loop only.
Compile-time improvement for if conversion.
Hi All, Here is implementation of Richard proposal: < For general infrastructure it would be nice to expose a (post-)dominator < compute for MESE (post-dominators) / SEME (dominators) regions. I believe < what makes if-conversion expensive is the post-dom compute which happens < for each loop for the whole function. It shouldn't be very difficult < to write this, < sharing as much as possible code with the current DOM code might need < quite some refactoring though. I implemented this proposal by adding calculation of dominance info for SESE regions and incorporate this change to if conversion pass. SESE region is built by adding loop pre-header and possibly fake post-header blocks to loop body. Fake post-header is deleted after predication completion. Bootstrapping and regression testing did not show any new failures. Is it OK for trunk? ChangeLog: 2016-10-05 Yuri Rumyantsev* dominance.c : Include cfgloop.h for loop recognition. (dom_info): Add new functions and add boolean argument to recognize computation for loop region. (dom_info::dom_info): New function. (dom_info::calc_dfs_tree): Add boolean argument IN_REGION to not handle unvisited blocks. (dom_info::calc_idoms): Likewise. (compute_dom_fast_query_in_region): New function. (calculate_dominance_info): Invoke calc_dfs_tree and calc_idoms with false argument. (calculate_dominance_info_for_region): New function. (free_dominance_info_for_region): Likewise. (verify_dominators): Invoke calc_dfs_tree and calc_idoms with false argument. * dominance.h: Add prototype for introduced functions calculate_dominance_info_for_region and free_dominance_info_for_region. tree-if-conv.c: Add to local variables ifc_sese_bbs & fake_postheader. (build_sese_region): New function. (if_convertible_loop_p_1): Invoke local version of post-dominators calculation, free it after basic block predication and delete created fake post-header block if any. (tree_if_conversion): Delete call of free_dominance_info for post-dominators, free ifc_sese_bbs which represents SESE region. (pass_if_conversion::execute): Delete detection of infinite loops and fake edges to exit block since post-dominator calculation is performed per if-converted loop only. patch Description: Binary data