Richard, I saw the sources of these functions, but I can't understand why I should use something else? Note that all predicate computations are located in basic blocks ( by design of if-conv) and there is special function that put these computations in bb (insert_gimplified_predicates). Edge contains only predicate not its computations. New function - find_insertion_point() does very simple search - it finds out the latest (in current bb) operand def-stmt of predicates taken from all incoming edges. In original algorithm the predicate of non-critical edge is taken to perform phi-node predication since for critical edge it does not work properly.
My question is: does your comments mean that I should re-design my extensions? Thanks. Yuri. BTW Jeff did initial review of my changes related to predicate computation for join blocks. I presented him updated patch with test-case and some minor changes in patch. But still did not get any feedback on it. Could you please take a look also on it? 2014-10-21 17:38 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: > On Tue, Oct 21, 2014 at 3:20 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >> Richard, >> >> Yes, This patch does not make sense since phi node predication for bb >> with critical incoming edges only performs another function which is >> absent (predicate_extended_scalar_phi). >> >> BTW I see that commit_edge_insertions() is used for rtx instructions >> only but you propose to use it for tree also. >> Did I miss something? > > Ah, it's gsi_commit_edge_inserts () (or gsi_commit_one_edge_insert > if you want easy access to the newly created basic block to push > the predicate to - see gsi_commit_edge_inserts implementation). > > Richard. > >> Thanks ahead. >> >> >> 2014-10-21 16:44 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: >>> On Tue, Oct 21, 2014 at 2:25 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: >>>> Richard, >>>> >>>> I did some changes in patch and ChangeLog to mark that support for >>>> if-convert of blocks with only critical incoming edges will be added >>>> in the future (more precise in patch.4). >>> >>> But the same reasoning applies to this version of the patch when >>> flag_force_vectorize is true!? (insertion point and invalid SSA form) >>> >>> Which means the patch doesn't make sense in isolation? >>> >>> Btw, I think for the case you should simply do gsi_insert_on_edge () >>> and commit_edge_insertions () before the call to combine_blocks >>> (pushing the edge predicate to the newly created block). >>> >>> Richard. >>> >>>> Could you please review it. >>>> >>>> Thanks. >>>> >>>> ChangeLog: >>>> >>>> 2014-10-21 Yuri Rumyantsev <ysrum...@gmail.com> >>>> >>>> (flag_force_vectorize): New variable. >>>> (edge_predicate): New function. >>>> (set_edge_predicate): New function. >>>> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list >>>> if destination block of edge is not always executed. Set-up predicate >>>> for critical edge. >>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>> if FLAG_FORCE_VECTORIZE was set-up. >>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>> (all_preds_critical_p): New function. >>>> (if_convertible_bb_p): Use call of all_preds_critical_p >>>> to reject temporarily block if-conversion with incoming critical edges >>>> if FLAG_FORCE_VECTORIZE was not set-up. This restriction will be deleted >>>> after adding support for extended predication. >>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc >>>> to compute predicate instead of fold_build2_loc. >>>> Add zeroing of edge 'aux' field. >>>> (find_phi_replacement_condition): Extend function interface: >>>> it returns NULL if given phi node must be handled by means of >>>> extended phi node predication. If number of predecessors of phi-block >>>> is equal 2 and at least one incoming edge is not critical original >>>> algorithm is used. >>>> (tree_if_conversion): Temporarily set-up FLAG_FORCE_VECTORIZE to false. >>>> Nullify 'aux' field of edges for blocks with two successors. >>>> >>>> 2014-10-20 17:55 GMT+04:00 Yuri Rumyantsev <ysrum...@gmail.com>: >>>>> Richard, >>>>> >>>>> Thanks for your answer! >>>>> >>>>> In current implementation phi node conversion assume that one of >>>>> incoming edge to bb containing given phi has at least one non-critical >>>>> edge and choose it to insert predicated code. But if we choose >>>>> critical edge we need to determine insert point and insertion >>>>> direction (before/after) since in other case we can get invalid ssa >>>>> form (use before def). This is done by my new function which is not in >>>>> current patch ( I will present this patch later). SO I assume that we >>>>> need to leave this patch as it is to not introduce new bugs. >>>>> >>>>> Thanks. >>>>> Yuri. >>>>> >>>>> 2014-10-20 12:00 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: >>>>>> On Fri, Oct 17, 2014 at 4:09 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>> wrote: >>>>>>> Richard, >>>>>>> >>>>>>> I reworked the patch as you proposed, but I didn't understand what >>>>>>> did you mean by: >>>>>>> >>>>>>>>So please rework the patch so critical edges are always handled >>>>>>>>correctly. >>>>>>> >>>>>>> In current patch flag_force_vectorize is used (1) to reject phi nodes >>>>>>> with more than 2 arguments; (2) to reject basic blocks with only >>>>>>> critical incoming edges since support for extended predication of phi >>>>>>> nodes will be in next patch. >>>>>> >>>>>> I mean that (2) should not be rejected dependent on flag_force_vectorize. >>>>>> It was rejected because if-cvt couldn't handle it correctly before but >>>>>> with >>>>>> this patch this is fixed. I see no reason to still reject this then even >>>>>> for !flag_force_vectorize. >>>>>> >>>>>> Rejecting PHIs with more than two arguments with flag_force_vectorize >>>>>> is ok. >>>>>> >>>>>> Richard. >>>>>> >>>>>>> Could you please clarify your statement. >>>>>>> >>>>>>> I attached modified patch. >>>>>>> >>>>>>> ChangeLog: >>>>>>> >>>>>>> 2014-10-17 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>> >>>>>>> (flag_force_vectorize): New variable. >>>>>>> (edge_predicate): New function. >>>>>>> (set_edge_predicate): New function. >>>>>>> (add_to_dst_predicate_list): Conditionally invoke add_to_predicate_list >>>>>>> if destination block of edge is not always executed. Set-up predicate >>>>>>> for critical edge. >>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>>> if FLAG_FORCE_VECTORIZE was set-up. >>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>>>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>>>>> (all_edges_are_critical): New function. >>>>>>> (if_convertible_bb_p): Use call of all_preds_critical_p >>>>>>> to reject block if-conversion with incoming critical edges only if >>>>>>> FLAG_FORCE_VECTORIZE was not set-up. >>>>>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc >>>>>>> to compute predicate instead of fold_build2_loc. >>>>>>> Add zeroing of edge 'aux' field. >>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>> extended phi node predication. If number of predecessors of phi-block >>>>>>> is equal 2 and atleast one incoming edge is not critical original >>>>>>> algorithm is used. >>>>>>> (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to false. >>>>>>> Nullify 'aux' field of edges for blocks with two successors. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2014-10-17 13:09 GMT+04:00 Richard Biener <richard.guent...@gmail.com>: >>>>>>>> On Thu, Oct 16, 2014 at 5:42 PM, Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>> wrote: >>>>>>>>> Richard, >>>>>>>>> >>>>>>>>> Here is reduced patch as you requested. All your remarks have been >>>>>>>>> fixed. >>>>>>>>> Could you please look at it ( I have already sent the patch with >>>>>>>>> changes in add_to_predicate_list for review). >>>>>>>> >>>>>>>> + if (dump_file && (dump_flags & TDF_DETAILS)) >>>>>>>> + fprintf (dump_file, "More than two phi node args.\n"); >>>>>>>> + return false; >>>>>>>> + } >>>>>>>> + >>>>>>>> + } >>>>>>>> >>>>>>>> Excess vertical space. >>>>>>>> >>>>>>>> >>>>>>>> +/* Assumes that BB has more than 2 predecessors. >>>>>>>> >>>>>>>> More than 1 predecessor? >>>>>>>> >>>>>>>> + Returns false if at least one successor is not on critical edge >>>>>>>> + and true otherwise. */ >>>>>>>> + >>>>>>>> +static inline bool >>>>>>>> +all_edges_are_critical (basic_block bb) >>>>>>>> +{ >>>>>>>> >>>>>>>> "all_preds_critical_p" would be a better name >>>>>>>> >>>>>>>> + if (EDGE_COUNT (bb->preds) > 2) >>>>>>>> + { >>>>>>>> + if (!flag_force_vectorize) >>>>>>>> + return false; >>>>>>>> + } >>>>>>>> >>>>>>>> as I said in the last review I don't think we should restrict edge >>>>>>>> predicates to flag_force_vectorize. At least I can't see how >>>>>>>> if-conversion is magically more expensive for that case? >>>>>>>> >>>>>>>> So please rework the patch so critical edges are always handled >>>>>>>> correctly. >>>>>>>> >>>>>>>> Ok with that and the above suggested changes. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Richard. >>>>>>>> >>>>>>>> >>>>>>>>> Thanks. >>>>>>>>> Yuri. >>>>>>>>> ChangeLog >>>>>>>>> 2014-10-16 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>> >>>>>>>>> (flag_force_vectorize): New variable. >>>>>>>>> (edge_predicate): New function. >>>>>>>>> (set_edge_predicate): New function. >>>>>>>>> (add_to_dst_predicate_list): Conditionally invoke >>>>>>>>> add_to_predicate_list >>>>>>>>> if destination block of edge is not always executed. Set-up predicate >>>>>>>>> for critical edge. >>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up. >>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if >>>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical >>>>>>>>> to reject block if-conversion with incoming critical edges only if >>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up. >>>>>>>>> (predicate_bbs): Skip loop exit block also.Invoke build2_loc >>>>>>>>> to compute predicate instead of fold_build2_loc. >>>>>>>>> Add zeroing of edge 'aux' field. >>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>> extended phi node predication. If number of predecessors of phi-block >>>>>>>>> is equal 2 and atleast one incoming edge is not critical original >>>>>>>>> algorithm is used. >>>>>>>>> (tree_if_conversion): Temporary set-up FLAG_FORCE_VECTORIZE to false. >>>>>>>>> Nullify 'aux' field of edges for blocks with two successors. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> 2014-10-15 13:50 GMT+04:00 Richard Biener >>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>> On Mon, Oct 13, 2014 at 11:38 AM, Yuri Rumyantsev >>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>> Richard, >>>>>>>>>>> >>>>>>>>>>> Here is updated patch (part1) for extended if conversion. >>>>>>>>>>> >>>>>>>>>>> Second part of patch will be sent later. >>>>>>>>>> >>>>>>>>>> Ok, I'm starting to look at this. I'd still like you to split >>>>>>>>>> things up >>>>>>>>>> more. >>>>>>>>>> >>>>>>>>>> static inline void >>>>>>>>>> add_to_predicate_list (struct loop *loop, basic_block bb, tree nc) >>>>>>>>>> { >>>>>>>>>> ... >>>>>>>>>> >>>>>>>>>> + /* We use notion of cd equivalence to get simplier predicate >>>>>>>>>> for >>>>>>>>>> + join block, e.g. if join block has 2 predecessors with >>>>>>>>>> predicates >>>>>>>>>> + p1 & p2 and p1 & !p2, we'd like to get p1 for it instead of >>>>>>>>>> + p1 & p2 | p1 & !p2. */ >>>>>>>>>> + if (dom_bb != loop->header >>>>>>>>>> + && get_immediate_dominator (CDI_POST_DOMINATORS, dom_bb) >>>>>>>>>> == bb) >>>>>>>>>> + { >>>>>>>>>> + gcc_assert (flow_bb_inside_loop_p (loop, dom_bb)); >>>>>>>>>> + bc = bb_predicate (dom_bb); >>>>>>>>>> + gcc_assert (!is_true_predicate (bc)); >>>>>>>>>> >>>>>>>>>> these changes look worthwhile even for !flag_force_vectorize. So >>>>>>>>>> please >>>>>>>>>> split the change to add_to_predicate_list out and compute >>>>>>>>>> post-dominators >>>>>>>>>> unconditionally. Note that you should call free_dominance_info >>>>>>>>>> (CDI_POST_DOMINATORS) at the end of if-conversion. >>>>>>>>>> >>>>>>>>>> + if (!dominated_by_p (CDI_DOMINATORS, loop->latch, e->dest)) >>>>>>>>>> + add_to_predicate_list (loop, e->dest, cond); >>>>>>>>>> + >>>>>>>>>> + /* If edge E is critical save predicate on it. */ >>>>>>>>>> + if (EDGE_COUNT (e->dest->preds) >= 2) >>>>>>>>>> + set_edge_predicate (e, cond); >>>>>>>>>> >>>>>>>>>> how do we know the edge is critical by this simple check? Why not >>>>>>>>>> simply always save edge predicates (well, you kind of do but omit >>>>>>>>>> the case where e->src dominates e->dest). >>>>>>>>>> >>>>>>>>>> Btw, you can rely on edge->aux being NULL at the start of the >>>>>>>>>> pass but need to clear it at the end (best use clear_aux_for_edges () >>>>>>>>>> for that). So stuff like >>>>>>>>>> >>>>>>>>>> + extract_true_false_edges_from_block (bb, &true_edge, >>>>>>>>>> &false_edge); >>>>>>>>>> + if (flag_force_vectorize) >>>>>>>>>> + true_edge->aux = false_edge->aux = NULL; >>>>>>>>>> >>>>>>>>>> shouldn't be necessary. >>>>>>>>>> >>>>>>>>>> I think the edge predicate handling should also be unconditionally >>>>>>>>>> and not depend on flag_force_vectorize. >>>>>>>>>> >>>>>>>>>> + /* The loop latch and loop exit block are always executed and >>>>>>>>>> + have no extra conditions to be processed: skip them. */ >>>>>>>>>> + if (bb == loop->latch >>>>>>>>>> + || bb_with_exit_edge_p (loop, bb)) >>>>>>>>>> >>>>>>>>>> I don't think the edge stuff is true - given you still only reset the >>>>>>>>>> loop->latch bb predicate the change looks broken. >>>>>>>>>> >>>>>>>>>> + /* Fold_build2 can produce bool conversion which is not >>>>>>>>>> + supported by vectorizer, so re-build it without >>>>>>>>>> folding. >>>>>>>>>> + For example, such conversion is generated for sequence: >>>>>>>>>> + _Bool _7, _8, _9; >>>>>>>>>> + _7 = _6 != 13; _8 = _6 != 0; _9 = _8 & _9; >>>>>>>>>> + if (_9 != 0) --> (bool)_9. */ >>>>>>>>>> + >>>>>>>>>> + if (CONVERT_EXPR_P (c) >>>>>>>>>> + && TREE_CODE_CLASS (code) == tcc_comparison) >>>>>>>>>> >>>>>>>>>> I think you should simply use canonicalize_cond_expr_cond on the >>>>>>>>>> folding result. Or rather _not_ fold at all - we are taking the >>>>>>>>>> operands from the GIMPLE condition unmodified after all. >>>>>>>>>> >>>>>>>>>> - add_to_dst_predicate_list (loop, false_edge, >>>>>>>>>> - unshare_expr (cond), c2); >>>>>>>>>> + add_to_dst_predicate_list (loop, false_edge, unshare_expr >>>>>>>>>> (cond), >>>>>>>>>> + unshare_expr (c2)); >>>>>>>>>> >>>>>>>>>> why is it necessary to unshare c2? >>>>>>>>>> >>>>>>>>>> Please split out the PHI-with-multi-arg handling (I have not looked >>>>>>>>>> at >>>>>>>>>> that in detail). >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Richard. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Changelog. >>>>>>>>>>> >>>>>>>>>>> 2014-10-13 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>> >>>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect function >>>>>>>>>>> clone. >>>>>>>>>>> (flag_force_vectorize): New variable. >>>>>>>>>>> (edge_predicate): New function. >>>>>>>>>>> (set_edge_predicate): New function. >>>>>>>>>>> (add_to_predicate_list): Check unconditionally that bb is always >>>>>>>>>>> executed to early exit. Use predicate of cd-equivalent block >>>>>>>>>>> for join blocks if it exists. >>>>>>>>>>> (add_to_dst_predicate_list): Invoke add_to_predicate_list if >>>>>>>>>>> destination block of edge is not always executed. Set-up predicate >>>>>>>>>>> for critical edge. >>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up. >>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if >>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical >>>>>>>>>>> to reject block if-conversion with incoming critical edges only if >>>>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up. >>>>>>>>>>> (predicate_bbs): Skip loop exit block also. Add check that if >>>>>>>>>>> fold_build2 produces bool conversion, recompute predicate using >>>>>>>>>>> build2_loc. Add zeroing of edge 'aux' field under >>>>>>>>>>> FLAG_FORCE_VECTORIZE. >>>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if >>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent bb's. >>>>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>>>> extended phi node predication. If number of predecessors of >>>>>>>>>>> phi-block >>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original >>>>>>>>>>> algorithm is used. >>>>>>>>>>> (get_predicate_for_edge): New function. >>>>>>>>>>> (find_insertion_point): New function. >>>>>>>>>>> (predicate_arbitrary_scalar_phi): New function. >>>>>>>>>>> (predicate_all_scalar_phis): Introduce new variable BEFORE. >>>>>>>>>>> Invoke find_insertion_point to initialize gsi and >>>>>>>>>>> predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it signals >>>>>>>>>>> that extended predication must be applied). >>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic >>>>>>>>>>> blocks that there are no gimplified statements to insert. Insert >>>>>>>>>>> predicates at the block begining for extended if-conversion. >>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current >>>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop >>>>>>>>>>> versioning >>>>>>>>>>> for innermost loop marked with pragma omp simd and >>>>>>>>>>> FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' field of >>>>>>>>>>> edges >>>>>>>>>>> for blocks with two successors. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> 2014-09-22 12:28 GMT+04:00 Yuri Rumyantsev <ysrum...@gmail.com>: >>>>>>>>>>>> Richard, >>>>>>>>>>>> >>>>>>>>>>>> here is reduced patch (part.1) which was reduced almost twice. >>>>>>>>>>>> Let's me also answer on your comments. >>>>>>>>>>>> >>>>>>>>>>>> 1. I really use edge field 'aux' to keep predicate for critical >>>>>>>>>>>> edges. >>>>>>>>>>>> My previous code was not correct and now it looks like: >>>>>>>>>>>> >>>>>>>>>>>> if (EDGE_COUNT (b->succs) == 1 || EDGE_COUNT (e->dest->preds) == >>>>>>>>>>>> 1) >>>>>>>>>>>> /* Edge E is not critical, use predicate of edge source bb. */ >>>>>>>>>>>> c = bb_predicate (b); >>>>>>>>>>>> else >>>>>>>>>>>> /* Edge E is critical and its aux field contains predicate. */ >>>>>>>>>>>> c = edge_predicate (e); >>>>>>>>>>>> >>>>>>>>>>>> 2. I completely delete all code related to creation of conditional >>>>>>>>>>>> expressions and completely rely on bool pattern recognition in >>>>>>>>>>>> vectorizer. But we need to delete all dead predicate computations >>>>>>>>>>>> which are not used since they prevent vectorization. I will add >>>>>>>>>>>> this >>>>>>>>>>>> local-dce function in next patch. >>>>>>>>>>>> 3. I also did not include in this patch recognition of general >>>>>>>>>>>> phi-nodes with two arguments only for which conversion of >>>>>>>>>>>> conditional >>>>>>>>>>>> scalar reduction can be applied also. >>>>>>>>>>>> Note that all these changes are applied for loop marked with pragma >>>>>>>>>>>> omp simd only. >>>>>>>>>>>> >>>>>>>>>>>> 2014-09-22 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>> >>>>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect function >>>>>>>>>>>> clone. >>>>>>>>>>>> (flag_force_vectorize): New variable. >>>>>>>>>>>> (edge_predicate): New function. >>>>>>>>>>>> (set_edge_predicate): New function. >>>>>>>>>>>> (convert_name_to_cmp): New function. >>>>>>>>>>>> (add_to_predicate_list): Check unconditionally that bb is always >>>>>>>>>>>> executed to early exit. Use predicate of cd-equivalent block >>>>>>>>>>>> for join blocks if it exists. >>>>>>>>>>>> (add_to_dst_predicate_list): Invoke add_to_predicate_list if >>>>>>>>>>>> destination block of edge is not always executed. Set-up predicate >>>>>>>>>>>> for critical edge. >>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>>>>>>>> if FLAG_FORCE_VECTORIZE was set-up. >>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use FLAG_FORCE_VECTORIZE. >>>>>>>>>>>> (if_convertible_stmt_p): Fix up pre-function comments. >>>>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if >>>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up. Use call of all_edges_are_critical >>>>>>>>>>>> to reject block if-conversion with incoming critical edges only if >>>>>>>>>>>> FLAG_FORCE_VECTORIZE was not set-up. >>>>>>>>>>>> (predicate_bbs): Skip loop exit block also. Add check that if >>>>>>>>>>>> fold_build2 produces bool conversion, recompute predicate using >>>>>>>>>>>> build2_loc. Add zeroing of edge 'aux' field under >>>>>>>>>>>> FLAG_FORCE_VECTORIZE. >>>>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if >>>>>>>>>>>> FLAG_FORCE_VECTORIZE was set-up to calculate cd equivalent bb's. >>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>>>>> extended phi node predication. If number of predecessors of >>>>>>>>>>>> phi-block >>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original >>>>>>>>>>>> algorithm is used. >>>>>>>>>>>> (get_predicate_for_edge): New function. >>>>>>>>>>>> (find_insertion_point): New function. >>>>>>>>>>>> (predicate_arbitrary_scalar_phi): New function. >>>>>>>>>>>> (predicate_all_scalar_phis): Introduce new variable BEFORE. >>>>>>>>>>>> Invoke find_insertion_point to initialize gsi and >>>>>>>>>>>> predicate_arbitrary_scalar_phi if TRUE_BB is NULL - it signals >>>>>>>>>>>> that extended predication must be applied). >>>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic >>>>>>>>>>>> blocks that there are no gimplified statements to insert. Insert >>>>>>>>>>>> predicates at the block begining for extended if-conversion. >>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from current >>>>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop >>>>>>>>>>>> versioning >>>>>>>>>>>> for innermost loop marked with pragma omp simd and >>>>>>>>>>>> FLAG_TREE_LOOP_IF_CONVERT was not sett-up. Nullify 'aux' field of >>>>>>>>>>>> edges >>>>>>>>>>>> for blocks with two successors. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2014-09-08 17:10 GMT+04:00 Richard Biener >>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>> On Fri, Aug 15, 2014 at 2:02 PM, Yuri Rumyantsev >>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>> Richard! >>>>>>>>>>>>>> Here is updated patch with the following changes: >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. Any restrictions on phi-function were eliminated for extended >>>>>>>>>>>>>> conversion. >>>>>>>>>>>>>> 2. Put predicate for critical edges to 'aux' field of edge, i.e. >>>>>>>>>>>>>> negate_predicate was deleted. >>>>>>>>>>>>>> 3. Deleted splitting of critical edges, i.e. both outgoing edges >>>>>>>>>>>>>> can >>>>>>>>>>>>>> be critical. >>>>>>>>>>>>>> 4. Use notion of cd-equivalence to set-up predicate for join >>>>>>>>>>>>>> basic >>>>>>>>>>>>>> blocks to simplify it. >>>>>>>>>>>>>> 5. I decided to not design pre-pass since it will lead generating >>>>>>>>>>>>>> chain of cond expressions for phi-node if conversion, whereas >>>>>>>>>>>>>> for phi >>>>>>>>>>>>>> of kind >>>>>>>>>>>>>> x = PHI <1(2), 1(3), 2(4)> >>>>>>>>>>>>>> only one cond expression is required and this is considered as >>>>>>>>>>>>>> simple >>>>>>>>>>>>>> optimization for arbitrary phi-function. More precise, >>>>>>>>>>>>>> if phi-function have only two different arguments and one of >>>>>>>>>>>>>> them has >>>>>>>>>>>>>> single occurrence, if- conversion is performed as if phi have >>>>>>>>>>>>>> only 2 >>>>>>>>>>>>>> arguments. >>>>>>>>>>>>>> For arbitrary phi function a chain of cond expressions is >>>>>>>>>>>>>> produced. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Updated patch is attached. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Any comments will be appreciated. >>>>>>>>>>>>> >>>>>>>>>>>>> The patch is still very big and does multiple things at once >>>>>>>>>>>>> which makes >>>>>>>>>>>>> it hard to review. >>>>>>>>>>>>> >>>>>>>>>>>>> In addition to that it changes function singatures without >>>>>>>>>>>>> updating >>>>>>>>>>>>> the function comments. For example what is the convert_bool >>>>>>>>>>>>> argument doing to add_to_dst_predicate_list? Why do we need >>>>>>>>>>>>> all this added logic. >>>>>>>>>>>>> >>>>>>>>>>>>> You duplicate operand_equal_for_phi_arg_p. >>>>>>>>>>>>> >>>>>>>>>>>>> I think the code handling PHIs with more than two operands but >>>>>>>>>>>>> only two unequal operands is useful generally, so that's an >>>>>>>>>>>>> obvious >>>>>>>>>>>>> candidate for splitting out into a separate patch. >>>>>>>>>>>>> >>>>>>>>>>>>> + CONVERT_BOOL argument was added to convert bool predicate >>>>>>>>>>>>> computations >>>>>>>>>>>>> + which is not supported by vectorizer to int type through >>>>>>>>>>>>> creating of >>>>>>>>>>>>> + conditional expressions. */ >>>>>>>>>>>>> >>>>>>>>>>>>> Example? The vectorizer has patterns for bool predicate >>>>>>>>>>>>> computations. >>>>>>>>>>>>> This seems to be another feature that needs splitting out. >>>>>>>>>>>>> >>>>>>>>>>>>> The way you get around the critical edge parts looks awkward to >>>>>>>>>>>>> me. >>>>>>>>>>>>> Please either do _all_ predicates as edge predicates or simply >>>>>>>>>>>>> split critical edges (of the respective loop body). >>>>>>>>>>>>> >>>>>>>>>>>>> I still think that an utility doing same PHI arg merging by >>>>>>>>>>>>> introducing >>>>>>>>>>>>> forwarder blocks would be nicer to have. >>>>>>>>>>>>> >>>>>>>>>>>>> I'd restructure the main tree_if_conversion function to apply >>>>>>>>>>>>> these >>>>>>>>>>>>> CFG pre-transforms when we are going to version the loop >>>>>>>>>>>>> for if conversion (eventually transitioning to always doing that). >>>>>>>>>>>>> >>>>>>>>>>>>> So - please split up the patch. It's way too big. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Richard. >>>>>>>>>>>>> >>>>>>>>>>>>>> 2014-08-15 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>>> >>>>>>>>>>>>>> * tree-if-conv.c (cgraph.h): Add include file to detect function >>>>>>>>>>>>>> clone. >>>>>>>>>>>>>> (flag_force_vectorize): New variable. >>>>>>>>>>>>>> (edge_predicate): New function. >>>>>>>>>>>>>> (set_edge_predicate): New function. >>>>>>>>>>>>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function. >>>>>>>>>>>>>> (init_bb_predicate): Add initialization of negate_predicate >>>>>>>>>>>>>> field. >>>>>>>>>>>>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE. >>>>>>>>>>>>>> (convert_name_to_cmp): New function. >>>>>>>>>>>>>> (get_type_for_cond): New function. >>>>>>>>>>>>>> (convert_bool_predicate): New function. >>>>>>>>>>>>>> (predicate_disjunction): New function. >>>>>>>>>>>>>> (predicate_conjunction): New function. >>>>>>>>>>>>>> (add_to_predicate_list): Add convert_bool argument. >>>>>>>>>>>>>> Use predicate of cd-equivalent block if convert_bool is true and >>>>>>>>>>>>>> such bb exists; save it in static variable for further possible >>>>>>>>>>>>>> use. >>>>>>>>>>>>>> Add call of predicate_disjunction if convert_bool argument is >>>>>>>>>>>>>> true. >>>>>>>>>>>>>> (add_to_dst_predicate_list): Add convert_bool argument. >>>>>>>>>>>>>> Add early function exit if edge target block is always executed. >>>>>>>>>>>>>> Add call of predicate_conjunction if convert_bool argument is >>>>>>>>>>>>>> true. >>>>>>>>>>>>>> Pass convert_bool argument for add_to_predicate_list. >>>>>>>>>>>>>> Set-up predicate for crritical edge if convert_bool is true. >>>>>>>>>>>>>> (equal_phi_args): New function. >>>>>>>>>>>>>> (phi_has_two_different_args): New function. >>>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two args >>>>>>>>>>>>>> if flag_force_vectorize wa set-up. >>>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Add test on >>>>>>>>>>>>>> flag_force_vectorize. >>>>>>>>>>>>>> (if_convertible_stmt_p): Allow calls of function clones if >>>>>>>>>>>>>> flag_force_vectorize was set-up. >>>>>>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors if >>>>>>>>>>>>>> flag_force_vectorize was set-up. Use call of >>>>>>>>>>>>>> all_edges_are_critical >>>>>>>>>>>>>> to reject block if-conversion with imcoming critical edges only >>>>>>>>>>>>>> if >>>>>>>>>>>>>> flag_force_vectorize was not set-up. >>>>>>>>>>>>>> (walk_cond_tree): New function. >>>>>>>>>>>>>> (vect_bool_pattern_is_applicable): New function. >>>>>>>>>>>>>> (predicate_bbs): Add convert_bool argument which is used to >>>>>>>>>>>>>> transform >>>>>>>>>>>>>> comparison expressions of boolean type into conditional >>>>>>>>>>>>>> expressions >>>>>>>>>>>>>> with integral operands. If convert_bool argument was set-up and >>>>>>>>>>>>>> vect bool pattern can be appied perform the following >>>>>>>>>>>>>> transformation: >>>>>>>>>>>>>> (bool) x != 0 --> y = (int) x; x != 0; >>>>>>>>>>>>>> Add check that if fold_build2 produces bool conversion if >>>>>>>>>>>>>> convert_bool >>>>>>>>>>>>>> was set-up, recompute predicate using build2_loc. Additional >>>>>>>>>>>>>> argument >>>>>>>>>>>>>> 'convert_bool" is passed to add_to_dst_predicate_list and >>>>>>>>>>>>>> add_to_predicate_list. >>>>>>>>>>>>>> (if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if >>>>>>>>>>>>>> flag_force_vectorize was set-up to calculate cd equivalent bb's. >>>>>>>>>>>>>> Call predicate_bbs with additional argument equal to false. >>>>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>>>>>>> extended phi node predication. If number of predecessors of >>>>>>>>>>>>>> phi-block >>>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical original >>>>>>>>>>>>>> algorithm is used. >>>>>>>>>>>>>> (is_cond_scalar_reduction): Add 'extended' argument which >>>>>>>>>>>>>> signals that >>>>>>>>>>>>>> phi arguments must be evaluated through >>>>>>>>>>>>>> phi_has_two_different_args. >>>>>>>>>>>>>> (predicate_scalar_phi): Add invoсation of convert_name_to_cmp if >>>>>>>>>>>>>> cond >>>>>>>>>>>>>> is SSA_NAME. Add 'false' argument to call of >>>>>>>>>>>>>> is_cond_scalar_reduction. >>>>>>>>>>>>>> (get_predicate_for_edge): New function. >>>>>>>>>>>>>> (find_insertion_point): New function. >>>>>>>>>>>>>> (predicate_arbitrary_phi): New function. >>>>>>>>>>>>>> (predicate_extended_scalar_phi): New function. >>>>>>>>>>>>>> (predicate_all_scalar_phis): Add code to set-up gimple statement >>>>>>>>>>>>>> iterator for predication of extended scalar phi's for insertion. >>>>>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated basic >>>>>>>>>>>>>> blocks that there are no gimplified statements to insert. Insert >>>>>>>>>>>>>> predicates at the block begining for extended if-conversion. >>>>>>>>>>>>>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended >>>>>>>>>>>>>> predication to build mask. >>>>>>>>>>>>>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs. >>>>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from >>>>>>>>>>>>>> current >>>>>>>>>>>>>> loop or outer loop (to support pragma omp declare).Do loop >>>>>>>>>>>>>> versioning >>>>>>>>>>>>>> for innermost loop marked with pragma omp simd. >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2014-08-01 13:40 GMT+04:00 Richard Biener >>>>>>>>>>>>>> <richard.guent...@gmail.com>: >>>>>>>>>>>>>>> On Wed, Jun 25, 2014 at 4:06 PM, Yuri Rumyantsev >>>>>>>>>>>>>>> <ysrum...@gmail.com> wrote: >>>>>>>>>>>>>>>> Hi All, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> We implemented additional support for pragma omp simd in part >>>>>>>>>>>>>>>> of >>>>>>>>>>>>>>>> extended if-conversion loops with such pragma. These extensions >>>>>>>>>>>>>>>> include: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 1. All extensions are performed only if considered loop or its >>>>>>>>>>>>>>>> outer >>>>>>>>>>>>>>>> loop was marked with pragma omp simd (force_vectorize); For >>>>>>>>>>>>>>>> ordinary >>>>>>>>>>>>>>>> loops behavior was not changed. >>>>>>>>>>>>>>>> 2. Took off cfg restriction on basic block which can have more >>>>>>>>>>>>>>>> than 2 >>>>>>>>>>>>>>>> predecessors. >>>>>>>>>>>>>>>> 3. Put additional restriction on phi nodes which was missed in >>>>>>>>>>>>>>>> current design: >>>>>>>>>>>>>>>> all phi nodes must be in non-predicated basic block to >>>>>>>>>>>>>>>> conform >>>>>>>>>>>>>>>> semantic of COND_EXPR which is used for transformation. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> How is that so? If the PHI is predicated then its result will >>>>>>>>>>>>>>> be used >>>>>>>>>>>>>>> in a PHI node again and thus we'd create a sequence of >>>>>>>>>>>>>>> COND_EXPRs. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> No? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 4. Extend predication of phi nodes: phi may have more than 2 >>>>>>>>>>>>>>>> arguments >>>>>>>>>>>>>>>> with some limitations: >>>>>>>>>>>>>>>> - for phi nodes which have more than 2 arguments, but only >>>>>>>>>>>>>>>> two >>>>>>>>>>>>>>>> arguments are different and one of them has the only >>>>>>>>>>>>>>>> occurence, >>>>>>>>>>>>>>>> transformation to single COND_EXPR can be done. >>>>>>>>>>>>>>>> - if phi node has more different arguments and all edge >>>>>>>>>>>>>>>> predicates >>>>>>>>>>>>>>>> correspondent to phi-arguments are disjoint, a chain of >>>>>>>>>>>>>>>> COND_EXPR >>>>>>>>>>>>>>>> will be generated for it. In current design very simple >>>>>>>>>>>>>>>> check is used: >>>>>>>>>>>>>>>> check starting from end that two edges correspondent to >>>>>>>>>>>>>>>> neighbor >>>>>>>>>>>>>>>> arguments have common predecessor which is used for further >>>>>>>>>>>>>>>> check >>>>>>>>>>>>>>>> with next edge. >>>>>>>>>>>>>>>> These guarantee that phi predication will produce the correct >>>>>>>>>>>>>>>> result. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Btw, you can think of these extensions as unfactoring a PHI >>>>>>>>>>>>>>> node by >>>>>>>>>>>>>>> inserting forwarder blocks. Thus >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> x = PHI <1(2), 1(3), 2(4)> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> becomes >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> bb 5: <forwarder-from(2)-and(3)> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> x = PHI <1(5), 2(4)> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> x = PHI <1(2), 2(3), 3(4)> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> becomes >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> bb 5: >>>>>>>>>>>>>>> x' = PHI <1(2), 2(3)> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> b = PHI<x'(5), 3(4)> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> which means that 3) has to work. Note that we want this kind of >>>>>>>>>>>>>>> PHI transforms for out-of-SSA as well to reduce the number of >>>>>>>>>>>>>>> copies we need to insert on edges. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thus it would be nice if you implemented 4) in terms of a >>>>>>>>>>>>>>> pre-pass >>>>>>>>>>>>>>> over the force_vect loops PHI nodes, applying that CFG >>>>>>>>>>>>>>> transform. >>>>>>>>>>>>>>> And make 3) work properly if it doesn't already. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It looks like you introduce a "negate predicate" to work around >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> critical edge limitation? Please instead change if-conversion >>>>>>>>>>>>>>> to >>>>>>>>>>>>>>> work with edge predicates (as opposed to BB predicates). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Richard. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Here is example of such extended predication (compile with >>>>>>>>>>>>>>>> -march=core-avx2): >>>>>>>>>>>>>>>> #pragma omp simd safelen(8) >>>>>>>>>>>>>>>> for (i=0; i<512; i++) >>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>> float t = a[i]; >>>>>>>>>>>>>>>> if (t > 0 & t < 1.0e+17f) >>>>>>>>>>>>>>>> if (c[i] != 0) >>>>>>>>>>>>>>>> res += 1; >>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>> <bb 4>: >>>>>>>>>>>>>>>> # res_15 = PHI <res_1(5), 0(3)> >>>>>>>>>>>>>>>> # i_16 = PHI <i_11(5), 0(3)> >>>>>>>>>>>>>>>> # ivtmp_17 = PHI <ivtmp_14(5), 512(3)> >>>>>>>>>>>>>>>> t_5 = a[i_16]; >>>>>>>>>>>>>>>> _6 = t_5 > 0.0; >>>>>>>>>>>>>>>> _7 = t_5 < 9.9999998430674944e+16; >>>>>>>>>>>>>>>> _8 = _7 & _6; >>>>>>>>>>>>>>>> _ifc__28 = (unsigned int) _8; >>>>>>>>>>>>>>>> _10 = &c[i_16]; >>>>>>>>>>>>>>>> _ifc__36 = _ifc__28 != 0 ? 4294967295 : 0; >>>>>>>>>>>>>>>> _9 = MASK_LOAD (_10, 0B, _ifc__36); >>>>>>>>>>>>>>>> _ifc__29 = _ifc__28 != 0 ? 1 : 0; >>>>>>>>>>>>>>>> _ifc__30 = (int) _ifc__29; >>>>>>>>>>>>>>>> _ifc__31 = _9 != 0 ? _ifc__30 : 0; >>>>>>>>>>>>>>>> _ifc__32 = _ifc__28 != 0 ? 1 : 0; >>>>>>>>>>>>>>>> _ifc__33 = (int) _ifc__32; >>>>>>>>>>>>>>>> _ifc__34 = _9 == 0 ? _ifc__33 : 0; >>>>>>>>>>>>>>>> _ifc__35 = _ifc__31 != 0 ? 1 : 0; >>>>>>>>>>>>>>>> res_1 = res_15 + _ifc__35; >>>>>>>>>>>>>>>> i_11 = i_16 + 1; >>>>>>>>>>>>>>>> ivtmp_14 = ivtmp_17 - 1; >>>>>>>>>>>>>>>> if (ivtmp_14 != 0) >>>>>>>>>>>>>>>> goto <bb 4>; >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Bootstrap and regression testing did not show any new failures. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> gcc/ChageLog >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2014-06-25 Yuri Rumyantsev <ysrum...@gmail.com> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> * tree-if-conv.c (flag_force_vectorize): New variable. >>>>>>>>>>>>>>>> (struct bb_predicate_s): Add negate_predicate field. >>>>>>>>>>>>>>>> (bb_negate_predicate): New function. >>>>>>>>>>>>>>>> (set_bb_negate_predicate): New function. >>>>>>>>>>>>>>>> (bb_copy_predicate): New function. >>>>>>>>>>>>>>>> (add_stmt_to_bb_predicate_gimplified_stmts): New function. >>>>>>>>>>>>>>>> (init_bb_predicate): Add initialization of negate_predicate >>>>>>>>>>>>>>>> field. >>>>>>>>>>>>>>>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE. >>>>>>>>>>>>>>>> (convert_name_to_cmp): New function. >>>>>>>>>>>>>>>> (get_type_for_cond): New function. >>>>>>>>>>>>>>>> (convert_bool_predicate): New function. >>>>>>>>>>>>>>>> (predicate_disjunction): New function. >>>>>>>>>>>>>>>> (predicate_conjunction): New function. >>>>>>>>>>>>>>>> (add_to_predicate_list): Add convert_bool argument. >>>>>>>>>>>>>>>> Add call of predicate_disjunction if convert_bool argument is >>>>>>>>>>>>>>>> true. >>>>>>>>>>>>>>>> (add_to_dst_predicate_list): Add convert_bool argument. >>>>>>>>>>>>>>>> Add early function exit if edge target block is always >>>>>>>>>>>>>>>> executed. >>>>>>>>>>>>>>>> Add call of predicate_conjunction if convert_bool argument is >>>>>>>>>>>>>>>> true. >>>>>>>>>>>>>>>> Pass convert_bool argument for add_to_predicate_list. >>>>>>>>>>>>>>>> (equal_phi_args): New function. >>>>>>>>>>>>>>>> (phi_has_two_different_args): New function. >>>>>>>>>>>>>>>> (phi_args_disjoint): New function. >>>>>>>>>>>>>>>> (if_convertible_phi_p): Accept phi nodes with more than two >>>>>>>>>>>>>>>> args >>>>>>>>>>>>>>>> for loops marked with pragma omp simd. Add check that phi >>>>>>>>>>>>>>>> nodes are >>>>>>>>>>>>>>>> in non-predicated basic blocks. >>>>>>>>>>>>>>>> (ifcvt_can_use_mask_load_store): Use flag_force_vectorize. >>>>>>>>>>>>>>>> (all_edges_are_critical): New function. >>>>>>>>>>>>>>>> (if_convertible_bb_p): Allow bb has more than two predecessors >>>>>>>>>>>>>>>> if >>>>>>>>>>>>>>>> flag_force_vectorize was setup. Use call of >>>>>>>>>>>>>>>> all_edges_are_critical >>>>>>>>>>>>>>>> to reject block if-conversion with imcoming critical edges >>>>>>>>>>>>>>>> only if >>>>>>>>>>>>>>>> flag_force_vectorize was not setup. >>>>>>>>>>>>>>>> (walk_cond_tree): New function. >>>>>>>>>>>>>>>> (vect_bool_pattern_is_applicable): New function. >>>>>>>>>>>>>>>> (predicate_bbs): Add convert_bool argument that is used to >>>>>>>>>>>>>>>> transform >>>>>>>>>>>>>>>> comparison expressions of boolean type into conditional >>>>>>>>>>>>>>>> expressions >>>>>>>>>>>>>>>> with integral operands. If bool_conv argument is false or both >>>>>>>>>>>>>>>> outgoing edges are not critical old algorithm of predicate >>>>>>>>>>>>>>>> assignments >>>>>>>>>>>>>>>> is used, otherwise the following code was added: check on >>>>>>>>>>>>>>>> applicable >>>>>>>>>>>>>>>> of vect-bool-pattern recognition and trnasformation of >>>>>>>>>>>>>>>> (bool) x != 0 --> y = (int) x; x != 0; >>>>>>>>>>>>>>>> compute predicates for both outgoing edges one of which is >>>>>>>>>>>>>>>> critical >>>>>>>>>>>>>>>> one using 'normal' edge, i.e. compute true and false >>>>>>>>>>>>>>>> predicates using >>>>>>>>>>>>>>>> normal outgoing edge only; evaluated predicates are stored in >>>>>>>>>>>>>>>> predicate and negate_predicate fields of struct bb_predicate_s >>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>> negate_predicate of normal edge conatins predicate of critical >>>>>>>>>>>>>>>> edge, >>>>>>>>>>>>>>>> but generated gimplified statements are stored in their >>>>>>>>>>>>>>>> destination >>>>>>>>>>>>>>>> block fields. Additional argument 'convert_bool" is passed to >>>>>>>>>>>>>>>> add_to_dst_predicate_list and add_to_predicate_list. >>>>>>>>>>>>>>>> (if_convertible_loop_p_1): Call predicate_bbs with additional >>>>>>>>>>>>>>>> argument >>>>>>>>>>>>>>>> equal to false. >>>>>>>>>>>>>>>> (find_phi_replacement_condition): Extend function interface: >>>>>>>>>>>>>>>> it returns NULL if given phi node must be handled by means of >>>>>>>>>>>>>>>> extended phi node predication. If number of predecessors of >>>>>>>>>>>>>>>> phi-block >>>>>>>>>>>>>>>> is equal 2 and atleast one incoming edge is not critical >>>>>>>>>>>>>>>> original >>>>>>>>>>>>>>>> algorithm is used. >>>>>>>>>>>>>>>> (is_cond_scalar_reduction): Add 'extended' argument which >>>>>>>>>>>>>>>> signals that >>>>>>>>>>>>>>>> both phi arguments must be evaluated through >>>>>>>>>>>>>>>> phi_has_two_different_args. >>>>>>>>>>>>>>>> (predicate_scalar_phi): Add invoсation of convert_name_to_cmp >>>>>>>>>>>>>>>> if cond >>>>>>>>>>>>>>>> is SSA_NAME. Add 'false' argument to call of >>>>>>>>>>>>>>>> is_cond_scalar_reduction. >>>>>>>>>>>>>>>> (get_predicate_for_edge): New function. >>>>>>>>>>>>>>>> (find_insertion_point): New function. >>>>>>>>>>>>>>>> (predicate_phi_disjoint_args): New function. >>>>>>>>>>>>>>>> (predicate_extended_scalar_phi): New function. >>>>>>>>>>>>>>>> (predicate_all_scalar_phis): Add code to set-up gimple >>>>>>>>>>>>>>>> statement >>>>>>>>>>>>>>>> iterator for predication of extended scalar phi's for >>>>>>>>>>>>>>>> insertion. >>>>>>>>>>>>>>>> (insert_gimplified_predicates): Add test for non-predicated >>>>>>>>>>>>>>>> basic >>>>>>>>>>>>>>>> blocks that there are no gimplified statements to insert. >>>>>>>>>>>>>>>> Insert >>>>>>>>>>>>>>>> predicates at the block begining for extended if-conversion. >>>>>>>>>>>>>>>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended >>>>>>>>>>>>>>>> predication to build mask. >>>>>>>>>>>>>>>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs. >>>>>>>>>>>>>>>> (split_crit_edge): New function. >>>>>>>>>>>>>>>> (tree_if_conversion): Initialize flag_force_vectorize from >>>>>>>>>>>>>>>> current >>>>>>>>>>>>>>>> loop or outer loop (to support pragma omp declare). Invoke >>>>>>>>>>>>>>>> split_crit_edge for extended predication. Do loop versioning >>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>> innermost loop marked with pragma omp simd.