Richard!
Here is updated patch with the following changes:

1. Any restrictions on phi-function were eliminated for extended conversion.
2.  Put predicate for critical edges to 'aux' field of edge, i.e.
negate_predicate was deleted.
3. Deleted splitting of critical edges, i.e. both outgoing edges can
be critical.
4. Use notion of cd-equivalence to set-up predicate for join basic
blocks to simplify it.
5. I decided to not design pre-pass since it will lead generating
chain of cond expressions for phi-node if conversion, whereas for phi
of kind
  x = PHI <1(2), 1(3), 2(4)>
only one cond expression is required and this is considered as simple
optimization for arbitrary phi-function. More precise,
if phi-function have only two different arguments and one of them has
single occurrence, if- conversion is performed as if phi have only 2
arguments.
For arbitrary phi function a chain of cond expressions is produced.

Updated patch is attached.

Any comments will be appreciated.

2014-08-15  Yuri Rumyantsev  <ysrum...@gmail.com>

* tree-if-conv.c (cgraph.h): Add include file to detect function clone.
(flag_force_vectorize): New variable.
(edge_predicate): New function.
(set_edge_predicate): New function.
(add_stmt_to_bb_predicate_gimplified_stmts): New function.
(init_bb_predicate): Add initialization of negate_predicate field.
(reset_bb_predicate): Reset negate_predicate to NULL_TREE.
(convert_name_to_cmp): New function.
(get_type_for_cond): New function.
(convert_bool_predicate): New function.
(predicate_disjunction): New function.
(predicate_conjunction): New function.
(add_to_predicate_list): Add convert_bool argument.
Use predicate of cd-equivalent block if convert_bool is true and
such bb exists; save it in static variable for further possible use.
Add call of predicate_disjunction if convert_bool argument is true.
(add_to_dst_predicate_list): Add convert_bool argument.
Add early function exit if edge target block is always executed.
Add call of predicate_conjunction if convert_bool argument is true.
Pass convert_bool argument for add_to_predicate_list.
Set-up predicate for crritical edge if convert_bool is true.
(equal_phi_args): New function.
(phi_has_two_different_args): New function.
(if_convertible_phi_p): Accept phi nodes with more than two args
if flag_force_vectorize wa set-up.
(ifcvt_can_use_mask_load_store): Add test on flag_force_vectorize.
(if_convertible_stmt_p): Allow calls of function clones if
flag_force_vectorize was set-up.
(all_edges_are_critical): New function.
(if_convertible_bb_p): Allow bb has more than two predecessors if
flag_force_vectorize was set-up. Use call of all_edges_are_critical
to reject block if-conversion with imcoming critical edges only if
flag_force_vectorize was not set-up.
(walk_cond_tree): New function.
(vect_bool_pattern_is_applicable): New function.
(predicate_bbs): Add convert_bool argument which is used to transform
comparison expressions of boolean type into conditional expressions
with integral operands. If convert_bool argument was set-up and
vect bool pattern can be appied perform the following transformation:
(bool) x != 0  --> y = (int) x; x != 0;
Add check that if fold_build2 produces bool conversion if convert_bool
was set-up, recompute predicate using build2_loc. Additional argument
'convert_bool" is passed to add_to_dst_predicate_list and
add_to_predicate_list.
(if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
flag_force_vectorize was set-up to calculate cd equivalent bb's.
Call predicate_bbs with additional argument equal to false.
(find_phi_replacement_condition): Extend function interface:
it returns NULL if given phi node must be handled by means of
extended phi node predication. If number of predecessors of phi-block
is equal 2 and atleast one incoming edge is not critical original
algorithm is used.
(is_cond_scalar_reduction): Add 'extended' argument which signals that
phi arguments must be evaluated through phi_has_two_different_args.
(predicate_scalar_phi): Add invoсation of convert_name_to_cmp if cond
is SSA_NAME. Add 'false' argument to call of is_cond_scalar_reduction.
(get_predicate_for_edge): New function.
(find_insertion_point): New function.
(predicate_arbitrary_phi): New function.
(predicate_extended_scalar_phi): New function.
(predicate_all_scalar_phis): Add code to set-up gimple statement
iterator for predication of extended scalar phi's for insertion.
(insert_gimplified_predicates): Add test for non-predicated basic
blocks that there are no gimplified statements to insert. Insert
predicates at the block begining for extended if-conversion.
(predicate_mem_writes): Invoke convert_name_to_cmp for extended
predication to build mask.
(combine_blocks): Pass flag_force_vectorize to predicate_bbs.
(tree_if_conversion): Initialize flag_force_vectorize from current
loop or outer loop (to support pragma omp declare).Do loop versioning
for innermost loop marked with pragma omp simd.

2014-08-01 13:40 GMT+04:00 Richard Biener <richard.guent...@gmail.com>:
> On Wed, Jun 25, 2014 at 4:06 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>> Hi All,
>>
>> We implemented additional support for pragma omp simd in part of
>> extended if-conversion loops with such pragma. These extensions
>> include:
>>
>> 1. All extensions are performed only if considered loop or its outer
>>    loop was marked with pragma omp simd (force_vectorize); For ordinary
>>    loops behavior was not changed.
>> 2. Took off cfg restriction on basic block which can have more than 2
>>    predecessors.
>> 3. Put additional restriction on phi nodes which was missed in current 
>> design:
>>    all phi nodes must be in non-predicated basic block to conform
>>    semantic of COND_EXPR which is used for transformation.
>
> How is that so?  If the PHI is predicated then its result will be used
> in a PHI node again and thus we'd create a sequence of COND_EXPRs.
>
> No?
>
>> 4. Extend predication of phi nodes: phi may have more than 2 arguments
>> with some limitations:
>>    - for phi nodes which have more than 2 arguments, but only two
>>    arguments are different and one of them has the only occurence,
>> transformation to  single COND_EXPR can be done.
>>    - if phi node has more different arguments and all edge predicates
>>    correspondent to phi-arguments are disjoint, a chain of COND_EXPR
>>    will be generated for it. In current design very simple check is used:
>>    check starting from end that two edges correspondent to neighbor
>> arguments have common predecessor which is used for further check
>> with next edge.
>>  These guarantee that phi predication will produce the correct result.
>
> Btw, you can think of these extensions as unfactoring a PHI node by
> inserting forwarder blocks.  Thus
>
>    x = PHI <1(2), 1(3), 2(4)>
>
> becomes
>
>   bb 5: <forwarder-from(2)-and(3)>
>
>   x = PHI <1(5), 2(4)>
>
> and
>
>   x = PHI <1(2), 2(3), 3(4)>
>
> becomes
>
>   bb 5:
>   x' = PHI <1(2), 2(3)>
>
>   b = PHI<x'(5), 3(4)>
>
> which means that 3) has to work.  Note that we want this kind of
> PHI transforms for out-of-SSA as well to reduce the number of
> copies we need to insert on edges.
>
> Thus it would be nice if you implemented 4) in terms of a pre-pass
> over the force_vect loops PHI nodes, applying that CFG transform.
> And make 3) work properly if it doesn't already.
>
> It looks like you introduce a "negate predicate" to work around the
> critical edge limitation?  Please instead change if-conversion to
> work with edge predicates (as opposed to BB predicates).
>
> Thanks,
> Richard.
>
>>
>> Here is example of such extended predication (compile with -march=core-avx2):
>> #pragma omp simd safelen(8)
>>   for (i=0; i<512; i++)
>>   {
>>     float t = a[i];
>>     if (t > 0 & t < 1.0e+17f)
>>       if (c[i] != 0)
>> res += 1;
>>   }
>>   <bb 4>:
>>   # res_15 = PHI <res_1(5), 0(3)>
>>   # i_16 = PHI <i_11(5), 0(3)>
>>   # ivtmp_17 = PHI <ivtmp_14(5), 512(3)>
>>   t_5 = a[i_16];
>>   _6 = t_5 > 0.0;
>>   _7 = t_5 < 9.9999998430674944e+16;
>>   _8 = _7 & _6;
>>   _ifc__28 = (unsigned int) _8;
>>   _10 = &c[i_16];
>>   _ifc__36 = _ifc__28 != 0 ? 4294967295 : 0;
>>   _9 = MASK_LOAD (_10, 0B, _ifc__36);
>>   _ifc__29 = _ifc__28 != 0 ? 1 : 0;
>>   _ifc__30 = (int) _ifc__29;
>>   _ifc__31 = _9 != 0 ? _ifc__30 : 0;
>>   _ifc__32 = _ifc__28 != 0 ? 1 : 0;
>>   _ifc__33 = (int) _ifc__32;
>>   _ifc__34 = _9 == 0 ? _ifc__33 : 0;
>>   _ifc__35 = _ifc__31 != 0 ? 1 : 0;
>>   res_1 = res_15 + _ifc__35;
>>   i_11 = i_16 + 1;
>>   ivtmp_14 = ivtmp_17 - 1;
>>   if (ivtmp_14 != 0)
>>     goto <bb 4>;
>>
>> Bootstrap and regression testing did not show any new failures.
>>
>> gcc/ChageLog
>>
>> 2014-06-25  Yuri Rumyantsev  <ysrum...@gmail.com>
>>
>> * tree-if-conv.c (flag_force_vectorize): New variable.
>> (struct bb_predicate_s): Add negate_predicate field.
>> (bb_negate_predicate): New function.
>> (set_bb_negate_predicate): New function.
>> (bb_copy_predicate): New function.
>> (add_stmt_to_bb_predicate_gimplified_stmts): New function.
>> (init_bb_predicate): Add initialization of negate_predicate field.
>> (reset_bb_predicate): Reset negate_predicate to NULL_TREE.
>> (convert_name_to_cmp): New function.
>> (get_type_for_cond): New function.
>> (convert_bool_predicate): New function.
>> (predicate_disjunction): New function.
>> (predicate_conjunction): New function.
>> (add_to_predicate_list): Add convert_bool argument.
>> Add call of predicate_disjunction if convert_bool argument is true.
>> (add_to_dst_predicate_list): Add convert_bool argument.
>> Add early function exit if edge target block is always executed.
>> Add call of predicate_conjunction if convert_bool argument is true.
>> Pass convert_bool argument for add_to_predicate_list.
>> (equal_phi_args): New function.
>> (phi_has_two_different_args): New function.
>> (phi_args_disjoint): New function.
>> (if_convertible_phi_p): Accept phi nodes with more than two args
>> for loops marked with pragma omp simd. Add check that phi nodes are
>> in non-predicated basic blocks.
>> (ifcvt_can_use_mask_load_store): Use flag_force_vectorize.
>> (all_edges_are_critical): New function.
>> (if_convertible_bb_p): Allow bb has more than two predecessors if
>> flag_force_vectorize was setup. Use call of all_edges_are_critical
>> to reject block if-conversion with imcoming critical edges only if
>> flag_force_vectorize was not setup.
>> (walk_cond_tree): New function.
>> (vect_bool_pattern_is_applicable): New function.
>> (predicate_bbs): Add convert_bool argument that is used to transform
>> comparison expressions of boolean type into conditional expressions
>> with integral operands. If bool_conv argument is false or both
>> outgoing edges are not critical old algorithm of predicate assignments
>> is used, otherwise the following code was added: check on applicable
>> of vect-bool-pattern recognition and trnasformation of
>> (bool) x != 0  --> y = (int) x; x != 0;
>> compute predicates for both outgoing edges one of which is critical
>> one using 'normal' edge, i.e. compute true and false predicates using
>> normal outgoing edge only; evaluated predicates are stored in
>> predicate and negate_predicate fields of struct bb_predicate_s and
>> negate_predicate of normal edge conatins predicate of critical edge,
>> but generated gimplified statements are stored in their destination
>> block fields. Additional argument 'convert_bool" is passed to
>> add_to_dst_predicate_list and add_to_predicate_list.
>> (if_convertible_loop_p_1): Call predicate_bbs with additional argument
>> equal to false.
>> (find_phi_replacement_condition): Extend function interface:
>> it returns NULL if given phi node must be handled by means of
>> extended phi node predication. If number of predecessors of phi-block
>> is equal 2 and atleast one incoming edge is not critical original
>> algorithm is used.
>> (is_cond_scalar_reduction): Add 'extended' argument which signals that
>> both phi arguments must be evaluated through phi_has_two_different_args.
>> (predicate_scalar_phi): Add invoсation of convert_name_to_cmp if cond
>> is SSA_NAME. Add 'false' argument to call of is_cond_scalar_reduction.
>> (get_predicate_for_edge): New function.
>> (find_insertion_point): New function.
>> (predicate_phi_disjoint_args): New function.
>> (predicate_extended_scalar_phi): New function.
>> (predicate_all_scalar_phis): Add code to set-up gimple statement
>> iterator for predication of extended scalar phi's for insertion.
>> (insert_gimplified_predicates): Add test for non-predicated basic
>> blocks that there are no gimplified statements to insert. Insert
>> predicates at the block begining for extended if-conversion.
>> (predicate_mem_writes): Invoke convert_name_to_cmp for extended
>> predication to build mask.
>> (combine_blocks): Pass flag_force_vectorize to predicate_bbs.
>> (split_crit_edge): New function.
>> (tree_if_conversion): Initialize flag_force_vectorize from current
>> loop or outer loop (to support pragma omp declare). Invoke
>> split_crit_edge for extended predication. Do loop versioning for
>> innermost loop marked with pragma omp simd.

Attachment: patch.1
Description: Binary data

Reply via email to