On Tue, Jun 13, 2017 at 11:59 AM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng <bin.ch...@arm.com> wrote:
>> Hi,
>> During the work I ran into a latent bug for distributing.  For the moment we 
>> sort statements
>> in dominance order, but that's not enough because basic blocks may be sorted 
>> in reverse order
>> of execution flow.  This results in wrong data dependence direction later.  
>> This patch fixes
>> the issue by sorting in topological order.
>>
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>
> I suppose you are fixing
>
> static int
> pg_add_dependence_edges (struct graph *rdg, vec<loop_p> loops, int dir,
>                          vec<data_reference_p> drs1,
>                          vec<data_reference_p> drs2)
> {
> ...
>         /* Re-shuffle data-refs to be in dominator order.  */
>         if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
>             > rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
>           {
>             std::swap (dr1, dr2);
>             this_dir = -this_dir;
>           }
>
> but then for stmts that are not "ordered" by RPO or DOM like
>
>    if (flag)
>      ... = dr1;
>    else
>      ... = dr2;
>
> this doesn't avoid spurious swaps?  Also the code was basically
No, this is mainly for below case:
  if (flag)
    {
      partition1: arr[i] = x;
    }
  partition2: arr[i] = y;

function pg_add_dependence_edges is like:
    /* Re-shuffle data-refs to be in dominator order. */
    if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
        > rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
      {
        std::swap (dr1, dr2);
        this_dir = -this_dir;
      }
    //...
    else if (DDR_ARE_DEPENDENT (ddr) == NULL_TREE)
      {
         if (DDR_REVERSED_P (ddr))
           {
             std::swap (dr1, dr2);
             this_dir = -this_dir;
           }
         /* Known dependences can still be unordered througout the
            iteration space, see gcc.dg/tree-ssa/ldist-16.c.  */
         if (DDR_NUM_DIST_VECTS (ddr) != 1)
           this_dir = 2;
         /* If the overlap is exact preserve stmt order.  */
         else if (lambda_vector_zerop (DDR_DIST_VECT (ddr, 0), 1))
           ;
         else
           {
             /* Else as the distance vector is lexicographic positive
                swap the dependence direction.  */
             this_dir = -this_dir;
           }
      }
For non-ZERO distance vector dependence, the second part always
computes src->dest dependence info correctly, as well as edge
direction of PG.  For ZERO distance vector dependence, we rely on the
swap part (thus topological order) to get correct dependence
direction.  For mentioned case, the two data references are unordered
in dominance relation, but ordered in RPO.  This is why DOM is not
enough.  For if-then-else case, the order actually doesn't matter, and
the references are unordered in either dominance relation or RPO.
Specifically, src->dest is always computed correctly for non-ZERO
distance vector cases, no matter <dr1, dr2> or <dr2, dr1> is passed to
data dependence analyzer.  As for ZERO distance vector (exact
overlap), the order doesn't matter either because they control
dependent on the same condition.  We can simply assume an arbitrary
order for it.

> copied from tree-data-refs.c:find_data_references_in_loop which
> does iterate over get_loop_body_in_dom_order as well.  So isn't the
> issue latent there as well?
In theory maybe.  In practice, this is not a problem at all since loop
distribution is the only one handles control dependence so far.
>
> That said, what about those "unordered" stmts?  I suppose
> dependence analysis still happily computes a dependence
> distance but in reality we'd have to consider both execution
> orders?
As explained, there is no need to consider both orders.  GCC doesn't
really support control dependence parallelization?

Thanks,
bin
>
> Thanks,
> Richard.
>
>
>>
>> Thanks,
>> bin
>> 2017-06-07  Bin Cheng  <bin.ch...@arm.com>
>>
>>         * tree-loop-distribution.c (bb_top_order_index): New.
>>         (bb_top_order_index_size, bb_top_order_cmp): New.
>>         (stmts_from_loop): Use topological order.
>>         (pass_loop_distribution::execute): Compute topological order for.
>>         basic blocks.

Reply via email to