Re: [PATCH GCC][04/13]Sort statements in topological order for loop distribution

2017-06-14 Thread Richard Biener
On Wed, Jun 14, 2017 at 11:25 AM, Bin.Cheng  wrote:
> On Wed, Jun 14, 2017 at 10:15 AM, Richard Biener
>  wrote:
>> On Wed, Jun 14, 2017 at 9:53 AM, Bin.Cheng  wrote:
>>> On Tue, Jun 13, 2017 at 11:59 AM, Richard Biener
>>>  wrote:
 On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
> Hi,
> During the work I ran into a latent bug for distributing.  For the moment 
> we sort statements
> in dominance order, but that's not enough because basic blocks may be 
> sorted in reverse order
> of execution flow.  This results in wrong data dependence direction 
> later.  This patch fixes
> the issue by sorting in topological order.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?

 I suppose you are fixing

 static int
 pg_add_dependence_edges (struct graph *rdg, vec loops, int dir,
  vec drs1,
  vec drs2)
 {
 ...
 /* Re-shuffle data-refs to be in dominator order.  */
 if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
 > rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
   {
 std::swap (dr1, dr2);
 this_dir = -this_dir;
   }

 but then for stmts that are not "ordered" by RPO or DOM like

if (flag)
  ... = dr1;
else
  ... = dr2;

 this doesn't avoid spurious swaps?  Also the code was basically
>>> No, this is mainly for below case:
>>>   if (flag)
>>> {
>>>   partition1: arr[i] = x;
>>> }
>>>   partition2: arr[i] = y;
>>>
>>> function pg_add_dependence_edges is like:
>>> /* Re-shuffle data-refs to be in dominator order. */
>>> if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
>>> > rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
>>>   {
>>> std::swap (dr1, dr2);
>>> this_dir = -this_dir;
>>>   }
>>> //...
>>> else if (DDR_ARE_DEPENDENT (ddr) == NULL_TREE)
>>>   {
>>>  if (DDR_REVERSED_P (ddr))
>>>{
>>>  std::swap (dr1, dr2);
>>>  this_dir = -this_dir;
>>>}
>>>  /* Known dependences can still be unordered througout the
>>> iteration space, see gcc.dg/tree-ssa/ldist-16.c.  */
>>>  if (DDR_NUM_DIST_VECTS (ddr) != 1)
>>>this_dir = 2;
>>>  /* If the overlap is exact preserve stmt order.  */
>>>  else if (lambda_vector_zerop (DDR_DIST_VECT (ddr, 0), 1))
>>>;
>>>  else
>>>{
>>>  /* Else as the distance vector is lexicographic positive
>>> swap the dependence direction.  */
>>>  this_dir = -this_dir;
>>>}
>>>   }
>>> For non-ZERO distance vector dependence, the second part always
>>> computes src->dest dependence info correctly, as well as edge
>>> direction of PG.  For ZERO distance vector dependence, we rely on the
>>> swap part (thus topological order) to get correct dependence
>>> direction.  For mentioned case, the two data references are unordered
>>> in dominance relation, but ordered in RPO.  This is why DOM is not
>>> enough.  For if-then-else case, the order actually doesn't matter, and
>>> the references are unordered in either dominance relation or RPO.
>>> Specifically, src->dest is always computed correctly for non-ZERO
>>> distance vector cases, no matter  or  is passed to
>>> data dependence analyzer.  As for ZERO distance vector (exact
>>> overlap), the order doesn't matter either because they control
>>> dependent on the same condition.  We can simply assume an arbitrary
>>> order for it.
>>
>> Ok, if it only is an issue for zero-distance then yes, I agree.
> Yeah, so for distribution, I think we can further simplify
> pg_add_dependence_edge because swap is only necessary for
> zero-distance cases.
>>
 copied from tree-data-refs.c:find_data_references_in_loop which
 does iterate over get_loop_body_in_dom_order as well.  So isn't the
 issue latent there as well?
>>> In theory maybe.  In practice, this is not a problem at all since loop
>>> distribution is the only one handles control dependence so far.
>>
>> You mean the only one running into the bogus BB ordering case.
>> I don't see how handling control dependences factor in here.
> Yes, that's what I meant.  The ordering issue doesn't exist if there
> is no control dependence between basic blocks in loop body I think.  I
> didn't realize the autopar pass.
>>

 That said, what about those "unordered" stmts?  I suppose
 dependence analysis still happily computes a dependence
 distance but in reality we'd have to consider both execution
 orders?
>>> As explained, there is no need to consider both orders.  GCC doesn't
>>> really support control dependence parallelization?
>>

Re: [PATCH GCC][04/13]Sort statements in topological order for loop distribution

2017-06-14 Thread Bin.Cheng
On Wed, Jun 14, 2017 at 10:15 AM, Richard Biener
 wrote:
> On Wed, Jun 14, 2017 at 9:53 AM, Bin.Cheng  wrote:
>> On Tue, Jun 13, 2017 at 11:59 AM, Richard Biener
>>  wrote:
>>> On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
 Hi,
 During the work I ran into a latent bug for distributing.  For the moment 
 we sort statements
 in dominance order, but that's not enough because basic blocks may be 
 sorted in reverse order
 of execution flow.  This results in wrong data dependence direction later. 
  This patch fixes
 the issue by sorting in topological order.

 Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>
>>> I suppose you are fixing
>>>
>>> static int
>>> pg_add_dependence_edges (struct graph *rdg, vec loops, int dir,
>>>  vec drs1,
>>>  vec drs2)
>>> {
>>> ...
>>> /* Re-shuffle data-refs to be in dominator order.  */
>>> if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
>>> > rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
>>>   {
>>> std::swap (dr1, dr2);
>>> this_dir = -this_dir;
>>>   }
>>>
>>> but then for stmts that are not "ordered" by RPO or DOM like
>>>
>>>if (flag)
>>>  ... = dr1;
>>>else
>>>  ... = dr2;
>>>
>>> this doesn't avoid spurious swaps?  Also the code was basically
>> No, this is mainly for below case:
>>   if (flag)
>> {
>>   partition1: arr[i] = x;
>> }
>>   partition2: arr[i] = y;
>>
>> function pg_add_dependence_edges is like:
>> /* Re-shuffle data-refs to be in dominator order. */
>> if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
>> > rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
>>   {
>> std::swap (dr1, dr2);
>> this_dir = -this_dir;
>>   }
>> //...
>> else if (DDR_ARE_DEPENDENT (ddr) == NULL_TREE)
>>   {
>>  if (DDR_REVERSED_P (ddr))
>>{
>>  std::swap (dr1, dr2);
>>  this_dir = -this_dir;
>>}
>>  /* Known dependences can still be unordered througout the
>> iteration space, see gcc.dg/tree-ssa/ldist-16.c.  */
>>  if (DDR_NUM_DIST_VECTS (ddr) != 1)
>>this_dir = 2;
>>  /* If the overlap is exact preserve stmt order.  */
>>  else if (lambda_vector_zerop (DDR_DIST_VECT (ddr, 0), 1))
>>;
>>  else
>>{
>>  /* Else as the distance vector is lexicographic positive
>> swap the dependence direction.  */
>>  this_dir = -this_dir;
>>}
>>   }
>> For non-ZERO distance vector dependence, the second part always
>> computes src->dest dependence info correctly, as well as edge
>> direction of PG.  For ZERO distance vector dependence, we rely on the
>> swap part (thus topological order) to get correct dependence
>> direction.  For mentioned case, the two data references are unordered
>> in dominance relation, but ordered in RPO.  This is why DOM is not
>> enough.  For if-then-else case, the order actually doesn't matter, and
>> the references are unordered in either dominance relation or RPO.
>> Specifically, src->dest is always computed correctly for non-ZERO
>> distance vector cases, no matter  or  is passed to
>> data dependence analyzer.  As for ZERO distance vector (exact
>> overlap), the order doesn't matter either because they control
>> dependent on the same condition.  We can simply assume an arbitrary
>> order for it.
>
> Ok, if it only is an issue for zero-distance then yes, I agree.
Yeah, so for distribution, I think we can further simplify
pg_add_dependence_edge because swap is only necessary for
zero-distance cases.
>
>>> copied from tree-data-refs.c:find_data_references_in_loop which
>>> does iterate over get_loop_body_in_dom_order as well.  So isn't the
>>> issue latent there as well?
>> In theory maybe.  In practice, this is not a problem at all since loop
>> distribution is the only one handles control dependence so far.
>
> You mean the only one running into the bogus BB ordering case.
> I don't see how handling control dependences factor in here.
Yes, that's what I meant.  The ordering issue doesn't exist if there
is no control dependence between basic blocks in loop body I think.  I
didn't realize the autopar pass.
>
>>>
>>> That said, what about those "unordered" stmts?  I suppose
>>> dependence analysis still happily computes a dependence
>>> distance but in reality we'd have to consider both execution
>>> orders?
>> As explained, there is no need to consider both orders.  GCC doesn't
>> really support control dependence parallelization?
>
> I think autopar supports an arbitrary CFG inside the loops but as it
> will never split them it won't change stmt ordering for zero-distance.
>
> That said, if dependence info can be incorrect if 

Re: [PATCH GCC][04/13]Sort statements in topological order for loop distribution

2017-06-14 Thread Richard Biener
On Wed, Jun 14, 2017 at 9:53 AM, Bin.Cheng  wrote:
> On Tue, Jun 13, 2017 at 11:59 AM, Richard Biener
>  wrote:
>> On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
>>> Hi,
>>> During the work I ran into a latent bug for distributing.  For the moment 
>>> we sort statements
>>> in dominance order, but that's not enough because basic blocks may be 
>>> sorted in reverse order
>>> of execution flow.  This results in wrong data dependence direction later.  
>>> This patch fixes
>>> the issue by sorting in topological order.
>>>
>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>
>> I suppose you are fixing
>>
>> static int
>> pg_add_dependence_edges (struct graph *rdg, vec loops, int dir,
>>  vec drs1,
>>  vec drs2)
>> {
>> ...
>> /* Re-shuffle data-refs to be in dominator order.  */
>> if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
>> > rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
>>   {
>> std::swap (dr1, dr2);
>> this_dir = -this_dir;
>>   }
>>
>> but then for stmts that are not "ordered" by RPO or DOM like
>>
>>if (flag)
>>  ... = dr1;
>>else
>>  ... = dr2;
>>
>> this doesn't avoid spurious swaps?  Also the code was basically
> No, this is mainly for below case:
>   if (flag)
> {
>   partition1: arr[i] = x;
> }
>   partition2: arr[i] = y;
>
> function pg_add_dependence_edges is like:
> /* Re-shuffle data-refs to be in dominator order. */
> if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
> > rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
>   {
> std::swap (dr1, dr2);
> this_dir = -this_dir;
>   }
> //...
> else if (DDR_ARE_DEPENDENT (ddr) == NULL_TREE)
>   {
>  if (DDR_REVERSED_P (ddr))
>{
>  std::swap (dr1, dr2);
>  this_dir = -this_dir;
>}
>  /* Known dependences can still be unordered througout the
> iteration space, see gcc.dg/tree-ssa/ldist-16.c.  */
>  if (DDR_NUM_DIST_VECTS (ddr) != 1)
>this_dir = 2;
>  /* If the overlap is exact preserve stmt order.  */
>  else if (lambda_vector_zerop (DDR_DIST_VECT (ddr, 0), 1))
>;
>  else
>{
>  /* Else as the distance vector is lexicographic positive
> swap the dependence direction.  */
>  this_dir = -this_dir;
>}
>   }
> For non-ZERO distance vector dependence, the second part always
> computes src->dest dependence info correctly, as well as edge
> direction of PG.  For ZERO distance vector dependence, we rely on the
> swap part (thus topological order) to get correct dependence
> direction.  For mentioned case, the two data references are unordered
> in dominance relation, but ordered in RPO.  This is why DOM is not
> enough.  For if-then-else case, the order actually doesn't matter, and
> the references are unordered in either dominance relation or RPO.
> Specifically, src->dest is always computed correctly for non-ZERO
> distance vector cases, no matter  or  is passed to
> data dependence analyzer.  As for ZERO distance vector (exact
> overlap), the order doesn't matter either because they control
> dependent on the same condition.  We can simply assume an arbitrary
> order for it.

Ok, if it only is an issue for zero-distance then yes, I agree.

>> copied from tree-data-refs.c:find_data_references_in_loop which
>> does iterate over get_loop_body_in_dom_order as well.  So isn't the
>> issue latent there as well?
> In theory maybe.  In practice, this is not a problem at all since loop
> distribution is the only one handles control dependence so far.

You mean the only one running into the bogus BB ordering case.
I don't see how handling control dependences factor in here.

>>
>> That said, what about those "unordered" stmts?  I suppose
>> dependence analysis still happily computes a dependence
>> distance but in reality we'd have to consider both execution
>> orders?
> As explained, there is no need to consider both orders.  GCC doesn't
> really support control dependence parallelization?

I think autopar supports an arbitrary CFG inside the loops but as it
will never split them it won't change stmt ordering for zero-distance.

That said, if dependence info can be incorrect if applied to a loop
we should fixup tree-data-refs.c as well.  It might be useful
to make get_loop_body_in_rpo_order available then (and eventually
all _in_dom_order users can work with rpo order as well thus we
can replace it entirely as a second step).

Thanks,
Richard.

> Thanks,
> bin
>>
>> Thanks,
>> Richard.
>>
>>
>>>
>>> Thanks,
>>> bin
>>> 2017-06-07  Bin Cheng  
>>>
>>> * tree-loop-distribution.c (bb_top_order_index): New.
>>> (bb_top_order_index_size, bb_top_order_cmp): 

Re: [PATCH GCC][04/13]Sort statements in topological order for loop distribution

2017-06-14 Thread Bin.Cheng
On Tue, Jun 13, 2017 at 11:59 AM, Richard Biener
 wrote:
> On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
>> Hi,
>> During the work I ran into a latent bug for distributing.  For the moment we 
>> sort statements
>> in dominance order, but that's not enough because basic blocks may be sorted 
>> in reverse order
>> of execution flow.  This results in wrong data dependence direction later.  
>> This patch fixes
>> the issue by sorting in topological order.
>>
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>
> I suppose you are fixing
>
> static int
> pg_add_dependence_edges (struct graph *rdg, vec loops, int dir,
>  vec drs1,
>  vec drs2)
> {
> ...
> /* Re-shuffle data-refs to be in dominator order.  */
> if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
> > rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
>   {
> std::swap (dr1, dr2);
> this_dir = -this_dir;
>   }
>
> but then for stmts that are not "ordered" by RPO or DOM like
>
>if (flag)
>  ... = dr1;
>else
>  ... = dr2;
>
> this doesn't avoid spurious swaps?  Also the code was basically
No, this is mainly for below case:
  if (flag)
{
  partition1: arr[i] = x;
}
  partition2: arr[i] = y;

function pg_add_dependence_edges is like:
/* Re-shuffle data-refs to be in dominator order. */
if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
> rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
  {
std::swap (dr1, dr2);
this_dir = -this_dir;
  }
//...
else if (DDR_ARE_DEPENDENT (ddr) == NULL_TREE)
  {
 if (DDR_REVERSED_P (ddr))
   {
 std::swap (dr1, dr2);
 this_dir = -this_dir;
   }
 /* Known dependences can still be unordered througout the
iteration space, see gcc.dg/tree-ssa/ldist-16.c.  */
 if (DDR_NUM_DIST_VECTS (ddr) != 1)
   this_dir = 2;
 /* If the overlap is exact preserve stmt order.  */
 else if (lambda_vector_zerop (DDR_DIST_VECT (ddr, 0), 1))
   ;
 else
   {
 /* Else as the distance vector is lexicographic positive
swap the dependence direction.  */
 this_dir = -this_dir;
   }
  }
For non-ZERO distance vector dependence, the second part always
computes src->dest dependence info correctly, as well as edge
direction of PG.  For ZERO distance vector dependence, we rely on the
swap part (thus topological order) to get correct dependence
direction.  For mentioned case, the two data references are unordered
in dominance relation, but ordered in RPO.  This is why DOM is not
enough.  For if-then-else case, the order actually doesn't matter, and
the references are unordered in either dominance relation or RPO.
Specifically, src->dest is always computed correctly for non-ZERO
distance vector cases, no matter  or  is passed to
data dependence analyzer.  As for ZERO distance vector (exact
overlap), the order doesn't matter either because they control
dependent on the same condition.  We can simply assume an arbitrary
order for it.

> copied from tree-data-refs.c:find_data_references_in_loop which
> does iterate over get_loop_body_in_dom_order as well.  So isn't the
> issue latent there as well?
In theory maybe.  In practice, this is not a problem at all since loop
distribution is the only one handles control dependence so far.
>
> That said, what about those "unordered" stmts?  I suppose
> dependence analysis still happily computes a dependence
> distance but in reality we'd have to consider both execution
> orders?
As explained, there is no need to consider both orders.  GCC doesn't
really support control dependence parallelization?

Thanks,
bin
>
> Thanks,
> Richard.
>
>
>>
>> Thanks,
>> bin
>> 2017-06-07  Bin Cheng  
>>
>> * tree-loop-distribution.c (bb_top_order_index): New.
>> (bb_top_order_index_size, bb_top_order_cmp): New.
>> (stmts_from_loop): Use topological order.
>> (pass_loop_distribution::execute): Compute topological order for.
>> basic blocks.


Re: [PATCH GCC][04/13]Sort statements in topological order for loop distribution

2017-06-13 Thread Richard Biener
On Mon, Jun 12, 2017 at 7:02 PM, Bin Cheng  wrote:
> Hi,
> During the work I ran into a latent bug for distributing.  For the moment we 
> sort statements
> in dominance order, but that's not enough because basic blocks may be sorted 
> in reverse order
> of execution flow.  This results in wrong data dependence direction later.  
> This patch fixes
> the issue by sorting in topological order.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?

I suppose you are fixing

static int
pg_add_dependence_edges (struct graph *rdg, vec loops, int dir,
 vec drs1,
 vec drs2)
{
...
/* Re-shuffle data-refs to be in dominator order.  */
if (rdg_vertex_for_stmt (rdg, DR_STMT (dr1))
> rdg_vertex_for_stmt (rdg, DR_STMT (dr2)))
  {
std::swap (dr1, dr2);
this_dir = -this_dir;
  }

but then for stmts that are not "ordered" by RPO or DOM like

   if (flag)
 ... = dr1;
   else
 ... = dr2;

this doesn't avoid spurious swaps?  Also the code was basically
copied from tree-data-refs.c:find_data_references_in_loop which
does iterate over get_loop_body_in_dom_order as well.  So isn't the
issue latent there as well?

That said, what about those "unordered" stmts?  I suppose
dependence analysis still happily computes a dependence
distance but in reality we'd have to consider both execution
orders?

Thanks,
Richard.


>
> Thanks,
> bin
> 2017-06-07  Bin Cheng  
>
> * tree-loop-distribution.c (bb_top_order_index): New.
> (bb_top_order_index_size, bb_top_order_cmp): New.
> (stmts_from_loop): Use topological order.
> (pass_loop_distribution::execute): Compute topological order for.
> basic blocks.


[PATCH GCC][04/13]Sort statements in topological order for loop distribution

2017-06-12 Thread Bin Cheng
Hi,
During the work I ran into a latent bug for distributing.  For the moment we 
sort statements
in dominance order, but that's not enough because basic blocks may be sorted in 
reverse order
of execution flow.  This results in wrong data dependence direction later.  
This patch fixes
the issue by sorting in topological order.

Bootstrap and test on x86_64 and AArch64.  Is it OK?

Thanks,
bin
2017-06-07  Bin Cheng  

* tree-loop-distribution.c (bb_top_order_index): New.
(bb_top_order_index_size, bb_top_order_cmp): New.
(stmts_from_loop): Use topological order.
(pass_loop_distribution::execute): Compute topological order for.
basic blocks.From 4bb233239e080eca956b3db7836cdf64da486dbf Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Wed, 7 Jun 2017 13:47:52 +0100
Subject: [PATCH 04/14] sort-stmts-in-top-order-20170607.txt

---
 gcc/tree-loop-distribution.c | 58 +++-
 1 file changed, 52 insertions(+), 6 deletions(-)

diff --git a/gcc/tree-loop-distribution.c b/gcc/tree-loop-distribution.c
index b0b9d66..a32253c 100644
--- a/gcc/tree-loop-distribution.c
+++ b/gcc/tree-loop-distribution.c
@@ -373,16 +373,39 @@ create_rdg_vertices (struct graph *rdg, vec 
stmts, loop_p loop,
   return true;
 }
 
-/* Initialize STMTS with all the statements of LOOP.  The order in
-   which we discover statements is important as
-   generate_loops_for_partition is using the same traversal for
-   identifying statements in loop copies.  */
+/* Array mapping basic block's index to its topological order.  */
+static int *bb_top_order_index;
+/* And size of the array.  */
+static int bb_top_order_index_size;
+
+/* If X has a smaller topological sort number than Y, returns -1;
+   if greater, returns 1.  */
+
+static int
+bb_top_order_cmp (const void *x, const void *y)
+{
+  basic_block bb1 = *(const basic_block *) x;
+  basic_block bb2 = *(const basic_block *) y;
+
+  gcc_assert (bb1->index < bb_top_order_index_size
+ && bb2->index < bb_top_order_index_size);
+  gcc_assert (bb1 == bb2
+ || bb_top_order_index[bb1->index]
+!= bb_top_order_index[bb2->index]);
+
+  return (bb_top_order_index[bb1->index] - bb_top_order_index[bb2->index]);
+}
+
+/* Initialize STMTS with all the statements of LOOP.  We use topological
+   order to discover all statements.  The order is important because
+   generate_loops_for_partition is using the same traversal for identifying
+   statements in loop copies.  */
 
 static void
 stmts_from_loop (struct loop *loop, vec *stmts)
 {
   unsigned int i;
-  basic_block *bbs = get_loop_body_in_dom_order (loop);
+  basic_block *bbs = get_loop_body_in_custom_order (loop, bb_top_order_cmp);
 
   for (i = 0; i < loop->num_nodes; i++)
 {
@@ -1764,6 +1787,22 @@ pass_loop_distribution::execute (function *fun)
   if (number_of_loops (fun) <= 1)
 return 0;
 
+  /* Compute topological order for basic blocks.  Topological order is
+ needed because data dependence is computed for data references in
+ lexicographical order.  */
+  if (bb_top_order_index == NULL)
+{
+  int *rpo = XNEWVEC (int, last_basic_block_for_fn (cfun));
+
+  bb_top_order_index = XNEWVEC (int, last_basic_block_for_fn (cfun));
+  bb_top_order_index_size
+   = pre_and_rev_post_order_compute_fn (cfun, NULL, rpo, true);
+  for (int i = 0; i < bb_top_order_index_size; i++)
+   bb_top_order_index[rpo[i]] = i;
+
+  free (rpo);
+}
+
   FOR_ALL_BB_FN (bb, fun)
 {
   gimple_stmt_iterator gsi;
@@ -1881,13 +1920,20 @@ out:
   if (cd)
 delete cd;
 
+  if (bb_top_order_index != NULL)
+{
+  free (bb_top_order_index);
+  bb_top_order_index = NULL;
+  bb_top_order_index_size = 0;
+}
+
   if (changed)
 {
   /* Destroy loop bodies that could not be reused.  Do this late as we
 otherwise can end up refering to stale data in control dependences.  */
   unsigned i;
   FOR_EACH_VEC_ELT (loops_to_be_destroyed, i, loop)
- destroy_loop (loop);
+   destroy_loop (loop);
 
   /* Cached scalar evolutions now may refer to wrong or non-existing
 loops.  */
-- 
1.9.1