Hello All,

With the code given below, i expected the ppc compiler (e500mc v4.6.2)
to generate 'memset' zero  call for loop initialization (at '-O3'),
but it generates a loop.

Case:1

int a[18], b[18];
foo () {
   int i;

   for (i=0; i < 18; i++)
      a[i] = 0;
}

Also based on the '-ftree-loop-distribute-patterns' flag, if the test
case (taken from gcc doc) is as shown below, the compiler does
generate 'memset' zero.

Case:2

int a[18], b[18];
foo () {
   int i;

   for (i=0; i < 18; i++) {
      a[i] = 0;               -------------(A)
      b[i] = a[i] + i;       -------------(B)
   }
}

Here statements (A) and (B) are split in to two loops and for the 1st
loop the compiler generates 'memset' zero call. Isn't the same
optimization supposed to happen with case (1)?

Also with case(2)  statement (A), for loop iterations < 18, the
compiler unrolls the loop and for iterations >= 18, 'memset' zero is
generated.

Looking at 'gcc/tree-loop-distribution.c' file,

static int
ldist_gen (struct loop *loop, struct graph *rdg,
           VEC (int, heap) *starting_vertices)
{
   ...
BITMAP_FREE (processed);
  nbp = VEC_length (bitmap, partitions);

  if (nbp <= 1
      || partition_contains_all_rw (rdg, partitions))
    goto ldist_done;
    ------------------------(Z)

  if (dump_file && (dump_flags & TDF_DETAILS))
    dump_rdg_partitions (dump_file, partitions);

  FOR_EACH_VEC_ELT (bitmap, partitions, i, partition)
    if (!generate_code_for_partition (loop, partition, i < nbp - 1))
-------------------(Y)              // code for generating built-in
'memset' is called from here.
      goto ldist_done;

  rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa);
  update_ssa (TODO_update_ssa_only_virtuals | TODO_update_ssa);

 ldist_done:

  BITMAP_FREE (remaining_stmts);

  .........
 return nbp;
 }

>From statement (Z), if the no of distributed loops is <=1 , then the
code generating built-in function (Y) is not executed.

Is it a good solution to update this conditional check for single loop
(which is not split) also? or Is there any other place/pass where we
can implement this.

Regards,
Rohit

Reply via email to