On Tue, Oct 29, 2013 at 10:18 AM, bin.cheng <bin.ch...@arm.com> wrote:
> Hi,
> I noticed that IVOPT generates complex address expressions like below for iv
> base.
>         &arr_base[0].y
>         &arr[0]
>         &MEM[p+o]
> It's even worse for targets support auto-increment addressing mode because
> IVOPT adjusts such base expression with +/- step, then creates below:
>         &arr_base[0].y +/- step
>         &arr[0] +/- step
>         &MEM[p+o] +/- step
> It has two disadvantages:
> 1) Cost computation in IVOPT can't handle complex address expression and
> general returns spill_cost for it, which is bad since address iv is
> important to IVOPT.
> 2) IVOPT creates duplicate candidates for IVs which have same value in
> different forms, for example, two candidates are generated with each for
> "&a[0]" and "&a".  Again, it's even worse for auto-increment addressing
> mode.
>
> This patch fixes the issue by simplifying address expression at the entry of
> allocating IV struct.  Maybe the simplification can be put in various fold*
> functions but I think it might be better in this way, because:
> 1) fold* functions are used from front-end to various tree optimizations,
> the simplified address expressions may not be what each optimizer wanted.
> Think about parallelism related passes, they might want the array index
> information kept for further analysis.
> 2) In some way, the simplification is conflict with current implementation
> of fold* function.  Take fold_binary_loc as an example, it tries to simplify
> "&a[i1] +p c* i2" into "&a[i1+i2]".  Of course we can simplify in this way
> for IVOPT too, but that will cause new problems like: a) we have to add code
> in IVOPT to cope with complex ARRAY_REF which is the exactly thing we want
> to avoid; b) the simplification can't always be done because of the
> sign/unsigned offset problem (especially for auto-increment addressing
> mode).
> 3) There are many entry point for fold* functions, the change will be
> non-trivial.
> 4) The simplification is only done in alloc_iv for true (not duplicate ones)
> iv struct, the number of such iv should be moderate.
>
> With these points, I think it might be a win to do the simplification in
> IVOPT and create a kind of sand box to let IVOPT play.  Any suggestions?
>
> Bootstrap and tested on x86/x86_64/arm.
> The patch causes three cases failed on some target, but all of them are
> false alarm, which can be resolved by refining test cases to check more
> accurate information.
>
> Is it OK?

Hmm.  I think you want what get_inner_reference_aff does, using
the return value of get_inner_reference as starting point for
determine_base_object.  And you don't want to restrict yourselves
so much on what exprs to process, but only exclude DECL_Ps.
Just amend get_inner_reference_aff to return the tree base object.

Note that this isn't really "simplifying" but rather lowering all
addresses.

The other changes in this patch are unrelated, right?

Thanks,
Richard.

> Thanks.
> bin
>
> gcc/testsuite/ChangeLog
> 2013-10-29  Bin Cheng  <bin.ch...@arm.com>
>
>         * gcc.dg/tree-ssa/loop-2.c: Refine check condition.
>         * gcc.dg/tree-ssa/ivopt_infer_2.c: Ditto.
>         * gcc.dg/tree-ssa/ivopt_mult_3.c: Ditto.
>
> 2013-10-29  Bin Cheng  <bin.ch...@arm.com>
>
>         * tree-ssa-loop-ivopts.c (alloc_iv): Simplify base expression.
>         (get_shiftadd_cost): Check equality using operand_equal_p.
>         (force_expr_to_var_cost): Refactor the code.  Handle type
>         conversion.
>         (split_address_cost): Call force_expr_to_var_cost.

Reply via email to