On Tue, Oct 29, 2013 at 10:18 AM, bin.cheng <bin.ch...@arm.com> wrote: > Hi, > I noticed that IVOPT generates complex address expressions like below for iv > base. > &arr_base[0].y > &arr[0] > &MEM[p+o] > It's even worse for targets support auto-increment addressing mode because > IVOPT adjusts such base expression with +/- step, then creates below: > &arr_base[0].y +/- step > &arr[0] +/- step > &MEM[p+o] +/- step > It has two disadvantages: > 1) Cost computation in IVOPT can't handle complex address expression and > general returns spill_cost for it, which is bad since address iv is > important to IVOPT. > 2) IVOPT creates duplicate candidates for IVs which have same value in > different forms, for example, two candidates are generated with each for > "&a[0]" and "&a". Again, it's even worse for auto-increment addressing > mode. > > This patch fixes the issue by simplifying address expression at the entry of > allocating IV struct. Maybe the simplification can be put in various fold* > functions but I think it might be better in this way, because: > 1) fold* functions are used from front-end to various tree optimizations, > the simplified address expressions may not be what each optimizer wanted. > Think about parallelism related passes, they might want the array index > information kept for further analysis. > 2) In some way, the simplification is conflict with current implementation > of fold* function. Take fold_binary_loc as an example, it tries to simplify > "&a[i1] +p c* i2" into "&a[i1+i2]". Of course we can simplify in this way > for IVOPT too, but that will cause new problems like: a) we have to add code > in IVOPT to cope with complex ARRAY_REF which is the exactly thing we want > to avoid; b) the simplification can't always be done because of the > sign/unsigned offset problem (especially for auto-increment addressing > mode). > 3) There are many entry point for fold* functions, the change will be > non-trivial. > 4) The simplification is only done in alloc_iv for true (not duplicate ones) > iv struct, the number of such iv should be moderate. > > With these points, I think it might be a win to do the simplification in > IVOPT and create a kind of sand box to let IVOPT play. Any suggestions? > > Bootstrap and tested on x86/x86_64/arm. > The patch causes three cases failed on some target, but all of them are > false alarm, which can be resolved by refining test cases to check more > accurate information. > > Is it OK?
Hmm. I think you want what get_inner_reference_aff does, using the return value of get_inner_reference as starting point for determine_base_object. And you don't want to restrict yourselves so much on what exprs to process, but only exclude DECL_Ps. Just amend get_inner_reference_aff to return the tree base object. Note that this isn't really "simplifying" but rather lowering all addresses. The other changes in this patch are unrelated, right? Thanks, Richard. > Thanks. > bin > > gcc/testsuite/ChangeLog > 2013-10-29 Bin Cheng <bin.ch...@arm.com> > > * gcc.dg/tree-ssa/loop-2.c: Refine check condition. > * gcc.dg/tree-ssa/ivopt_infer_2.c: Ditto. > * gcc.dg/tree-ssa/ivopt_mult_3.c: Ditto. > > 2013-10-29 Bin Cheng <bin.ch...@arm.com> > > * tree-ssa-loop-ivopts.c (alloc_iv): Simplify base expression. > (get_shiftadd_cost): Check equality using operand_equal_p. > (force_expr_to_var_cost): Refactor the code. Handle type > conversion. > (split_address_cost): Call force_expr_to_var_cost.