On Tue, May 13, 2014 at 4:59 PM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Sun, May 11, 2014 at 2:49 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>> On Thu, May 8, 2014 at 5:08 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>>> On Tue, May 6, 2014 at 6:44 PM, Richard Biener
>>> <richard.guent...@gmail.com> wrote:
>>>> On Tue, May 6, 2014 at 10:39 AM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>>>>> On Fri, Dec 6, 2013 at 6:19 PM, Richard Biener
>>>>> <richard.guent...@gmail.com> wrote:
>>>
>>>>> Hi,
>>>>> I split the patch into two and updated the test case.
>>>>> The patches pass bootstrap/tests on x86/x86_64, also pass test on arm 
>>>>> cortex-m3.
>>>>> Is it OK?
>>>>>
>>>>> Thanks,
>>>>> bin
>>>>>
>>>>> PATCH 1:
>>>>>
>>>>> 2014-05-06  Bin Cheng  <bin.ch...@arm.com>
>>>>>
>>>>>     * gcc/tree-affine.c (tree_to_aff_combination): Handle MEM_REF for
>>>>>     core part of address expressions.
>>>>
>>>> No gcc/ in the changelog
>>>>
>>>> Simplify that by using aff_combination_add_cst:
>>>>
>>>> +      if (TREE_CODE (core) == MEM_REF)
>>>> +       {
>>>> +         aff_combination_add_cst (comb, mem_ref_offset (core));
>>>> +         core = TREE_OPERAND (core, 0);
>>>>
>>>> patch 1 is ok with that change.
>>>
>>> Installed with below change because of wide-int merge:
>>> -      core = build_fold_addr_expr (core);
>>> +      if (TREE_CODE (core) == MEM_REF)
>>> +    {
>>> +      aff_combination_add_cst (comb, wi::to_widest (TREE_OPERAND (core, 
>>> 1)));
>>> +      core = TREE_OPERAND (core, 0);
>>>
>>>>
>>>>> PATCH 2:
>>>>>
>>>>> 2014-05-06  Bin Cheng  <bin.ch...@arm.com>
>>>>>
>>>>>     * gcc/tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New.
>>>>>     (alloc_iv): Lower base expressions containing ADDR_EXPR.
>>>>
>>>> So this "lowers" addresses(?) which are based on &<not-decl>,
>>>> like &a[0] + 4 but not &a + 4.  I question this odd choice.  ISTR
>>> &a+4 is already in its lowered form, what we want to handle is &EXPR +
>>> 4, in which EXPR is MEM_REF/ARRAY_REF, etc..
>>>
>>>> when originally introducing address lowering (in rev. 204497) I was
>>>> concerned about the cost?
>>> Yes, you did.  I still think the iv number is relative small for each
>>> loop, thus the change won't cause compilation time issue considering
>>> the existing tree<->affine transformation in ivopt.
>>> I would like to collect more accurate time information for ivopt in
>>> gcc bootstrap.  Should I use "-ftime-report" then add all time slices
>>> in TV_TREE_LOOP_IVOPTS category?  Is there any better solutions?
>>> Thanks.
>>>
>>>>
>>>> Now I wonder why we bother to convert the lowered form back
>>>> to trees to store it in ->base and not simply keep (and always
>>>> compute) the affine expansion only.  Thus, change struct iv
>>>> to have aff_tree base instead of tree base?
>>>>
>>>> Can you see what it takes to do such change?  At the point
>>>> we replace uses we go into affine representation again anyway.
>>> Good idea, I may have a look.
>> After going through work flow of ivopt, I think it's practical to make
>> such change and can help compilation time.  Ivopt calls
>> tree_to_aff_combination to convert iv->base into affine_tree whenever
>> it tries to represent an iv_use by an iv_cand,  i.e., N*M times for
>> computing cost of each iv_use represented by each iv_cand, and M times
>> for rewriting each iv_use.  Here, N and M stands for number of iv_use
>> and iv_cand.  Also strip_offset (which is a recursive function) is
>> called for iv->base also at complexity of O(NM), so it may be improved
>> too.
>> To make a start, it's possible to store both tree and affine_tree of
>> base in struct iv, and use either of them whenever needed.
>>
>> It seems to me there are two potential issues of this change.  First,
>> though affine_tree of base is stored, we can't operate on it directly,
>> which means we have to copy it before using it.
>
> You'd use it as unchanged operand to some aff_* function to avoid
> actual copying of course.
>
>>  Second, ivopt
>> sometimes computes affine_tree of base in different type other than
>> TREE_TYPE(base), we may need to do additional calls to
>> aff_combination_convert to get wanted type.  All these will introduce
>> additional computation and compromise benefit of the change.
>
> Sure.
>
>> At last, I did some experiments on how many additional calls to
>> tree_to_aff_combination are introduced by this patch.  The data of
>> bootstrap shows it's less than 3% comparing to calls made by current
>> implementation of ivopt, which confirms that this patch shouldn't have
>> issue of compilation time.
>>
>> Any comments?
>
> I'd see the use of affines as making the algorithmic structure of
> IVOPTs clearer and more consistent (also with possibly using
> the _expand variants later to get even more simplifications).
Yes, one example, it's possible to rewrite iv uses in a loop-invariant
sensitive manner if we can use affine in the whole process.  Currently
fold_tree routines tend to decompose invariant into different
expressions.

>
> I don't want to force you to do this change but I thought it may
> help further changes (as you seem to be doing a lot of IVOPTs
> changes lately).
>
> So - the patch is ok as-is then, but please consider refactoring
Patch installed.

> IVOPTs code when you do changes.  It's current structure
> is somewhat awkward.
Understood.  I will continue to look at it when have proper time.

Thanks,
bin



-- 
Best Regards.

Reply via email to