On Mon, Aug 11, 2014 at 3:35 AM, Zhenqiang Chen
<zhenqiang.c...@linaro.org> wrote:
> On 8 August 2014 23:22, Ramana Radhakrishnan <ramana....@googlemail.com> 
> wrote:
>> On Tue, Aug 5, 2014 at 10:31 AM, Zhenqiang Chen
>> <zhenqiang.c...@linaro.org> wrote:
>>> Hi,
>>>
>>> For some large constants, ARM will split them during expanding, which
>>> makes impossible to hoist them out the loop or shared by different
>>> references (refer the test case in the patch).
>>>
>>> The patch keeps some constants in registers. If the constant can not
>>> be optimized, the cprop and combine passes can optimize them as what
>>> we do in current expand pass with
>>>
>>> define_insn_and_split "*arm_subsi3_insn"
>>> define_insn_and_split "*arm_andsi3_insn"
>>> define_insn_and_split "*iorsi3_insn"
>>> define_insn_and_split "*arm_xorsi3"
>>>
>>> The patch does not modify addsi3 since the define_insn_and_split
>>> "*arm_addsi3" is only valid when (reload_completed ||
>>> !arm_eliminable_register (operands[1])). The cprop and combine passes
>>> can not optimize the large constant if we put it in register, which
>>> will lead to regression.
>>>
>>> For logic operators, the patch skips changes for constants:
>>>
>>> INTVAL (operands[2]) < 0 && const_ok_for_arm (-INTVAL (operands[2])
>>>
>>> since expand pass always uses "sign-extend" to get the value
>>> (trunc_int_for_mode called from immed_wide_int_const) for rtl, and
>>> logs show most negative values are UNSIGNED when they are TREE node.
>>> And combine pass is smart enough to recover the negative value to
>>> positive value.
>>
>> I am unable to verify any change in code generation for this testcase
>> with and without the patch when I had a play with the patch.
>>
>> what gives ?
>
> Thanks for trying the patch.
>
> Do you add option -fno-gcse which is mentioned in dg-options " -O2
> -fno-gcse "? Without it, there is no change for the testcase since
> cprop pass will propagate the constant to AND expr (A patch to enhance
> cprop pass was discussed at
> https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01321.html).

Probably not and I can now see the difference in code generated for
Thumb state. Why is it that in ARM state with -mcpu=cortex-a15 we see
the hoisting of the constant without your patch with -fno-gcse ?

So, the patch improves code generation for -mcpu=cortex-a15 -mthumb
-fno-gcse for the given testcase ?

>
> In addition, if the constant can not be hoisted out the loop, later
> combine pass can also optimize it. These (cprop and combine) are
> reasons why the patch itself has little impact on current tests.

Does this mean you need the referred to patch to be useful as a
pre-requisite ? I fail to  understand why this patch needs to go in if
it makes no difference without disabling GCSE. I cannot see -fno-gcse
being used by default for performant code.


regards
Ramana

Reply via email to