On Mon, Aug 11, 2014 at 3:35 AM, Zhenqiang Chen <zhenqiang.c...@linaro.org> wrote: > On 8 August 2014 23:22, Ramana Radhakrishnan <ramana....@googlemail.com> > wrote: >> On Tue, Aug 5, 2014 at 10:31 AM, Zhenqiang Chen >> <zhenqiang.c...@linaro.org> wrote: >>> Hi, >>> >>> For some large constants, ARM will split them during expanding, which >>> makes impossible to hoist them out the loop or shared by different >>> references (refer the test case in the patch). >>> >>> The patch keeps some constants in registers. If the constant can not >>> be optimized, the cprop and combine passes can optimize them as what >>> we do in current expand pass with >>> >>> define_insn_and_split "*arm_subsi3_insn" >>> define_insn_and_split "*arm_andsi3_insn" >>> define_insn_and_split "*iorsi3_insn" >>> define_insn_and_split "*arm_xorsi3" >>> >>> The patch does not modify addsi3 since the define_insn_and_split >>> "*arm_addsi3" is only valid when (reload_completed || >>> !arm_eliminable_register (operands[1])). The cprop and combine passes >>> can not optimize the large constant if we put it in register, which >>> will lead to regression. >>> >>> For logic operators, the patch skips changes for constants: >>> >>> INTVAL (operands[2]) < 0 && const_ok_for_arm (-INTVAL (operands[2]) >>> >>> since expand pass always uses "sign-extend" to get the value >>> (trunc_int_for_mode called from immed_wide_int_const) for rtl, and >>> logs show most negative values are UNSIGNED when they are TREE node. >>> And combine pass is smart enough to recover the negative value to >>> positive value. >> >> I am unable to verify any change in code generation for this testcase >> with and without the patch when I had a play with the patch. >> >> what gives ? > > Thanks for trying the patch. > > Do you add option -fno-gcse which is mentioned in dg-options " -O2 > -fno-gcse "? Without it, there is no change for the testcase since > cprop pass will propagate the constant to AND expr (A patch to enhance > cprop pass was discussed at > https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01321.html).
Probably not and I can now see the difference in code generated for Thumb state. Why is it that in ARM state with -mcpu=cortex-a15 we see the hoisting of the constant without your patch with -fno-gcse ? So, the patch improves code generation for -mcpu=cortex-a15 -mthumb -fno-gcse for the given testcase ? > > In addition, if the constant can not be hoisted out the loop, later > combine pass can also optimize it. These (cprop and combine) are > reasons why the patch itself has little impact on current tests. Does this mean you need the referred to patch to be useful as a pre-requisite ? I fail to understand why this patch needs to go in if it makes no difference without disabling GCSE. I cannot see -fno-gcse being used by default for performant code. regards Ramana