On Fri, Jan 17, 2014 at 7:11 AM, Uros Bizjak <ubiz...@gmail.com> wrote:
> On Fri, Jan 17, 2014 at 3:46 PM, H.J. Lu <hjl.to...@gmail.com> wrote:
>
>>>> ix86_split_lea_for_addr transforms a single LEA instruction into a series
>>>> of MOV and ADD instructions.  For
>>>>
>>>> lea 0x400(%eax, %ecx, 8), %edx
>>>>
>>>> we get
>>>>
>>>> mov %eax, %edx
>>>> add %ecx, %edx
>>>> add %ecx, %edx
>>>> add %ecx, %edx
>>>> add %ecx, %edx
>>>> add %ecx, %edx
>>>> add %ecx, %edx
>>>> add %ecx, %edx
>>>> add %ecx, %edx
>>>> add $0x400, %edx
>>>>
>>>> For -mtune=intel, we want to turn on X86_TUNE_OPT_AGU, but avoid
>>>> ix86_split_lea_for_addr.  This patch adds X86_TUNE_AVOID_LEA_FOR_ADDR
>>>> and PROCESSOR_INTEL.  We keep PROCESSOR_INTEL the same as
>>>> PROCESSOR_SILVERMONT, except that X86_TUNE_AVOID_LEA_FOR_ADDR isn't
>>>> turned on for PROCESSOR_INTEL.  OK for trunk?
>>>
>>> As said earlier, m_INTEL is not a processor, but equals a REAL
>>> processor, so the patch is not acceptable.
>>>
>>
>> -mtune=intel, similar to -mtune=generic,  isn't equal to a single processor.
>> From invoke.texi:
>>
>> ---
>> @item intel
>> Produce code optimized for the most current Intel processors, which are
>> Haswell and Silvermont for this version of GCC.
>> ---
>>
>> We don't want -mtune=intel to define __tune_silvermont__ and we
>> want to generate balanced codes for Haswell and Silvermont.
>> -mtune=intel started as -mtune=silvermont.  I am working on incremental
>> changes like this to better tune for Haswell without significantly impacting
>> Silvermont.
>
> OK, this clarifies the situation.
>
> So, -mtune=generic is too broad, and -mtune=intel is needed, as a
> generic tuning for latest Intel processors (note the plural). We want
> tuning options that cover Haswell and Silvermont for this version, but
> not something that degrades runtime too much (or unnecessarily
> increases code size too much).

Yes, that is correct.

> If this is the case, I agree with the approach.

I will check it in.

> BTW: There are some ix86_tune == XXX conditions scattered throughout
> LEA handling code. Can these be substituted with appropriate TARGET_*
> defines?

I have been looking at them closely to check their impacts on
both Haswell and Silvermont.  I am planning to keep
the simple LEA -> ADD transformation, but avoid
the complex LEA -> ADD/MOV/SHL transformation.

Thanks.

-- 
H.J.

Reply via email to