On Fri, Jan 3, 2014 at 2:43 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Fri, Jan 03, 2014 at 02:35:36PM +0100, Uros Bizjak wrote:
>> Like in the patch below. Please note, that the block_tune setting for
>> the nocona is wrong, -march=native on my trusted old P4 returns:
>>
>> --param "l1-cache-size=16" --param "l1-cache-line-size=64" --param
>> "l2-cache-size=2048" "-mtune=nocona"
>>
>> which is consistent with the above quote from manual.
>>
>> 2014-01-02  Uros Bizjak  <ubiz...@gmail.com>
>>
>>     * config/i386/i386.c (ix86_data_alignment): Calculate max_align
>>     from prefetch_block tune setting.
>>     (nocona_cost): Correct size of prefetch block to 64.
>>
>> The patch was bootstrapped on x86_64-pc-linux-gnu and is currently in
>> regression testing. If there are no comments, I will commit it to
>> mainline and release branches after a couple of days.
>
> That still has the effect of not aligning (for most tunings) 32 to 63 bytes
> long aggregates to 32 bytes, while previously they were aligned.  Forcing
> aligning 32 byte long aggregates to 64 bytes would be overkill, 32 byte
> alignment is just fine for those (and ensures it never crosses 64 byte
> boundary), for 33 to 63 bytes perhaps using 64 bytes alignment wouldn't
> be that bad, just wouldn't match what we have done before.

Please note that previous value was based on earlier (pre P4)
recommendation and it was appropriate for older chips with 32byte
cache line. The value should be updated long ago, when 64bit cache
lines were introduced, but was probably missed due to usage of magic
value without comment.

Ah, I see. My patch deals only with structures, larger than cache
line. As recommended in "As long as 16-byte boundaries (and cache
lines) are never crossed, natural alignment is not strictly necessary
(though it is an easy way to enforce this)." part of the manual, we
should align smaller structures to 16 or 32 bytes.

Yes, I agree. Can you please merge your patch together with the proposed patch?

Thanks,
Uros.

Reply via email to