On Wed, Apr 11, 2012 at 5:34 PM, Andi Kleen <a...@firstfloor.org> wrote:
> Richard Guenther <richard.guent...@gmail.com> writes:
>>
>> 5% is not moderate.  Your patch does enable unrolling at -O2 but not -O3,
>> why? Why do you disable register renaming?  check_imull requires a function
>> comment.
>>
>> This completely looks like a hack for EEMBC2.0, so it's definitely not ok.
>>
>> -O2 is not supposed to give best benchmark results.
>
> Besides it is against the Intel Optimization Manual recommendation
> to prefer small code on Atom to avoid falling out of the predecode hints
> in the cache.

Yes, this is well-known concern for Atom. But in the same time unroll
could help a lot for inorder machines because it could provide more
opportunities to a compiler scheduler. And experiments showed that
unroll could really help.

>
> So would need much more benchmarking on macro workloads first at least.

Like what, for example? I believe in this case everything also
strongly depends on test usage model (e.g. it usually compiled with Os
not O2) and, let's say, internal test structure - whether there are
hot loops that suitable for unroll.

>
> -Andi
>
> --
> a...@linux.intel.com -- Speaking for myself only

Reply via email to