On Wed, Apr 11, 2012 at 5:34 PM, Andi Kleen <a...@firstfloor.org> wrote:
> Richard Guenther <richard.guent...@gmail.com> writes:
>> 5% is not moderate. Your patch does enable unrolling at -O2 but not -O3,
>> why? Why do you disable register renaming? check_imull requires a function
>> This completely looks like a hack for EEMBC2.0, so it's definitely not ok.
>> -O2 is not supposed to give best benchmark results.
> Besides it is against the Intel Optimization Manual recommendation
> to prefer small code on Atom to avoid falling out of the predecode hints
> in the cache.
Yes, this is well-known concern for Atom. But in the same time unroll
could help a lot for inorder machines because it could provide more
opportunities to a compiler scheduler. And experiments showed that
unroll could really help.
> So would need much more benchmarking on macro workloads first at least.
Like what, for example? I believe in this case everything also
strongly depends on test usage model (e.g. it usually compiled with Os
not O2) and, let's say, internal test structure - whether there are
hot loops that suitable for unroll.
> a...@linux.intel.com -- Speaking for myself only