Bin Chen wrote:
> 在 2007-10-16二的 10:04 +0200,Denis Oliver Kropp写道:
>> Bin Chen wrote:
>>> 在 2007-10-16二的 09:23 +0200,Denis Oliver Kropp写道:
>>>> Phil Endecott wrote:
>>>>> I've also noticed that djpeg runs about 15% faster if compiled with -Os
>>>>> rather than -O4.
>>>> On x86?
>>>>
>>>> On embedded architectures, -O2 is often better than -O3, but if you have
>>>> a very small instruction cache, -Os could be best.
>>> Its interesting, why -O2 is better?
>> O3 can produce code (loop unrolling etc.) where the cache penalty is
>> bigger than the speed improvement.
>>
> Thanks Phil, so what is loop unrolling, is it to expand a loop to repeat
> machine code to reduce the penalty from jumping back?
Yes.
> Such as
>
> for (i = 0;i < 5;i++) {
> do sth for i;
> }
>
> expand to
>
> do sth for 1
> do sth for 2
> ...
> do sth for 5
>
> The expanded code doesn't need to do jump, so can increase the prefetch
> efficiency.
There are less instructions in total, but more instructions need to be read.
--
Best regards,
Denis Oliver Kropp
.------------------------------------------.
| DirectFB - Hardware accelerated graphics |
| http://www.directfb.org/ |
"------------------------------------------"
_______________________________________________
directfb-dev mailing list
[email protected]
http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-dev