On 24/01/13 22:30, Daniel Kang wrote:
>>>> Depending on the architecture (??) the functions are inlined, but are
>>>> often not. I suspect GCC's insane method of reordering registers
>>>> swallows any overhead from calling these functions, but due to macro
>>>> hell, I'm not sure of the best way to test this.
>>>
>>> Sorry, this was not very clear. I think the yasm version is faster
>>> despite calling overhead, because GCC uses some ridiculous method of
>>> reordering registers for the inline assembly.
>>
>> Do you have numbers?
> 
> Here's an example:
> 
> yasm (put_qpel16_mc21):
> 8285
> 8333
> 8278
> 8347
> 8273
> AVG: 8303.2
> 
> inline (put_qpel16_mc21):
> 8505
> 8424
> 8295
> 8400
> 8461
> AVG: 8417

While monitoring patches we got similar results we definitely
over-inline and/or gcc is to optimistic about data-cache.

http://blog.flameeyes.eu/2013/01/postmortem-of-a-patch-or-how-do-you-find-what-changed

For some information about tools in use, probably it could go into the
developer documentation sooner or later.

(yes, from time to time I do benchmark stuff I consider interesting)

lu

_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to