On 2017-02-05 00:34:16 +0200, Martin Storsjö wrote:
> On Sat, 4 Feb 2017, Janne Grunau wrote:
> 
> >I'm not really sure which variant I prefer. Is the speed difference
> >mesuable for idct heavy real world samples? If you have preference for one
> >or the other variant I trust your judgement.
> 
> It's measurable, but it's not much. For one sample, I originally got a full
> decode time like this (fastest time out of 2 runs) with the current master:
> user    2m53.980s
> Alternative 1:
> user    2m53.448s
> Alternative 2:
> user    2m52.952s

What's is the approximate share of the idct on the whole decoding time?  
 
> So alternative 2 is better, but produces a couple KB bigger binaries, and
> more duplicated code. (OTOH also allowing more exact special casing of minor
> details.)
> 
> I originally clearly preferred alt 2, but with your suggestions for alt 1,
> the diff for that one ends up very small and neat.

I think the numbers look pretty compelling for alternative 2. 1s vs.  
0.5s overall decoding speedup. The difference is larger than I expected 
and imo justifies the code duplication and increased binary size. While 
the patch for alternative 1 looks small and nice that's not really an 
argument. the patch for alternative 2 would also look nicer if you did 
the macro move in a separate patch.

Janne
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to