On 2017-02-05 00:34:16 +0200, Martin Storsjö wrote: > On Sat, 4 Feb 2017, Janne Grunau wrote: > > >I'm not really sure which variant I prefer. Is the speed difference > >mesuable for idct heavy real world samples? If you have preference for one > >or the other variant I trust your judgement. > > It's measurable, but it's not much. For one sample, I originally got a full > decode time like this (fastest time out of 2 runs) with the current master: > user 2m53.980s > Alternative 1: > user 2m53.448s > Alternative 2: > user 2m52.952s
What's is the approximate share of the idct on the whole decoding time? > So alternative 2 is better, but produces a couple KB bigger binaries, and > more duplicated code. (OTOH also allowing more exact special casing of minor > details.) > > I originally clearly preferred alt 2, but with your suggestions for alt 1, > the diff for that one ends up very small and neat. I think the numbers look pretty compelling for alternative 2. 1s vs. 0.5s overall decoding speedup. The difference is larger than I expected and imo justifies the code duplication and increased binary size. While the patch for alternative 1 looks small and nice that's not really an argument. the patch for alternative 2 would also look nicer if you did the macro move in a separate patch. Janne _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
