On Wed, 31 Jul 2013, Ben Avison wrote:

On Wed, 31 Jul 2013 14:14:02 +0100, Hendrik Leppkes <[email protected]> wrote:
Did you measure the overhead from the extra call, without any special
asm enhanced versions?

I was rather hoping nobody would ask that, to save me the trouble of
having to go back and re-profile them. The truth is that I only split
patches 2 and 3 (and 1) apart in preparation for publishing the patch
series. The benchmarks in patch 3 refer to the combined effect of patches
1-3 - if you recall, that was an overall 6% speedup.

Profiling patch 2 in isolation does actually lead to a 5% regression,
though this is more than compensated for by the fact that patch 3 by itself
results in 11% speedup. Of course, patch 3 is ARM only. Other architectures
will hopefully find that any regression due to patch 2 is compensated for
by patches 4-6, plus there's also the option to write versions of patch 3
targeted at them.

What do others think about this, is the slowdown acceptable in itself? As long as you actually do decoding, this slowdown shouldn't really be measurable in the grand scheme of things - or is it? I guess it would have most impact on slow systems, and patch 3/6 provides an armv6 implementation.

Is there anyone interested in trying to write an x86 asm version of the same function, that would offset the slowdown due to the extra function call?

// Martin
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to