This patch set speeds up mantissa bit counting and also makes it more
adaptable to doing accurate incremental counts for varying bandwidth.
The ARM version of compute_mantissa_size() was removed because the
function was changed completely.
The effect is about 5% to 8% faster overall encoding on x86.
I also tested an SSSE3 version using 2 phaddd instructions instead of
movhlps/paddd/pshufd/paddd, but it didn't have a measurable speed
difference on Atom, probably because that part is outside the loop
anyways. But maybe someone could test that on other processors to see
if it makes any difference.
I also tested unrolling the simd part, but it was significantly slower.
Justin Ruggles (3):
ac3enc: split mantissa bit counting into a separate function.
ac3enc: modify mantissa bit counting to keep bap counts for all
values of bap instead of just 0 to 4.
ac3enc: sse2 version of compute_mantissa_size()
libavcodec/ac3dsp.c | 33 +++++++-------
libavcodec/ac3dsp.h | 4 +-
libavcodec/ac3enc.c | 87 ++++++++++++++++++++++++--------------
libavcodec/arm/Makefile | 1 -
libavcodec/arm/ac3dsp_arm.S | 52 ----------------------
libavcodec/arm/ac3dsp_init_arm.c | 2 -
libavcodec/x86/ac3dsp.asm | 48 +++++++++++++++++++++
libavcodec/x86/ac3dsp_mmx.c | 3 +
8 files changed, 126 insertions(+), 104 deletions(-)
delete mode 100644 libavcodec/arm/ac3dsp_arm.S
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel