This patch set speeds up mantissa bit counting and also makes it more
adaptable to doing accurate incremental counts for varying bandwidth.

The ARM version of compute_mantissa_size() was removed because the
function was changed completely.

The effect is about 5% to 8% faster overall encoding on x86.

I also tested an SSSE3 version using 2 phaddd instructions instead of
movhlps/paddd/pshufd/paddd, but it didn't have a measurable speed
difference on Atom, probably because that part is outside the loop
anyways. But maybe someone could test that on other processors to see
if it makes any difference.

I also tested unrolling the simd part, but it was significantly slower.


Justin Ruggles (3):
  ac3enc: split mantissa bit counting into a separate function.
  ac3enc: modify mantissa bit counting to keep bap counts for all
    values of bap     instead of just 0 to 4.
  ac3enc: sse2 version of compute_mantissa_size()

 libavcodec/ac3dsp.c              |   33 +++++++-------
 libavcodec/ac3dsp.h              |    4 +-
 libavcodec/ac3enc.c              |   87 ++++++++++++++++++++++++--------------
 libavcodec/arm/Makefile          |    1 -
 libavcodec/arm/ac3dsp_arm.S      |   52 ----------------------
 libavcodec/arm/ac3dsp_init_arm.c |    2 -
 libavcodec/x86/ac3dsp.asm        |   48 +++++++++++++++++++++
 libavcodec/x86/ac3dsp_mmx.c      |    3 +
 8 files changed, 126 insertions(+), 104 deletions(-)
 delete mode 100644 libavcodec/arm/ac3dsp_arm.S

_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to