On 05/14/2011 10:50 AM, Måns Rullgård wrote: > Diego Biurrun <[email protected]> writes: > >> On Sat, May 14, 2011 at 09:41:01AM +0100, Måns Rullgård wrote: >>> Justin Ruggles <[email protected]> writes: >>> >>>> This does all the actual bit counting as a final step. >>>> x86 benchmarks: >>>> 50% faster in function count_mantissa_bits() >>>> 16% faster in function bit_alloc() >>>> --- >>>> libavcodec/ac3dsp.c | 33 ++++++++-------- >>>> libavcodec/ac3dsp.h | 4 +- >>>> libavcodec/ac3enc.c | 78 >>>> +++++++++++++++++++++----------------- >>>> libavcodec/arm/Makefile | 1 - >>>> libavcodec/arm/ac3dsp_arm.S | 52 ------------------------- >>>> libavcodec/arm/ac3dsp_init_arm.c | 2 - >>>> 6 files changed, 63 insertions(+), 107 deletions(-) >>>> delete mode 100644 libavcodec/arm/ac3dsp_arm.S >>>> +static void count_mantissa_bits_update_ch(AC3EncodeContext *s, int ch, >>>> + uint16_t >>>> mant_cnt[AC3_MAX_BLOCKS][16], >>>> + int start, int end) >>>> +{ >>>> + int blk, i; >>>> + >>>> + for (blk = 0; blk < AC3_MAX_BLOCKS; blk++) { >>>> + uint8_t *bap = s->blocks[blk].exp_ref_block[ch]->bap[ch]; >>>> + for (i = start; i < end; i++) >>>> + mant_cnt[blk][bap[i]]++; >>> >>> This loop will suck with gcc on ARM. >> >> I'm curious as to why, could you elaborate? > > Because gcc sucks, what else? This particular suckage was the main > reason for writing that function assembler at all.
Could this be written in asm for ARM then? It's not too bad on x86, and 5-8% overall speed gain is significant. Or can you see anything about the C version that could be trivially changed to make gcc not mess it up terribly? -Justin _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
