On 05/14/2011 10:50 AM, Måns Rullgård wrote:

> Diego Biurrun <[email protected]> writes:
> 
>> On Sat, May 14, 2011 at 09:41:01AM +0100, Måns Rullgård wrote:
>>> Justin Ruggles <[email protected]> writes:
>>>
>>>> This does all the actual bit counting as a final step.
>>>> x86 benchmarks:
>>>> 50% faster in function count_mantissa_bits()
>>>> 16% faster in function bit_alloc()
>>>> ---
>>>>  libavcodec/ac3dsp.c              |   33 ++++++++--------
>>>>  libavcodec/ac3dsp.h              |    4 +-
>>>>  libavcodec/ac3enc.c              |   78 
>>>> +++++++++++++++++++++-----------------
>>>>  libavcodec/arm/Makefile          |    1 -
>>>>  libavcodec/arm/ac3dsp_arm.S      |   52 -------------------------
>>>>  libavcodec/arm/ac3dsp_init_arm.c |    2 -
>>>>  6 files changed, 63 insertions(+), 107 deletions(-)
>>>>  delete mode 100644 libavcodec/arm/ac3dsp_arm.S
>>>> +static void count_mantissa_bits_update_ch(AC3EncodeContext *s, int ch,
>>>> +                                          uint16_t 
>>>> mant_cnt[AC3_MAX_BLOCKS][16],
>>>> +                                          int start, int end)
>>>> +{
>>>> +    int blk, i;
>>>> +
>>>> +    for (blk = 0; blk < AC3_MAX_BLOCKS; blk++) {
>>>> +        uint8_t *bap = s->blocks[blk].exp_ref_block[ch]->bap[ch];
>>>> +        for (i = start; i < end; i++)
>>>> +            mant_cnt[blk][bap[i]]++;
>>>
>>> This loop will suck with gcc on ARM.
>>
>> I'm curious as to why, could you elaborate?
> 
> Because gcc sucks, what else?  This particular suckage was the main
> reason for writing that function assembler at all.


Could this be written in asm for ARM then?  It's not too bad on x86, and
5-8% overall speed gain is significant.  Or can you see anything about
the C version that could be trivially changed to make gcc not mess it up
terribly?

-Justin
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to