Justin Ruggles <[email protected]> writes:

> On 05/15/2011 04:49 AM, Måns Rullgård wrote:
>
>> Justin Ruggles <[email protected]> writes:
>> 
>>> On 05/14/2011 10:50 AM, Måns Rullgård wrote:
>>>
>>>> Diego Biurrun <[email protected]> writes:
>>>>
>>>>> On Sat, May 14, 2011 at 09:41:01AM +0100, Måns Rullgård wrote:
>>>>>> Justin Ruggles <[email protected]> writes:
>>>>>>
>>>>>>> This does all the actual bit counting as a final step.
>>>>>>> x86 benchmarks:
>>>>>>> 50% faster in function count_mantissa_bits()
>>>>>>> 16% faster in function bit_alloc()
>>>>>>> ---
>>>>>>>  libavcodec/ac3dsp.c              |   33 ++++++++--------
>>>>>>>  libavcodec/ac3dsp.h              |    4 +-
>>>>>>>  libavcodec/ac3enc.c              |   78 
>>>>>>> +++++++++++++++++++++-----------------
>>>>>>>  libavcodec/arm/Makefile          |    1 -
>>>>>>>  libavcodec/arm/ac3dsp_arm.S      |   52 -------------------------
>>>>>>>  libavcodec/arm/ac3dsp_init_arm.c |    2 -
>>>>>>>  6 files changed, 63 insertions(+), 107 deletions(-)
>>>>>>>  delete mode 100644 libavcodec/arm/ac3dsp_arm.S
>>>>>>> +static void count_mantissa_bits_update_ch(AC3EncodeContext *s, int ch,
>>>>>>> +                                          uint16_t 
>>>>>>> mant_cnt[AC3_MAX_BLOCKS][16],
>>>>>>> +                                          int start, int end)
>>>>>>> +{
>>>>>>> +    int blk, i;
>>>>>>> +
>>>>>>> +    for (blk = 0; blk < AC3_MAX_BLOCKS; blk++) {
>>>>>>> +        uint8_t *bap = s->blocks[blk].exp_ref_block[ch]->bap[ch];
>>>>>>> +        for (i = start; i < end; i++)
>>>>>>> +            mant_cnt[blk][bap[i]]++;
>>>>>>
>>>>>> This loop will suck with gcc on ARM.
>>>>>
>>>>> I'm curious as to why, could you elaborate?
>>>>
>>>> Because gcc sucks, what else?  This particular suckage was the main
>>>> reason for writing that function assembler at all.
>>>
>>> Could this be written in asm for ARM then?
>> 
>> If the code is reorganised to allow this, yes.
>
> Would it help to just have the inner loop in asm?

The outer loop looks simple enough to write in asm too.  The pointer
chasing is a bit worrisome though.  Is there any way to flatten some of
that into an array instead?

-- 
Måns Rullgård
[email protected]
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to