Hi,

2012/3/17 Måns Rullgård <[email protected]>:
> "Ronald S. Bultje" <[email protected]> writes:
>
>> ---
>>  libavcodec/x86/cabac.h |   17 ++++++++++-------
>>  1 files changed, 10 insertions(+), 7 deletions(-)
>>
>> diff --git a/libavcodec/x86/cabac.h b/libavcodec/x86/cabac.h
>> index 3c3652d..c4832c3 100644
>> --- a/libavcodec/x86/cabac.h
>> +++ b/libavcodec/x86/cabac.h
>> @@ -105,8 +105,8 @@ static av_always_inline int 
>> get_cabac_bypass_sign_x86(CABACContext *c, int val)
>>  {
>>      x86_reg tmp;
>>      __asm__ volatile(
>> -        "movl %4, %k1                           \n\t"
>> -        "movl %2, %%eax                         \n\t"
>> +        "movl %c5(%2), %k1                      \n\t"
>> +        "movl %c3(%2), %%eax                    \n\t"
>
> %c5?  Last I checked, the code to get a plain number was 'a'.

Fixed.

>> -        :"+c"(val), "=&r"(tmp), "+m"(c->low), "+m"(c->bytestream)
>> -        :"m"(c->range)
>> -        : "%eax", "%edx"
>> +        : "+c"(val), "=&r"(tmp)
>> +        : "r"(c),
>> +          "i"(offsetof(CABACContext, low)),
>> +          "i"(offsetof(CABACContext, bytestream)),
>> +          "i"(offsetof(CABACContext, range))
>> +        : "%eax", "%edx", "memory"
>
> We changed this to use "m" operands to avoid the memory clobber.  I know
> why you're doing this, but I think it's the wrong approach.

It generates better code (less instructions for e.g.
decode_cabac_mb_mvd()) with gcc-4.2.1 (which is shipped with XCode).
Does it generate worse code anywhere? (It's true that later on it adds
instructions for the overread protection again, but this commit in
isolation makes things better, not worse.)

Ronald
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to