"Ronald S. Bultje" <[email protected]> writes:

> Hi,
>
> 2011/6/9 Måns Rullgård <[email protected]>:
>> Kostya <[email protected]> writes:
>>> On Wed, Jun 08, 2011 at 12:55:01PM -0400, Ronald S. Bultje wrote:
>>>> This generates better code on some non-x86 architectures.
>>>> ---
>>>>  libswscale/swscale.c |   49 
>>>> ++++++++++++++++++-------------------------------
>>>>  1 files changed, 18 insertions(+), 31 deletions(-)
>>>>
>>>> diff --git a/libswscale/swscale.c b/libswscale/swscale.c
>>>> index 9833101..500d838 100644
>>>> --- a/libswscale/swscale.c
>>>> +++ b/libswscale/swscale.c
>>>> @@ -404,6 +404,12 @@ static void yuv2nv12X_c(SwsContext *c, const int16_t 
>>>> *lumFilter,
>>>>          Y2>>=19;\
>>>>          U >>=19;\
>>>>          V >>=19;\
>>>> +        if ((Y1|Y2|U|V)&256) {\
>>>> +            Y1 = av_clip_uint8(Y1); \
>>>> +            Y2 = av_clip_uint8(Y2); \
>>>> +            U  = av_clip_uint8(U); \
>>>> +            V  = av_clip_uint8(V); \
>>>> +        }\
>>>
>>> I'd write condition as (Y1|Y2|U|V) & 0x100 (since it needs some spaces and 
>>> hex
>>> constants are more readable here), the same below with 65536. Otherwise OK.
>>
>> The condition seems strange.  Why should they be clipped if bit 8 it
>> set, but not if bit 9 or higher is?
>
> I think it's because this particular calculation never overflows by
> more than a few values at most, so it can be 257 or 256, but not 512.
> Therefore, in practice, this works.
>
> I'm fine with changing that (or even removing the if condition at all,
> since that probably isn't necessary on e.g. arm which has an actual
> clip instruction), but let's do that later.

Indeed, doing the clipping is faster than calculating the condition on a
modern ARM.  Then again, these functions should be optimised in pure asm
anyway...

-- 
Måns Rullgård
[email protected]
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to