"Ronald S. Bultje" <[email protected]> writes: > Hi, > > 2011/6/9 Måns Rullgård <[email protected]>: >> Kostya <[email protected]> writes: >>> On Wed, Jun 08, 2011 at 12:55:01PM -0400, Ronald S. Bultje wrote: >>>> This generates better code on some non-x86 architectures. >>>> --- >>>> libswscale/swscale.c | 49 >>>> ++++++++++++++++++------------------------------- >>>> 1 files changed, 18 insertions(+), 31 deletions(-) >>>> >>>> diff --git a/libswscale/swscale.c b/libswscale/swscale.c >>>> index 9833101..500d838 100644 >>>> --- a/libswscale/swscale.c >>>> +++ b/libswscale/swscale.c >>>> @@ -404,6 +404,12 @@ static void yuv2nv12X_c(SwsContext *c, const int16_t >>>> *lumFilter, >>>> Y2>>=19;\ >>>> U >>=19;\ >>>> V >>=19;\ >>>> + if ((Y1|Y2|U|V)&256) {\ >>>> + Y1 = av_clip_uint8(Y1); \ >>>> + Y2 = av_clip_uint8(Y2); \ >>>> + U = av_clip_uint8(U); \ >>>> + V = av_clip_uint8(V); \ >>>> + }\ >>> >>> I'd write condition as (Y1|Y2|U|V) & 0x100 (since it needs some spaces and >>> hex >>> constants are more readable here), the same below with 65536. Otherwise OK. >> >> The condition seems strange. Why should they be clipped if bit 8 it >> set, but not if bit 9 or higher is? > > I think it's because this particular calculation never overflows by > more than a few values at most, so it can be 257 or 256, but not 512. > Therefore, in practice, this works. > > I'm fine with changing that (or even removing the if condition at all, > since that probably isn't necessary on e.g. arm which has an actual > clip instruction), but let's do that later.
Indeed, doing the clipping is faster than calculating the condition on a modern ARM. Then again, these functions should be optimised in pure asm anyway... -- Måns Rullgård [email protected] _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
