Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

Timo Rothenpieler Fri, 02 Sep 2016 02:13:47 -0700

> Just sticking my head above the parapet, but shouldn’t things like...
> 
>> +            for (x = 0; x < c->srcW / 2; x++) {
>> +                dstUV[x*2  ] = src[1][x] << 6;
>> +                dstUV[x*2+1] = src[2][x] << 6;
>> +            }
> 
> …be more efficiently written as...
> 
> uint16_t* tdstUV = dstUV;
> uint16_t* tsrc1 = src[1];
> uint16_t* tsrc2 = src[2];
> for (x = c->srcW / 2; x > 0; x--) {
>     *tdstUV++ = *tsrc1++ << 6;
>     *tdstUV++ = *tsrc2++ << 6;
> }
> 
> …or is that really old-school and a modern compiler does all that when 
> optimising?
> 
> Or is readability considered more important than marginal gains in 
> performance?
> 
> Oliver (time travelling from the 1980s)


You would still have to add the remaining stride.
The linesize is usually larger than the width, so each line is properly
aligned.

So with your code, you'd still need something like

dstUV += dstStride[1] / 2 - 2 * x;
src[2] += srcStride[1] / 2 - x;
src[2] += srcStride[1] / 2 - x;

after it.
_______________________________________________
ffmpeg-devel mailing list
[email protected]
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

Reply via email to