Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Timo Rothenpieler
Could you please make sure to properly reply to mails in the future? Otherwise this causes quite a mess to anyone who's viewing the ML in a threaded view, which includes the list archives. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpe

[FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Ali KIZIL
Hi Oliver, Yes, for same quality I also noticed a bandwidth usage drop like 4-6%. This is my case. But as you know it depends on the samples you work. For CBR, you are right, it should bring a higher quality. I compared decode of PSNR for CBR 8 bits and 10 bits HD NVENC HEVC encodded content with

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Moritz Barsnick
On Fri, Sep 02, 2016 at 15:40:54 +0300, Oliver Collyer wrote: > In my test, my sample file went from 80mb encoded down to 69mb with > the same global quality setting. (using -constq -global_quality 21) I also wonder whether the 10-bit algorithms have the same visual result as the 8-bit algorithms,

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Oliver Collyer
> Just one note; encoding from YUV420P to P010LE is still slow. It will be > nice a similar patch is done for YUV420P 8bits to P010LE 10 bits > convertion. (For reason: > http://x264.nl/x264/10bit_02-ateme-why_does_10bit_save_bandwidth.pdf) > Ali I’m curious as whether you have managed to save

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Timo Rothenpieler
>>> >>> …or is that really old-school and a modern compiler does all that when >>> optimising? >>> >>> Or is readability considered more important than marginal gains in >>> performance? >>> >>> Oliver (time travelling from the 1980s) >> >> You would still have to add the remaining stride. >> The

[FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Ali KIZIL
Hello Timo, I tested your patch. It increases UHD HEVC 10 bits Main10 encoding performance a lot while doing YUV420P10LE to P010LE (same level to Oliver's original 10 bits HEVC encoding patch). Your patch together with current FFmpeg git source, encoding performance increase from 40-42 fps to 69-

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Oliver Collyer
> On 2 Sep 2016, at 12:12, Timo Rothenpieler wrote: > >> Just sticking my head above the parapet, but shouldn’t things like... >> >>> +for (x = 0; x < c->srcW / 2; x++) { >>> +dstUV[x*2 ] = src[1][x] << 6; >>> +dstUV[x*2+1] = src[2][x] << 6; >>> +

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Timo Rothenpieler
> Just sticking my head above the parapet, but shouldn’t things like... > >> +for (x = 0; x < c->srcW / 2; x++) { >> +dstUV[x*2 ] = src[1][x] << 6; >> +dstUV[x*2+1] = src[2][x] << 6; >> +} > > …be more efficiently written as... > > uint16_

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Timo Rothenpieler
Am 02.09.2016 um 11:02 schrieb Michael Niedermayer: > On Fri, Sep 02, 2016 at 10:38:39AM +0200, Timo Rothenpieler wrote: +uint16_t *src[] = { +(uint16_t*)(src8[0] + srcStride[0] * srcSliceY), +(uint16_t*)(src8[1] + srcStride[1] * srcSliceY), +(uint16_

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Michael Niedermayer
On Fri, Sep 02, 2016 at 10:38:39AM +0200, Timo Rothenpieler wrote: > >> +uint16_t *src[] = { > >> +(uint16_t*)(src8[0] + srcStride[0] * srcSliceY), > >> +(uint16_t*)(src8[1] + srcStride[1] * srcSliceY), > >> +(uint16_t*)(src8[2] + srcStride[2] * srcSliceY) > > > > this

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-02 Thread Timo Rothenpieler
>> +uint16_t *src[] = { >> +(uint16_t*)(src8[0] + srcStride[0] * srcSliceY), >> +(uint16_t*)(src8[1] + srcStride[1] * srcSliceY), >> +(uint16_t*)(src8[2] + srcStride[2] * srcSliceY) > > this looks odd, why is this needed ? > Without it, every dstY[x] = src[0][x] << 6

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-01 Thread Michael Niedermayer
On Thu, Sep 01, 2016 at 07:49:38PM +0200, Timo Rothenpieler wrote: > --- > libswscale/swscale_unscaled.c | 42 ++ > 1 file changed, 42 insertions(+) > > diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c > index b231abe..f47e1f4 10064

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-01 Thread Oliver Collyer
Just sticking my head above the parapet, but shouldn’t things like... > +for (x = 0; x < c->srcW / 2; x++) { > +dstUV[x*2 ] = src[1][x] << 6; > +dstUV[x*2+1] = src[2][x] << 6; > +} …be more efficiently written as... uint16_t* tdstUV = dstU

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-01 Thread Michael Niedermayer
On Thu, Sep 01, 2016 at 06:44:56PM +0200, Timo Rothenpieler wrote: > On 9/1/2016 6:20 PM, Michael Niedermayer wrote: > > On Thu, Sep 01, 2016 at 05:23:04PM +0200, Timo Rothenpieler wrote: > >> --- > >> libswscale/swscale_unscaled.c | 39 +++ > >> 1 file changed,

[FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-01 Thread Timo Rothenpieler
--- libswscale/swscale_unscaled.c | 42 ++ 1 file changed, 42 insertions(+) diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index b231abe..f47e1f4 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-01 Thread Timo Rothenpieler
On 9/1/2016 6:20 PM, Michael Niedermayer wrote: > On Thu, Sep 01, 2016 at 05:23:04PM +0200, Timo Rothenpieler wrote: >> --- >> libswscale/swscale_unscaled.c | 39 +++ >> 1 file changed, 39 insertions(+) >> >> diff --git a/libswscale/swscale_unscaled.c b/libswsca

Re: [FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-01 Thread Michael Niedermayer
On Thu, Sep 01, 2016 at 05:23:04PM +0200, Timo Rothenpieler wrote: > --- > libswscale/swscale_unscaled.c | 39 +++ > 1 file changed, 39 insertions(+) > > diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c > index b231abe..51768fa 100644

[FFmpeg-devel] [PATCH] swscale: add unscaled copy from yuv420p10 to p010

2016-09-01 Thread Timo Rothenpieler
--- libswscale/swscale_unscaled.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index b231abe..51768fa 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -197