Re: [FFmpeg-devel] [PATCH] lavfi: VAAPI video processing filter

2016-09-17 Thread Jun Zhao
Can't find SKL available now, just run the test in IVY(Ivybridge)/Debian 8.5/
Kernel 3.16.0/Libva master/intel-driver master/ffmpeg master

- Build config:

./configure --enable-libx264 --enable-gpl --enable-vaapi 
--prefix=/opt/ffmpeg

- Libva and intel-driver

barry@barry:~/Source/video/ffmpeg$ vainfo 
libva info: VA-API version 0.39.3
libva info: va_getDriverName() returns 0
libva info: Trying to open /opt/yami/vaapi/lib/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_0_39
libva info: va_openDriver() returns 0
vainfo: VA-API version: 0.39 (libva 1.7.3.pre1)
vainfo: Driver version: Intel i965 driver for Intel(R) Ivybridge Mobile - 
1.7.3.pre1 (1.7.0-118-gb5cd299)
vainfo: Supported profile and entrypoints
  VAProfileMPEG2Simple: VAEntrypointVLD
  VAProfileMPEG2Simple: VAEntrypointEncSlice
  VAProfileMPEG2Main  : VAEntrypointVLD
  VAProfileMPEG2Main  : VAEntrypointEncSlice
  VAProfileH264ConstrainedBaseline: VAEntrypointVLD
  VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
  VAProfileH264Main   : VAEntrypointVLD
  VAProfileH264Main   : VAEntrypointEncSlice
  VAProfileH264High   : VAEntrypointVLD
  VAProfileH264High   : VAEntrypointEncSlice
  VAProfileH264StereoHigh : VAEntrypointVLD
  VAProfileVC1Simple  : VAEntrypointVLD
  VAProfileVC1Main: VAEntrypointVLD
  VAProfileVC1Advanced: VAEntrypointVLD
  VAProfileNone   : VAEntrypointVideoProc
  VAProfileJPEGBaseline   : VAEntrypointVLD

- Kernel and distribution

barry@barry:~/Source/video/ffmpeg$ uname -a
Linux barry 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt25-2+deb8u3 (2016-07-02) 
x86_64 GNU/Linux

barry@barry:~/Source/video/ffmpeg$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:Debian GNU/Linux 8.5 (jessie)
Release:8.5
Codename:   jessie

- Test result

a). denoise -> scale [With the patch adding vf_process_vaapi]

./ffmpeg_g -y -vaapi_device /dev/dri/card0 -hwaccel vaapi 
-hwaccel_output_format vaapi -i 
../yami/ffmpeg_yami_testcase/skyfall2-trailer.mp4 -an -vf 
'format=vaapi|nv12,hwupload,process_vaapi=denoise=50,scale_vaapi=w=1280:h=720' 
-c:v h264_vaapi -qp 20 out_denoise_scale.mp4

121 fps

b). scale -> denosie [With the patch adding vf_process_vaapi]

./ffmpeg_g -y -vaapi_device /dev/dri/card0 -hwaccel vaapi 
-hwaccel_output_format vaapi -i 
../yami/ffmpeg_yami_testcase/skyfall2-trailer.mp4 -an -vf 
'format=vaapi|nv12,hwupload,scale_vaapi=w=1280:h=720,process_vaapi=denoise=50' 
-c:v h264_vaapi -qp 20 out_scale_denoise.mp4

169 fps

c). scale + denosie in all [patch to vf_scale_vaapi]

 ./ffmpeg_g -y -vaapi_device /dev/dri/card0 -hwaccel vaapi 
-hwaccel_output_format vaapi -i 
../yami/ffmpeg_yami_testcase/skyfall2-trailer.mp4 -an -vf 
'format=vaapi|nv12,hwupload,scale_vaapi=w=1280:h=720:denoise=50' -c:v 
h264_vaapi -qp 20 out_all.mp4

139 fps

d). scale without denoise

./ffmpeg_g -y -vaapi_device /dev/dri/card0 -hwaccel vaapi 
-hwaccel_output_format vaapi -i 
../yami/ffmpeg_yami_testcase/skyfall2-trailer.mp4 -an -vf 
'format=vaapi|nv12,hwupload,scale_vaapi=w=1280:h=720' -c:v h264_vaapi -qp 20 
out_scale.mp4

254 fps

I will try this after find the SKL. :)

On 2016/9/14 6:06, Mark Thompson wrote:
> WIP.
> ---
> On 05/09/16 02:52, Jun Zhao wrote:
>> On 2016/8/31 6:48, Mark Thompson wrote:
>>> On 30/08/16 09:00, Jun Zhao wrote:
 v3 : fix sharpless mapping issue
 v2 : fix filter support flag check logic issue
>>>
>>> Hi,
>>>
>>> A general remark to start: vf_scale_vaapi is named to be a scaling filter 
>>> (i.e. it replaces vf_scale/swscale for AV_PIX_FMT_VAAPI) - is this 
>>> therefore really the right place to be adding other operations unrelated to 
>>> scaling?
>>>
>>> Do use-cases for these operations actually make sense to add here rather 
>>> than in a separate filter?  (I'm not sure what the answer to this should be 
>>> - I would definitely argue that the deinterlacer should be a separate 
>>> filter, but these other operations are unclear.)
>>>
>>>
>>
>> As you know, VPP use the pipeline mode, split the 
>> scale/denoise/sharpness/... in 
>> different filter maybe is not good idea, I guess nobody want to call 
>> vaRenderPicture()/
>> vaEndpicture/... again and again in 
>> vf_scale_vaapi.c/vf_denosie_vaapi.c/vf_sharpness_vaapi.c/...
> 
> How about something like this, then?  It adds a new filter to do the video 
> processing, while leaving the scale filter as-is.
> 
> Implements denoise, sharpen and all of the colour balance controls; lightly 
> tested but seems working on i965/Skylake.
> 
> Outstanding issues:
> * The name is not very good, but I can't think of anything better.
> * Needs more testing.
> * Some error recovery is missing.
> * Documentation.
> * Reuses the surface pool from the input hw_frames_ctx - is anything 

Re: [FFmpeg-devel] [PATCH] lavfi: VAAPI video processing filter

2016-09-14 Thread Mark Thompson
On 14/09/16 02:30, Jun Zhao wrote:
> On 2016/9/14 6:06, Mark Thompson wrote:
>> How about something like this, then?  It adds a new filter to do the video 
>> processing, while leaving the scale filter as-is.
> 
> Can we merge the vpp scale/the other vpp filters in one AVFilter, e,g 
> vf_postprocess_vaapi.c. 
> If we split the scale/the other vpp filters, I guess maybe have some 
> performance issue,
> when merge the scale/the other vpp filters in one AVFilter, only once surface 
> copy ,
>  
> 1 input surface-> 1 output surface // once copy for 
> scale/de-noise/sharpness/...
> 
> but if split them, it's will lead to twice surface copy in some case. 
> 
> 1 input surface -> 1 output surface -> 2 output surface // 1st copy for 
> scale, 2nd
> // copy for the 
> other vaapi filters

Can you share what driver/platform you are testing on and what commands you are 
using to get the result that the combined filter is faster?

For example, I get (1080p H.264 input, current i965 on Skylake):


[With the patch to vf_scale_vaapi]

./ffmpeg_g -y -vaapi_device /dev/dri/renderD128 -hwaccel vaapi 
-hwaccel_output_format vaapi -i in.mp4 -an -vf 
'format=vaapi|nv12,hwupload,scale_vaapi=denoise=50:w=1280:h=720' -c:v 
h264_vaapi -qp 20 out.mp4

-> 225fps.


[With the patch adding vf_process_vaapi]

./ffmpeg_g -y -vaapi_device /dev/dri/renderD128 -hwaccel vaapi 
-hwaccel_output_format vaapi -i in.mp4 -an -vf 
'format=vaapi|nv12,hwupload,process_vaapi=denoise=50,scale_vaapi=w=1280:h=720' 
-c:v h264_vaapi -qp 20 out.mp4

-> 255fps.


I'm not sure why the separate filters are actually faster here, but I was 
certainly expecting them to be about the same - since we haven't introduced any 
additional synchronisation points in either sequence, it should all be fully 
pipelined in the batch buffer rings from the decoder to the encoder output.  I 
believe the argument about surfaces is specious because the combined case needs 
the same intermediates and therefore internally allocates temporary surfaces 
for them.

Thanks,

- Mark

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] lavfi: VAAPI video processing filter

2016-09-13 Thread Jun Zhao


On 2016/9/14 6:06, Mark Thompson wrote:
> WIP.
> ---
> On 05/09/16 02:52, Jun Zhao wrote:
>> On 2016/8/31 6:48, Mark Thompson wrote:
>>> On 30/08/16 09:00, Jun Zhao wrote:
 v3 : fix sharpless mapping issue
 v2 : fix filter support flag check logic issue
>>>
>>> Hi,
>>>
>>> A general remark to start: vf_scale_vaapi is named to be a scaling filter 
>>> (i.e. it replaces vf_scale/swscale for AV_PIX_FMT_VAAPI) - is this 
>>> therefore really the right place to be adding other operations unrelated to 
>>> scaling?
>>>
>>> Do use-cases for these operations actually make sense to add here rather 
>>> than in a separate filter?  (I'm not sure what the answer to this should be 
>>> - I would definitely argue that the deinterlacer should be a separate 
>>> filter, but these other operations are unclear.)
>>>
>>>
>>
>> As you know, VPP use the pipeline mode, split the 
>> scale/denoise/sharpness/... in 
>> different filter maybe is not good idea, I guess nobody want to call 
>> vaRenderPicture()/
>> vaEndpicture/... again and again in 
>> vf_scale_vaapi.c/vf_denosie_vaapi.c/vf_sharpness_vaapi.c/...
> 
> How about something like this, then?  It adds a new filter to do the video 
> processing, while leaving the scale filter as-is.

Can we merge the vpp scale/the other vpp filters in one AVFilter, e,g 
vf_postprocess_vaapi.c. 
If we split the scale/the other vpp filters, I guess maybe have some 
performance issue,
when merge the scale/the other vpp filters in one AVFilter, only once surface 
copy ,
 
1 input surface-> 1 output surface // once copy for 
scale/de-noise/sharpness/...

but if split them, it's will lead to twice surface copy in some case. 

1 input surface -> 1 output surface -> 2 output surface // 1st copy for 
scale, 2nd
// copy for the 
other vaapi filters

Anyway, need to more test for this.

> 
> Implements denoise, sharpen and all of the colour balance controls; lightly 
> tested but seems working on i965/Skylake.
> 
> Outstanding issues:
> * The name is not very good, but I can't think of anything better.
> * Needs more testing.
> * Some error recovery is missing.
> * Documentation.
> * Reuses the surface pool from the input hw_frames_ctx - is anything going to 
> object to that?
> * Can't order the filters applied - does that matter?
> * Sharpness + anything else aborts inside the i965 driver, other combinations 
> work - should vaQueryVideoProcPipelineCaps() detect that, or is there some 
> other way to get it?

I think i965 driver guys will fix this issue:)

> 
> Thanks,
> 
> - Mark
> 
> 
>  libavfilter/Makefile   |   1 +
>  libavfilter/allfilters.c   |   1 +
>  libavfilter/vf_process_vaapi.c | 597 
> +
>  3 files changed, 599 insertions(+)
>  create mode 100644 libavfilter/vf_process_vaapi.c
> 
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index 5cd10fa..10ffa78 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -239,6 +239,7 @@ OBJS-$(CONFIG_PIXDESCTEST_FILTER)+= 
> vf_pixdesctest.o
>  OBJS-$(CONFIG_PP_FILTER) += vf_pp.o
>  OBJS-$(CONFIG_PP7_FILTER)+= vf_pp7.o
>  OBJS-$(CONFIG_PREWITT_FILTER)+= vf_convolution.o
> +OBJS-$(CONFIG_PROCESS_VAAPI_FILTER)  += vf_process_vaapi.o
>  OBJS-$(CONFIG_PSNR_FILTER)   += vf_psnr.o dualinput.o 
> framesync.o
>  OBJS-$(CONFIG_PULLUP_FILTER) += vf_pullup.o
>  OBJS-$(CONFIG_QP_FILTER) += vf_qp.o
> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> index 47d95f5..0684aef 100644
> --- a/libavfilter/allfilters.c
> +++ b/libavfilter/allfilters.c
> @@ -255,6 +255,7 @@ void avfilter_register_all(void)
>  REGISTER_FILTER(PP, pp, vf);
>  REGISTER_FILTER(PP7,pp7,vf);
>  REGISTER_FILTER(PREWITT,prewitt,vf);
> +REGISTER_FILTER(PROCESS_VAAPI,  process_vaapi,  vf);
>  REGISTER_FILTER(PSNR,   psnr,   vf);
>  REGISTER_FILTER(PULLUP, pullup, vf);
>  REGISTER_FILTER(QP, qp, vf);
> diff --git a/libavfilter/vf_process_vaapi.c b/libavfilter/vf_process_vaapi.c
> new file mode 100644
> index 000..25701a0
> --- /dev/null
> +++ b/libavfilter/vf_process_vaapi.c
> @@ -0,0 +1,597 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for