Re: [FFmpeg-devel] [PATCH, v2] lavc/vaapi_encode: grow packet if vaMapBuffer returns multiple buffers

2019-12-11 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of Max
> Dmitrichenko
> Sent: Wednesday, November 20, 2019 3:04 PM
> To: FFmpeg development discussions and patches 
> Cc: Li, Zhong 
> Subject: Re: [FFmpeg-devel] [PATCH, v2] lavc/vaapi_encode: grow packet if
> vaMapBuffer returns multiple buffers
> 
> On Sun, Sep 29, 2019 at 3:19 AM Fu, Linjie  wrote:
> 
> > > -Original Message-
> > > From: Li, Zhong 
> > > Sent: Friday, September 13, 2019 00:05
> > > To: FFmpeg development discussions and patches  > > de...@ffmpeg.org>
> > > Cc: Fu, Linjie 
> > > Subject: RE: [FFmpeg-devel] [PATCH, v2] lavc/vaapi_encode: grow packet if
> > > vaMapBuffer returns multiple buffers
> > >
> > > > From: ffmpeg-devel  On Behalf Of
> > > Linjie Fu
> > > > Sent: Friday, May 31, 2019 8:35 AM
> > > > To: ffmpeg-devel@ffmpeg.org
> > > > Cc: Fu, Linjie 
> > > > Subject: [FFmpeg-devel] [PATCH, v2] lavc/vaapi_encode: grow packet if
> > > > vaMapBuffer returns multiple buffers
> > > >
> > > > It seems that VA_CODED_BUF_STATUS_SINGLE_NALU allows driver to
> > > map
> > > > buffer for each slice.
The patch LGTM. But the above line of commit message seems not too much 
relevant.
Will remove this line of commit message and apply the patch if no objection.

Thanks!
Ruiling 

> > > >
> > > > Currently, assigning new buffer for pkt when multiple buffer returns
> > from
> > > > vaMapBuffer will cover the previous encoded pkt data and lead to encode
> > > issues.
> > > >
> > > > Iterate through the buf_list first to find out the total buffer size
> > needed for
> > > the
> > > > pkt, allocate the whole pkt to avoid repeated reallocation and memcpy,
> > > then copy
> > > > data from each buf to pkt.
> > > >
> > > > Signed-off-by: Linjie Fu 
> > > > ---
> > > > [v2]: allocate the whole pkt to avoid repeated reallocation and memcpy
> > > >
> > > >  libavcodec/vaapi_encode.c | 18 +-
> > > >  1 file changed, 13 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c
> > index
> > > > 2dda451..9c9e5dd 100644
> > > > --- a/libavcodec/vaapi_encode.c
> > > > +++ b/libavcodec/vaapi_encode.c
> > > > @@ -489,6 +489,8 @@ static int vaapi_encode_output(AVCodecContext
> > > *avctx,
> > > >  VAAPIEncodeContext *ctx = avctx->priv_data;
> > > >  VACodedBufferSegment *buf_list, *buf;
> > > >  VAStatus vas;
> > > > +int total_size = 0;
> > > > +uint8_t *ptr;
> > > >  int err;
> > > >
> > > >  err = vaapi_encode_wait(avctx, pic); @@ -505,15 +507,21 @@ static
> > int
> > > > vaapi_encode_output(AVCodecContext *avctx,
> > > >  goto fail;
> > > >  }
> > > >
> > > > +for (buf = buf_list; buf; buf = buf->next)
> > > > +total_size += buf->size;
> > > > +
> > > > +err = av_new_packet(pkt, total_size);
> > > > +ptr = pkt->data;
> > > > +
> > > > +if (err < 0)
> > > > +goto fail_mapped;
> > > > +
> > > >  for (buf = buf_list; buf; buf = buf->next) {
> > > >  av_log(avctx, AV_LOG_DEBUG, "Output buffer: %u bytes "
> > > > "(status %08x).\n", buf->size, buf->status);
> > > >
> > > > -err = av_new_packet(pkt, buf->size);
> > > > -if (err < 0)
> > > > -goto fail_mapped;
> > > > -
> > > > -memcpy(pkt->data, buf->buf, buf->size);
> > > > +memcpy(ptr, buf->buf, buf->size);
> > > > +ptr += buf->size;
> > > >  }
> > > >
> > > >  if (pic->type == PICTURE_TYPE_IDR)
> > > > --
> > > > 2.7.4
> > >
> > > LGTM
> >
> > Thanks for review.
> > A kindly ping.
> >
> > - linjie
> >
> 
> LGTM
> 
> regards
> Max
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v3] avfilter: Add tonemap vaapi filter for H2S

2019-12-10 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Vittorio Giovara
> Sent: Tuesday, December 3, 2019 2:28 AM
> To: FFmpeg development discussions and patches 
> Cc: Sun, Xinpeng ; Zhou, Zachary
> 
> Subject: Re: [FFmpeg-devel] [PATCH v3] avfilter: Add tonemap vaapi filter for
> H2S
> 
> On Mon, Dec 2, 2019 at 2:19 AM Xinpeng Sun  wrote:
> 
> > It performs HDR(High Dynamic Range) to SDR(Standard Dynamic Range)
> > conversion
> > with tone-mapping. It only supports HDR10 as input temporarily.
> >
> > An example command to use this filter with vaapi codecs:
> > FFMPEG -hwaccel vaapi -vaapi_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi \
> > -i INPUT -vf 'tonemap_vaapi=format=p010' -c:v hevc_vaapi -profile 2 OUTPUT
> >
> > Signed-off-by: Xinpeng Sun 
> > Signed-off-by: Zachary Zhou 
> > ---
> >  configure  |   2 +
> >  doc/filters.texi   |  81 +++
> >  libavfilter/Makefile   |   1 +
> >  libavfilter/allfilters.c   |   1 +
> >  libavfilter/vf_tonemap_vaapi.c | 420
> +
> >  5 files changed, 505 insertions(+)
Is there any concern or objection? If no, I will make requested changes and 
apply this version.

Thanks!
Ruiling

[...]
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 3/3] avfilter/vf_convolution: add X86 SIMD for filter_column()

2019-12-03 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> chen
> Sent: Wednesday, December 4, 2019 9:36 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 3/3] avfilter/vf_convolution: add X86
> SIMD for filter_column()
> 
> 
> 
> At 2019-12-04 08:59:08, "Song, Ruiling"  wrote:
> >> -Original Message-
> >> From: ffmpeg-devel  On Behalf Of
> >> chen
> >> Sent: Tuesday, December 3, 2019 4:59 PM
> >> To: FFmpeg development discussions and patches  >> de...@ffmpeg.org>
> >> Subject: Re: [FFmpeg-devel] [PATCH 3/3] avfilter/vf_convolution: add X86
> >> SIMD for filter_column()
> >>
> >> comments inline in code
> >>
> >>
> >> At 2019-12-03 15:52:07, xuju...@sjtu.edu.cn wrote:
> >> >From: Xu Jun 
> >[...]
> >> >+
> >> >+cvtdq2ps m4, m4
> >> >+mulps m4, m0 ; sum *= rdiv
> >> >+addps m4, m1 ; sum += bias
> >>
> >> >+addps m4, m5 ; sum += 0.5
> >> I don't know how about precision mismatch if we pre-compute (bias+0.5)
> 
> >I think it is hard to prove it is safe to do pre-compute.
> Agree, I also worried precision issue since float operator is execute order
> dependent.
> How about ROUNDPS?
Seems no exactly match.
> 
> 
> >
> >>
> >>
> >> >+cvttps2dq m4, m4
> >> >+packssdw m4, m4
> >> >+packuswb m4, m4
> >> >+movss [dstq + dst_offq], m4
> >> >+add c_offq, mmsize/4
> >> >+add dst_offq, mmsize/4
> >> >+
> >> >+add off16q, mmsize/4
> >> >+cmp off16q, widthq
> >> >+jl .loop16
> >> >+
> >> >+add widthq, rq
> >> >+cmp off16q, widthq
> >> >+jge .paraend
> >> >+
> >>
> >> >+.loopr:
> >> no idea about this loop, if we can read beyond, we can reuse above SIMD
> >> code
> >Reuse above SIMD code may write to the memory that does not belong to
> this slice-thread.
> 
> >IMO, the code to handle remainder columns is still necessary.
> 
> 
> Depends on algorithm & size,
> For example width=23
> Process #0 [0:15]
> Process #1 [7:22]
> Both of them is multiple of 16
Sounds interesting. But FFmpeg does not do like this now.
One question is will this get a penalty for writing to same address of memory 
(both are writing to 7-15) from different threads?

> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 3/3] avfilter/vf_convolution: add X86 SIMD for filter_column()

2019-12-03 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> chen
> Sent: Tuesday, December 3, 2019 4:59 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 3/3] avfilter/vf_convolution: add X86
> SIMD for filter_column()
> 
> comments inline in code
> 
> 
> At 2019-12-03 15:52:07, xuju...@sjtu.edu.cn wrote:
> >From: Xu Jun 
[...]
> >+
> >+cvtdq2ps m4, m4
> >+mulps m4, m0 ; sum *= rdiv
> >+addps m4, m1 ; sum += bias
> 
> >+addps m4, m5 ; sum += 0.5
> I don't know how about precision mismatch if we pre-compute (bias+0.5)
I think it is hard to prove it is safe to do pre-compute.

> 
> 
> >+cvttps2dq m4, m4
> >+packssdw m4, m4
> >+packuswb m4, m4
> >+movss [dstq + dst_offq], m4
> >+add c_offq, mmsize/4
> >+add dst_offq, mmsize/4
> >+
> >+add off16q, mmsize/4
> >+cmp off16q, widthq
> >+jl .loop16
> >+
> >+add widthq, rq
> >+cmp off16q, widthq
> >+jge .paraend
> >+
> 
> >+.loopr:
> no idea about this loop, if we can read beyond, we can reuse above SIMD
> code
Reuse above SIMD code may write to the memory that does not belong to this 
slice-thread.
IMO, the code to handle remainder columns is still necessary.

Ruiling
> 
> 
> >+xor sumd, sumd
> >+xor iq, iq
> >+.loopr_i:
> >+mov ciq, [ptrq + iq * gprsize]
> >+movzx rd, byte [ciq + c_offq]
> >+imul rd, [matrixq + 4*iq]
> >+add sumd, rd
> >+
> >+add iq, 1
> >+cmp iq, radq
> >+jl .loopr_i
> >+
> >+pxor m4, m4
> >+cvtsi2ss m4, sumd
> >+mulss m4, m0 ; sum *= rdiv
> >+addss m4, m1 ; sum += bias
> >+addss m4, m5 ; sum += 0.5
> >+cvttps2dq m4, m4
> >+packssdw m4, m4
> >+packuswb m4, m4
> >+movd sumd, m4
> >+mov [dstq + dst_offq], sumb
> >+add c_offq, 1
> >+add dst_offq, 1
> >+add off16q, 1
> >+cmp off16q, widthq
> >+jl .loopr
> >+
> >+.paraend:
> >+sub c_offq, widthq
> >+sub dst_offq, widthq
> >+add c_offq, strideq
> >+add dst_offq, dstrideq
> >+
> >+sub heightq, 1
> >+cmp heightq, 0
> >+jg .loopy
> >+
> >+.end:
> >+RET
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v3] avfilter: Add tonemap vaapi filter for H2S

2019-12-02 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of Fu,
> Linjie
> Sent: Tuesday, December 3, 2019 11:23 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Cc: Sun, Xinpeng ; Zhou, Zachary
> 
> Subject: Re: [FFmpeg-devel] [PATCH v3] avfilter: Add tonemap vaapi filter for
> H2S
> 
> > -Original Message-
> > From: ffmpeg-devel  On Behalf Of
> > Xinpeng Sun
> > Sent: Monday, December 2, 2019 15:17
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Sun, Xinpeng ; Zhou, Zachary
> > 
> > Subject: [FFmpeg-devel] [PATCH v3] avfilter: Add tonemap vaapi filter for
> > H2S
> >
> > It performs HDR(High Dynamic Range) to SDR(Standard Dynamic Range)
> > conversion
> > with tone-mapping. It only supports HDR10 as input temporarily.
> >
> > An example command to use this filter with vaapi codecs:
> > FFMPEG -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -
> > hwaccel_output_format vaapi \
> > -i INPUT -vf 'tonemap_vaapi=format=p010' -c:v hevc_vaapi -profile 2
> > OUTPUT
> >
> > Signed-off-by: Xinpeng Sun 
> > Signed-off-by: Zachary Zhou 
> > ---
> >  configure  |   2 +
> >  doc/filters.texi   |  81 +++
> >  libavfilter/Makefile   |   1 +
> >  libavfilter/allfilters.c   |   1 +
> >  libavfilter/vf_tonemap_vaapi.c | 420
> > +
> >  5 files changed, 505 insertions(+)
> >  create mode 100644 libavfilter/vf_tonemap_vaapi.c
> >
> > diff --git a/configure b/configure
> > index ca7137f341..5272fb2a57 100755
> > --- a/configure
> > +++ b/configure
> > @@ -3576,6 +3576,7 @@ tinterlace_filter_deps="gpl"
> >  tinterlace_merge_test_deps="tinterlace_filter"
> >  tinterlace_pad_test_deps="tinterlace_filter"
> >  tonemap_filter_deps="const_nan"
> > +tonemap_vaapi_filter_deps="vaapi
> > VAProcPipelineParameterBuffer_output_hdr_metadata"
> >  tonemap_opencl_filter_deps="opencl const_nan"
> >  transpose_opencl_filter_deps="opencl"
> >  transpose_vaapi_filter_deps="vaapi VAProcPipelineCaps_rotation_flags"
> > @@ -6576,6 +6577,7 @@ if enabled vaapi; then
> >
> >  check_type "va/va.h va/va_dec_hevc.h"
> > "VAPictureParameterBufferHEVC"
> >  check_struct "va/va.h" "VADecPictureParameterBufferVP9" bit_depth
> > +check_struct "va/va.h va/va_vpp.h" "VAProcPipelineParameterBuffer"
> > output_hdr_metadata
> >  check_struct "va/va.h va/va_vpp.h" "VAProcPipelineCaps"
> rotation_flags
> >  check_type "va/va.h va/va_enc_hevc.h"
> > "VAEncPictureParameterBufferHEVC"
> >  check_type "va/va.h va/va_enc_jpeg.h"
> > "VAEncPictureParameterBufferJPEG"
> > diff --git a/doc/filters.texi b/doc/filters.texi
> > index 5fdec6f015..7223ab89a3 100644
> > --- a/doc/filters.texi
> > +++ b/doc/filters.texi
> > @@ -20972,6 +20972,87 @@ Apply a strong blur of both luma and chroma
> > parameters:
> >
> >  @c man end OPENCL VIDEO FILTERS
> >
> > +@chapter VAAPI Video Filters
> > +@c man begin VAAPI VIDEO FILTERS
> > +
> > +VAAPI Video filters are usually used with VAAPI decoder and VAAPI
> > encoder. Below is a description of VAAPI video filters.
> > +
> > +To enable compilation of these filters you need to configure FFmpeg with
> > +@code{--enable-vaapi}.
> > +
> > +Running VAAPI filters requires you to initialize a hardware device and to
> > pass that device to all filters in any filter graph.
> > +@table @option
> > +
> > +@item -hwaccel vaapi
> > +Specify the hardware accelerator as @var{vaapi}.
> > +
> > +@item -vaapi_device @var{driver_path}
> > +Specify the vaapi driver path with @var{driver_path}.
> > +
> > +@item -hwaccel_output_format @var{vaapi}
> > +Specify the output format of hardware accelerator as @var{vaapi}. All
> > VAAPI hardware surfaces in ffmpeg are represented by the @var{vaapi}
> > pixfmt.
> > +
> > +@end table
> > +
> > +@itemize
> > +@item
> > +Example of running tonemap_vaapi filter with default parameters on it.
> > +@example
> > +-hwaccel vaapi -vaapi_device /dev/dri/renderD128 -
> > hwaccel_output_format vaapi -i INPUT -vf "tonemap_vaapi, hwdownload"
> > OUTPUT
> > +@end example
> > +@end itemize
> > +
> > +Since VAAPI filters are not able to access frame data in arbitrary memory,
> so
> > if you use a decoder other than VAAPI decoder before VAAPI filters, all
> > frame data needs to be uploaded(@ref{hwupload}) to hardware surfaces
> > connected to the appropriate device before being used. Also if you add a
> > encoder other than VAAPI encoder after VAAPI filters,
> 
> How about VAAPI decoder/filter + QSV encoder?
I think hwmap may help on this. anyway we can further enhance the document 
later if you have good idea.

Ruiling
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org

Re: [FFmpeg-devel] [PATCH v3] avfilter: Add tonemap vaapi filter for H2S

2019-12-02 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Vittorio Giovara
> Sent: Tuesday, December 3, 2019 2:28 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Cc: Sun, Xinpeng ; Zhou, Zachary
> 
> Subject: Re: [FFmpeg-devel] [PATCH v3] avfilter: Add tonemap vaapi filter for
> H2S
> 
> On Mon, Dec 2, 2019 at 2:19 AM Xinpeng Sun 
> wrote:
> 
> > It performs HDR(High Dynamic Range) to SDR(Standard Dynamic Range)
> > conversion
> > with tone-mapping. It only supports HDR10 as input temporarily.
> >
> > An example command to use this filter with vaapi codecs:
> > FFMPEG -hwaccel vaapi -vaapi_device /dev/dri/renderD128
> > -hwaccel_output_format vaapi \
> > -i INPUT -vf 'tonemap_vaapi=format=p010' -c:v hevc_vaapi -profile 2
> OUTPUT
> >
> > Signed-off-by: Xinpeng Sun 
> > Signed-off-by: Zachary Zhou 
> > ---
> >  configure  |   2 +
> >  doc/filters.texi   |  81 +++
> >  libavfilter/Makefile   |   1 +
> >  libavfilter/allfilters.c   |   1 +
> >  libavfilter/vf_tonemap_vaapi.c | 420
> +
> >  5 files changed, 505 insertions(+)
> >  create mode 100644 libavfilter/vf_tonemap_vaapi.c
> >
[...]
> > +static int tonemap_vaapi_save_metadata(AVFilterContext *avctx,
> AVFrame
> > *input_frame)
> > +{
> > +HDRVAAPIContext *ctx = avctx->priv;
> > +AVMasteringDisplayMetadata *hdr_meta;
> > +AVContentLightMetadata *light_meta;
> > +
> > +if (input_frame->color_trc != AVCOL_TRC_SMPTE2084) {
> > +av_log(avctx, AV_LOG_WARNING, "Only support HDR10 as input for
> > vaapi tone-mapping\n");
> > +input_frame->color_trc = AVCOL_TRC_SMPTE2084;
I think we don't need to modify the input->color_trc here. I am not sure if 
this has any side-effect, but may be misleading if you want to check that value 
when debugging.
Simply remove this single line would be ok.

[...]
> > +err = av_frame_copy_props(output_frame, input_frame);
> > +if (err < 0)
> > +return err;
> > +
> > +if (ctx->color_primaries != AVCOL_PRI_UNSPECIFIED)
> > +output_frame->color_primaries = ctx->color_primaries;
> > +
> > +if (ctx->color_transfer != AVCOL_TRC_UNSPECIFIED)
> > +output_frame->color_trc = ctx->color_transfer;
> > +else
> > +output_frame->color_trc = AVCOL_TRC_BT709
> >
> 
> why does only this setting get special treatment?
Basically for other properties we can copy from the source, but for color_trc, 
we cannot.
And I guess bt709 is a widely used sdr format. So even if user does not give a 
target transfer characteristic, we use this default one.

[...]
> 
> Overall this lgtm, I'd push it but I don't have a platform to test it on.
Really appreciate that. I borrow an icelake from other team member and have a 
test on this patch, the tone-mapping result video basically looks good.

Ruiling
> --
> Vittorio
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_convolution: add 16-column operation for filter_column() to prepare for x86 SIMD.

2019-12-01 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> xuju...@sjtu.edu.cn
> Sent: Wednesday, November 27, 2019 10:56 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: xuju...@sjtu.edu.cn
> Subject: [FFmpeg-devel] [PATCH] avfilter/vf_convolution: add 16-column
> operation for filter_column() to prepare for x86 SIMD.
> 
> From: Xu Jun 
> 
> In order to add x86 SIMD for filter_column(), I write a C function which
> processes 16 columns at a time.
> 
> Signed-off-by: Xu Jun 
> ---
>  libavfilter/vf_convolution.c  | 56 +++
>  libavfilter/x86/vf_convolution_init.c | 23 +++
>  2 files changed, 79 insertions(+)
> 
> diff --git a/libavfilter/vf_convolution.c b/libavfilter/vf_convolution.c
> index d022f1a04a..5291415d48 100644
> --- a/libavfilter/vf_convolution.c
> +++ b/libavfilter/vf_convolution.c
> @@ -520,6 +520,61 @@ static int filter_slice(AVFilterContext *ctx, void *arg,
> int jobnr, int nb_jobs)
>  continue;
>  }
> 
> +if (mode == MATRIX_COLUMN && s->filter[plane] != filter_column){
> +for (y = slice_start; y < slice_end - 16; y+=16) {
Please take care of the coding style there should be white-space between 
variables and operators.
And also I think this piece of change make it harder to maintain, let's try to 
avoid code duplicate as much as we can.
> +const int xoff = (y - slice_start) * bpc;
> +const int yoff = radius * stride;
> +for (x = 0; x < radius; x++) {
> +const int xoff = (y - slice_start) * bpc;
> +const int yoff = x * stride;
> +
> +s->setup[plane](radius, c, src, stride, x, width, y, 
> height, bpc);
> +s->filter[plane](dst + yoff + xoff, 1, rdiv,
> +bias, matrix, c, 16, radius,
> +dstride, stride);
> +}
> +s->setup[plane](radius, c, src, stride, radius, width, y, 
> height, bpc);
> +s->filter[plane](dst + yoff + xoff, sizew - 2 * radius,
> +rdiv, bias, matrix, c, 16, radius,
> +dstride, stride);
> +for (x = sizew - radius; x < sizew; x++) {
> +const int xoff = (y - slice_start) * bpc;
> +const int yoff = x * stride;
> +
> +s->setup[plane](radius, c, src, stride, x, width, y, 
> height, bpc);
> +s->filter[plane](dst + yoff + xoff, 1, rdiv,
> +bias, matrix, c, 16, radius,
> +dstride, stride);
> +}
> +}
> +if (y < slice_end){
> +const int xoff = (y - slice_start) * bpc;
> +const int yoff = radius * stride;
> +for (x = 0; x < radius; x++) {
> +const int xoff = (y - slice_start) * bpc;
> +const int yoff = x * stride;
> +
> +s->setup[plane](radius, c, src, stride, x, width, y, 
> height, bpc);
> +s->filter[plane](dst + yoff + xoff, 1, rdiv,
> +bias, matrix, c, slice_end - y, radius,
> +dstride, stride);
> +}
> +s->setup[plane](radius, c, src, stride, radius, width, y, 
> height, bpc);
> +s->filter[plane](dst + yoff + xoff, sizew - 2 * radius,
> +rdiv, bias, matrix, c, slice_end - y, radius,
> +dstride, stride);
> +for (x = sizew - radius; x < sizew; x++) {
> +const int xoff = (y - slice_start) * bpc;
> +const int yoff = x * stride;
> +
> +s->setup[plane](radius, c, src, stride, x, width, y, 
> height, bpc);
> +s->filter[plane](dst + yoff + xoff, 1, rdiv,
> +bias, matrix, c, slice_end - y, radius,
> +dstride, stride);
> +}
> +}
> +}
> +else {
>  for (y = slice_start; y < slice_end; y++) {
>  const int xoff = mode == MATRIX_COLUMN ? (y - slice_start) * bpc 
> :
> radius * bpc;
>  const int yoff = mode == MATRIX_COLUMN ? radius * stride : 0;
> @@ -550,6 +605,7 @@ static int filter_slice(AVFilterContext *ctx, void *arg,
> int jobnr, int nb_jobs)
>  dst += dstride;
>  }
>  }
> +}
> 
>  return 0;
>  }
> diff --git a/libavfilter/x86/vf_convolution_init.c
> b/libavfilter/x86/vf_convolution_init.c
> index d1e8c90ceb..6b1c2f0e9f 100644
> --- a/libavfilter/x86/vf_convolution_init.c
> +++ b/libavfilter/x86/vf_convolution_init.c
> @@ -34,6 +34,27 @@ void ff_filter_row_sse4(uint8_t *dst, int width,
>  const uint8_t 

Re: [FFmpeg-devel] [PATCH] avformat/hlsenc: remove duplicate code block

2019-11-28 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of Liu
> Steven
> Sent: Friday, November 29, 2019 7:42 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Cc: Liu Steven 
> Subject: Re: [FFmpeg-devel] [PATCH] avformat/hlsenc: remove duplicate
> code block
> 
> 
> 
> > 在 2019年11月29日,上午2:48,Michael Niedermayer
>  写道:
> >
> > On Thu, Nov 28, 2019 at 11:26:24AM +0800, Steven Liu wrote:
> >>
> >>
> >>> 在 2019年11月28日,04:06,Michael Niedermayer
>  写道:
> >>>
> >>> mm-short.mpg
> >> Hi Michael,
> >>
> >>Where should i download the file mm-short.mpg?
> >
> > you can make it yourself, it is just:
> >
> > dd if=matrixbench_mpeg2.mpg of=mm-short.mpg count=4000
> 
> StevenLiu:dash StevenLiu$ find fate-suite -name matrixbench_mpeg2.mpg
> StevenLiu:dash StevenLiu$
> There have no file named matrixbench_mpeg2.mpg,
> I ask for the mpg file for: i want to know what contents of the mpg file, just
> video stream? audio stream? or other stream?
https://samples.ffmpeg.org/benchmark/testsuite1/

> 
> 
> Whatever, i will resubmit a new version patch, try to fix this problem.
> 
> >
> > db7c44ab3d2b75d6e61fe61b1a595b31  mm-short.mpg
> >
> 
> Thanks
> Steven
> 
> 
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2] avfilter: Add tonemap vaapi filter for H2S

2019-11-28 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Carl Eugen Hoyos
> Sent: Thursday, November 28, 2019 5:16 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v2] avfilter: Add tonemap vaapi filter for
> H2S
> 
> Am Do., 28. Nov. 2019 um 07:56 Uhr schrieb Song, Ruiling
> :
> 
> > > > Am 28.11.2019 um 06:37 schrieb Sun, Xinpeng
> :
> > > >
> > > >>>
> > > >>> +if (input_frame->color_trc != AVCOL_TRC_SMPTE2084) {
> > > >>> +av_log(avctx, AV_LOG_ERROR, "Only support HDR10 as input
> for
> > > vaapi tone-mapping\n");
> > > >>> +return AVERROR(EINVAL);
> > > >>
> > > >> Shouldn't this also accept unknown trc?
> > > >> (With a warning)
> > > >
> > > > Sorry if I misunderstand "unknown trc". Did you mean the trc
> undefined by
> > > ffmpeg or the trc unsupported by the driver?
> > >
> > > My question is:
> > > If input_frame->color_trc is AVCOL_TRC_UNSPECIFIED, will the above fail?
> > > But shouldn’t the user be able to use the filter in this case?
> >
> > I am not sure if assuming the input is using SMPTE2084 sounds more
> acceptable
> > in case of unspecified? If yes, I think we can change as you suggested.
> 
> (Me neither.)
> A warning could be shown instead of failing.
Adding a warning sound good idea. But in order to proceed the tone-mapping, a 
default input transfer-function need to be chosen, which I think we can use 
SMPTE2084 here.

Ruiling
> 
> Carl Eugen
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2] avfilter: Add tonemap vaapi filter for H2S

2019-11-27 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Carl Eugen Hoyos
> Sent: Thursday, November 28, 2019 2:37 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v2] avfilter: Add tonemap vaapi filter for
> H2S
> 
> 
> 
> > Am 28.11.2019 um 06:37 schrieb Sun, Xinpeng :
> >
> >>>
> >>> +if (input_frame->color_trc != AVCOL_TRC_SMPTE2084) {
> >>> +av_log(avctx, AV_LOG_ERROR, "Only support HDR10 as input for
> vaapi tone-mapping\n");
> >>> +return AVERROR(EINVAL);
> >>
> >> Shouldn't this also accept unknown trc?
> >> (With a warning)
> >
> > Sorry if I misunderstand "unknown trc". Did you mean the trc undefined by
> ffmpeg or the trc unsupported by the driver?
> 
> My question is:
> If input_frame->color_trc is AVCOL_TRC_UNSPECIFIED, will the above fail?
> But shouldn’t the user be able to use the filter in this case?
Hi Carl,

I am not sure if assuming the input is using SMPTE2084 sounds more acceptable 
in case of unspecified? If yes, I think we can change as you suggested.
I suggested Xinpeng to follow the behavior of tonemap_opencl for such case, 
i.e. to fail explicitly for cases other than SMPTE2084. Because people may get 
incorrect color.

Ruiling
> 
> Thank you for the explanations, Carl Eugen
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v1] avfilter: Add tonemap vaapi filter for H2S

2019-11-12 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Carl Eugen Hoyos
> Sent: Tuesday, November 12, 2019 5:52 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v1] avfilter: Add tonemap vaapi filter for
> H2S
> 
> Hi!
> 
> > Am 12.11.2019 um 17:59 schrieb Xinpeng Sun :
> >
> > It performs HDR(High Dynamic Range) to SDR(Standard Dynamic Range)
> conversion
> > with tone-mapping. It supports HDR10 only as input temporarily.
> >
> > H2S: P010 -> NV12
> 
> No objection here but could you tell us if you (Intel) already have a plan how
> to deal with H2H?
Hi Carl,

I guess H2S would be much more useful than H2H. So I think it is ok to make H2S 
patch work well and acceptable.

Ruiling
> 
> Thank you for your effort, Carl Eugen
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v1] avfilter: Add tonemap vaapi filter for H2S

2019-11-12 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Xinpeng Sun
> Sent: Tuesday, November 12, 2019 5:00 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Sun, Xinpeng ; Zhou, Zachary
> 
> Subject: [FFmpeg-devel] [PATCH v1] avfilter: Add tonemap vaapi filter for
> H2S
> 
> It performs HDR(High Dynamic Range) to SDR(Standard Dynamic Range)
> conversion
> with tone-mapping. It supports HDR10 only as input temporarily.
> 
> H2S: P010 -> NV12
Have you tried P010 HDR to P010 SDR? Does it work? I think people may like use 
10bit SDR because it has more color details.

> 
> An example command to use this filter with vaapi codecs:
> FFMPEG -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -
> hwaccel_output_format vaapi \
> -i INPUT -vf 'tonemap_vaapi=h2s,hwdownload,format=nv12' -pix_fmt nv12 \
> -f rawvideo -y OUTPUT
> 
> Signed-off-by: Xinpeng Sun 
> Signed-off-by: Zachary Zhou 
> ---
>  doc/filters.texi   |  30 
>  libavfilter/Makefile   |   1 +
>  libavfilter/allfilters.c   |   1 +
>  libavfilter/vaapi_vpp.c|   5 +
>  libavfilter/vf_tonemap_vaapi.c | 272
> +
>  5 files changed, 309 insertions(+)
>  create mode 100644 libavfilter/vf_tonemap_vaapi.c
> 
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 6800124574..b1c466ba24 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -20754,6 +20754,36 @@ Convert HDR(PQ/HLG) video to bt2020-transfer-
> characteristic p010 format using li
>  @end example
>  @end itemize
> 
This should not be here. Please move above opencl video filters or start 
another chapter dedicated for vaapi accelerated video filters somewhere.
> +@section tonemap_vappi
> +
> +Perform HDR(High Dynamic Range) to SDR(Standard Dynamic Range)
> conversion with tone-mapping.
> +It maps the dynamic range of HDR10 content to the SDR content.
> +It only accepts HDR10 as input temporarilly.
> +
> +It accepts the following parameters:
> +
> +@table @option
> +@item type
> +Specify the tone-mapping operator to be used.
> +
> +Possible values are:
> +@table @var
> +@item h2s
> +Perform H2S(HDR to SDR), convert from p010 to nv12
> +@end table
> +
> +@end table
> +
> +@subsection Example
> +
> +@itemize
> +@item
> +Convert HDR video to SDR video from p010 format to nv12 format.
> +@example
> +-i INPUT -vf "tonemap_vaapi=h2s" OUTPUT
> +@end example
> +@end itemize
> +
>  @section unsharp_opencl
> 
>  Sharpen or blur the input video.
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index fce930360d..90a0e9945e 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -410,6 +410,7 @@ OBJS-$(CONFIG_TMIX_FILTER)   += vf_mix.o
> framesync.o
>  OBJS-$(CONFIG_TONEMAP_FILTER)+= vf_tonemap.o colorspace.o
>  OBJS-$(CONFIG_TONEMAP_OPENCL_FILTER) += vf_tonemap_opencl.o
> colorspace.o opencl.o \
>  opencl/tonemap.o 
> opencl/colorspace_common.o
> +OBJS-$(CONFIG_TONEMAP_VAAPI_FILTER)  += vf_tonemap_vaapi.o
> vaapi_vpp.o
>  OBJS-$(CONFIG_TPAD_FILTER)   += vf_tpad.o
>  OBJS-$(CONFIG_TRANSPOSE_FILTER)  += vf_transpose.o
>  OBJS-$(CONFIG_TRANSPOSE_NPP_FILTER)  += vf_transpose_npp.o
> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> index 7c1e19e1da..b2fb1f8a98 100644
> --- a/libavfilter/allfilters.c
> +++ b/libavfilter/allfilters.c
> @@ -390,6 +390,7 @@ extern AVFilter ff_vf_tlut2;
>  extern AVFilter ff_vf_tmix;
>  extern AVFilter ff_vf_tonemap;
>  extern AVFilter ff_vf_tonemap_opencl;
> +extern AVFilter ff_vf_tonemap_vaapi;
>  extern AVFilter ff_vf_tpad;
>  extern AVFilter ff_vf_transpose;
>  extern AVFilter ff_vf_transpose_npp;
> diff --git a/libavfilter/vaapi_vpp.c b/libavfilter/vaapi_vpp.c
> index b5b245c8af..5776243fa0 100644
> --- a/libavfilter/vaapi_vpp.c
> +++ b/libavfilter/vaapi_vpp.c
> @@ -257,6 +257,11 @@ static const VAAPIColourProperties
> vaapi_colour_standard_map[] = {
>  { VAProcColorStandardSMPTE170M,   6,  6,  6 },
>  { VAProcColorStandardSMPTE240M,   7,  7,  7 },
>  { VAProcColorStandardGenericFilm, 8,  1,  1 },
> +
> +#if VA_CHECK_VERSION(2, 3, 0)
> +{ VAProcColorStandardExplicit,9,  16, AVCOL_SPC_BT2020_NCL},
> +#endif
> +
>  #if VA_CHECK_VERSION(1, 1, 0)
>  { VAProcColorStandardSRGB,1, 13,  0 },
>  { VAProcColorStandardXVYCC601,1, 11,  5 },
> diff --git a/libavfilter/vf_tonemap_vaapi.c b/libavfilter/vf_tonemap_vaapi.c
> new file mode 100644
> index 00..27ee17bf00
> --- /dev/null
> +++ b/libavfilter/vf_tonemap_vaapi.c
> @@ -0,0 +1,272 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will 

Re: [FFmpeg-devel] [PATCH v1 1/2] lavu/pixfmt: add new pixel format a2r10g10b10/a2b10g10r10

2019-09-27 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Carl Eugen Hoyos
> Sent: Friday, September 27, 2019 7:47 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v1 1/2] lavu/pixfmt: add new pixel
> format a2r10g10b10/a2b10g10r10
> 
> Am Fr., 27. Sept. 2019 um 11:02 Uhr schrieb Sun, Xinpeng
> :
> 
> > > > Add two 10 bit RGBA pixel format for hardware color space conversion
> > > > support in VAAPI and QSV:
> > > >
> > > > 2:10:10:10 10 bit: A2R10G10B10
> > > > 2:10:10:10 10 bit: A2B10G10R10
> > >
> > > Without more explanation, this patch is not ok.
> 
> > The main reasons for adding these two format are as follows:
> > 1. For most HDR monitors, A2R10G10B10 is used for display format for
> > rendering. So this format is important to do 10bit RGB rendering support
> > in ffmpeg.
> 
> For which operating systems (and video drivers) is this true?
Here:
https://docs.microsoft.com/en-us/windows/win32/direct3d9/d3dformat
and here:
https://docs.microsoft.com/en-us/windows/win32/directshow/uncompressed-rgb-video-subtypes

It is defined in Windows APIs. so I guess these are widely used 10bit RGB 
format on Windows?

> And which video players will profit from this filter?
> 
> > 2. HW VPP can do both p010->a2r10g10b10 and a2r10g10b10->p010
> > with this patch, which can provide support for hw encode pipeline
> > using a2r10g10b10 as input.
> 
> But if the pipeline (that you control, no?) would support GBRP10, not
> only one (very) specific use case would be supported but all thinkable
> use cases, or do I misunderstand?
AFAIK, Intel GPU does not support planar RGB10 currently. May be we can add 
some format conversion between GBRP10 and the new added formats in swscale? (I 
am not if this could help the thinkable use cases you mean.)
And could you share your thought why supporting planar RGB10 is important? 
Thanks!

Ruiling
> 
> Afaict, the fact that FFmpeg cannot deal at all with HDR is the most
> pressing issue we have atm. I believe that only solving this problem
> for one specific use case is not the ideal solution.
> 
> Carl Eugen
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 1/3] checkasm/vf_eq: add test for vf_eq

2019-09-24 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of Li,
> Zhong
> Sent: Tuesday, September 24, 2019 2:34 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V2 1/3] checkasm/vf_eq: add test for
> vf_eq
> 
> > From: ffmpeg-devel  On Behalf Of
> Ting Fu
> > Sent: Wednesday, September 18, 2019 3:06 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: [FFmpeg-devel] [PATCH V2 1/3] checkasm/vf_eq: add test for
> vf_eq
> >
> > Signed-off-by: Ting Fu 
> > ---
> >  libavfilter/vf_eq.c   | 13 ---
> >  libavfilter/vf_eq.h   |  1 +
> >  tests/checkasm/Makefile   |  1 +
> >  tests/checkasm/checkasm.c |  3 ++
> >  tests/checkasm/checkasm.h |  1 +
> >  tests/checkasm/vf_eq.c| 79
> +++
[...]
> > +declare_func(void, EQParameters *param, uint8_t *dst, int dst_stride,
> > + const uint8_t *src, int src_stride, int w, int h);
> > +
> > +memset(src, 0, PIXELS);
> 
> Looks it is redundant with randomize_buffers() and make performance drop
Will remove and apply.
> 
> > +memset(dst_ref, 0, PIXELS);
> > +memset(dst_new, 0, PIXELS);
> > +randomize_buffers(src, PIXELS);
> > +ff_eq_init();
> > +
[...]
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 1/3] checkasm/vf_eq: add test for vf_eq

2019-09-23 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Ting Fu
> Sent: Wednesday, September 18, 2019 3:06 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH V2 1/3] checkasm/vf_eq: add test for vf_eq
> 
> Signed-off-by: Ting Fu 

The patchset LGTM. Have also verified on linux64, win64 and linux32.
Will apply the patches if no objection.
There are some indention errors, have fixed them locally. Please take care next 
time.

Ruiling

> ---
>  libavfilter/vf_eq.c   | 13 ---
>  libavfilter/vf_eq.h   |  1 +
>  tests/checkasm/Makefile   |  1 +
>  tests/checkasm/checkasm.c |  3 ++
>  tests/checkasm/checkasm.h |  1 +
>  tests/checkasm/vf_eq.c| 79
> +++
>  tests/fate/checkasm.mak   |  1 +
>  7 files changed, 94 insertions(+), 5 deletions(-)
>  create mode 100644 tests/checkasm/vf_eq.c
> 
> diff --git a/libavfilter/vf_eq.c b/libavfilter/vf_eq.c
> index 2c4c7e4d54..0f9d129255 100644
> --- a/libavfilter/vf_eq.c
> +++ b/libavfilter/vf_eq.c
> @@ -174,12 +174,18 @@ static int set_expr(AVExpr **pexpr, const char
> *expr, const char *option, void *
>  return 0;
>  }
> 
> +void ff_eq_init(EQContext *eq)
> +{
> +eq->process = process_c;
> +if (ARCH_X86)
> +ff_eq_init_x86(eq);
> +}
> +
>  static int initialize(AVFilterContext *ctx)
>  {
>  EQContext *eq = ctx->priv;
>  int ret;
> -
> -eq->process = process_c;
> +ff_eq_init(eq);
> 
>  if ((ret = set_expr(>contrast_pexpr, eq->contrast_expr,
> "contrast", ctx)) < 0 ||
>  (ret = set_expr(>brightness_pexpr,   eq->brightness_expr,
> "brightness",   ctx)) < 0 ||
> @@ -191,9 +197,6 @@ static int initialize(AVFilterContext *ctx)
>  (ret = set_expr(>gamma_weight_pexpr, eq->gamma_weight_expr,
> "gamma_weight", ctx)) < 0 )
>  return ret;
> 
> -if (ARCH_X86)
> -ff_eq_init_x86(eq);
> -
>  if (eq->eval_mode == EVAL_MODE_INIT) {
>  set_gamma(eq);
>  set_contrast(eq);
> diff --git a/libavfilter/vf_eq.h b/libavfilter/vf_eq.h
> index fa49d46e5c..cd0cd75f08 100644
> --- a/libavfilter/vf_eq.h
> +++ b/libavfilter/vf_eq.h
> @@ -100,6 +100,7 @@ typedef struct EQContext {
>  enum EvalMode { EVAL_MODE_INIT, EVAL_MODE_FRAME,
> EVAL_MODE_NB } eval_mode;
>  } EQContext;
> 
> +void ff_eq_init(EQContext *eq);
>  void ff_eq_init_x86(EQContext *eq);
> 
>  #endif /* AVFILTER_EQ_H */
> diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
> index 0112ff603e..de850c016e 100644
> --- a/tests/checkasm/Makefile
> +++ b/tests/checkasm/Makefile
> @@ -36,6 +36,7 @@ CHECKASMOBJS-$(CONFIG_AVCODEC)  +=
> $(AVCODECOBJS-yes)
>  AVFILTEROBJS-$(CONFIG_AFIR_FILTER) += af_afir.o
>  AVFILTEROBJS-$(CONFIG_BLEND_FILTER) += vf_blend.o
>  AVFILTEROBJS-$(CONFIG_COLORSPACE_FILTER) += vf_colorspace.o
> +AVFILTEROBJS-$(CONFIG_EQ_FILTER) += vf_eq.o
>  AVFILTEROBJS-$(CONFIG_GBLUR_FILTER)  += vf_gblur.o
>  AVFILTEROBJS-$(CONFIG_HFLIP_FILTER)  += vf_hflip.o
>  AVFILTEROBJS-$(CONFIG_THRESHOLD_FILTER)  += vf_threshold.o
> diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
> index d9a5c7f401..bcbe775510 100644
> --- a/tests/checkasm/checkasm.c
> +++ b/tests/checkasm/checkasm.c
> @@ -165,6 +165,9 @@ static const struct {
>  #if CONFIG_COLORSPACE_FILTER
>  { "vf_colorspace", checkasm_check_colorspace },
>  #endif
> +#if CONFIG_EQ_FILTER
> +{ "vf_eq", checkasm_check_vf_eq },
> +#endif
>  #if CONFIG_GBLUR_FILTER
>  { "vf_gblur", checkasm_check_vf_gblur },
>  #endif
> diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
> index fdf9eeb75d..0a7f9f25c4 100644
> --- a/tests/checkasm/checkasm.h
> +++ b/tests/checkasm/checkasm.h
> @@ -72,6 +72,7 @@ void checkasm_check_sw_rgb(void);
>  void checkasm_check_utvideodsp(void);
>  void checkasm_check_v210dec(void);
>  void checkasm_check_v210enc(void);
> +void checkasm_check_vf_eq(void);
>  void checkasm_check_vf_gblur(void);
>  void checkasm_check_vf_hflip(void);
>  void checkasm_check_vf_threshold(void);
> diff --git a/tests/checkasm/vf_eq.c b/tests/checkasm/vf_eq.c
> new file mode 100644
> index 00..684718f2cd
> --- /dev/null
> +++ b/tests/checkasm/vf_eq.c
> @@ -0,0 +1,79 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
> + * 51 Franklin 

Re: [FFmpeg-devel] [PATCH 0/7] Import some x264asm patches from x264

2019-08-12 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of James Darnley
> Sent: Monday, August 5, 2019 9:39 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH 0/7] Import some x264asm patches from
> x264
> 
> Here are a few easy-to-import patches from x264.  These are all after x264
> commit 4a158b00 "x86inc: Correctly set mmreg variables" which FFmpeg
> already
> has (commit eb5f063e7c).
> 
> It does not include the following commits:
> * 82721eae "x86inc: Add x86-32 PIC support macros"
> * 101bd27d "x86inc: Support N_PEXT bit on Mach-O"
> 
> They would not apply cleanly because of existing differences between x264
> and
> FFmpeg.  The PIC one has a change to configure which would need remaking.
> 
> Henrik Gramner (7):
>   x86inc: Fix VEX -> EVEX instruction conversion
>   x86inc: Optimize VEX instruction encoding
>   x86inc: Improve SAVE/LOAD_MM_PERMUTATION macros
>   x86inc: Turn 'movsxd' into 'movifnidn' on x86-32
>   x86inc: Make 'non-adjacent' default in the TAIL_CALL macro
>   x86inc: Improve warnings for use of unsupported instructions
>   x86inc: Add support for GFNI instructions
> 
>  libavutil/x86/x86inc.asm | 219 ---
>  1 file changed, 161 insertions(+), 58 deletions(-)
> 
I think it is a good idea to apply such changes. No sure if anybody else has 
different opinion.

Ruiling
> --
> 2.22.0
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_convolution: Fix build failures

2019-08-12 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Andreas Rheinhardt
> Sent: Monday, August 12, 2019 9:15 AM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Andreas Rheinhardt 
> Subject: [FFmpeg-devel] [PATCH] avfilter/vf_convolution: Fix build failures
> 
> 98e419cb added SIMD for the convolution filter for x64 systems. As
> usual, it used a check of the form
> if (ARCH_X86_64)
> ff_convolution_init_x86(s);
> and thereby relied on the compiler eliminating this pseudo-runtime check
> at compiletime for non x64 systems (for which ff_convolution_init_x86
> isn't defined) to compile. But vf_convolution.c contains more than one
> filter and if the convolution filter is disabled, but one of the other
> filters (prewitt, sobel, roberts) is enabled, the build will fail on x64,
> because ff_convolution_init_x86 isn't defined in this case.
> 
> Signed-off-by: Andreas Rheinhardt 
> ---
Will apply.

> Found via ubitux2's random FATE box:
> http://fate.ffmpeg.org/history.cgi?slot=x86_64-archlinux-gcc-random
[...]
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_convolution: Fix build failures

2019-08-11 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Andreas Rheinhardt
> Sent: Monday, August 12, 2019 9:15 AM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Andreas Rheinhardt 
> Subject: [FFmpeg-devel] [PATCH] avfilter/vf_convolution: Fix build failures
> 
> 98e419cb added SIMD for the convolution filter for x64 systems. As
> usual, it used a check of the form
> if (ARCH_X86_64)
> ff_convolution_init_x86(s);
> and thereby relied on the compiler eliminating this pseudo-runtime check
> at compiletime for non x64 systems (for which ff_convolution_init_x86
> isn't defined) to compile. But vf_convolution.c contains more than one
> filter and if the convolution filter is disabled, but one of the other
> filters (prewitt, sobel, roberts) is enabled, the build will fail on x64,
> because ff_convolution_init_x86 isn't defined in this case.

Sorry I missed that. Thanks for the fix. The patch LGTM.
> 
> Signed-off-by: Andreas Rheinhardt 
> ---
> Found via ubitux2's random FATE box:
> http://fate.ffmpeg.org/history.cgi?slot=x86_64-archlinux-gcc-random
> 
>  libavfilter/vf_convolution.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/libavfilter/vf_convolution.c b/libavfilter/vf_convolution.c
> index e3bf1df79f..f29df38a20 100644
> --- a/libavfilter/vf_convolution.c
> +++ b/libavfilter/vf_convolution.c
> @@ -588,8 +588,9 @@ static int config_input(AVFilterLink *inlink)
>  s->filter[p] = filter16_7x7;
>  }
>  }
> -if (ARCH_X86_64)
> -ff_convolution_init_x86(s);
> +#if CONFIG_CONVOLUTION_FILTER && ARCH_X86_64
> +ff_convolution_init_x86(s);
> +#endif
>  } else if (!strcmp(ctx->filter->name, "prewitt")) {
>  if (s->depth > 8)
>  for (p = 0; p < s->nb_planes; p++)
> --
> 2.21.0
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2] avfilter/vf_convolution: add x86 SIMD for filter_3x3()

2019-08-07 Thread Song, Ruiling
> -Original Message-
> From: Song, Ruiling
> Sent: Wednesday, July 31, 2019 3:54 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Song, Ruiling 
> Subject: [PATCH V2] avfilter/vf_convolution: add x86 SIMD for filter_3x3()
> 
> Tested using a simple command (apply edge enhance):
> ./ffmpeg_g -i ~/Downloads/bbb_sunflower_1080p_30fps_normal.mp4 \
>  -vf convolution="0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 
> 0 0 -1 1 0 0
> 0 0:5:1:1:1:0:128:128:128" \
>  -an -vframes 1000 -f null /dev/null
> 
> The fps increase from 151 to 270 on my local machine.
> 
> Signed-off-by: Ruiling Song 
> ---
> v2:
>   fix a bug in scalar code path.
>   Use macro PROCESS_V/S for the first tap to simplify code.
Applied this version.

> 
>  libavfilter/convolution.h |  64 +++
>  libavfilter/vf_convolution.c  |  41 +--
>  libavfilter/x86/Makefile  |   2 +
>  libavfilter/x86/vf_convolution.asm| 156 ++
>  libavfilter/x86/vf_convolution_init.c |  46 
>  5 files changed, 271 insertions(+), 38 deletions(-)
>  create mode 100644 libavfilter/convolution.h
>  create mode 100644 libavfilter/x86/vf_convolution.asm
>  create mode 100644 libavfilter/x86/vf_convolution_init.c
[...]
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_convolution: add x86 SIMD for filter_3x3()

2019-07-31 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Paul B Mahol
> Sent: Wednesday, July 17, 2019 8:42 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] avfilter/vf_convolution: add x86 SIMD
> for filter_3x3()
> 
> On 7/15/19, Song, Ruiling  wrote:
> >> -----Original Message-
> >> From: Song, Ruiling
> >> Sent: Tuesday, July 9, 2019 9:15 AM
> >> To: ffmpeg-devel@ffmpeg.org
> >> Cc: Song, Ruiling 
> >> Subject: [PATCH] avfilter/vf_convolution: add x86 SIMD for filter_3x3()
> >>
> >> Tested using a simple command (apply edge enhance):
> >> ./ffmpeg_g -i ~/Downloads/bbb_sunflower_1080p_30fps_normal.mp4 \
> >>  -vf convolution="0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0
> >> 0:0 0 0 -1 1 0 0
> >> 0 0:5:1:1:1:0:128:128:128" \
> >>  -an -vframes 1000 -f null /dev/null
> >>
> >> The fps increase from 151 to 270 on my local machine.
> >>
> >> Signed-off-by: Ruiling Song 
> > Ping?
> 
> Should be fine IFF output is exact with C version (under different
> parameters).
Thanks Paul, after fixing a bug in scalar code path, the v2 produces exact 
result as C version.
Have tested against many different parameters. Will apply in a few days.

Thanks!
Ruiling

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v1] Bug #8027 - Wrong result for FFSIGN(0)

2019-07-17 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Ulf Zibis
> Sent: Wednesday, July 17, 2019 2:34 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH v1] Bug #8027 - Wrong result for
> FFSIGN(0)
> 
> Again with the patch attached ...
> 
> Am 17.07.19 um 08:30 schrieb Ulf Zibis:
> > Hi,
> >
> > I have a patch for bug #8027  (see
> > attachment).
Why do you think FFSIGN(0.0) should return +1? What issue do you meet?
I think the value of FFSIGN(0) depends on how we define the behavior of 
FFSIGN().

Thanks!
Ruiling
> >
> > But there is still a problem with -0.0, but FFABS(-0.0) works fine.
> >
> > Testcode:
> >    av_log(NULL, AV_LOG_ERROR, "FFSIGN(0): %d\n", FFSIGN(0));
> >     av_log(NULL, AV_LOG_ERROR, "FFSIGN(-0): %d\n", FFSIGN(-0));
> >     av_log(NULL, AV_LOG_ERROR, "FFSIGN(0.0D): %d\n", FFSIGN(0.0D));
> >     av_log(NULL, AV_LOG_ERROR, "FFSIGN(-0.0D): %d\n", FFSIGN(-0.0D));
> >     av_log(NULL, AV_LOG_ERROR, "FFSIGN(-0.0F): %d\n", FFSIGN(-0.0F));
> >     av_log(NULL, AV_LOG_ERROR, "FFSIGN(-0.0): %d\n", FFSIGN(-0.0));
> >
> >     av_log(NULL, AV_LOG_ERROR, "FFABS(0): %d\n", FFABS(0));
> >     av_log(NULL, AV_LOG_ERROR, "FFABS(-0): %d\n", FFABS(-0));
> >     av_log(NULL, AV_LOG_ERROR, "FFABS(0.0D): %f\n", FFABS(0.0D));
> >     av_log(NULL, AV_LOG_ERROR, "FFABS(-0.0D): %f\n", FFABS(-0.0D));
> >     av_log(NULL, AV_LOG_ERROR, "FFABS(-0.0F): %f\n", FFABS(-0.0F));
> >     av_log(NULL, AV_LOG_ERROR, "FFABS(-0.0): %f\n", FFABS(-0.0));
> >
> > Results:
> > FFSIGN(0): 1
> > FFSIGN(-0): 1
> > FFSIGN(0.0D): 1
> > FFSIGN(-0.0D): 1
> > FFSIGN(-0.0F): 1
> > FFSIGN(-0.0): 1
> > FFABS(0): 0
> > FFABS(-0): 0
> > FFABS(0.0D): 0.00
> > FFABS(-0.0D): -0.00
> > FFABS(-0.0F): -0.00
> > FFABS(-0.0): -0.00
> >
> > -Ulf
> >
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_convolution: add x86 SIMD for filter_3x3()

2019-07-14 Thread Song, Ruiling
> -Original Message-
> From: Song, Ruiling
> Sent: Tuesday, July 9, 2019 9:15 AM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Song, Ruiling 
> Subject: [PATCH] avfilter/vf_convolution: add x86 SIMD for filter_3x3()
> 
> Tested using a simple command (apply edge enhance):
> ./ffmpeg_g -i ~/Downloads/bbb_sunflower_1080p_30fps_normal.mp4 \
>  -vf convolution="0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 0 0 -1 1 0 0 0 0:0 
> 0 0 -1 1 0 0
> 0 0:5:1:1:1:0:128:128:128" \
>  -an -vframes 1000 -f null /dev/null
> 
> The fps increase from 151 to 270 on my local machine.
> 
> Signed-off-by: Ruiling Song 
Ping?

> ---
>  libavfilter/convolution.h |  64 +++
>  libavfilter/vf_convolution.c  |  41 +--
>  libavfilter/x86/Makefile  |   2 +
>  libavfilter/x86/vf_convolution.asm| 158 ++
>  libavfilter/x86/vf_convolution_init.c |  46 
>  5 files changed, 273 insertions(+), 38 deletions(-)
>  create mode 100644 libavfilter/convolution.h
>  create mode 100644 libavfilter/x86/vf_convolution.asm
>  create mode 100644 libavfilter/x86/vf_convolution_init.c
> 
> diff --git a/libavfilter/convolution.h b/libavfilter/convolution.h
> new file mode 100644
> index 00..fc6aad58fd
> --- /dev/null
> +++ b/libavfilter/convolution.h
> @@ -0,0 +1,64 @@
> +/*
> + * Copyright (c) 2012-2013 Oka Motofumi (chikuzen.mo at gmail dot com)
> + * Copyright (c) 2015 Paul B Mahol
> + *
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
> USA
> + */
> +#ifndef AVFILTER_CONVOLUTION_H
> +#define AVFILTER_CONVOLUTION_H
> +#include "avfilter.h"
> +
> +enum MatrixMode {
> +MATRIX_SQUARE,
> +MATRIX_ROW,
> +MATRIX_COLUMN,
> +MATRIX_NBMODES,
> +};
> +
> +typedef struct ConvolutionContext {
> +const AVClass *class;
> +
> +char *matrix_str[4];
> +float rdiv[4];
> +float bias[4];
> +int mode[4];
> +float scale;
> +float delta;
> +int planes;
> +
> +int size[4];
> +int depth;
> +int max;
> +int bpc;
> +int nb_planes;
> +int nb_threads;
> +int planewidth[4];
> +int planeheight[4];
> +int matrix[4][49];
> +int matrix_length[4];
> +int copy[4];
> +
> +void (*setup[4])(int radius, const uint8_t *c[], const uint8_t *src, int 
> stride,
> + int x, int width, int y, int height, int bpc);
> +void (*filter[4])(uint8_t *dst, int width,
> +  float rdiv, float bias, const int *const matrix,
> +  const uint8_t *c[], int peak, int radius,
> +  int dstride, int stride);
> +} ConvolutionContext;
> +
> +void ff_convolution_init_x86(ConvolutionContext *s);
> +#endif
> diff --git a/libavfilter/vf_convolution.c b/libavfilter/vf_convolution.c
> index 1305569c88..e3bf1df79f 100644
> --- a/libavfilter/vf_convolution.c
> +++ b/libavfilter/vf_convolution.c
> @@ -25,48 +25,11 @@
>  #include "libavutil/opt.h"
>  #include "libavutil/pixdesc.h"
>  #include "avfilter.h"
> +#include "convolution.h"
>  #include "formats.h"
>  #include "internal.h"
>  #include "video.h"
> 
> -enum MatrixMode {
> -MATRIX_SQUARE,
> -MATRIX_ROW,
> -MATRIX_COLUMN,
> -MATRIX_NBMODES,
> -};
> -
> -typedef struct ConvolutionContext {
> -const AVClass *class;
> -
> -char *matrix_str[4];
> -float rdiv[4];
> -float bias[4];
> -int mode[4];
> -float scale;
> -float delta;
> -int planes;
> -
> -int size[4];
> -int depth;
> -int max;
> -int bpc;
> -int nb_planes;
> -int nb_threads;
> -int planewidth[4];
> -int planeheight[4];
> -int matrix[4][49];
> -int matrix_length[4];
> -int copy[4];
> -
> -void (*setup[4])(int rad

Re: [FFmpeg-devel] [PATCH v1] lavc/libdavs2.c: optimize frame copy

2019-07-04 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of hwrenx
> Sent: Wednesday, July 3, 2019 11:24 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH v1] lavc/libdavs2.c: optimize frame copy
> 

I think it's better to use "correct ..." or "fix ..." instead of "optimize" in 
the title.
Maybe a short commit message here would be useful for reviewer.

> Signed-off-by: hwrenx 
> ---
>  libavcodec/libdavs2.c | 15 ++-
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/libavcodec/libdavs2.c b/libavcodec/libdavs2.c
> index 218f3ec..15ed3a1 100644
> --- a/libavcodec/libdavs2.c
> +++ b/libavcodec/libdavs2.c
> @@ -62,7 +62,7 @@ static int davs2_dump_frames(AVCodecContext *avctx,
> davs2_picture_t *pic, int *g
>   davs2_seq_info_t *headerset, int ret_type, 
> AVFrame *frame)
>  {
>  DAVS2Context *cad= avctx->priv_data;
> -int bytes_per_sample = pic->bytes_per_sample;
> +int bytes_per_sample = pic->bytes_per_sample == 8 ? 1 : 2;
>  int plane = 0;
>  int line  = 0;
> 
> @@ -104,6 +104,7 @@ static int davs2_dump_frames(AVCodecContext
> *avctx, davs2_picture_t *pic, int *g
> 
>  for (plane = 0; plane < 3; ++plane) {
>  int size_line = pic->widths[plane] * bytes_per_sample;
> +void *dst, *src;
>  frame->buf[plane]  = av_buffer_alloc(size_line * pic->lines[plane]);
> 
>  if (!frame->buf[plane]){
> @@ -114,10 +115,14 @@ static int davs2_dump_frames(AVCodecContext
> *avctx, davs2_picture_t *pic, int *g
>  frame->data[plane] = frame->buf[plane]->data;
>  frame->linesize[plane] = size_line;
> 

Did you observe performance difference with only below lines of change?
If it is just for code cleanup, it's better to split this into separate patch.

Thanks!
Ruiling
> -for (line = 0; line < pic->lines[plane]; ++line)
> -memcpy(frame->data[plane] + line * size_line,
> -   pic->planes[plane] + line * pic->strides[plane],
> -   pic->widths[plane] * bytes_per_sample);
> +dst = frame->data[plane];
> +src = pic->planes[plane];
> +
> +for (line = 0; line < pic->lines[plane]; ++line) {
> +memcpy(dst, src, size_line);
> +dst += size_line;
> +src += pic->strides[plane];
> +}
>  }
> 
>  frame->width = cad->headerset.width;
> --
> 2.7.4
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] MAINTAINERS: add myself to the AMF section

2019-07-04 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Marton Balint
> Sent: Friday, July 5, 2019 2:27 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] MAINTAINERS: add myself to the AMF
> section
> 
> 
> On Thu, 4 Jul 2019, Hendrik Leppkes wrote:
> 
> > On Thu, Jul 4, 2019 at 12:42 AM Lynne  wrote:
> >>
> >> NAK for reasons said on IRC
> >
> > For everyones benefit, why don't you actually formulate your reasons
> > here instead of asking people to piece them together from some chat
> > history, that way people can actually understand or respond to them.
> > I, for one, could not find an actual reason you listed that actually 
> > applies.
> 
> Also irclogs mailing list archives stopped working not long ago, maybe
> someone can take a look?
Burek said he will take a look at the issue.

> 
> Thanks,
> Marton
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/4] avcodec/hevc_ps: Fix integer overflow with num_tile_rows

2019-06-16 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Michael Niedermayer
> Sent: Sunday, June 16, 2019 6:07 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 2/4] avcodec/hevc_ps: Fix integer
> overflow with num_tile_rows
> 
> On Sat, Jun 15, 2019 at 03:07:13PM +, Song, Ruiling wrote:
> > > -Original Message-
> > > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On
> Behalf
> > > Of Michael Niedermayer
> > > Sent: Friday, June 14, 2019 2:33 AM
> > > To: FFmpeg development discussions and patches  > > de...@ffmpeg.org>
> > > Subject: [FFmpeg-devel] [PATCH 2/4] avcodec/hevc_ps: Fix integer
> overflow
> > > with num_tile_rows
> > >
> > > Fixes: signed integer overflow: -2147483648 - 1 cannot be represented in
> > > type 'int'
> > > Fixes: 14880/clusterfuzz-testcase-minimized-
> > > ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5130977304641536
> > >
> > > Found-by: continuous fuzzing process https://github.com/google/oss-
> > > fuzz/tree/master/projects/ffmpeg
> > > Signed-off-by: Michael Niedermayer 
> > > ---
> > >  libavcodec/hevc_ps.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/libavcodec/hevc_ps.c b/libavcodec/hevc_ps.c
> > > index 80df417e4f..0ed6682bb4 100644
> > > --- a/libavcodec/hevc_ps.c
> > > +++ b/libavcodec/hevc_ps.c
> > > @@ -1596,7 +1596,7 @@ int ff_hevc_decode_nal_pps(GetBitContext
> *gb,
> > > AVCodecContext *avctx,
> > >  if (pps->num_tile_rows <= 0 ||
> > >  pps->num_tile_rows >= sps->height) {
> > >  av_log(avctx, AV_LOG_ERROR, "num_tile_rows_minus1 out of
> > > range: %d\n",
> > > -   pps->num_tile_rows - 1);
> > > +   pps->num_tile_rows - 1U);
> > I think the machine code generated here should be the same, right?
> > So you just tell fuzzer "I am doing subtraction between unsigned numbers",
> to make it happy?
> 
> its likely the same machine code, yes. A compiler might produce different
> code
> that break in case of the overflow though ...
Ok, it seems num_tile_columns also need such kind of change.

> 
> thx
> 
> [...]
> --
> Michael GnuPG fingerprint:
> 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> When the tyrant has disposed of foreign enemies by conquest or treaty, and
> there is nothing more to fear from them, then he is always stirring up
> some war or other, in order that the people may require a leader. -- Plato
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/4] avcodec/hevc_ps: Fix integer overflow with num_tile_rows

2019-06-15 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Michael Niedermayer
> Sent: Friday, June 14, 2019 2:33 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: [FFmpeg-devel] [PATCH 2/4] avcodec/hevc_ps: Fix integer overflow
> with num_tile_rows
> 
> Fixes: signed integer overflow: -2147483648 - 1 cannot be represented in
> type 'int'
> Fixes: 14880/clusterfuzz-testcase-minimized-
> ffmpeg_AV_CODEC_ID_HEVC_fuzzer-5130977304641536
> 
> Found-by: continuous fuzzing process https://github.com/google/oss-
> fuzz/tree/master/projects/ffmpeg
> Signed-off-by: Michael Niedermayer 
> ---
>  libavcodec/hevc_ps.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/libavcodec/hevc_ps.c b/libavcodec/hevc_ps.c
> index 80df417e4f..0ed6682bb4 100644
> --- a/libavcodec/hevc_ps.c
> +++ b/libavcodec/hevc_ps.c
> @@ -1596,7 +1596,7 @@ int ff_hevc_decode_nal_pps(GetBitContext *gb,
> AVCodecContext *avctx,
>  if (pps->num_tile_rows <= 0 ||
>  pps->num_tile_rows >= sps->height) {
>  av_log(avctx, AV_LOG_ERROR, "num_tile_rows_minus1 out of
> range: %d\n",
> -   pps->num_tile_rows - 1);
> +   pps->num_tile_rows - 1U);
I think the machine code generated here should be the same, right?
So you just tell fuzzer "I am doing subtraction between unsigned numbers", to 
make it happy?

>  ret = AVERROR_INVALIDDATA;
>  goto err;
>  }
> --
> 2.21.0
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3 1/2] avfilter/vf_gblur: add x86 SIMD optimizations

2019-06-12 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Adam Sampson
> Sent: Wednesday, June 12, 2019 8:21 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH V3 1/2] avfilter/vf_gblur: add x86 SIMD
> optimizations
> 
> Hi Ruiling,
> 
> Ruiling Song  writes:
> 
> This breaks the build for me on x86-32 -- the asm helpers in
> vf_gblur.asm are only defined on x86-64, but vf_gblur_init.c expects
> them to exist on both architectures.
> 
> ld: libavfilter/libavfilter.so: undefined reference to `ff_horiz_slice_avx2'
> ld: libavfilter/libavfilter.so: undefined reference to `ff_horiz_slice_sse4'
> collect2: error: ld returned 1 exit status
> 
> Adding "#if ARCH_X86_64" conditionals to vf_gblur_init.c fixes it.
Thank you for reporting this. Sorry for that. Thank you James for fixing it.

> 
> Thanks,
> 
> --
> Adam Sampson  
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [FFmpeg-cvslog] avfilter/vf_gblur: add x86 SIMD optimizations

2019-06-12 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Reimar D?ffinger
> Sent: Wednesday, June 12, 2019 1:51 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [FFmpeg-cvslog] avfilter/vf_gblur: add x86
> SIMD optimizations
> 
> 
> 
> On 12.06.2019, at 03:00, Ruiling Song  wrote:
> 
> > ffmpeg | branch: master | Ruiling Song  | Wed
> May 15 17:54:10 2019 +0800| [83f9da77684e7ea0d8e9f9712ec716424140043a]
> | committer: Ruiling Song
> >
> > avfilter/vf_gblur: add x86 SIMD optimizations
> >
> > The horizontal pass get ~2x performance with the patch
> > under single thread.
> >
> > Tested overall performance using the command(avx2 enabled):
> > ./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null
> > ./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null
> > For single thread, the fps improves from 43 to 60, about 40%.
> > For multi-thread, the fps improves from 110 to 130, about 20%.
> >
> > Signed-off-by: Ruiling Song 
> >
> >>
> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=83f9da77684e7e
> a0d8e9f9712ec716424140043a
> > ---
> >
> > libavfilter/gblur.h |  55 
> > libavfilter/vf_gblur.c  |  71 +++
> > libavfilter/x86/Makefile|   2 +
> > libavfilter/x86/vf_gblur.asm| 185
> 
> > libavfilter/x86/vf_gblur_init.c |  36 
> > 5 files changed, 310 insertions(+), 39 deletions(-)
> >
> > diff --git a/libavfilter/gblur.h b/libavfilter/gblur.h
> > new file mode 100644
> > index 00..87129801de
> > --- /dev/null
> > +++ b/libavfilter/gblur.h
> > @@ -0,0 +1,55 @@
> > +/*
> > + * Copyright (c) 2011 Pascal Getreuer
> > + * Copyright (c) 2016 Paul B Mahol
> > + *
> > + * Redistribution and use in source and binary forms, with or without
> modification,
> > + * are permitted provided that the following conditions are met:
> > + *
> > + *  * Redistributions of source code must retain the above copyright
> > + *notice, this list of conditions and the following disclaimer.
> > + *  * Redistributions in binary form must reproduce the above
> > + *copyright notice, this list of conditions and the following
> > + *disclaimer in the documentation and/or other materials provided
> > + *with the distribution.
> 
> Where does this license come from?
The license is from vf_gblur.c, because the code was copied from there.
If I read correctly, this is "Simplified BSD License".

> Is that even GPL-compatible?
> I mean how is someone compiling ffmpeg even supposed to know
> they have to put this license text in their documentation?


> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3 2/2] checkasm/vf_gblur: add test for horiz_slice simd

2019-06-11 Thread Song, Ruiling


> -Original Message-
> From: Song, Ruiling
> Sent: Friday, June 7, 2019 5:59 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: RE: [FFmpeg-devel] [PATCH V3 2/2] checkasm/vf_gblur: add test for
> horiz_slice simd
> 
> > -Original Message-
> > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On
> Behalf
> > Of Michael Niedermayer
> > Sent: Thursday, June 6, 2019 6:45 PM
> > To: FFmpeg development discussions and patches  > de...@ffmpeg.org>
> > Subject: Re: [FFmpeg-devel] [PATCH V3 2/2] checkasm/vf_gblur: add test
> for
> > horiz_slice simd
> >
> > On Wed, Jun 05, 2019 at 10:29:36PM +0800, Ruiling Song wrote:
> > > Signed-off-by: Ruiling Song 
> > > ---
> > >  tests/checkasm/Makefile   |  1 +
> > >  tests/checkasm/checkasm.c |  3 ++
> > >  tests/checkasm/checkasm.h |  1 +
> > >  tests/checkasm/vf_gblur.c | 67
> > +++
> > >  tests/fate/checkasm.mak   |  1 +
> > >  5 files changed, 73 insertions(+)
> > >  create mode 100644 tests/checkasm/vf_gblur.c
> >
> > this patchset seems to fix the fate failure of the last
> Thanks Michael, I will wait a few more days to see if anybody has comment
> on the patch.
> Will apply later next week if no objection.
Patchset Applied.

> 
> >
> > thanks
> >
> > [...]
> > --
> > Michael GnuPG fingerprint:
> > 9FF2128B147EF6730BADF133611EC787040B0FAB
> >
> > He who knows, does not speak. He who speaks, does not know. -- Lao Tsu
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3 2/2] checkasm/vf_gblur: add test for horiz_slice simd

2019-06-06 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Michael Niedermayer
> Sent: Thursday, June 6, 2019 6:45 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V3 2/2] checkasm/vf_gblur: add test for
> horiz_slice simd
> 
> On Wed, Jun 05, 2019 at 10:29:36PM +0800, Ruiling Song wrote:
> > Signed-off-by: Ruiling Song 
> > ---
> >  tests/checkasm/Makefile   |  1 +
> >  tests/checkasm/checkasm.c |  3 ++
> >  tests/checkasm/checkasm.h |  1 +
> >  tests/checkasm/vf_gblur.c | 67
> +++
> >  tests/fate/checkasm.mak   |  1 +
> >  5 files changed, 73 insertions(+)
> >  create mode 100644 tests/checkasm/vf_gblur.c
> 
> this patchset seems to fix the fate failure of the last
Thanks Michael, I will wait a few more days to see if anybody has comment on 
the patch.
Will apply later next week if no objection.

> 
> thanks
> 
> [...]
> --
> Michael GnuPG fingerprint:
> 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> He who knows, does not speak. He who speaks, does not know. -- Lao Tsu
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 2/2] checkasm/vf_gblur: add test for horiz_slice simd

2019-06-05 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Michael Niedermayer
> Sent: Wednesday, June 5, 2019 4:16 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V2 2/2] checkasm/vf_gblur: add test for
> horiz_slice simd
> 
> On Tue, Jun 04, 2019 at 04:42:09PM +0800, Ruiling Song wrote:
> > Signed-off-by: Ruiling Song 
> > ---
> >  tests/checkasm/Makefile   |  1 +
> >  tests/checkasm/checkasm.c |  3 ++
> >  tests/checkasm/checkasm.h |  1 +
> >  tests/checkasm/vf_gblur.c | 67
> +++
> >  tests/fate/checkasm.mak   |  1 +
> >  5 files changed, 73 insertions(+)
> >  create mode 100644 tests/checkasm/vf_gblur.c
> 
> this fails here: (ubuntu x86-64)
> 
> Test checkasm-vf_gblur failed. Look at tests/data/fate/checkasm-
> vf_gblur.err for details.
> checkasm: using random seed 1608403213
> test failed comparing 258.619 with 212.24 (abs diff=46.3793 with EPS=0.01)
> SSE4.1:
>horiz_slice_sse4 (vf_gblur.c:60)
>  - vf_gblur.horiz_slice [FAILED]
> checkasm: 1 of 1 tests have failed
> make: *** [fate-checkasm-vf_gblur] Error 1
Hi Michael,

Thanks so much for testing. I tried on three different hardware with Ubuntu 
18.04, and failed to reproduce the issue. It's really strange:(
But I reproduce a failure on WIN64. the root-cause of the bug is I missed the 
important fact that the 'int' parameter was passed in using lower 32bit of the 
64bit register. The upper 32bit may have garbage.
I have fixed the issue in V3. Hope it can solve the issue you met. Please help 
take a test when you have time.
If it cannot fix your issue, please help share me your CPU info, Ubuntu 
version, gcc version, and nasm/yasm version. Thanks so much!

Thanks!
Ruiling
> 
> 
> 
> [...]
> --
> Michael GnuPG fingerprint:
> 9FF2128B147EF6730BADF133611EC787040B0FAB
> 
> You can kill me, but you cannot change the truth.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_gblur: add x86 SIMD optimizations

2019-06-01 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Carl Eugen Hoyos
> Sent: Saturday, June 1, 2019 6:12 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] avfilter/vf_gblur: add x86 SIMD
> optimizations
> 
> Am Do., 30. Mai 2019 um 05:46 Uhr schrieb Ruiling Song
> :
> >
> > For details of the implementation, please refer to the comment
> > inlined in the assembly code.
> 
> This sentence sounds unneeded to me.
> 
> > It improves the horizontal pass
> > performance about 100% under single thread.
> 
> I am not a native speaker but I wonder what a "100% speed
> improvement" could mean...
It means 50% reduction of running time.
For example, previously it takes 12ms to do one horizontal pass per frame, now 
it takes 6ms to do the horizontal pass per frame.
Any comments on the assembly code?

> 
> > Tested overall performance using the command(avx2 enabled):
> > ./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null
> > ./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null
> > For single thread, the fps improves from 43 to 60, about 40%.
> > For multi-thread, the fps improves from 110 to 130, about 20%.
> 
> Carl Eugen
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_zscale: add slice threading

2019-05-30 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Paul B Mahol
> Sent: Thursday, May 30, 2019 3:22 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH] avfilter/vf_zscale: add slice threading
> 
> Signed-off-by: Paul B Mahol 
> ---
>  libavfilter/vf_zscale.c | 335 +---
>  1 file changed, 211 insertions(+), 124 deletions(-)

Doing some testing show that this patch introduce big performance drop for 
below scale from 1080p to 720p:
./ffmpeg -I 1080p.mp4 -vf zscale=w=1280:h=720 -f null /dev/null
On my local machine(i7-6770HQ with 4 cores, thus 8 threads), the fps number 
drops from 240 to 160.
Did you observe any performance gain with this patch for some use-case?

 [...]
> @@ -706,10 +790,12 @@ static void uninit(AVFilterContext *ctx)
>  {
>  ZScaleContext *s = ctx->priv;
> 
> -zimg_filter_graph_free(s->graph);
> -zimg_filter_graph_free(s->alpha_graph);
> -av_freep(>tmp);
> -s->tmp_size = 0;
> +for (int i = 0; i < s->nb_threads; i++) {
> +zimg_filter_graph_free(s->ztd[i].graph);
> +zimg_filter_graph_free(s->ztd[i].alpha_graph);
> +av_freep(>ztd[i].tmp);
> +s->ztd[i].tmp_size = 0;
> +}
Missing av_freep(>ztd) here?
>  }
> 
>  static int process_command(AVFilterContext *ctx, const char *cmd, const
> char *args,
> @@ -890,4 +976,5 @@ AVFilter ff_vf_zscale = {
>  .inputs  = avfilter_vf_zscale_inputs,
>  .outputs = avfilter_vf_zscale_outputs,
>  .process_command = process_command,
> +.flags   = AVFILTER_FLAG_SLICE_THREADS,
>  };
> --
> 2.17.1
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_gblur: add x86 SIMD optimizations

2019-05-30 Thread Song, Ruiling


> -Original Message-
> From: Paul B Mahol [mailto:one...@gmail.com]
> Sent: Thursday, May 30, 2019 3:24 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Cc: Song, Ruiling 
> Subject: Re: [FFmpeg-devel] [PATCH] avfilter/vf_gblur: add x86 SIMD
> optimizations
> 
> On 5/30/19, Ruiling Song  wrote:
> > For details of the implementation, please refer to the comment
> > inlined in the assembly code. It improves the horizontal pass
> > performance about 100% under single thread.
> >
> > Tested overall performance using the command(avx2 enabled):
> > ./ffmpeg -i 1080p.mp4 -vf gblur -f null /dev/null
> > ./ffmpeg -i 1080p.mp4 -vf gblur=threads=1 -f null /dev/null
> > For single thread, the fps improves from 43 to 60, about 40%.
> > For multi-thread, the fps improves from 110 to 130, about 20%.
> >
> > Signed-off-by: Ruiling Song 
> > ---
> >  libavfilter/gblur.h |  54 ++
> >  libavfilter/vf_gblur.c  |  66 +---
> >  libavfilter/x86/Makefile|   2 +
> >  libavfilter/x86/vf_gblur.asm| 182
> 
> >  libavfilter/x86/vf_gblur_init.c |  36 +++
> >  5 files changed, 302 insertions(+), 38 deletions(-)
> >  create mode 100644 libavfilter/gblur.h
> >  create mode 100644 libavfilter/x86/vf_gblur.asm
> >  create mode 100644 libavfilter/x86/vf_gblur_init.c

[...]
> > diff --git a/libavfilter/vf_gblur.c b/libavfilter/vf_gblur.c
> > index b91a8c074a..4e876bca05 100644
> > --- a/libavfilter/vf_gblur.c
> > +++ b/libavfilter/vf_gblur.c
> > @@ -30,29 +30,11 @@
> >  #include "libavutil/pixdesc.h"
> >  #include "avfilter.h"
> >  #include "formats.h"
> > +#include "gblur.h"
> >  #include "internal.h"
> >  #include "video.h"
> > +#include 
> 
> Is this header really needed?
Oh, this is not needed, I forget to remove it after I am experimenting with SSE 
intrinsics.
Will remove it. Thanks!

Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2] avfilter/vf_unsharp: enable slice threading

2019-05-29 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Song, Ruiling
> Sent: Thursday, May 23, 2019 9:26 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH V2] avfilter/vf_unsharp: enable slice
> threading
> 
> > -Original Message-
> > From: Song, Ruiling
> > Sent: Thursday, May 16, 2019 5:48 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Song, Ruiling 
> > Subject: [PATCH V2] avfilter/vf_unsharp: enable slice threading
> >
> > benchmarking with a simple command:
> > ffmpeg -i 1080p.mp4 -vf unsharp=la=3:ca=3 -an -f null /dev/null
> > with the patch, the fps increase from 50 to 120 on my local machine (i7-
> > 6770HQ).
> >
> > v2:
> > make av_image_copy_plane() only copy per-slice content.
> >
> > Signed-off-by: Ruiling Song 
> Ping? Any comments?
Ping? Will apply next week if nobody against.

Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2] lavfi/lut: Add slice threading support

2019-05-29 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Paul B Mahol
> Sent: Wednesday, May 29, 2019 4:24 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V2] lavfi/lut: Add slice threading
> support
> 
> On 5/29/19, Song, Ruiling  wrote:
> >> -Original Message-
> >> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On
> Behalf
> >> Of Jun Zhao
> >> Sent: Saturday, May 25, 2019 10:33 AM
> >> To: ffmpeg-devel@ffmpeg.org
> >> Cc: Jun Zhao 
> >> Subject: [FFmpeg-devel] [PATCH V2] lavfi/lut: Add slice threading support
> >>
> >> V2: - update comments
> >>
> >> Jun Zhao (1):
> >>   lavfi/lut: Add slice threading support
> >>
> >>  libavfilter/vf_lut.c |  329 +---
> ---
> >> ---
> >>  1 files changed, 216 insertions(+), 113 deletions(-)
> > I have attached the patch which I would like to happen.
> > I only test lutyuv on 1080p input, seems the performance penalty is less
> > than 5%.
> > I would like also hear performance numbers from your environment.
> > I am not sure do you like the patch?
> >
> 
> You should use function pointers.
Do you mean function pointer for the per-pixel function?
I think function pointer will introduce extra burden of function call per each 
pixel, while for now, the xxx_pixel_8/16 is just inlined.

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2] lavfi/lut: Add slice threading support

2019-05-29 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Jun Zhao
> Sent: Saturday, May 25, 2019 10:33 AM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Jun Zhao 
> Subject: [FFmpeg-devel] [PATCH V2] lavfi/lut: Add slice threading support
> 
> V2: - update comments
> 
> Jun Zhao (1):
>   lavfi/lut: Add slice threading support
> 
>  libavfilter/vf_lut.c |  329 +--
> ---
>  1 files changed, 216 insertions(+), 113 deletions(-)
I have attached the patch which I would like to happen.
I only test lutyuv on 1080p input, seems the performance penalty is less than 
5%.
I would like also hear performance numbers from your environment.
I am not sure do you like the patch?

Ruiling
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


0001-avoid-too-much-duplicate-code.patch
Description: 0001-avoid-too-much-duplicate-code.patch
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2] avfilter/vf_unsharp: enable slice threading

2019-05-23 Thread Song, Ruiling
> -Original Message-
> From: Song, Ruiling
> Sent: Thursday, May 16, 2019 5:48 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Song, Ruiling 
> Subject: [PATCH V2] avfilter/vf_unsharp: enable slice threading
> 
> benchmarking with a simple command:
> ffmpeg -i 1080p.mp4 -vf unsharp=la=3:ca=3 -an -f null /dev/null
> with the patch, the fps increase from 50 to 120 on my local machine (i7-
> 6770HQ).
> 
> v2:
> make av_image_copy_plane() only copy per-slice content.
> 
> Signed-off-by: Ruiling Song 
Ping? Any comments?

> ---
>  libavfilter/unsharp.h|   4 +-
>  libavfilter/vf_unsharp.c | 102 ++-
>  2 files changed, 81 insertions(+), 25 deletions(-)
> 
> diff --git a/libavfilter/unsharp.h b/libavfilter/unsharp.h
> index caff986fc1..a60b30f31a 100644
> --- a/libavfilter/unsharp.h
> +++ b/libavfilter/unsharp.h
> @@ -37,7 +37,8 @@ typedef struct UnsharpFilterParam {
>  int steps_y; ///< vertical step count
>  int scalebits;   ///< bits to shift pixel
>  int32_t halfscale;   ///< amount to add to pixel
> -uint32_t *sc[MAX_MATRIX_SIZE - 1];   ///< finite state machine 
> storage
> +uint32_t *sr;///< finite state machine storage within a row
> +uint32_t **sc;   ///< finite state machine storage across rows
>  } UnsharpFilterParam;
> 
>  typedef struct UnsharpContext {
> @@ -47,6 +48,7 @@ typedef struct UnsharpContext {
>  UnsharpFilterParam luma;   ///< luma parameters (width, height, amount)
>  UnsharpFilterParam chroma; ///< chroma parameters (width, height,
> amount)
>  int hsub, vsub;
> +int nb_threads;
>  int opencl;
>  int (* apply_unsharp)(AVFilterContext *ctx, AVFrame *in, AVFrame *out);
>  } UnsharpContext;
> diff --git a/libavfilter/vf_unsharp.c b/libavfilter/vf_unsharp.c
> index 41ccc56942..af05833a5d 100644
> --- a/libavfilter/vf_unsharp.c
> +++ b/libavfilter/vf_unsharp.c
> @@ -47,15 +47,22 @@
>  #include "libavutil/pixdesc.h"
>  #include "unsharp.h"
> 
> -static void apply_unsharp(  uint8_t *dst, int dst_stride,
> -  const uint8_t *src, int src_stride,
> -  int width, int height, UnsharpFilterParam *fp)
> +typedef struct TheadData {
> +UnsharpFilterParam *fp;
> +uint8_t   *dst;
> +const uint8_t *src;
> +int dst_stride;
> +int src_stride;
> +int width;
> +int height;
> +} ThreadData;
> +
> +static int unsharp_slice(AVFilterContext *ctx, void *arg, int jobnr, int
> nb_jobs)
>  {
> +ThreadData *td = arg;
> +UnsharpFilterParam *fp = td->fp;
>  uint32_t **sc = fp->sc;
> -uint32_t sr[MAX_MATRIX_SIZE - 1], tmp1, tmp2;
> -
> -int32_t res;
> -int x, y, z;
> +uint32_t *sr = fp->sr;
>  const uint8_t *src2 = NULL;  //silence a warning
>  const int amount = fp->amount;
>  const int steps_x = fp->steps_x;
> @@ -63,30 +70,54 @@ static void apply_unsharp(  uint8_t *dst, int
> dst_stride,
>  const int scalebits = fp->scalebits;
>  const int32_t halfscale = fp->halfscale;
> 
> +uint8_t *dst = td->dst;
> +const uint8_t *src = td->src;
> +const int dst_stride = td->dst_stride;
> +const int src_stride = td->src_stride;
> +const int width = td->width;
> +const int height = td->height;
> +const int sc_offset = jobnr * 2 * steps_y;
> +const int sr_offset = jobnr * (MAX_MATRIX_SIZE - 1);
> +const int slice_start = (height * jobnr) / nb_jobs;
> +const int slice_end = (height * (jobnr+1)) / nb_jobs;
> +
> +int32_t res;
> +int x, y, z;
> +uint32_t tmp1, tmp2;
> +
>  if (!amount) {
> -av_image_copy_plane(dst, dst_stride, src, src_stride, width, height);
> -return;
> +av_image_copy_plane(dst + slice_start * dst_stride, dst_stride,
> +src + slice_start * src_stride, src_stride,
> +width, slice_end - slice_start);
> +return 0;
>  }
> 
>  for (y = 0; y < 2 * steps_y; y++)
> -memset(sc[y], 0, sizeof(sc[y][0]) * (width + 2 * steps_x));
> +memset(sc[sc_offset + y], 0, sizeof(sc[y][0]) * (width + 2 * 
> steps_x));
> 
> -for (y = -steps_y; y < height + steps_y; y++) {
> +// if this is not the first tile, we start from (slice_start - steps_y),
> +// so we can get smooth result at slice boundary
> +if (slice_start > steps_y) {
> +src += (slice_start - steps_y) * src_stride;
> +dst

Re: [FFmpeg-devel] [PATCH V1] lavfi/lut: Add slice threading support

2019-05-21 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of myp...@gmail.com
> Sent: Wednesday, May 22, 2019 11:14 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Cc: Jun Zhao 
> Subject: Re: [FFmpeg-devel] [PATCH V1] lavfi/lut: Add slice threading
> support
> 
> On Wed, May 22, 2019 at 11:03 AM Song, Ruiling 
> wrote:
> >
> >
> >
> > > -Original Message-
> > > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On
> Behalf
> > > Of Jun Zhao
> > > Sent: Wednesday, May 22, 2019 12:29 AM
> > > To: ffmpeg-devel@ffmpeg.org
> > > Cc: Jun Zhao 
> > > Subject: [FFmpeg-devel] [PATCH V1] lavfi/lut: Add slice threading support
> > >
> > > From: Jun Zhao 
> > >
> > > Used the command for 1080p h264 clip as follow:
> > >
> > > a). ffmpeg -i input -vf lutyuv="u=128:v=128" -f null /dev/null
> > > b). ffmpeg -i input -vf lutrgb="g=0:b=0" -f null /dev/null
> > >
> > > after enabled the slice threading, the fps change from:
> > >
> > > a). 144fps to 258fps (lutyuv)
> > > b). 94fps  to 153fps (lutrgb)
> > >
> > > in Intel(R) Core(TM) i5-8265U CPU @ 1.60GHz
> > >
> > > Signed-off-by: Jun Zhao 
> > > ---
> > >  libavfilter/vf_lut.c |  328 +--
> 
> > > ---
> > >  1 files changed, 216 insertions(+), 112 deletions(-)
> > >
> > > diff --git a/libavfilter/vf_lut.c b/libavfilter/vf_lut.c
> > > index c815ddc..61550ee 100644
> > > --- a/libavfilter/vf_lut.c
> > > +++ b/libavfilter/vf_lut.c
> > > @@ -337,13 +337,194 @@ static int config_props(AVFilterLink *inlink)
> > >  return 0;
> > >  }
> > >
> > > +struct thread_data {
> > > +AVFrame *in;
> > > +AVFrame *out;
> > > +
> > > +int w;
> > > +int h;
> > > +};
> >
> > I think it's better to refine the patch to avoid duplicating code, the 
> > exiting
> source code has been copy-pasted too much.
> > Maybe we just need lut_packed() and lut_planar(). For 8/16 variation, I
> think it is easy to add one field( like "int is_16bit;")in thread_data to 
> solve it.
> Ha, in fact, they are come from origin code, and I noticed the code
> redundancy in origin code, as my plan, I plan to split with 2 steps:
> step 1: enabling the slice thread, it's will help to review + test (as
> this patch)
> step 2: refine the code redundancy, (the next round patch).
> 
>  So you want to combine step 1 and step 2 as one patch ? Thanks.
Yes, I don't see much benefit of split it into 2 steps. I prefer reviewing 
clean code.

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V1] lavfi/lut: Add slice threading support

2019-05-21 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Jun Zhao
> Sent: Wednesday, May 22, 2019 12:29 AM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Jun Zhao 
> Subject: [FFmpeg-devel] [PATCH V1] lavfi/lut: Add slice threading support
> 
> From: Jun Zhao 
> 
> Used the command for 1080p h264 clip as follow:
> 
> a). ffmpeg -i input -vf lutyuv="u=128:v=128" -f null /dev/null
> b). ffmpeg -i input -vf lutrgb="g=0:b=0" -f null /dev/null
> 
> after enabled the slice threading, the fps change from:
> 
> a). 144fps to 258fps (lutyuv)
> b). 94fps  to 153fps (lutrgb)
> 
> in Intel(R) Core(TM) i5-8265U CPU @ 1.60GHz
> 
> Signed-off-by: Jun Zhao 
> ---
>  libavfilter/vf_lut.c |  328 +--
> ---
>  1 files changed, 216 insertions(+), 112 deletions(-)
> 
> diff --git a/libavfilter/vf_lut.c b/libavfilter/vf_lut.c
> index c815ddc..61550ee 100644
> --- a/libavfilter/vf_lut.c
> +++ b/libavfilter/vf_lut.c
> @@ -337,13 +337,194 @@ static int config_props(AVFilterLink *inlink)
>  return 0;
>  }
> 
> +struct thread_data {
> +AVFrame *in;
> +AVFrame *out;
> +
> +int w;
> +int h;
> +};

I think it's better to refine the patch to avoid duplicating code, the exiting 
source code has been copy-pasted too much.
Maybe we just need lut_packed() and lut_planar(). For 8/16 variation, I think 
it is easy to add one field( like "int is_16bit;")in thread_data to solve it.

Ruiling
> +
> +/* packed, 16-bit */
> +static int lut_packed_16bits(AVFilterContext *ctx, void *arg, int jobnr, int

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3] lavfi/opencl: add nlmeans_opencl filter

2019-05-20 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Mark Thompson
> Sent: Tuesday, May 21, 2019 6:16 AM
> To: 'FFmpeg development discussions and patches'  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH V3] lavfi/opencl: add nlmeans_opencl
> filter
> 
> On 20/05/2019 02:18, Song, Ruiling wrote:
> >> -----Original Message-
> >> From: Song, Ruiling
> >> Sent: Monday, May 13, 2019 10:18 AM
> >> To: FFmpeg development discussions and patches  >> de...@ffmpeg.org>; 'Mark Thompson' 
> >> Subject: RE: [FFmpeg-devel] [PATCH V3] lavfi/opencl: add
> nlmeans_opencl
> >> filter
> >>
> >>> -Original Message-
> >>> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On
> >> Behalf
> >>> Of Ruiling Song
> >>> Sent: Tuesday, May 7, 2019 10:45 AM
> >>> To: ffmpeg-devel@ffmpeg.org
> >>> Cc: Song, Ruiling 
> >>> Subject: [FFmpeg-devel] [PATCH V3] lavfi/opencl: add nlmeans_opencl
> >> filter
> >>>
> >>> Signed-off-by: Ruiling Song 
> >>> ---
> >>>  configure   |   1 +
> >>>  doc/filters.texi|   4 +
> >>>  libavfilter/Makefile|   1 +
> >>>  libavfilter/allfilters.c|   1 +
> >>>  libavfilter/opencl/nlmeans.cl   | 115 +
> >>>  libavfilter/opencl_source.h |   1 +
> >>>  libavfilter/vf_nlmeans_opencl.c | 443
> >>> 
> >>>  7 files changed, 566 insertions(+)
> >>>  create mode 100644 libavfilter/opencl/nlmeans.cl
> >>>  create mode 100644 libavfilter/vf_nlmeans_opencl.c
> >> Hi Mark,
> >>
> >> Do you have further comment on v3?
> > Will apply if no further comments.
> 
> No more from me.  I also did some testing of this on Mali, all good.
Thanks Mark for your valuable comments on the patch, will apply.

> 
> Thanks,
> 
> - Mark
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3] lavfi/opencl: add nlmeans_opencl filter

2019-05-19 Thread Song, Ruiling
> -Original Message-
> From: Song, Ruiling
> Sent: Monday, May 13, 2019 10:18 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>; 'Mark Thompson' 
> Subject: RE: [FFmpeg-devel] [PATCH V3] lavfi/opencl: add nlmeans_opencl
> filter
> 
> > -Original Message-
> > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On
> Behalf
> > Of Ruiling Song
> > Sent: Tuesday, May 7, 2019 10:45 AM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Song, Ruiling 
> > Subject: [FFmpeg-devel] [PATCH V3] lavfi/opencl: add nlmeans_opencl
> filter
> >
> > Signed-off-by: Ruiling Song 
> > ---
> >  configure   |   1 +
> >  doc/filters.texi|   4 +
> >  libavfilter/Makefile|   1 +
> >  libavfilter/allfilters.c|   1 +
> >  libavfilter/opencl/nlmeans.cl   | 115 +
> >  libavfilter/opencl_source.h |   1 +
> >  libavfilter/vf_nlmeans_opencl.c | 443
> > 
> >  7 files changed, 566 insertions(+)
> >  create mode 100644 libavfilter/opencl/nlmeans.cl
> >  create mode 100644 libavfilter/vf_nlmeans_opencl.c
> Hi Mark,
> 
> Do you have further comment on v3?
Will apply if no further comments.

> 
> Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2] avutil/tx: add check against (*ctx)

2019-05-16 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of James Almer
> Sent: Friday, May 17, 2019 5:30 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH V2] avutil/tx: add check against (*ctx)
> 
> On 5/16/2019 6:06 PM, Lynne wrote:
> > May 16, 2019, 8:43 PM by geo...@nsup.org:
> >
> >> Lynne (12019-05-16):
> >>
> >>> I'm not, I still want the 2 checks.
> >>>
> >>
> >> Arguments please. As I explained, the first check is harmful to
> >> applications because it hides bug.
> >>
> >
> > Nevermind, its not really possible to give av_tx_uninit() a NULL double
> pointer so
> > whatever. Can the author just push a patch already?
> 
> I did it just now, but as the author and maintainer of the code in
> question you're the most adequate person to review and push patches for it.
After reading the discussion, I agree to use v1 to just crash if null pointer 
passed in and thanks James for approve and push the patch. And thank you all 
for the active discussion:)

Ruiling
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V3] lavfi/opencl: add nlmeans_opencl filter

2019-05-12 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Ruiling Song
> Sent: Tuesday, May 7, 2019 10:45 AM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Song, Ruiling 
> Subject: [FFmpeg-devel] [PATCH V3] lavfi/opencl: add nlmeans_opencl filter
> 
> Signed-off-by: Ruiling Song 
> ---
>  configure   |   1 +
>  doc/filters.texi|   4 +
>  libavfilter/Makefile|   1 +
>  libavfilter/allfilters.c|   1 +
>  libavfilter/opencl/nlmeans.cl   | 115 +
>  libavfilter/opencl_source.h |   1 +
>  libavfilter/vf_nlmeans_opencl.c | 443
> 
>  7 files changed, 566 insertions(+)
>  create mode 100644 libavfilter/opencl/nlmeans.cl
>  create mode 100644 libavfilter/vf_nlmeans_opencl.c
Hi Mark,

Do you have further comment on v3?

Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_unsharp: enable slice threading

2019-05-12 Thread Song, Ruiling
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Carl Eugen Hoyos
> Sent: Friday, May 10, 2019 4:53 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH] avfilter/vf_unsharp: enable slice
> threading
> 
> Am Fr., 10. Mai 2019 um 08:50 Uhr schrieb Song, Ruiling
> :
> >
> > > -Original Message-
> > > From: Song, Ruiling
> > > Sent: Thursday, May 9, 2019 3:43 PM
> > > To: ffmpeg-devel@ffmpeg.org
> > > Cc: Song, Ruiling 
> > > Subject: [PATCH] avfilter/vf_unsharp: enable slice threading
> > >
> > > Signed-off-by: Ruiling Song 
> > > ---
> > >  libavfilter/unsharp.h|  4 +-
> > >  libavfilter/vf_unsharp.c | 98 ++--
> 
> > >  2 files changed, 78 insertions(+), 24 deletions(-)
> >
> > Add some performance number in case somebody have interest to know.
> > Running "ffmpeg -i 1080p.mp4 -vf unsharp=la=3:ca=3 -an -f null /dev/null"
> on my local machine (i7-6770HQ): the fps increase from 50 to 120.
> 
> Something like this should be part of the commit message imo.
Ok, I will add this to commit message when pushing the patch. Will add next 
time.
I really hope someone to take a look at the change whether this is functionally 
ok.

Ruiling
> 
> Carl Eugen
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_unsharp: enable slice threading

2019-05-10 Thread Song, Ruiling
> -Original Message-
> From: Song, Ruiling
> Sent: Thursday, May 9, 2019 3:43 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Song, Ruiling 
> Subject: [PATCH] avfilter/vf_unsharp: enable slice threading
> 
> Signed-off-by: Ruiling Song 
> ---
>  libavfilter/unsharp.h|  4 +-
>  libavfilter/vf_unsharp.c | 98 ++--
>  2 files changed, 78 insertions(+), 24 deletions(-)

Add some performance number in case somebody have interest to know.
Running "ffmpeg -i 1080p.mp4 -vf unsharp=la=3:ca=3 -an -f null /dev/null" on my 
local machine (i7-6770HQ): the fps increase from 50 to 120.

Thanks!
Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCHv2] lavfi: add gblur_opencl filter

2019-05-08 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Dylan Fernando
> Sent: Tuesday, May 7, 2019 8:27 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCHv2] lavfi: add gblur_opencl filter
> 
> Anyone have any comments/feedback?
I think unsharp_opencl with a negative amount should do similar thing as this 
one.
What's the difference? Better quality? or better speed?

Thanks!
Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add nlmeans_opencl filter

2019-05-06 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Mark Thompson
> Sent: Monday, May 6, 2019 10:20 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add
> nlmeans_opencl filter
> 
> On 29/04/2019 03:06, Song, Ruiling wrote:>
> > In order to verify the patch, I also have more testing on the CPU OpenCL
> driver from Intel.
> > I make it run 100 times, and still not see any reported overflow. So I think
> we can say the filter is in good quality to be merged. Any different idea?
> 
> I've tried a lot more times on some additional platforms (Skylake-GT3, Mali-
> G52) and I can't reproduce it on anything else.  So, I think I agree that it 
> must
> be a driver issue and shouldn't block anything.
> 
> 
> On 12/04/2019 16:09, Ruiling Song wrote:
> > Signed-off-by: Ruiling Song 
> > ---
> >  configure   |   1 +
> >  doc/filters.texi|   4 +
> >  libavfilter/Makefile|   1 +
> >  libavfilter/allfilters.c|   1 +
> >  libavfilter/opencl/nlmeans.cl   | 115 +
> >  libavfilter/opencl_source.h |   1 +
> >  libavfilter/vf_nlmeans_opencl.c | 442
> 
> >  7 files changed, 565 insertions(+)
> >  create mode 100644 libavfilter/opencl/nlmeans.cl
> >  create mode 100644 libavfilter/vf_nlmeans_opencl.c
> >
> > ...
> > +
> > +static int nlmeans_plane(AVFilterContext *avctx, cl_mem dst, cl_mem src,
> > + cl_int width, cl_int height, cl_int p, cl_int r)
> > +{
> > +NLMeansOpenCLContext *ctx = avctx->priv;
> > +const float zero = 0.0f;
> > +const size_t worksize1[] = {height};
> > +const size_t worksize2[] = {width};
> > +const size_t worksize3[2] = {width, height};
> > +int dx, dy, err = 0, weight_buf_size;
> > +cl_int cle;
> > +int nb_pixel, *tmp, idx = 0;
> > +cl_int *dxdy;
> > +
> > +weight_buf_size = width * height * sizeof(float);
> > +cle = clEnqueueFillBuffer(ctx->command_queue, ctx->weight,
> > +  , sizeof(float), 0, weight_buf_size,
> > +  0, NULL, NULL);
> > +CL_FAIL_ON_ERROR(AVERROR(EIO), "Failed to fill weight buffer: %d.\n",
> > + cle);
> > +cle = clEnqueueFillBuffer(ctx->command_queue, ctx->sum,
> > +  , sizeof(float), 0, weight_buf_size,
> > +  0, NULL, NULL);
> > +CL_FAIL_ON_ERROR(AVERROR(EIO), "Failed to fill sum buffer: %d.\n",
> > + cle);
> > +
> > +nb_pixel = (2 * r + 1) * (2 * r + 1) - 1;
> > +dxdy = av_malloc(nb_pixel * 2 * sizeof(cl_int));
> > +tmp = av_malloc(nb_pixel * 2 * sizeof(int));
> > +
> > +if (!dxdy || !tmp)
> > +goto fail;
> > +
> > +for (dx = -r; dx <= r; dx++) {
> > +for (dy = -r; dy <= r; dy++) {
> > +if (dx || dy) {
> > +tmp[idx++] = dx;
> > +tmp[idx++] = dy;
> > +}
> > +}
> > +}
> > +// repack dx/dy seperately, as we want to do four pairs of dx/dy in a
> batch
> > +for (int i = 0; i < nb_pixel / 4; i++) {
> > +dxdy[i * 8] = tmp[i * 8]; // dx0
> > +dxdy[i * 8 + 1] = tmp[i * 8 + 2]; // dx1
> > +dxdy[i * 8 + 2] = tmp[i * 8 + 4]; // dx2
> > +dxdy[i * 8 + 3] = tmp[i * 8 + 6]; // dx3
> > +dxdy[i * 8 + 4] = tmp[i * 8 + 1]; // dy0
> > +dxdy[i * 8 + 5] = tmp[i * 8 + 3]; // dy1
> > +dxdy[i * 8 + 6] = tmp[i * 8 + 5]; // dy2
> > +dxdy[i * 8 + 7] = tmp[i * 8 + 7]; // dy3
> > +}
> > +av_freep();
> > +
> > +for (int i = 0; i < nb_pixel / 4; i++) {
> > +int *dx_cur = dxdy + 8 * i;
> > +int *dy_cur = dxdy + 8 * i + 4;
> 
> cl_int.
Fixed
> 
> > +
> > +// horizontal pass
> > +// integral(x,y) = sum([u(v,y) - u(v+dx,y+dy)]^2) for v in [0, x]
> > +CL_SET_KERNEL_ARG(ctx->horiz_kernel, 0, cl_mem, 
> >integral_img);
> > +CL_SET_KERNEL_ARG(ctx->horiz_kernel, 1, cl_mem, );
> > +CL_SET_KERNEL_ARG(ctx->horiz_kernel, 2, cl_int, );
> > +CL_SET_KERNEL_ARG(ctx->horiz_kernel, 3, cl_int, );
> > +CL_SET_KERNEL_ARG(ctx->horiz_kernel, 4, cl_int4, dx_cur);
> > +CL_SET_KERNEL_ARG(ctx->horiz_kernel, 5, cl_int4, dy_

Re: [FFmpeg-devel] [PATCH] lavfi/gblur: doing several columns at the same time

2019-05-06 Thread Song, Ruiling


> -Original Message-
> From: Paul B Mahol [mailto:one...@gmail.com]
> Sent: Monday, May 6, 2019 4:02 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Cc: Song, Ruiling 
> Subject: Re: [FFmpeg-devel] [PATCH] lavfi/gblur: doing several columns at
> the same time
> 
> On 5/6/19, Ruiling Song  wrote:
> > Instead of doing each column one by one, doing several columns
> > together gives about 30% better performance.
> >
> > Signed-off-by: Ruiling Song 
> > ---
> > below is some of performance numbers(fps) on my i7-6770HQ (decode +
> gblur):
> >
> > resolution:480p | 720p | 1080p | 4k
> > without patch: 393  | 146  | 71| 14
> > with patch:502  | 184  | 95| 18
> >  libavfilter/vf_gblur.c | 62 --
> >  1 file changed, 42 insertions(+), 20 deletions(-)
> >
> 
> LGTM
Thanks Paul, will apply. 

Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add nlmeans_opencl filter

2019-05-05 Thread Song, Ruiling
Will apply.

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add nlmeans_opencl filter

2019-04-28 Thread Song, Ruiling


> -Original Message-
> From: Song, Ruiling
> Sent: Tuesday, April 23, 2019 4:52 PM
> To: 'FFmpeg development discussions and patches'  de...@ffmpeg.org>
> Subject: RE: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add
> nlmeans_opencl filter
> 
> 
> 
> > -Original Message-
> > From: Song, Ruiling
> > Sent: Sunday, April 21, 2019 8:18 PM
> > To: FFmpeg development discussions and patches  > de...@ffmpeg.org>
> > Subject: RE: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add
> > nlmeans_opencl filter
> >
> >
> >
> > > -Original Message-
> > > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On
> > Behalf Of
> > > Mark Thompson
> > > Sent: Saturday, April 20, 2019 11:08 PM
> > > To: ffmpeg-devel@ffmpeg.org
> > > Subject: Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add
> > nlmeans_opencl
> > > filter
> > >
> > > On 17/04/2019 03:43, Song, Ruiling wrote:
> > > >> -Original Message-
> > > >> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On
> > Behalf
> > > Of
> > > >> Mark Thompson
> > > >> Sent: Wednesday, April 17, 2019 5:28 AM
> > > >> To: ffmpeg-devel@ffmpeg.org
> > > >> Subject: Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add
> > > nlmeans_opencl
> > > >> filter
> > > >>
> > > >> On 12/04/2019 16:09, Ruiling Song wrote:
> > > >>> Signed-off-by: Ruiling Song 
> > > >>
> > > >> I can't work out where the problem is, but there is something really
> > weirdly
> > > >> nondeterministic going on here.
> > > >>
> > > >> E.g.
> > > >>
> > > >> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-
> 120-
> > > mbps-
> > > >> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf
> > > >>
> >
> format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p -
> > > >> frames:v 10 -f framemd5 -
> > > >> ...
> > > >> 0,  0,  0,1, 12441600, 
> > > >> 8b8805818076b23ae6f80ec2b5a349d4
> > > >> 0,  1,  1,1, 12441600, 
> > > >> 7a7fdaa083dc337cfb6af31b643f30a3
> > > >> 0,  2,  2,1, 12441600, 
> > > >> b10ef2a1e5125cc67e262e086f8040b5
> > > >> 0,  3,  3,1, 12441600, 
> > > >> c06b53ad90e0357e537df41b63d5b1dc
> > > >> 0,  4,  4,1, 12441600, 
> > > >> 5aa2da07703859a3dee080847dd17d46
> > > >> 0,  5,  5,1, 12441600, 
> > > >> 733364c6be6af825057e905a6092937d
> > > >> 0,  6,  6,1, 12441600, 
> > > >> 47edae2dec956a582b04babb745d26b0
> > > >> 0,  7,  7,1, 12441600, 
> > > >> 4e45fe8268df4298d06a17ab8e46c3e9
> > > >> 0,  8,  8,1, 12441600, 
> > > >> 960d722a3f8787c9191299a114c04174
> > > >> 0,  9,  9,1, 12441600, 
> > > >> e759c07ee4834a9cf94bfcb4128e7612
> > > >> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-
> 120-
> > > mbps-
> > > >> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf
> > > >>
> >
> format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p -
> > > >> frames:v 10 -f framemd5 -
> > > >> 0,  0,  0,1, 12441600, 
> > > >> 8b8805818076b23ae6f80ec2b5a349d4
> > > >> [Parsed_nlmeans_opencl_2 @ 0x5557ae580d00] integral image
> > overflow
> > > >> 2157538
> > > >> 0,  1,  1,1, 12441600, 
> > > >> bce72e10a9f1118940c5a8392ad78ec3
> > > >> 0,  2,  2,1, 12441600, 
> > > >> b10ef2a1e5125cc67e262e086f8040b5
> > > >> 0,  3,  3,1, 12441600, 
> > > >> c06b53ad90e0357e537df41b63d5b1dc
> > > >> 0,  4,  4,1, 12441600, 
> > > >> 5aa2da07703859a3dee080847dd17d46
> > > >> 0,  5,  5,1, 12441600, 
> > > >> 733364c6be6af825057e905a6092937d
> > > >> 0,  6,  6,1, 12441600, 

Re: [FFmpeg-devel] [PATCH V2 1/2] lavfi/opencl: add more opencl helper macro

2019-04-25 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of Mark Thompson
> Sent: Wednesday, April 17, 2019 5:25 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH V2 1/2] lavfi/opencl: add more opencl
> helper macro
> 
> On 12/04/2019 16:09, Ruiling Song wrote:
> > Signed-off-by: Ruiling Song 
> > ---
> >  libavfilter/opencl.h | 38 ++
> >  1 file changed, 38 insertions(+)
> >
> > diff --git a/libavfilter/opencl.h b/libavfilter/opencl.h
> > index 0b06232ade..0fa5b49d3f 100644
> > --- a/libavfilter/opencl.h
> > +++ b/libavfilter/opencl.h
> > @@ -73,6 +73,44 @@ typedef struct OpenCLFilterContext {
> >  goto fail; \
> >  }  \
> >  } while(0)
> > +/**
> > +  * release an OpenCL Kernel
> > +  */
> > +#define CL_RELEASE_KERNEL(k)  \
> > +do {  \
> > +if (k) {  \
> > +cle = clReleaseKernel(k); \
> > +if (cle != CL_SUCCESS)\
> > +av_log(avctx, AV_LOG_ERROR, "Failed to release "  \
> > +   "OpenCL kernel: %d.\n", cle);  \
> > +} \
> > +} while(0)
> > +
> > +/**
> > +  * release an OpenCL Memory Object
> > +  */
> > +#define CL_RELEASE_MEMORY(m)  \
> > +do {  \
> > +if (m) {  \
> > +cle = clReleaseMemObject(m);  \
> > +if (cle != CL_SUCCESS)\
> > +av_log(avctx, AV_LOG_ERROR, "Failed to release "  \
> > +   "OpenCL memory: %d.\n", cle);  \
> > +} \
> > +} while(0)
> > +
> > +/**
> > +  * release an OpenCL Command Queue
> > +  */
> > +#define CL_RELEASE_QUEUE(q)   \
> > +do {  \
> > +if (q) {  \
> > +cle = clReleaseCommandQueue(q);   \
> > +if (cle != CL_SUCCESS)\
> > +av_log(avctx, AV_LOG_ERROR, "Failed to release "  \
> > +   "cl command queue: %d.\n", cle);   \
> > +} \
> > +} while(0)
> >
> >  /**
> >   * Return that all inputs and outputs support only AV_PIX_FMT_OPENCL.
> >
> 
> LGTM.
Pushed this patch so we can use it in other opencl filters. Thanks!
> 
> Thanks,
> 
> - Mark
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add nlmeans_opencl filter

2019-04-23 Thread Song, Ruiling


> -Original Message-
> From: Song, Ruiling
> Sent: Sunday, April 21, 2019 8:18 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: RE: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add
> nlmeans_opencl filter
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On
> Behalf Of
> > Mark Thompson
> > Sent: Saturday, April 20, 2019 11:08 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Subject: Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add
> nlmeans_opencl
> > filter
> >
> > On 17/04/2019 03:43, Song, Ruiling wrote:
> > >> -Original Message-
> > >> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On
> Behalf
> > Of
> > >> Mark Thompson
> > >> Sent: Wednesday, April 17, 2019 5:28 AM
> > >> To: ffmpeg-devel@ffmpeg.org
> > >> Subject: Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add
> > nlmeans_opencl
> > >> filter
> > >>
> > >> On 12/04/2019 16:09, Ruiling Song wrote:
> > >>> Signed-off-by: Ruiling Song 
> > >>
> > >> I can't work out where the problem is, but there is something really
> weirdly
> > >> nondeterministic going on here.
> > >>
> > >> E.g.
> > >>
> > >> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-120-
> > mbps-
> > >> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf
> > >>
> format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p -
> > >> frames:v 10 -f framemd5 -
> > >> ...
> > >> 0,  0,  0,1, 12441600, 
> > >> 8b8805818076b23ae6f80ec2b5a349d4
> > >> 0,  1,  1,1, 12441600, 
> > >> 7a7fdaa083dc337cfb6af31b643f30a3
> > >> 0,  2,  2,1, 12441600, 
> > >> b10ef2a1e5125cc67e262e086f8040b5
> > >> 0,  3,  3,1, 12441600, 
> > >> c06b53ad90e0357e537df41b63d5b1dc
> > >> 0,  4,  4,1, 12441600, 
> > >> 5aa2da07703859a3dee080847dd17d46
> > >> 0,  5,  5,1, 12441600, 
> > >> 733364c6be6af825057e905a6092937d
> > >> 0,  6,  6,1, 12441600, 
> > >> 47edae2dec956a582b04babb745d26b0
> > >> 0,  7,  7,1, 12441600, 
> > >> 4e45fe8268df4298d06a17ab8e46c3e9
> > >> 0,  8,  8,1, 12441600, 
> > >> 960d722a3f8787c9191299a114c04174
> > >> 0,  9,  9,1, 12441600, 
> > >> e759c07ee4834a9cf94bfcb4128e7612
> > >> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-120-
> > mbps-
> > >> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf
> > >>
> format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p -
> > >> frames:v 10 -f framemd5 -
> > >> 0,  0,  0,1, 12441600, 
> > >> 8b8805818076b23ae6f80ec2b5a349d4
> > >> [Parsed_nlmeans_opencl_2 @ 0x5557ae580d00] integral image
> overflow
> > >> 2157538
> > >> 0,  1,  1,1, 12441600, 
> > >> bce72e10a9f1118940c5a8392ad78ec3
> > >> 0,  2,  2,1, 12441600, 
> > >> b10ef2a1e5125cc67e262e086f8040b5
> > >> 0,  3,  3,1, 12441600, 
> > >> c06b53ad90e0357e537df41b63d5b1dc
> > >> 0,  4,  4,1, 12441600, 
> > >> 5aa2da07703859a3dee080847dd17d46
> > >> 0,  5,  5,1, 12441600, 
> > >> 733364c6be6af825057e905a6092937d
> > >> 0,  6,  6,1, 12441600, 
> > >> 47edae2dec956a582b04babb745d26b0
> > >> 0,  7,  7,1, 12441600, 
> > >> 4e45fe8268df4298d06a17ab8e46c3e9
> > >> 0,  8,  8,1, 12441600, 
> > >> 960d722a3f8787c9191299a114c04174
> > >> 0,  9,  9,1, 12441600, 
> > >> e759c07ee4834a9cf94bfcb4128e7612
> > >> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-120-
> > mbps-
> > >> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf
> > >>
> format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p -
> > >> frames:v 10 -f framemd5 -
>

Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add nlmeans_opencl filter

2019-04-21 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Saturday, April 20, 2019 11:08 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add nlmeans_opencl
> filter
> 
> On 17/04/2019 03:43, Song, Ruiling wrote:
> >> -Original Message-
> >> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of
> >> Mark Thompson
> >> Sent: Wednesday, April 17, 2019 5:28 AM
> >> To: ffmpeg-devel@ffmpeg.org
> >> Subject: Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add
> nlmeans_opencl
> >> filter
> >>
> >> On 12/04/2019 16:09, Ruiling Song wrote:
> >>> Signed-off-by: Ruiling Song 
> >>
> >> I can't work out where the problem is, but there is something really 
> >> weirdly
> >> nondeterministic going on here.
> >>
> >> E.g.
> >>
> >> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-120-
> mbps-
> >> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf
> >> format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p -
> >> frames:v 10 -f framemd5 -
> >> ...
> >> 0,  0,  0,1, 12441600, 
> >> 8b8805818076b23ae6f80ec2b5a349d4
> >> 0,  1,  1,1, 12441600, 
> >> 7a7fdaa083dc337cfb6af31b643f30a3
> >> 0,  2,  2,1, 12441600, 
> >> b10ef2a1e5125cc67e262e086f8040b5
> >> 0,  3,  3,1, 12441600, 
> >> c06b53ad90e0357e537df41b63d5b1dc
> >> 0,  4,  4,1, 12441600, 
> >> 5aa2da07703859a3dee080847dd17d46
> >> 0,  5,  5,1, 12441600, 
> >> 733364c6be6af825057e905a6092937d
> >> 0,  6,  6,1, 12441600, 
> >> 47edae2dec956a582b04babb745d26b0
> >> 0,  7,  7,1, 12441600, 
> >> 4e45fe8268df4298d06a17ab8e46c3e9
> >> 0,  8,  8,1, 12441600, 
> >> 960d722a3f8787c9191299a114c04174
> >> 0,  9,  9,1, 12441600, 
> >> e759c07ee4834a9cf94bfcb4128e7612
> >> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-120-
> mbps-
> >> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf
> >> format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p -
> >> frames:v 10 -f framemd5 -
> >> 0,  0,  0,1, 12441600, 
> >> 8b8805818076b23ae6f80ec2b5a349d4
> >> [Parsed_nlmeans_opencl_2 @ 0x5557ae580d00] integral image overflow
> >> 2157538
> >> 0,  1,  1,1, 12441600, 
> >> bce72e10a9f1118940c5a8392ad78ec3
> >> 0,  2,  2,1, 12441600, 
> >> b10ef2a1e5125cc67e262e086f8040b5
> >> 0,  3,  3,1, 12441600, 
> >> c06b53ad90e0357e537df41b63d5b1dc
> >> 0,  4,  4,1, 12441600, 
> >> 5aa2da07703859a3dee080847dd17d46
> >> 0,  5,  5,1, 12441600, 
> >> 733364c6be6af825057e905a6092937d
> >> 0,  6,  6,1, 12441600, 
> >> 47edae2dec956a582b04babb745d26b0
> >> 0,  7,  7,1, 12441600, 
> >> 4e45fe8268df4298d06a17ab8e46c3e9
> >> 0,  8,  8,1, 12441600, 
> >> 960d722a3f8787c9191299a114c04174
> >> 0,  9,  9,1, 12441600, 
> >> e759c07ee4834a9cf94bfcb4128e7612
> >> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-120-
> mbps-
> >> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf
> >> format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p -
> >> frames:v 10 -f framemd5 -
> >> 0,  0,  0,1, 12441600, 
> >> 8b8805818076b23ae6f80ec2b5a349d4
> >> 0,  1,  1,1, 12441600, 
> >> 7a7fdaa083dc337cfb6af31b643f30a3
> >> [Parsed_nlmeans_opencl_2 @ 0x557c51fbfe80] integral image overflow
> >> 2098545
> >> 0,  2,  2,1, 12441600, 
> >> 68b390535adc5cfa0f8a7942c42a47ca
> >> 0,  3,  3,1, 12441600, 
> >> c06b53ad90e0357e537df41b63d5b1dc
> >> 0,  4,  4,1, 12441600, 
> >> 5aa2da07703859a3dee080847dd17d46
> >> 0,  5,  5,1, 12441600, 
> >> 733364c6be6af825057e905a6092937d
> >

Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add nlmeans_opencl filter

2019-04-16 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Wednesday, April 17, 2019 5:28 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH V2 2/2] lavfi/opencl: add nlmeans_opencl
> filter
> 
> On 12/04/2019 16:09, Ruiling Song wrote:
> > Signed-off-by: Ruiling Song 
> 
> I can't work out where the problem is, but there is something really weirdly
> nondeterministic going on here.
> 
> E.g.
> 
> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-120-mbps-
> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf
> format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p -
> frames:v 10 -f framemd5 -
> ...
> 0,  0,  0,1, 12441600, 
> 8b8805818076b23ae6f80ec2b5a349d4
> 0,  1,  1,1, 12441600, 
> 7a7fdaa083dc337cfb6af31b643f30a3
> 0,  2,  2,1, 12441600, 
> b10ef2a1e5125cc67e262e086f8040b5
> 0,  3,  3,1, 12441600, 
> c06b53ad90e0357e537df41b63d5b1dc
> 0,  4,  4,1, 12441600, 
> 5aa2da07703859a3dee080847dd17d46
> 0,  5,  5,1, 12441600, 
> 733364c6be6af825057e905a6092937d
> 0,  6,  6,1, 12441600, 
> 47edae2dec956a582b04babb745d26b0
> 0,  7,  7,1, 12441600, 
> 4e45fe8268df4298d06a17ab8e46c3e9
> 0,  8,  8,1, 12441600, 
> 960d722a3f8787c9191299a114c04174
> 0,  9,  9,1, 12441600, 
> e759c07ee4834a9cf94bfcb4128e7612
> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-120-mbps-
> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf
> format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p -
> frames:v 10 -f framemd5 -
> 0,  0,  0,1, 12441600, 
> 8b8805818076b23ae6f80ec2b5a349d4
> [Parsed_nlmeans_opencl_2 @ 0x5557ae580d00] integral image overflow
> 2157538
> 0,  1,  1,1, 12441600, 
> bce72e10a9f1118940c5a8392ad78ec3
> 0,  2,  2,1, 12441600, 
> b10ef2a1e5125cc67e262e086f8040b5
> 0,  3,  3,1, 12441600, 
> c06b53ad90e0357e537df41b63d5b1dc
> 0,  4,  4,1, 12441600, 
> 5aa2da07703859a3dee080847dd17d46
> 0,  5,  5,1, 12441600, 
> 733364c6be6af825057e905a6092937d
> 0,  6,  6,1, 12441600, 
> 47edae2dec956a582b04babb745d26b0
> 0,  7,  7,1, 12441600, 
> 4e45fe8268df4298d06a17ab8e46c3e9
> 0,  8,  8,1, 12441600, 
> 960d722a3f8787c9191299a114c04174
> 0,  9,  9,1, 12441600, 
> e759c07ee4834a9cf94bfcb4128e7612
> $ ./ffmpeg_g -y -init_hw_device opencl:0.0 -i ~/video/test/jellyfish-120-mbps-
> 4k-uhd-hevc-10bit.mkv -an -filter_hw_device opencl0 -vf
> format=yuv420p,hwupload,nlmeans_opencl,hwdownload,format=yuv420p -
> frames:v 10 -f framemd5 -
> 0,  0,  0,1, 12441600, 
> 8b8805818076b23ae6f80ec2b5a349d4
> 0,  1,  1,1, 12441600, 
> 7a7fdaa083dc337cfb6af31b643f30a3
> [Parsed_nlmeans_opencl_2 @ 0x557c51fbfe80] integral image overflow
> 2098545
> 0,  2,  2,1, 12441600, 
> 68b390535adc5cfa0f8a7942c42a47ca
> 0,  3,  3,1, 12441600, 
> c06b53ad90e0357e537df41b63d5b1dc
> 0,  4,  4,1, 12441600, 
> 5aa2da07703859a3dee080847dd17d46
> 0,  5,  5,1, 12441600, 
> 733364c6be6af825057e905a6092937d
> 0,  6,  6,1, 12441600, 
> 47edae2dec956a582b04babb745d26b0
> 0,  7,  7,1, 12441600, 
> 4e45fe8268df4298d06a17ab8e46c3e9
> 0,  8,  8,1, 12441600, 
> 960d722a3f8787c9191299a114c04174
> 0,  9,  9,1, 12441600, 
> e759c07ee4834a9cf94bfcb4128e7612
> 
> Frame 1 gave an overflow on the second run, and gets a different answer, then
> frame 2 in the same way on the third run?  I can't characterise when this
> happens, it seems to be pretty random with low probability.

I tried to reproduce on my SKL and KBL, with Beignet and Neo. And didn't 
reproduce the issue.
As I am encountering some network issue, I didn't get the video sample you 
provide (I am using https://4ksamples.com/ses-astra-uhd-test-2-2160p-uhdtv/ ), 
I can try later to download the same video as you.
May be an OpenCL driver issue? I am not sure yet. So could you provide what 
hardware and opencl driver version you are using? So I can do some debugging if 
possible.

> 
> (Input here is a 4K file from , but I don't think it 
> matters - I
> saw it with others sometimes as well.)
> 
> >  configure   |   1 +
> >  doc/filters.texi|   4 +
> >  libavfilter/Makefile|   1 +
> >  libavfilter/allfilters.c|   1 +
> >  libavfilter/opencl/nlmeans.cl   | 115 +
> >  

Re: [FFmpeg-devel] [PATCH v4 1/7] vf_crop: Add support for cropping hardware frames

2019-04-14 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Wednesday, April 10, 2019 6:07 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH v4 1/7] vf_crop: Add support for cropping
> hardware frames
> 
> Set the cropping fields in the AVFrame.
The patchset looks fine to me. But I am not quite sure if others are happy with 
this crop patch.
If nobody against. I think you can go pushing the patchset when you make sure 
it will not trigger build failures as reported by Michael.
(one unnecessary empty line below, please remove it)

Thanks!
Ruiling

> ---
> On 26/03/2019 10:59, Song, Ruiling wrote:>
> > I think we need to make scale_vaapi evaluate input dimensions considering
> crop information. What do you think?
> 
> I agree.  But the cropping information is currently carried on the frame, not 
> at
> any higher level (from the codec context or on the filter link), so we don't 
> have
> any idea how big the output frames need to be at setup time when we have to
> set the size on the output link.  Making it work requires carrying more 
> complete
> information through filter setup, similar to the problem with colour range and
> other properties which aren't reflected in the existing format.  (This is on 
> my to-
> do list.)
> 
> 
>  libavfilter/vf_crop.c | 74 +--
>  1 file changed, 51 insertions(+), 23 deletions(-)
> 
> diff --git a/libavfilter/vf_crop.c b/libavfilter/vf_crop.c
> index 84be4c7d0d..7f6b0f03d3 100644
> --- a/libavfilter/vf_crop.c
> +++ b/libavfilter/vf_crop.c
> @@ -98,9 +98,17 @@ static int query_formats(AVFilterContext *ctx)
> 
>  for (fmt = 0; av_pix_fmt_desc_get(fmt); fmt++) {
>  const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(fmt);
> -if (!(desc->flags & (AV_PIX_FMT_FLAG_HWACCEL |
> AV_PIX_FMT_FLAG_BITSTREAM)) &&
> -!((desc->log2_chroma_w || desc->log2_chroma_h) && !(desc->flags &
> AV_PIX_FMT_FLAG_PLANAR)) &&
> -(ret = ff_add_format(, fmt)) < 0)
> +if (desc->flags & AV_PIX_FMT_FLAG_BITSTREAM)
> +continue;
> +if (!(desc->flags & AV_PIX_FMT_FLAG_HWACCEL)) {
> +// Not usable if there is any subsampling but the format is
> +// not planar (e.g. YUYV422).
> +if ((desc->log2_chroma_w || desc->log2_chroma_h) &&
> +!(desc->flags & AV_PIX_FMT_FLAG_PLANAR))
> +continue;
> +}
> +ret = ff_add_format(, fmt);
> +if (ret < 0)
>  return ret;
>  }
> 
> @@ -157,8 +165,14 @@ static int config_input(AVFilterLink *link)
>  s->var_values[VAR_POS]   = NAN;
> 
>  av_image_fill_max_pixsteps(s->max_step, NULL, pix_desc);
> -s->hsub = pix_desc->log2_chroma_w;
> -s->vsub = pix_desc->log2_chroma_h;
> +
> +if (pix_desc->flags & AV_PIX_FMT_FLAG_HWACCEL) {
> +s->hsub = 1;
> +s->vsub = 1;
> +} else {
> +s->hsub = pix_desc->log2_chroma_w;
> +s->vsub = pix_desc->log2_chroma_h;
> +}
> 
>  if ((ret = av_expr_parse_and_eval(, (expr = s->w_expr),
>var_names, s->var_values,
> @@ -237,9 +251,15 @@ fail_expr:
>  static int config_output(AVFilterLink *link)
>  {
>  CropContext *s = link->src->priv;
> +const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(link->format);
> 
> -link->w = s->w;
> -link->h = s->h;
> +if (desc->flags & AV_PIX_FMT_FLAG_HWACCEL) {
> +// Hardware frames adjust the cropping regions rather than
> +// changing the frame size.
> +} else {
> +link->w = s->w;
> +link->h = s->h;
> +}
>  link->sample_aspect_ratio = s->out_sar;
> 
>  return 0;
> @@ -252,9 +272,6 @@ static int filter_frame(AVFilterLink *link, AVFrame
> *frame)
>  const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(link->format);
>  int i;
> 
> -frame->width  = s->w;
> -frame->height = s->h;
> -
>  s->var_values[VAR_N] = link->frame_count_out;
>  s->var_values[VAR_T] = frame->pts == AV_NOPTS_VALUE ?
>  NAN : frame->pts * av_q2d(link->time_base);
> @@ -285,22 +302,33 @@ static int filter_frame(AVFilterLink *link, AVFrame
> *frame)
>  (int)s->var_values[VAR_N], s->var_values[VAR_T], s-
> >var_values[VAR_POS],
>  s->x, s->y, s->x+s->w, s->y+s->h);
> 
>

Re: [FFmpeg-devel] [PATCH v4 2/7] doc/indevs: Add example using cropping to capture part of a plane

2019-04-14 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Wednesday, April 10, 2019 6:07 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH v4 2/7] doc/indevs: Add example using cropping
> to capture part of a plane
> 
> ---
>  doc/indevs.texi | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/doc/indevs.texi b/doc/indevs.texi
> index 1d5ed65773..a4f0f608d7 100644
> --- a/doc/indevs.texi
> +++ b/doc/indevs.texi
> @@ -910,6 +910,14 @@ Capture from CRTC ID 42 at 60fps, map the result to
> VAAPI, convert to NV12 and e
>  ffmpeg -crtc_id 42 -framerate 60 -f kmsgrab -i - -vf
> 'hwmap=derive_device=vaapi,scale_vaapi=w=1920:h=1080:format=nv12' -c:v
> h264_vaapi output.mp4
>  @end example
> 
> +@item
> +To capture only part of a plane the output can be cropped - this can be used 
> to
> capture
> +a single window, as long as it has a known absolute position.  For example, 
> to
 absolute position and window 
size?
> capture
> +and encode the middle quarter of a 1920x1080 plane:
> +@example
> +ffmpeg -f kmsgrab -i - -vf
> 'hwmap=derive_device=vaapi,crop=960:540:480:270,scale_vaapi=960:540:nv12'
> -c:v h264_vaapi output.mp4
> +@end example
> +
>  @end itemize
> 
>  @section lavfi
> --
> 2.20.1
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter

2019-04-14 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Sunday, April 14, 2019 1:23 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter
> 
> On 12/04/2019 08:38, Song, Ruiling wrote:
> >>>> +#define RELEASE_KERNEL(k)\
> >>>> +do { \
> >>>> +if (k) { \
> >>>> +cle = clReleaseKernel(k);\
> >>>> +if (cle != CL_SUCCESS)   \
> >>>> +av_log(avctx, AV_LOG_ERROR, "Failed to release " \
> >>>> +   "kernel: %d.\n", cle);\
> >>>> +}\
> >>>> +} while(0)
> >>>
> >>> This appears multiple times here and also in other filters.  Maybe it 
> >>> should
> be a
> >>> macro in opencl.h like CL_SET_KERNEL_ARG?
> > Hi Mark,
> >
> > I am rethinking about this problem, can we just simply call 
> > clReleaseKernel()
> and not checking the input and the error_code.
> > OpenCL spec has require implementation to check the input argument. So I
> think we can just ignore the if-null check.
> 
> I'm not sure that's true?  The spec allows a CL_INVALID_KERNEL error, but
> doesn't offer any clear indication of when it should be returned (NULL is
> distinguished in other cases, but not here).  Random pointers certainly do 
> crash
> implementations, so they aren't interpreting it as a requirement to validate 
> the
> pointer generally (against some list in the context, say).
Yes, seems the spec does not say about null pointer check clearly.
Because the null pointer check is cheap, so I thought every good programmed 
OpenCL driver should be able to check that.
Maybe you are right. I am not quite sure now:(
So we can keep the check as before. I have added the macro to do this. Please 
help take a look at V2 when you have time.

Thanks!
Ruiling
> 
> The standard ICD loader does have a null check returning CL_INVALID_KERNEL,
> but there is no requirement that it is used rather than linking to a 
> particular ICD
> directly.
> 
> > As we are destroying the objects, is it still useful to care the error code
> returned?
> 
> Probably not, I agree.
> 
> - Mark
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter

2019-04-12 Thread Song, Ruiling
> > > +#define RELEASE_KERNEL(k)\
> > > +do { \
> > > +if (k) { \
> > > +cle = clReleaseKernel(k);\
> > > +if (cle != CL_SUCCESS)   \
> > > +av_log(avctx, AV_LOG_ERROR, "Failed to release " \
> > > +   "kernel: %d.\n", cle);\
> > > +}\
> > > +} while(0)
> >
> > This appears multiple times here and also in other filters.  Maybe it 
> > should be a
> > macro in opencl.h like CL_SET_KERNEL_ARG?
Hi Mark,

I am rethinking about this problem, can we just simply call clReleaseKernel() 
and not checking the input and the error_code.
OpenCL spec has require implementation to check the input argument. So I think 
we can just ignore the if-null check.
As we are destroying the objects, is it still useful to care the error code 
returned?

Thanks!
Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH, v2] lavu/hwcontext_qsv: Fix the realign check for hwupload

2019-04-11 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Fu, Linjie
> Sent: Thursday, April 11, 2019 3:59 PM
> To: Li, Zhong ; FFmpeg development discussions and
> patches 
> Subject: Re: [FFmpeg-devel] [PATCH, v2] lavu/hwcontext_qsv: Fix the realign
> check for hwupload
> 
> > -Original Message-
> > From: Li, Zhong
> > Sent: Thursday, April 11, 2019 10:51
> > To: FFmpeg development discussions and patches  > de...@ffmpeg.org>
> > Cc: Fu, Linjie 
> > Subject: RE: [FFmpeg-devel] [PATCH, v2] lavu/hwcontext_qsv: Fix the realign
> > check for hwupload
> >
> > > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On
> > Behalf
> > > Of Linjie Fu
> > > Sent: Wednesday, April 10, 2019 7:56 PM
> > > To: ffmpeg-devel@ffmpeg.org
> > > Cc: Fu, Linjie 
> > > Subject: [FFmpeg-devel] [PATCH, v2] lavu/hwcontext_qsv: Fix the realign
> > > check for hwupload
> > >
> > > Fix the aligned check in hwupload, input surface should be 16 aligned too.
> > >
> > > Fix #7830.
> > >
> > > Signed-off-by: Linjie Fu 
> > > ---
> > >
> > >  libavutil/hwcontext_qsv.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/libavutil/hwcontext_qsv.c b/libavutil/hwcontext_qsv.c index
> > > b6d8bfe2bf..8b000fe636 100644
> > > --- a/libavutil/hwcontext_qsv.c
> > > +++ b/libavutil/hwcontext_qsv.c
> > > @@ -892,7 +892,8 @@ static int
> > > qsv_transfer_data_to(AVHWFramesContext *ctx, AVFrame *dst,
> > >  return ret;
> > >
> > >
> > > -if (src->height & 16 || src->linesize[0] & 16) {
> > > +if (src->height & 15 || src->width & 15 ||
> > > +src->linesize[0] & 15) {
> >
> > Should be better to use FFALIGN()
> >
> > Another question is it really necessary to check width alignment if we 
> > already
> > checked linesize to fix this issue?
> > (I guess it it not necessary, and if it is needed, many other places 
> > probably
> > needed to be changed too.)
> 
> Checked the code in qsvvpp.c and qsvenc.c, it has the same check using &:
> 
> libavcodec/qsvenc.c: if ((frame->height & 31 || frame->linesize[0] & 
> (q-
> >width_align - 1)) ||
> libavfilter/qsvvpp.c:if (picref->height & 31 || picref->linesize[0] & 
> 31) {
> 
> These should be matched, so the FFALIGN is better for all?
I think FFALIGN() is used to align a value not checking alignment.
> 
> For the width check, added for redundant check, can be removed to match the
> whole behavior.
only checking linesize[0] sounds good.
> 
> Thanks for comments.
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter

2019-04-10 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Carl Eugen Hoyos
> Sent: Tuesday, April 9, 2019 9:21 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter
> 
> 2019-04-09 4:54 GMT+02:00, Song, Ruiling :
> 
> >> > +kernel void vert_sum(__global uint4 *ii,
> >> > + int width,
> >> > + int height)
> >> > +{
> >> > +int x = get_global_id(0);
> >> > +uint4 sum = 0;
> >> > +for (int i = 0; i < height; i++) {
> >> > +ii[i * width + x] += sum;
> >> > +sum = ii[i * width + x];
> >>
> >> This looks like it might be able to overflow in extreme cases?
> >>
> >> 3840 * 2160 * (1 - 0)^2 * 255 * 255 = 539,343,360,000 which
> >> is a long way out of range for a 32-bit int.  That requires
> >> impossible input (all pixels differing by the most extreme
> >> value), but something like a chequerboard might be of the
> >> same order?
> > Yes this is a dilemma for me. Generally the filter is with
> > high computation cost.
> > To fix the overflow, we have to use 64bit integer for the
> > integral image. Most GPUs are not good at 64bit integer
> > calculation I think. May be we can try later.
> > So I would prefer to stay with 32bit integer for a while.
> 
> Can the overflow be detected at runtime?
Will add the check.
> 
> Could the user choose between 32 and 64 bit calculation?
I may mark this as TODO.
> 
> Carl Eugen
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter

2019-04-08 Thread Song, Ruiling
Thanks for the valuable comments!

> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Tuesday, April 9, 2019 4:26 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter
> 
> On 01/04/2019 08:52, Ruiling Song wrote:
> > Signed-off-by: Ruiling Song 
> > ---
> > This filter runs about 2x faster on integrated GPU than nlmeans on my 
> > Skylake
> CPU.
> > Anybody like to give some comments?
> 
> Nice!
> 
> >  configure   |   1 +
> >  doc/filters.texi|   4 +
> >  libavfilter/Makefile|   1 +
> >  libavfilter/allfilters.c|   1 +
> >  libavfilter/opencl/nlmeans.cl   | 108 +
> >  libavfilter/opencl_source.h |   1 +
> >  libavfilter/vf_nlmeans_opencl.c | 390 
> >  7 files changed, 506 insertions(+)
> >  create mode 100644 libavfilter/opencl/nlmeans.cl
> >  create mode 100644 libavfilter/vf_nlmeans_opencl.c
> >
> > diff --git a/configure b/configure
> > index f6123f53e5..a233512491 100755
> > --- a/configure
> > +++ b/configure
> > @@ -3460,6 +3460,7 @@ mpdecimate_filter_select="pixelutils"
> >  minterpolate_filter_select="scene_sad"
> >  mptestsrc_filter_deps="gpl"
> >  negate_filter_deps="lut_filter"
> > +nlmeans_opencl_filter_deps="opencl"
> >  nnedi_filter_deps="gpl"
> >  ocr_filter_deps="libtesseract"
> >  ocv_filter_deps="libopencv"
> > diff --git a/doc/filters.texi b/doc/filters.texi
> > index 867607d870..21c2c1a4b5 100644
> > --- a/doc/filters.texi
> > +++ b/doc/filters.texi
> > @@ -19030,6 +19030,10 @@ Apply erosion filter with threshold0 set to 30,
> threshold1 set 40, threshold2 se
> >  @end example
> >  @end itemize
> >
> > +@section nlmeans_opencl
> > +
> > +Non-local Means denoise filter through OpenCL, this filter accepts same
> options as @ref{nlmeans}.
> > +
> >  @section overlay_opencl
> >
> >  Overlay one video on top of another.
> > diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> > index fef6ec5c55..92039bfdcf 100644
> > --- a/libavfilter/Makefile
> > +++ b/libavfilter/Makefile
> > @@ -291,6 +291,7 @@ OBJS-$(CONFIG_MIX_FILTER)+= vf_mix.o
> >  OBJS-$(CONFIG_MPDECIMATE_FILTER) += vf_mpdecimate.o
> >  OBJS-$(CONFIG_NEGATE_FILTER) += vf_lut.o
> >  OBJS-$(CONFIG_NLMEANS_FILTER)+= vf_nlmeans.o
> > +OBJS-$(CONFIG_NLMEANS_OPENCL_FILTER) += vf_nlmeans_opencl.o
> opencl.o opencl/nlmeans.o
> >  OBJS-$(CONFIG_NNEDI_FILTER)  += vf_nnedi.o
> >  OBJS-$(CONFIG_NOFORMAT_FILTER)   += vf_format.o
> >  OBJS-$(CONFIG_NOISE_FILTER)  += vf_noise.o
> > diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> > index c51ae0f3c7..2a6390c92d 100644
> > --- a/libavfilter/allfilters.c
> > +++ b/libavfilter/allfilters.c
> > @@ -277,6 +277,7 @@ extern AVFilter ff_vf_mix;
> >  extern AVFilter ff_vf_mpdecimate;
> >  extern AVFilter ff_vf_negate;
> >  extern AVFilter ff_vf_nlmeans;
> > +extern AVFilter ff_vf_nlmeans_opencl;
> >  extern AVFilter ff_vf_nnedi;
> >  extern AVFilter ff_vf_noformat;
> >  extern AVFilter ff_vf_noise;
> > diff --git a/libavfilter/opencl/nlmeans.cl b/libavfilter/opencl/nlmeans.cl
> > new file mode 100644
> > index 00..dcb04834ca
> > --- /dev/null
> > +++ b/libavfilter/opencl/nlmeans.cl
> > @@ -0,0 +1,108 @@
> > +/*
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
> USA
> > + */
> > +
> > +const sampler_t sampler = (CLK_NORMALIZED_COORDS_FALSE |
> > +   CLK_ADDRESS_CLAMP_TO_EDGE   |
> > +   CLK_FILTER_NEAREST);
> > +
> > +kernel void horiz_sum(__global uint4 *ii,
> > +  __read_only image2d_t src,
> > +  int width,
> > +  int height,
> > +  int4 dx,
> > +  int4 dy)
> > +{
> > +
> > +int y = get_global_id(0);
> > +int work_size = get_global_size(0);
> > +
> > +uint4 sum = (uint4)(0);
> > +float4 s2;
> > +for (int i = 0; i < width; i++) {
> > +float s1 = read_imagef(src, sampler, (int2)(i, y)).x;
> > +

Re: [FFmpeg-devel] [PATCH v2] libavutil/hwcontext_opencl.c: fix bug in `opencl_get_plane_format`

2019-04-08 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Tuesday, April 9, 2019 3:49 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH v2] libavutil/hwcontext_opencl.c: fix bug 
> in
> `opencl_get_plane_format`
> 
> On 08/04/2019 03:01, Jarek Samic wrote:
> > The `opencl_get_plane_format` function was incorrectly determining the
> > value used to set the image channel order. This resulted in all RGB
> > pixel formats being set to the `CL_RGBA` pixel format, regardless of
> > whether or not they actually *were* RGBA.
> >
> > This patch fixes the issue by using the `offset` and depth of components
> > rather than the loop index to determine the value of `order`.
> >
> > Signed-off-by: Jarek Samic 
> > ---
> > I have updated this patch in response to the comments on the first version.
> > RGB is no longer special-cased, the 2, 3, and 4 mappings to `CL_R` have been
> > removed, and the mapping for `CL_ARGB` has been changed to the correct
> value.
> >
> >  libavutil/hwcontext_opencl.c | 8 +++-
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/libavutil/hwcontext_opencl.c b/libavutil/hwcontext_opencl.c
> > index b116c5b708..593de1ca41 100644
> > --- a/libavutil/hwcontext_opencl.c
> > +++ b/libavutil/hwcontext_opencl.c
> > @@ -1419,8 +1419,9 @@ static int opencl_get_plane_format(enum
> AVPixelFormat pixfmt,
> >  // from the same component.
> >  if (step && comp->step != step)
> >  return AVERROR(EINVAL);
> > -order = order * 10 + c + 1;
> > +
> >  depth = comp->depth;
> > +order = order * 10 + comp->offset / ((depth + 7) / 8) + 1;
> >  step  = comp->step;
> >  alpha = (desc->flags & AV_PIX_FMT_FLAG_ALPHA &&
> >   c == desc->nb_components - 1);
> 
> This part LGTM, I can split it off and apply it on its own if you like.
> 
> > @@ -1456,14 +1457,11 @@ static int opencl_get_plane_format(enum
> AVPixelFormat pixfmt,
> >  case order: image_format->image_channel_order = type; break;
> >  switch (order) {
> >  CHANNEL_ORDER(1,CL_R);
> > -CHANNEL_ORDER(2,CL_R);
> > -CHANNEL_ORDER(3,CL_R);
> > -CHANNEL_ORDER(4,CL_R);
> >  CHANNEL_ORDER(12,   CL_RG);
> >  CHANNEL_ORDER(23,   CL_RG);
> 
> 23 should be gone too, I think?
Agree.
> 
> >  CHANNEL_ORDER(1234, CL_RGBA);
> > +CHANNEL_ORDER(2341, CL_ARGB);
> >  CHANNEL_ORDER(3214, CL_BGRA);
> > -CHANNEL_ORDER(4123, CL_ARGB);
> 
> I'm not sure I believe this part:
> 
> 1 = R
> 2 = G
> 3 = B
> 4 = A
The above assumption is not true.
The new logic changes to use combination of offset-index of RGBA.
So for CL_ARGB, the R offset at 2, G is offset at 3, B is offset at 4, A is 
offset at 1.
So, it is 2341 that maps to ARGB.
It's interesting that these two ways of representing the swizzle sometime 
match, sometime not.

Thanks!
Ruiling
> 
> gives
> 
> RGBA -> 1234
> BGRA -> 3214
> ARGB -> 4123
> ABGR -> 4321
> 
> The others match, so why would ARGB be different?  2341 should be GBAR.
> 
> (Can you try this with multiple ARGB sources or OpenCL ICDs?  Maybe there is a
> bug somewhere else...)
> 
> >  #ifdef CL_ABGR
> >  CHANNEL_ORDER(4321, CL_ABGR);
> >  #endif
> >
> 
> Thanks,
> 
> - Mark
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter

2019-04-07 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> myp...@gmail.com
> Sent: Monday, April 8, 2019 9:37 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter
> 
> On Mon, Apr 8, 2019 at 9:33 AM Song, Ruiling  wrote:
> >
> > > -Original Message-
> > > From: Song, Ruiling
> > > Sent: Monday, April 1, 2019 3:53 PM
> > > To: ffmpeg-devel@ffmpeg.org
> > > Cc: Song, Ruiling 
> > > Subject: [PATCH] lavfi: add nlmeans_opencl filter
> > >
> > > Signed-off-by: Ruiling Song 
> > > ---
> > > This filter runs about 2x faster on integrated GPU than nlmeans on my
> Skylake
> > > CPU.
> > > Anybody like to give some comments?
> >
> > Ping?
> >
> Tested and verified in i5-8265U

Thanks for the testing. And comments about the code itself are welcome.
The performance data highly depend on the research-window parameters and also 
the hardware.
I think you may play-with the parameters to make a trade-off between speed and 
quality.

Thanks!
Ruiling
> 
> OpenCL CPU/pocl 1.2fps with 1080P input
> OpenCL GPU/intel NEO 1.2 fps with 1080P input
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter

2019-04-07 Thread Song, Ruiling
> -Original Message-
> From: Song, Ruiling
> Sent: Monday, April 1, 2019 3:53 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Song, Ruiling 
> Subject: [PATCH] lavfi: add nlmeans_opencl filter
> 
> Signed-off-by: Ruiling Song 
> ---
> This filter runs about 2x faster on integrated GPU than nlmeans on my Skylake
> CPU.
> Anybody like to give some comments?

Ping?

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] libavutil/hwcontext_opencl.c: fix bug in `opencl_get_plane_format`

2019-04-07 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Cld fire
> Sent: Monday, April 8, 2019 8:11 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH] libavutil/hwcontext_opencl.c: fix bug in
> `opencl_get_plane_format`
> 
> >
> > For P010, I guess that division needs to round up?
> >
> 
> Yep, rounding the division up did the trick; thanks!
> 
> One last observation before I submit a new patch: I actually missed
> previously that the order number is still not lining up for the ARGB format
> (the order number that maps to ARGB in the code is currently 4123 while the
> new method using the offset and depth is ending up with an order number of
> 2341). Simply changing the order number that maps to ARGB from 4123 to 2341
> seems to work fine; is it okay to make that change or do the mapping
> numbers need to remain exactly the same?
I think we don't need to remain exact as before, you can update the case-values 
accordingly.
And 2341 seems correct for ARGB.

Thanks!
Ruiling
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] libavutil/hwcontext_opencl.c: fix bug in `opencl_get_plane_format`

2019-04-07 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Monday, April 8, 2019 7:27 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH] libavutil/hwcontext_opencl.c: fix bug in
> `opencl_get_plane_format`
> >
> > This is mostly fine, but it looks like nv21, ayuv64le, and p010le are all
> > having their order numbers changed to broken values.
> 
> The changes to AYUV and NV21 both make sense - they can be supported
> because the layout works, but they require special treatment to use beyond 
> just
> taking the given planes in the order common to other formats.  It doesn't seem
> unreasonable to drop them because of that?  I don't think any existing code
> actually supports them (e.g. trying to overlay AYUV on anything is going to 
> mess
> up totally).
I think AYUV can be mapped the same as ARGB.
For NV21, as OpenCL does not support CL_GR. I am ok not supporting this format 
unless someone strongly require this with a reason.

Thanks!
Ruiling
> 
> For P010, I guess that division needs to round up?  element_size = 
> (comp->depth
> + 7) / 8.
> 
> Thanks,
> 
> - Mark
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] configure: include pkgconfig path as vaapi header search

2019-04-01 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Dennis Mungai
> Sent: Thursday, March 28, 2019 7:11 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH] configure: include pkgconfig path as vaapi
> header search
> 
> On Thu, 28 Mar 2019 at 02:05, Mark Thompson  wrote:
> 
> > On 20/03/2019 07:57, Zhong Li wrote:
> > > Currectly just standard header path and be found,
> > > check_type/struct will fail if vaapi is installed somewhere else.
> > > ---
> > >  configure | 18 ++
> > >  1 file changed, 10 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/configure b/configure
> > > index eaf543df96..0e3c2d24bf 100755
> > > --- a/configure
> > > +++ b/configure
> > > @@ -6024,14 +6024,6 @@ check_type "windows.h d3d11.h"
> > "ID3D11VideoDecoder"
> > >  check_type "windows.h d3d11.h" "ID3D11VideoContext"
> > >  check_type "d3d9.h dxva2api.h" DXVA2_ConfigPictureDecode
> > -D_WIN32_WINNT=0x0602
> > >
> > > -check_type "va/va.h va/va_dec_hevc.h" "VAPictureParameterBufferHEVC"
> > > -check_struct "va/va.h" "VADecPictureParameterBufferVP9" bit_depth
> > > -check_struct "va/va.h va/va_vpp.h" "VAProcPipelineCaps" rotation_flags
> > > -check_type "va/va.h va/va_enc_hevc.h"
> "VAEncPictureParameterBufferHEVC"
> > > -check_type "va/va.h va/va_enc_jpeg.h"
> "VAEncPictureParameterBufferJPEG"
> > > -check_type "va/va.h va/va_enc_vp8.h"  "VAEncPictureParameterBufferVP8"
> > > -check_type "va/va.h va/va_enc_vp9.h"  "VAEncPictureParameterBufferVP9"
> > > -
> > >  check_type "vdpau/vdpau.h" "VdpPictureInfoHEVC"
> > >
> > >  if enabled cuda_sdk; then
> > > @@ -6469,6 +6461,16 @@ if enabled vaapi; then
> > >  check_cpp_condition vaapi_1 "va/va.h" "VA_CHECK_VERSION(1, 0, 0)"
> > >  fi
> > >
> > > +if enabled vaapi; then
> >
> > Merge this into the previous block, which has the same condition.
> >
> > > +check_type "va/va.h va/va_dec_hevc.h"
> "VAPictureParameterBufferHEVC"
> > > +check_struct "va/va.h" "VADecPictureParameterBufferVP9" bit_depth
> > > +check_struct "va/va.h va/va_vpp.h" "VAProcPipelineCaps"
> > rotation_flags
> > > +check_type "va/va.h va/va_enc_hevc.h"
> > "VAEncPictureParameterBufferHEVC"
> > > +check_type "va/va.h va/va_enc_jpeg.h"
> > "VAEncPictureParameterBufferJPEG"
> > > +check_type "va/va.h va/va_enc_vp8.h"
> > "VAEncPictureParameterBufferVP8"
> > > +check_type "va/va.h va/va_enc_vp9.h"
> > "VAEncPictureParameterBufferVP9"
> > > +fi
> > > +
> > >  if enabled_all opencl libdrm ; then
> > >  check_type "CL/cl_intel.h" "clCreateImageFromFdINTEL_fn" &&
> > >  enable opencl_drm_beignet
> > >
> >
> > LGTM with that.
> >
> > Thanks,
> >
> > - Mark
> >
> >
> Does a similar check exist for Intel's Neo OpenCL runtime?
I find that Neo does not install a intel-opencl.pc. I guess the reason is that 
Neo is designed to be loaded by OpenCL ICD. So a pkg_check seems not possible.
Currently FFmpeg only checks against cl_va_api_media_sharing_intel.h header 
file. Which acts similar way as checking against Beignet.
For example, on Ubuntu 18.04, the opencl-c-headers shipped is version 2.2, 
which already includes this specific header file.
If you configure FFmpeg with --enable-opencl. And also install Neo and 
intel-media-driver. Then everything should work.
The only thing you need take care is both Neo and intel-media-driver depends on 
gmmlib. So you should choose matching gmmlib version and intel-media-driver 
version that will work with Neo release.

Ruiling
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] lavfi: add nlmeans_opencl filter

2019-04-01 Thread Song, Ruiling



> Can you supply some details performance data ? 

On my i7-6770HQ, the nlmeans take 1.2s to process one 1080p frame.
And nlmeans_opencl take 500ms to process one frame.

Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] configure: include pkgconfig path as vaapi header search

2019-03-27 Thread Song, Ruiling
> > >
> > > Neo is the successor to Beignet, correct?
> > Yes, that's the truth.
> > Currently we simply checking against the specific header file of OpenCL,
> > which is in-fact not accurate.
> > I am not sure whether you would like to use Neo together with
> > intel-media-driver, which is the most targeted opencl usage in FFmpeg.
> > If that's the case, I think it may be hard to find a matching
> > intel-media-driver to work with Neo release package.
> > Because Neo release version depends on a very outdated libva revision.
> > I just sent a patch to Neo to update libva revision dependency. Once they
> > accept the patch and new Neo release package comes out,
> > I think we can change to check against Neo package. People would not need
> > to build Neo themselves then.
> >
> > Thanks!
> > Ruiling
> > >
> > > Enabling similar functionality for Neo should allow for the same feature
> > > support for these not using Beignet.
> >
> >
> Indeed, I'd want to use Neo + intel-media-driver.
> Judging by the (relatively low) development activity on Beignet of late,
> its' considered ready to deprecate in place of Neo, applicable on anything
> newer than Kabylake.
I think Mark don't have plan to deprecate Beignet now, and me too.
FFmpeg-OpenCL currently use direct buffer sharing between OpenCL and vaapi 
driver.
One obvious limitation I didn't notice before is 10bit or 12bit buffer sharing 
is not supported by Neo.
I pinged the author of cl_intel_va_api_media_sharing, but got no response.
Maybe I will take some effort to update the extension spec and implement them 
in Neo myself.
I am not sure any other Neo limitation that Mark wants to add?

> Let's see how long it takes for the libva revision to be bumped up Neo.
> Here's the request for the version bump:
> https://github.com/intel/compute-runtime/issues/131
> And the PR, for these following along:
> https://github.com/intel/compute-runtime/pull/151
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] configure: include pkgconfig path as vaapi header search

2019-03-27 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Dennis Mungai
> Sent: Thursday, March 28, 2019 11:15 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH] configure: include pkgconfig path as vaapi
> header search
> 
> On Thu, 28 Mar 2019 at 06:10, Song, Ruiling  wrote:
> 
> >
> >
> > > -Original Message-
> > > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of
> > > Dennis Mungai
> > > Sent: Thursday, March 28, 2019 7:11 AM
> > > To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> > > Subject: Re: [FFmpeg-devel] [PATCH] configure: include pkgconfig path as
> > vaapi
> > > header search
> > >
> > > On Thu, 28 Mar 2019 at 02:05, Mark Thompson  wrote:
> > >
> > > > On 20/03/2019 07:57, Zhong Li wrote:
> > > > > Currectly just standard header path and be found,
> > > > > check_type/struct will fail if vaapi is installed somewhere else.
> > > > > ---
> > > > >  configure | 18 ++
> > > > >  1 file changed, 10 insertions(+), 8 deletions(-)
> > > > >
> > > > > diff --git a/configure b/configure
> > > > > index eaf543df96..0e3c2d24bf 100755
> > > > > --- a/configure
> > > > > +++ b/configure
> > > > > @@ -6024,14 +6024,6 @@ check_type "windows.h d3d11.h"
> > > > "ID3D11VideoDecoder"
> > > > >  check_type "windows.h d3d11.h" "ID3D11VideoContext"
> > > > >  check_type "d3d9.h dxva2api.h" DXVA2_ConfigPictureDecode
> > > > -D_WIN32_WINNT=0x0602
> > > > >
> > > > > -check_type "va/va.h va/va_dec_hevc.h"
> "VAPictureParameterBufferHEVC"
> > > > > -check_struct "va/va.h" "VADecPictureParameterBufferVP9" bit_depth
> > > > > -check_struct "va/va.h va/va_vpp.h" "VAProcPipelineCaps"
> > rotation_flags
> > > > > -check_type "va/va.h va/va_enc_hevc.h"
> > > "VAEncPictureParameterBufferHEVC"
> > > > > -check_type "va/va.h va/va_enc_jpeg.h"
> > > "VAEncPictureParameterBufferJPEG"
> > > > > -check_type "va/va.h va/va_enc_vp8.h"
> > "VAEncPictureParameterBufferVP8"
> > > > > -check_type "va/va.h va/va_enc_vp9.h"
> > "VAEncPictureParameterBufferVP9"
> > > > > -
> > > > >  check_type "vdpau/vdpau.h" "VdpPictureInfoHEVC"
> > > > >
> > > > >  if enabled cuda_sdk; then
> > > > > @@ -6469,6 +6461,16 @@ if enabled vaapi; then
> > > > >  check_cpp_condition vaapi_1 "va/va.h" "VA_CHECK_VERSION(1, 0,
> > 0)"
> > > > >  fi
> > > > >
> > > > > +if enabled vaapi; then
> > > >
> > > > Merge this into the previous block, which has the same condition.
> > > >
> > > > > +check_type "va/va.h va/va_dec_hevc.h"
> > > "VAPictureParameterBufferHEVC"
> > > > > +check_struct "va/va.h" "VADecPictureParameterBufferVP9"
> > bit_depth
> > > > > +check_struct "va/va.h va/va_vpp.h" "VAProcPipelineCaps"
> > > > rotation_flags
> > > > > +check_type "va/va.h va/va_enc_hevc.h"
> > > > "VAEncPictureParameterBufferHEVC"
> > > > > +check_type "va/va.h va/va_enc_jpeg.h"
> > > > "VAEncPictureParameterBufferJPEG"
> > > > > +check_type "va/va.h va/va_enc_vp8.h"
> > > > "VAEncPictureParameterBufferVP8"
> > > > > +check_type "va/va.h va/va_enc_vp9.h"
> > > > "VAEncPictureParameterBufferVP9"
> > > > > +fi
> > > > > +
> > > > >  if enabled_all opencl libdrm ; then
> > > > >  check_type "CL/cl_intel.h" "clCreateImageFromFdINTEL_fn" &&
> > > > >  enable opencl_drm_beignet
> > > > >
> > > >
> > > > LGTM with that.
> > > >
> > > > Thanks,
> > > >
> > > > - Mark
> > > >
> > > >
> > > Does a similar check exist

Re: [FFmpeg-devel] [PATCH] configure: include pkgconfig path as vaapi header search

2019-03-27 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Dennis Mungai
> Sent: Thursday, March 28, 2019 7:11 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH] configure: include pkgconfig path as vaapi
> header search
> 
> On Thu, 28 Mar 2019 at 02:05, Mark Thompson  wrote:
> 
> > On 20/03/2019 07:57, Zhong Li wrote:
> > > Currectly just standard header path and be found,
> > > check_type/struct will fail if vaapi is installed somewhere else.
> > > ---
> > >  configure | 18 ++
> > >  1 file changed, 10 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/configure b/configure
> > > index eaf543df96..0e3c2d24bf 100755
> > > --- a/configure
> > > +++ b/configure
> > > @@ -6024,14 +6024,6 @@ check_type "windows.h d3d11.h"
> > "ID3D11VideoDecoder"
> > >  check_type "windows.h d3d11.h" "ID3D11VideoContext"
> > >  check_type "d3d9.h dxva2api.h" DXVA2_ConfigPictureDecode
> > -D_WIN32_WINNT=0x0602
> > >
> > > -check_type "va/va.h va/va_dec_hevc.h" "VAPictureParameterBufferHEVC"
> > > -check_struct "va/va.h" "VADecPictureParameterBufferVP9" bit_depth
> > > -check_struct "va/va.h va/va_vpp.h" "VAProcPipelineCaps" rotation_flags
> > > -check_type "va/va.h va/va_enc_hevc.h"
> "VAEncPictureParameterBufferHEVC"
> > > -check_type "va/va.h va/va_enc_jpeg.h"
> "VAEncPictureParameterBufferJPEG"
> > > -check_type "va/va.h va/va_enc_vp8.h"  "VAEncPictureParameterBufferVP8"
> > > -check_type "va/va.h va/va_enc_vp9.h"  "VAEncPictureParameterBufferVP9"
> > > -
> > >  check_type "vdpau/vdpau.h" "VdpPictureInfoHEVC"
> > >
> > >  if enabled cuda_sdk; then
> > > @@ -6469,6 +6461,16 @@ if enabled vaapi; then
> > >  check_cpp_condition vaapi_1 "va/va.h" "VA_CHECK_VERSION(1, 0, 0)"
> > >  fi
> > >
> > > +if enabled vaapi; then
> >
> > Merge this into the previous block, which has the same condition.
> >
> > > +check_type "va/va.h va/va_dec_hevc.h"
> "VAPictureParameterBufferHEVC"
> > > +check_struct "va/va.h" "VADecPictureParameterBufferVP9" bit_depth
> > > +check_struct "va/va.h va/va_vpp.h" "VAProcPipelineCaps"
> > rotation_flags
> > > +check_type "va/va.h va/va_enc_hevc.h"
> > "VAEncPictureParameterBufferHEVC"
> > > +check_type "va/va.h va/va_enc_jpeg.h"
> > "VAEncPictureParameterBufferJPEG"
> > > +check_type "va/va.h va/va_enc_vp8.h"
> > "VAEncPictureParameterBufferVP8"
> > > +check_type "va/va.h va/va_enc_vp9.h"
> > "VAEncPictureParameterBufferVP9"
> > > +fi
> > > +
> > >  if enabled_all opencl libdrm ; then
> > >  check_type "CL/cl_intel.h" "clCreateImageFromFdINTEL_fn" &&
> > >  enable opencl_drm_beignet
> > >
> >
> > LGTM with that.
> >
> > Thanks,
> >
> > - Mark
> >
> >
> Does a similar check exist for Intel's Neo OpenCL runtime?
Do you mean checking whether package intel-opencl (Neo package) exists in the 
system?

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] configure: Do not enable both OpenCL-VAAPI interop modes simultaneously

2019-03-26 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Wednesday, March 27, 2019 7:39 AM
> To: FFmpeg development discussions and patches 
> Subject: [FFmpeg-devel] [PATCH] configure: Do not enable both OpenCL-VAAPI
> interop modes simultaneously
> 
> Beignet offers a far more flexiable and complete interface, so choose it
> by default if available.
Sorry I missed your last mail. Sure, I agree Beignet sharing is far more 
flexible.
The patch LGTM.

Thanks!
Ruiling
> ---
> On 23/03/2019 12:27, Mark Thompson wrote:
> > On 22/03/2019 01:40, Ruiling Song wrote:
> >> ffmpeg | branch: master | Ruiling Song  | Fri Nov 
> >> 23
> 13:39:12 2018 +0800| [61cb505d18b8a335bd118d88c05b9daf40eb5f9b] |
> committer: Ruiling Song
> >>
> >> lavu/opencl: replace va_ext.h with standard name
> >>
> >> Khronos OpenCL header (https://github.com/KhronosGroup/OpenCL-Headers)
> >> uses cl_va_api_media_sharing_intel.h. And Intel's official OpenCL driver
> >> for Intel GPU (https://github.com/intel/compute-runtime) was compiled
> >> against Khronos OpenCL header. So it's better to align with Khronos.
> >>
> >> Signed-off-by: Ruiling Song 
> >>
> >>>
> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=61cb505d18b8a335
> bd118d88c05b9daf40eb5f9b
> >> ---
> >>
> >>  configure| 2 +-
> >>  libavutil/hwcontext_opencl.c | 2 +-
> >>  2 files changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/configure b/configure
> >> index a817479559..331393f8d5 100755
> >> --- a/configure
> >> +++ b/configure
> >> @@ -6472,7 +6472,7 @@ fi
> >>
> >>  if enabled_all opencl vaapi ; then
> >>  enabled opencl_drm_beignet && enable opencl_vaapi_beignet
> >> -check_type "CL/cl.h CL/va_ext.h"
> "clCreateFromVA_APIMediaSurfaceINTEL_fn" &&
> >> +check_type "CL/cl.h CL/cl_va_api_media_sharing_intel.h"
> "clCreateFromVA_APIMediaSurfaceINTEL_fn" &&
> >>  enable opencl_vaapi_intel_media
> >>  fi
> >>
> >> diff --git a/libavutil/hwcontext_opencl.c b/libavutil/hwcontext_opencl.c
> >> index d3df6221c4..b116c5b708 100644
> >> --- a/libavutil/hwcontext_opencl.c
> >> +++ b/libavutil/hwcontext_opencl.c
> >> @@ -50,7 +50,7 @@
> >>  #include 
> >>  #endif
> >>  #include 
> >> -#include 
> >> +#include 
> >>  #include "hwcontext_vaapi.h"
> >>  #endif
> >>
> >
> > This broke the build when both are available.
> >
> > $ make
> > CC  libavutil/hwcontext_opencl.o
> > src/libavutil/hwcontext_opencl.c: In function ‘opencl_device_derive’:
> > src/libavutil/hwcontext_opencl.c:1236:5: error: duplicate case value
> >  case AV_HWDEVICE_TYPE_VAAPI:
> >  ^~~~
> > src/libavutil/hwcontext_opencl.c:1205:5: note: previously used here
> >  case AV_HWDEVICE_TYPE_VAAPI:
> >  ^~~~
> > src/libavutil/hwcontext_opencl.c: In function ‘opencl_map_to’:
> > src/libavutil/hwcontext_opencl.c:2831:5: error: duplicate case value
> >  case AV_PIX_FMT_VAAPI:
> >  ^~~~
> > src/libavutil/hwcontext_opencl.c:2825:5: note: previously used here
> >  case AV_PIX_FMT_VAAPI:
> >  ^~~~
> > src/libavutil/hwcontext_opencl.c: In function ‘opencl_frames_derive_to’:
> > src/libavutil/hwcontext_opencl.c:2873:5: error: duplicate case value
> >  case AV_HWDEVICE_TYPE_VAAPI:
> >  ^~~~
> > src/libavutil/hwcontext_opencl.c:2866:5: note: previously used here
> >  case AV_HWDEVICE_TYPE_VAAPI:
> >  ^~~~
> > make: *** [ffbuild/common.mak:60: libavutil/hwcontext_opencl.o] Error 1
> > make: Target 'all' not remade because of errors.
> >
> > $ cat config.h | grep HAVE_OPENCL
> > #define HAVE_OPENCL_D3D11 0
> > #define HAVE_OPENCL_DRM_ARM 0
> > #define HAVE_OPENCL_DRM_BEIGNET 1
> > #define HAVE_OPENCL_DXVA2 0
> > #define HAVE_OPENCL_VAAPI_BEIGNET 1
> > #define HAVE_OPENCL_VAAPI_INTEL_MEDIA 1
> >
> >
> > I think in general the Beignet mapping is more useful if present since it 
> > has far
> fewer constraints, so perhaps disable this one if Beignet is there?
> 
>  configure | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/configure b/configure
> index 331393f8d5..c94f516224 100755
> --- a/configure
> +++ b/configure
> @@ -6471,9 +6471,12 @@ if enabled_all opencl libdrm ; then
>  fi
> 
>  if enabled_all opencl vaapi ; then
> -enabled opencl_drm_beignet && enable opencl_vaapi_beignet
> -check_type "CL/cl.h CL/cl_va_api_media_sharing_intel.h"
> "clCreateFromVA_APIMediaSurfaceINTEL_fn" &&
> -enable opencl_vaapi_intel_media
> +if enabled opencl_drm_beignet ; then
> +enable opencl_vaapi_beignet
> +else
> +check_type "CL/cl.h CL/cl_va_api_media_sharing_intel.h"
> "clCreateFromVA_APIMediaSurfaceINTEL_fn" &&
> +enable opencl_vaapi_intel_media
> +fi
>  fi
> 
>  if enabled_all opencl dxva2 ; then
> --
> 2.19.2
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To 

Re: [FFmpeg-devel] [PATCH v3 1/2] vf_crop: Add support for cropping hardware frames

2019-03-26 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Sunday, March 24, 2019 12:19 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH v3 1/2] vf_crop: Add support for cropping
> hardware frames
> 
> Set the cropping fields in the AVFrame.
> ---
>  libavfilter/vf_crop.c | 74 +--
>  1 file changed, 51 insertions(+), 23 deletions(-)
> 
> There is the slightly unfortunate effect the filter links don't carry the 
> cropping
> information, so we don't know how big the cropped output is in following links
> until we actually get a frame.
> 
> For example, to get the middle ninth of a stream:
> 
> ./ffmpeg_g -y -hwaccel vaapi -hwaccel_device /dev/dri/renderD128 -
> hwaccel_output_format vaapi -i in.mp4 -an -vf
> "crop=iw/3:ih/3:iw/3:ih/3,scale_vaapi=iw/3:ih/3" -c:v h264_vaapi out.mp4
Hi Mark,

I tested the command against the patch, it works.
But for people who have no idea of implementation details, I think the 
”scale_vaapi=iw/3:ih/3“ will be very confusing.
I think we need to make scale_vaapi evaluate input dimensions considering crop 
information. What do you think?
People would just think that the input buffer to the scale_vaapi is the cropped 
size.
And do we need to add warning message against crop information in encoder if 
user failed to add some vaapi filter after crop?
Seems that vaapi encoder does not encode correctly with crop?

Thanks!
Ruiling
> 
> Without the extra arguments to scale it will take the cropped part correctly 
> but
> then scale it to the original size.
> 
> diff --git a/libavfilter/vf_crop.c b/libavfilter/vf_crop.c
> index 84be4c7d0d..7f6b0f03d3 100644
> --- a/libavfilter/vf_crop.c
> +++ b/libavfilter/vf_crop.c
> @@ -98,9 +98,17 @@ static int query_formats(AVFilterContext *ctx)
> 
>  for (fmt = 0; av_pix_fmt_desc_get(fmt); fmt++) {
>  const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(fmt);
> -if (!(desc->flags & (AV_PIX_FMT_FLAG_HWACCEL |
> AV_PIX_FMT_FLAG_BITSTREAM)) &&
> -!((desc->log2_chroma_w || desc->log2_chroma_h) && !(desc->flags &
> AV_PIX_FMT_FLAG_PLANAR)) &&
> -(ret = ff_add_format(, fmt)) < 0)
> +if (desc->flags & AV_PIX_FMT_FLAG_BITSTREAM)
> +continue;
> +if (!(desc->flags & AV_PIX_FMT_FLAG_HWACCEL)) {
> +// Not usable if there is any subsampling but the format is
> +// not planar (e.g. YUYV422).
> +if ((desc->log2_chroma_w || desc->log2_chroma_h) &&
> +!(desc->flags & AV_PIX_FMT_FLAG_PLANAR))
> +continue;
> +}
> +ret = ff_add_format(, fmt);
> +if (ret < 0)
>  return ret;
>  }
> 
> @@ -157,8 +165,14 @@ static int config_input(AVFilterLink *link)
>  s->var_values[VAR_POS]   = NAN;
> 
>  av_image_fill_max_pixsteps(s->max_step, NULL, pix_desc);
> -s->hsub = pix_desc->log2_chroma_w;
> -s->vsub = pix_desc->log2_chroma_h;
> +
> +if (pix_desc->flags & AV_PIX_FMT_FLAG_HWACCEL) {
> +s->hsub = 1;
> +s->vsub = 1;
> +} else {
> +s->hsub = pix_desc->log2_chroma_w;
> +s->vsub = pix_desc->log2_chroma_h;
> +}
> 
>  if ((ret = av_expr_parse_and_eval(, (expr = s->w_expr),
>var_names, s->var_values,
> @@ -237,9 +251,15 @@ fail_expr:
>  static int config_output(AVFilterLink *link)
>  {
>  CropContext *s = link->src->priv;
> +const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(link->format);
> 
> -link->w = s->w;
> -link->h = s->h;
> +if (desc->flags & AV_PIX_FMT_FLAG_HWACCEL) {
> +// Hardware frames adjust the cropping regions rather than
> +// changing the frame size.
> +} else {
> +link->w = s->w;
> +link->h = s->h;
> +}
>  link->sample_aspect_ratio = s->out_sar;
> 
>  return 0;
> @@ -252,9 +272,6 @@ static int filter_frame(AVFilterLink *link, AVFrame
> *frame)
>  const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(link->format);
>  int i;
> 
> -frame->width  = s->w;
> -frame->height = s->h;
> -
>  s->var_values[VAR_N] = link->frame_count_out;
>  s->var_values[VAR_T] = frame->pts == AV_NOPTS_VALUE ?
>  NAN : frame->pts * av_q2d(link->time_base);
> @@ -285,22 +302,33 @@ static int filter_frame(AVFilterLink *link, AVFrame
> *frame)
>  (int)s->var_values[VAR_N], s->var_values[VAR_T], s-
> >var_values[VAR_POS],
>  s->x, s->y, s->x+s->w, s->y+s->h);
> 
> -frame->data[0] += s->y * frame->linesize[0];
> -frame->data[0] += s->x * s->max_step[0];
> -
> -if (!(desc->flags & AV_PIX_FMT_FLAG_PAL || desc->flags & FF_PSEUDOPAL))
> {
> -for (i = 1; i < 3; i ++) {
> -if (frame->data[i]) {
> -frame->data[i] += (s->y >> s->vsub) * frame->linesize[i];
> -frame->data[i] += (s->x * s->max_step[i]) >> s->hsub;
> +  

Re: [FFmpeg-devel] [PATCH][FFmpeg-devel v2] Add GPU accelerated video crop filter

2019-03-25 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Timo Rothenpieler
> Sent: Monday, March 25, 2019 6:31 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH][FFmpeg-devel v2] Add GPU accelerated
> video crop filter
> 
> On 25/03/2019 09:27, Tao Zhang wrote:
> >>> Hi,
> >>>
> >>> Timo and Mark and I have been discussing this, and we think the right
> >>> thing to do is add support to vf_scale_cuda to respect the crop
> >>> properties on an input AVFrame. Mark posted a patch to vf_crop to
> >>> ensure that the properties are set, and then the scale filter should
> >>> respect those properties if they are set. You can look at
> >>> vf_scale_vaapi for how the properties are read, but they will require
> >>> explicit handling to adjust the src dimensions passed to the scale
> >>> filter.
> > Maybe a little not intuitive to users.
> >>>
> >>> This will be a more efficient way of handling crops, in terms of total
> >>> lines of code and also allowing crop/scale with one less copy.
> >>>
> >>> I know this is quite different from the approach you've taken here, and
> >>> we appreciate the work you've done, but it should be better overall to
> >>> implement this integrated method.
> >> Hi Philip,
> >>
> >> Glad to hear you guys had discussion on this. As I am also considering the
> problem, I have some questions about your idea.
> >> So, what if user did not insert a scale_cuda after crop filter? Do you 
> >> plan to
> automatically insert scale_cuda or just ignore the crop?
> >> What if user want to do crop,transpose_cuda,scale_cuda? So we also need
> to handle crop inside transpose_cuda filter?
>  >
> > I have the same question.
> Ideally, scale_cuda should be auto-inserted at the required places once
> it works that way.
> Otherwise it seems pointless to me if the user still has to manually
> insert it after the generic filters setting metadata.
Agree.

> 
> For that reason it should also still support getting its parameters
> passed directly as a fallback, and potentially even expose multiple
> filter names, so crop_cuda and transpose_cuda are still visible, but
> ultimately point to the same filter code.
> 
> We have a transpose_npp, right now, but with libnpp slowly being on its
> way out, transpose_cuda is needed, and ultimately even a format_cuda
> filter, since right now scale_npp is the only filter that can convert
> pixel formats on the hardware.
> I'd also like to see scale_cuda to support a few more interpolation
> algorithms, but that's not very important for now.
> 
> All this functionality can be in the same filter, which is scale_cuda.
> The point of that is that it avoids needless expensive frame copies as
> much as possible.

For crop/transpose, these are just some copy-like kernel. May be a good idea to 
merge with other kernels.
But I am not sure how much overall performance gain we would get for a 
transcoding pipeline. And merging all the things together may make code very 
complex.
For example, a crop+scale or crop+transpose may be easy to merge. But a 
crop+transpose+scale or crop+transpose+scale+format will be more complex.

I want to share some of my experience on developing opencl scale filter( 
https://patchwork.ffmpeg.org/patch/11910/ ).
I tried to merge scale and format-convert in one single OpenCL kernel.
But I failed to make the code clean after supporting interpolation method like 
bicubic, so I plan to separate them in two kernels these days.

And my experiments on scale_opencl show that merging scale with format-convert 
may not always get benefit.
For example, for 1080p scale-down, merging these two operations together is 
about 10% faster (for decode+scale), but for 4K input, merging two kernels make 
it slower.
My guess is different planes may compete the limited GPU cache. For scale-only, 
we can do it plane by plane, but for format-convert you have to read all the 
input planes and write all output planes at the same time.
This is just my guess, I have not root-caused what is the real reason. But I 
think keeping scale and format-convert in separate kernel function seems better.

I am also thinking about this issue other way, whether it is possible that we 
simple do the needed copy in crop/transpose and try to optimize off one filter 
if they are neighbors and pass the options to the other when configuring the 
filter pipeline?
Definitely I am really interested to see the work you described happen in 
FFmpeg.

Thanks!
Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH][FFmpeg-devel v2] Add GPU accelerated video crop filter

2019-03-25 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Philip Langdale via ffmpeg-devel
> Sent: Monday, March 25, 2019 12:57 PM
> To: FFmpeg development discussions and patches 
> Cc: Philip Langdale 
> Subject: Re: [FFmpeg-devel] [PATCH][FFmpeg-devel v2] Add GPU accelerated
> video crop filter
> 
> On Sat, 23 Mar 2019 23:51:10 +0800
> UsingtcNower  wrote:
> 
> > Signed-off-by: UsingtcNower 
> > ---
> >  Changelog   |   1 +
> >  configure   |   1 +
> >  doc/filters.texi|  31 +++
> >  libavfilter/Makefile|   1 +
> >  libavfilter/allfilters.c|   1 +
> >  libavfilter/version.h   |   2 +-
> >  libavfilter/vf_crop_cuda.c  | 638
> > 
> > libavfilter/vf_crop_cuda.cu | 109  8 files changed, 783
> > insertions(+), 1 deletion(-) create mode 100644
> > libavfilter/vf_crop_cuda.c create mode 100644
> > libavfilter/vf_crop_cuda.cu
> >
> > diff --git a/Changelog b/Changelog
> > index ad7e82f..f224fc8 100644
> > --- a/Changelog
> > +++ b/Changelog
> > @@ -20,6 +20,7 @@ version :
> >  - libaribb24 based ARIB STD-B24 caption support (profiles A and C)
> >  - Support decoding of HEVC 4:4:4 content in nvdec and cuviddec
> >  - removed libndi-newtek
> > +- crop_cuda GPU accelerated video crop filter
> 
> Hi,
> 
> Timo and Mark and I have been discussing this, and we think the right
> thing to do is add support to vf_scale_cuda to respect the crop
> properties on an input AVFrame. Mark posted a patch to vf_crop to
> ensure that the properties are set, and then the scale filter should
> respect those properties if they are set. You can look at
> vf_scale_vaapi for how the properties are read, but they will require
> explicit handling to adjust the src dimensions passed to the scale
> filter.
> 
> This will be a more efficient way of handling crops, in terms of total
> lines of code and also allowing crop/scale with one less copy.
> 
> I know this is quite different from the approach you've taken here, and
> we appreciate the work you've done, but it should be better overall to
> implement this integrated method.
Hi Philip,

Glad to hear you guys had discussion on this. As I am also considering the 
problem, I have some questions about your idea.
So, what if user did not insert a scale_cuda after crop filter? Do you plan to 
automatically insert scale_cuda or just ignore the crop?
What if user want to do crop,transpose_cuda,scale_cuda? So we also need to 
handle crop inside transpose_cuda filter?
(looks like we do not have transpose_cuda right now, but this filter seems 
needed if user want to do transpose job using cuda.)

Thanks!
Ruiling
> 
> Thanks,
> 
> >
> >  version 4.1:
> > diff --git a/configure b/configure
> > index 331393f..3f3ac2f 100755
> > --- a/configure
> > +++ b/configure
> > @@ -2973,6 +2973,7 @@ qsvvpp_select="qsv"
> >  vaapi_encode_deps="vaapi"
> >  v4l2_m2m_deps="linux_videodev2_h sem_timedwait"
> >
> > +crop_cuda_filter_deps="ffnvcodec cuda_nvcc"
> >  hwupload_cuda_filter_deps="ffnvcodec"
> >  scale_npp_filter_deps="ffnvcodec libnpp"
> >  scale_cuda_filter_deps="ffnvcodec cuda_nvcc"
> > diff --git a/doc/filters.texi b/doc/filters.texi
> > index 4ffb392..ee16a2d 100644
> > --- a/doc/filters.texi
> > +++ b/doc/filters.texi
> > @@ -7415,6 +7415,37 @@ If the specified expression is not valid, it
> > is kept at its current value.
> >  @end table
> >
> > +@section crop_cuda
> > +
> > +Crop the input video to given dimensions, implemented in CUDA.
> > +
> > +It accepts the following parameters:
> > +
> > +@table @option
> > +
> > +@item w
> > +The width of the output video. It defaults to @code{iw}.
> > +This expression is evaluated only once during the filter
> > +configuration.
> > +
> > +@item h
> > +The height of the output video. It defaults to @code{ih}.
> > +This expression is evaluated only once during the filter
> > +configuration.
> > +
> > +@item x
> > +The horizontal position, in the input video, of the left edge of the
> > output +video. It defaults to @code{(in_w-out_w)/2}.
> > +This expression is evaluated only once during the filter
> > +configuration.
> > +
> > +@item y
> > +The vertical position, in the input video, of the top edge of the
> > output video. +It defaults to @code{(in_h-out_h)/2}.
> > +This expression is evaluated only once during the filter
> > +configuration.
> > +@end table
> > +
> >  @section cropdetect
> >
> >  Auto-detect the crop size.
> > diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> > index fef6ec5..84df037 100644
> > --- a/libavfilter/Makefile
> > +++ b/libavfilter/Makefile
> > @@ -187,6 +187,7 @@ OBJS-$(CONFIG_COPY_FILTER)   +=
> > vf_copy.o OBJS-$(CONFIG_COREIMAGE_FILTER)  +=
> > vf_coreimage.o OBJS-$(CONFIG_COVER_RECT_FILTER) +=
> > vf_cover_rect.o lavfutils.o
> > OBJS-$(CONFIG_CROP_FILTER)   += vf_crop.o
> > 

Re: [FFmpeg-devel] [PATCH][FFmpeg-devel v2] Add GPU accelerated video crop filter

2019-03-24 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> UsingtcNower
> Sent: Saturday, March 23, 2019 11:51 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH][FFmpeg-devel v2] Add GPU accelerated video
> crop filter
> 
> Signed-off-by: UsingtcNower 
> ---
>  Changelog   |   1 +
>  configure   |   1 +
>  doc/filters.texi|  31 +++
>  libavfilter/Makefile|   1 +
>  libavfilter/allfilters.c|   1 +
>  libavfilter/version.h   |   2 +-
>  libavfilter/vf_crop_cuda.c  | 638
> 
>  libavfilter/vf_crop_cuda.cu | 109 
>  8 files changed, 783 insertions(+), 1 deletion(-)
>  create mode 100644 libavfilter/vf_crop_cuda.c
>  create mode 100644 libavfilter/vf_crop_cuda.cu
> 
> diff --git a/Changelog b/Changelog
> index ad7e82f..f224fc8 100644
> --- a/Changelog
> +++ b/Changelog
> @@ -20,6 +20,7 @@ version :
>  - libaribb24 based ARIB STD-B24 caption support (profiles A and C)
>  - Support decoding of HEVC 4:4:4 content in nvdec and cuviddec
>  - removed libndi-newtek
> +- crop_cuda GPU accelerated video crop filter
> 
> 
>  version 4.1:
> diff --git a/configure b/configure
> index 331393f..3f3ac2f 100755
> --- a/configure
> +++ b/configure
> @@ -2973,6 +2973,7 @@ qsvvpp_select="qsv"
>  vaapi_encode_deps="vaapi"
>  v4l2_m2m_deps="linux_videodev2_h sem_timedwait"
> 
> +crop_cuda_filter_deps="ffnvcodec cuda_nvcc"
Seems you are using NAN, you may also need to check dependency against 
"const_nan"

>  hwupload_cuda_filter_deps="ffnvcodec"
>  scale_npp_filter_deps="ffnvcodec libnpp"
>  scale_cuda_filter_deps="ffnvcodec cuda_nvcc"
> diff --git a/doc/filters.texi b/doc/filters.texi
> index 4ffb392..ee16a2d 100644
> --- a/doc/filters.texi
> +++ b/doc/filters.texi
> @@ -7415,6 +7415,37 @@ If the specified expression is not valid, it is kept 
> at its
> current
>  value.
>  @end table
> 
> +@section crop_cuda
> +
> +Crop the input video to given dimensions, implemented in CUDA.
> +
> +It accepts the following parameters:
> +
> +@table @option
> +
> +@item w
> +The width of the output video. It defaults to @code{iw}.
> +This expression is evaluated only once during the filter
> +configuration.
> +
> +@item h
> +The height of the output video. It defaults to @code{ih}.
> +This expression is evaluated only once during the filter
> +configuration.
> +
> +@item x
> +The horizontal position, in the input video, of the left edge of the output
> +video. It defaults to @code{(in_w-out_w)/2}.
> +This expression is evaluated only once during the filter
> +configuration.
> +
> +@item y
> +The vertical position, in the input video, of the top edge of the output 
> video.
> +It defaults to @code{(in_h-out_h)/2}.
> +This expression is evaluated only once during the filter
> +configuration.
> +@end table
> +
>  @section cropdetect
> 
>  Auto-detect the crop size.
> diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> index fef6ec5..84df037 100644
> --- a/libavfilter/Makefile
> +++ b/libavfilter/Makefile
> @@ -187,6 +187,7 @@ OBJS-$(CONFIG_COPY_FILTER)   += vf_copy.o
>  OBJS-$(CONFIG_COREIMAGE_FILTER)  += vf_coreimage.o
>  OBJS-$(CONFIG_COVER_RECT_FILTER) += vf_cover_rect.o lavfutils.o
>  OBJS-$(CONFIG_CROP_FILTER)   += vf_crop.o
> +OBJS-$(CONFIG_CROP_CUDA_FILTER)  += vf_crop_cuda.o
> vf_crop_cuda.ptx.o
>  OBJS-$(CONFIG_CROPDETECT_FILTER) += vf_cropdetect.o
>  OBJS-$(CONFIG_CUE_FILTER)+= f_cue.o
>  OBJS-$(CONFIG_CURVES_FILTER) += vf_curves.o
> diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> index c51ae0f..550e545 100644
> --- a/libavfilter/allfilters.c
> +++ b/libavfilter/allfilters.c
> @@ -175,6 +175,7 @@ extern AVFilter ff_vf_copy;
>  extern AVFilter ff_vf_coreimage;
>  extern AVFilter ff_vf_cover_rect;
>  extern AVFilter ff_vf_crop;
> +extern AVFilter ff_vf_crop_cuda;
>  extern AVFilter ff_vf_cropdetect;
>  extern AVFilter ff_vf_cue;
>  extern AVFilter ff_vf_curves;
> diff --git a/libavfilter/version.h b/libavfilter/version.h
> index c71282c..5aa95f4 100644
> --- a/libavfilter/version.h
> +++ b/libavfilter/version.h
> @@ -31,7 +31,7 @@
> 
>  #define LIBAVFILTER_VERSION_MAJOR   7
>  #define LIBAVFILTER_VERSION_MINOR  48
> -#define LIBAVFILTER_VERSION_MICRO 100
> +#define LIBAVFILTER_VERSION_MICRO 101
> 
>  #define LIBAVFILTER_VERSION_INT
> AV_VERSION_INT(LIBAVFILTER_VERSION_MAJOR, \
> LIBAVFILTER_VERSION_MINOR, \
> diff --git a/libavfilter/vf_crop_cuda.c b/libavfilter/vf_crop_cuda.c
> new file mode 100644
> index 000..fc6a2a6
> --- /dev/null
> +++ b/libavfilter/vf_crop_cuda.c
> @@ -0,0 +1,638 @@
> +/*
> +* Copyright (c) 2019, iQIYI CORPORATION. All rights reserved.
> +*
> +* Permission is hereby granted, free of charge, to any person obtaining a
> +* copy of this software and 

Re: [FFmpeg-devel] [PATCH 1/5] lavu/opencl: replace va_ext.h with standard name

2019-03-13 Thread Song, Ruiling


> -Original Message-
> From: Song, Ruiling
> Sent: Tuesday, January 22, 2019 3:16 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Song, Ruiling 
> Subject: [PATCH 1/5] lavu/opencl: replace va_ext.h with standard name
> 
> Khronos OpenCL header (https://github.com/KhronosGroup/OpenCL-Headers)
> uses cl_va_api_media_sharing_intel.h. And Intel's official OpenCL driver
> for Intel GPU (https://github.com/intel/compute-runtime) was compiled
> against Khronos OpenCL header. So it's better to align with Khronos.
> 
> Signed-off-by: Ruiling Song 
> ---
>  configure| 2 +-
>  libavutil/hwcontext_opencl.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
ping?
If nobody against, I will push the patchset next week.

Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH v3 2/2] vf_scale_vaapi: Add options to configure output colour properties

2019-03-04 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Thursday, February 28, 2019 8:38 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH v3 2/2] vf_scale_vaapi: Add options to
> configure output colour properties
> 
> The "out_color_matrix" and "out_range" properties match the same options
> in vf_scale; the others attempt to follow the same pattern.
> ---
Looks good.

Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH v3 1/2] lavfi/vaapi: Improve support for colour properties

2019-03-04 Thread Song, Ruiling

The patch basically looks good. Some comments inline.
> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Thursday, February 28, 2019 8:38 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH v3 1/2] lavfi/vaapi: Improve support for colour
> properties
> 
> Attempts to pick the set of supported colour properties best matching the
> input.  Output is then set with the same values, except for the colour
> matrix which may change when converting between RGB and YUV.
> ---
> Not much change since the version sent two months ago - rebased, and the
> transpose filter updated to match the others.
> 
> 
>  libavfilter/vaapi_vpp.c| 273 -
>  libavfilter/vaapi_vpp.h|   5 +-
>  libavfilter/vf_deinterlace_vaapi.c |  16 +-
>  libavfilter/vf_misc_vaapi.c|  13 +-
>  libavfilter/vf_procamp_vaapi.c |  13 +-
>  libavfilter/vf_scale_vaapi.c   |  12 +-
>  libavfilter/vf_transpose_vaapi.c   |  15 +-
>  7 files changed, 309 insertions(+), 38 deletions(-)
> 
> diff --git a/libavfilter/vaapi_vpp.c b/libavfilter/vaapi_vpp.c
> index c5bbc3b85b..f4ee622a2b 100644
> --- a/libavfilter/vaapi_vpp.c
> +++ b/libavfilter/vaapi_vpp.c
> @@ -234,18 +234,273 @@ fail:
>  return err;
>  }
> 
> -int ff_vaapi_vpp_colour_standard(enum AVColorSpace av_cs)
> +typedef struct VAAPIColourProperties {
> +VAProcColorStandardType va_color_standard;
> +
> +enum AVColorPrimaries color_primaries;
> +enum AVColorTransferCharacteristic color_trc;
> +enum AVColorSpace colorspace;
> +
> +uint8_t va_chroma_sample_location;
> +uint8_t va_color_range;
> +
> +enum AVColorRange color_range;
> +enum AVChromaLocation chroma_sample_location;
> +} VAAPIColourProperties;
> +
> +static const VAAPIColourProperties*
> +vaapi_vpp_find_colour_props(VAProcColorStandardType vacs)
> +{
> +static const VAAPIColourProperties cs_map[] = {

Why using these magic number instead of meaningful enum value?
And two entries added for VAProcColorStandardBT601, seems that only the first 
will be returned?
> +{ VAProcColorStandardBT601,   5,  6,  5 },
> +{ VAProcColorStandardBT601,   6,  6,  6 },
> +{ VAProcColorStandardBT709,   1,  1,  1 },
> +{ VAProcColorStandardBT470M,  4,  4,  4 },
> +{ VAProcColorStandardBT470BG, 5,  5,  5 },
> +{ VAProcColorStandardSMPTE170M,   6,  6,  6 },
> +{ VAProcColorStandardSMPTE240M,   7,  7,  7 },
> +{ VAProcColorStandardGenericFilm, 8,  1,  1 },
> +#if VA_CHECK_VERSION(1, 1, 0)
> +{ VAProcColorStandardSRGB,1, 13,  0 },
> +{ VAProcColorStandardXVYCC601,1, 11,  5 },
> +{ VAProcColorStandardXVYCC709,1, 11,  1 },
> +{ VAProcColorStandardBT2020,  9, 14,  9 },
> +#endif
> +};
> +int i;
> +for (i = 0; i < FF_ARRAY_ELEMS(cs_map); i++) {
> +if (vacs == cs_map[i].va_color_standard)
> +return _map[i];
> +}
> +return NULL;
> +}
> +
> +static void vaapi_vpp_fill_colour_standard(VAAPIColourProperties *props,
> +   VAProcColorStandardType *vacs,
> +   int nb_vacs)
> +{
> +const VAAPIColourProperties *t;
> +int i, k, score, best_score, worst_score;
> +
> +// If the driver supports explicit use of the standard values then just
> +// use them and avoid doing any mapping.  (The driver may not support
> +// some particular code point, but it still has enough information to
> +// make a better fallback choice than we do in that case.)
> +#if VA_CHECK_VERSION(1, 1, 0)
> +for (i = 0; i < nb_vacs; i++) {
> +if (vacs[i] == VAProcColorStandardExplicit) {
> +props->va_color_standard = VAProcColorStandardExplicit;
> +return;
> +}
> +}
> +#endif
> +
> +// Give scores to the possible options and choose the lowest one.
> +// An exact match will score zero and therefore always be chosen, as
> +// will a partial match where all unmatched elements are explicitly
> +// unspecified.  (If all elements are unspecified this will use the
> +// first available value.)  If no options match at all then just
> +// pass "none" to the driver and let it make its own choice.
Here (a*4+b*2+c)  is chosen as the score function, I am not sure whether (a + b 
+ c) is just ok? 

> +best_score = -1;
> +worst_score = 4 * (props->colorspace != AVCOL_SPC_UNSPECIFIED) +
> +  2 * (props->color_trc != AVCOL_TRC_UNSPECIFIED) +
> +  (props->color_primaries != AVCOL_PRI_UNSPECIFIED);
Seems that the outer loop here is just used to re-iterate through nb_vacs to 
find the best match again?
Can we remove the outer-loop-over-k like below?

best_va_standard = VAProcColorStandardNone;
for (i = 0; i < nb_vacs; i++) {
...
...

Re: [FFmpeg-devel] [PATCH] MAINTAINERS: add myself for tonemap_opencl

2019-02-18 Thread Song, Ruiling
> -Original Message-
> From: Song, Ruiling
> Sent: Wednesday, February 13, 2019 9:29 AM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Song, Ruiling 
> Subject: [PATCH] MAINTAINERS: add myself for tonemap_opencl
> 
> Signed-off-by: Ruiling Song 
> ---
>  MAINTAINERS | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 7ac2d22..412a739 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -362,6 +362,7 @@ Filters:
>vf_ssim.c Paul B Mahol
>vf_stereo3d.c Paul B Mahol
>vf_telecine.c Paul B Mahol
> +  vf_tonemap_opencl.c   Ruiling Song
>vf_yadif.cMichael Niedermayer
>vf_zoompan.c  Paul B Mahol
Ping?

> 
> --
> 2.7.4

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/2] lavfi/vf_hwmap: make hwunmap from software frame work.

2018-12-25 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Friday, December 21, 2018 7:39 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH 1/2] lavfi/vf_hwmap: make hwunmap from
> software frame work.
> 
> On 18/12/2018 01:28, Song, Ruiling wrote:
> >> -Original Message-
> >> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of
> >> Mark Thompson
> >> Sent: Tuesday, December 18, 2018 6:33 AM
> >> To: ffmpeg-devel@ffmpeg.org
> >> Subject: Re: [FFmpeg-devel] [PATCH 1/2] lavfi/vf_hwmap: make hwunmap
> from
> >> software frame work.
> >>
> >>  13/12/2018 01:50, Ruiling Song wrote:
> >>> This patch was used to fix the second hwmap filter issue:
> >>> [vaapi_frame] hwmap [software filters] hwmap [vaapi_frame]
> >>> For such case, we also need to allocate the hardware frame
> >>> and map it back to software.
> >>>
> >>> Signed-off-by: Ruiling Song 
> >>> ---
> >>>  libavfilter/vf_hwmap.c | 125 +-
> ---
> >> 
> >>>  1 file changed, 75 insertions(+), 50 deletions(-)
> >>>
> >>> diff --git a/libavfilter/vf_hwmap.c b/libavfilter/vf_hwmap.c
> >>> index 290559a..03cb325 100644
> >>> --- a/libavfilter/vf_hwmap.c
> >>> +++ b/libavfilter/vf_hwmap.c
> >>> @@ -50,6 +50,36 @@ static int hwmap_query_formats(AVFilterContext
> >> *avctx)
> >>>  return 0;
> >>>  }
> >>>
> >>> +static int create_hwframe_context(HWMapContext *ctx, AVFilterContext
> >> *avctx,
> >>> +  AVBufferRef *device, int format,
> >>> +  int sw_format, int width, int height)
> >>> +{
> >>> +int err;
> >>> +AVHWFramesContext *frames;
> >>> +
> >>> +ctx->hwframes_ref = av_hwframe_ctx_alloc(device);
> >>> +if (!ctx->hwframes_ref) {
> >>> +return AVERROR(ENOMEM);
> >>> +}
> >>> +frames = (AVHWFramesContext*)ctx->hwframes_ref->data;
> >>> +
> >>> +frames->format= format;
> >>> +frames->sw_format = sw_format;
> >>> +frames->width = width;
> >>> +frames->height= height;
> >>> +
> >>> +if (avctx->extra_hw_frames >= 0)
> >>> +frames->initial_pool_size = 2 + avctx->extra_hw_frames;
> >>> +
> >>> +err = av_hwframe_ctx_init(ctx->hwframes_ref);
> >>> +if (err < 0) {
> >>> +av_log(avctx, AV_LOG_ERROR, "Failed to initialise "
> >>> +   "target frames context: %d.\n", err);
> >>> +return err;
> >>> +}
> >>> +return 0;
> >>> +}
> >>> +
> >>>  static int hwmap_config_output(AVFilterLink *outlink)
> >>>  {
> >>>  AVFilterContext *avctx = outlink->src;
> >>> @@ -130,29 +160,11 @@ static int hwmap_config_output(AVFilterLink
> >> *outlink)
> >>>  // overwrite the input hwframe context with a derived context
> >>>  // mapped from that back to the source type.
> >>>  AVBufferRef *source;
> >>> -AVHWFramesContext *frames;
> >>> -
> >>> -ctx->hwframes_ref = av_hwframe_ctx_alloc(device);
> >>> -if (!ctx->hwframes_ref) {
> >>> -err = AVERROR(ENOMEM);
> >>> +err = create_hwframe_context(ctx, avctx, device, 
> >>> outlink->format,
> >>> + hwfc->sw_format, hwfc->width,
> >>> + hwfc->height);
> >>> +if (err < 0)
> >>>  goto fail;
> >>> -}
> >>> -frames = (AVHWFramesContext*)ctx->hwframes_ref->data;
> >>> -
> >>> -frames->format= outlink->format;
> >>> -frames->sw_format = hwfc->sw_format;
> >>> -frames->width = hwfc->width;
> >>> -frames->height= hwfc->height;
> >>> -
> >>> -if (

Re: [FFmpeg-devel] [PATCH] lavfi/tonemap_opencl: reuse matrix calculation from vf_colorspace

2018-12-19 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Song, Ruiling
> Sent: Tuesday, December 4, 2018 3:33 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH] lavfi/tonemap_opencl: reuse matrix
> calculation from vf_colorspace
> 
> 
> 
> > -Original Message-
> > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> > Ruiling Song
> > Sent: Wednesday, November 28, 2018 2:09 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Song, Ruiling 
> > Subject: [FFmpeg-devel] [PATCH] lavfi/tonemap_opencl: reuse matrix
> > calculation from vf_colorspace
> >
> > As these functions are moved to shared file, other colorspace-related
> > filters could also leverage the code.
> >
> > Signed-off-by: Ruiling Song 
> > ---
> >  libavfilter/colorspace.c| 71 +
> >  libavfilter/colorspace.h|  4 ++
> >  libavfilter/opencl/colorspace_common.cl | 25 ---
> >  libavfilter/vf_colorspace.c | 80 
> > ++---
> >  libavfilter/vf_tonemap_opencl.c | 62 +++--
> >  5 files changed, 106 insertions(+), 136 deletions(-)
> >
> > diff --git a/libavfilter/colorspace.c b/libavfilter/colorspace.c
> > index c668221..19616e4 100644
> > --- a/libavfilter/colorspace.c
> > +++ b/libavfilter/colorspace.c
> > @@ -93,6 +93,77 @@ void ff_fill_rgb2xyz_table(const struct
> > PrimaryCoefficients *coeffs,
> >  rgb2xyz[2][1] *= sg;
> >  rgb2xyz[2][2] *= sb;
> >  }
> > +static const double ycgco_matrix[3][3] =
> > +{
> > +{  0.25, 0.5,  0.25 },
> > +{ -0.25, 0.5, -0.25 },
> > +{  0.5,  0,   -0.5  },
> > +};
> > +
> > +static const double gbr_matrix[3][3] =
> > +{
> > +{ 0,1,   0   },
> > +{ 0,   -0.5, 0.5 },
> > +{ 0.5, -0.5, 0   },
> > +};
> > +
> > +/*
> > + * All constants explained in e.g. https://linuxtv.org/downloads/v4l-dvb-
> > apis/ch02s06.html
> > + * The older ones (bt470bg/m) are also explained in their respective ITU 
> > docs
> > + * (e.g. https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.470-5-
> 199802-
> > S!!PDF-E.pdf)
> > + * whereas the newer ones can typically be copied directly from wikipedia 
> > :)
> > + */
> > +static const struct LumaCoefficients luma_coefficients[AVCOL_SPC_NB] = {
> > +[AVCOL_SPC_FCC]= { 0.30,   0.59,   0.11   },
> > +[AVCOL_SPC_BT470BG]= { 0.299,  0.587,  0.114  },
> > +[AVCOL_SPC_SMPTE170M]  = { 0.299,  0.587,  0.114  },
> > +[AVCOL_SPC_BT709]  = { 0.2126, 0.7152, 0.0722 },
> > +[AVCOL_SPC_SMPTE240M]  = { 0.212,  0.701,  0.087  },
> > +[AVCOL_SPC_YCOCG]  = { 0.25,   0.5,0.25   },
> > +[AVCOL_SPC_RGB]= { 1,  1,  1  },
> > +[AVCOL_SPC_BT2020_NCL] = { 0.2627, 0.6780, 0.0593 },
> > +[AVCOL_SPC_BT2020_CL]  = { 0.2627, 0.6780, 0.0593 },
> > +};
> > +
> > +const struct LumaCoefficients *ff_get_luma_coefficients(enum
> AVColorSpace
> > csp)
> > +{
> > +const struct LumaCoefficients *coeffs;
> > +
> > +if (csp >= AVCOL_SPC_NB)
> > +return NULL;
> > +coeffs = _coefficients[csp];
> > +if (!coeffs->cr)
> > +return NULL;
> > +
> > +return coeffs;
> > +}
> > +
> > +void ff_fill_rgb2yuv_table(const struct LumaCoefficients *coeffs,
> > +   double rgb2yuv[3][3])
> > +{
> > +double bscale, rscale;
> > +
> > +// special ycgco matrix
> > +if (coeffs->cr == 0.25 && coeffs->cg == 0.5 && coeffs->cb == 0.25) {
> > +memcpy(rgb2yuv, ycgco_matrix, sizeof(double) * 9);
> > +return;
> > +} else if (coeffs->cr == 1 && coeffs->cg == 1 && coeffs->cb == 1) {
> > +memcpy(rgb2yuv, gbr_matrix, sizeof(double) * 9);
> > +return;
> > +}
> > +
> > +rgb2yuv[0][0] = coeffs->cr;
> > +rgb2yuv[0][1] = coeffs->cg;
> > +rgb2yuv[0][2] = coeffs->cb;
> > +bscale = 0.5 / (coeffs->cb - 1.0);
> > +rscale = 0.5 / (coeffs->cr - 1.0);
> > +rgb2yuv[1][0] = bscale * coeffs->cr;
> > +rgb2yuv[1][1] = bscale * coeffs->cg;
> > +rgb2yuv[1][2] = 0.5;
> > +rgb2yuv[2][0] = 0.5;
> > +rgb2yuv[2][1] = rscale * coeffs->cg;
>

Re: [FFmpeg-devel] [PATCH 1/2] lavfi/vf_hwmap: make hwunmap from software frame work.

2018-12-17 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Tuesday, December 18, 2018 6:33 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH 1/2] lavfi/vf_hwmap: make hwunmap from
> software frame work.
> 
>  13/12/2018 01:50, Ruiling Song wrote:
> > This patch was used to fix the second hwmap filter issue:
> > [vaapi_frame] hwmap [software filters] hwmap [vaapi_frame]
> > For such case, we also need to allocate the hardware frame
> > and map it back to software.
> >
> > Signed-off-by: Ruiling Song 
> > ---
> >  libavfilter/vf_hwmap.c | 125 +
> 
> >  1 file changed, 75 insertions(+), 50 deletions(-)
> >
> > diff --git a/libavfilter/vf_hwmap.c b/libavfilter/vf_hwmap.c
> > index 290559a..03cb325 100644
> > --- a/libavfilter/vf_hwmap.c
> > +++ b/libavfilter/vf_hwmap.c
> > @@ -50,6 +50,36 @@ static int hwmap_query_formats(AVFilterContext
> *avctx)
> >  return 0;
> >  }
> >
> > +static int create_hwframe_context(HWMapContext *ctx, AVFilterContext
> *avctx,
> > +  AVBufferRef *device, int format,
> > +  int sw_format, int width, int height)
> > +{
> > +int err;
> > +AVHWFramesContext *frames;
> > +
> > +ctx->hwframes_ref = av_hwframe_ctx_alloc(device);
> > +if (!ctx->hwframes_ref) {
> > +return AVERROR(ENOMEM);
> > +}
> > +frames = (AVHWFramesContext*)ctx->hwframes_ref->data;
> > +
> > +frames->format= format;
> > +frames->sw_format = sw_format;
> > +frames->width = width;
> > +frames->height= height;
> > +
> > +if (avctx->extra_hw_frames >= 0)
> > +frames->initial_pool_size = 2 + avctx->extra_hw_frames;
> > +
> > +err = av_hwframe_ctx_init(ctx->hwframes_ref);
> > +if (err < 0) {
> > +av_log(avctx, AV_LOG_ERROR, "Failed to initialise "
> > +   "target frames context: %d.\n", err);
> > +return err;
> > +}
> > +return 0;
> > +}
> > +
> >  static int hwmap_config_output(AVFilterLink *outlink)
> >  {
> >  AVFilterContext *avctx = outlink->src;
> > @@ -130,29 +160,11 @@ static int hwmap_config_output(AVFilterLink
> *outlink)
> >  // overwrite the input hwframe context with a derived context
> >  // mapped from that back to the source type.
> >  AVBufferRef *source;
> > -AVHWFramesContext *frames;
> > -
> > -ctx->hwframes_ref = av_hwframe_ctx_alloc(device);
> > -if (!ctx->hwframes_ref) {
> > -err = AVERROR(ENOMEM);
> > +err = create_hwframe_context(ctx, avctx, device, 
> > outlink->format,
> > + hwfc->sw_format, hwfc->width,
> > + hwfc->height);
> > +if (err < 0)
> >  goto fail;
> > -}
> > -frames = (AVHWFramesContext*)ctx->hwframes_ref->data;
> > -
> > -frames->format= outlink->format;
> > -frames->sw_format = hwfc->sw_format;
> > -frames->width = hwfc->width;
> > -frames->height= hwfc->height;
> > -
> > -if (avctx->extra_hw_frames >= 0)
> > -frames->initial_pool_size = 2 + avctx->extra_hw_frames;
> > -
> > -err = av_hwframe_ctx_init(ctx->hwframes_ref);
> > -if (err < 0) {
> > -av_log(avctx, AV_LOG_ERROR, "Failed to initialise "
> > -   "target frames context: %d.\n", err);
> > -goto fail;
> > -}
> >
> >  err = av_hwframe_ctx_create_derived(,
> >  inlink->format,
> > @@ -175,10 +187,20 @@ static int hwmap_config_output(AVFilterLink
> *outlink)
> >  inlink->hw_frames_ctx = source;
> >
> >  } else if ((outlink->format == hwfc->format &&
> > -inlink->format  == hwfc->sw_format) ||
> > -   inlink->format == hwfc->format) {
> > -// Map from a hardware format to a software format, or
> > -// undo an existing such mapping.
> > +inlink->format  == hwfc->sw_format)) {
> > +// unmap a software frame back to hardware
> > +ctx->reverse = 1;
> > +// incase user does not provide filter device, use the 
> > device_ref
> > +// from inlink
> > +if (!device)
> > +device = hwfc->device_ref;
> > +
> > +err = create_hwframe_context(ctx, avctx, device, 
> > outlink->format,
> > + inlink->format, inlink->w, 
> > inlink->h);
> > +if (err < 0)
> > +goto fail;
> 
> I don't think the unmap case here wants to make a new hardware frames
> context?  You have a software frame which is actually a mapping of a 

Re: [FFmpeg-devel] [PATCH] lavfi/tonemap_opencl: reuse matrix calculation from vf_colorspace

2018-12-03 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Ruiling Song
> Sent: Wednesday, November 28, 2018 2:09 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Song, Ruiling 
> Subject: [FFmpeg-devel] [PATCH] lavfi/tonemap_opencl: reuse matrix
> calculation from vf_colorspace
> 
> As these functions are moved to shared file, other colorspace-related
> filters could also leverage the code.
> 
> Signed-off-by: Ruiling Song 
> ---
>  libavfilter/colorspace.c| 71 +
>  libavfilter/colorspace.h|  4 ++
>  libavfilter/opencl/colorspace_common.cl | 25 ---
>  libavfilter/vf_colorspace.c | 80 
> ++---
>  libavfilter/vf_tonemap_opencl.c | 62 +++--
>  5 files changed, 106 insertions(+), 136 deletions(-)
> 
> diff --git a/libavfilter/colorspace.c b/libavfilter/colorspace.c
> index c668221..19616e4 100644
> --- a/libavfilter/colorspace.c
> +++ b/libavfilter/colorspace.c
> @@ -93,6 +93,77 @@ void ff_fill_rgb2xyz_table(const struct
> PrimaryCoefficients *coeffs,
>  rgb2xyz[2][1] *= sg;
>  rgb2xyz[2][2] *= sb;
>  }
> +static const double ycgco_matrix[3][3] =
> +{
> +{  0.25, 0.5,  0.25 },
> +{ -0.25, 0.5, -0.25 },
> +{  0.5,  0,   -0.5  },
> +};
> +
> +static const double gbr_matrix[3][3] =
> +{
> +{ 0,1,   0   },
> +{ 0,   -0.5, 0.5 },
> +{ 0.5, -0.5, 0   },
> +};
> +
> +/*
> + * All constants explained in e.g. https://linuxtv.org/downloads/v4l-dvb-
> apis/ch02s06.html
> + * The older ones (bt470bg/m) are also explained in their respective ITU docs
> + * (e.g. https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.470-5-199802-
> S!!PDF-E.pdf)
> + * whereas the newer ones can typically be copied directly from wikipedia :)
> + */
> +static const struct LumaCoefficients luma_coefficients[AVCOL_SPC_NB] = {
> +[AVCOL_SPC_FCC]= { 0.30,   0.59,   0.11   },
> +[AVCOL_SPC_BT470BG]= { 0.299,  0.587,  0.114  },
> +[AVCOL_SPC_SMPTE170M]  = { 0.299,  0.587,  0.114  },
> +[AVCOL_SPC_BT709]  = { 0.2126, 0.7152, 0.0722 },
> +[AVCOL_SPC_SMPTE240M]  = { 0.212,  0.701,  0.087  },
> +[AVCOL_SPC_YCOCG]  = { 0.25,   0.5,0.25   },
> +[AVCOL_SPC_RGB]= { 1,  1,  1  },
> +[AVCOL_SPC_BT2020_NCL] = { 0.2627, 0.6780, 0.0593 },
> +[AVCOL_SPC_BT2020_CL]  = { 0.2627, 0.6780, 0.0593 },
> +};
> +
> +const struct LumaCoefficients *ff_get_luma_coefficients(enum AVColorSpace
> csp)
> +{
> +const struct LumaCoefficients *coeffs;
> +
> +if (csp >= AVCOL_SPC_NB)
> +return NULL;
> +coeffs = _coefficients[csp];
> +if (!coeffs->cr)
> +return NULL;
> +
> +return coeffs;
> +}
> +
> +void ff_fill_rgb2yuv_table(const struct LumaCoefficients *coeffs,
> +   double rgb2yuv[3][3])
> +{
> +double bscale, rscale;
> +
> +// special ycgco matrix
> +if (coeffs->cr == 0.25 && coeffs->cg == 0.5 && coeffs->cb == 0.25) {
> +memcpy(rgb2yuv, ycgco_matrix, sizeof(double) * 9);
> +return;
> +} else if (coeffs->cr == 1 && coeffs->cg == 1 && coeffs->cb == 1) {
> +memcpy(rgb2yuv, gbr_matrix, sizeof(double) * 9);
> +return;
> +}
> +
> +rgb2yuv[0][0] = coeffs->cr;
> +rgb2yuv[0][1] = coeffs->cg;
> +rgb2yuv[0][2] = coeffs->cb;
> +bscale = 0.5 / (coeffs->cb - 1.0);
> +rscale = 0.5 / (coeffs->cr - 1.0);
> +rgb2yuv[1][0] = bscale * coeffs->cr;
> +rgb2yuv[1][1] = bscale * coeffs->cg;
> +rgb2yuv[1][2] = 0.5;
> +rgb2yuv[2][0] = 0.5;
> +rgb2yuv[2][1] = rscale * coeffs->cg;
> +rgb2yuv[2][2] = rscale * coeffs->cb;
> +}
> 
>  double ff_determine_signal_peak(AVFrame *in)
>  {
> diff --git a/libavfilter/colorspace.h b/libavfilter/colorspace.h
> index 9366818..459a5df 100644
> --- a/libavfilter/colorspace.h
> +++ b/libavfilter/colorspace.h
> @@ -44,6 +44,10 @@ void ff_fill_rgb2xyz_table(const struct
> PrimaryCoefficients *coeffs,
> const struct WhitepointCoefficients *wp,
> double rgb2xyz[3][3]);
> 
> +const struct LumaCoefficients *ff_get_luma_coefficients(enum AVColorSpace
> csp);
> +void ff_fill_rgb2yuv_table(const struct LumaCoefficients *coeffs,
> +   double rgb2yuv[3][3]);
> +
>  double ff_determine_signal_peak(AVFrame *in);
>  void ff_update_hdr_metadata(AVFrame *in, double peak);
> 
> diff --git a/libavfilter/openc

Re: [FFmpeg-devel] [PATCH V2] lavf: add transpose_opencl filter

2018-12-03 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Monday, December 3, 2018 8:10 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH V2] lavf: add transpose_opencl filter
> 
> On 28/11/2018 02:27, Ruiling Song wrote:
> > Signed-off-by: Ruiling Song 
> > ---
> >  configure |   1 +
> >  libavfilter/Makefile  |   1 +
> >  libavfilter/allfilters.c  |   1 +
> >  libavfilter/opencl/transpose.cl   |  35 +
> >  libavfilter/opencl_source.h   |   1 +
> >  libavfilter/transpose.h   |  34 +
> >  libavfilter/vf_transpose.c|  14 +-
> >  libavfilter/vf_transpose_opencl.c | 288
> ++
> >  8 files changed, 362 insertions(+), 13 deletions(-)
> >  create mode 100644 libavfilter/opencl/transpose.cl
> >  create mode 100644 libavfilter/transpose.h
> >  create mode 100644 libavfilter/vf_transpose_opencl.c
> 
> Testing the passthrough option here reveals a slightly unfortunate interaction
> with mapping - if this is the only filter in use, then not doing a redundant 
> copy
> can fall over.
> 
> For example, on Rockchip (Mali) decoding with rkmpp then using:
> 
> -vf
> hwmap=derive_device=opencl,transpose_opencl=dir=clock:passthrough=landsc
> ape,hwdownload,format=nv12
> 
> fails at the download in the passthrough case because it doesn't allow the 
> read
> (the extension does explicitly document this constraint -
>  emory.txt>).
> 
> VAAPI has a similar problem with a decode followed by:
> 
> -vf
> hwmap=derive_device=opencl,transpose_opencl,hwmap=derive_device=vaapi:r
> everse=1
> 
> because the reverse mapping tries to replace the inlink hw_frames_ctx in a way
> which doesn't actually work.
> 
> All of these cases do of course work if anything else is in the way - any 
> additional
> opencl filter on either side makes it work.  I think it's fine to ignore this 
> (after all,
> the hwmap immediately followed by hwdownload case can already fail in the
> same way), but any thoughts you have on making that better are welcome.
I also noticed that when I did testing. Currently have no idea on how to fix it.
But I do have interest to look for a better fix for this issue.
Right now I am still struggling to understand the source code of hwmap.
I didn't figure out how the hwmap will be used to map from software to hardware 
format.
That is the piece of code starting from line 200 in vf_hwmap.c
https://github.com/FFmpeg/FFmpeg/blob/master/libavfilter/vf_hwmap.c#L200
Could you show me some example command that would go into this branch?

Thanks!
Ruiling
> 
> 
> >> Does the dependency on dir have any effect on speed here?  Any call is only
> ever
> >> going to use one side of each of the dir cases, so it feels like it might 
> >> be nicer
> to
> >> hard-code that so they aren't included in the compiled code at all.
> > For such memory bound OpenCL kernel, some little more arithmetic operation
> would not affect the overall performance.
> > I did some more testing, and see no obvious performance difference for
> different 'dir' parameter. So I just keep it as now.
> 
> That makes sense, thank you for checking.
> 
> 
> So, LGTM and applied.
> 
> Thanks,
> 
> - Mark
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] lavf: add tranpose_opencl filter

2018-11-27 Thread Song, Ruiling
Thanks for your valuable comments, reply inline.

> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Wednesday, November 28, 2018 8:41 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH] lavf: add tranpose_opencl filter
> 
> On 26/11/2018 07:05, Ruiling Song wrote:
> > Signed-off-by: Ruiling Song 
> > ---
> >  configure |   1 +
> >  libavfilter/Makefile  |   1 +
> >  libavfilter/allfilters.c  |   1 +
> >  libavfilter/opencl/transpose.cl   |  35 +
> >  libavfilter/opencl_source.h   |   1 +
> >  libavfilter/transpose.h   |  34 +
> >  libavfilter/vf_transpose.c|  14 +-
> >  libavfilter/vf_transpose_opencl.c | 294
> ++
> >  8 files changed, 368 insertions(+), 13 deletions(-)
> >  create mode 100644 libavfilter/opencl/transpose.cl
> >  create mode 100644 libavfilter/transpose.h
> >  create mode 100644 libavfilter/vf_transpose_opencl.c
> >
> > diff --git a/configure b/configure
> > index b4f944c..dcb3f5f 100755
> > --- a/configure
> > +++ b/configure
> > @@ -3479,6 +3479,7 @@ tinterlace_merge_test_deps="tinterlace_filter"
> >  tinterlace_pad_test_deps="tinterlace_filter"
> >  tonemap_filter_deps="const_nan"
> >  tonemap_opencl_filter_deps="opencl const_nan"
> > +transpose_opencl_filter_deps="opencl"
> >  unsharp_opencl_filter_deps="opencl"
> >  uspp_filter_deps="gpl avcodec"
> >  vaguedenoiser_filter_deps="gpl"
> > diff --git a/libavfilter/Makefile b/libavfilter/Makefile
> > index 1895fa2..6e26581 100644
> > --- a/libavfilter/Makefile
> > +++ b/libavfilter/Makefile
> > @@ -393,6 +393,7 @@ OBJS-$(CONFIG_TONEMAP_OPENCL_FILTER) +=
> vf_tonemap_opencl.o colorspace.o
> >  OBJS-$(CONFIG_TPAD_FILTER)   += vf_tpad.o
> >  OBJS-$(CONFIG_TRANSPOSE_FILTER)  += vf_transpose.o
> >  OBJS-$(CONFIG_TRANSPOSE_NPP_FILTER)  += vf_transpose_npp.o
> cuda_check.o
> > +OBJS-$(CONFIG_TRANSPOSE_OPENCL_FILTER)   += vf_transpose_opencl.o
> opencl.o opencl/transpose.o
> >  OBJS-$(CONFIG_TRIM_FILTER)   += trim.o
> >  OBJS-$(CONFIG_UNPREMULTIPLY_FILTER)  += vf_premultiply.o
> framesync.o
> >  OBJS-$(CONFIG_UNSHARP_FILTER)+= vf_unsharp.o
> > diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
> > index 837c99e..a600069 100644
> > --- a/libavfilter/allfilters.c
> > +++ b/libavfilter/allfilters.c
> > @@ -372,6 +372,7 @@ extern AVFilter ff_vf_tonemap_opencl;
> >  extern AVFilter ff_vf_tpad;
> >  extern AVFilter ff_vf_transpose;
> >  extern AVFilter ff_vf_transpose_npp;
> > +extern AVFilter ff_vf_transpose_opencl;
> >  extern AVFilter ff_vf_trim;
> >  extern AVFilter ff_vf_unpremultiply;
> >  extern AVFilter ff_vf_unsharp;
> > diff --git a/libavfilter/opencl/transpose.cl 
> > b/libavfilter/opencl/transpose.cl
> > new file mode 100644
> > index 000..e6388ab
> > --- /dev/null
> > +++ b/libavfilter/opencl/transpose.cl
> > @@ -0,0 +1,35 @@
> > +/*
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301
> USA
> > + */
> > +kernel void transpose(__write_only image2d_t dst,
> > +  __read_only image2d_t src,
> > +  int dir) {
> > +const sampler_t sampler = (CLK_NORMALIZED_COORDS_FALSE |
> > +   CLK_ADDRESS_CLAMP_TO_EDGE   |
> > +   CLK_FILTER_NEAREST);
> > +
> > +int2 size = get_image_dim(dst);
> > +int x = get_global_id(0);
> > +int y = get_global_id(1);
> > +
> > +int xin = (dir & 2) ? (size.y - 1 - y) : y;
> > +int yin = (dir & 1) ? (size.x - 1 - x) : x;
> > +float4 data = read_imagef(src, sampler, (int2)(xin, yin));
> > +
> > +if (x < size.x && y < size.y)
> > +write_imagef(dst, (int2)(x, y), data);
> > +}
> 
> Does the dependency on dir have any effect on speed here?  Any call is only 
> ever
> going to use one side of each of the dir cases, so it feels like it might be 
> nicer to
> hard-code that so they aren't included in the compiled code at all.
For such memory bound OpenCL kernel, some little more arithmetic operation 
would not affect the overall 

Re: [FFmpeg-devel] [PATCH] hwcontext_opencl: Use correct function to enumerate devices

2018-11-27 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Wednesday, November 28, 2018 8:17 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH] hwcontext_opencl: Use correct function to
> enumerate devices
> 
> Also assert that all required functions are present.
> ---
> On 26/11/2018 08:57, Song, Ruiling wrote:
> >> -Original Message-
> >> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf
> Of
> >> Mark Thompson
> >> Sent: Monday, November 26, 2018 6:08 AM
> >> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> >> Subject: [FFmpeg-devel] [PATCH] hwcontext_opencl: Use correct function to
> >> enumerate devices
> >>
> >> ---
> >>  libavutil/hwcontext_opencl.c | 6 +++---
> >>  1 file changed, 3 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/libavutil/hwcontext_opencl.c b/libavutil/hwcontext_opencl.c
> >> index e6cef74269..6a26354c87 100644
> >> --- a/libavutil/hwcontext_opencl.c
> >> +++ b/libavutil/hwcontext_opencl.c
> >> @@ -542,9 +542,9 @@ static int
> >> opencl_device_create_internal(AVHWDeviceContext *hwdev,
> >>  continue;
> >>  }
> >>
> >> -err = opencl_enumerate_devices(hwdev, platforms[p], platform_name,
> >> -   _devices, ,
> >> -   selector->context);
> >> +err = selector->enumerate_devices(hwdev, platforms[p],
> platform_name,
> >> +  _devices, ,
> >> +  selector->context);
> > I think it is better to check enumerate_devices  against null pointer before
> calling it, although it should works well currently.
> 
> The two enumerate functions should always be set when entering the function,
> since they are always required.  (Unlike the filter, where "do nothing" is a
> reasonable case.)
> 
> How about an assert at the start of the function to check that, like this?
Yes, that's good. It is just helpful when somebody add new platform support but 
happened to forget to give a meaningful function pointer.

Ruiling
> 
> - Mark
> 
> 
>  libavutil/hwcontext_opencl.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/libavutil/hwcontext_opencl.c b/libavutil/hwcontext_opencl.c
> index be71c8323e..d3df6221c4 100644
> --- a/libavutil/hwcontext_opencl.c
> +++ b/libavutil/hwcontext_opencl.c
> @@ -500,6 +500,9 @@ static int
> opencl_device_create_internal(AVHWDeviceContext *hwdev,
>   *device_name_src   = NULL;
>  int err, found, p, d;
> 
> +av_assert0(selector->enumerate_platforms &&
> +   selector->enumerate_devices);
> +
>  err = selector->enumerate_platforms(hwdev, _platforms, ,
>  selector->context);
>  if (err)
> @@ -531,9 +534,9 @@ static int
> opencl_device_create_internal(AVHWDeviceContext *hwdev,
>  continue;
>  }
> 
> -err = opencl_enumerate_devices(hwdev, platforms[p], platform_name,
> -   _devices, ,
> -   selector->context);
> +err = selector->enumerate_devices(hwdev, platforms[p], platform_name,
> +  _devices, ,
> +  selector->context);
>  if (err < 0)
>  continue;
> 
> --
> 2.19.1
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] hwcontext_opencl: Only release command queue if it exists

2018-11-26 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Monday, November 26, 2018 3:16 AM
> To: FFmpeg development discussions and patches 
> Subject: [FFmpeg-devel] [PATCH] hwcontext_opencl: Only release command
> queue if it exists
> 
> If the frames context creation fails then the command queue reference
> need not exist when uninit is called.
> ---
>  libavutil/hwcontext_opencl.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/libavutil/hwcontext_opencl.c b/libavutil/hwcontext_opencl.c
> index c745b91775..e6cef74269 100644
> --- a/libavutil/hwcontext_opencl.c
> +++ b/libavutil/hwcontext_opencl.c
> @@ -1750,10 +1750,13 @@ static void
> opencl_frames_uninit(AVHWFramesContext *hwfc)
>  av_freep(>mapped_frames);
>  #endif
> 
> -cle = clReleaseCommandQueue(priv->command_queue);
> -if (cle != CL_SUCCESS) {
> -av_log(hwfc, AV_LOG_ERROR, "Failed to release frame "
> -   "command queue: %d.\n", cle);
> +if (priv->command_queue) {
> +cle = clReleaseCommandQueue(priv->command_queue);
> +if (cle != CL_SUCCESS) {
> +av_log(hwfc, AV_LOG_ERROR, "Failed to release frame "
> +   "command queue: %d.\n", cle);
> +}
> +priv->command_queue = NULL;

Seems ok.

Ruiling
>  }
>  }
> 
> --
> 2.19.1
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] hwcontext_opencl: Use correct function to enumerate devices

2018-11-26 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Monday, November 26, 2018 6:08 AM
> To: FFmpeg development discussions and patches 
> Subject: [FFmpeg-devel] [PATCH] hwcontext_opencl: Use correct function to
> enumerate devices
> 
> ---
>  libavutil/hwcontext_opencl.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/libavutil/hwcontext_opencl.c b/libavutil/hwcontext_opencl.c
> index e6cef74269..6a26354c87 100644
> --- a/libavutil/hwcontext_opencl.c
> +++ b/libavutil/hwcontext_opencl.c
> @@ -542,9 +542,9 @@ static int
> opencl_device_create_internal(AVHWDeviceContext *hwdev,
>  continue;
>  }
> 
> -err = opencl_enumerate_devices(hwdev, platforms[p], platform_name,
> -   _devices, ,
> -   selector->context);
> +err = selector->enumerate_devices(hwdev, platforms[p], platform_name,
> +  _devices, ,
> +  selector->context);
I think it is better to check enumerate_devices  against null pointer before 
calling it, although it should works well currently.

Ruiling
>  if (err < 0)
>  continue;
> 
> --
> 2.19.1
> 
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 2/4] lavfi/opencl: Handle overlay input formats correctly.

2018-11-13 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Mark Thompson
> Sent: Sunday, November 11, 2018 9:55 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: Re: [FFmpeg-devel] [PATCH 2/4] lavfi/opencl: Handle overlay input
> formats correctly.
> 
> On 29/10/18 05:56, Ruiling Song wrote:
> > The main input may have alpha channel, we just ignore it.
> 
> This doesn't ignore it - it leaves it uninitialised in the output, so a YUVA 
> or GBRAP
> output will never write to the A plane.  I don't think that's what you're 
> intending.
What I wanted to say is ignoring main input alpha channel.
The question is what the user would expect the result alpha channel contains?
I don't have a clear answer to it, so I just keep it uninitialized.
Other comments make sense. Will fix them.

Thanks!
Ruiling
> 
> > Also add some checks for incompatible input formats.
> >
> > Signed-off-by: Ruiling Song 
> > ---
> >  libavfilter/vf_overlay_opencl.c | 58 ---
> --
> >  1 file changed, 46 insertions(+), 12 deletions(-)
> >
> > diff --git a/libavfilter/vf_overlay_opencl.c 
> > b/libavfilter/vf_overlay_opencl.c
> > index e9c8532..320c1a5 100644
> > --- a/libavfilter/vf_overlay_opencl.c
> > +++ b/libavfilter/vf_overlay_opencl.c
> > @@ -37,7 +37,7 @@ typedef struct OverlayOpenCLContext {
> >
> >  FFFrameSync  fs;
> >
> > -int  nb_planes;
> > +int  nb_color_planes;
> 
> This name change seems wrong - it includes the luma plane, which does not
> contain colour information.
> 
> >  int  x_subsample;
> >  int  y_subsample;
> >  int  alpha_separate;
> > @@ -46,6 +46,22 @@ typedef struct OverlayOpenCLContext {
> >  int  y_position;
> >  } OverlayOpenCLContext;
> >
> > +static int has_planar_alpha(const AVPixFmtDescriptor *fmt) {
> 
> { on new line.
> 
> > +int nb_components;
> > +int has_alpha = !!(fmt->flags & AV_PIX_FMT_FLAG_ALPHA);
> > +if (!has_alpha) return 0;
> 
> So, if the format does not not not contain alpha?  Perhaps instead write:
> 
> if (!(fmt->flags & AV_PIX_FMT_FLAG_ALPHA))
> return 0;
> 
> > +
> > +nb_components = fmt->nb_components;
> > +// PAL8
> > +if (nb_components < 2) return 0;
> 
> Check AV_PIX_FMT_FLAG_PAL instead?
> 
> > +
> > +if (fmt->comp[nb_components - 1].plane >
> > +fmt->comp[nb_components - 2].plane)
> > +return 1;
> > +else
> > +return 0;
> > +}
> > +
> >  static int overlay_opencl_load(AVFilterContext *avctx,
> > enum AVPixelFormat main_format,
> > enum AVPixelFormat overlay_format)
> > @@ -55,10 +71,13 @@ static int overlay_opencl_load(AVFilterContext *avctx,
> >  const char *source = ff_opencl_source_overlay;
> >  const char *kernel;
> >  const AVPixFmtDescriptor *main_desc, *overlay_desc;
> > -int err, i, main_planes, overlay_planes;
> > +int err, i, main_planes, overlay_planes, overlay_alpha,
> > +main_planar_alpha, overlay_planar_alpha;
> >
> >  main_desc= av_pix_fmt_desc_get(main_format);
> >  overlay_desc = av_pix_fmt_desc_get(overlay_format);
> > +overlay_alpha = !!(overlay_desc->flags & AV_PIX_FMT_FLAG_ALPHA);
> > +main_planar_alpha = has_planar_alpha(main_desc);
> >
> >  main_planes = overlay_planes = 0;
> >  for (i = 0; i < main_desc->nb_components; i++)
> > @@ -68,7 +87,7 @@ static int overlay_opencl_load(AVFilterContext *avctx,
> >  overlay_planes = FFMAX(overlay_planes,
> > overlay_desc->comp[i].plane + 1);
> >
> > -ctx->nb_planes = main_planes;
> > +ctx->nb_color_planes = main_planar_alpha ? (main_planes - 1) :
> main_planes;
> >  ctx->x_subsample = 1 << main_desc->log2_chroma_w;
> >  ctx->y_subsample = 1 << main_desc->log2_chroma_h;
> >
> > @@ -80,15 +99,30 @@ static int overlay_opencl_load(AVFilterContext *avctx,
> > ctx->x_subsample, ctx->y_subsample);
> >  }
> >
> > -if (main_planes == overlay_planes) {
> > -if (main_desc->nb_components == overlay_desc->nb_components)
> > -kernel = "overlay_no_alpha";
> > -else
> > -kernel = "overlay_internal_alpha";
> > +if ((main_desc->flags & AV_PIX_FMT_FLAG_RGB) !=
> > +(overlay_desc->flags & AV_PIX_FMT_FLAG_RGB)) {
> > +av_log(avctx, AV_LOG_ERROR, "mixed YUV/RGB input formats.\n");
> > +return AVERROR(EINVAL);
> > +}
> > +
> > +if (main_desc->log2_chroma_w != overlay_desc->log2_chroma_w ||
> > +main_desc->log2_chroma_h != overlay_desc->log2_chroma_h) {
> > +av_log(avctx, AV_LOG_ERROR, "incompatible chroma sub-sampling.\n");
> > +return AVERROR(EINVAL);
> > +}
> > +
> > +if (!overlay_alpha) {
> >  ctx->alpha_separate = 0;
> > +kernel = "overlay_no_alpha";
> >  } else {
> > -

Re: [FFmpeg-devel] [PATCH 2/4] lavfi/opencl: Handle overlay input formats correctly.

2018-11-09 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Li, Zhong
> Sent: Thursday, November 8, 2018 11:39 AM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH 2/4] lavfi/opencl: Handle overlay input
> formats correctly.
> 
> > > -Original Message-
> > > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On
> > Behalf
> > > Of Li, Zhong
> > > Sent: Wednesday, November 7, 2018 4:37 PM
> > > To: FFmpeg development discussions and patches
> > > 
> > > Subject: Re: [FFmpeg-devel] [PATCH 2/4] lavfi/opencl: Handle overlay
> > > input formats correctly.
> > >
> > > > > > > -Original Message-
> > > > > > > From: Song, Ruiling
> > > > > > > Sent: Monday, October 29, 2018 1:18 PM
> > > > > > > To: ffmpeg-devel@ffmpeg.org
> > > > > > > Cc: Song, Ruiling 
> > > > > > > Subject: [PATCH 2/4] lavfi/opencl: Handle overlay input
> > > > > > > formats
> > > > correctly.
> > > > > > >
> > > > > > > The main input may have alpha channel, we just ignore it.
> > > > > > > Also add some checks for incompatible input formats.
> > > > > > >
> > > > > > > Signed-off-by: Ruiling Song 
> > > > > LGTM.
> > > > > BTW, could the main input with alpha case be supported?
> > > >
> > > > I am not sure what kind of support do you mean?
> > > > I simply ignore the alpha channel of the main input, and do the
> > > > alpha blending using the overlay alpha.
> > > > Before this patch, the filter will do it wrong if the main input has
> > > > alpha channel. Now it works with this patch.
> > > >
> > > > Thanks!
> > > > Ruiling
> > >
> > > I mean support alpha blending with alpha channel of main input when no
> > > overlay alpha.
> > I think I got your idea, this patch aims to fix the issues reported by Gyan.
> > If people really want it (blending using alpha of main input) be supported, 
> > we
> > can add it later.
> > I am not sure whether this sounds ok?
> >
> > Thanks!
> > Ruiling
> 
> Sure, sound good.

Can we merge this patch? Any objection or concern?

Ruiling
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 2/4] lavfi/opencl: Handle overlay input formats correctly.

2018-11-07 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Li, Zhong
> Sent: Wednesday, November 7, 2018 4:37 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH 2/4] lavfi/opencl: Handle overlay input
> formats correctly.
> 
> > > > > -Original Message-
> > > > > From: Song, Ruiling
> > > > > Sent: Monday, October 29, 2018 1:18 PM
> > > > > To: ffmpeg-devel@ffmpeg.org
> > > > > Cc: Song, Ruiling 
> > > > > Subject: [PATCH 2/4] lavfi/opencl: Handle overlay input formats
> > correctly.
> > > > >
> > > > > The main input may have alpha channel, we just ignore it.
> > > > > Also add some checks for incompatible input formats.
> > > > >
> > > > > Signed-off-by: Ruiling Song 
> > > LGTM.
> > > BTW, could the main input with alpha case be supported?
> >
> > I am not sure what kind of support do you mean?
> > I simply ignore the alpha channel of the main input, and do the alpha
> > blending using the overlay alpha.
> > Before this patch, the filter will do it wrong if the main input has alpha
> > channel. Now it works with this patch.
> >
> > Thanks!
> > Ruiling
> 
> I mean support alpha blending with alpha channel of main input when no overlay
> alpha.
I think I got your idea, this patch aims to fix the issues reported by Gyan.
If people really want it (blending using alpha of main input) be supported, we 
can add it later.
I am not sure whether this sounds ok?

Thanks!
Ruiling

> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 2/4] lavfi/opencl: Handle overlay input formats correctly.

2018-11-07 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Li, Zhong
> Sent: Wednesday, November 7, 2018 2:58 PM
> To: FFmpeg development discussions and patches 
> Subject: Re: [FFmpeg-devel] [PATCH 2/4] lavfi/opencl: Handle overlay input
> formats correctly.
> 
> > > -Original Message-
> > > From: Song, Ruiling
> > > Sent: Monday, October 29, 2018 1:18 PM
> > > To: ffmpeg-devel@ffmpeg.org
> > > Cc: Song, Ruiling 
> > > Subject: [PATCH 2/4] lavfi/opencl: Handle overlay input formats correctly.
> > >
> > > The main input may have alpha channel, we just ignore it.
> > > Also add some checks for incompatible input formats.
> > >
> > > Signed-off-by: Ruiling Song 
> LGTM.
> BTW, could the main input with alpha case be supported?

I am not sure what kind of support do you mean?
I simply ignore the alpha channel of the main input, and do the alpha blending 
using the overlay alpha.
Before this patch, the filter will do it wrong if the main input has alpha 
channel. Now it works with this patch.

Thanks!
Ruiling
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 2/4] lavfi/opencl: Handle overlay input formats correctly.

2018-11-06 Thread Song, Ruiling


> -Original Message-
> From: Song, Ruiling
> Sent: Monday, October 29, 2018 1:18 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Song, Ruiling 
> Subject: [PATCH 2/4] lavfi/opencl: Handle overlay input formats correctly.
> 
> The main input may have alpha channel, we just ignore it.
> Also add some checks for incompatible input formats.
> 
> Signed-off-by: Ruiling Song 
> ---
>  libavfilter/vf_overlay_opencl.c | 58 -
> 
>  1 file changed, 46 insertions(+), 12 deletions(-)
> 
> diff --git a/libavfilter/vf_overlay_opencl.c b/libavfilter/vf_overlay_opencl.c
> index e9c8532..320c1a5 100644
> --- a/libavfilter/vf_overlay_opencl.c
> +++ b/libavfilter/vf_overlay_opencl.c
> @@ -37,7 +37,7 @@ typedef struct OverlayOpenCLContext {
> 
>  FFFrameSync  fs;
> 
> -int  nb_planes;
> +int  nb_color_planes;
>  int  x_subsample;
>  int  y_subsample;
>  int  alpha_separate;
> @@ -46,6 +46,22 @@ typedef struct OverlayOpenCLContext {
>  int  y_position;
>  } OverlayOpenCLContext;
> 
> +static int has_planar_alpha(const AVPixFmtDescriptor *fmt) {
> +int nb_components;
> +int has_alpha = !!(fmt->flags & AV_PIX_FMT_FLAG_ALPHA);
> +if (!has_alpha) return 0;
> +
> +nb_components = fmt->nb_components;
> +// PAL8
> +if (nb_components < 2) return 0;
> +
> +if (fmt->comp[nb_components - 1].plane >
> +fmt->comp[nb_components - 2].plane)
> +return 1;
> +else
> +return 0;
> +}
> +
>  static int overlay_opencl_load(AVFilterContext *avctx,
> enum AVPixelFormat main_format,
> enum AVPixelFormat overlay_format)
> @@ -55,10 +71,13 @@ static int overlay_opencl_load(AVFilterContext *avctx,
>  const char *source = ff_opencl_source_overlay;
>  const char *kernel;
>  const AVPixFmtDescriptor *main_desc, *overlay_desc;
> -int err, i, main_planes, overlay_planes;
> +int err, i, main_planes, overlay_planes, overlay_alpha,
> +main_planar_alpha, overlay_planar_alpha;
> 
>  main_desc= av_pix_fmt_desc_get(main_format);
>  overlay_desc = av_pix_fmt_desc_get(overlay_format);
> +overlay_alpha = !!(overlay_desc->flags & AV_PIX_FMT_FLAG_ALPHA);
> +main_planar_alpha = has_planar_alpha(main_desc);
> 
>  main_planes = overlay_planes = 0;
>  for (i = 0; i < main_desc->nb_components; i++)
> @@ -68,7 +87,7 @@ static int overlay_opencl_load(AVFilterContext *avctx,
>  overlay_planes = FFMAX(overlay_planes,
> overlay_desc->comp[i].plane + 1);
> 
> -ctx->nb_planes = main_planes;
> +ctx->nb_color_planes = main_planar_alpha ? (main_planes - 1) : 
> main_planes;
>  ctx->x_subsample = 1 << main_desc->log2_chroma_w;
>  ctx->y_subsample = 1 << main_desc->log2_chroma_h;
> 
> @@ -80,15 +99,30 @@ static int overlay_opencl_load(AVFilterContext *avctx,
> ctx->x_subsample, ctx->y_subsample);
>  }
> 
> -if (main_planes == overlay_planes) {
> -if (main_desc->nb_components == overlay_desc->nb_components)
> -kernel = "overlay_no_alpha";
> -else
> -kernel = "overlay_internal_alpha";
> +if ((main_desc->flags & AV_PIX_FMT_FLAG_RGB) !=
> +(overlay_desc->flags & AV_PIX_FMT_FLAG_RGB)) {
> +av_log(avctx, AV_LOG_ERROR, "mixed YUV/RGB input formats.\n");
> +return AVERROR(EINVAL);
> +}
> +
> +if (main_desc->log2_chroma_w != overlay_desc->log2_chroma_w ||
> +main_desc->log2_chroma_h != overlay_desc->log2_chroma_h) {
> +av_log(avctx, AV_LOG_ERROR, "incompatible chroma sub-sampling.\n");
> +return AVERROR(EINVAL);
> +}
> +
> +if (!overlay_alpha) {
>  ctx->alpha_separate = 0;
> +kernel = "overlay_no_alpha";
>  } else {
> -kernel = "overlay_external_alpha";
> -ctx->alpha_separate = 1;
> +overlay_planar_alpha = has_planar_alpha(overlay_desc);
> +if (overlay_planar_alpha) {
> +ctx->alpha_separate = 1;
> +kernel = "overlay_external_alpha";
> +} else {
> +ctx->alpha_separate = 0;
> +kernel = "overlay_internal_alpha";
> +}
>  }
> 
>  av_log(avctx, AV_LOG_DEBUG, "Using kernel %s.\n", kernel);
> @@ -155,7 +189,7 

Re: [FFmpeg-devel] [PATCH 1/4] doc/filters: add document for opencl filters

2018-10-29 Thread Song, Ruiling


> -Original Message-
> From: Song, Ruiling
> Sent: Monday, October 29, 2018 1:57 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Song, Ruiling ; Danil Iashchenko
> 
> Subject: [PATCH 1/4] doc/filters: add document for opencl filters
> 
> Signed-off-by: Danil Iashchenko 
> Signed-off-by: Ruiling Song 
> ---
> Seems like Danil is not working on this recently.
> So I re-submit this patch to address the comment over overlay_opencl.
> 
> Thanks!
> Ruiling
Sorry to send the patch-set accidently. They are just the same.

Thanks!
Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH V3] Add a filter implementing HDR image reconstruction from a single exposure using deep CNNs

2018-10-22 Thread Song, Ruiling
> Thanks for the link, however i'm still not sold on the term. You "generate"
> hdr data, not "reconstruct": it's generated/estimated/made up data, not
> data that is lost and needs to be reconstrcuted. I suggested "tonemap"
> because you're mapping SDR tones (aka colors) to HDR ones, and that seems
> the right term to use. If you really dislike it, at least consider "HDR
> image generation from a single exposure using deep CNNs" which would work
I think "inverse/reverse tone mapping" looks a little better. I see many papers 
use this term when talking about sdr to hdr.

> much better.
> --
> Vittorio
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/4] vf_tonemap: Update the default peak values

2018-08-03 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Vittorio Giovara
> Sent: Wednesday, July 25, 2018 8:47 AM
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH 1/4] vf_tonemap: Update the default peak
> values
> 
> When there is no metadata attached to a frame, take into account both
> the PQ and HLG transfers, and change the HLG default value to 10:
> the value of 12 is the maximum range in scene referred light, but
> the reference OOTF maps this from 0 to 1000 cd/m² on the ideal HLG
> monitor.
The patch-set looks good to me.

Ruiling
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH 1/2] docs/filters: add documentation to all existing OpenCL filters

2018-08-01 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Danil Iashchenko
> Sent: Tuesday, July 31, 2018 8:14 AM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Danil Iashchenko 
> Subject: [FFmpeg-devel] [PATCH 1/2] docs/filters: add documentation to all
> existing OpenCL filters
> 
> docs/filters: add documentation to all existing OpenCL filters
> 
> ---
> 
> Thanks for your comments! Most of the issues have been fixed.
> 
> >The filer source has many more options defined.
> >In addition, there are many missing values for the tonemap algo; it appears 
> >not
> all are effected. If that's the case, remove or note that in the AVOptions 
> table.
> 
> It seems I'd need more time than expected to fully acquaint myself with
> tonemap_opencl so as to write appropriately detailed documentation, (as you
> said, there are a lot of options, and it's unclear which ones are actually 
> working).
> This hinders overall progress on the documentation and filter implementation 
> of
> my GSoC project and there is not much time left. I suggest putting it on the
> backburner for the moment and leaving it out until the next patch.
> Again, only tonemap_opencl presents an issue; the rest of the documentation is
> fine.
I am sorry for the late reply. I just write a patch to add the tonemap_opencl 
filter document.
It is based on this patch.

Thanks!
Ruiling 

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] lavfi/convolution_opencl: implement CL_FAIL_ON_ERR macro

2018-07-12 Thread Song, Ruiling


> -Original Message-
> From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of
> Danil Iashchenko
> Sent: Thursday, July 12, 2018 7:02 AM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Danil Iashchenko 
> Subject: [FFmpeg-devel] [PATCH] lavfi/convolution_opencl: implement
> CL_FAIL_ON_ERR macro
Hi Danil,

The patch looks good, But I think it would be more proper to use a title like 
"switch to use CL_FAIL_ON_ERROR " or "use CL_FAIL_ON_ERROR for error handling".

Ruiling
> 
> ---
>  libavfilter/vf_convolution_opencl.c | 46 
> +
>  1 file changed, 11 insertions(+), 35 deletions(-)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


  1   2   >