[FFmpeg-devel] [PATCH] libavfi/dnn: enable LibTorch xpu device option support

2024-06-02 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Add xpu device support to libtorch backend. To enable xpu support you need to add "-Wl,--no-as-needed -lintel-ext-pt-gpu -Wl,--as-needed" to "--extra-libs" when configure ffmpeg. Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_backend_torch.cpp | 16

[FFmpeg-devel] [PATCH v1 2/2][GSoC 2024] tests/checkasm/vvc_mc: for SAD, only test valid subblock sizes

2024-05-28 Thread Stone Chen
According to the VVC specification (section 8.5.1), the maximum width/height of a subblock passed for DMVR SAD is 16. This along with previous constraint requiring width * height >= 128 means that 8x16, 16x8, and 16x16 are the only allowed sizes. This changes check_vvc_sad() to only test and

[FFmpeg-devel] [PATCH v1 1/2][GSoC 2024] libavcode/x86/vvc: change label to vvc_sad_16 to reflect block sizes

2024-05-28 Thread Stone Chen
According to the VVC specification (section 8.5.1), the maximum width/height of a subblock passed for DMVR SAD is 16. This along with previous constraint requiring width * height >= 128 means that 8x16, 16x8, and 16x16 are the only allowed sizes. This re-labels vvc_sad_16_128 to vvc_sad_16 to

Re: [FFmpeg-devel] [PATCH v5 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-23 Thread Stone Chen
On Thu, May 23, 2024 at 9:18 AM Nuo Mi wrote: > On Thu, May 23, 2024 at 7:38 AM James Almer wrote: > > > On 5/21/2024 10:01 PM, Ronald S. Bultje wrote: > > > Hi, > > > > > > On Tue, May 21, 2024 at 8:01 PM Stone Chen > > wrote: > > > &

Re: [FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-21 Thread Stone Chen
On Mon, May 20, 2024 at 7:23 AM Ronald S. Bultje wrote: > Hi, > > This is mostly good, the following is tiny nitpicks. > > On Sun, May 19, 2024 at 8:46 PM Stone Chen > wrote: > >> +%macro INIT_OFFSET 6 ; src1, src2, dxq, dyq, off1, off2 >> > > The macro

[FFmpeg-devel] [PATCH v5 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-21 Thread Stone Chen
Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 50.3 vvc_sad_8x8_avx2: 0.3 vvc_sad_16x16_c: 250.3 vvc_sad_16x16_avx2: 10.3 vvc_sad_32x32_c: 1020.3 vvc_sad_32x32_avx2: 60.3 vvc_sad_64x64_c: 3850.3 vvc_sad_64x64_avx2: 220.3 vvc_sad_128x128_c: 14100.3

[FFmpeg-devel] [PATCH v5 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-21 Thread Stone Chen
codec/x86/vvc/vvc_sad.asm new file mode 100644 index 00..9766446b11 --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,130 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpeg is

[FFmpeg-devel] [PATCH v4 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-19 Thread Stone Chen
Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 70.0 vvc_sad_8x8_avx2: 10.0 vvc_sad_16x16_c: 280.0 vvc_sad_16x16_avx2: 20.0 vvc_sad_32x32_c: 1020.0 vvc_sad_32x32_avx2: 70.0 vvc_sad_64x64_c: 3560.0 vvc_sad_64x64_avx2: 270.0 vvc_sad_128x128_c: 13760.0

[FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-19 Thread Stone Chen
codec/x86/vvc/vvc_sad.asm new file mode 100644 index 00..58a24635d2 --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,138 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpe

[FFmpeg-devel] [PATCH v4 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-19 Thread Stone Chen
Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 70.0 vvc_sad_8x8_avx2: 10.0 vvc_sad_16x16_c: 280.0 vvc_sad_16x16_avx2: 20.0 vvc_sad_32x32_c: 1020.0 vvc_sad_32x32_avx2: 70.0 vvc_sad_64x64_c: 3560.0 vvc_sad_64x64_avx2: 270.0 vvc_sad_128x128_c: 13760.0

[FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-19 Thread Stone Chen
codec/x86/vvc/vvc_sad.asm new file mode 100644 index 00..58a24635d2 --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,138 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpe

Re: [FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-19 Thread Stone Chen
On Sat, May 18, 2024 at 11:33 AM Ronald S. Bultje wrote: > Hi, > > On Tue, May 14, 2024 at 4:40 PM Stone Chen > wrote: > >> +vvc_sad_8: >> +.loop_height: >> +movu xm0, [src1q] >> +movu xm1, [src2q] >

Re: [FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-18 Thread Stone Chen
On Sat, May 18, 2024 at 9:04 AM Ronald S. Bultje wrote: > Hi, > > On Tue, May 14, 2024 at 4:40 PM Stone Chen > wrote: > >> Implements AVX2 DMVR (decoder-side motion vector refinement) SAD >> functions. DMVR SAD is only calculated if w >= 8, h >= 8, and w * h &g

[FFmpeg-devel] [PATCH v3 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-14 Thread Stone Chen
Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 63.0 vvc_sad_8x8_avx2: 3.0 vvc_sad_16x16_c: 263.0 vvc_sad_16x16_avx2: 23.0 vvc_sad_32x32_c: 1003.0 vvc_sad_32x32_avx2: 83.0 vvc_sad_64x64_c: 3923.0 vvc_sad_64x64_avx2: 373.0 vvc_sad_128x128_c: 17533.0

[FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-14 Thread Stone Chen
c/x86/vvc/vvc_sad.asm new file mode 100644 index 00..530142ad35 --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,157 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpe

[FFmpeg-devel] [PATCH v2 2/2][GSoC 2024] Terminal tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-11 Thread Stone Chen
Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 63.0 vvc_sad_8x8_avx2: 3.0 vvc_sad_16x16_c: 263.0 vvc_sad_16x16_avx2: 23.0 vvc_sad_32x32_c: 1003.0 vvc_sad_32x32_avx2: 83.0 vvc_sad_64x64_c: 3923.0 vvc_sad_64x64_avx2: 373.0 vvc_sad_128x128_c: 17533.0

[FFmpeg-devel] [PATCH v2 1/2][GSoC] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-11 Thread Stone Chen
1184c731c --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,155 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpeg is free software; you can redistribute it and/or +; * modify

Re: [FFmpeg-devel] [PATCH 1/3][GSoC 2024] libavcodec/vvc: convert (*sad) to (*sad[6]) to prepare for AVX2 funcs

2024-05-06 Thread Stone Chen
On Wed, May 1, 2024 at 6:59 PM Andreas Rheinhardt < andreas.rheinha...@outlook.com> wrote: > Stone Chen: > > To prepare for adding AVX2 functions for different block widths, change > VVCInterDSPContext to contain (*sad[6]) instead of (*sad). This also > default initiali

[FFmpeg-devel] [PATCH 3/3][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-01 Thread Stone Chen
Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8_16bpc_c: 112.5 vvc_sad_8_16bpc_avx2: 2.5 vvc_sad_16_16bpc_c: 232.5 vvc_sad_16_16bpc_avx2: 22.5 vvc_sad_32_16bpc_c: 912.5 vvc_sad_32_16bpc_avx2: 82.5 vvc_sad_64_16bpc_c: 3582.5 vvc_sad_64_16bpc_avx2: 392.5

[FFmpeg-devel] [PATCH 2/3][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-01 Thread Stone Chen
ull +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,193 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpeg is free software; you can redistribute it and/or +; * modify it under the terms of the G

[FFmpeg-devel] [PATCH 1/3][GSoC 2024] libavcodec/vvc: convert (*sad) to (*sad[6]) to prepare for AVX2 funcs

2024-05-01 Thread Stone Chen
To prepare for adding AVX2 functions for different block widths, change VVCInterDSPContext to contain (*sad[6]) instead of (*sad). This also default initializes the pointer array with the scalar function and the calling sites to jump to the correct function based on block width. There's no

Re: [FFmpeg-devel] [PATCH WIP v2 1/9] avfilter/dnn: Refactor DNN parameter configuration system

2024-04-29 Thread Chen, Wenbin
> -Original Message- > From: ffmpeg-devel On Behalf Of Zhao > Zhili > Sent: Sunday, April 28, 2024 2:47 PM > To: ffmpeg-devel@ffmpeg.org > Cc: Zhao Zhili > Subject: [FFmpeg-devel] [PATCH WIP v2 1/9] avfilter/dnn: Refactor DNN > parameter configuration system > > From: Zhao Zhili > + >

Re: [FFmpeg-devel] [PATCH WIP 0/9] Refactor DNN

2024-04-29 Thread Chen, Wenbin
> > On Apr 29, 2024, at 18:29, Guo, Yejun > wrote: > > > > > > > >> -Original Message- > >> From: ffmpeg-devel On Behalf Of > Zhao > >> Zhili > >> Sent: Sunday, April 28, 2024 6:55 PM > >> To: FFmpeg development discussions and patches >> de...@ffmpeg.org> > >> Subject: Re:

Re: [FFmpeg-devel] [PATCH WIP 0/9] Refactor DNN

2024-04-28 Thread Chen, Wenbin
> > On Apr 28, 2024, at 18:58, Paul B Mahol wrote: > > > > Extremely low quality filters, both in source code quality and > > performance/security and output quality should be queued for removal. These filters cannot be removed because there are users using them. What are your suggestions to

[FFmpeg-devel] [PATCH 2/2] libavfilter/dnn_io_proc: Take step into consideration when crop frame

2024-04-02 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_io_proc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/libavfilter/dnn/dnn_io_proc.c b/libavfilter/dnn/dnn_io_proc.c index e5d6edb301..d2ec9f63f5 100644 --- a/libavfilter/dnn/dnn_io_proc.c +++ b

[FFmpeg-devel] [PATCH 1/2] libavfilter/dnn_backend_openvino: Check bbox's height

2024-04-02 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Check bbox's height with frame's height rather than frame's width. Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_backend_openvino.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavfilter/dnn/dnn_backend_openvino.c b/libavfilter/dnn

Re: [FFmpeg-devel] [PATCH v2] doc: Add libtoch backend option to dnn_processing

2024-03-27 Thread Chen, Wenbin
> >> -Original Message- > >> From: ffmpeg-devel On Behalf Of > >> wenbin.chen-at-intel@ffmpeg.org > >> Sent: Monday, March 25, 2024 10:15 AM > >> To: ffmpeg-devel@ffmpeg.org > >> Subject: [FFmpeg-devel] [PATCH v2] doc: Add libtoch backend option to > >> dnn_processing > > Typo in

[FFmpeg-devel] [PATCH v2] doc: Add libtoch backend option to dnn_processing

2024-03-24 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Signed-off-by: Wenbin Chen --- doc/filters.texi | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/doc/filters.texi b/doc/filters.texi index 18f0d1c5a7..bfa8ccec8b 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -12073,11 +12073,21

[FFmpeg-devel] [PATCH] doc: Add libtoch backend option to dnn_processing

2024-03-21 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Signed-off-by: Wenbin Chen --- doc/filters.texi | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/doc/filters.texi b/doc/filters.texi index 913365671d..20605e72b2 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -12069,11 +12069,21

Re: [FFmpeg-devel] [PATCH] Changelog: Add libtorch

2024-03-20 Thread Chen, Wenbin
> On date Wednesday 2024-03-20 16:01:36 +0800, wenbin.chen-at- > intel@ffmpeg.org wrote: > > From: Wenbin Chen > > > > Signed-off-by: Wenbin Chen > > --- > > Changelog | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/

[FFmpeg-devel] [PATCH v2] Changelog: Add libtorch

2024-03-20 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Signed-off-by: Wenbin Chen --- Changelog | 1 + 1 file changed, 1 insertion(+) diff --git a/Changelog b/Changelog index e3ca52430c..4af55ff537 100644 --- a/Changelog +++ b/Changelog @@ -35,6 +35,7 @@ version : - AEA muxer - ffmpeg CLI loopback decoders - Support

[FFmpeg-devel] [PATCH] Changelog: Add libtorch

2024-03-20 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Signed-off-by: Wenbin Chen --- Changelog | 1 + 1 file changed, 1 insertion(+) diff --git a/Changelog b/Changelog index e3ca52430c..d0c41887f3 100644 --- a/Changelog +++ b/Changelog @@ -35,6 +35,7 @@ version : - AEA muxer - ffmpeg CLI loopback decoders - Support

[FFmpeg-devel] [PATCH v6] libavfi/dnn: add LibTorch as one of DNN backend

2024-03-14 Thread wenbin . chen-at-intel . com
From: Wenbin Chen PyTorch is an open source machine learning framework that accelerates the path from research prototyping to production deployment. Official website: https://pytorch.org/. We call the C++ library of PyTorch as LibTorch, the same below. To build FFmpeg with LibTorch, please take

Re: [FFmpeg-devel] [PATCH v5] libavfi/dnn: add LibTorch as one of DNN backend

2024-03-14 Thread Chen, Wenbin
> > -Original Message- > > From: ffmpeg-devel On Behalf Of > > wenbin.chen-at-intel@ffmpeg.org > > Sent: Monday, March 11, 2024 1:02 PM > > To: ffmpeg-devel@ffmpeg.org > > Subject: [FFmpeg-devel] [PATCH v5] libavfi/dnn: add LibTorch as one of DNN &

[FFmpeg-devel] [PATCH] doc/filters: Change rdiv (vf_convolution) documentation to reflect actual behavior

2024-03-14 Thread Stone Chen
The documentation correctly states that the rdiv is a multiplier but incorrectly states the default behavior is to multiply by the sum of all matrix elements - it multiplies by 1/sum. This changes the documentation to match the code. Address trac #10889 --- doc/filters.texi | 2 +- 1 file

Re: [FFmpeg-devel] [PATCH] Change rdiv (vf_convolution) documentation to reflect actual behavior

2024-03-13 Thread Stone Chen
On Wed, Mar 13, 2024 at 4:26 AM Marton Balint wrote: > > > On Tue, 12 Mar 2024, Stone Chen wrote: > > > The documentation correctly states that the rdiv is a multiplier but > incorrectly states the default behavior is to multiply by the sum of all > matrix elements

[FFmpeg-devel] [PATCH] Change rdiv (vf_convolution) documentation to reflect actual behavior

2024-03-12 Thread Stone Chen
The documentation correctly states that the rdiv is a multiplier but incorrectly states the default behavior is to multiply by the sum of all matrix elements - it multiplies by 1/sum. This changes the documentation to match the code. --- doc/filters.texi | 2 +- 1 file changed, 1 insertion(+),

[FFmpeg-devel] [PATCH v5] libavfi/dnn: add LibTorch as one of DNN backend

2024-03-10 Thread wenbin . chen-at-intel . com
From: Wenbin Chen PyTorch is an open source machine learning framework that accelerates the path from research prototyping to production deployment. Official website: https://pytorch.org/. We call the C++ library of PyTorch as LibTorch, the same below. To build FFmpeg with LibTorch, please take

Re: [FFmpeg-devel] [PATCH v4] libavfi/dnn: add LibTorch as one of DNN backend

2024-03-05 Thread Chen, Wenbin
> > > On Feb 20, 2024, at 7:07 PM, wenbin.chen-at-intel@ffmpeg.org wrote: > > > > From: Wenbin Chen > > > > PyTorch is an open source machine learning framework that accelerates > > the path from research prototyping to production deployment. Officia

Re: [FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

2024-02-24 Thread Stone Chen
On Sat, Feb 24, 2024 at 6:34 PM Marton Balint wrote: > > > On Sat, 24 Feb 2024, Stone Chen wrote: > > > On Sat, Feb 24, 2024 at 3:56 PM Marton Balint wrote: > > > >> > >> > >> On Sat, 24 Feb 2024, Stone Chen wrote: > >> > >>

Re: [FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

2024-02-24 Thread Stone Chen
On Sat, Feb 24, 2024 at 3:56 PM Marton Balint wrote: > > > On Sat, 24 Feb 2024, Stone Chen wrote: > > > Previously to support dynamic reconfigurations of the matrix string > (e.g. 0m), the rdiv values would always be cleared to 0.f, causing the rdiv > to be recalculated

Re: [FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

2024-02-24 Thread Stone Chen
Sorry I just realized I messed up my git commit (new to git), I've attached a patch file with that correction. On Sat, Feb 24, 2024 at 10:49 AM Stone Chen wrote: > Previously to support dynamic reconfigurations of the matrix string (e.g. > 0m), the rdiv values would always be cleared

[FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

2024-02-24 Thread Stone Chen
Previously to support dynamic reconfigurations of the matrix string (e.g. 0m), the rdiv values would always be cleared to 0.f, causing the rdiv to be recalculated based on the new filter. This however had the side effect of always ignoring user specified rdiv values. Instead float user_rdiv[0]

Re: [FFmpeg-devel] [PATCH v3] libavfi/dnn: add LibTorch as one of DNN backend

2024-02-20 Thread Chen, Wenbin
> Hello, > > On Tue, 20 Feb 2024, at 05:48, wenbin.chen-at-intel@ffmpeg.org wrote: > > From: Wenbin Chen > > > > PyTorch is an open source machine learning framework that accelerates > > OK for me > > > the path from research prototyping to product

[FFmpeg-devel] [PATCH v4] libavfi/dnn: add LibTorch as one of DNN backend

2024-02-20 Thread wenbin . chen-at-intel . com
From: Wenbin Chen PyTorch is an open source machine learning framework that accelerates the path from research prototyping to production deployment. Official website: https://pytorch.org/. We call the C++ library of PyTorch as LibTorch, the same below. To build FFmpeg with LibTorch, please take

[FFmpeg-devel] [PATCH v3] libavfi/dnn: add LibTorch as one of DNN backend

2024-02-19 Thread wenbin . chen-at-intel . com
From: Wenbin Chen PyTorch is an open source machine learning framework that accelerates the path from research prototyping to production deployment. Official websit: https://pytorch.org/. We call the C++ library of PyTorch as LibTorch, the same below. To build FFmpeg with LibTorch, please take

Re: [FFmpeg-devel] [PATCH v2] libavfi/dnn: add LibTorch as one of DNN backend

2024-02-19 Thread Chen, Wenbin
> > Hello, > > > > On Fri, 2 Feb 2024, at 08:26, wenbin.chen-at-intel@ffmpeg.org wrote: > > > +static void infer_completion_callback(void *args) { > > > +THRequestItem *request = (THRequestItem*)args; > > > +LastLevelTaskItem *lltask = request->lltask; > > > +TaskItem *task =

Re: [FFmpeg-devel] [PATCH v2] libavfi/dnn: add LibTorch as one of DNN backend

2024-02-19 Thread Chen, Wenbin
> Hello, > > On Fri, 2 Feb 2024, at 08:26, wenbin.chen-at-intel@ffmpeg.org wrote: > > +static void infer_completion_callback(void *args) { > > +THRequestItem *request = (THRequestItem*)args; > > +LastLevelTaskItem *lltask = request->lltask; > > +TaskItem *task = lltask->task; > >

Re: [FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

2024-02-18 Thread Stone Chen
Hi Marton, Thanks for the feedback! I'm not sure what dynamic reconfiguration is, from some searching I think it might be related to commands? On Sun, Feb 18, 2024 at 7:08 PM Marton Balint wrote: > > > On Sun, 18 Feb 2024, Stone Chen wrote: > > > In commit 6c45d34

Re: [FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

2024-02-18 Thread Stone Chen
Hi Marton, Thanks for the feedback! I'm not sure what dynamic reconfiguration is, from some searching I think it might be related to commands? On Sun, Feb 18, 2024 at 7:08 PM Marton Balint wrote: > > > On Sun, 18 Feb 2024, Stone Chen wrote: > > > In commit 6c45d34

Re: [FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

2024-02-18 Thread Stone Chen
Sorry I think I didn't correctly attach the patch the first time. On Sun, Feb 18, 2024 at 2:21 PM Stone Chen wrote: > In commit 6c45d34, a line was added that always sets rdiv to 0, overriding > any user input. This removes that line allowing user set values for 0rdiv, > 1rdiv, 2rd

[FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

2024-02-18 Thread Stone Chen
In commit 6c45d34, a line was added that always sets rdiv to 0, overriding any user input. This removes that line allowing user set values for 0rdiv, 1rdiv, 2rdiv, 3rdiv to apply as expected. This fixes ticket #10294. Signed-off-by: Stone Chen --- libavfilter/vf_convolution.c | 1 - 1 file

Re: [FFmpeg-devel] [PATCH v2 1/1] avfilter/vf_vpp_qsv: apply 3D LUT from file.

2024-02-03 Thread Chen Yufei
On Tue, Jan 30, 2024 at 10:59 AM Chen Yufei wrote: > > On Tue, Jan 30, 2024 at 1:07 AM Anton Khirnov wrote: > > > > Quoting Chen Yufei (2024-01-29 04:01:51) > > > On Sun, Jan 28, 2024 at 10:10 PM Anton Khirnov wrote: > > > > > >

[FFmpeg-devel] [PATCH v2] libavfi/dnn: add LibTorch as one of DNN backend

2024-02-01 Thread wenbin . chen-at-intel . com
From: Wenbin Chen PyTorch is an open source machine learning framework that accelerates the path from research prototyping to production deployment. Official websit: https://pytorch.org/. We call the C++ library of PyTorch as LibTorch, the same below. To build FFmpeg with LibTorch, please take

Re: [FFmpeg-devel] [PATCH v2 1/1] avfilter/vf_vpp_qsv: apply 3D LUT from file.

2024-01-29 Thread Chen Yufei
On Tue, Jan 30, 2024 at 1:07 AM Anton Khirnov wrote: > > Quoting Chen Yufei (2024-01-29 04:01:51) > > On Sun, Jan 28, 2024 at 10:10 PM Anton Khirnov wrote: > > > > > > Quoting Zhao Zhili (2024-01-28 14:51:58) > > > > > > > > >

Re: [FFmpeg-devel] [PATCH v2 1/1] avfilter/vf_vpp_qsv: apply 3D LUT from file.

2024-01-29 Thread Chen Yufei
etect LUT file type and call different parse functions. If we add another option to specify LUT file type, then vf_lut3d's command line option would require change. -- Best regards, Chen Yufei ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http

Re: [FFmpeg-devel] [PATCH v2 1/1] avfilter/vf_vpp_qsv: apply 3D LUT from file.

2024-01-28 Thread Chen Yufei
On Sun, Jan 28, 2024 at 10:10 PM Anton Khirnov wrote: > > Quoting Zhao Zhili (2024-01-28 14:51:58) > > > > > > > On Jan 28, 2024, at 18:31, Anton Khirnov wrote: > > > > > > Quoting Chen Yufei (2024-01-25 17:16:46) > > >

[FFmpeg-devel] [PATCH v3 1/1] avfilter/vf_vpp_qsv: apply 3D LUT from file.

2024-01-27 Thread Chen Yufei
Usage: "vpp_qsv=lut3d_file=" Requires oneVPL, using system memory 3D LUT surface. Signed-off-by: Chen Yufei --- libavfilter/Makefile | 8 +- libavfilter/lut3d.c | 669 +++ libavfilter/lut3d.h | 13 + libavfilter/vf_lut3d

[FFmpeg-devel] [PATCH v3 0/1] avfilter/vf_vpp_qsv: apply 3D LUT from file.

2024-01-27 Thread Chen Yufei
This version of PATCH use `QSV_RUNTIME_VERSION_ATLEAST` to apply 3D LUT when libvpl runtime API version >= 2.11. Chen Yufei (1): avfilter/vf_vpp_qsv: apply 3D LUT from file. libavfilter/Makefile | 8 +- libavfilter/lut3d.c | 669 +++ libavfil

Re: [FFmpeg-devel] [PATCH v2 1/1] avfilter/vf_vpp_qsv: apply 3D LUT from file.

2024-01-25 Thread Chen Yufei
On Wed, Jan 24, 2024 at 7:39 PM Anton Khirnov wrote: > > Quoting Chen Yufei (2024-01-20 16:14:29) > > Usage: "vpp_qsv=lut3d_file=" > > Passing file paths to a filter and having the filter load the file is > not recommended, it is generally preferable to have

Re: [FFmpeg-devel] [PATCH v2 0/1] avfilter/vf_vpp_qsv: apply 3D LUT from file

2024-01-23 Thread Chen Yufei
On Tue, Jan 23, 2024 at 10:00 AM Xiang, Haihao wrote: > > On Sa, 2024-01-20 at 23:14 +0800, Chen Yufei wrote: > > This patch adds support for applying 3D LUT from file using oneVPL VPP. > > > > PATCH v1 uses VA-API to create LUT surface. Because oneVPL can't work with &g

[FFmpeg-devel] [PATCH] libavfi/dnn: add LibTorch as one of DNN backend

2024-01-21 Thread wenbin . chen-at-intel . com
From: Wenbin Chen PyTorch is an open source machine learning framework that accelerates the path from research prototyping to production deployment. Official websit: https://pytorch.org/. We call the C++ library of PyTorch as LibTorch, the same below. To build FFmpeg with LibTorch, please take

[FFmpeg-devel] [PATCH v2 1/1] avfilter/vf_vpp_qsv: apply 3D LUT from file.

2024-01-20 Thread Chen Yufei
Usage: "vpp_qsv=lut3d_file=" Requires oneVPL, using system memory 3D LUT surface. Signed-off-by: Chen Yufei --- libavfilter/Makefile | 8 +- libavfilter/lut3d.c | 669 +++ libavfilter/lut3d.h | 13 + libavfilter/vf_lut3d

[FFmpeg-devel] [PATCH v2 0/1] avfilter/vf_vpp_qsv: apply 3D LUT from file

2024-01-20 Thread Chen Yufei
: requires oneVPL-intel-gpu version >= 24.1.1 because this version contains a fix for creating LUT in video memory. (For details, refer to https://github.com/oneapi-src/oneVPL-intel-gpu/issues/307) Chen Yufei (1): avfilter/vf_vpp_qsv: apply 3D LUT from file. libavfilter/Makefile |

[FFmpeg-devel] [PATCH 3/3] libavfilter/vf_dnn_detect: Use class confidence to filt boxes

2024-01-16 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Use class confidence instead of box_score to filt boxes, which is more accurate. Class confidence is obtained by multiplying class probability distribution and box_score. Signed-off-by: Wenbin Chen --- libavfilter/vf_dnn_detect.c | 6 +++--- 1 file changed, 3 insertions

[FFmpeg-devel] [PATCH 2/3] libavfilter/dnn_interface: use dims to represent shapes

2024-01-16 Thread wenbin . chen-at-intel . com
From: Wenbin Chen For detect and classify output, width and height make no sence, so change width, height to dims to represent the shape of tensor. Use layout and dims to get width, height and channel. Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_backend_openvino.c | 80

[FFmpeg-devel] [PATCH 1/3] libavfilter/dnn_bakcend_openvino: Add automatic input/output detection

2024-01-16 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Now when using openvino backend, user doesn't need to set input/output names in command line. Model ports will be automatically detected. For example: ffmpeg -i input.png -vf \ dnn_detect=dnn_backend=openvino:model=model.xml:input=image:\ output=detection_out -y output.png

Re: [FFmpeg-devel] [PATCH 1/3] swscale: don't assign range converters for float

2024-01-10 Thread Chen, Wenbin
> On Mon, 27 Nov 2023 02:10:11 +0000 "Chen, Wenbin" intel@ffmpeg.org> wrote: > > > > From: Niklas Haas > > > > > > > > This logic was incongruent with logic used elsewhere, where floating > > > > point formats are explici

[FFmpeg-devel] [PATCH 2/2] libavfilter/vf_dnn_detect: Add two outputs ssd support

2023-12-26 Thread wenbin . chen-at-intel . com
From: Wenbin Chen For this kind of model, we can directly use its output as final result just like ssd model. The difference is that it splits output into two tensors. [x_min, y_min, x_max, y_max, confidence] and [lable_id]. Model example refer to: https://github.com/openvinotoolkit

[FFmpeg-devel] [PATCH 1/2] libavfilter/dnn_backend_openvino: Add dynamic output support

2023-12-26 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Add dynamic outputs support. Some models don't have fixed output size. Its size changes according to result. Now openvino can run these kinds of models. Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_backend_openvino.c | 134 +++-- 1 file changed, 59

[FFmpeg-devel] [PATCH 2/2] libavfilter/vf_dnn_detect: Add initialized value to function pointer

2023-12-17 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Signed-off-by: Wenbin Chen --- libavfilter/vf_dnn_detect.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavfilter/vf_dnn_detect.c b/libavfilter/vf_dnn_detect.c index 52d5c3d798..88865c8a8e 100644 --- a/libavfilter/vf_dnn_detect.c +++ b/libavfilter

[FFmpeg-devel] [PATCH 1/2] libavfilter/vf_dnn_detect: Fix a control flow issue

2023-12-17 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Signed-off-by: Wenbin Chen --- libavfilter/vf_dnn_detect.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/libavfilter/vf_dnn_detect.c b/libavfilter/vf_dnn_detect.c index fcc64118b6..52d5c3d798 100644 --- a/libavfilter/vf_dnn_detect.c +++ b/libavfilter/vf_dnn_detect.c

[FFmpeg-devel] [PATCH 4/4] libavfilter/vf_dnn_detect: Set used pointer to NULL

2023-12-13 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Set used pointer to NULL in case it leaks the storage. Signed-off-by: Wenbin Chen --- libavfilter/vf_dnn_detect.c | 1 + 1 file changed, 1 insertion(+) diff --git a/libavfilter/vf_dnn_detect.c b/libavfilter/vf_dnn_detect.c index 5668b8b017..3464af86c8 100644

[FFmpeg-devel] [PATCH 3/4] libavfilter/vf_dnn_detect: Fix uninitialized variables error

2023-12-13 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Signed-off-by: Wenbin Chen --- libavfilter/vf_dnn_detect.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libavfilter/vf_dnn_detect.c b/libavfilter/vf_dnn_detect.c index b2e9b8d4c6..5668b8b017 100644 --- a/libavfilter/vf_dnn_detect.c +++ b/libavfilter

[FFmpeg-devel] [PATCH 2/4] libavfilter/vf_dnn_detect: Add NULL pointer check

2023-12-13 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Signed-off-by: Wenbin Chen --- libavfilter/vf_dnn_detect.c | 4 1 file changed, 4 insertions(+) diff --git a/libavfilter/vf_dnn_detect.c b/libavfilter/vf_dnn_detect.c index b82916ce6d..b2e9b8d4c6 100644 --- a/libavfilter/vf_dnn_detect.c +++ b/libavfilter/vf_dnn_detect.c

[FFmpeg-devel] [PATCH 1/4] libavfilter/vf_dnn_detect: Fix an incorrect expression

2023-12-13 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Signed-off-by: Wenbin Chen --- libavfilter/vf_dnn_detect.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavfilter/vf_dnn_detect.c b/libavfilter/vf_dnn_detect.c index 7ac3bb0b58..b82916ce6d 100644 --- a/libavfilter/vf_dnn_detect.c +++ b/libavfilter

Re: [FFmpeg-devel] [PATCH 1/3] swscale: don't assign range converters for float

2023-12-11 Thread Chen, Wenbin
> > > From: Niklas Haas > > > > > > This logic was incongruent with logic used elsewhere, where floating > > > point formats are explicitly exempted from range conversion. Fixes an > > > issue where floating point formats were not going through special > > > unscaled converters even when it was

[FFmpeg-devel] [PATCH v2 3/4] libavfilter/vf_dnn_detect: Add yolov3 support

2023-12-11 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Add yolov3 support. The difference of yolov3 is that it has multiple outputs in different scale to perform better on both large and small object. The model detail refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v3-tf Signed-off

[FFmpeg-devel] [PATCH v2 1/4] libavfiter/dnn_backend_openvino: Add multiple output support

2023-12-11 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Add multiple output support to openvino backend. You can use '&' to split different output when you set output name using command line. Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_backend_common.c | 7 - libavfilter/dnn/dnn_backend_openvino.c |

[FFmpeg-devel] [PATCH v2 4/4] libavfilter/vf_dnn_detect: Add yolov4 support

2023-12-11 Thread wenbin . chen-at-intel . com
From: Wenbin Chen The difference of yolov4 is that sigmoid function needed to be applied on x, y coordinates. Also make it compatiple with NHWC output as the yolov4 model from openvino model zoo has NHWC output layout. Model refer to: https://github.com/openvinotoolkit/open_model_zoo/tree

[FFmpeg-devel] [PATCH v2 2/4] libavfilter/vf_dnn_detect: Add input pad

2023-12-11 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Add input pad to get model input resolution. Detection models always have fixed input size. And the output coordinators are based on the input resolution, so we need to get input size to map coordinators to our real output frames. Signed-off-by: Wenbin Chen --- libavfilter

[FFmpeg-devel] [PATCH 4/4] libavfilter/vf_dnn_detect: Add yolov4 support

2023-12-03 Thread wenbin . chen-at-intel . com
From: Wenbin Chen The difference of yolov4 is that sigmoid function needed to be applied on x, y coordinates. Also make it compatiple with NHWC output as the yolov4 model from openvino model zoo has NHWC output layout. Model refer to: https://github.com/openvinotoolkit/open_model_zoo/tree

[FFmpeg-devel] [PATCH 3/4] libavfilter/vf_dnn_detect: Add yolov3 support

2023-12-03 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Add yolov3 support. The difference of yolov3 is that it has multiple outputs in different scale to perform better on both large and small object. The model detail refer to: https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/yolo-v3-tf Signed-off

[FFmpeg-devel] [PATCH 2/4] libavfilter/vf_dnn_detect: Add input pad

2023-12-03 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Add input pad to get model input resolution. Detection models always have fixed input size. And the output coordinators are based on the input resolution, so we need to get input size to map coordinators to our real output frames. Signed-off-by: Wenbin Chen --- libavfilter

[FFmpeg-devel] [PATCH 1/4] libavfiter/dnn/dnn_backend_openvino: add multiple output support

2023-12-03 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Add multiple output support to openvino backend. You can use '&' to split different output when you set output name using command line. Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_backend_common.c | 7 - libavfilter/dnn/dnn_backend_openvino.c |

Re: [FFmpeg-devel] [PATCH 1/3] swscale: don't assign range converters for float

2023-11-26 Thread Chen, Wenbin
> > From: Niklas Haas > > > > This logic was incongruent with logic used elsewhere, where floating > > point formats are explicitly exempted from range conversion. Fixes an > > issue where floating point formats were not going through special > > unscaled converters even when it was otherwise

[FFmpeg-devel] [PATCH 2/2] libavfilter/vf_dnn_detect: Add yolo support

2023-11-20 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Add yolo support. Yolo model doesn't output final result. It outputs candidate boxes, so we need post-process to remove overlap boxes to get final results. Also, the box's coordinators relate to cell and anchors, so we need these information to calculate boxes as well. Model

[FFmpeg-devel] [PATCH 1/2] libavfilter/vf_dnn_detect: Add model_type option.

2023-11-20 Thread wenbin . chen-at-intel . com
From: Wenbin Chen There are many kinds of detection DNN model and they have different preprocess and postprocess methods. To support more models, "model_type" option is added to help to choose preprocess and postprocess function. Signed-off-by: Wenbin Chen --- libavfilter/vf_dnn_det

Re: [FFmpeg-devel] [PATCH 1/3] swscale: don't assign range converters for float

2023-11-13 Thread Chen, Wenbin
> From: Niklas Haas > > This logic was incongruent with logic used elsewhere, where floating > point formats are explicitly exempted from range conversion. Fixes an > issue where floating point formats were not going through special > unscaled converters even when it was otherwise possible. >

Re: [FFmpeg-devel] [PATCH v3 1/8] swscale: fix sws_setColorspaceDetails after sws_init_context

2023-11-12 Thread Chen, Wenbin
> > Will apply soon. > Hi Niklas: This patchset causes a regression. The command: "ffmpeg -i input.png -vf format=grayf32,format=gray8 output.png" reports error. If I configure with "--disable-sse2", the error is unseen. Thanks Wenbin > ___ >

[FFmpeg-devel] [PATCH] libavfilter/dnn/openvino: Reduce redundant memory allocation

2023-11-09 Thread wenbin . chen-at-intel . com
From: Wenbin Chen We can directly get data ptr from tensor, so that extral memory allocation can be removed. Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_backend_openvino.c | 42 +- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/libavfilter/dnn

[FFmpeg-devel] [PATCH] fftools/ffmpeg_mux: keep write_header and write_packet in the same thread

2023-11-01 Thread angus . chen-at-intel . com
From: "Chen, Angus" sdl2_muxer(wayland): In ffmpeg6, we create a separate thread for muxer after calling avformat_write_header(). It may generate EGL_BAD_ACCESS when we call write_packet. This is because egl_context is bound to previous thread. >From EGL spec: If ctx is current

Re: [FFmpeg-devel] [PATCH 2/2] avfilter/vf_vpp_qsv: apply 3D LUT from file.

2023-10-22 Thread Chen Yufei
multiple times? On Mon, Oct 16, 2023 at 4:05 PM Xiang, Haihao wrote: > > On Sa, 2023-09-23 at 23:36 +0800, Chen Yufei wrote: > > Usage: "vpp_qsv=lut3d_file=" > > > > Only enabled with VAAPI because using VASurface to store 3D LUT. > > > > Signed-off-by: Che

Re: [FFmpeg-devel] [PATCH 1/2] avfilter/vf_lut3d: expose 3D LUT file parse function.

2023-10-22 Thread Chen Yufei
Thanks for reviewing this patch. Do you mean this should be merged with the change to vf_vpp_qsv file and send only one patch file? On Mon, Oct 16, 2023 at 3:51 PM Xiang, Haihao wrote: > > On Sa, 2023-09-23 at 23:36 +0800, Chen Yufei wrote: > > Signed-off-by

[FFmpeg-devel] [PATCH 2/2] avfilter/vf_vpp_qsv: apply 3D LUT from file.

2023-09-23 Thread Chen Yufei
Usage: "vpp_qsv=lut3d_file=" Only enabled with VAAPI because using VASurface to store 3D LUT. Signed-off-by: Chen Yufei --- libavfilter/vf_vpp_qsv.c | 241 ++- 1 file changed, 236 insertions(+), 5 deletions(-) diff --git a/libavfilter/vf_v

[FFmpeg-devel] [PATCH 1/2] avfilter/vf_lut3d: expose 3D LUT file parse function.

2023-09-23 Thread Chen Yufei
Signed-off-by: Chen Yufei --- libavfilter/Makefile | 8 +- libavfilter/lut3d.c| 669 + libavfilter/lut3d.h| 13 + libavfilter/vf_lut3d.c | 590 +--- 4 files changed, 689 insertions(+), 591 deletions(-) create

[FFmpeg-devel] [PATCH 0/2] avfilter/vf_vpp_qsv: apply 3D LUT from file

2023-09-23 Thread Chen Yufei
on a Thunderbolt 3 GPU dock. I compared transcoding output with `vf_lut3d` and don't see noticeable difference with my eyes. `make fate` passes without error. Chen Yufei (2): avfilter/vf_lut3d: expose 3D LUT file parse function. avfilter/vf_vpp_qsv: apply 3D LUT from file. libavfilter/Makefile | 8

Re: [FFmpeg-devel] [PATCH 3/3] libavfilter/dnn: Initialze DNNData variables

2023-09-20 Thread Chen, Wenbin
> > On Sep 20, 2023, at 10:26, wenbin.chen-at-intel@ffmpeg.org wrote: > > > > From: Wenbin Chen > > > > Signed-off-by: Wenbin Chen > > --- > > libavfilter/dnn/dnn_backend_tf.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > >

[FFmpeg-devel] [PATCH v2 3/3] libavfilter/dnn: Initialze DNNData variables

2023-09-20 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_backend_tf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libavfilter/dnn/dnn_backend_tf.c b/libavfilter/dnn/dnn_backend_tf.c index b521de7fbe..25046b58d9 100644 --- a/libavfilter/dnn

[FFmpeg-devel] [PATCH v2 2/3] libavfilter/dnn: Add scale and mean preprocess to openvino backend

2023-09-20 Thread wenbin . chen-at-intel . com
From: Wenbin Chen Dnn models has different data preprocess requirements. Scale and mean parameters are added to preprocess input data. Signed-off-by: Wenbin Chen --- libavfilter/dnn/dnn_backend_openvino.c | 43 -- libavfilter/dnn/dnn_io_proc.c | 82

  1   2   3   4   5   6   >