Re: [FFmpeg-devel] [PATCH v5 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-23 Thread Stone Chen
On Thu, May 23, 2024 at 9:18 AM Nuo Mi wrote: > On Thu, May 23, 2024 at 7:38 AM James Almer wrote: > > > On 5/21/2024 10:01 PM, Ronald S. Bultje wrote: > > > Hi, > > > > > > On Tue, May 21, 2024 at 8:01 PM Stone Chen > > wrote: > > > &

Re: [FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-21 Thread Stone Chen
On Mon, May 20, 2024 at 7:23 AM Ronald S. Bultje wrote: > Hi, > > This is mostly good, the following is tiny nitpicks. > > On Sun, May 19, 2024 at 8:46 PM Stone Chen > wrote: > >> +%macro INIT_OFFSET 6 ; src1, src2, dxq, dyq, off1, off2 >> > > The macro

[FFmpeg-devel] [PATCH v5 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-21 Thread Stone Chen
Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 50.3 vvc_sad_8x8_avx2: 0.3 vvc_sad_16x16_c: 250.3 vvc_sad_16x16_avx2: 10.3 vvc_sad_32x32_c: 1020.3 vvc_sad_32x32_avx2: 60.3 vvc_sad_64x64_c: 3850.3 vvc_sad_64x64_avx2: 220.3 vvc_sad_128x128_c: 14100.3

[FFmpeg-devel] [PATCH v5 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-21 Thread Stone Chen
codec/x86/vvc/vvc_sad.asm new file mode 100644 index 00..9766446b11 --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,130 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpeg is

[FFmpeg-devel] [PATCH v4 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-19 Thread Stone Chen
Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 70.0 vvc_sad_8x8_avx2: 10.0 vvc_sad_16x16_c: 280.0 vvc_sad_16x16_avx2: 20.0 vvc_sad_32x32_c: 1020.0 vvc_sad_32x32_avx2: 70.0 vvc_sad_64x64_c: 3560.0 vvc_sad_64x64_avx2: 270.0 vvc_sad_128x128_c: 13760.0

[FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-19 Thread Stone Chen
codec/x86/vvc/vvc_sad.asm new file mode 100644 index 00..58a24635d2 --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,138 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpe

[FFmpeg-devel] [PATCH v4 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-19 Thread Stone Chen
Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 70.0 vvc_sad_8x8_avx2: 10.0 vvc_sad_16x16_c: 280.0 vvc_sad_16x16_avx2: 20.0 vvc_sad_32x32_c: 1020.0 vvc_sad_32x32_avx2: 70.0 vvc_sad_64x64_c: 3560.0 vvc_sad_64x64_avx2: 270.0 vvc_sad_128x128_c: 13760.0

[FFmpeg-devel] [PATCH v4 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-19 Thread Stone Chen
codec/x86/vvc/vvc_sad.asm new file mode 100644 index 00..58a24635d2 --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,138 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpe

Re: [FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-19 Thread Stone Chen
On Sat, May 18, 2024 at 11:33 AM Ronald S. Bultje wrote: > Hi, > > On Tue, May 14, 2024 at 4:40 PM Stone Chen > wrote: > >> +vvc_sad_8: >> +.loop_height: >> +movu xm0, [src1q] >> +movu xm1, [src2q] >

Re: [FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-18 Thread Stone Chen
On Sat, May 18, 2024 at 9:04 AM Ronald S. Bultje wrote: > Hi, > > On Tue, May 14, 2024 at 4:40 PM Stone Chen > wrote: > >> Implements AVX2 DMVR (decoder-side motion vector refinement) SAD >> functions. DMVR SAD is only calculated if w >= 8, h >= 8, and w * h &g

[FFmpeg-devel] [PATCH v3 2/2][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-14 Thread Stone Chen
Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 63.0 vvc_sad_8x8_avx2: 3.0 vvc_sad_16x16_c: 263.0 vvc_sad_16x16_avx2: 23.0 vvc_sad_32x32_c: 1003.0 vvc_sad_32x32_avx2: 83.0 vvc_sad_64x64_c: 3923.0 vvc_sad_64x64_avx2: 373.0 vvc_sad_128x128_c: 17533.0

[FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-14 Thread Stone Chen
c/x86/vvc/vvc_sad.asm new file mode 100644 index 00..530142ad35 --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,157 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpe

[FFmpeg-devel] [PATCH v2 2/2][GSoC 2024] Terminal tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-11 Thread Stone Chen
Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8x8_c: 63.0 vvc_sad_8x8_avx2: 3.0 vvc_sad_16x16_c: 263.0 vvc_sad_16x16_avx2: 23.0 vvc_sad_32x32_c: 1003.0 vvc_sad_32x32_avx2: 83.0 vvc_sad_64x64_c: 3923.0 vvc_sad_64x64_avx2: 373.0 vvc_sad_128x128_c: 17533.0

[FFmpeg-devel] [PATCH v2 1/2][GSoC] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-11 Thread Stone Chen
1184c731c --- /dev/null +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,155 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpeg is free software; you can redistribute it and/or +; * modify

Re: [FFmpeg-devel] [PATCH 1/3][GSoC 2024] libavcodec/vvc: convert (*sad) to (*sad[6]) to prepare for AVX2 funcs

2024-05-06 Thread Stone Chen
On Wed, May 1, 2024 at 6:59 PM Andreas Rheinhardt < andreas.rheinha...@outlook.com> wrote: > Stone Chen: > > To prepare for adding AVX2 functions for different block widths, change > VVCInterDSPContext to contain (*sad[6]) instead of (*sad). This also > default initiali

[FFmpeg-devel] [PATCH 3/3][GSoC 2024] tests/checkasm: Add check_vvc_sad to vvc_mc.c

2024-05-01 Thread Stone Chen
Adds checkasm for DMVR SAD AVX2 implementation. Benchmarks ( AMD 7940HS ) vvc_sad_8_16bpc_c: 112.5 vvc_sad_8_16bpc_avx2: 2.5 vvc_sad_16_16bpc_c: 232.5 vvc_sad_16_16bpc_avx2: 22.5 vvc_sad_32_16bpc_c: 912.5 vvc_sad_32_16bpc_avx2: 82.5 vvc_sad_64_16bpc_c: 3582.5 vvc_sad_64_16bpc_avx2: 392.5

[FFmpeg-devel] [PATCH 2/3][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-01 Thread Stone Chen
ull +++ b/libavcodec/x86/vvc/vvc_sad.asm @@ -0,0 +1,193 @@ +; /* +; * Provide SIMD DMVR SAD functions for VVC decoding +; * +; * Copyright (c) 2024 Stone Chen +; * +; * This file is part of FFmpeg. +; * +; * FFmpeg is free software; you can redistribute it and/or +; * modify it under the terms of the G

[FFmpeg-devel] [PATCH 1/3][GSoC 2024] libavcodec/vvc: convert (*sad) to (*sad[6]) to prepare for AVX2 funcs

2024-05-01 Thread Stone Chen
To prepare for adding AVX2 functions for different block widths, change VVCInterDSPContext to contain (*sad[6]) instead of (*sad). This also default initializes the pointer array with the scalar function and the calling sites to jump to the correct function based on block width. There's no

[FFmpeg-devel] [PATCH] doc/filters: Change rdiv (vf_convolution) documentation to reflect actual behavior

2024-03-14 Thread Stone Chen
The documentation correctly states that the rdiv is a multiplier but incorrectly states the default behavior is to multiply by the sum of all matrix elements - it multiplies by 1/sum. This changes the documentation to match the code. Address trac #10889 --- doc/filters.texi | 2 +- 1 file

Re: [FFmpeg-devel] [PATCH] Change rdiv (vf_convolution) documentation to reflect actual behavior

2024-03-13 Thread Stone Chen
On Wed, Mar 13, 2024 at 4:26 AM Marton Balint wrote: > > > On Tue, 12 Mar 2024, Stone Chen wrote: > > > The documentation correctly states that the rdiv is a multiplier but > incorrectly states the default behavior is to multiply by the sum of all > matrix elements

[FFmpeg-devel] [PATCH] Change rdiv (vf_convolution) documentation to reflect actual behavior

2024-03-12 Thread Stone Chen
The documentation correctly states that the rdiv is a multiplier but incorrectly states the default behavior is to multiply by the sum of all matrix elements - it multiplies by 1/sum. This changes the documentation to match the code. --- doc/filters.texi | 2 +- 1 file changed, 1 insertion(+),

Re: [FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

2024-02-24 Thread Stone Chen
On Sat, Feb 24, 2024 at 6:34 PM Marton Balint wrote: > > > On Sat, 24 Feb 2024, Stone Chen wrote: > > > On Sat, Feb 24, 2024 at 3:56 PM Marton Balint wrote: > > > >> > >> > >> On Sat, 24 Feb 2024, Stone Chen wrote: > >> > >>

Re: [FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

2024-02-24 Thread Stone Chen
On Sat, Feb 24, 2024 at 3:56 PM Marton Balint wrote: > > > On Sat, 24 Feb 2024, Stone Chen wrote: > > > Previously to support dynamic reconfigurations of the matrix string > (e.g. 0m), the rdiv values would always be cleared to 0.f, causing the rdiv > to be recalculated

Re: [FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

2024-02-24 Thread Stone Chen
Sorry I just realized I messed up my git commit (new to git), I've attached a patch file with that correction. On Sat, Feb 24, 2024 at 10:49 AM Stone Chen wrote: > Previously to support dynamic reconfigurations of the matrix string (e.g. > 0m), the rdiv values would always be cleared

[FFmpeg-devel] [PATCH] Add float user_rdiv[4] to allow user options to apply correctly

2024-02-24 Thread Stone Chen
Previously to support dynamic reconfigurations of the matrix string (e.g. 0m), the rdiv values would always be cleared to 0.f, causing the rdiv to be recalculated based on the new filter. This however had the side effect of always ignoring user specified rdiv values. Instead float user_rdiv[0]

Re: [FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

2024-02-18 Thread Stone Chen
Hi Marton, Thanks for the feedback! I'm not sure what dynamic reconfiguration is, from some searching I think it might be related to commands? On Sun, Feb 18, 2024 at 7:08 PM Marton Balint wrote: > > > On Sun, 18 Feb 2024, Stone Chen wrote: > > > In commit 6c45d34

Re: [FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

2024-02-18 Thread Stone Chen
Hi Marton, Thanks for the feedback! I'm not sure what dynamic reconfiguration is, from some searching I think it might be related to commands? On Sun, Feb 18, 2024 at 7:08 PM Marton Balint wrote: > > > On Sun, 18 Feb 2024, Stone Chen wrote: > > > In commit 6c45d34

Re: [FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

2024-02-18 Thread Stone Chen
Sorry I think I didn't correctly attach the patch the first time. On Sun, Feb 18, 2024 at 2:21 PM Stone Chen wrote: > In commit 6c45d34, a line was added that always sets rdiv to 0, overriding > any user input. This removes that line allowing user set values for 0rdiv, > 1rdiv, 2rd

[FFmpeg-devel] [PATCH] Fix rdiv always being set to 0 in vf_convolution.c

2024-02-18 Thread Stone Chen
In commit 6c45d34, a line was added that always sets rdiv to 0, overriding any user input. This removes that line allowing user set values for 0rdiv, 1rdiv, 2rdiv, 3rdiv to apply as expected. This fixes ticket #10294. Signed-off-by: Stone Chen --- libavfilter/vf_convolution.c | 1 - 1 file