On Thu, May 23, 2024 at 9:18 AM Nuo Mi wrote:
> On Thu, May 23, 2024 at 7:38 AM James Almer wrote:
>
> > On 5/21/2024 10:01 PM, Ronald S. Bultje wrote:
> > > Hi,
> > >
> > > On Tue, May 21, 2024 at 8:01 PM Stone Chen
> > wrote:
> > >
&
On Mon, May 20, 2024 at 7:23 AM Ronald S. Bultje wrote:
> Hi,
>
> This is mostly good, the following is tiny nitpicks.
>
> On Sun, May 19, 2024 at 8:46 PM Stone Chen
> wrote:
>
>> +%macro INIT_OFFSET 6 ; src1, src2, dxq, dyq, off1, off2
>>
>
> The macro
Adds checkasm for DMVR SAD AVX2 implementation.
Benchmarks ( AMD 7940HS )
vvc_sad_8x8_c: 50.3
vvc_sad_8x8_avx2: 0.3
vvc_sad_16x16_c: 250.3
vvc_sad_16x16_avx2: 10.3
vvc_sad_32x32_c: 1020.3
vvc_sad_32x32_avx2: 60.3
vvc_sad_64x64_c: 3850.3
vvc_sad_64x64_avx2: 220.3
vvc_sad_128x128_c: 14100.3
codec/x86/vvc/vvc_sad.asm
new file mode 100644
index 00..9766446b11
--- /dev/null
+++ b/libavcodec/x86/vvc/vvc_sad.asm
@@ -0,0 +1,130 @@
+; /*
+; * Provide SIMD DMVR SAD functions for VVC decoding
+; *
+; * Copyright (c) 2024 Stone Chen
+; *
+; * This file is part of FFmpeg.
+; *
+; * FFmpeg is
Adds checkasm for DMVR SAD AVX2 implementation.
Benchmarks ( AMD 7940HS )
vvc_sad_8x8_c: 70.0
vvc_sad_8x8_avx2: 10.0
vvc_sad_16x16_c: 280.0
vvc_sad_16x16_avx2: 20.0
vvc_sad_32x32_c: 1020.0
vvc_sad_32x32_avx2: 70.0
vvc_sad_64x64_c: 3560.0
vvc_sad_64x64_avx2: 270.0
vvc_sad_128x128_c: 13760.0
codec/x86/vvc/vvc_sad.asm
new file mode 100644
index 00..58a24635d2
--- /dev/null
+++ b/libavcodec/x86/vvc/vvc_sad.asm
@@ -0,0 +1,138 @@
+; /*
+; * Provide SIMD DMVR SAD functions for VVC decoding
+; *
+; * Copyright (c) 2024 Stone Chen
+; *
+; * This file is part of FFmpeg.
+; *
+; * FFmpe
Adds checkasm for DMVR SAD AVX2 implementation.
Benchmarks ( AMD 7940HS )
vvc_sad_8x8_c: 70.0
vvc_sad_8x8_avx2: 10.0
vvc_sad_16x16_c: 280.0
vvc_sad_16x16_avx2: 20.0
vvc_sad_32x32_c: 1020.0
vvc_sad_32x32_avx2: 70.0
vvc_sad_64x64_c: 3560.0
vvc_sad_64x64_avx2: 270.0
vvc_sad_128x128_c: 13760.0
codec/x86/vvc/vvc_sad.asm
new file mode 100644
index 00..58a24635d2
--- /dev/null
+++ b/libavcodec/x86/vvc/vvc_sad.asm
@@ -0,0 +1,138 @@
+; /*
+; * Provide SIMD DMVR SAD functions for VVC decoding
+; *
+; * Copyright (c) 2024 Stone Chen
+; *
+; * This file is part of FFmpeg.
+; *
+; * FFmpe
On Sat, May 18, 2024 at 11:33 AM Ronald S. Bultje
wrote:
> Hi,
>
> On Tue, May 14, 2024 at 4:40 PM Stone Chen
> wrote:
>
>> +vvc_sad_8:
>> +.loop_height:
>> +movu xm0, [src1q]
>> +movu xm1, [src2q]
>
On Sat, May 18, 2024 at 9:04 AM Ronald S. Bultje wrote:
> Hi,
>
> On Tue, May 14, 2024 at 4:40 PM Stone Chen
> wrote:
>
>> Implements AVX2 DMVR (decoder-side motion vector refinement) SAD
>> functions. DMVR SAD is only calculated if w >= 8, h >= 8, and w * h &g
Adds checkasm for DMVR SAD AVX2 implementation.
Benchmarks ( AMD 7940HS )
vvc_sad_8x8_c: 63.0
vvc_sad_8x8_avx2: 3.0
vvc_sad_16x16_c: 263.0
vvc_sad_16x16_avx2: 23.0
vvc_sad_32x32_c: 1003.0
vvc_sad_32x32_avx2: 83.0
vvc_sad_64x64_c: 3923.0
vvc_sad_64x64_avx2: 373.0
vvc_sad_128x128_c: 17533.0
c/x86/vvc/vvc_sad.asm
new file mode 100644
index 00..530142ad35
--- /dev/null
+++ b/libavcodec/x86/vvc/vvc_sad.asm
@@ -0,0 +1,157 @@
+; /*
+; * Provide SIMD DMVR SAD functions for VVC decoding
+; *
+; * Copyright (c) 2024 Stone Chen
+; *
+; * This file is part of FFmpeg.
+; *
+; * FFmpe
Adds checkasm for DMVR SAD AVX2 implementation.
Benchmarks ( AMD 7940HS )
vvc_sad_8x8_c: 63.0
vvc_sad_8x8_avx2: 3.0
vvc_sad_16x16_c: 263.0
vvc_sad_16x16_avx2: 23.0
vvc_sad_32x32_c: 1003.0
vvc_sad_32x32_avx2: 83.0
vvc_sad_64x64_c: 3923.0
vvc_sad_64x64_avx2: 373.0
vvc_sad_128x128_c: 17533.0
1184c731c
--- /dev/null
+++ b/libavcodec/x86/vvc/vvc_sad.asm
@@ -0,0 +1,155 @@
+; /*
+; * Provide SIMD DMVR SAD functions for VVC decoding
+; *
+; * Copyright (c) 2024 Stone Chen
+; *
+; * This file is part of FFmpeg.
+; *
+; * FFmpeg is free software; you can redistribute it and/or
+; * modify
On Wed, May 1, 2024 at 6:59 PM Andreas Rheinhardt <
andreas.rheinha...@outlook.com> wrote:
> Stone Chen:
> > To prepare for adding AVX2 functions for different block widths, change
> VVCInterDSPContext to contain (*sad[6]) instead of (*sad). This also
> default initiali
Adds checkasm for DMVR SAD AVX2 implementation.
Benchmarks ( AMD 7940HS )
vvc_sad_8_16bpc_c: 112.5
vvc_sad_8_16bpc_avx2: 2.5
vvc_sad_16_16bpc_c: 232.5
vvc_sad_16_16bpc_avx2: 22.5
vvc_sad_32_16bpc_c: 912.5
vvc_sad_32_16bpc_avx2: 82.5
vvc_sad_64_16bpc_c: 3582.5
vvc_sad_64_16bpc_avx2: 392.5
ull
+++ b/libavcodec/x86/vvc/vvc_sad.asm
@@ -0,0 +1,193 @@
+; /*
+; * Provide SIMD DMVR SAD functions for VVC decoding
+; *
+; * Copyright (c) 2024 Stone Chen
+; *
+; * This file is part of FFmpeg.
+; *
+; * FFmpeg is free software; you can redistribute it and/or
+; * modify it under the terms of the G
To prepare for adding AVX2 functions for different block widths, change
VVCInterDSPContext to contain (*sad[6]) instead of (*sad). This also default
initializes the pointer array with the scalar function and the calling sites to
jump to the correct function based on block width. There's no
The documentation correctly states that the rdiv is a multiplier but
incorrectly states the default behavior is to multiply by the sum of all matrix
elements - it multiplies by 1/sum.
This changes the documentation to match the code.
Address trac #10889
---
doc/filters.texi | 2 +-
1 file
On Wed, Mar 13, 2024 at 4:26 AM Marton Balint wrote:
>
>
> On Tue, 12 Mar 2024, Stone Chen wrote:
>
> > The documentation correctly states that the rdiv is a multiplier but
> incorrectly states the default behavior is to multiply by the sum of all
> matrix elements
The documentation correctly states that the rdiv is a multiplier but
incorrectly states the default behavior is to multiply by the sum of all matrix
elements - it multiplies by 1/sum.
This changes the documentation to match the code.
---
doc/filters.texi | 2 +-
1 file changed, 1 insertion(+),
On Sat, Feb 24, 2024 at 6:34 PM Marton Balint wrote:
>
>
> On Sat, 24 Feb 2024, Stone Chen wrote:
>
> > On Sat, Feb 24, 2024 at 3:56 PM Marton Balint wrote:
> >
> >>
> >>
> >> On Sat, 24 Feb 2024, Stone Chen wrote:
> >>
> >>
On Sat, Feb 24, 2024 at 3:56 PM Marton Balint wrote:
>
>
> On Sat, 24 Feb 2024, Stone Chen wrote:
>
> > Previously to support dynamic reconfigurations of the matrix string
> (e.g. 0m), the rdiv values would always be cleared to 0.f, causing the rdiv
> to be recalculated
Sorry I just realized I messed up my git commit (new to git), I've attached
a patch file with that correction.
On Sat, Feb 24, 2024 at 10:49 AM Stone Chen
wrote:
> Previously to support dynamic reconfigurations of the matrix string (e.g.
> 0m), the rdiv values would always be cleared
Previously to support dynamic reconfigurations of the matrix string (e.g. 0m),
the rdiv values would always be cleared to 0.f, causing the rdiv to be
recalculated based on the new filter. This however had the side effect of
always ignoring user specified rdiv values.
Instead float user_rdiv[0]
Hi Marton,
Thanks for the feedback!
I'm not sure what dynamic reconfiguration is, from some searching I think
it might be related to commands?
On Sun, Feb 18, 2024 at 7:08 PM Marton Balint wrote:
>
>
> On Sun, 18 Feb 2024, Stone Chen wrote:
>
> > In commit 6c45d34
Hi Marton,
Thanks for the feedback!
I'm not sure what dynamic reconfiguration is, from some searching I think
it might be related to commands?
On Sun, Feb 18, 2024 at 7:08 PM Marton Balint wrote:
>
>
> On Sun, 18 Feb 2024, Stone Chen wrote:
>
> > In commit 6c45d34
Sorry I think I didn't correctly attach the patch the first time.
On Sun, Feb 18, 2024 at 2:21 PM Stone Chen wrote:
> In commit 6c45d34, a line was added that always sets rdiv to 0, overriding
> any user input. This removes that line allowing user set values for 0rdiv,
> 1rdiv, 2rd
In commit 6c45d34, a line was added that always sets rdiv to 0, overriding any
user input. This removes that line allowing user set values for 0rdiv, 1rdiv,
2rdiv, 3rdiv to apply as expected. This fixes ticket #10294.
Signed-off-by: Stone Chen
---
libavfilter/vf_convolution.c | 1 -
1 file
29 matches
Mail list logo