On Sat, May 18, 2024 at 9:04 AM Ronald S. Bultje <rsbul...@gmail.com> wrote:

> Hi,
>
> On Tue, May 14, 2024 at 4:40 PM Stone Chen <chen.stonec...@gmail.com>
> wrote:
>
>> Implements AVX2 DMVR (decoder-side motion vector refinement) SAD
>> functions. DMVR SAD is only calculated if w >= 8, h >= 8, and w * h > 128.
>> To reduce complexity, SAD is only calculated on even rows. This is
>> calculated for all video bitdepths, but the values passed to the function
>> are always 16bit (even if the original video bitdepth is 8). The AVX2
>> implementation uses min/max/sub.
>>
>> Benchmarks ( AMD 7940HS )
>> Before:
>> BQTerrace_1920x1080_60_10_420_22_RA.vvc | 80.7 |
>> Chimera_8bit_1080P_1000_frames.vvc | 158.0 |
>> NovosobornayaSquare_1920x1080.bin | 159.7 |
>> RitualDance_1920x1080_60_10_420_37_RA.266 | 146.3 |
>>
>> After:
>> BQTerrace_1920x1080_60_10_420_22_RA.vvc | 82.7 |
>> Chimera_8bit_1080P_1000_frames.vvc | 167.0 |
>> NovosobornayaSquare_1920x1080.bin | 166.3 |
>> RitualDance_1920x1080_60_10_420_37_RA.266 | 154.0 |
>>
>
> I assume these are FPS benchmarks? Can you provide checkasm --bench output
> for these functions also?
>
> Ronald
>

Hi Ronald,

Correct those are FPS benchmarks. There's a separate patch that has the
checkasm --bench (Add check_vvc_sad to vvc_mc.c), in the commit message.
I've c&p the benchmark snippit below:


> vvc_sad_8x8_c: 63.0
> vvc_sad_8x8_avx2: 3.0
> vvc_sad_16x16_c: 263.0
> vvc_sad_16x16_avx2: 23.0
> vvc_sad_32x32_c: 1003.0
> vvc_sad_32x32_avx2: 83.0
> vvc_sad_64x64_c: 3923.0
> vvc_sad_64x64_avx2: 373.0
> vvc_sad_128x128_c: 17533.0
> vvc_sad_128x128_avx2: 1683.0


Also your blogpost was very helpful for getting started with asm!

Cheers,
Stone
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to