On Sat, May 18, 2024 at 9:04 AM Ronald S. Bultje <rsbul...@gmail.com> wrote:
> Hi, > > On Tue, May 14, 2024 at 4:40 PM Stone Chen <chen.stonec...@gmail.com> > wrote: > >> Implements AVX2 DMVR (decoder-side motion vector refinement) SAD >> functions. DMVR SAD is only calculated if w >= 8, h >= 8, and w * h > 128. >> To reduce complexity, SAD is only calculated on even rows. This is >> calculated for all video bitdepths, but the values passed to the function >> are always 16bit (even if the original video bitdepth is 8). The AVX2 >> implementation uses min/max/sub. >> >> Benchmarks ( AMD 7940HS ) >> Before: >> BQTerrace_1920x1080_60_10_420_22_RA.vvc | 80.7 | >> Chimera_8bit_1080P_1000_frames.vvc | 158.0 | >> NovosobornayaSquare_1920x1080.bin | 159.7 | >> RitualDance_1920x1080_60_10_420_37_RA.266 | 146.3 | >> >> After: >> BQTerrace_1920x1080_60_10_420_22_RA.vvc | 82.7 | >> Chimera_8bit_1080P_1000_frames.vvc | 167.0 | >> NovosobornayaSquare_1920x1080.bin | 166.3 | >> RitualDance_1920x1080_60_10_420_37_RA.266 | 154.0 | >> > > I assume these are FPS benchmarks? Can you provide checkasm --bench output > for these functions also? > > Ronald > Hi Ronald, Correct those are FPS benchmarks. There's a separate patch that has the checkasm --bench (Add check_vvc_sad to vvc_mc.c), in the commit message. I've c&p the benchmark snippit below: > vvc_sad_8x8_c: 63.0 > vvc_sad_8x8_avx2: 3.0 > vvc_sad_16x16_c: 263.0 > vvc_sad_16x16_avx2: 23.0 > vvc_sad_32x32_c: 1003.0 > vvc_sad_32x32_avx2: 83.0 > vvc_sad_64x64_c: 3923.0 > vvc_sad_64x64_avx2: 373.0 > vvc_sad_128x128_c: 17533.0 > vvc_sad_128x128_avx2: 1683.0 Also your blogpost was very helpful for getting started with asm! Cheers, Stone _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".