Placed in a new separate file as the existing combined MMX/SSE/AVX file is humongous and takes forever to assemble as is.
This adds ~16 KiB of .text. The existing 8bpc asm is ~240 KiB of which the corresponding AVX2 functions makes up ~42 KiB. Tested to pass FATE on Linux and Windows. Checkasm numbers vs AVX2 on Zen 5 (Strix Halo): vp9_inv_adst_adst_16x16_sub16_add_8_avx2: 209.3 vp9_inv_adst_adst_16x16_sub16_add_8_avx512icl: 99.5 vp9_inv_adst_dct_16x16_sub16_add_8_avx2: 165.2 vp9_inv_adst_dct_16x16_sub16_add_8_avx512icl: 89.7 vp9_inv_dct_adst_16x16_sub16_add_8_avx2: 165.9 vp9_inv_dct_adst_16x16_sub16_add_8_avx512icl: 87.7 vp9_inv_dct_dct_16x16_sub16_add_8_avx2: 121.3 vp9_inv_dct_dct_16x16_sub16_add_8_avx512icl: 79.2 vp9_inv_dct_dct_32x32_sub32_add_8_avx2: 745.5 vp9_inv_dct_dct_32x32_sub32_add_8_avx512icl: 285.5
vp9_itx_avx512.patch
Description: Binary data
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".