For 10bits and 12bits, they should stay sse4 as well because of packusdw. You need some instructions to convert it to ssse3 see below
static av_always_inline __m128i _MM_PACKUS_EPI32( __m128i a, __m128i b ) { a = _mm_slli_epi32 (a, 16); a = _mm_srai_epi32 (a, 16); b = _mm_slli_epi32 (b, 16); b = _mm_srai_epi32 (b, 16); a = _mm_packs_epi32 (a, b); return a; } Mickaël Le 23 août 2014 à 15:22, Christophe Gisquet <christophe.gisq...@gmail.com> a écrit : > As far as I can see, the only reason those functions are SSE4 is because > of the pextrw needed for the following block widths: > - 2, used only by chroma; > - 6, used by chroma and indirectly by luma; > - 12, used by both. > The better solution would be to convert all chroma handling to NV12, but > it is vastly simpler to modify the above cases to not use pextrw. > > This is done in 2 steps: > - Fix width of 12 to do 8+4 instead of 6+6; > - Modify the store macros for width 2 and 6 by passing data through > a GPR (alas at the cost for some functions of a supplementary GPR). > > Christophe Gisquet (2): > x86: hevc_mc: split differently calls > x86: hevc_mc: convert to ssse3 > > libavcodec/x86/hevc_mc.asm | 63 +++-- > libavcodec/x86/hevcdsp.h | 48 ++-- > libavcodec/x86/hevcdsp_init.c | 561 ++++++++++++++++++++++-------------------- > 3 files changed, 362 insertions(+), 310 deletions(-) > > -- > 1.9.2.msysgit.0 > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel