On Wed, Sep 30, 2015 at 9:36 PM, Ronald S. Bultje <rsbul...@gmail.com> wrote: diff --git a/libavcodec/x86/vp9intrapred_16bpp.asm b/libavcodec/x86/vp9intrapred_16bpp.asm
+pd_65535: times 8 dd 0xffff Duplicate of pd_0f from 264_qpel_10bit.asm +%if cpuflag(ssse3) + ; FIXME this can be done without three-op-instr by doing pshfhw m1, m0 + ; but then interleaving decreases, measure which is faster + pshufb m1, m0, [pb_2to15_14_15]; bcdefghh +%else + psrldq m1, m0, 2 ; bcdefgh. +%endif + pshufhw m0, m0, q3310 ; abcdefhh +%if notcpuflag(ssse3) + pshufhw m1, m1, q2210 ; bcdefghh +%endif Move pshufhw into the else part. There's also a typo (pshfhw) in the comment. +%if cpuflag(ssse3) + pshufb m0, m4 +%else + psrldq m0, 2 ; CDEFGHh. +%endif + pshuflw m1, m1, q3321 ; GHhhhhhh +%if notcpuflag(ssse3) + pshufhw m0, m0, q2210 ; CDEFGHhh +%endif Ditto +%if cpuflag(ssse3) + pshufb m1, m3 + pshufb m2, m3 +%else + psrldq m1, 2 + psrldq m2, 2 + pshufhw m1, m1, q2210 + pshufhw m2, m2, q2210 +%endif + mova [dstq+strideq*2], m1 + mova [dstq+stride3q ], m2 + lea dstq, [dstq+strideq*4] +%if cpuflag(ssse3) + pshufb m1, m3 + pshufb m2, m3 +%else + psrldq m1, 2 + psrldq m2, 2 + pshufhw m1, m1, q2210 + pshufhw m2, m2, q2210 +%endif + mova [dstq+strideq*0], m1 + mova [dstq+strideq*1], m2 +%if cpuflag(ssse3) + pshufb m1, m3 + pshufb m2, m3 +%else + psrldq m1, 2 + psrldq m2, 2 + pshufhw m1, m1, q2210 + pshufhw m2, m2, q2210 +%endif + mova [dstq+strideq*2], m1 + mova [dstq+stride3q ], m2 Possibly some deduplication here. There are a few very similar segments in more places as well, might be possible to turn them into a macro. +%if cpuflag(ssse3) + pshufb m2, [pb_4_5_8to13_8x0] +%else + pshuflw m2, m2, q2222 +%endif + psrldq m0, 6 +%if notcpuflag(ssse3) + psrldq m2, 6 +%endif Move psrldq into the else part. It's quite a large patch so I mostly just skimmed through it fairly quickly, but the rest looks fine to me. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel