On Tue, Jan 10, 2012 at 10:58 AM, Christophe Gisquet <[email protected]> wrote: > Hi, > > the attached patch deals with a low-hanging fruit. It should be > entirely independent from my other patches.
+ movd m4, [%3+0] + movd m5, [%4+0] + movd m6, [%3+4] + movd m7, [%4+4] + punpcklbw m4, m0 + punpcklbw m5, m0 + punpcklbw m6, m0 + punpcklbw m7, m0 You don't need separate MMX and SSE macros. Just use movh. This code is needlessly redundant right now. + psllw m4, 7 + psllw m5, 7 + psllw m6, 7 + psllw m7, 7 + pmulhw m4, m3 + pmulhw m5, m2 + pmulhw m6, m3 + pmulhw m7, m2 Can you just scale m2/m3 up by <<7 to begin with, or are they too large for this? +%if %1 == 0 You need comments explaining these arguments, right now they're opaque. + cmp r6, 0 ; are both multiple of 2^9? Does this happen often? Is it worth optimizing for? Comment about this. Consider whether the pmac* instructions might be usable here (xop). Jason _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
