Hi, On Sat, Oct 22, 2011 at 11:01 AM, Jason Garrett-Glaser <[email protected]> wrote: > + mova m0, [r0+r2*4+mmsize*0] > + mova m1, [r0+r2*4+mmsize*1] > + mova m2, [r0+r2*4+mmsize*2] > + mova m3, [r0+r2*4+mmsize*3] > + paddd m0, m4 > + paddd m1, m4 > + paddd m2, m4 > + paddd m3, m4 > > In AVX, can't this be: > > paddd m0, m4, [r0+r2*4+mmsize*0] > > or something of the sort? > > We might have to ifdef it because mova between regs is an extra uop on > Intel, whereas a load isn't.
Good idea, see attached. Ronald
0001-swscale-write-yuv2plane1-MMX-SSE2-SSE4-AVX-functions.patch
Description: Binary data
_______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
