Hi,

On Sat, Oct 22, 2011 at 11:01 AM, Jason Garrett-Glaser <[email protected]> wrote:
> +    mova            m0, [r0+r2*4+mmsize*0]
> +    mova            m1, [r0+r2*4+mmsize*1]
> +    mova            m2, [r0+r2*4+mmsize*2]
> +    mova            m3, [r0+r2*4+mmsize*3]
> +    paddd           m0, m4
> +    paddd           m1, m4
> +    paddd           m2, m4
> +    paddd           m3, m4
>
> In AVX, can't this be:
>
> paddd m0, m4, [r0+r2*4+mmsize*0]
>
> or something of the sort?
>
> We might have to ifdef it because mova between regs is an extra uop on
> Intel, whereas a load isn't.

Good idea, see attached.

Ronald

Attachment: 0001-swscale-write-yuv2plane1-MMX-SSE2-SSE4-AVX-functions.patch
Description: Binary data

_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to