Hi,

On 12/01/2012 06:17 AM, Christophe Gisquet wrote:
> +cglobal sbr_qmf_post_shuffle, 2,3,3,W,z

INIT_XMM sse

> +    lea       r2q, [zq + (64-4)*4]
> +.loop:
> +    mova       m0, [r2q]
> +    mova       m1, [zq ]
> +    xorps      m0, [ps_neg]
> +    shufps     m0, m0, 0x1B
> +    mova       m2, m0
> +    unpcklps   m0, m1
> +    unpckhps   m2, m1
> +    mova  [Wq +  0], m0
> +    mova  [Wq + 16], m2

putting [ps_neg] in a register and switching m0 and m2 in the unpacking
would allow some 3-arg XMM AVX to be used, like so:

    mova       m3, [ps_neg]
.loop:
    mova       m1, [zq]
    xorps      m0, m3, [r2q]
    shufps     m0, m0, m0, q0123
    unpcklps   m2, m0, m1
    unpckhps   m0, m0, m1
    mova  [Wq +  0], m2
    mova  [Wq + 16], m0

> +    add        Wq, 32
> +    sub       r2q, 16
> +    add        zq, 16
> +    cmp        zq, r2q
> +    jl      .loop
> +    REP_RET

Thanks,
Justin
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to