Hi,
On 12/01/2012 06:17 AM, Christophe Gisquet wrote:
> +cglobal sbr_qmf_post_shuffle, 2,3,3,W,z
INIT_XMM sse
> + lea r2q, [zq + (64-4)*4]
> +.loop:
> + mova m0, [r2q]
> + mova m1, [zq ]
> + xorps m0, [ps_neg]
> + shufps m0, m0, 0x1B
> + mova m2, m0
> + unpcklps m0, m1
> + unpckhps m2, m1
> + mova [Wq + 0], m0
> + mova [Wq + 16], m2
putting [ps_neg] in a register and switching m0 and m2 in the unpacking
would allow some 3-arg XMM AVX to be used, like so:
mova m3, [ps_neg]
.loop:
mova m1, [zq]
xorps m0, m3, [r2q]
shufps m0, m0, m0, q0123
unpcklps m2, m0, m1
unpckhps m0, m0, m1
mova [Wq + 0], m2
mova [Wq + 16], m0
> + add Wq, 32
> + sub r2q, 16
> + add zq, 16
> + cmp zq, r2q
> + jl .loop
> + REP_RET
Thanks,
Justin
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel