On 12/02/2012 05:04 AM, Christophe Gisquet wrote:
> movh generates a SSE2 movq instruction, so explicitly use movlps.
> 
> -- Christophe
> 
> 
> 0004-SBR-DSP-x86-implement-SSE-qmf_pre_shuffle.patch
> 
> 
> From c237585dbbc307e41378a2c2f619672fb1d58298 Mon Sep 17 00:00:00 2001
> From: Christophe Gisquet <[email protected]>
> Date: Sun, 25 Nov 2012 09:10:36 +0100
> Subject: [PATCH 04/12] SBR DSP x86: implement SSE qmf_pre_shuffle
> 
> From 253 to 185c.
> ---
>  libavcodec/x86/sbrdsp.asm    |   23 +++++++++++++++++++++++
>  libavcodec/x86/sbrdsp_init.c |    2 ++
>  2 files changed, 25 insertions(+), 0 deletions(-)
> 
> diff --git a/libavcodec/x86/sbrdsp.asm b/libavcodec/x86/sbrdsp.asm
> index ef2260f..7fe136f 100644
> --- a/libavcodec/x86/sbrdsp.asm
> +++ b/libavcodec/x86/sbrdsp.asm
> @@ -224,3 +224,26 @@ cglobal sbr_qmf_post_shuffle, 2,3,3,W,z
>      cmp        zq, r2q
>      jl      .loop
>      REP_RET
> +
> +cglobal sbr_qmf_pre_shuffle, 1,4,4,z

please add INIT_XMM sse

Also, I only see 3 gp registers being used, not 4. Looks like r1 is unused?

> +    movlps     m3, [zq]
> +    lea       r3q, [zq + 64*4]
> +    lea       r2q, [zq + (64-3)*4]
> +    add        zq, 4
> +.loop:
> +    movu       m0, [r2q]
> +    movu       m1, [zq ]
> +    xorps      m0, [ps_neg]
> +    shufps     m0, m0, 0x1B
> +    mova       m2, m0
> +    unpcklps   m0, m1
> +    unpckhps   m2, m1
> +    mova  [r3q +  0], m0
> +    mova  [r3q + 16], m2

How can z be unaligned but r3 is aligned?

-Justin
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to