On 12/02/2012 05:04 AM, Christophe Gisquet wrote: > movh generates a SSE2 movq instruction, so explicitly use movlps. > > -- Christophe > > > 0004-SBR-DSP-x86-implement-SSE-qmf_pre_shuffle.patch > > > From c237585dbbc307e41378a2c2f619672fb1d58298 Mon Sep 17 00:00:00 2001 > From: Christophe Gisquet <[email protected]> > Date: Sun, 25 Nov 2012 09:10:36 +0100 > Subject: [PATCH 04/12] SBR DSP x86: implement SSE qmf_pre_shuffle > > From 253 to 185c. > --- > libavcodec/x86/sbrdsp.asm | 23 +++++++++++++++++++++++ > libavcodec/x86/sbrdsp_init.c | 2 ++ > 2 files changed, 25 insertions(+), 0 deletions(-) > > diff --git a/libavcodec/x86/sbrdsp.asm b/libavcodec/x86/sbrdsp.asm > index ef2260f..7fe136f 100644 > --- a/libavcodec/x86/sbrdsp.asm > +++ b/libavcodec/x86/sbrdsp.asm > @@ -224,3 +224,26 @@ cglobal sbr_qmf_post_shuffle, 2,3,3,W,z > cmp zq, r2q > jl .loop > REP_RET > + > +cglobal sbr_qmf_pre_shuffle, 1,4,4,z
please add INIT_XMM sse Also, I only see 3 gp registers being used, not 4. Looks like r1 is unused? > + movlps m3, [zq] > + lea r3q, [zq + 64*4] > + lea r2q, [zq + (64-3)*4] > + add zq, 4 > +.loop: > + movu m0, [r2q] > + movu m1, [zq ] > + xorps m0, [ps_neg] > + shufps m0, m0, 0x1B > + mova m2, m0 > + unpcklps m0, m1 > + unpckhps m2, m1 > + mova [r3q + 0], m0 > + mova [r3q + 16], m2 How can z be unaligned but r3 is aligned? -Justin _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
