On 28/03/14 4:15 PM, Jason Garrett-Glaser wrote: > On Thu, Mar 20, 2014 at 11:37 AM, James Almer <jamr...@gmail.com> wrote: >> Build only on x86_32 targets. >> >> Signed-off-by: James Almer <jamr...@gmail.com> >> --- >> libavcodec/x86/dcadsp.asm | 55 >> +++++++++++++++++++++++++++++++++----------- >> libavcodec/x86/dcadsp_init.c | 45 ++++++++++++++++++++++-------------- >> 2 files changed, 70 insertions(+), 30 deletions(-) >> >> diff --git a/libavcodec/x86/dcadsp.asm b/libavcodec/x86/dcadsp.asm >> index 56039ba..970ec3d 100644 >> --- a/libavcodec/x86/dcadsp.asm >> +++ b/libavcodec/x86/dcadsp.asm >> @@ -199,15 +199,31 @@ INIT_XMM sse >> DCA_LFE_FIR 0 >> DCA_LFE_FIR 1 >> >> -INIT_XMM sse2 >> +%macro SETZERO 1 >> +%if cpuflag(sse2) >> + pxor %1, %1 >> +%else >> + xorps %1, %1, %1 >> +%endif >> +%endmacro > > Is there some reason we can't just use xorps here for all versions? I > mean, it is float data, right? > >> %if ARCH_X86_32 || WIN64 >> +%if cpuflag(sse2) >> movd scale, scalem >> +%else >> + movss scale, scalem >> +%endif > > Same here; does this need to be ifdeffed? > > Otherwise looks okay. > > Jason
You're right that it's all float data, but both Christophe and I tested and xorps/shufps was a bit slower than pxor/pshufd (At least in my tests it was about five cycles slower), so i decided to use some ifdeffery to keep the SSE2 version intact. _______________________________________________ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel