On 28/03/14 4:15 PM, Jason Garrett-Glaser wrote:
> On Thu, Mar 20, 2014 at 11:37 AM, James Almer <jamr...@gmail.com> wrote:
>> Build only on x86_32 targets.
>>
>> Signed-off-by: James Almer <jamr...@gmail.com>
>> ---
>>  libavcodec/x86/dcadsp.asm    | 55 
>> +++++++++++++++++++++++++++++++++-----------
>>  libavcodec/x86/dcadsp_init.c | 45 ++++++++++++++++++++++--------------
>>  2 files changed, 70 insertions(+), 30 deletions(-)
>>
>> diff --git a/libavcodec/x86/dcadsp.asm b/libavcodec/x86/dcadsp.asm
>> index 56039ba..970ec3d 100644
>> --- a/libavcodec/x86/dcadsp.asm
>> +++ b/libavcodec/x86/dcadsp.asm
>> @@ -199,15 +199,31 @@ INIT_XMM sse
>>  DCA_LFE_FIR 0
>>  DCA_LFE_FIR 1
>>
>> -INIT_XMM sse2
>> +%macro SETZERO 1
>> +%if cpuflag(sse2)
>> +    pxor          %1, %1
>> +%else
>> +    xorps         %1, %1, %1
>> +%endif
>> +%endmacro
> 
> Is there some reason we can't just use xorps here for all versions?  I
> mean, it is float data, right?
> 
>>  %if ARCH_X86_32 || WIN64
>> +%if cpuflag(sse2)
>>      movd       scale, scalem
>> +%else
>> +    movss      scale, scalem
>> +%endif
> 
> Same here; does this need to be ifdeffed?
> 
> Otherwise looks okay.
> 
> Jason

You're right that it's all float data, but both Christophe and I tested and 
xorps/shufps was a bit slower than pxor/pshufd (At least in my tests it was 
about five cycles slower), so i decided to use some ifdeffery to keep the 
SSE2 version intact.
_______________________________________________
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to