Re: [libav-devel] [PATCH 3/3] x86: add XOP code for FFT

Ronald S. Bultje Fri, 11 May 2012 14:20:13 -0700

Hi,

On Fri, May 11, 2012 at 1:44 PM, Jason Garrett-Glaser <[email protected]> wrote:
> On Fri, May 11, 2012 at 1:35 PM, Vitor Sessak <[email protected]> wrote:
>> On 05/11/2012 10:31 PM, Vitor Sessak wrote:
>>>
>>> ---
>>>  libavcodec/x86/fft.c       |    9 +++-
>>>  libavcodec/x86/fft.h       |    2 +
>>>  libavcodec/x86/fft_mmx.asm |  108
>>> +++++++++++++++++++++++++-------------------
>>>  libavcodec/x86/fft_sse.c   |    7 +++
>>>  libavutil/x86/x86inc.asm   |    4 +-
>>>  5 files changed, 81 insertions(+), 49 deletions(-)
>>
>>
>> Note that I don't have the hardware to test if this actually work, so
>> consider this patch more as a request for testers.
>
> I should note, in my experience, 256-bit float is always slower than
> 128-bit on Bulldozer/Trinity, so the XOP functions should all be xmm
> regs, not ymm.


That sounds like the old Core Duo SSE2 performance problem again?

Ronald
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 3/3] x86: add XOP code for FFT

Reply via email to