Re: [libav-devel] [PATCH 3/3] x86: add XOP code for FFT

Jason Garrett-Glaser Fri, 11 May 2012 14:24:53 -0700

On Fri, May 11, 2012 at 2:20 PM, Ronald S. Bultje <[email protected]> wrote:
> Hi,
>
> On Fri, May 11, 2012 at 1:44 PM, Jason Garrett-Glaser <[email protected]> wrote:
>> On Fri, May 11, 2012 at 1:35 PM, Vitor Sessak <[email protected]> wrote:
>>> On 05/11/2012 10:31 PM, Vitor Sessak wrote:
>>>>
>>>> ---
>>>>  libavcodec/x86/fft.c       |    9 +++-
>>>>  libavcodec/x86/fft.h       |    2 +
>>>>  libavcodec/x86/fft_mmx.asm |  108
>>>> +++++++++++++++++++++++++-------------------
>>>>  libavcodec/x86/fft_sse.c   |    7 +++
>>>>  libavutil/x86/x86inc.asm   |    4 +-
>>>>  5 files changed, 81 insertions(+), 49 deletions(-)
>>>
>>>
>>> Note that I don't have the hardware to test if this actually work, so
>>> consider this patch more as a request for testers.
>>
>> I should note, in my experience, 256-bit float is always slower than
>> 128-bit on Bulldozer/Trinity, so the XOP functions should all be xmm
>> regs, not ymm.
>
> That sounds like the old Core Duo SSE2 performance problem again?


Kind of.  Remember, AVX is an Intel thing, XOP is an AMD thing, and
while AMD supports AVX, they don't have 256-bit execution units yet.

Jason
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 3/3] x86: add XOP code for FFT

Reply via email to