On Fri, May 11, 2012 at 2:20 PM, Ronald S. Bultje <[email protected]> wrote: > Hi, > > On Fri, May 11, 2012 at 1:44 PM, Jason Garrett-Glaser <[email protected]> wrote: >> On Fri, May 11, 2012 at 1:35 PM, Vitor Sessak <[email protected]> wrote: >>> On 05/11/2012 10:31 PM, Vitor Sessak wrote: >>>> >>>> --- >>>> libavcodec/x86/fft.c | 9 +++- >>>> libavcodec/x86/fft.h | 2 + >>>> libavcodec/x86/fft_mmx.asm | 108 >>>> +++++++++++++++++++++++++------------------- >>>> libavcodec/x86/fft_sse.c | 7 +++ >>>> libavutil/x86/x86inc.asm | 4 +- >>>> 5 files changed, 81 insertions(+), 49 deletions(-) >>> >>> >>> Note that I don't have the hardware to test if this actually work, so >>> consider this patch more as a request for testers. >> >> I should note, in my experience, 256-bit float is always slower than >> 128-bit on Bulldozer/Trinity, so the XOP functions should all be xmm >> regs, not ymm. > > That sounds like the old Core Duo SSE2 performance problem again?
Kind of. Remember, AVX is an Intel thing, XOP is an AMD thing, and while AMD supports AVX, they don't have 256-bit execution units yet. Jason _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
