Hi, On Fri, May 11, 2012 at 1:44 PM, Jason Garrett-Glaser <[email protected]> wrote: > On Fri, May 11, 2012 at 1:35 PM, Vitor Sessak <[email protected]> wrote: >> On 05/11/2012 10:31 PM, Vitor Sessak wrote: >>> >>> --- >>> libavcodec/x86/fft.c | 9 +++- >>> libavcodec/x86/fft.h | 2 + >>> libavcodec/x86/fft_mmx.asm | 108 >>> +++++++++++++++++++++++++------------------- >>> libavcodec/x86/fft_sse.c | 7 +++ >>> libavutil/x86/x86inc.asm | 4 +- >>> 5 files changed, 81 insertions(+), 49 deletions(-) >> >> >> Note that I don't have the hardware to test if this actually work, so >> consider this patch more as a request for testers. > > I should note, in my experience, 256-bit float is always slower than > 128-bit on Bulldozer/Trinity, so the XOP functions should all be xmm > regs, not ymm.
That sounds like the old Core Duo SSE2 performance problem again? Ronald _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
