Hi, And very interesting findings, thanks for looking into this!
Here are some more on my core2 pc. flags: CFLAGS=-O3 -mmmx -msse -mfpmath=sse -ftree-vectorize compiler: 4.1.3 or 4.2.1, didn't make a difference. 1. #define FRAGMENTSIZE 32 Benchmarking mixdown (WITH coeff): Process time for pure C++: 1505 useconds Process time for ASM SSE: 2871 useconds Process time for GCC vector extensions: 503 useconds 2. #define FRAGMENTSIZE 64 Benchmarking mixdown (WITH coeff): Process time for pure C++: 3006 useconds Process time for ASM SSE: 5072 useconds Process time for GCC vector extensions: 1568 useconds 3. #define FRAGMENTSIZE 128 Benchmarking mixdown (WITH coeff): Process time for pure C++: 6793 useconds Process time for ASM SSE: 8232 useconds Process time for GCC vector extensions: 6091 useconds 3. #define FRAGMENTSIZE 512 Benchmarking mixdown (WITH coeff): Process time for pure C++: 19843 useconds Process time for ASM SSE: 31141 useconds Process time for GCC vector extensions: 17669 useconds 4. #define FRAGMENTSIZE 1024 Benchmarking mixdown (WITH coeff): Process time for pure C++: 27083 useconds Process time for ASM SSE: 42730 useconds Process time for GCC vector extensions: 31436 useconds I only modified the example to use gettimeofday() instead of clock(). Maybe a gcc developer can shed some light on this issue ? Greetings, Remon _______________________________________________ Linux-audio-dev mailing list [email protected] http://lists.linuxaudio.org/mailman/listinfo/linux-audio-dev
