Hi Steven On Tuesday 21 June 2011 05:55:40 Steven G. Johnson wrote: > I am one of the FFTW developers, and wanted to comment on this.
Thanks a lot! > Yes, you should definitely use --enable-sse/--enable-sse2 flags in when > compiling single/double precision versions of FFTW on all x86 and x86-64 > platforms. This is *not* just a matter of compiler flags -- it enables > the compilation of special computational kernels in FFTW that explicitly > use SSE/SSE2 intrinsics. > > In addition to x86-64, note that this is SAFE to enable in general for > all 32-bit x86 platforms. FFTW checks at runtime to see whether the > processor supports SSE/SSE2 and disables its SSE/SSE2 code if not. > (Similarly for Altivec on PowerPC, and similarly in the next release for > AVX instructions.) > Well, that depends what you are aiming for. If you want to have a single 32bit x86 package which is guaranteed to work for all x86 compatible CPus out there starting say at a Pentium II level you have to ensure that this will still work - for my case where I have ~ 1800 computers doing number crunching and all are 64bit this is another matter then the one Debian has for packaging. > > For benchmarking, I would recommend using the "bench" program that comes > with FFTW. e.g. you can compare for a size-1024 FFT with and without the > SSE/SSE2 kernels just by doing: > ./bench -opatient 1024 > ./bench -opatient -onosimd 1024 > On my 64-bit Intel Xeon E5440 running FFTW 3.2.2 and Debian GNU/Linux, > the SSE/SSE2 version is faster for size 1024 by a factor of 1.7 in > double precision and by a factor of 3.4 in single precision. Interesting, I think I need to rerun my tests again but then again this could be that I was just using a 'measured' plan. Thanks a lot for the insight! Cheers Carsten -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: http://lists.debian.org/[email protected]

