Hi Joachim,
On 03/10/16 12:01, Joachim Hein wrote:
Hi,
I am trying to a version push on a package that needs FFTW. I noticed that
FFTW 3.3.5 is out now and there are FFTW 3.3.5 packages in EB 2.9.0 (for e.g
foss/2016.09).
Looking into the release node for FFTW 3.3.5, avx2 seems a big addition.
Looking into the EB 2.9.0 configurations, it seems sse2 is hard coded. To my
understanding that is a factor of 4 in theoretical performance (2x for register
length and 2x for FMA). The current SSE2 should be a lowest common denominator
for current hardware. I am pretty sure an AVX2 version will do a nice illegal
instruction error on oldish hardware (e.g. Interlagos).
Thank you very much for bringing this up, and carefully checking the
FFTW release notes and verifying what the FFTW easyconfigs specify.
This type of thing is exactly what makes the EasyBuild community so
valuable...
We now have a bit of a can of worms. I am not supporting any pre avx2 hardware
any longer, but I am sure some people here do.
For FFTW we could make that explicit, e.g. adding an avx2 to the name of the
FFTW. But then I am not building FFTW for the sake of building FFTW, I want to
build NAMD with it. If I now supply a NAMD using the avx2 FFTW into the git
repository, it will fail on any system deploying pre-avx2 hardware. For the
compiler we are dealing with these kind of things via -xHost and -march=native
options. Is there an EB mechanism to deal with this kind of thing?
To my recollection, the 'hardcoding' (via --enable-sse2) you refer to
was not done for the sake of hardcoding to SSE2 as a common base, but
more to ensure that the FFTW builds with single and double (default)
precision are properly using *at least* SSE2 (which they don't seem to
do by default). Of course, that does imply that AVX(2) is bypassed
entirely...
That does make me wonder how the higher precision builds
(--enable-double-long and --enable-quad-precision) are handled on older
systems...
Anyway, you're right, the real problem to me seems to be that these
configuration options are specified totally independent of the 'optarch'
configuration option.
The question is, how do we fix this...
Do we implement a custom easyblock for FFTW that takes into account both
the --optarch configuration setting, together with the features of the
system it is being installed on?
That way, we could probably do a more intelligent pick of configure
options to use (keeping --enable-sse2 as the default, unless
--optarch=GENERIC is used).
That's probably easier said than done, but I don't see a better way
since this is quite specific to FFTW?
regards,
Kenneth