Jack Perdue
Lead Systems Administrator
High Performance Research Computing
TAMU Division of Research
[email protected]    http://hprc.tamu.edu
HPRC Helpdesk: [email protected]

On 10/04/2016 07:13 AM, Kenneth Hoste wrote:
Hi Joachim,

On 03/10/16 12:01, Joachim Hein wrote:
Hi,

I am trying to a version push on a package that needs FFTW. I noticed that FFTW 3.3.5 is out now and there are FFTW 3.3.5 packages in EB 2.9.0 (for e.g foss/2016.09).

Looking into the release node for FFTW 3.3.5, avx2 seems a big addition. Looking into the EB 2.9.0 configurations, it seems sse2 is hard coded. To my understanding that is a factor of 4 in theoretical performance (2x for register length and 2x for FMA). The current SSE2 should be a lowest common denominator for current hardware. I am pretty sure an AVX2 version will do a nice illegal instruction error on oldish hardware (e.g. Interlagos).

Thank you very much for bringing this up, and carefully checking the FFTW release notes and verifying what the FFTW easyconfigs specify.

This type of thing is exactly what makes the EasyBuild community so valuable...

We now have a bit of a can of worms. I am not supporting any pre avx2 hardware any longer, but I am sure some people here do.

For FFTW we could make that explicit, e.g. adding an avx2 to the name of the FFTW. But then I am not building FFTW for the sake of building FFTW, I want to build NAMD with it. If I now supply a NAMD using the avx2 FFTW into the git repository, it will fail on any system deploying pre-avx2 hardware. For the compiler we are dealing with these kind of things via -xHost and -march=native options. Is there an EB mechanism to deal with this kind of thing?

To my recollection, the 'hardcoding' (via --enable-sse2) you refer to was not done for the sake of hardcoding to SSE2 as a common base, but more to ensure that the FFTW builds with single and double (default) precision are properly using *at least* SSE2 (which they don't seem to do by default). Of course, that does imply that AVX(2) is bypassed entirely... That does make me wonder how the higher precision builds (--enable-double-long and --enable-quad-precision) are handled on older systems...

Anyway, you're right, the real problem to me seems to be that these configuration options are specified totally independent of the 'optarch' configuration option.
The question is, how do we fix this...

Do we implement a custom easyblock for FFTW that takes into account both the --optarch configuration setting, together with the features of the system it is being installed on? That way, we could probably do a more intelligent pick of configure options to use (keeping --enable-sse2 as the default, unless --optarch=GENERIC is used).

That's probably easier said than done, but I don't see a better way since this is quite specific to FFTW?

Don' forget altivec on Power. :)

http://www.siliconslick.com/easybuild/ebfiles_repo_cleaned/curie/FFTW/FFTW-3.3.4-gompi-2016a.eb

Anyway, it demonstrates one approach (querying the build system CPU).

jack

Reply via email to