At JSC we handle this issue by treating CUDA as a simple dependency of packages 
built at the compiler level, we only incorporate it into a toolchain when we 
use a CUDA-aware MPI (which means that the MODULEPATH expansion only happens 
once rather than twice, once for CUDA and once for MPI). Since our MPI 
implementations are in a "family" this is very safe. It also has very little 
side-effects because how CUDA is included is very heterogeneous across packages 
and typically needs to be implemented by hand anyway.

On 2 March 2017 at 16:31, Alan O'Cais 
<[email protected]<mailto:[email protected]>> wrote:
Dear Shahzeb,

I think this is probably the same (or at least related to the) issue that is 
being discussed in https://github.com/hpcugent/easybuild-framework/pull/2135

It also exposes one of the problems of a HMNS, the potential non-uniqueness of 
module names. The problem with not building software with minimal toolchains is 
that you can have multiple copies at various levels of your toolchain 
hierarchy. What module you end up loading is then dependent on the order that 
you load things (perhaps not in Lmod because it is hierarchy-aware but 
definitely for other module tools). This can clearly lead to issues.

Alan

On 2 March 2017 at 15:32, Siddiqui, Shahzeb 
<[email protected]<mailto:[email protected]>> wrote:
Hello,

I seem to notice an issue when building modules using HierarchicalNamingScheme 
when building out the intel and intelcuda toolchains.

I notice that MODULEPATH is set for icc and ifort for intel directory. This is 
correct when setting up intel toolchain.

hpcswadm@hpcv18$grep -iR MODULEPATH
icc/2017.1.132-GCC-5.2.0.lua:prepend_path("MODULEPATH", 
"/nfs/grid/software/testing/RHEL7/easybuild/modules/all/Compiler/intel/2017.1.132-GCC-5.2.0")
ifort/2017.1.132-GCC-5.2.0.lua:prepend_path("MODULEPATH", 
"/nfs/grid/software/testing/RHEL7/easybuild/modules/all/Compiler/intel/2017.1.132-GCC-5.2.0")

impi  gets installed in the path for intel as expected.

hpcswadm@hpcv18$ls -R 
/nfs/grid/software/testing/RHEL7/easybuild/modules/all/Compiler/intel/2017.1.132-GCC-5.2.0/
/nfs/grid/software/testing/RHEL7/easybuild/modules/all/Compiler/intel/2017.1.132-GCC-5.2.0/:
impi

/nfs/grid/software/testing/RHEL7/easybuild/modules/all/Compiler/intel/2017.1.132-GCC-5.2.0/impi:
2017.1.132.lua

As for impi built with iccifortcuda it gets installed in

hpcswadm@hpcv18$ls -R 
/nfs/grid/software/testing/RHEL7/easybuild/modules/all/Compiler/intel-CUDA/
/nfs/grid/software/testing/RHEL7/easybuild/modules/all/Compiler/intel-CUDA/:
2017.1.132-GCC-5.2.0-7.5.18

/nfs/grid/software/testing/RHEL7/easybuild/modules/all/Compiler/intel-CUDA/2017.1.132-GCC-5.2.0-7.5.18:
impi

/nfs/grid/software/testing/RHEL7/easybuild/modules/all/Compiler/intel-CUDA/2017.1.132-GCC-5.2.0-7.5.18/impi:
2017.1.132.lua

The problem is when loading iimpic toolchain it loads up icc and ifort modules 
along with impi that belongs to the module tree from intel and not intel-CUDA. 
I am not sure if this is a problem, but it seems like the impi is not being 
picked up correctly.

hpcswadm@hpcv18$ml av iimpi
------------------------------------- 
/nfs/grid/software/testing/RHEL7/easybuild/modules/all/Core 
-------------------------------------
   iimpi/2017.01-GCC-5.2.0 (TC)    iimpic/2017.01-GCC-5.2.0 (TC)

hpcswadm@hpcv18$ml

Currently Loaded Modules:
  1) EasyBuild/3.1.0

hpcswadm@hpcv18$ml iimpic
Currently Loaded Modules:
  1) EasyBuild/3.1.0   3) icc/2017.1.132-GCC-5.2.0   (I)   5) impi/2017.1.132 
(I)   7) iimpic/2017.01-GCC-5.2.0 (
TC)
  2) GCC/5.2.0         4) ifort/2017.1.132-GCC-5.2.0 (I)   6) CUDA/7.5.18

hpcswadm@hpcv18$which mpicc
/nfs/grid/software/testing/RHEL7/easybuild/software/Compiler/intel/2017.1.132-GCC-5.2.0/impi/2017.1.132/bin64/mpicc

The one that should be loaded is from
/nfs/grid/software/testing/RHEL7/easybuild/software/Compiler/intel-CUDA/2017.1.132-GCC-5.2.0-7.5.18/impi/2017.1.132/bin64/mpicc

I think the impi module should not sit inside intel directory or somehow icc 
and ifort MODULEPATH need to be changed to intel-CUDA when loading iimpic
/nfs/grid/software/testing/RHEL7/easybuild/modules/all/Compiler/intel/2017.1.132-GCC-5.2.0
 ----------------------
   impi/2017.1.132 (I,L)

Anyone else come across this issue.


Shahzeb Siddiqui
HPC Linux Engineer
B2220-447.2
Groton, CT




--
Dr. Alan O'Cais
E-CAM Software Manager
Juelich Supercomputing Centre
Forschungszentrum Juelich GmbH
52425 Juelich, Germany

Phone: +49 2461 61 5213<tel:02461%20615213>
Fax: +49 2461 61 6656<tel:02461%20616656>
E-mail: [email protected]<mailto:[email protected]>
WWW:    http://www.fz-juelich.de/ias/jsc/EN



--
Dr. Alan O'Cais
E-CAM Software Manager
Juelich Supercomputing Centre
Forschungszentrum Juelich GmbH
52425 Juelich, Germany

Phone: +49 2461 61 5213
Fax: +49 2461 61 6656
E-mail: [email protected]<mailto:[email protected]>
WWW:    http://www.fz-juelich.de/ias/jsc/EN


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------

Reply via email to