You can mitigate this issue by installing as much as possible at the
compiler level -- we do that at Compute Canada. I have some pending work on
the framework that could make that possible for Python too.
The major incompatibility between goolfc and goolf=foss is in the MPI
libraries, one with and one without CUDA support. But anything that does
not need MPI should not need recompilation.

Basically for goolfc you'd have this shiny alphabet soup:

                gcccuda - gompic - goolfc
               /       \        /
GCCcore - GCC - golf   -  golfc
               \       \
                gompi  -  goolf


ie:
goolf has two direct subtoolchains (golf=GNU+OpenBLAS+LAPACK+FFTW, and
gompi)
goolfc has two direct subtoolchains (golfc and gompic)
golfc has two direct subtoolchains (golf and gcccuda)

Then you can install Python (except for mpi4py) for the golf toolchain.

I got one headache when trying to implement this though: what about FFTW in
a hierarchy? There need to be two FFTW modules, one with (in goolf) and one
without MPI (in golf), and the majority of applications (even those that
use MPI) use the serial FFTW.
1. using a -serial suffix for the serial one, and no suffix for the MPI one?
2. using a -mpi suffix for the MPI one, no suffix for serial.
3. using pFFTW and FFTW?
4. anything else? Should the MPI FFTW only contain the parallel libraries
(libfftw3*_mpi.*) or also a copy of the non-MPI ones?

Bart


On 21 March 2018 at 10:55, Jakob Schiøtz <[email protected]> wrote:

> I very strongly agree with Jack on this.  If only a single program /
> python module uses CUDA, it is wasteful to have to build and install a new
> toolchain, and to rebuild everything on the system, including Python and
> perhaps even X11 (if using matplotlib).
>
> But there may be something I have overlooked.
>
> Best regards
>
> Jakob
>
>
> > On 19 Mar 2018, at 15:49, Jack Perdue <[email protected]> wrote:
> >
> >
> > On 03/19/2018 09:43 AM, Joachim Hein wrote:
> >> Hi,
> >>
> >> I am currently installing tensorflow via easybuild (I assume many of us
> do these days) and am trying to understand EasyBuild’s ideas on toolchains
> supporting cuda.
> >>
> >> I looked at TensorFlow-1.5.0-goolfc-2017b-Python-3.6.3.eb, which
> builds ontop of a toolchain containing GCC, Cuda (installed as a compiler
> module), and OpenMPI, Blas, FFTW etc.
> >>
> >> I now noticed that there is a new 
> >> TensorFlow-1.6.0-foss-2018a-Python-3.6.4-CUDA-9.1.85.eb,
> which is accepted into the development branch (PR 6016).   This builds
> ontop a “vanilla” foss-2018a toolchain, using a  Cuda and cuDNN modules
> installed as a core module (system compiler).
> >>
> >> I am wondering how do we want to organise us in future?  Do we want to
> continue with the goolfc idea or do we go for a “core” cuda and cuDNN?  I
> feel this needs standardising soonish.  It is also something I feel I need
> to document for my users, who want to build their own cuda based software.
> What models should be loaded to build software.
> >>
> >> Any comments, how we take this further?
> >>
> >> Best wishes
> >>    Joachim
> >>
> >
> > FWIW ($.02) I'm partial to the latter approach since it allows
> > more flexibility of CUDA version without redefining an entire
> > toolchain (which then requres everything to be rebuilt (e.g. Python)
> > whether they need CUDA or not).
> >
> > Jack Perdue
> > Lead Systems Administrator
> > High Performance Research Computing
> > TAMU Division of Research
> > [email protected]    http://hprc.tamu.edu
> > HPRC Helpdesk: [email protected]
> >
> >
> >
>
> --
> Jakob Schiøtz, professor, Ph.D.
> Department of Physics
> Technical University of Denmark
> DK-2800 Kongens Lyngby, Denmark
> http://www.fysik.dtu.dk/~schiotz/
>
>
>
>


-- 
Dr. Bart E. Oldeman | [email protected] | [email protected]
Scientific Computing Analyst / Analyste en calcul scientifique
McGill HPC Centre / Centre de Calcul Haute Performance de McGill |
http://www.hpc.mcgill.ca
Calcul Québec | http://www.calculquebec.ca
Compute/Calcul Canada | http://www.computecanada.ca
Tel/Tél: 514-396-8926 | Fax/Télécopieur: 514-396-8934

Reply via email to