Thanks Robert,
While I understand and am all in favor of reusing dependencies, most
packages with dependencies do not provide download hooks for specific
versions of their dependencies. PETSc is this way, I believe OpenFoam
too. When the developers go through this trouble, I prefer to assume
that they know what they are doing and test with specific versions of
the dependencies. If a user finds a bug, it will be that much simpler to
diagnose if it uses the same versions as the developers.
One could install the specific versions of the dependencies, but this
means that we have to keep tracking each PETSc version and adjust the
version of each dependency every time there is a new version of PETSc. I
don't think that the the space we save by reusing the packages is worth
the trouble of making sure it works for every version.
By bundling the dependencies into PETSc, one should set conflicts in the
modules or use RPATH to find them.
I am working on a recipe that would use the petsc.py easyblock. If I
succeed, I'll open a PR for it.
Maxime
On 17-02-21 20:01, Robert Schmidt wrote:
I'm not an expert at PETSc (though I know it is widely used on large
clusters, I haven't see it in bioinfo). My general reading is that
there are two general EasyBuild principles at work here, but they are
certainly at odds with a few of these "bundle all the deps" type
installs that we are seeing more of (cmake and bazel come to mind as
tools that do this kind of thing).
The first principle is that your build nodes might not have internet
access, so the downloads should be handled separately and stashed into
the source directory. I think you could even imagine using easybuild
to just run the fetch_step on the login node, but that the build would
be done on some cluster node that wouldn't have direct internet access.
The second principle is that dependencies are often useful for more
than one piece of software (or as a development library directly) and
that we would install them for cluster-wide use (Boost is definitely
in that category). Also with LD_LIBRARY_PATH you need to be somewhat
careful which libraries are loaded (maybe PETSc would handle RPATHing
its deps?). It is easier if there is one module loaded that represents
the version being used in that shell instance.
I don't think there is anything wrong with doing it the other way
either and it would be pretty easy to write an easyconfig that just
used the standard configuremake easyblock. The main thing is that you
will be able to keep a reproducible record of your build and hopefully
share it for others to use. EasyConfigs are really more like examples
and helpful defaults.
As a related aside, this is a pretty big deal with things like
tensorflow. These packages do end up being more like "Bundles" of
software and dependencies. Disk space is relatively cheap,
On Tue, Feb 21, 2017 at 1:41 PM Maxime Boissonneault
<[email protected]
<mailto:[email protected]>> wrote:
To give more concreteness to my question, here is what our PETSc
install
script looks like. It is pretty much a configure make make install.
# module --force purge
# module load compilers/intel mpi/openmpi libs/mkl/11.1
apps/cmake/2.8.12 apps/buildtools apps/devtools mpi/openmpi/1.8.8
# cd /software6/src
# NAME=petsc
# VERSION=3.7.2
# SRCDIR=${NAME}-${VERSION}
# COMPILER=intel
# MPI=openmpi1.8.8
# PREFIX=/software6/libs/${NAME}/${VERSION}_${COMPILER}_${MPI}_patched
# ARCHIVE=${NAME}-${VERSION}-build-${COMPILER}-${MPI}.tar.xz
#
# curl -O
http://ftp.mcs.anl.gov/pub/petsc/release-snapshots/${NAME}-lite-${VERSION}.tar.gz
<http://ftp.mcs.anl.gov/pub/petsc/release-snapshots/$%7BNAME%7D-lite-$%7BVERSION%7D.tar.gz>
# tar xvfz ${NAME}-lite-${VERSION}.tar.gz
# cd ${SRCDIR}
# export CFLAGS="-O3 -xHost -mkl -fPIC -m64 -no-diag-message-catalog"
# export FFLAGS="-O3 -xHost -mkl -fPIC -m64 -no-diag-message-catalog"
# export MPIDIR=$(dirname $(dirname $(which mpiexec)))
# Required to build hypre, which expects a GNU AR
# unset AR
# ./config/configure.py PETSC_ARCH=linux-gnu-intel \
CFLAGS="$CFLAGS" FFLAGS="$FFLAGS" \
--prefix=$PREFIX \
--with-x=0 \
--with-mpi-compilers=1 \
--with-mpi-dir=$MPIDIR \
--known-mpi-shared-libraries=1 \
--with-debugging=no \
--with-shared-libraries=1 \
--with-blas-lapack-dir=$MKLROOT/lib/intel64 \
--with-scalapack=1 \
--with-scalapack-include=$MKLROOT/include \
--with-scalapack-lib="-lmkl_scalapack_lp64
-lmkl_blacs_openmpi_lp64" \
--with-mkl_pardiso=1 \
--with-mkl_pardiso-dir=$MKLROOT \
--download-mumps=yes \
--download-ptscotch=yes \
--download-superlu=yes \
--download-superlu_dist=yes \
--download-parmetis=yes \
--download-metis=yes \
--download-ml=yes \
--download-suitesparse=yes \
--download-hypre=yes |& tee configure.log
# make PETSC_DIR=/software6/src/${SRCDIR}
PETSC_ARCH=linux-gnu-intel all
|& tee make.log
# make PETSC_DIR=/software6/src/${SRCDIR} PETSC_ARCH=linux-gnu-intel
install |& tee make-install.log
# make PETSC_DIR=$PREFIX test |& tee make-test.log
# chmod -R g+w ${PREFIX}; chmod g+w ${PREFIX}/..
# cd ..
# tar cfJ ${ARCHIVE} ${SRCDIR} && xz -t ${ARCHIVE} && rm -rf ${SRCDIR}
Maxime
On 17-02-21 13:31, Maxime Boissonneault wrote:
> Hi,
>
> I am looking at installing PETSc through EasyBuild. I am
surprised to
> see that the EasyBlock relies heavily on other recipes. My personal
> experience with PETSc is that one is best to stick with whatever
> version the authors of PETSc decided was best for each of the
packages.
>
>
> Therefore, on my system, PETSc depends only on MKL and OpenMPI.
>
> For every other package, I use the download config options so that
> PETSc fetches whatever version of the packages it needs.
>
> --download-mumps --download-ptscotch --download-superlu
> --download-superlu_dist --download-parmetis --download-metis
> --download-ml --download-suitesparse --download-hypre
>
>
> Is this contrary to other people's experience ?
>
>
>
--
---------------------------------
Maxime Boissonneault
Analyste de calcul - Calcul Québec, Université Laval
Président - Comité de coordination du soutien à la recherche de Calcul Québec
Team lead - Research Support National Team, Compute Canada
Instructeur Software Carpentry
Ph. D. en physique