Hi, Le 24/01/2020 à 13:17, Rafael Laboissière a écrit : >>> I'm not sure subprocess is involved at all, it may be that MPI_Init() >>> simply dies because h5py has already called it earlier. >>> >> >> That sounds plausible. > > I am not sure this would be an explanation for the problem. > > At any rate, I found a minimal example that exposes the bug. Consider > these two scripts: > [...]
Looking for similar behavior outside of Python (thus independently of import subprocess): 8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<-- $ mpirun mpirun ********************************************************** mpirun does not support recursive calls ********************************************************** 8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<-- Another try, setting only thee environment (not a real recursive call, but mpirun perceives it as recursive): 8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<-- $ env mpirun > mpirun.env $ while read line ; do export $line ; done < mpirun.env $ mpirun ********************************************************** mpirun does not support recursive calls ********************************************************** 8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<--8<-- My guess: import h5py implicitly sets up the MPI stack, including all those environment variables. This is equivalent to running mpirun recursively, which is not supported. Either mpirun notices and errors out, or does not notice it and dies. Like other issues, this points to not setting up the MPI stack when not explicitly requested (see bug#944769 as mentioned by Drew). Kind regards, Thibaut.
signature.asc
Description: OpenPGP digital signature
-- debian-science-maintainers mailing list [email protected] https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/debian-science-maintainers
