Drew Parsons a écrit le 06/04/2020 à 05:08 : > On 2020-04-06 09:56, Drew Parsons wrote: >> On 2020-04-06 01:48, Gilles Filippini wrote: >>> Drew Parsons a écrit le 05/04/2020 à 18:57 : >>>> >>>> Another option is to create an environment variable to force h5py to >>>> load the mpi version even when run in a serial environment without >>>> mpirun. Easy enough to set up, though I'm interested to see if "mpirun >>>> -n 1 dh_auto_build" or a variation of that is viable. Maybe >>>> %: >>>> mpirun -n 1 dh $@ --with python3 --buildsystem=pybuild >>> >>> This, way the test cases run against python3.7 is OK, but it fails >>> against python3.8 with: >>> >>> I: pybuild base:217: cd >>> /build/bitshuffle-z2ZvpN/bitshuffle-0.3.5/.pybuild/cpython3_3.8_bitshuffle/build; >>> >>> python3.8 -m unittest discover -v >>> [pinibrem15:43725] OPAL ERROR: Unreachable in file ext3x_client.c at >>> line 112 >>> *** An error occurred in MPI_Init_thread >>> *** on a NULL communicator >>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, >>> *** and potentially your MPI job) >>> [pinibrem15:43725] Local abort before MPI_INIT completed completed >>> successfully, but am not able to aggregate error messages, and not able >>> to guarantee that all other processes were killed! >>> E: pybuild pybuild:352: test: plugin distutils failed with: exit code=1: >>> cd >>> /build/bitshuffle-z2ZvpN/bitshuffle-0.3.5/.pybuild/cpython3_3.8_bitshuffle/build; >>> >>> python3.8 -m unittest discover -v >>> dh_auto_test: error: pybuild --test -i python{version} -p "3.7 3.8" >>> returned exit code 13 >>> >>> But the HDF5 error is no more present with python3.7. So it seems a good >>> point. >> >> >> Strange again. I would have expected the same behaviour in python3.8 >> and python3.7, whether successful or unsuccessful. > > > Putting dh into mpirun seems to be interfering with process spawning. > Once MPI is initialised (for the python3.7 test) it's not reinitialised > for the python3.8 and so it's in a bad state for the test. Something > like that. > > It's only in the tests where h5py is invoked that we get the problems. > This variant works, applying mpirun separately for each test run: > > override_dh_auto_test: > set -e; \ > for py in `py3versions -s -v`; do \ > mpirun -n 1 pybuild --test -i python{version} -p $$py; \ > done > > (could use mpirun -n $(NPROC) for real mpi testing).
Yes, it works! \o/ > Do we want to use this as a solution? Or would you prefer an environment > variable that h5py can check to allow mpi invocation on a serial process? I let this decision up to you. Whatever you choose it deserve a bit fat note in README.Debian. > Note that this means bitshuffle as built now is expressly tied in with > hdf5-mpi and h5py-mpi (this seems intentional by debian/rules and > debian/control, though the Build-Depends must be updated to > python3-h5py-mpi). It's a separate question whether it's desirable to > also support a hdf5-serial build of bitshuffle. Likewise we need to > think about what we want to happen when bitshuffle is invoked in a > serial process. I'll let that to the bitshuffle maintainer. I'll propose a patch to fix the current FTBFS, sticking on the mpi flavour to be conservative vs bitshuffle's previous builds. > I think part of the confusion here is that bitshuffle (at least in the > tests) is double-handling the HDF5 library, with direct calls on the one > hand, but indirectly through h5py as well, on the other hand. Thanks, _g.
signature.asc
Description: OpenPGP digital signature