Thanks for this detailed explanation, Drew. I released version 3.1.0+dfsg2-5 of the xmds2 package before reading it. I added python3-h5py to Build-Depends and libhdf5-mpi-dev to Depends, as you suggested (even though there is a typo in the debian/changelog entry, stating eroneaously that libhdf5-serial-dev has been added; I will fix this in the next release).

I also used H5PY_ALWAYS_USE_MPI=1, as you mentioned.

As regards adding also python3-h5py-serial to Depends and putting a fallback code in place, I will have to give it a little thought. Maybe, I should discuss this with the upstream authors, to know what they thing. Let us see how things evolve. At least, I hope that version 3.1.0+dfsg2-5 will really fix Bug#1053314 and the h5py transition will be completed.

Best,

Rafael

* Drew Parsons <dpars...@debian.org> [2023-10-09 02:23]:

Nilesh explained most of the situation correctly. I can give some more background. It made sense (to me) to have h5py built against hdf5-mpi, since I figured that if you need the complexity of the hdf5 file format then you probably want to use it in an MPI environment.

There was a complaint from a user though, who wanted to make use of a massive ensemble of HDF5 (h5py) serial jobs, and the small cost of loading up MPI support was interfering with their throughput.

So the compromise solution was to provide both builds, with a custom __init__.py to select the serial or MPI build depending on runtime environment. If an MPI environment is detected then the h5py MPI build is loaded, otherwise the serial build is loaded.

If you want to run h5py in a serial process, then one might say you'd normally want the serial build. As Nilesh noted, I put in a mechanism to load the MPI build if you really want to access the mpi build in a serial process (mpirun -n 1 is not a "serial" process as such, it's still an MPI environment even though using only 1 process).

The mechanism to force MPI loading is NOT to set OMPI_COMM_WORLD_SIZE. I recommend NOT doing that. I couldn't promise it won't mess up other things, certainly it will get in the way of an MPICH environment. No, the mechanism for handling this for h5py is described in /usr/share/doc/python3-h5py/README.Debian: set H5PY_ALWAYS_USE_MPI=1

Is there a way to force h5py to import _debian_h5py_serial instead of _debian_h5py_mpi, via the generic h5py namespace?

It sounds like there is some confusion about how xmds2 should be used. Is it intended to be used as a serial process or MPI? I noted in the bug report that xmds2 Depends: libhdf5-serial-dev. Is it even using MPI? If you want it to be using h5py-serial, then why does xmsd2 depend on python3-h5py-mpi?

It seems to me that xmds2's h5py dependency should be the same as its hdf5 dependency. If it uses libhdf5-serial then should it be depending on just python3-h5py (implying python3-h5py-serial, make it explicit if needed) and not depend on python3-h5py-mpi?

If xmds2 is intended to be flexible, equally happy in serial and MPI environments (and can actually make use of h5py-mpi) then perhaps the dependency should cover all cases, Depends: python3-h5py, python3-h5py-serial, python3-h5py-mpi all three explicitly, since otherwise one or the other of -serial or -mpi would be missed.

The problem raises interesting questions about h5py configuration. I set up it so you could choose how you want it to work, with or without MPI support. But it looks like an edge case is missing: it's failing in serial jobs if you chose to set up your installation with python3-h5py-mpi and explicitly don't want python3-h5py-serial (unless you always set H5PY_ALWAYS_USE_MPI). Perhaps I should add an additional fallback to try h5py-mpi if h5py-serial is not found (in a serial job), the same way that h5py-serial is loaded as a fallback in an MPI job if h5py-mpi is not found. On the other hand maybe that just hides the real problem, that h5py-serial was not installed when actually it was wanted after all. The ImportError correctly identifies that case.




On 2023-10-08 17:38, Nilesh Patra wrote:
Hello,

On 10/8/23 17:22, Rafael Laboissière wrote:
Ok, I tried to fix the building problem by including python3-h5py,
alongside with python3-h5py-mpi, into Build-Depends, as suggested
by Drew, but the xmds2 package FTBFS.

Here is a way to reproduce the problem without building the package:

  $ dpkg -l python3-h5py\*
  Desired=Unknown/Install/Remove/Purge/Hold
  | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
  |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
  ||/ Name                Version      Architecture Description
  
+++-===================-============-============-=======================================================
  ii  python3-h5py        3.9.0-3      all         
general-purpose Python interface to hdf5   ii  python3-h5py-mpi    3.9.0-3      amd64        general-purpose Python interface to hdf5 (Python 3 MPI)   un  python3-h5py-serial <none>       <none>       (no description available)   $ echo 'import h5py' | python3   Traceback (most recent call last):     File "<stdin>", line 1, in <module>     File "/usr/lib/python3/dist-packages/h5py/__init__.py", line 21, in <module>       from . import _debian_h5py_serial as _h5py   ImportError: cannot import name '_debian_h5py_serial' from partially initialized module 'h5py' (most likely due to a circular import) (/usr/lib/python3/dist-packages/h5py/__init__.py)

Is there a way to force h5py to import _debian_h5py_serial instead of _debian_h5py_mpi, via the generic h5py namespace?

Drew would probably answer that question better but from taking a brief look, it seems to be on expected lines. This should work if you run it explicitly with mpi.

$ mpirun -n 1 python3 -c "import h5py" && echo "true" true

or with setting the MPI var manually.

$ OMPI_COMM_WORLD_SIZE=1 python3 -c "import h5py" && echo "true" true

If you want the _debian_h5py_serial interface then you need python3-h5py-serial and the B-D (and Depends) on h5py-mpi should be dropped which would mean this package does not need the -mpi package.

Otherwise, a (unreliable) hack that you could do it that add a B-D on h5py *before* mpi and then -serial should also be installed (at least on my env).

If the code really needs h5py-mpi, then it should be running the build/tests with mpi enabled (via openmpi). At least that's the impression I get from reading.

        https://sources.debian.org/src/h5py/3.9.0-3/debian/README.Debian/

This patch gets the package building for me with h5py-mpi+h5py, but not sure if it is the right thing to do -- please verify for yourself as package maintainer :)

--- a/xpdeint/XSILFile.py +++ b/xpdeint/XSILFile.py @@ -31,6 +31,9 @@ numpy = None +# Set env var to use h5py-mpi +os.environ['OMPI_COMM_WORLD_SIZE'] = '1' + def require_h5py(): global h5py if not h5py:


Reply via email to