Bug#1058876: libopenmpi-dev: paths missing /usr/include...(for fortran mpi.mod)
On 27/12/2023 21:30, Drew Parsons wrote: Hi Alistair, given the complexity around hacking openmpi to accommodate placing the mod files under /usr/include, I'm starting to wonder whether it's the best way of resolving Bug#1058526 in the first place. I did it bit of background reading on the fortran mod files. There's a fair bit of dissent about them, and no consensus on a proper location. e.g. https://fortranwiki.org/fortran/show/Library+distribution The files are binary dependent (and compiler version dependent), and not clear that /usr/include is the best place for them anyway. mpich seems to be fine placing them in /usr/lib/x86_64-linux-gnu/fortran/gfortran-mod-15/mpich, and openmpi seemed to be happy enough doing the same up until Bug#1058526. Is there a different way of resolving Bug#1058526 without moving the mod files to /usr/include? Drew I had altered FMODDIR from /usr/lib/ to /usr/include to match what appears to be most conventional, but given the problems caused, I'm backing out that change and reverting to /usr/lib/${DEB_HOST_MULTIARCH}/. It will take changes in dh-fortran-mod and openmpi which I'm doing today. Alastair -- Alastair McKinstry, GPG: 82383CE9165B347C787081A2CBE6BB4E5D9AD3A5 ph: +353 87 6847928 e: alast...@mckinstry.ie, im: @alastair:mckinstry.ie
Bug#1058876: libopenmpi-dev: paths missing /usr/include...(for fortran mpi.mod)
Hi Alistair, given the complexity around hacking openmpi to accommodate placing the mod files under /usr/include, I'm starting to wonder whether it's the best way of resolving Bug#1058526 in the first place. I did it bit of background reading on the fortran mod files. There's a fair bit of dissent about them, and no consensus on a proper location. e.g. https://fortranwiki.org/fortran/show/Library+distribution The files are binary dependent (and compiler version dependent), and not clear that /usr/include is the best place for them anyway. mpich seems to be fine placing them in /usr/lib/x86_64-linux-gnu/fortran/gfortran-mod-15/mpich, and openmpi seemed to be happy enough doing the same up until Bug#1058526. Is there a different way of resolving Bug#1058526 without moving the mod files to /usr/include? Drew
Bug#1058876: libopenmpi-dev: paths missing /usr/include...(for fortran mpi.mod)
On 2023-12-27 09:51, Alastair McKinstry wrote: On 27/12/2023 08:45, Drew Parsons wrote: ... I guess the problem must be the common files from openmpi-common in /usr/share/openmpi/. They're not actually arch-independent. Do mpif90.openmpi and the other components actively read them at runtime? .. This appears to be it. I've been building on arm64 recently (a VM on a mac) and don't see this. There appears to be a mechanism for including ${includedir} and ${libdir} and evaluating the wrapper-data files at runtime. My hacking on these files in d/rules is causing the errors. I'll work on a better solution. I can see at the lowest level the location is pkgdatadir at l.110 (and elsewhere) in ompi/tools/wrappers/Makefile.am Not clear if hacking it at that point will interfere with the orterun binary finding them. If not, then it could in principle be replaced with something like $(pkglibdir)/$(datadir) (i.e. in a share subdir under the openmpi libdir). Might call it "pkglibdatadir". The default value for pkgdatadir is set as $(datadir)/@PACKAGE@, l.129 in toplevel Makefile.in datadir is the autotool default ${prefix}/share (i.e. /usr/share), https://www.gnu.org/software/automake/manual/html_node/Standard-Directory-Variables.html If orterun can be trained to look for the wrapper txt files in pkglibdatadir (presumably as well as pkgdatadir, not instead of), then setting and using "pkglibdatadir" instead of pkgdatadir in ompi/tools/wrappers/Makefile.am "might" be simple and reliable. Reliability depends on whether any other component uses these wrapper txt files.
Bug#1058876: libopenmpi-dev: paths missing /usr/include...(for fortran mpi.mod)
On 27/12/2023 08:45, Drew Parsons wrote: On 2023-12-26 12:45, Drew Parsons wrote: I can manually reproduce the error trivially on an arm64 chroot (amdahl.debian.org). Copying hello.f90 from openmpi's debian/tests and manually running mpif90 -o hello hello.f90 reproduces the error reference to the x86_64 include path on the arm64 machine. `mpif90.openmpi -print-search-dirs` only shows aarch64 paths though. I guess the problem must be the common files from openmpi-common in /usr/share/openmpi/. They're not actually arch-independent. Do mpif90.openmpi and the other components actively read them at runtime? For instance, /usr/share/openmpi/mpif90.openmpi-wrapper-data.txt contains fmoddir=/usr/include/x86_64-linux-gnu/fortran/gfortran-mod-15 Since openmpi-common is marked Arch: all, it's only built once, on amd64, hence x86_64-linux-gnu gets carried to the other arches. The compiler_flags variables is also affected, alongside as fmoddir. It looks like only the mpi fortran wrapper txts are affected, mpif77-wrapper-data.txt mpif77.openmpi-wrapper-data.txt mpif90-wrapper-data.txt mpif90.openmpi-wrapper-data.txt mpifort-wrapper-data.txt mpifort.openmpi-wrapper-data.txt Should these be moved from openmpi-common to libopenmpi-dev (or openmpi-bin) at /usr/lib//openmpi/share ? This appears to be it. I've been building on arm64 recently (a VM on a mac) and don't see this. There appears to be a mechanism for including ${includedir} and ${libdir} and evaluating the wrapper-data files at runtime. My hacking on these files in d/rules is causing the errors. I'll work on a better solution. Alastair -- Alastair McKinstry, GPG: 82383CE9165B347C787081A2CBE6BB4E5D9AD3A5 ph: +353 87 6847928 e: alast...@mckinstry.ie, im: @alastair:mckinstry.ie
Bug#1058876: libopenmpi-dev: paths missing /usr/include...(for fortran mpi.mod)
On 2023-12-26 12:45, Drew Parsons wrote: I can manually reproduce the error trivially on an arm64 chroot (amdahl.debian.org). Copying hello.f90 from openmpi's debian/tests and manually running mpif90 -o hello hello.f90 reproduces the error reference to the x86_64 include path on the arm64 machine. `mpif90.openmpi -print-search-dirs` only shows aarch64 paths though. I guess the problem must be the common files from openmpi-common in /usr/share/openmpi/. They're not actually arch-independent. Do mpif90.openmpi and the other components actively read them at runtime? For instance, /usr/share/openmpi/mpif90.openmpi-wrapper-data.txt contains fmoddir=/usr/include/x86_64-linux-gnu/fortran/gfortran-mod-15 Since openmpi-common is marked Arch: all, it's only built once, on amd64, hence x86_64-linux-gnu gets carried to the other arches. The compiler_flags variables is also affected, alongside as fmoddir. It looks like only the mpi fortran wrapper txts are affected, mpif77-wrapper-data.txt mpif77.openmpi-wrapper-data.txt mpif90-wrapper-data.txt mpif90.openmpi-wrapper-data.txt mpifort-wrapper-data.txt mpifort.openmpi-wrapper-data.txt Should these be moved from openmpi-common to libopenmpi-dev (or openmpi-bin) at /usr/lib//openmpi/share ?
Bug#1058876: libopenmpi-dev: paths missing /usr/include...(for fortran mpi.mod)
On 2023-12-26 12:31, Drew Parsons wrote: ... It's not just adios2 and sundials though. openmpi's own arm64 tests are failing on debci with a reference to x86_64-linux-gnu ... openmpi's compile_run_mpif90 test doesn't use pkgconfig anyway. It builds directly with mpif90. Could the problem be inside the mpif90.openmpi binary? That would be strange though. arm64's mpif90.openmpi oughtn't be referring to x86_64 any more than the pkgconfig file. I can manually reproduce the error trivially on an arm64 chroot (amdahl.debian.org). Copying hello.f90 from openmpi's debian/tests and manually running mpif90 -o hello hello.f90 reproduces the error reference to the x86_64 include path on the arm64 machine. `mpif90.openmpi -print-search-dirs` only shows aarch64 paths though.
Bug#1058876: libopenmpi-dev: paths missing /usr/include...(for fortran mpi.mod)
On 2023-12-26 11:00, Alastair McKinstry wrote: On 24/12/2023 10:50, Drew Parsons wrote: reopen 1058876 block 1058944 by 1058876 thanks Alas, the fix in openmpi 4.1.6-3 for the include path to the openmpi fortran modules has hardcoded x86_64-linux-gnu This is preventing builds and tests on other architectures, e.g. rebuilding sundials for the petsc transition. I think openmpi's debian/tests will also need Depends: pkg-config for the new compile_run_cc_pkgconfig test. The problem appears to be the heuristics in upstream/FindMPI.cmake in adios2 (and sundials). It happens in sid tests but not my arm64 devel environment. Debugging slowly. It's not just adios2 and sundials though. openmpi's own arm64 tests are failing on debci with a reference to x86_64-linux-gnu e.g. 79s Setting up libopenmpi-dev:arm64 (4.1.6-3) ... 79s update-alternatives: using /usr/lib/aarch64-linux-gnu/openmpi/include to provide /usr/include/aarch64-linux-gnu/mpi (mpi-aarch64-linux-gnu) in auto mode 79s Setting up autopkgtest-satdep (0) ... 79s Processing triggers for libc-bin (2.37-12) ... 83s (Reading database ... 17753 files and directories currently installed.) 83s Removing autopkgtest-satdep (0) ... 86s autopkgtest [03:14:37]: test compile_run_mpif90: [--- 86s f951: Warning: Nonexistent include directory ‘/usr/include/x86_64-linux-gnu/fortran/gfortran-mod-15/openmpi’ [-Wmissing-include-dirs] 86s hello.f90:3:11: 86s 86s 3 | use mpi 86s | 1 86s Fatal Error: Cannot open module file ‘mpi.mod’ for reading at (1): No such file or directory It's a strange error to be sure. From that error message, I thought x86_64-linux-gnu might have gotten hardcoded into the include path in ompi-f90.pc for arm64. But downloading libopenmpi-dev_4.1.6-3_arm64.deb and inspecting manually, I can see that arm64's ompi-f90.pc contains -I/usr/include/aarch64-linux-gnu/fortran/gfortran-mod-15/openmpi which would be the correct path. I unpacked libopenmpi-dev_4.1.6-3_arm64.deb manually, but I can't find any reference to include/x86_64 inside its files. openmpi's compile_run_mpif90 test doesn't use pkgconfig anyway. It builds directly with mpif90. Could the problem be inside the mpif90.openmpi binary? That would be strange though. arm64's mpif90.openmpi oughtn't be referring to x86_64 any more than the pkgconfig file. Best of luck with debugging. Drew
Bug#1058876: libopenmpi-dev: paths missing /usr/include...(for fortran mpi.mod)
On 24/12/2023 10:50, Drew Parsons wrote: reopen 1058876 block 1058944 by 1058876 thanks Alas, the fix in openmpi 4.1.6-3 for the include path to the openmpi fortran modules has hardcoded x86_64-linux-gnu This is preventing builds and tests on other architectures, e.g. rebuilding sundials for the petsc transition. I think openmpi's debian/tests will also need Depends: pkg-config for the new compile_run_cc_pkgconfig test. The problem appears to be the heuristics in upstream/FindMPI.cmake in adios2 (and sundials). It happens in sid tests but not my arm64 devel environment. Debugging slowly. -- Alastair McKinstry, GPG: 82383CE9165B347C787081A2CBE6BB4E5D9AD3A5 ph: +353 87 6847928 e: alast...@mckinstry.ie, im: @alastair:mckinstry.ie
Bug#1058876: libopenmpi-dev: paths missing /usr/include...(for fortran mpi.mod)
reopen 1058876 block 1058944 by 1058876 thanks Alas, the fix in openmpi 4.1.6-3 for the include path to the openmpi fortran modules has hardcoded x86_64-linux-gnu This is preventing builds and tests on other architectures, e.g. rebuilding sundials for the petsc transition. I think openmpi's debian/tests will also need Depends: pkg-config for the new compile_run_cc_pkgconfig test.