Bug#1070300: pmix_psquash_base_select failed during MPI_INIT on 32bit architectures

2024-05-05 Thread Samuel Thibault
Samuel Thibault, le sam. 04 mai 2024 11:49:40 +0200, a ecrit:
> Samuel Thibault, le ven. 03 mai 2024 19:00:22 +0200, a ecrit:
> > This has been posing migration issues for quite some time, I have
> > uploaded the attached fix to delayed/0.
> 
> Some of the components depend on libmca_common_libdstore which also
> needs to be installed, otherwise openmpi emits some text on stderr,
> which some autopkgtest don't like, I have uploaded the attached changes
> to delayed/0

Sorry it seems my tests had gone bogus, I do remember testing the result
but apparently obviously failed to. I have double-checked my changes
this time, as attached and uploaded to delayed/0 (now that openmpi got a
bit force-migrated to testing)

Samuel
diff -Nru openmpi-4.1.6/debian/changelog openmpi-4.1.6/debian/changelog
--- openmpi-4.1.6/debian/changelog  2024-05-04 11:32:26.0 +0200
+++ openmpi-4.1.6/debian/changelog  2024-05-05 20:38:36.0 +0200
@@ -1,3 +1,10 @@
+openmpi (4.1.6-13.3) unstable; urgency=medium
+
+  * Non-maintainer Upload
+  * Really install libmca_common_dstore.
+
+ -- Samuel Thibault   Sun, 05 May 2024 20:38:36 +0200
+
 openmpi (4.1.6-13.2) unstable; urgency=medium
 
   * Non-maintainer Upload
diff -Nru openmpi-4.1.6/debian/rules openmpi-4.1.6/debian/rules
--- openmpi-4.1.6/debian/rules  2024-05-04 11:32:26.0 +0200
+++ openmpi-4.1.6/debian/rules  2024-05-05 20:38:36.0 +0200
@@ -289,10 +289,11 @@
dh_install -p libopenmpi3t64 
$(LIBDIR)/openmpi/lib/libpmix.so.2.2.35 $(LIBDIR) ; \
dh_install -p libopenmpi3t64 /usr/share/pmix ; \
dh_install -p libopenmpi3t64 
"/usr/lib/$(DEB_HOST_MULTIARCH)/openmpi/lib/pmix/*.so" ; \
-   if test -f 
$(DESTDIR)/$(LIBDIR)/openmpi/lib/libmca_common_libdstore.so.1.0.2 ; then \
-   dh_install -p libopenmpi3t64 
$(LIBDIR)/libmca_common_libdstore.so.1.0.2 ; \
-   dh_link -p libopenmpi3t64
$(LIBDIR)/libmca_common_libdstore.so.1.0.2 
$(LIBDIR)/libmca_common_libdstore.so.1 ; \
-   dh_link -p libopenmpi-dev 
$(LIBDIR)/libmca_common_libdstore.so.1  $(LIBDIR)/libmca_common_libdstore.so ; \
+   if test -f 
$(DESTDIR)/$(LIBDIR)/openmpi/lib/libmca_common_dstore.so.1.0.2 ; then \
+   dh_install -p libopenmpi3t64 
$(LIBDIR)/openmpi/lib/libmca_common_dstore.so.1.0.2 $(LIBDIR) ; \
+   dh_link -p libopenmpi3t64 
$(LIBDIR)/libmca_common_dstore.so.1.0.2 $(LIBDIR)/libmca_common_dstore.so.1 ; \
+   dh_link -p libopenmpi-dev 
$(LIBDIR)/libmca_common_dstore.so.1   
$(LIBDIR)/openmpi/lib/libmca_common_dstore.so ; \
+   dh_link -p libopenmpi-dev 
$(LIBDIR)/libmca_common_dstore.so.1   $(LIBDIR)/libmca_common_dstore.so ; \
fi ; \
dh_link -p libopenmpi3t64 $(LIBDIR)/libpmix.so.2.2.35 
$(LIBDIR)/libpmix.so.2  ; \
dh_link -p libopenmpi-dev $(LIBDIR)/libpmix.so.2
$(LIBDIR)/openmpi/lib/libpmix.so ; \


Bug#1070300: pmix_psquash_base_select failed during MPI_INIT on 32bit architectures

2024-05-04 Thread Samuel Thibault
Samuel Thibault, le ven. 03 mai 2024 19:00:22 +0200, a ecrit:
> This has been posing migration issues for quite some time, I have
> uploaded the attached fix to delayed/0.

Some of the components depend on libmca_common_libdstore which also
needs to be installed, otherwise openmpi emits some text on stderr,
which some autopkgtest don't like, I have uploaded the attached changes
to delayed/0

Samuel
diff -Nru openmpi-4.1.6/debian/changelog openmpi-4.1.6/debian/changelog
--- openmpi-4.1.6/debian/changelog  2024-05-03 18:53:52.0 +0200
+++ openmpi-4.1.6/debian/changelog  2024-05-04 11:32:26.0 +0200
@@ -1,3 +1,11 @@
+openmpi (4.1.6-13.2) unstable; urgency=medium
+
+  * Non-maintainer Upload
+  * Also install libmca_common_dstore.
+  * Do not install .la pmix files.
+
+ -- Samuel Thibault   Sat, 04 May 2024 11:32:26 +0200
+
 openmpi (4.1.6-13.1) unstable; urgency=medium
 
   * Non-maintainer Upload
diff -Nru openmpi-4.1.6/debian/rules openmpi-4.1.6/debian/rules
--- openmpi-4.1.6/debian/rules  2024-05-03 18:49:28.0 +0200
+++ openmpi-4.1.6/debian/rules  2024-05-04 11:32:26.0 +0200
@@ -288,7 +288,12 @@
echo "PMIX: install " ;  \
dh_install -p libopenmpi3t64 
$(LIBDIR)/openmpi/lib/libpmix.so.2.2.35 $(LIBDIR) ; \
dh_install -p libopenmpi3t64 /usr/share/pmix ; \
-   dh_install -p libopenmpi3t64 
/usr/lib/$(DEB_HOST_MULTIARCH)/openmpi/lib/pmix ; \
+   dh_install -p libopenmpi3t64 
"/usr/lib/$(DEB_HOST_MULTIARCH)/openmpi/lib/pmix/*.so" ; \
+   if test -f 
$(DESTDIR)/$(LIBDIR)/openmpi/lib/libmca_common_libdstore.so.1.0.2 ; then \
+   dh_install -p libopenmpi3t64 
$(LIBDIR)/libmca_common_libdstore.so.1.0.2 ; \
+   dh_link -p libopenmpi3t64
$(LIBDIR)/libmca_common_libdstore.so.1.0.2 
$(LIBDIR)/libmca_common_libdstore.so.1 ; \
+   dh_link -p libopenmpi-dev 
$(LIBDIR)/libmca_common_libdstore.so.1  $(LIBDIR)/libmca_common_libdstore.so ; \
+   fi ; \
dh_link -p libopenmpi3t64 $(LIBDIR)/libpmix.so.2.2.35 
$(LIBDIR)/libpmix.so.2  ; \
dh_link -p libopenmpi-dev $(LIBDIR)/libpmix.so.2
$(LIBDIR)/openmpi/lib/libpmix.so ; \
dh_link -p libopenmpi-dev $(LIBDIR)/libpmix.so.2
$(LIBDIR)/libpmix.so ; \


Bug#1070300: pmix_psquash_base_select failed during MPI_INIT on 32bit architectures

2024-05-03 Thread Samuel Thibault
Hello,

This has been posing migration issues for quite some time, I have
uploaded the attached fix to delayed/0.

Samuel
diff -Nru openmpi-4.1.6/debian/changelog openmpi-4.1.6/debian/changelog
--- openmpi-4.1.6/debian/changelog  2024-04-27 18:37:26.0 +0200
+++ openmpi-4.1.6/debian/changelog  2024-05-03 18:53:52.0 +0200
@@ -1,3 +1,10 @@
+openmpi (4.1.6-13.1) unstable; urgency=medium
+
+  * Non-maintainer Upload
+  * Also install pmix components on 32-bit systems. Closes: #1070300
+
+ -- Samuel Thibault   Fri, 03 May 2024 18:53:52 +0200
+
 openmpi (4.1.6-13) unstable; urgency=medium
 
   * Move pmix help files to libopenmpi3t64, not openmpi3-common
diff -Nru openmpi-4.1.6/debian/rules openmpi-4.1.6/debian/rules
--- openmpi-4.1.6/debian/rules  2024-04-27 18:37:26.0 +0200
+++ openmpi-4.1.6/debian/rules  2024-05-03 18:49:28.0 +0200
@@ -287,7 +287,8 @@
if $(DO_OWN_PMIX); then \
echo "PMIX: install " ;  \
dh_install -p libopenmpi3t64 
$(LIBDIR)/openmpi/lib/libpmix.so.2.2.35 $(LIBDIR) ; \
-   dh_install -p libopenmpi3t64 /usr/share/pmix/* ; \
+   dh_install -p libopenmpi3t64 /usr/share/pmix ; \
+   dh_install -p libopenmpi3t64 
/usr/lib/$(DEB_HOST_MULTIARCH)/openmpi/lib/pmix ; \
dh_link -p libopenmpi3t64 $(LIBDIR)/libpmix.so.2.2.35 
$(LIBDIR)/libpmix.so.2  ; \
dh_link -p libopenmpi-dev $(LIBDIR)/libpmix.so.2
$(LIBDIR)/openmpi/lib/libpmix.so ; \
dh_link -p libopenmpi-dev $(LIBDIR)/libpmix.so.2
$(LIBDIR)/libpmix.so ; \


Bug#1070300: pmix_psquash_base_select failed during MPI_INIT on 32bit architectures

2024-05-03 Thread Markus Blatt

Hi,

the problem already appears in OpenMPI's own autopkgtests, see [1]

Best,

Markus

[1] https://ci.debian.net/packages/o/openmpi/unstable/i386/46207866/



Bug#1070300: pmix_psquash_base_select failed during MPI_INIT on 32bit architectures

2024-05-03 Thread Markus Blatt
Source: openmpi
Version: 4.1.6-13
Severity: serious
Justification: unkown
Control: affects -1 src:dune-grid

Dear Maintainer,

I just uploaded a new version of package dune-grid and noticed that none of our
parallel tests start successfully on 32bit
architectures.

  2/66 Test  #2: scsgmappertest
***Failed0.15 sec
--
It looks like pmix_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during pmix_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
PMIX developer):

  pmix_psquash_base_select failed
  --> Returned value -46 instead of PMIX_SUCCESS
--

[arm-ubc-05:12560] PMIX ERROR: NOT-FOUND in file
../../../../../../../../opal/mca/pmix/pmix3x/pmix/src/server/pmix_server.c at
line 237
[arm-ubc-05:12559] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a daemon
on the local node in file
../../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line 716
[arm-ubc-05:12559] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to start a daemon
on the local node in file
../../../../../../orte/mca/ess/singleton/ess_singleton_module.c at line 172
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_init failed
  --> Returned value Unable to start a daemon on the local node (-127) instead
of ORTE_SUCCESS
--
--
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "Unable to start a daemon on the local node" (-127) instead of
"Success" (0)
--
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***and potentially your MPI job)
[arm-ubc-05:12559] Local abort before MPI_INIT completed completed
successfully, but am not able to aggregate error messages, and not able to
guarantee that all other processes were killed!

See [1] for a complete build where the tests using mpirun fail in this way.

This happens on these architectures: armel, armhf, i386, hppa

Best,

Markus
[1] https://buildd.debian.org/status/fetch.php?pkg=dune-
grid=armel=2.9.0-4=1714724856=0


-- System Information:
Debian Release: 12.5
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 6.1.0-20-amd64 (SMP w/64 CPU threads; PREEMPT)
Locale: LANG=de_DE.UTF-8, LC_CTYPE=de_DE.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages openmpi-bin depends on:
ii  libc62.36-9+deb12u6
ii  libevent-core-2.1-7  2.1.12-stable-8
ii  libopenmpi3  4.1.4-3+b1
ii  openmpi-common   4.1.4-3
ii  openssh-client [ssh-client]  1:9.2p1-2+deb12u2

openmpi-bin recommends no packages.

Versions of packages openmpi-bin suggests:
ii  gfortran [fortran-compiler]  4:12.2.0-3

-- no debconf information