Re: [OMPI devel] Announcing Open MPI v5.0.0rc2
On 01.01.2022 20:07, Barrett, Brian wrote: Marco - There are some patches that haven't made it to the 5.0 branch to make this behavior better. I didn't get a chance to back port them before the holiday break, but they will be in the next RC. That said, the issue below is a warning, not an error, so you should still end up with a build that works (with an included PMIx). The issue is that png-config can't be found, so we have trouble guessing what libraries are dependencies of PMIx, which is a potential problem in complicated builds with static libraries. Brian Thanks Brian, the build error was in reality in setting for threads. I was using up to v4.1 --with-threads=posix that currently is not accepted anymore but no error is reported, causing a different setting that does not work in CYGWIN. Removing the configuration seems to work I have however found a logic error in prrte that probably need a verification of all the HAVE_*_H between configuration and code /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/odls/default/odls_default_module.c:114:14: fatal error: sys/ptrace.h: No such file or directory 114 | #include | ^~ caused by $ grep -rH HAVE_SYS_PTRACE_H . ./3rd-party/prrte/config.log:| #define HAVE_SYS_PTRACE_H 0 ./3rd-party/prrte/config.log:| #define HAVE_SYS_PTRACE_H 0 ./3rd-party/prrte/config.log:#define HAVE_SYS_PTRACE_H 0 ./3rd-party/prrte/config.status:D["HAVE_SYS_PTRACE_H"]=" 0" ./3rd-party/prrte/src/include/prte_config.h:#define HAVE_SYS_PTRACE_H 0 while the code in 3rd-party/prrte/src/mca/odls/default/odls_default_module.c has #ifdef HAVE_SYS_PTRACE_H #include #endif currently I am stacked at 0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:61: /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/s rc/mca/oob/tcp/oob_tcp_connection.c: In function ‘prte_oob_tcp_peer_try_connect’: /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:163:16: error: expected identifier or ‘(’ before ‘struct’ 163 | prte_if_t *interface; |^ /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:180:19: error: expected ‘{’ before ‘=’ token 180 | interface = PRTE_NEW(prte_if_t); | ^ not sure if it is caused by new GCC 11 requirement or from wrong headers being pulled in. Has anyone built with GCC 11 ? Regards Marco
Re: [OMPI devel] Announcing Open MPI v5.0.0rc2
Hi Marco Try the patch here (for the prrte 3rd-party subdirectory): https://github.com/openpmix/prrte/pull/1173 Ralph > On Jan 9, 2022, at 12:29 AM, Marco Atzeri via devel > wrote: > > On 01.01.2022 20:07, Barrett, Brian wrote: >> Marco - >> There are some patches that haven't made it to the 5.0 branch to make this >> behavior better. I didn't get a chance to back port them before the holiday >> break, but they will be in the next RC. That said, the issue below is a >> warning, not an error, so you should still end up with a build that works >> (with an included PMIx). The issue is that png-config can't be found, so we >> have trouble guessing what libraries are dependencies of PMIx, which is a >> potential problem in complicated builds with static libraries. >> Brian > > Thanks Brian, > > the build error was in reality in setting for threads. > > I was using up to v4.1 > > --with-threads=posix > > that currently is not accepted anymore but no error is reported, > causing a different setting that does not work in CYGWIN. > Removing the configuration seems to work > > > I have however found a logic error in prrte that > probably need a verification of all the HAVE_*_H > between configuration and code > > > /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/odls/default/odls_default_module.c:114:14: > fatal error: sys/ptrace.h: No such file or directory > 114 | #include > | ^~ > > caused by > > $ grep -rH HAVE_SYS_PTRACE_H . > ./3rd-party/prrte/config.log:| #define HAVE_SYS_PTRACE_H 0 > ./3rd-party/prrte/config.log:| #define HAVE_SYS_PTRACE_H 0 > ./3rd-party/prrte/config.log:#define HAVE_SYS_PTRACE_H 0 > ./3rd-party/prrte/config.status:D["HAVE_SYS_PTRACE_H"]=" 0" > ./3rd-party/prrte/src/include/prte_config.h:#define HAVE_SYS_PTRACE_H 0 > > while the code in >3rd-party/prrte/src/mca/odls/default/odls_default_module.c > has > > #ifdef HAVE_SYS_PTRACE_H > #include > #endif > > > currently I am stacked at > > 0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:61: > /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/s > rc/mca/oob/tcp/oob_tcp_connection.c: In function > ‘prte_oob_tcp_peer_try_connect’: > /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:163:16: > error: expected identifier or ‘(’ before ‘struct’ > 163 | prte_if_t *interface; > |^ > /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:180:19: > error: expected ‘{’ before ‘=’ token > 180 | interface = PRTE_NEW(prte_if_t); > | ^ > > > not sure if it is caused by new GCC 11 requirement or from wrong headers > being pulled in. > > Has anyone built with GCC 11 ? > > Regards > Marco >
Re: [OMPI devel] Announcing Open MPI v5.0.0rc2
On 10.01.2022 06:50, Marco Atzeri wrote: On 09.01.2022 15:54, Ralph Castain via devel wrote: Hi Marco Try the patch here (for the prrte 3rd-party subdirectory): https://github.com/openpmix/prrte/pull/1173 Ralph Thanks Ralph, I will do on the next build as I need still to test the current build. The test are not satisfactory I have only one test fail FAIL: dlopen_test.exe that I supect is due to a wrong name on test but a simple run fails $ mpirun -n 4 ./hello_c.exe [116] /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/openpmix/src/mca/ptl/base/ptl_base_listener.c:498 bind() failed for socket 13 storage size 16: Cannot assign requested address Hello, world, I am 0 of 1, (Open MPI v5.0.0rc2, package: Open MPI Marco@LAPTOP-82F08ILC Distribution, ident: 5.0.0rc2, repo rev: v5.0.0rc2, Oct 18, 2021, 125) -- It looks like MPI_INIT failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): PML add procs failed --> Returned "Not found" (-13) instead of "Success" (0) -- [LAPTOP-82F08ILC:0] *** An error occurred in MPI_Init [LAPTOP-82F08ILC:0] *** reported by process [36547002369,1] [LAPTOP-82F08ILC:0] *** on a NULL communicator [LAPTOP-82F08ILC:0] *** Unknown error [LAPTOP-82F08ILC:0] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, [LAPTOP-82F08ILC:0] ***and MPI will try to terminate your MPI job as well) -- Suggestion for what to look for ? Regards Marco
Re: [OMPI devel] Announcing Open MPI v5.0.0rc2
On 09.01.2022 15:54, Ralph Castain via devel wrote: Hi Marco Try the patch here (for the prrte 3rd-party subdirectory): https://github.com/openpmix/prrte/pull/1173 Ralph Thanks Ralph, I will do on the next build as I need still to test the current build. To complete the build I also needed the attached: 5.0.0rc2-amend-DEF.patch to correct a missing def in OPAL CYGWIN-undefined.patch to pass "-no-undefined" to pmix and romio to allow shared lib CYGWIN-interface-workaround.patch temporary workaround to avoid the "interface" collision with Windows headers I will look to provide a better solution for this point. Regards Marco --- origsrc/openmpi-5.0.0rc2/opal/util/minmax.h 2021-10-18 17:27:42.0 +0200 +++ src/openmpi-5.0.0rc2/opal/util/minmax.h 2022-01-09 12:46:07.148969800 +0100 @@ -40,7 +40,7 @@ OPAL_DEFINE_MINMAX(float, float) OPAL_DEFINE_MINMAX(double, double) OPAL_DEFINE_MINMAX(void *, ptr) -#if OPAL_C_HAVE__GENERIC +#ifdef OPAL_C_HAVE__GENERIC #define opal_min(a, b) \ (_Generic((a) + (b),\ int8_t: opal_min_8,\ --- origsrc/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c 2021-10-18 17:28:05.0 +0200 +++ src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c 2022-01-09 11:13:29.687212300 +0100 @@ -160,6 +160,9 @@ void prte_oob_tcp_peer_try_connect(int f prte_oob_tcp_peer_t *peer; prte_oob_tcp_addr_t *addr; bool connected = false; +#if defined interface +# undef interface +#endif prte_if_t *interface; char *host; --- origsrc/openmpi-5.0.0rc2/opal/mca/btl/tcp/btl_tcp_proc.c2021-10-18 17:27:42.0 +0200 +++ src/openmpi-5.0.0rc2/opal/mca/btl/tcp/btl_tcp_proc.c2022-01-09 12:48:27.075225600 +0100 @@ -160,6 +160,9 @@ static int mca_btl_tcp_proc_create_inter fields needed in the proc version */ for (i = 0; i < btl_proc->proc_addr_count; i++) { /* Construct opal_if_t objects for the remote interfaces */ +#ifdef interface +# undef interface +#endif opal_if_t *interface = OBJ_NEW(opal_if_t); if (NULL == interface) { rc = OPAL_ERR_OUT_OF_RESOURCE; --- origsrc/openmpi-5.0.0rc2/3rd-party/openpmix/configure.ac2021-10-18 17:28:09.0 +0200 +++ src/openmpi-5.0.0rc2/3rd-party/openpmix/configure.ac2022-01-09 04:51:14.271907200 +0100 @@ -337,6 +337,23 @@ AC_ARG_ENABLE(werror, ]) +# no-undefined needed on some platform for shared lib + + +AC_MSG_CHECKING([if libtool needs -no-undefined flag to build shared libraries]) +case "`uname`" in + CYGWIN*|MINGW*|AIX*) +## Add in the -no-undefined flag to LDFLAGS for libtool. +AC_MSG_RESULT([yes]) +LDFLAGS="$LDFLAGS -no-undefined" +;; + *) +## Don't add in anything. +AC_MSG_RESULT([no]) +;; +esac + + # Version information --- origsrc/openmpi-5.0.0rc2/3rd-party/romio341/configure.ac2021-10-18 17:27:42.0 +0200 +++ src/openmpi-5.0.0rc2/3rd-party/romio341/configure.ac2022-01-09 04:58:32.150499100 +0100 @@ -1784,6 +1784,19 @@ AM_PROG_LIBTOOL # support gcov test coverage information PAC_ENABLE_COVERAGE +AC_MSG_CHECKING([if libtool needs -no-undefined flag to build shared libraries]) +case "`uname`" in + CYGWIN*|MINGW*|AIX*) +## Add in the -no-undefined flag to LDFLAGS for libtool. +AC_MSG_RESULT([yes]) +LDFLAGS="$LDFLAGS -no-undefined" +;; + *) +## Don't add in anything. +AC_MSG_RESULT([no]) +;; +esac + AC_MSG_NOTICE([setting CC to $CC]) AC_MSG_NOTICE([setting F77 to $F77]) AC_MSG_NOTICE([setting TEST_CC to $TEST_CC]) --- origsrc/openmpi-5.0.0rc2/3rd-party/romio341/mpl/configure.ac 2021-10-18 17:27:42.0 +0200 +++ src/openmpi-5.0.0rc2/3rd-party/romio341/mpl/configure.ac2022-01-09 05:01:27.462723200 +0100 @@ -1077,6 +1077,19 @@ CFLAGS="" AX_GCC_FUNC_ATTRIBUTE(fallthrough) PAC_POP_ALL_FLAGS +AC_MSG_CHECKING([if libtool needs -no-undefined flag to build shared libraries]) +case "`uname`" in + CYGWIN*|MINGW*|AIX*) +## Add in the -no-undefined flag to LDFLAGS for libtool. +AC_MSG_RESULT([yes]) +LDFLAGS="$LDFLAGS -no-undefined" +;; + *) +## Don't add in anything. +AC_MSG_RESULT([no]) +;; +esac + dnl Final output AC_CONFIG_FILES([Makefile localdefs include/mpl_timer.h]) AC_OUTPUT