Re: [OMPI devel] Announcing Open MPI v5.0.0rc2

2022-01-09 Thread Marco Atzeri via devel

On 01.01.2022 20:07, Barrett, Brian wrote:

Marco -

There are some patches that haven't made it to the 5.0 branch to make this 
behavior better.  I didn't get a chance to back port them before the holiday 
break, but they will be in the next RC.  That said, the issue below is a 
warning, not an error, so you should still end up with a build that works (with 
an included PMIx).  The issue is that png-config can't be found, so we have 
trouble guessing what libraries are dependencies of PMIx, which is a potential 
problem in complicated builds with static libraries.

Brian



Thanks Brian,

the build error was in reality in setting for threads.

I was using up to v4.1

  --with-threads=posix

that currently is not accepted anymore but no error is reported,
causing a different setting that does not work in CYGWIN.
Removing the configuration seems to work


I have however found a logic error in prrte that
probably need a verification of all the HAVE_*_H
between configuration and code


/pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/odls/default/odls_default_module.c:114:14: 
fatal error: sys/ptrace.h: No such file or directory

  114 | #include 
  |  ^~

caused by

$ grep -rH HAVE_SYS_PTRACE_H .
./3rd-party/prrte/config.log:| #define HAVE_SYS_PTRACE_H 0
./3rd-party/prrte/config.log:| #define HAVE_SYS_PTRACE_H 0
./3rd-party/prrte/config.log:#define HAVE_SYS_PTRACE_H 0
./3rd-party/prrte/config.status:D["HAVE_SYS_PTRACE_H"]=" 0"
./3rd-party/prrte/src/include/prte_config.h:#define HAVE_SYS_PTRACE_H 0

while the code in
3rd-party/prrte/src/mca/odls/default/odls_default_module.c
has

#ifdef HAVE_SYS_PTRACE_H
#include 
#endif


currently I am stacked at

0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:61:
/pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/s
rc/mca/oob/tcp/oob_tcp_connection.c: In function 
‘prte_oob_tcp_peer_try_connect’:
/pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:163:16: 
error: expected identifier or ‘(’ before ‘struct’

  163 | prte_if_t *interface;
  |^
/pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:180:19: 
error: expected ‘{’ before ‘=’ token

  180 | interface = PRTE_NEW(prte_if_t);
  |   ^


not sure if it is caused by new GCC 11 requirement or from wrong headers
being pulled in.

Has anyone built with GCC 11 ?

Regards
Marco



Re: [OMPI devel] Announcing Open MPI v5.0.0rc2

2022-01-09 Thread Ralph Castain via devel
Hi Marco

Try the patch here (for the prrte 3rd-party subdirectory): 
https://github.com/openpmix/prrte/pull/1173


Ralph

> On Jan 9, 2022, at 12:29 AM, Marco Atzeri via devel 
>  wrote:
> 
> On 01.01.2022 20:07, Barrett, Brian wrote:
>> Marco -
>> There are some patches that haven't made it to the 5.0 branch to make this 
>> behavior better.  I didn't get a chance to back port them before the holiday 
>> break, but they will be in the next RC.  That said, the issue below is a 
>> warning, not an error, so you should still end up with a build that works 
>> (with an included PMIx).  The issue is that png-config can't be found, so we 
>> have trouble guessing what libraries are dependencies of PMIx, which is a 
>> potential problem in complicated builds with static libraries.
>> Brian
> 
> Thanks Brian,
> 
> the build error was in reality in setting for threads.
> 
> I was using up to v4.1
> 
>  --with-threads=posix
> 
> that currently is not accepted anymore but no error is reported,
> causing a different setting that does not work in CYGWIN.
> Removing the configuration seems to work
> 
> 
> I have however found a logic error in prrte that
> probably need a verification of all the HAVE_*_H
> between configuration and code
> 
> 
> /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/odls/default/odls_default_module.c:114:14:
>  fatal error: sys/ptrace.h: No such file or directory
>  114 | #include 
>  |  ^~
> 
> caused by
> 
> $ grep -rH HAVE_SYS_PTRACE_H .
> ./3rd-party/prrte/config.log:| #define HAVE_SYS_PTRACE_H 0
> ./3rd-party/prrte/config.log:| #define HAVE_SYS_PTRACE_H 0
> ./3rd-party/prrte/config.log:#define HAVE_SYS_PTRACE_H 0
> ./3rd-party/prrte/config.status:D["HAVE_SYS_PTRACE_H"]=" 0"
> ./3rd-party/prrte/src/include/prte_config.h:#define HAVE_SYS_PTRACE_H 0
> 
> while the code in
>3rd-party/prrte/src/mca/odls/default/odls_default_module.c
> has
> 
> #ifdef HAVE_SYS_PTRACE_H
> #include 
> #endif
> 
> 
> currently I am stacked at
> 
> 0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:61:
> /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/s
> rc/mca/oob/tcp/oob_tcp_connection.c: In function 
> ‘prte_oob_tcp_peer_try_connect’:
> /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:163:16:
>  error: expected identifier or ‘(’ before ‘struct’
>  163 | prte_if_t *interface;
>  |^
> /pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c:180:19:
>  error: expected ‘{’ before ‘=’ token
>  180 | interface = PRTE_NEW(prte_if_t);
>  |   ^
> 
> 
> not sure if it is caused by new GCC 11 requirement or from wrong headers
> being pulled in.
> 
> Has anyone built with GCC 11 ?
> 
> Regards
> Marco
> 




Re: [OMPI devel] Announcing Open MPI v5.0.0rc2

2022-01-09 Thread Marco Atzeri via devel

On 10.01.2022 06:50, Marco Atzeri wrote:

On 09.01.2022 15:54, Ralph Castain via devel wrote:

Hi Marco

Try the patch here (for the prrte 3rd-party subdirectory): 
https://github.com/openpmix/prrte/pull/1173



Ralph



Thanks Ralph,

I will do on the next build
as I need still to test the current build.



The test are not satisfactory

I have only one test fail
  FAIL: dlopen_test.exe

that I supect is due to a wrong name on test

but a simple run fails

$ mpirun -n 4 ./hello_c.exe
[116] 
/pub/devel/openmpi/v5.0/openmpi-5.0.0-0.1.x86_64/src/openmpi-5.0.0rc2/3rd-party/openpmix/src/mca/ptl/base/ptl_base_listener.c:498 
bind() failed for socket 13 storage size 16: Cannot assign requested address
Hello, world, I am 0 of 1, (Open MPI v5.0.0rc2, package: Open MPI 
Marco@LAPTOP-82F08ILC Distribution, ident: 5.0.0rc2, repo rev: 
v5.0.0rc2, Oct 18, 2021, 125)

--
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Not found" (-13) instead of "Success" (0)
--
[LAPTOP-82F08ILC:0] *** An error occurred in MPI_Init
[LAPTOP-82F08ILC:0] *** reported by process [36547002369,1]
[LAPTOP-82F08ILC:0] *** on a NULL communicator
[LAPTOP-82F08ILC:0] *** Unknown error
[LAPTOP-82F08ILC:0] *** MPI_ERRORS_ARE_FATAL (processes in this 
communicator will now abort,
[LAPTOP-82F08ILC:0] ***and MPI will try to terminate your MPI 
job as well)

--

Suggestion for what to look for ?

Regards
Marco




Re: [OMPI devel] Announcing Open MPI v5.0.0rc2

2022-01-09 Thread Marco Atzeri via devel

On 09.01.2022 15:54, Ralph Castain via devel wrote:

Hi Marco

Try the patch here (for the prrte 3rd-party subdirectory): 
https://github.com/openpmix/prrte/pull/1173


Ralph



Thanks Ralph,

I will do on the next build
as I need still to test the current build.


To complete the build I also needed the attached:


5.0.0rc2-amend-DEF.patch
to correct a missing def in OPAL

CYGWIN-undefined.patch
to pass "-no-undefined" to pmix and romio to allow shared lib

CYGWIN-interface-workaround.patch
temporary workaround to avoid the "interface" collision with 
Windows headers

I will look to provide a better solution for this point.

Regards
Marco
--- origsrc/openmpi-5.0.0rc2/opal/util/minmax.h 2021-10-18 17:27:42.0 
+0200
+++ src/openmpi-5.0.0rc2/opal/util/minmax.h 2022-01-09 12:46:07.148969800 
+0100
@@ -40,7 +40,7 @@ OPAL_DEFINE_MINMAX(float, float)
 OPAL_DEFINE_MINMAX(double, double)
 OPAL_DEFINE_MINMAX(void *, ptr)
 
-#if OPAL_C_HAVE__GENERIC
+#ifdef OPAL_C_HAVE__GENERIC
 #define opal_min(a, b) \
 (_Generic((a) + (b),\
  int8_t: opal_min_8,\
--- 
origsrc/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c   
2021-10-18 17:28:05.0 +0200
+++ src/openmpi-5.0.0rc2/3rd-party/prrte/src/mca/oob/tcp/oob_tcp_connection.c   
2022-01-09 11:13:29.687212300 +0100
@@ -160,6 +160,9 @@ void prte_oob_tcp_peer_try_connect(int f
 prte_oob_tcp_peer_t *peer;
 prte_oob_tcp_addr_t *addr;
 bool connected = false;
+#if defined interface
+#  undef interface
+#endif
 prte_if_t *interface;
 char *host;
 
--- origsrc/openmpi-5.0.0rc2/opal/mca/btl/tcp/btl_tcp_proc.c2021-10-18 
17:27:42.0 +0200
+++ src/openmpi-5.0.0rc2/opal/mca/btl/tcp/btl_tcp_proc.c2022-01-09 
12:48:27.075225600 +0100
@@ -160,6 +160,9 @@ static int mca_btl_tcp_proc_create_inter
fields needed in the proc version */
 for (i = 0; i < btl_proc->proc_addr_count; i++) {
 /* Construct opal_if_t objects for the remote interfaces */
+#ifdef interface
+#  undef interface
+#endif
 opal_if_t *interface = OBJ_NEW(opal_if_t);
 if (NULL == interface) {
 rc = OPAL_ERR_OUT_OF_RESOURCE;
--- origsrc/openmpi-5.0.0rc2/3rd-party/openpmix/configure.ac2021-10-18 
17:28:09.0 +0200
+++ src/openmpi-5.0.0rc2/3rd-party/openpmix/configure.ac2022-01-09 
04:51:14.271907200 +0100
@@ -337,6 +337,23 @@ AC_ARG_ENABLE(werror,
 ])
 
 
+# no-undefined needed on some platform for shared lib
+
+
+AC_MSG_CHECKING([if libtool needs -no-undefined flag to build shared 
libraries])
+case "`uname`" in
+  CYGWIN*|MINGW*|AIX*)
+## Add in the -no-undefined flag to LDFLAGS for libtool.
+AC_MSG_RESULT([yes])
+LDFLAGS="$LDFLAGS -no-undefined"
+;;
+  *)
+## Don't add in anything.
+AC_MSG_RESULT([no])
+;;
+esac
+
+
 # Version information
 
 
--- origsrc/openmpi-5.0.0rc2/3rd-party/romio341/configure.ac2021-10-18 
17:27:42.0 +0200
+++ src/openmpi-5.0.0rc2/3rd-party/romio341/configure.ac2022-01-09 
04:58:32.150499100 +0100
@@ -1784,6 +1784,19 @@ AM_PROG_LIBTOOL
 # support gcov test coverage information
 PAC_ENABLE_COVERAGE
 
+AC_MSG_CHECKING([if libtool needs -no-undefined flag to build shared 
libraries])
+case "`uname`" in
+  CYGWIN*|MINGW*|AIX*)
+## Add in the -no-undefined flag to LDFLAGS for libtool.
+AC_MSG_RESULT([yes])
+LDFLAGS="$LDFLAGS -no-undefined"
+;;
+  *)
+## Don't add in anything.
+AC_MSG_RESULT([no])
+;;
+esac
+
 AC_MSG_NOTICE([setting CC to $CC])
 AC_MSG_NOTICE([setting F77 to $F77])
 AC_MSG_NOTICE([setting TEST_CC to $TEST_CC])
--- origsrc/openmpi-5.0.0rc2/3rd-party/romio341/mpl/configure.ac
2021-10-18 17:27:42.0 +0200
+++ src/openmpi-5.0.0rc2/3rd-party/romio341/mpl/configure.ac2022-01-09 
05:01:27.462723200 +0100
@@ -1077,6 +1077,19 @@ CFLAGS=""
 AX_GCC_FUNC_ATTRIBUTE(fallthrough)
 PAC_POP_ALL_FLAGS
 
+AC_MSG_CHECKING([if libtool needs -no-undefined flag to build shared 
libraries])
+case "`uname`" in
+  CYGWIN*|MINGW*|AIX*)
+## Add in the -no-undefined flag to LDFLAGS for libtool.
+AC_MSG_RESULT([yes])
+LDFLAGS="$LDFLAGS -no-undefined"
+;;
+  *)
+## Don't add in anything.
+AC_MSG_RESULT([no])
+;;
+esac
+
 dnl Final output
 AC_CONFIG_FILES([Makefile localdefs include/mpl_timer.h])
 AC_OUTPUT