Re: [UPDATE] devel/openmpi 4.0.1 -> 4.0.2

2019-12-19 Thread Raymond, David
Hello Martin,

It is odd that I already had "PMIX_MCA_gds=hash" and still got the
problem on a beowolf with multiple boxes connected by ethernet.

Since the lack of appropriate "#ifdefs" seems like an oversight on the
part of the openmpi developers, I think it would be appropriate to
push it upstream.  Are you prepared to do that?  I could try, but it
would take me a while to educate myself on this.

Best,

Dave

On 12/19/19, Martin Reindl  wrote:
> [moved to ports@]
>
> On Tue, Dec 17, 2019 at 04:16:25PM -0700, Raymond, David wrote:
>> Martin,
>>
>> I have been using openmpi 4.0.2 on my computer system and I found a
>> bug that is provoked by running a job (a Go program interfaced to the
>> Clang MPI package) on multiple machines connected by ethernet.  This
>> crashes the program with the following output:
> [...]
>>
>> I traced this to the fact that OpenBSD's version of pthreads doesn't
>> have "pthread_mutexattr_setpshared".  It turns out that the
>> configuration file undefines a flag if this is so, but the actual code
>> doesn't pay any attention to this.  I fixed the problem by putting
>> appropriate ifdefs around the code generating the error, which itself
>> is simple error checking code.  This seems to work.  I have attached
>> two patches for the 4.0.2 source.
>
> Hello Dave,
>
> Thanks for your input, I've updated the 4.0.2 diff.
>
> We already were aware of the problem with 4.0.1 back in June and worked
> around the problem by setting PMIX_MCA_gds=hash before execution to avoid
> GDS/ds21 and GDS/12.
>
> Your diff is of course a much better way, do you want to try to push it
> upstream?
>
> -m
>
> Index: Makefile
> ===
> RCS file: /cvs/ports/devel/openmpi/Makefile,v
> retrieving revision 1.28
> diff -u -p -u -p -r1.28 Makefile
> --- Makefile  28 Jun 2019 11:05:11 -  1.28
> +++ Makefile  19 Dec 2019 07:18:30 -
> @@ -2,9 +2,8 @@
>
>  COMMENT =open source MPI-3.1 implementation
>
> -V =  4.0.1
> +V =  4.0.2
>  DISTNAME =   openmpi-$V
> -REVISION =   0
>
>  SHARED_LIBS +=  mca_common_dstore 0.0 # 1.0
>  SHARED_LIBS +=  mca_common_monitoring 0.0 # 60.0
> Index: distinfo
> ===
> RCS file: /cvs/ports/devel/openmpi/distinfo,v
> retrieving revision 1.4
> diff -u -p -u -p -r1.4 distinfo
> --- distinfo  27 Jun 2019 13:52:00 -  1.4
> +++ distinfo  19 Dec 2019 07:18:30 -
> @@ -1,2 +1,2 @@
> -SHA256 (openmpi-4.0.1.tar.gz) =
> 5V4hP+CaIUq58scirP2L97ObvBgA5LekZNON8V5wf1k=
> -SIZE (openmpi-4.0.1.tar.gz) = 17513706
> +SHA256 (openmpi-4.0.2.tar.gz) =
> ZigFhw6GoUceWXObDDTG+QBODHoi2waFYtU4jsRCGQQ=
> +SIZE (openmpi-4.0.2.tar.gz) = 17373487
> Index:
> patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds12_gds_ds12_lock_pthread_c
> ===
> RCS file:
> patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds12_gds_ds12_lock_pthread_c
> diff -N
> patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds12_gds_ds12_lock_pthread_c
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++
> patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds12_gds_ds12_lock_pthread_c
>   19
> Dec 2019 07:18:30 -
> @@ -0,0 +1,20 @@
> +$OpenBSD$
> +
> +Index: opal/mca/pmix/pmix3x/pmix/src/mca/gds/ds12/gds_ds12_lock_pthread.c
> +---
> opal/mca/pmix/pmix3x/pmix/src/mca/gds/ds12/gds_ds12_lock_pthread.c.orig
>  opal/mca/pmix/pmix3x/pmix/src/mca/gds/ds12/gds_ds12_lock_pthread.c
> +@@ -132,12 +132,14 @@ pmix_status_t
> pmix_gds_ds12_lock_init(pmix_common_dsto
> + PMIX_ERROR_LOG(rc);
> + goto error;
> + }
> ++#ifdef HAVE_PTHREAD_SHARED
> + if (0 != pthread_rwlockattr_setpshared(,
> PTHREAD_PROCESS_SHARED)) {
> + pthread_rwlockattr_destroy();
> + rc = PMIX_ERR_INIT;
> + PMIX_ERROR_LOG(rc);
> + goto error;
> + }
> ++#endif
> + #ifdef HAVE_PTHREAD_SETKIND
> + if (0 != pthread_rwlockattr_setkind_np(,
> +
> PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP)) {
> Index:
> patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds21_gds_ds21_lock_pthread_c
> ===
> RCS file:
> patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds21_gds_ds21_lock_pthread_c
> diff -N
> patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds21_gds_ds21_lock_pthread_c
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++
> patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds21_gds_ds21_lock_pthread_c
>   19
> Dec 2019 07:18:30 -
> @@ -0,0 +1,21 @@
> +$OpenBSD$
> +
> +Index: opal/mca/pmix/pmix3x/pmix/src/mca/gds/ds21/gds_ds21_lock_pthread.c
> +---
> opal/mca/pmix/pmix3x/pmix/src/mca/gds/ds21/gds_ds21_lock_pthread.c.orig
>  opal/mca/pmix/pmix3x/pmix/src/mca/gds/ds21/gds_ds21_lock_pthread.c
> +@@ -182,12 +182,15 @@ pmix_status_t
> 

Re: [UPDATE] devel/openmpi 4.0.1 -> 4.0.2

2019-12-18 Thread Martin Reindl
[moved to ports@]

On Tue, Dec 17, 2019 at 04:16:25PM -0700, Raymond, David wrote:
> Martin,
> 
> I have been using openmpi 4.0.2 on my computer system and I found a
> bug that is provoked by running a job (a Go program interfaced to the
> Clang MPI package) on multiple machines connected by ethernet.  This
> crashes the program with the following output:
[...]
> 
> I traced this to the fact that OpenBSD's version of pthreads doesn't
> have "pthread_mutexattr_setpshared".  It turns out that the
> configuration file undefines a flag if this is so, but the actual code
> doesn't pay any attention to this.  I fixed the problem by putting
> appropriate ifdefs around the code generating the error, which itself
> is simple error checking code.  This seems to work.  I have attached
> two patches for the 4.0.2 source.

Hello Dave,

Thanks for your input, I've updated the 4.0.2 diff.

We already were aware of the problem with 4.0.1 back in June and worked
around the problem by setting PMIX_MCA_gds=hash before execution to avoid
GDS/ds21 and GDS/12.

Your diff is of course a much better way, do you want to try to push it
upstream?

-m

Index: Makefile
===
RCS file: /cvs/ports/devel/openmpi/Makefile,v
retrieving revision 1.28
diff -u -p -u -p -r1.28 Makefile
--- Makefile28 Jun 2019 11:05:11 -  1.28
+++ Makefile19 Dec 2019 07:18:30 -
@@ -2,9 +2,8 @@
 
 COMMENT =  open source MPI-3.1 implementation
 
-V =4.0.1
+V =4.0.2
 DISTNAME = openmpi-$V
-REVISION = 0
 
 SHARED_LIBS +=  mca_common_dstore 0.0 # 1.0
 SHARED_LIBS +=  mca_common_monitoring 0.0 # 60.0
Index: distinfo
===
RCS file: /cvs/ports/devel/openmpi/distinfo,v
retrieving revision 1.4
diff -u -p -u -p -r1.4 distinfo
--- distinfo27 Jun 2019 13:52:00 -  1.4
+++ distinfo19 Dec 2019 07:18:30 -
@@ -1,2 +1,2 @@
-SHA256 (openmpi-4.0.1.tar.gz) = 5V4hP+CaIUq58scirP2L97ObvBgA5LekZNON8V5wf1k=
-SIZE (openmpi-4.0.1.tar.gz) = 17513706
+SHA256 (openmpi-4.0.2.tar.gz) = ZigFhw6GoUceWXObDDTG+QBODHoi2waFYtU4jsRCGQQ=
+SIZE (openmpi-4.0.2.tar.gz) = 17373487
Index: 
patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds12_gds_ds12_lock_pthread_c
===
RCS file: 
patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds12_gds_ds12_lock_pthread_c
diff -N 
patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds12_gds_ds12_lock_pthread_c
--- /dev/null   1 Jan 1970 00:00:00 -
+++ 
patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds12_gds_ds12_lock_pthread_c
19 Dec 2019 07:18:30 -
@@ -0,0 +1,20 @@
+$OpenBSD$
+
+Index: opal/mca/pmix/pmix3x/pmix/src/mca/gds/ds12/gds_ds12_lock_pthread.c
+--- opal/mca/pmix/pmix3x/pmix/src/mca/gds/ds12/gds_ds12_lock_pthread.c.orig
 opal/mca/pmix/pmix3x/pmix/src/mca/gds/ds12/gds_ds12_lock_pthread.c
+@@ -132,12 +132,14 @@ pmix_status_t pmix_gds_ds12_lock_init(pmix_common_dsto
+ PMIX_ERROR_LOG(rc);
+ goto error;
+ }
++#ifdef HAVE_PTHREAD_SHARED
+ if (0 != pthread_rwlockattr_setpshared(, 
PTHREAD_PROCESS_SHARED)) {
+ pthread_rwlockattr_destroy();
+ rc = PMIX_ERR_INIT;
+ PMIX_ERROR_LOG(rc);
+ goto error;
+ }
++#endif
+ #ifdef HAVE_PTHREAD_SETKIND
+ if (0 != pthread_rwlockattr_setkind_np(,
+ 
PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP)) {
Index: 
patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds21_gds_ds21_lock_pthread_c
===
RCS file: 
patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds21_gds_ds21_lock_pthread_c
diff -N 
patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds21_gds_ds21_lock_pthread_c
--- /dev/null   1 Jan 1970 00:00:00 -
+++ 
patches/patch-opal_mca_pmix_pmix3x_pmix_src_mca_gds_ds21_gds_ds21_lock_pthread_c
19 Dec 2019 07:18:30 -
@@ -0,0 +1,21 @@
+$OpenBSD$
+
+Index: opal/mca/pmix/pmix3x/pmix/src/mca/gds/ds21/gds_ds21_lock_pthread.c
+--- opal/mca/pmix/pmix3x/pmix/src/mca/gds/ds21/gds_ds21_lock_pthread.c.orig
 opal/mca/pmix/pmix3x/pmix/src/mca/gds/ds21/gds_ds21_lock_pthread.c
+@@ -182,12 +182,15 @@ pmix_status_t pmix_gds_ds21_lock_init(pmix_common_dsto
+ PMIX_ERROR_LOG(rc);
+ goto error;
+ }
++
++#ifdef HAVE_PTHREAD_MUTEXATTR_SETPSHARED
+ if (0 != pthread_mutexattr_setpshared(, PTHREAD_PROCESS_SHARED)) 
{
+ pthread_mutexattr_destroy();
+ rc = PMIX_ERR_INIT;
+ PMIX_ERROR_LOG(rc);
+ goto error;
+ }
++#endif
+ 
+ segment_hdr_t *seg_hdr = 
(segment_hdr_t*)lock_item->seg_desc->seg_info.seg_base_addr;
+ seg_hdr->num_locks = local_size;
Index: pkg/PLIST

Re: [UPDATE] devel/openmpi 4.0.1 -> 4.0.2

2019-11-19 Thread Martin Reindl
Am 31.10.19 um 15:59 schrieb Martin Reindl:
> Hello,
> 
> The attached diff updates devel/openmpi to the current stable release 4.0.2.
> Changelog can be found here:
> https://raw.githubusercontent.com/open-mpi/ompi/v4.0.x/NEWS
> Tested on amd64 and arm64.
> 
> Most noteably this update fixes problems reported by David Raymond off-list.

Ping.



[UPDATE] devel/openmpi 4.0.1 -> 4.0.2

2019-10-31 Thread Martin Reindl
Hello,

The attached diff updates devel/openmpi to the current stable release 4.0.2.
Changelog can be found here:
https://raw.githubusercontent.com/open-mpi/ompi/v4.0.x/NEWS
Tested on amd64 and arm64.

Most noteably this update fixes problems reported by David Raymond off-list.

-m


Index: Makefile
===
RCS file: /cvs/ports/devel/openmpi/Makefile,v
retrieving revision 1.28
diff -u -p -u -p -r1.28 Makefile
--- Makefile28 Jun 2019 11:05:11 -  1.28
+++ Makefile30 Oct 2019 21:44:23 -
@@ -2,9 +2,9 @@
 
 COMMENT =  open source MPI-3.1 implementation
 
-V =4.0.1
+V =4.0.2
 DISTNAME = openmpi-$V
-REVISION = 0
+#REVISION =0
 
 SHARED_LIBS +=  mca_common_dstore 0.0 # 1.0
 SHARED_LIBS +=  mca_common_monitoring 0.0 # 60.0
Index: distinfo
===
RCS file: /cvs/ports/devel/openmpi/distinfo,v
retrieving revision 1.4
diff -u -p -u -p -r1.4 distinfo
--- distinfo27 Jun 2019 13:52:00 -  1.4
+++ distinfo30 Oct 2019 21:44:23 -
@@ -1,2 +1,2 @@
-SHA256 (openmpi-4.0.1.tar.gz) = 5V4hP+CaIUq58scirP2L97ObvBgA5LekZNON8V5wf1k=
-SIZE (openmpi-4.0.1.tar.gz) = 17513706
+SHA256 (openmpi-4.0.2.tar.gz) = ZigFhw6GoUceWXObDDTG+QBODHoi2waFYtU4jsRCGQQ=
+SIZE (openmpi-4.0.2.tar.gz) = 17373487
Index: pkg/PLIST
===
RCS file: /cvs/ports/devel/openmpi/pkg/PLIST,v
retrieving revision 1.5
diff -u -p -u -p -r1.5 PLIST
--- pkg/PLIST   27 Jun 2019 13:52:00 -  1.5
+++ pkg/PLIST   30 Oct 2019 21:44:23 -
@@ -143,15 +143,6 @@ lib/openmpi/mca_compress_gzip.so
 lib/openmpi/mca_crs_none.a
 lib/openmpi/mca_crs_none.la
 lib/openmpi/mca_crs_none.so
-lib/openmpi/mca_dfs_app.a
-lib/openmpi/mca_dfs_app.la
-lib/openmpi/mca_dfs_app.so
-lib/openmpi/mca_dfs_orted.a
-lib/openmpi/mca_dfs_orted.la
-lib/openmpi/mca_dfs_orted.so
-lib/openmpi/mca_dfs_test.a
-lib/openmpi/mca_dfs_test.la
-lib/openmpi/mca_dfs_test.so
 lib/openmpi/mca_errmgr_default_app.a
 lib/openmpi/mca_errmgr_default_app.la
 lib/openmpi/mca_errmgr_default_app.so
@@ -221,9 +212,6 @@ lib/openmpi/mca_iof_tool.so
 lib/openmpi/mca_mpool_hugepage.a
 lib/openmpi/mca_mpool_hugepage.la
 lib/openmpi/mca_mpool_hugepage.so
-lib/openmpi/mca_notifier_syslog.a
-lib/openmpi/mca_notifier_syslog.la
-lib/openmpi/mca_notifier_syslog.so
 lib/openmpi/mca_odls_default.a
 lib/openmpi/mca_odls_default.la
 lib/openmpi/mca_odls_default.so
@@ -288,6 +276,9 @@ lib/openmpi/mca_reachable_weighted.so
 lib/openmpi/mca_regx_fwd.a
 lib/openmpi/mca_regx_fwd.la
 lib/openmpi/mca_regx_fwd.so
+lib/openmpi/mca_regx_naive.a
+lib/openmpi/mca_regx_naive.la
+lib/openmpi/mca_regx_naive.so
 lib/openmpi/mca_regx_reverse.a
 lib/openmpi/mca_regx_reverse.la
 lib/openmpi/mca_regx_reverse.so
@@ -315,9 +306,6 @@ lib/openmpi/mca_rml_oob.so
 lib/openmpi/mca_routed_binomial.a
 lib/openmpi/mca_routed_binomial.la
 lib/openmpi/mca_routed_binomial.so
-lib/openmpi/mca_routed_debruijn.a
-lib/openmpi/mca_routed_debruijn.la
-lib/openmpi/mca_routed_debruijn.so
 lib/openmpi/mca_routed_direct.a
 lib/openmpi/mca_routed_direct.la
 lib/openmpi/mca_routed_direct.so