Re: [OMPI devel] Open MPI 3.1.0rc4 posted

2018-04-17 Thread r...@open-mpi.org
I’ll let you decide about 3.1.0. FWIW: I think Gilles fix should work for external PMIx v1.2.5 as well. > On Apr 17, 2018, at 7:56 AM, Barrett, Brian via devel > wrote: > > Do we honestly care for 3.1.0? I mean, we went 6 months without it working > and no one cared. We can’t fix all bugs,

Re: [OMPI devel] [OMPI users] 3.x - hang in MPI_Comm_disconnect

2018-05-21 Thread r...@open-mpi.org
Comm_connect and Comm_disconnect are both broken in OMPI v2.0 and above, including OMPI master - the precise reasons differ across the various releases. From what I can tell, the problem is in the OMPI side (as opposed to PMIx). I’ll try to file a few issues (since the problem is different in th

Re: [OMPI devel] About supporting HWLOC 2.0.x

2018-05-22 Thread r...@open-mpi.org
I’ve been running with hwloc 2.0.1 for quite some time now without problem, including use of the shared memory segment. It would be interesting to hear what changes you had to make. However, that said, there is a significant issue in ORTE when trying to map-by NUMA as hwloc 2.0.1 no longer asso

Re: [OMPI devel] About supporting HWLOC 2.0.x

2018-05-22 Thread r...@open-mpi.org
Arg - just remembered. I should have noted in my comment that I started with that PR and did make a few further adjustments, though not much. > On May 22, 2018, at 8:49 AM, Jeff Squyres (jsquyres) > wrote: > > Geoffroy -- check out https://github.com/open-mpi/ompi/pull/4677. > > If all those

Re: [OMPI devel] [OMPI users] 3.x - hang in MPI_Comm_disconnect

2018-05-22 Thread r...@open-mpi.org
_disconnect when closing the cluster. I think the idea is that > they can then create and destroy clusters several times within the same R > script. But of course, that won’t work here when you can’t disconnect > processes. > > Cheers, > Ben > > > >> On 22 May

Re: [OMPI devel] Running on Kubernetes

2018-05-28 Thread r...@open-mpi.org
n this version we've managed to not use ssh, relying on `kubectl exec` > instead. It's still pretty "ghetto", but at least we've managed to train some > tensorflow models with it. :) Please take a look and let me know what you > think. > > Thanks,

[OMPI devel] Some disturbing warnings on master today

2018-05-30 Thread r...@open-mpi.org
In file included from /usr/include/stdio.h:411:0, from ../../opal/util/malloc.h:24, from ../../opal/include/opal_config_bottom.h:331, from ../../opal/include/opal_config.h:2919, from ../../opal/util/argv.h:33, from

[OMPI devel] Master warnings?

2018-06-01 Thread r...@open-mpi.org
Geez guys - what happened? In file included from monitoring_prof.c:47:0: ../../../../ompi/include/mpi.h:423:9: warning: ‘__error__’ attribute ignored [-Wattributes] __mpi_interface_removed__("MPI_Comm_errhandler_fn was removed in MPI-3.0; use MPI_Comm_errhandler_function instead");

Re: [OMPI devel] Master warnings?

2018-06-02 Thread r...@open-mpi.org
No problem - I just commented because earlier in the week it had built clean, so I was surprised to get the flood. This was with gcc 6.3.0, so not that old > On Jun 2, 2018, at 7:19 AM, Nathan Hjelm wrote: > > Should have it fixed today or tomorrow. Guess I didn't have a sufficiently > old g

[OMPI devel] Master broken

2018-06-03 Thread r...@open-mpi.org
On my system, which has libfabric installed (but maybe an older version than expected?): btl_ofi_component.c: In function ‘mca_btl_ofi_component_progress’: btl_ofi_component.c:557:63: error: ‘FI_EINTR’ undeclared (first use in this function) } else if (OPAL_UNLIKELY(ret != -FI_EAGAIN &&

Re: [OMPI devel] Master broken

2018-06-03 Thread r...@open-mpi.org
IN && ret != -FI_EINTR)) { ^ What the heck version was this tested against??? > On Jun 3, 2018, at 7:32 AM, r...@open-mpi.org wrote: > > On my system, which has libfabric installed (but maybe an older version than >

[OMPI devel] Remove prun tool from OMPI?

2018-06-05 Thread r...@open-mpi.org
Hey folks Does anyone have heartburn if I remove the “prun” tool from ORTE? I don’t believe anyone is using it, and it doesn’t look like it even works. I ask because the name conflicts with PRRTE and can cause problems when running OMPI against PRRTE Ralph

Re: [OMPI devel] Remove prun tool from OMPI?

2018-06-05 Thread r...@open-mpi.org
naught...@ornl.gov > Research Associate (865) 576-4184 > > > On Tue, 5 Jun 2018, r...@open-mpi.org wrote: > >> Hey folks >> >> Does anyone have heartburn if I remove the “prun” tool from ORTE? I don’t >> believe any

Re: [OMPI devel] Remove prun tool from OMPI?

2018-06-05 Thread r...@open-mpi.org
gt; > Thanks, > --tjn > > _ > Thomas Naughton naught...@ornl.gov > Research Associate (865) 576-4184 > > > On Tue, 5 Jun 2018, r

Re: [OMPI devel] Remove prun tool from OMPI?

2018-06-06 Thread r...@open-mpi.org
I have renamed prun for now - will do the update in a bit > On Jun 5, 2018, at 12:20 PM, Thomas Naughton wrote: > > > On Tue, 5 Jun 2018, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote: > >> >> >>> On Jun 5, 2018, at 11:59 AM, Thomas Naughton

[OMPI devel] PRRTE+OMPI status

2018-06-07 Thread r...@open-mpi.org
Hi folks I now have it so that you can run MTT using OMPI against PRRTE. Current results look promising: +-+-+-+--+--+--+--+--+--+ | Phase | Section

[OMPI devel] ARM failure on PR to master

2018-06-08 Thread r...@open-mpi.org
Can someone who knows/cares about ARM perhaps take a look at PR https://github.com/open-mpi/ompi/pull/5247 ? I’m hitting an error in the ARM CI tests that I can’t understand: --> Running example: hello_c ---

Re: [OMPI devel] ARM failure on PR to master

2018-06-10 Thread r...@open-mpi.org
Now moved to https://github.com/open-mpi/ompi/pull/5258 <https://github.com/open-mpi/ompi/pull/5258> - same error > On Jun 8, 2018, at 9:04 PM, r...@open-mpi.org wrote: > > Can someone who knows/cares about ARM perhaps take a look at PR > https://github.com/open-mpi/ompi/p

[OMPI devel] New binding option

2018-06-21 Thread r...@open-mpi.org
Hello all I have added a new binding option to OMPI master: Alternatively, processes can be assigned to processors based on their local rank on a node using the \fI--bind-to cpuset:ordered\fP option with an associated \fI--cpu-list "0,2,5"\fP. This directs that the first rank on a node be bound t

Re: [OMPI devel] New binding option

2018-06-21 Thread r...@open-mpi.org
> On Jun 21, 2018, at 6:47 AM, Jeff Squyres (jsquyres) via devel > wrote: > > On Jun 21, 2018, at 9:41 AM, r...@open-mpi.org wrote: >> >> Alternatively, processes can be assigned to processors based on >> their local rank on a node using the \fI--bind-to cpuse

Re: [OMPI devel] New binding option

2018-06-21 Thread r...@open-mpi.org
> On Jun 21, 2018, at 7:37 AM, Jeff Squyres (jsquyres) via devel > wrote: > > On Jun 21, 2018, at 10:26 AM, r...@open-mpi.org wrote: >> >>>> Alternatively, processes can be assigned to processors based on >>>> their local rank on a node using

Re: [OMPI devel] Open MPI: Undefined reference to pthread_atfork

2018-06-22 Thread r...@open-mpi.org
OMPI 2.1.3??? Is there any way you could update to something more recent? > On Jun 22, 2018, at 12:28 PM, lille stor wrote: > > Hi, > > > When compiling a C++ source file named test.cpp that needs a shared library > named libUtils.so (which in its turn needs Open MPI shared library, hence th

[OMPI devel] Fwd: [pmix] Release candidates available for testing

2018-07-01 Thread r...@open-mpi.org
FYI - v3.0.0 will go into master for the OMPI v4 branch. v2.1.2 should go into updates for OMPI v3.1 and v3.0 branches Ralph > Begin forwarded message: > > From: "r...@open-mpi.org" > Subject: [pmix] Release candidates available for testing > Date: June 29, 2018 at

[OMPI devel] Odd warning in OMPI v3.0.x

2018-07-06 Thread r...@open-mpi.org
I’m seeing this when building the v3.0.x branch: runtime/ompi_mpi_init.c:395:49: warning: passing argument 2 of ‘opal_atomic_cmpset_32’ makes integer from pointer without a cast [-Wint-conversion] if (!opal_atomic_cmpset_32(&ompi_mpi_state, &expected, desired)) {

Re: [OMPI devel] Odd warning in OMPI v3.0.x

2018-07-06 Thread r...@open-mpi.org
ok, i’ll fix it > On Jul 6, 2018, at 3:09 PM, Nathan Hjelm via devel > wrote: > > Looks like a bug to me. The second argument should be a value in v3.x.x. > > -Nathan > >> On Jul 6, 2018, at 4:00 PM, r...@open-mpi.org wrote: >> >> I’m seei

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-08-05 Thread r...@open-mpi.org
Perhaps those flags need to be the default? > On Aug 5, 2016, at 7:14 AM, tmish...@jcity.maeda.co.jp wrote: > > Hi Christoph, > > I applied the commits - pull/#1250 as Nathan told me and added "-mca > btl_openib_flags 311" to the mpirun command line option, then it worked for > me. I don't know

[OMPI devel] PMIx Language Bindings

2016-08-07 Thread r...@open-mpi.org
Hi folks I’m looking for someone(s) interested in writing some simple language bindings (e.g., Python, Java, Fortran) for the PMIx library (which is written in C). There aren’t a lot of APIs, so I don’t envision this as being a monstrous effort. Please let me know if you have any interest - an

Re: [OMPI devel] PMIx Language Bindings

2016-08-08 Thread r...@open-mpi.org
> > > > On 8/7/16, 3:21 PM, "devel on behalf of r...@open-mpi.org" > wrote: > >> Hi folks >> >> Iąm looking for someone(s) interested in writing some simple language >> bindings (e.g., Python, Java, Fortran) for the PMIx library (which i

<    1   2   3