Re: [OMPI devel] OPAL_PMIX_NODEID is not set by orted

2016-08-11 Thread r...@open-mpi.org
I’m working on providing the info, guys - just sitting in a branch right now. Too many meetings...sigh. > On Aug 11, 2016, at 10:09 AM, George Bosilca wrote: > > I just pushed a solution to this problem in 8d0baf140f. If we are unable to > extract the expected

Re: [OMPI devel] OPAL_PMIX_NODEID is not set by orted

2016-08-12 Thread r...@open-mpi.org
Fixed in https://github.com/open-mpi/ompi/pull/1959 > On Aug 11, 2016, at 6:23 PM, Gilles Gouaillardet wrote: > > Thanks George, > > > fwiw, note the current behavior is a bit more "twisted" than that. > > OPAL_MODEX_RECV_VALUE() returns successfully (e.g. err ==

[OMPI devel] OMPI v1.10.6rc1 ready for test

2017-01-30 Thread r...@open-mpi.org
Usual place: https://www.open-mpi.org/software/ompi/v1.10/ Scheduled release: Fri Feb 3rd 1.10.6 -- - Fix bug in timer code that caused problems at optimization settings greater than 2 - OSHMEM: make mmap allocator the default instead of

Re: [OMPI devel] Reminder: assign as well as request review

2017-01-27 Thread r...@open-mpi.org
; > -Paul > > > On Fri, Jan 27, 2017 at 7:46 AM, r...@open-mpi.org <mailto:r...@open-mpi.org> > <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote: > Hey folks > > Just a reminder. If you request a review from someone, GitHub doesn’t show > that

[OMPI devel] Reminder: assign as well as request review

2017-01-27 Thread r...@open-mpi.org
Hey folks Just a reminder. If you request a review from someone, GitHub doesn’t show that person’s icon when looking at the list of PRs. It only shows their icon and marks the PR with their ID if you actually “assign” it to that person. Thus, just requesting a review without assigning the PR

[OMPI devel] Problem on master

2017-01-27 Thread r...@open-mpi.org
Hello all There is a known issue on master that we are attempting to debug. Sadly, it is one that only shows on multi-node operations, and the signature varies based on your environment. We hope to have this resolved soon (and no, it doesn’t appear to be due to any one specific commit). In

Re: [OMPI devel] define a new ENV variable in etc/openmpi-mca-params.conf

2017-02-24 Thread r...@open-mpi.org
I think Jeff got lost in the weeds here. If you define a new MCA param in the default param file, we will automatically pick it up and it will be in the environment of your application. You don’t need to do anything. However, you checked for the wrong envar. Anything you provide is going to

Re: [OMPI devel] Q: Using a hostfile in managed environment?

2017-02-24 Thread r...@open-mpi.org
> On Feb 24, 2017, at 11:57 AM, Thomas Naughton wrote: > > Hi, > > We're trying to track down some curious behavior and decided to take a step > back and check a base assumption. > > When running within a managed environment (job allocation): > >Q: Should you be able

Re: [OMPI devel] OMPI v1.10.6

2017-01-18 Thread r...@open-mpi.org
Will someone be submitting that PR soon? > On Jan 18, 2017, at 10:09 AM, George Bosilca <bosi...@icl.utk.edu> wrote: > > https://github.com/open-mpi/ompi/issues/2750 > <https://github.com/open-mpi/ompi/issues/2750> > > George. > > > > On Wed, J

Re: [OMPI devel] OMPI v1.10.6

2017-01-18 Thread r...@open-mpi.org
Last call for v1.10.6 changes - we still have a few pending for review, but none marked as critical. If you want them included, please push for a review _now_ Thanks Ralph > On Jan 12, 2017, at 1:54 PM, r...@open-mpi.org wrote: > > Hi folks > > It looks like we may have motiva

Re: [OMPI devel] Coll/sync component missing???

2016-08-19 Thread r...@open-mpi.org
with legacy codes that don’t want to refactor their algorithms. > On Aug 19, 2016, at 8:48 PM, Nathan Hjelm <hje...@me.com> wrote: > >> On Aug 19, 2016, at 4:24 PM, r...@open-mpi.org wrote: >> >> Hi folks >> >> I had a question arise regarding a problem

[OMPI devel] Warnings on master

2016-08-20 Thread r...@open-mpi.org
We seem to have gotten into a state again of generating a ton of warnings on master - can folks take a look at these and clean them up? opal_datatype_pack.c: In function ‘pack_predefined_heterogeneous’: opal_datatype_pack.c:421:24: warning: variable ‘_l_blength’ set but not used

Re: [OMPI devel] Coll/sync component missing???

2016-08-20 Thread r...@open-mpi.org
and scatter*) was only helping to exacerbate > the issue. However, doing a loop around a small MPI_Send will also end on a > memory exhaustion issue, one that would not be easily circumvented by adding > synchronizations deep inside the library. > > George. > > > On Sat, Au

Re: [OMPI devel] Coll/sync component missing???

2016-08-20 Thread r...@open-mpi.org
> so even if Open MPI provides a fix or workaround, the end user will be left > with some important load imbalance, which is far from being optimal from > his/her performance point of view. > > > Cheers, > > Gilles > > On Sunday, August 21, 2016, r...@open-mpi.o

Re: [OMPI devel] Coll/sync component missing???

2016-08-22 Thread r...@open-mpi.org
doing a loop around a small MPI_Send will also end on a >> memory exhaustion issue, one that would not be easily circumvented by adding >> synchronizations deep inside the library. >> >> George. >> >> >> On Sat, Aug 20, 2016 at 12:30 AM, r...@open-mpi.

Re: [OMPI devel] v2.1.0rc1 has been released

2017-02-26 Thread r...@open-mpi.org
> On Feb 26, 2017, at 6:47 AM, Jeff Squyres (jsquyres) > wrote: > > - Should be none. > ^^^ JMS Did we change --host or --hostfile behavior? > Not that I am aware of - we talked about it, but I think we concluded that we couldn’t/shouldn’t as that would constitute a

Re: [OMPI devel] Binding with --oversubscribe in 2.0.0

2016-08-24 Thread r...@open-mpi.org
ge (as "overload-allowed", which also works). > > Cheers, > Ben > > > -Original Message- > From: devel [mailto:devel-boun...@lists.open-mpi.org] On Behalf Of > r...@open-mpi.org > Sent: Thursday, 25 August 2016 2:03 AM > To: OpenMPI Deve

Re: [OMPI devel] Binding with --oversubscribe in 2.0.0

2016-08-24 Thread r...@open-mpi.org
versubscribe in 2.0.0 > > Hi Ralph, > > Thanks for that... that option's not on the man page for mpirun, but I can > see it in the --help message (as "overload-allowed", which also works). > > Cheers, > Ben > > > -Original Message- > From: de

Re: [OMPI devel] Binding with --oversubscribe in 2.0.0

2016-08-24 Thread r...@open-mpi.org
2016 9:36 AM >> To: 'Open MPI Developers' <devel@lists.open-mpi.org> >> Subject: Re: [OMPI devel] Binding with --oversubscribe in 2.0.0 >> >> Hi Ralph, >> >> Thanks for that... that option's not on the man page for mpirun, but I can >> see it in the --h

Re: [MTT devel] reporter error using pyclient

2016-09-02 Thread r...@open-mpi.org
ubmitting the result_stderr as an array still. The server treats both of > those keys the same, so it must be on the python client side. > > I don't think the server requires a MPI Install phase. It's listed as an > optional field for the test_build/test_run phases. If you don't submit an MP

Re: [MTT devel] the boolean keyval error traceback with debug

2016-09-02 Thread r...@open-mpi.org
Should be fixed in https://github.com/open-mpi/mtt/pull/469 > On Aug 31, 2016, at 6:54 PM, r...@open-mpi.org wrote: > > I’ll dig into this tomorrow > >> On Aug 31, 2016, at 10:59 AM, Howard Pritchard <hpprit...@gmail.com >> <mailto:hpprit...@gmail.com>> w

Re: [MTT devel] the boolean keyval error traceback with debug

2016-08-31 Thread r...@open-mpi.org
I’ll dig into this tomorrow > On Aug 31, 2016, at 10:59 AM, Howard Pritchard wrote: > > Hi Folks, > > Here's what I'm seeing with the boolean keyval issue: > > orking repo g...@github.com:open-mpi/ompi-tests > > Working final repo g...@github.com:open-mpi/ompi-tests > >

Re: [OMPI devel] Question about Open MPI bindings

2016-09-02 Thread r...@open-mpi.org
I’ll dig more later, but just checking offhand, I can’t replicate this on my box, so it may be something in hwloc for that box (or maybe you have some MCA params set somewhere?): $ mpirun -n 2 --bind-to core --report-bindings hostname [rhc001:83938] MCW rank 0 bound to socket 0[core 0[hwt

Re: [OMPI devel] Question about Open MPI bindings

2016-09-03 Thread r...@open-mpi.org
Okay, can you add --display-devel-map --mca rmaps_base_verbose 10 to your cmd line? It sounds like there is something about that topo that is bothering the mapper > On Sep 2, 2016, at 9:31 PM, George Bosilca wrote: > > Thanks Gilles, that's a very useful trick. The

Re: [OMPI devel] Question about Open MPI bindings

2016-09-03 Thread r...@open-mpi.org
u:17451 <http://dancer.icl.utk.edu:17451/>] [[41198,0],0] > GOT 1 CPUS > [dancer.icl.utk.edu:17451 <http://dancer.icl.utk.edu:17451/>] [[41198,0],0] > PROC [[41198,1],2] BITMAP 1,9 > [dancer.icl.utk.edu:17451 <http://dancer.icl.utk.edu:17451/>] [[41198,0],0] &g

Re: [OMPI devel] OMPI devel] Question about Open MPI bindings

2016-09-03 Thread r...@open-mpi.org
ore test plus the --hetero-nodes option ? > > Bottom line, you might have to set yet an other MCA param equivalent to the > --hetero-nodes option. > > Cheers, > > Gilles > > r...@open-mpi.org wrote: > Interesting - well, it looks like ORTE is working correctly. Th

Re: [OMPI devel] Question about Open MPI bindings

2016-09-05 Thread r...@open-mpi.org
u want the behavior you describe, then you simply tell ORTE to “--map-by core --bind-to core” > On Sep 5, 2016, at 11:05 AM, George Bosilca <bosi...@icl.utk.edu> wrote: > > On Sat, Sep 3, 2016 at 10:34 AM, r...@open-mpi.org <mailto:r...@open-mpi.org> > <r...@open-mpi.org

[OMPI devel] Hanging tests

2016-09-05 Thread r...@open-mpi.org
Hey folks All of the tests that involve either ISsend_ator, SSend_ator, ISsend_rtoa, or SSend_rtoa are hanging on master and v2.x. Does anyone know what these tests do, and why we never seem to pass them? Do we care? Ralph ___ devel mailing list

Re: [OMPI devel] Hanging tests

2016-09-05 Thread r...@open-mpi.org
have a look tomorrow if the hang is still there > > Cheers, > > Gilles > > r...@open-mpi.org wrote: >> Hey folks >> >> All of the tests that involve either ISsend_ator, SSend_ator, ISsend_rtoa, >> or SSend_rtoa are hanging on master and v2.x. Does anyone

Re: [OMPI devel] Hanging tests

2016-09-06 Thread r...@open-mpi.org
MPI_COMM_WORLD, ); > if (1 == rank) { > b = 0x; > MPI_Recv(, 1, MPI_INT, 0, 0, comm, MPI_STATUS_IGNORE); > if (0x != b) MPI_Abort(MPI_COMM_WORLD, 2); > } > MPI_Comm_free(); > > MPI_Finalize(); > > return 0; >

[OMPI devel] PMIx shared memory dstore now off by default

2016-09-01 Thread r...@open-mpi.org
Hi folks In order to let some folks continue working on dynamic operations on the master, I have turned the PMIx shared memory data store support “off” by default for the embedded code. You can turn it “on” using the --enable-pmix3-dstore option. Once the dynamics support is functional, we

Re: [OMPI devel] C89 support

2016-08-30 Thread r...@open-mpi.org
Chris At the risk of being annoying, it would really help me if you could answer my question: is Gilles correct in his feeling that we are looking at a scenario where you can support 90% of C99 (e.g., C99-style comments, named structure fields), and only the things modified in this PR are

Re: [OMPI devel] C89 support

2016-08-30 Thread r...@open-mpi.org
Chris For me, this is the critical point: > On Aug 29, 2016, at 9:50 PM, Gilles Gouaillardet wrote: > > iirc, we use C99 struct initialisers, so stricly speaking, i do not think > Open MPI can be built with a pure C89 compiler when configure'd > > with the --disable-c99

Re: [OMPI devel] OpenMPI 2.x: bug: violent break at beginning with (sequential) runs...

2016-09-14 Thread r...@open-mpi.org
This has nothing to do with PMIx, Josh - the error is coming out of the usock OOB component. > On Sep 14, 2016, at 7:17 AM, Joshua Ladd wrote: > > Eric, > > We are looking into the PMIx code path that sets up the jobid. The session > directories are created based on

Re: [OMPI devel] OpenMPI 2.x: bug: violent break at beginning with (sequential) runs...

2016-09-14 Thread r...@open-mpi.org
lles > > On Wednesday, September 14, 2016, r...@open-mpi.org > <mailto:r...@open-mpi.org> <r...@open-mpi.org <mailto:r...@open-mpi.org>> > wrote: > This has nothing to do with PMIx, Josh - the error is coming out of the usock > OOB component. > > >> O

Re: [OMPI devel] Lots of new features rolled out on github.com today

2016-09-14 Thread r...@open-mpi.org
> On Sep 14, 2016, at 11:37 AM, Jeff Squyres (jsquyres) > wrote: > > - Code reviews got better / more organized > - Some project management tools now available > - We can enforce the use of 2-factor authentication Please don’t do that... > >

Re: [OMPI devel] Lots of new features rolled out on github.com today

2016-09-14 Thread r...@open-mpi.org
, 2016, at 2:40 PM, r...@open-mpi.org wrote: >> >>> - Code reviews got better / more organized >>> - Some project management tools now available >>> - We can enforce the use of 2-factor authentication >> >> Please don’t do that... >

Re: [OMPI devel] Lots of new features rolled out on github.com today

2016-09-14 Thread r...@open-mpi.org
rush at all; in fact, this is probably a decent topic >> for our next face-to-face. >> >> >>> On Sep 14, 2016, at 2:46 PM, r...@open-mpi.org wrote: >>> >>> I’d want to _fully_ understand the implications before forcing >>> something on

Re: [OMPI devel] link issue on master with --disable-shared --enable-static --disable-dlopen

2016-09-13 Thread r...@open-mpi.org
I should think we could pass the disable-pdl-open option downward - can’t see any reason why not. > On Sep 13, 2016, at 7:51 PM, Gilles Gouaillardet wrote: > > Folks, > > > i configure'd Open MPI with > > --disable-shared --enable-static --disable-dlopen > > and i can no

Re: [OMPI devel] OpenMPI 2.x: bug: violent break at beginning with (sequential) runs...

2016-09-15 Thread r...@open-mpi.org
I don’t understand this fascination with PMIx. PMIx didn’t calculate this jobid - OMPI did. Yes, it is in the opal/pmix layer, but it had -nothing- to do with PMIx. So why do you want to continue to blame PMIx for this problem?? > On Sep 15, 2016, at 4:29 AM, Joshua Ladd

Re: [OMPI devel] OMPI devel] OpenMPI 2.x: bug: violent break at beginning with (sequential) runs...

2016-09-15 Thread r...@open-mpi.org
I don’t think a collision was the issue here. We were taking the mpirun-generated jobid and passing it thru the hash, thus creating an incorrect and invalid value. What I’m more surprised by is that it doesn’t -always- fail. Only thing I can figure is that, unlike with PMIx, the usock oob

Re: [OMPI devel] OpenMPI 2.x: bug: violent break at beginning with (sequential) runs...

2016-09-15 Thread r...@open-mpi.org
sn't > clear. > > Josh > > On Thu, Sep 15, 2016 at 10:07 AM, r...@open-mpi.org > <mailto:r...@open-mpi.org> <r...@open-mpi.org <mailto:r...@open-mpi.org>> > wrote: > I don’t understand this fascination with PMIx. PMIx didn’t calculate this > jobid - OMPI

Re: [OMPI devel] toward a unique session directory

2016-09-15 Thread r...@open-mpi.org
gt; Ralph, >> >> that looks good to me. >> >> can you please remind me how to test if an app was launched by >> mpirun/orted or direct launched by the RM ? >> >> right now, which direct launch method is supported ? >> i am aware of srun (SLURM)

Re: [OMPI devel] OpenMPI 2.x: bug: violent break at beginning with (sequential) runs...

2016-09-14 Thread r...@open-mpi.org
Ah...I take that back. We changed this and now we _do_ indeed go down that code path. Not good. So yes, we need that putenv so it gets the jobid from the HNP that was launched, like it used to do. You want to throw that in? Thanks Ralph > On Sep 14, 2016, at 8:18 PM, r...@open-mpi.org wr

Re: [OMPI devel] toward a unique session directory

2016-09-14 Thread r...@open-mpi.org
away temp dirs. It isn’t the RM-based environment that is of concern - it’s the non-RM one where epilog scripts don’t exist that is the problem. > On Sep 14, 2016, at 6:05 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: > > Ralph, > > On 9/15/2016 12:11 AM, r...@o

Re: [OMPI devel] OpenMPI 2.x: bug: violent break at beginning with (sequential) runs...

2016-09-14 Thread r...@open-mpi.org
what is causing the trouble. > On Sep 14, 2016, at 8:26 PM, r...@open-mpi.org wrote: > > Ah...I take that back. We changed this and now we _do_ indeed go down that > code path. Not good. > > So yes, we need that putenv so it gets the jobid from the HNP that was > launched, li

Re: [OMPI devel] OpenMPI 2.x: bug: violent break at beginning with (sequential) runs...

2016-09-14 Thread r...@open-mpi.org
Nah, something isn’t right here. The singleton doesn’t go thru that code line, or it isn’t supposed to do so. I think the problem lies in the way the singleton in 2.x is starting up. Let me take a look at how singletons are working over there. > On Sep 14, 2016, at 8:10 PM, Gilles Gouaillardet

Re: [OMPI devel] OMPI devel] RFC: Reenabling the TCP BTL over local interfaces (when specifically requested)

2016-09-21 Thread r...@open-mpi.org
FWIW: you know the location of every proc (to at least the host level) from time of orte_init, which should precede anything in the BTL > On Sep 21, 2016, at 8:31 AM, Gilles Gouaillardet > wrote: > > George, > > Is proc locality already set at that time ? > >

Re: [OMPI devel] toward a unique session directory

2016-09-15 Thread r...@open-mpi.org
> On Sep 15, 2016, at 12:51 AM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: > > Ralph, > > > > my reply is in the text > > > On 9/15/2016 11:11 AM, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote: >> If we are going to make a change, the

Re: [OMPI devel] Sample of merging ompi and ompi-release

2016-09-19 Thread r...@open-mpi.org
One question, to be discussed on the webex: now that github has a “reviewed” feature, so we still need/want the “thumbs-up” bot? If we retain it, then how do we deal with the non-sync’d, duplicative mechanisms? > On Sep 19, 2016, at 4:23 PM, George Bosilca wrote: > >

Re: [OMPI devel] Error in hwloc configury

2016-09-22 Thread r...@open-mpi.org
CPPFLAGS="-I$OPAL_TOP_SRCDIR/$file/include $CPPFLAGS" >> >>unset file >> >> ]) >> >> OPAL_VAR_SCOPE_POP >> >> diff --git a/opal/mca/hwloc/hwloc1113/hwloc/config/hwloc.m4 >> b/opal/mca/hwloc/hwloc1113/h

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. v2.x-dev-2911-gc7bf9a0

2016-09-22 Thread r...@open-mpi.org
Hey Gilles This fix doesn’t look right to me. > +/* we read something - better get more */ > +num_chars_read += rc; > +orted_uri = realloc((void*)orted_uri, buffer_length+chunk); > +memset(_uri[buffer_length], 0, chunk); > +

[OMPI devel] Error in hwloc configury

2016-09-22 Thread r...@open-mpi.org
Hey folks I’m encountering an issue with the way we detect external HWLOC. If I have a directory that includes an hwloc installation in my CPPFLAGS, then we fail to build, even if I don’t specify anything with regard to hwloc on my configure cmd line. The errors I get look like: In file

Re: [OMPI devel] Error in hwloc configury

2016-09-22 Thread r...@open-mpi.org
Ralph, > > Is the root cause we append our stuff to CPPFLAGS, instead of prepend ? > > You can retrieve the compile command line with > make V=1 > > If my guess is correct, does someone know the rationale for append vs prepend > ? > > Cheers, > > Gill

Re: [MTT devel] reporter error using pyclient

2016-08-26 Thread r...@open-mpi.org
3: ... merge_stdout_stderr = E'false' AND result_stderr = ARRAY[E'... ^ HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. Powered by http://www.cherrypy.org;>

Re: [MTT devel] reporter error using pyclient

2016-08-26 Thread r...@open-mpi.org
Hmmm...Hey Josh - is it possible that your server is requiring an MPI_Install phase? We don’t have one since we just build and then set the path accordingly - is it complaining about missing data for an install phase? > On Aug 26, 2016, at 7:33 PM, r...@open-mpi.org wrote: > > Okay,

Re: [MTT devel] reporter error using pyclient

2016-08-26 Thread r...@open-mpi.org
BTW: here is my .ini snippet [Reporter:IUdatabase] plugin = IUDatabase realm = OMPI username = intel pwfile = /home/common/mttpwd.txt platform = bend-rsh hostname = rhc00[1-2] url = https://mtt.open-mpi.org/submit/cpy/ email = r...@open-mpi.org So it looks like the CherryPi server is adding

Re: [MTT devel] reporter error using pyclient

2016-08-26 Thread r...@open-mpi.org
FWIW: the extra “/“ is inserted in the IUDatabase reporter plugin. Removing it didn’t make any difference Must be something on the server side, I fear > On Aug 26, 2016, at 12:08 PM, r...@open-mpi.org wrote: > > BTW: here is my .ini snippet > > > [Reporter:IUdatabase] >

Re: [OMPI devel] stdin issue with master

2016-08-22 Thread r...@open-mpi.org
Yeah, I started working on it earlier this evening - will look some more tomorrow > On Aug 22, 2016, at 7:57 PM, Gilles Gouaillardet wrote: > > Folks, > > > i made a trivial test > > > echo hello | mpirun -np 1 cat > > > and with v2.x and v1.10, the output is "hello"

Re: [OMPI devel] stdin issue with master

2016-08-22 Thread r...@open-mpi.org
Fixed in 9210230 > On Aug 22, 2016, at 8:49 PM, r...@open-mpi.org wrote: > > Yeah, I started working on it earlier this evening - will look some more > tomorrow > >> On Aug 22, 2016, at 7:57 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: >> >>

Re: [OMPI devel] [2.0.1.rc1] runtime failure on MacOS 10.6

2016-08-22 Thread r...@open-mpi.org
or to appearing in POSIX.1 and so might be on most any > Linux system regardless of age. > > -Paul > > On Mon, Aug 22, 2016 at 9:17 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> > <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote: > Hey Paul > &g

Re: [OMPI devel] C89 support

2016-08-29 Thread r...@open-mpi.org
Just so people don’t spend a lot of time on this: as the release manager for the 1.10 series, you are going to have to provide me with a great deal of motivation to accept this proposed change. We ended C89 support way back in the 1.7 series, so reviving it here would really seem odd. I

Re: [OMPI devel] C89 support

2016-08-29 Thread r...@open-mpi.org
Just to clarify: we primarily use c99 features in our plugins as a means of directly specifying which functions are being implemented, and which are not. In c89, this can only be done by maintaining positional alignment - c99 allows us to do this using the function names. Thus, the c99 method

Re: [OMPI devel] [2.0.1.rc1] runtime failure on MacOS 10.6

2016-08-22 Thread r...@open-mpi.org
Hey Paul I just checked on my Mac and had no problem. However, I’m at 10.11, and so I’m wondering if the old 10.6 just doesn’t have strnlen on it? What compiler were you using? > On Aug 22, 2016, at 9:14 PM, r...@open-mpi.org wrote: > > Huh - I’ll take a look. Thanks! > >&g

Re: [OMPI devel] [2.0.1.rc1] runtime failure on MacOS 10.6

2016-08-23 Thread r...@open-mpi.org
Actually, I found that we already dealt with this, but the version in the 2.0.1 branch didn’t include the update. I’ll see what else is missing and ask that it be brought across. Thanks Paul Ralph > On Aug 22, 2016, at 9:25 PM, r...@open-mpi.org wrote: > > Hmmm...okay. I guess we’l

Re: [OMPI devel] [2.0.1.rc1] Solaris MPIX failure

2016-08-23 Thread r...@open-mpi.org
n Aug 23, 2016, at 5:55 AM, r...@open-mpi.org wrote: > > Thanks Gilles! > >> On Aug 23, 2016, at 3:42 AM, Gilles Gouaillardet >> <gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> wrote: >> >> Thanks Paul, >> >> at fir

Re: [OMPI devel] [2.0.1.rc1] runtime failure on MacOS 10.6

2016-08-22 Thread r...@open-mpi.org
Huh - I’ll take a look. Thanks! > On Aug 22, 2016, at 9:11 PM, Paul Hargrove wrote: > > On a Mac OSX 10.6 system: > > $ mpirun -mca btl sm,self -np 2 examples/ring_c' > dyld: lazy symbol binding failed: Symbol not found: _strnlen > Referenced from: >

Re: [OMPI devel] [2.0.1.rc1] Solaris MPIX failure

2016-08-23 Thread r...@open-mpi.org
Looks like Solaris has a “getupeercred” - can you take a look at it, Gilles? We’d have to add that to our AC_CHECK_FUNCS and update the native sec component. > On Aug 23, 2016, at 6:32 AM, r...@open-mpi.org wrote: > > I took a quick glance at this one, and the only way I can s

[OMPI devel] OMPI v2.0.1rc1 available for test

2016-08-22 Thread r...@open-mpi.org
Hello folks Dunno where the head-honcho’s are hiding, but per their request: the newest v2.0.1 release candidate has been posted in the usual place: https://www.open-mpi.org/software/ompi/v2.0/ Beat it up, please! Ralph 2.0.1 -- 23 August 2016 --- Bug fixes/minor

Re: [OMPI devel] [2.0.1.rc1] Solaris MPIX failure

2016-08-23 Thread r...@open-mpi.org
Thanks Gilles! > On Aug 23, 2016, at 3:42 AM, Gilles Gouaillardet > wrote: > > Thanks Paul, > > at first glance, something is going wrong in the sec module under solaris. > I will keep digging tomorrow > > Cheers, > > Gilles > > On Tuesday, August 23, 2016,

Re: [OMPI devel] Binding with --oversubscribe in 2.0.0

2016-08-24 Thread r...@open-mpi.org
Well, that’s a new one! I imagine we could modify the logic to allow a combination of oversubscribe and overload flags. Won’t get out until 2.1, though you could pull the patch in advance if it is holding you up. > On Aug 23, 2016, at 11:46 PM, Ben Menadue wrote: > >

Re: [OMPI devel] Binding with --oversubscribe in 2.0.0

2016-08-24 Thread r...@open-mpi.org
who don’t have such kind scenarios, and don’t realize we are otherwise binding by default. So in your case, you’d want something like: mpirun --map-by core:oversubscribe --bind-to core:overload HTH Ralph > On Aug 24, 2016, at 7:33 AM, r...@open-mpi.org wrote: > > Well, that’s a n

Re: [OMPI devel] [2.0.1.rc1] Solaris MPIX failure

2016-08-24 Thread r...@open-mpi.org
; > <https://patch-diff.githubusercontent.com/raw/open-mpi/ompi-release/pull/1336.patch> > (note you need recent autotools in order to use it) > > > Cheers, > > > Gilles > > On 8/23/2016 10:40 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote

Re: [OMPI devel] C89 support

2016-08-29 Thread r...@open-mpi.org
I hadn’t realized we still have a --disable-c99 configure option - that sounds bad as we can’t possibly build that way. We need to internally perform the configure check, but we shouldn’t be exposing a configure option as that just confuses people into thinking it really is an option. > On Aug

Re: [OMPI devel] Open MPI, PMIx and munge

2016-10-03 Thread r...@open-mpi.org
OMPI should not build munge support unless specifically requested to do so > On Oct 2, 2016, at 7:04 PM, Gilles Gouaillardet wrote: > > Folks, > > > Open MPI policy is to build munge support if it is found, whereas PMIx policy > is to build munge support only if it

Re: [OMPI devel] mtl/psm2 and $PSM2_DEVICES

2016-09-29 Thread r...@open-mpi.org
PSM2_DEVICES="self,shm” ). >> This is to avoid “reserving” HW resources in the HFI card that wouldn’t be >> used unless you later on spawn ranks in other nodes. Therefore, to allow >> dynamic process to be spawned on other nodes you need to tell PSM2 to >> inst

Re: [OMPI devel] use of OBJ_NEW and related calls

2016-10-10 Thread r...@open-mpi.org
See opal/class/opal_object.h And your assumption is correct :-) > On Oct 10, 2016, at 1:18 PM, Emani, Murali wrote: > > Hi, > > Could someone help me in understanding where the functions OBJ_NEW/ > OBJ_CONSTRUCT/ OBJ_DESTRUCT are defined in the source code. Are these >

Re: [OMPI devel] PMIx in 2.x

2016-11-08 Thread r...@open-mpi.org
> > PACK-PMIX-VALUE: UNSUPPORTED TYPE 0 > > > PMIX ERROR: ERROR in file src/server/pmix_server.c at line 1881 > >

Re: [OMPI devel] PMIx in 2.x

2016-11-08 Thread r...@open-mpi.org
t be present in master). > > Thanks, > Pieter > From: devel <devel-boun...@lists.open-mpi.org> on behalf of r...@open-mpi.org > <r...@open-mpi.org> > Sent: Tuesday, November 8, 2016 12:17:29 PM > To: OpenMPI Devel > Subject: Re: [OMPI devel] PMIx in 2.x > >

Re: [OMPI devel] PMIx in 2.x

2016-11-07 Thread r...@open-mpi.org
I’m not sure your description of the 1.x behavior is entirely accurate. What actually happened in that scenario is that the various mpirun’s would connect to each other, proxying the various MPI dynamic calls across each other. You had to tell each mpirun how to find the other - this was in the

Re: [OMPI devel] RFC: Rename nightly snapshot tarballs

2016-10-17 Thread r...@open-mpi.org
You make a valid point - I too prefer simplicity to a 50-character-long name. There should be some simple way of making it clear which branch the tarball came from...your suggestions seem reasonable and easy to do. I’m sure we’ll be talking about this on the telecon in the morning. > On Oct

[OMPI devel] Update to Open MPI Administrative Rules

2016-10-25 Thread r...@open-mpi.org
Hello all Some of you may have noticed that we have been receiving pull requests on Github from contributors who have not signed a formal Contributor’s Agreement. This has raised some discussion in the community about how we accept such contributions without conflicting with our bylaws.

Re: [OMPI devel] Update to Open MPI Administrative Rules

2016-10-25 Thread r...@open-mpi.org
M, Jeff Squyres (jsquyres) <jsquy...@cisco.com> > wrote: > > On Oct 25, 2016, at 12:43 PM, r...@open-mpi.org wrote: >> >> signed-off by: > > Just to nit-pick: it's "Signed-off-by" (with a capital S and a -). It's the > output you get when you

Re: [hwloc-devel] [RFC] applying native OS restrictions to XML imported topology

2016-10-21 Thread r...@open-mpi.org
Hmmm...I think maybe we are only seeing a small portion of the picture here. There are two pieces of the problem when looking at large SMPs: * time required for discovery - your proposal is attempting to address that, assuming that the RM daemon collects the topology and then communicates it to

Re: [hwloc-devel] [RFC] applying native OS restrictions to XML imported topology

2016-10-21 Thread r...@open-mpi.org
whatever info hwloc would like passed into its calls - doesn’t have to be something “understandable” by the proc itself. > On Oct 21, 2016, at 8:15 AM, r...@open-mpi.org wrote: > > Hmmm...I think maybe we are only seeing a small portion of the picture here. > There are two pieces of

Re: [hwloc-devel] [RFC] applying native OS restrictions to XML imported topology

2016-10-21 Thread r...@open-mpi.org
> On Oct 21, 2016, at 10:09 AM, Brice Goglin <brice.gog...@inria.fr> wrote: > > Le 21/10/2016 17:21, r...@open-mpi.org <mailto:r...@open-mpi.org> a écrit : >> I should add: this does beg the question of how a proc “discovers” its >> resource constraints withou

[OMPI devel] Supercomputing 2016: Birds-of-a-Feather meetings

2016-10-24 Thread r...@open-mpi.org
Hello all This year, we will again be hosting Birds-of-a-Feather meetings for Open MPI and PMIx. Open MPI: Wed, Nov 16th, 5:15-7pm http://sc16.supercomputing.org/presentation/?id=bof103=sess322 PMIx: Wed, Nov16th,

Re: [OMPI devel] master nightly tarballs stopped on 11/21

2016-11-23 Thread r...@open-mpi.org
I’ll turn my crontab back on for the holiday, in case Brian isn’t available - worst case, the tarball gets pushed upstream twice. > On Nov 23, 2016, at 7:59 AM, Pritchard Jr., Howard wrote: > > Hi Brian, > > Could you check what’s going on with the nightly tarball builds? >

Re: [OMPI devel] master nightly tarballs stopped on 11/21

2016-11-23 Thread r...@open-mpi.org
:/. > > On Nov 23, 2016, at 08:28, "r...@open-mpi.org <mailto:r...@open-mpi.org>" > <r...@open-mpi.org <mailto:r...@open-mpi.org>> wrote: > >> I’ll turn my crontab back on for the holiday, in case Brian isn’t available >> - worst case, the t

Re: [OMPI devel] [OMPI users] funny SIGSEGV in 'ompi_info'

2016-11-22 Thread r...@open-mpi.org
The “correct” answer is, of course, to propagate the error upwards so that the highest level caller (e.g., MPI_Init or ompi_info) can return it to the user, who can then decide what to do. Disregarding the parameter is not an option as it violates our “do what the user said to do, else return

Re: [OMPI devel] Developing MPI program without mpirun

2016-11-18 Thread r...@open-mpi.org
The 2.0.1 NEWS states that the MPI dynamics operations (comm_spawn, connect, and accept) do not work on that release. They are being fixed for the 2.0.2 release. > On Nov 18, 2016, at 7:48 AM, Rui Liu wrote: > > Hi Howard, > > 1. I am using a cluster which involves 20

Re: [OMPI devel] direct launch problem with master

2016-10-31 Thread r...@open-mpi.org
I should hope bisecting would be a last resort. The simplest interim solution is to set OMPI_MCA_routed=direct in your environment. I’ll take a look at a more permanent solution in the morning. > On Oct 30, 2016, at 6:33 PM, Pritchard Jr., Howard wrote: > > Hi Folks, > >

Re: [OMPI devel] direct launch problem with master

2016-10-31 Thread r...@open-mpi.org
Fixed in PR https://github.com/open-mpi/ompi/pull/2322 <https://github.com/open-mpi/ompi/pull/2322> > On Oct 31, 2016, at 1:20 AM, r...@open-mpi.org wrote: > > I should hope bisecting would be a last resort. The simplest interim solution > is to set OMPI_MCA_routed=direct i

Re: [OMPI devel] New Open MPI Community Bylaws to discuss

2016-10-12 Thread r...@open-mpi.org
The OMPI community members have had their respective legal offices review the changes, but we decided to provide notice and get input from others prior to the formal vote of acceptance. Once approved, there will no longer be a CLA at all. The only requirement for contribution will be the

Re: [OMPI devel] New Open MPI Community Bylaws to discuss

2016-10-12 Thread r...@open-mpi.org
mis <pasharesea...@gmail.com> wrote: > > Regardless, I would have to notify legal teams about amendment of the > existing CLA. If organizations that already signed the agreement don't have > any say, then this conversation is pointless. > > -Pasha > > On Wed, Oct 12, 2

Re: [OMPI devel] Errors with CXX=pgc++ (but CXX=pgCC OK)

2016-12-17 Thread r...@open-mpi.org
Added to 1.10 README - thanks! > On Dec 16, 2016, at 4:18 PM, Paul Hargrove wrote: > > With the 1.10.r5c1 tarball on linux/x86-64 and various versions of the PGI > compilers I have configured with > --prefix=[...] --enable-debug CC=pgcc CXX=pgc++ FC=pgfortran > > I see the

Re: [OMPI devel] [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-11 Thread r...@open-mpi.org
Looking at this note again: how many procs is spawn_master generating? > On Jan 11, 2017, at 7:39 PM, r...@open-mpi.org wrote: > > Sigh - yet another corner case. Lovely. Will take a poke at it later this > week. Thx for tracking it down > >> On Jan 11, 2017, at 5:27 PM

Re: [OMPI devel] Fwd: Re: [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-11 Thread r...@open-mpi.org
ots than available, to specify fewer > slots than available, or to specify more slots than needed for > the processes? > > > Kind regards > > Siegmar > > Am 11.01.2017 um 10:04 schrieb Gilles Gouaillardet: > Siegmar, > > I was able to reproduce the issue on my vm

Re: [OMPI devel] OMPI devel] hwloc missing NUMANode object

2017-01-11 Thread r...@open-mpi.org
Should be fixed here: https://github.com/open-mpi/ompi/pull/2711 <https://github.com/open-mpi/ompi/pull/2711> > On Jan 5, 2017, at 6:42 AM, r...@open-mpi.org wrote: > > I can add a check to see if we have NUMA, and if not we can fall back to > socket (if present) or just “no

Re: [OMPI devel] [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-12 Thread r...@open-mpi.org
;--slot-list 0:0-5,1:0-5". Does incorrect mean that it isn't > allowed to specify more slots than available, to specify fewer > slots than available, or to specify more slots than needed for > the processes? > > > Kind regards > > Siegmar > > Am 11.01.2017 um 10:

  1   2   3   >