Re: [OMPI devel] simple_spawn test fails using different set of btls.

2014-11-05 Thread Ralph Castain
> On Nov 5, 2014, at 6:11 PM, Gilles Gouaillardet > wrote: > > Elena, > > the first case (-mca btl tcp,self) crashing is a bug and i will have a look > at it. > > the second case (-mca sm,self) is a feature : the sm btl cannot be used with > tasks > having different jobids (this is the case

Re: [OMPI devel] RFC: revamp btl rdma interface

2014-11-05 Thread Paul Hargrove
All atomics must be done through not just "the same btl" but the same btl MODULE, since atomics from two IB HCAs, for instance, are not necessarily coherent. So, how is the "best" one to be selected? -Paul [Sent from my phone] On Nov 5, 2014 7:15 AM, "Nathan Hjelm" wrote: > > In the new osc com

Re: [OMPI devel] simple_spawn test fails using different set of btls.

2014-11-05 Thread Gilles Gouaillardet
Elena, the first case (-mca btl tcp,self) crashing is a bug and i will have a look at it. the second case (-mca sm,self) is a feature : the sm btl cannot be used with tasks having different jobids (this is the case after a spawn), and obviously, self cannot be used also, so the behaviour and erro

Re: [OMPI devel] mpirun does not honor rankfile

2014-11-05 Thread Ralph Castain
I suspect the issue may be with physical vs logical numbering. As I said, we use logical numbering in the rankfile, not physical. So I’m not entirely sure how to translate the cpumask in your final table into the numbering shown in your rankfile listings. Is the cpumask showing a physical core n

Re: [OMPI devel] mpirun does not honor rankfile

2014-11-05 Thread Tom Wurgler
Well, further investigation found this: If I edit the rank file and change it like this: before: rank 0=mach1 slot=0 rank 1=mach1 slot=4 rank 2=mach1 slot=8 rank 3=mach1 slot=12 rank 4=mach1 slot=16 rank 5=mach1 slot=20 rank 6=mach1 slot=24 rank 7=mach1 slot=28 rank 8=mach1 slot=32 rank 9=mach

Re: [OMPI devel] mpirun does not honor rankfile

2014-11-05 Thread Ralph Castain
Hmmm…well, it seems to be working fine in 1.8.4rc1 (I only have 12 cores on my humble machine). However, I can’t test any interactions with LSF, though that shouldn’t be an issue: $ mpirun -host bend001 -rf ./rankfile --report-bindings --display-devel-map hostname Data for JOB [60677,1] offset

Re: [OMPI devel] Open MPI Developers F2F Q1 2015 (poll closes on Friday, 7th of November)

2014-11-05 Thread Edgar Gabriel
to throw in my 0.02$, I am probably not able to attend the entire meeting. Dallas would be however in driving distance, I would try attend parts of the meeting as well. Thanks Edgar On 11/5/2014 1:10 PM, Howard Pritchard wrote: Hi Folks, I think Dallas (either Love or DFW) is cheaper to fly

Re: [OMPI devel] Open MPI Developers F2F Q1 2015 (poll closes on Friday, 7th of November)

2014-11-05 Thread Howard Pritchard
Hi Folks, I think Dallas (either Love or DFW) is cheaper to fly in to than Atlanta. Howard 2014-11-05 11:46 GMT-07:00 Jeff Squyres (jsquyres) : > Isn't Dallas 1 flight away from Knoxville? Dallas is a bit more central > (i.e., shorter flights for those coming from the west) > > > > On Nov 5,

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Paul Hargrove
Jeff wrote: MPI_THREAD_MULTIPLE support barely works in v1.8. Why have it on by default, especially when there's a performance penalty? I think the "barely works" state of threading support is a stronger argument for return to the 1.6.x behavior than PSM performance. Who knows what subtle bugs h

Re: [OMPI devel] Open MPI Developers F2F Q1 2015 (poll closes on Friday, 7th of November)

2014-11-05 Thread Jeff Squyres (jsquyres)
Isn't Dallas 1 flight away from Knoxville? Dallas is a bit more central (i.e., shorter flights for those coming from the west) On Nov 5, 2014, at 1:35 PM, George Bosilca wrote: > Even to US attendees Atlanta might seem more appealing, as it is one hop away > from most locations and it has r

Re: [OMPI devel] Request for a Open MPI SotU BoF slot for VampirTrace

2014-11-05 Thread Jeff Squyres (jsquyres)
Bert - That would be great. Could you send me 1-3 slides about this? On Nov 4, 2014, at 4:10 AM, Bert Wesarg wrote: > All, > > the TU Dresden would like to talk a little bit about the current state of > VampirTrace in Open MPI, its successor Score-P [1] and the future of the > collaboratio

Re: [OMPI devel] Open MPI Developers F2F Q1 2015 (poll closes on Friday, 7th of November)

2014-11-05 Thread George Bosilca
Even to US attendees Atlanta might seem more appealing, as it is one hop away from most locations and it has reasonable weather forecast for January/February (not as good as Dallas I concede). George. On Wed, Nov 5, 2014 at 1:18 PM, Jeff Squyres (jsquyres) wrote: > SHORT VERSION > ==

Re: [OMPI devel] Open MPI Developers F2F Q1 2015 (poll closes on Friday, 7th of November)

2014-11-05 Thread Jeff Squyres (jsquyres)
SHORT VERSION = Will anyone be attending from Europe? This may influence the location of the meeting. MORE DETAIL === We're tentatively thinking that Dallas, TX would be a good location for the meeting (at the Cisco facility). The rationale was as follows: 1. Chicago is n

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Jeff Squyres (jsquyres)
On Nov 5, 2014, at 12:03 PM, Joshua Ladd wrote: > I think this is a pretty significant change in behavior for a minor release, > Jeff. According to the interested parties: > > "I'm reporting a performance (message rate 16%, latency 3%) regression when > using PSM that occurred between OMPI v1.

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Ralph Castain
I don’t think anyone is proposing a major change in behavior. We are proposing to fix a bug that crept into the 1.8 series without prior detection - i.e., that mpi-thread-multiple was enabled by default, which is definitely not the intention. I just looked at the configure code, and it does beha

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Joshua Ladd
I think this is a pretty significant change in behavior for a minor release, Jeff. According to the interested parties: "I'm reporting a performance (message rate 16%, latency 3%) regression when using PSM that occurred between OMPI v1.6.5 and v1.8.1. I would guess it affects other networks too,

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Ralph Castain
I think the issue is that the revert may have resulted in having to set both —enable-mpi-thread-multiple and —enable-opal-multi-threads, and Mike is asking that the first automatically turn on the second > On Nov 5, 2014, at 8:51 AM, Jeff Squyres (jsquyres) > wrote: > > On Nov 5, 2014, at 11:

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Jeff Squyres (jsquyres)
On Nov 5, 2014, at 11:43 AM, Mike Dubman wrote: > sorry, > >>>"now we use only this "--enable-mpi-thread-multiple" and it worked." > > I meant it worked fine before the "configure logic" changes. It went back to the way it was in in the v1.6 series. The issue is that --enable-mpi-thread-multi

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Mike Dubman
sorry, >>>"now we use only this "--enable-mpi-thread-multiple" and it worked." I meant it worked fine before the "configure logic" changes. On Wed, Nov 5, 2014 at 6:39 PM, Jeff Squyres (jsquyres) wrote: > I thought you said passing only --enable-mpi-thread-multiple made it > work...? > > On Nov

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Jeff Squyres (jsquyres)
I thought you said passing only --enable-mpi-thread-multiple made it work...? On Nov 5, 2014, at 11:37 AM, Mike Dubman wrote: > the problem is that now the behavior is changed. > Before: user provided single flag and could use MT support. > Now same method will not work starting from v1.8.4 whic

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Mike Dubman
the problem is that now the behavior is changed. Before: user provided single flag and could use MT support. Now same method will not work starting from v1.8.4 which is production branch and will live for a long time with it. Is that possible that some1 familiar with this configure kung-fu will fi

Re: [OMPI devel] RFC: revamp btl rdma interface

2014-11-05 Thread Nathan Hjelm
In the osc component, no. Though it would be straightforward to add that feature. -Nathan On Wed, Nov 05, 2014 at 11:05:48AM -0500, Joshua Ladd wrote: >Does this mean that you maintain a separate channel for 'put' and 'gets' >that can use multiple transports and another for atomics? > >

[OMPI devel] Open MPI Developers F2F Q1 2015 (poll closes on Friday, 7th of November)

2014-11-05 Thread Howard Pritchard
Hi Folks, We've gotten a number of responses to the doodle poll for a week to hold the next OMPI developers F2F. The responses are definitely favoring a meeting the week of January 26th. The poll will be kept open till COB (PST) Friday, the 7th of November.

[OMPI devel] simple_spawn test fails using different set of btls.

2014-11-05 Thread Elena Elkina
Hi, It looks like there is a problem in trunk which reproduces with simple_spawn test (orte/test/mpi/simple_spawn.c). It seems to be a n issue with pmix. It doesn't reproduce with default set of btls. But it reproduces with several btls specified. For example, salloc -N5 $OMPI_HOME/install/bin/mp

Re: [OMPI devel] RFC: revamp btl rdma interface

2014-11-05 Thread Joshua Ladd
Does this mean that you maintain a separate channel for 'put' and 'gets' that can use multiple transports and another for atomics? Josh On Wed, Nov 5, 2014 at 10:15 AM, Nathan Hjelm wrote: > > In the new osc component I don't try to handle that case. All atomics > have to be done through the sa

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Jeff Squyres (jsquyres)
On Nov 5, 2014, at 9:42 AM, Mike Dubman wrote: > Hey Jeff, > > now we use only this "--enable-mpi-thread-multiple" and it worked. > does it mean that now we need to pass "--enable-mpi-thread-multiple > --enable-opal-multi-threads" to get it working again? > Maybe if one of the params used it sh

[OMPI devel] mpirun does not honor rankfile

2014-11-05 Thread twurgl
I am using openmpi v 1.8.3 and LSF 9.1.3. LSF creates a rankfile that looks like: RANK_FILE: == rank 0=mach1 slot=0 rank 1=mach1 slot=4 rank 2=mach1 slot=8 rank 3=mach1 slot=12 rank 4=mach1 slot=16 rank 5=mach1 slot=20 rank 6=mac

Re: [OMPI devel] RFC: revamp btl rdma interface

2014-11-05 Thread Nathan Hjelm
In the new osc component I don't try to handle that case. All atomics have to be done through the same btl (including atomics on self). I did this because with the default setup of Gemini they can not be mixed. If it is possible to mix them with other networks I would be happy to add an atomic fla

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Mike Dubman
Hey Jeff, now we use only this "--enable-mpi-thread-multiple" and it worked. does it mean that now we need to pass "--enable-mpi-thread-multiple --enable-opal-multi-threads" to get it working again? Maybe if one of the params used it should enable another one as well? Thanks On Wed, Nov 5, 2014

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Jeff Squyres (jsquyres)
$ ./configure --help |& grep thread code will ever run in SMP or multi-threaded --enable-opal-multi-threads Enable thread support inside OPAL (default: --enable-mpi-thread-multiple Enable MPI_THREAD_MULTIPLE support (

Re: [OMPI devel] RFC: revamp btl rdma interface

2014-11-05 Thread Joshua Ladd
Quick question. Out of curiosity, how do you handle the (common) case of mixing network atomics with CPU atomics? Say for a single target with two initiators, one initiator is on host with the target, so goes through the SM BTL, and the other initiator is off host, so goes through the network BTL.

Re: [OMPI devel] thread-tests hang

2014-11-05 Thread Joshua Ladd
Jeff, What configure voodoo do we need to add to our MTT to get this functional again? Josh On Tue, Nov 4, 2014 at 12:33 PM, Ralph Castain wrote: > That would be correct - we restored some configure flags that are required > to make multi-thread programs work. Jeff can probably provide more in