Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-21 Thread Jeff Squyres
On Aug 17, 2009, at 7:59 PM, Chris Samuel wrote: Ah, I think I've misunderstood the website then. :-( It calls 1.3 stable and 1.2 old and I presumed old meant deprecated. :-( To clarify... 1.3 *is* stable, meaning "ok for production use." We test all 1.3 releases before they go out, it

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-21 Thread Chris Samuel
- "Eugene Loh" wrote: > Actually, the current proposed defaults for 1.3.4 are > not to change the defaults at all. Thanks, I hadn't picked up on the latest update to the trac ticket 3 days ago that says that the defaults will stay the same. Sounds good to me! All the best and have a good w

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-21 Thread Eugene Loh
Chris Samuel wrote: - "Chris Samuel" wrote: $ mpiexec --mca opal_paffinity_alone 1 -bysocket -bind-to-socket -mca odls_base_report_bindings 99 -mca odls_base_verbose 7 ./cpi-1.4 To clarify - does that command line accurately reflect the proposed defaults for OMPI 1.3

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-21 Thread Chris Samuel
- "Chris Samuel" wrote: > $ mpiexec --mca opal_paffinity_alone 1 -bysocket -bind-to-socket -mca > odls_base_report_bindings 99 -mca odls_base_verbose 7 ./cpi-1.4 To clarify - does that command line accurately reflect the proposed defaults for OMPI 1.3.4 ? cheers, Chris -- Christopher Samu

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-18 Thread Chris Samuel
- "Chris Samuel" wrote: > This is most likely because it's getting an error from the > kernel when trying to bind to a socket it's not permitted > to access. This is what strace reports: 18561 sched_setaffinity(18561, 8, { f0 } 18561 <... sched_setaffinity resumed> ) = -1 EINVAL (Invalid

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-18 Thread Chris Samuel
- "Eugene Loh" wrote: > Ah, you're missing the third secret safety switch that prevents > hapless mortals from using this stuff accidentally! :^) Sounds good to me. :-) > I think you need to add > > --mca opal_paffinity_alone 1 Yup, looks like that's it; it fails to launch with tha

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-18 Thread Eugene Loh
Chris Samuel wrote: OK, grabbed that (1.4a1r21825). Configured with: ./configure --prefix=$FOO --with-openib --with-tm=/usr/ local/torque/latest --enable-static --enable-shared It built & installed OK, but when running a trivial example with it I don't see evidence for that code getting calle

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-18 Thread Chris Samuel
- "Ralph Castain" wrote: > Hi Chris Hiya, > The devel trunk has all of this in it - you can get that tarball from > the OMPI web site (take the nightly snapshot). OK, grabbed that (1.4a1r21825). Configured with: ./configure --prefix=$FOO --with-openib --with-tm=/usr/ local/torque/latest

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Ralph Castain
Hi Chris The devel trunk has all of this in it - you can get that tarball from the OMPI web site (take the nightly snapshot). I plan to work on cpuset support beginning Tues morning. Ralph On Aug 17, 2009, at 7:18 PM, Chris Samuel wrote: - "Eugene Loh" wrote: Hi Eugene, [...] It

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Chris Samuel
- "Eugene Loh" wrote: Hi Eugene, [...] > It would be even better to have binding selections adapt to other > bindings on the system. Indeed! This touches on the earlier thread about making OMPI aware of its cpuset/cgroup allocation on the node (for those sites that are using it), it might

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Ralph Castain
On Aug 17, 2009, at 5:59 PM, Chris Samuel wrote: - "Jeff Squyres" wrote: An important point to raise here: the 1.3 series is *not* the super stable series. It is the *feature* series. Specifically: it is not out of scope to introduce or change features within the 1.3 series. Ah, I t

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Chris Samuel
- "Jeff Squyres" wrote: > An important point to raise here: the 1.3 series is *not* the super > stable series. It is the *feature* series. Specifically: it is not > out of scope to introduce or change features within the 1.3 series. Ah, I think I've misunderstood the website then. :-(

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread N.M. Maclaren
On Aug 17 2009, Paul H. Hargrove wrote: + I wonder if one can do any "introspection" with the dynamic linker to detect hybrid OpenMP (no "I") apps and avoid pinning them by default (examining OMP_NUM_THREADS in the environment is no good, since that variable may have a site default value othe

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Patrick Geoffray
Jeff, Jeff Squyres wrote: ignored it whenever presenting competitive data. The 1,000,000th time I saw this, I gave up arguing that our competitors were not being fair and simply changed our defaults to always leave memory pinned for OpenFabrics-based networks. Instead, you should have tol

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Paul H. Hargrove
Some more thoughts in this thread that I've not seen expressed yet (perhaps I missed them): + Some argue that this change in the middle of a stable series may, to some users, appear to be a performance regression when they update. However, I would argue that if the alternative is to delay thi

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Ashley Pittman
Some very good points in this thread all round. On Mon, 2009-08-17 at 09:00 -0400, Jeff Squyres wrote: > > This is probably not too surprising (i.e., allowing the OS to move > jobs around between cores on a socket can probably involve a little > cache thrashing, resulting in that 5-10% loss)

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Jeff Squyres
On Aug 17, 2009, at 3:23 PM, N.M. Maclaren wrote: >Yes, BUT... We had a similar option to this for a long, long time. Sorry, perhaps I should have spelled out what I meant by "mandatory". The system would not build (or run, depending on where it was set) without such a value being specified.

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread N.M. Maclaren
On Aug 17 2009, Jeff Squyres wrote: Yes, BUT... We had a similar option to this for a long, long time. Sorry, perhaps I should have spelled out what I meant by "mandatory". The system would not build (or run, depending on where it was set) without such a value being specified. There would

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Jeff Squyres
On Aug 17, 2009, at 12:11 PM, N.M. Maclaren wrote: 1) To have a mandatory configuration option setting the default, which would have a name like 'performance' for the binding option. YOU could then beat up anyone who benchmarkets without it for being biassed. This is a better solution

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread N.M. Maclaren
On Aug 17 2009, Ralph Castain wrote: At issue for us is that other MPIs -do- bind by default, thus creating an apparent performance advantage for themselves compared to us on standard benchmarks run "out-of-the-box". We repeatedly get beat-up in papers and elsewhere over our performance, when ma

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Ralph Castain
I don't disagree with your statements. However, I was addressing the specific question of two OpenMPI programs conflicting on process placement, not the overall question you are raising. The issue of when/if to bind has been debated for a long time. I agree that having more options (bind-to-socket

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Kenneth Lloyd
pen-mpi.org > [mailto:devel-boun...@open-mpi.org] On Behalf Of Jeff Squyres > Sent: Monday, August 17, 2009 7:01 AM > To: Open MPI Developers > Subject: Re: [OMPI devel] Heads up on new feature to 1.3.4 > > On Aug 16, 2009, at 11:02 PM, Ralph Castain wrote: > > > I think the pro

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread N.M. Maclaren
On Aug 17 2009, Ralph Castain wrote: The problem is that the two mpiruns don't know about each other, and therefore the second mpirun doesn't know that another mpirun has already used socket 0. We hope to change that at some point in the future. It won't help. The problem is less likely

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Eugene Loh
Jeff Squyres wrote: On Aug 16, 2009, at 11:02 PM, Ralph Castain wrote: UNLESS you have a threaded application, in which case -any- binding can be highly detrimental to performance. I'm not quite sure I understand this statement. Binding is not inherently contrary to multi-threaded applic

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread N.M. Maclaren
On Aug 17 2009, Jeff Squyres wrote: On Aug 16, 2009, at 11:02 PM, Ralph Castain wrote: I think the problem here, Eugene, is that performance benchmarks are far from the typical application. We have repeatedly seen this - optimizing for benchmarks frequently makes applications run less effi

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Jeff Squyres
On Aug 16, 2009, at 8:56 PM, George Bosilca wrote: I tend to agree with Chris. Changing the behavior of the 1.3 in the middle of the stable release cycle, will be very confusing for our users. An important point to raise here: the 1.3 series is *not* the super stable series. It is the *fea

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Jeff Squyres
On Aug 16, 2009, at 11:02 PM, Ralph Castain wrote: I think the problem here, Eugene, is that performance benchmarks are far from the typical application. We have repeatedly seen this - optimizing for benchmarks frequently makes applications run less efficiently. So I concur with Chris on th

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Ralph Castain
The problem is that the two mpiruns don't know about each other, and therefore the second mpirun doesn't know that another mpirun has already used socket 0. We hope to change that at some point in the future. Ralph On Aug 17, 2009, at 4:02 AM, Lenny Verkhovsky wrote: In the multi job envi

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Lenny Verkhovsky
In the multi job environment, can't we just start binding processes on the first avaliable and unused socket? I mean first job/user will start binding itself from socket 0, the next job/user will start binding itself from socket 2, for instance . Lenny. On Mon, Aug 17, 2009 at 6:02 AM, Ralph Casta

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-16 Thread Ralph Castain
On Aug 16, 2009, at 8:16 PM, Eugene Loh wrote: Chris Samuel wrote: - "Eugene Loh" wrote: This is an important discussion. Indeed! My big fear is that people won't pick up the significance of the change and will complain about performance regressions in the middle of an OMPI stable re

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-16 Thread Eugene Loh
Chris Samuel wrote: - "Eugene Loh" wrote: This is an important discussion. Indeed! My big fear is that people won't pick up the significance of the change and will complain about performance regressions in the middle of an OMPI stable release cycle. 2) The pro

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-16 Thread Chris Samuel
- "Eugene Loh" wrote: > This is an important discussion. Indeed! My big fear is that people won't pick up the significance of the change and will complain about performance regressions in the middle of an OMPI stable release cycle. > Do note: > > 1) Bind-to-core is actually the default be

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-16 Thread George Bosilca
I tend to agree with Chris. Changing the behavior of the 1.3 in the middle of the stable release cycle, will be very confusing for our users. Moreover, as Ralph pointed out, everything in Open MPI is configurable so if we advertise this feature in the Changelog, the institutions where the n

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-16 Thread Chris Samuel
- "Ralph Castain" wrote: > Hi Chris Hiya Ralph, > There would be a "-do-not-bind" option that will prevent us from > binding processes to anything which should cover that situation. Gotcha. > My point was only that we would be changing the out-of-the-box > behavior to the opposite of tod

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-16 Thread Eugene Loh
This is an important discussion.  Do note: 1) Bind-to-core is actually the default behavior of many MPIs today. 2) The proposed OMPI bind-to-socket default is less severe.  In the general case, it would allow multiple jobs to bind in the same way without oversubscribing any core or socket.  (

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-16 Thread Ralph Castain
Hi Chris There would be a "-do-not-bind" option that will prevent us from binding processes to anything which should cover that situation. My point was only that we would be changing the out-of-the-box behavior to the opposite of today's, so all those such as yourself would now have to add the -d

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-16 Thread Chris Samuel
- "Terry Dontje" wrote: > I just wanted to give everyone a heads up if they do not get bugs > email. I just submitted a CMR to move over some new paffinity options > from the trunk to the v1.3 branch. Ralphs comments imply that for those sites that share nodes between jobs (such as oursel

[OMPI devel] Heads up on new feature to 1.3.4

2009-08-13 Thread Terry Dontje
I just wanted to give everyone a heads up if they do not get bugs email. I just submitted a CMR to move over some new paffinity options from the trunk to the v1.3 branch. You can read the gory details in https://svn.open-mpi.org/trac/ompi/ticket/1997 --td