Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-11 Thread Ralph Castain
You are more than welcome - we really appreciate your spotting the problem! As a side note: you commented about how this works now even if you don’t set the “yield” MCA param. Just as an FYI: we automatically set the “yield” param for you when we detect that you are oversubscribing the node as

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Eric Chamberland
On 12/10/2014 12:55 PM, Ralph Castain wrote: Tarball now available on web site http://www.open-mpi.org/nightly/v1.8/ I’ll run the tarball generator now so you can try the nightly tarball. ok, retrieved openmpi-v1.8.3-236-ga21cb20 and it compiled, linked, and executed nicely when

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Eric Chamberland
On 12/10/2014 12:55 PM, Ralph Castain wrote: Tarball now available on web site http://www.open-mpi.org/nightly/v1.8/ _ _ _ _ On Dec 10, 2014, at 9:40 AM, Ralph Castain > wrote: I’ll run the tarball generator now so you can try the nightly tarball.

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Ralph Castain
Tarball now available on web site http://www.open-mpi.org/nightly/v1.8/ > On Dec 10, 2014, at 9:40 AM, Ralph Castain wrote: > > I’ll run the tarball generator now so you can try the nightly tarball. > >> On Dec 10, 2014, at 9:20 AM,

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Ralph Castain
I’ll run the tarball generator now so you can try the nightly tarball. > On Dec 10, 2014, at 9:20 AM, Eric Chamberland > wrote: > > On 12/10/2014 10:40 AM, Ralph Castain wrote: >> You should be able to apply the patch - I don’t think that section of >> code

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Eric Chamberland
On 12/10/2014 10:40 AM, Ralph Castain wrote: You should be able to apply the patch - I don’t think that section of code differs from what is in the 1.8 repo. it compiles, link, but gives me a segmentation violation now: #0 0x7f1827b00e91 in mca_allocator_component_lookup () from

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Ralph Castain
You should be able to apply the patch - I don’t think that section of code differs from what is in the 1.8 repo. The sha for 1.8.3 can be found on the web site (see right-most column in table): http://www.open-mpi.org/software/ompi/v1.8/ > On Dec

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Eric Chamberland
Hi Nathan, I pulled your commit d0da29351f9 and tested it against our example. It now works perfectly. Strangely, I can even unset "OMPI_MCA_mpi_yield_when_idle=1" and it doesn't seems to last longer. Can I apply the patch to a fresh "1.8.3" and it should work? Other question: how can I

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-09 Thread Eric Chamberland
On 12/09/2014 04:19 PM, Nathan Hjelm wrote: yield when idle is broken on 1.8. Fixing now. ok, thanks a lot! will wait for the fix! Eric

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-09 Thread Nathan Hjelm
yield when idle is broken on 1.8. Fixing now. -Nathan On Tue, Dec 09, 2014 at 01:02:08PM -0800, Ralph Castain wrote: > Hmmm….well, it looks like we are doing the right thing and running unbound > when oversubscribed like this. I don’t have any brilliant idea why it would > be running so

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-09 Thread Ralph Castain
Hmmm….well, it looks like we are doing the right thing and running unbound when oversubscribed like this. I don’t have any brilliant idea why it would be running so slowly in that situation when compared with 1.6.5 - it could be that yield-when-idle is borked. I’ll try to dig into that notion a

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-09 Thread Eric Chamberland
Hi again, I sorted and "seded" (cat outpout.1.00 |sed 's/default/default value/g'|sed 's/true/1/g' |sed 's/false/0/g') the output.1.00 file from: mpirun --output-filename output -mca mpi_show_mca_params all --report-bindings -np 32 myprog between a launch with 165 vs 183. The diff may be

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-09 Thread Eric Chamberland
On 12/09/2014 12:24 PM, Ralph Castain wrote: Can you provide an example cmd line you use to launch one of these tests using 1.8.3? Some of the options changed between the 1.6 and 1.8 series, and we bind by default in 1.8 - the combination may be causing you a problem. I very simply launch:

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-09 Thread Ralph Castain
Not for that many procs - we default to binding to socket for anything more than 2 procs > On Dec 9, 2014, at 9:24 AM, Nathan Hjelm wrote: > > > One thing that changed between 1.6 and 1.8 is the default binding > policy. Open MPI 1.6 did not bind by default but 1.8 binds to

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-09 Thread Ralph Castain
Can you provide an example cmd line you use to launch one of these tests using 1.8.3? Some of the options changed between the 1.6 and 1.8 series, and we bind by default in 1.8 - the combination may be causing you a problem. > On Dec 9, 2014, at 9:14 AM, Eric Chamberland >

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-09 Thread Nathan Hjelm
One thing that changed between 1.6 and 1.8 is the default binding policy. Open MPI 1.6 did not bind by default but 1.8 binds to core. You can unset the binding policy by adding --bind-to none. -Nathan Hjelm HPC-5, LANL On Tue, Dec 09, 2014 at 12:14:32PM -0500, Eric Chamberland wrote: > Hi, > >