Re: [OMPI users] large memory usage and hangs when preconnecting beyond 1000 cpus

2014-10-30 Thread Marshall Ward
Hi, I'm just following up on this to say that the problem was not related to preconnection, but just very large memory usage for high CPU jobs. Preconnecting was just acting to send off a large number of isend/irecv messages and trigger the memory consumption. I tried experimenting a bit with

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-30 Thread Gus Correa
On 10/30/2014 07:32 PM, Ralph Castain wrote: Just for FYI: I believe Nathan misspoke. The new capability is in 1.8.4, which I hope to release next Friday (Nov 7th) Hi Ralph That is even better! Look forward to OMPI 1.8.4. I still would love to hear from Nathan / OMPI team about my remaining

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-30 Thread Ralph Castain
Just for FYI: I believe Nathan misspoke. The new capability is in 1.8.4, which I hope to release next Friday (Nov 7th) > On Oct 30, 2014, at 4:24 PM, Gus Correa wrote: > > Hi Nathan > > Thank you very much for addressing this problem. > > I read your notes on Jeff's

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-30 Thread Gus Correa
Hi Nathan Thank you very much for addressing this problem. I read your notes on Jeff's blog about vader, and that clarified many things that were obscure to me when I first started this thread whining that knem was not working in OMPI 1.8.3. Thank you also for writing that blog post, and for

[OMPI users] orte-ps and orte-top behavior

2014-10-30 Thread Brock Palen
If i'm on the node hosting mpirun for a job, and run: orte-ps It finds the job and shows the pids and info for all ranks. If I use orte-top though it does no such default, I have to find the mpirun pid and then use it. Why do the two have different behavior? The show data from the same

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-30 Thread Nathan Hjelm
I want to close the loop on this issue. 1.8.5 will address it in several ways: - knem support in btl/sm has been fixed. A sanity check was disabling knem during component registration. I wrote the sanity check before the 1.7 release and didn't intend this side-effect. - vader now

Re: [OMPI users] Allgather in OpenMPI 1.4.3

2014-10-30 Thread Sebastian Rettenberger
Since I can't upgrade the system packages anyway (due to dependencies), I installed version 1.8.3. The bug is fixed in this version. Thank you Sebastian On 29.10.2014 16:03, Jeff Squyres (jsquyres) wrote: Can you at least upgrade to 1.4.5? That's the last release in the 1.4.x series. Note

[OMPI users] engineer position on hwloc+netloc

2014-10-30 Thread Brice Goglin
Hello, There's an R engineer position opening in my research team at Inria Bordeaux (France) for developing hwloc and netloc software (both Open MPI subprojects). All details available at http://runtime.bordeaux.inria.fr/goglin/201410-Engineer-hwloc+netloc.en.pdf or French version

[hwloc-users] engineer position on hwloc+netloc

2014-10-30 Thread Brice Goglin
Hello, There's an R engineer position opening in my research team at Inria Bordeaux (France) for developing hwloc and netloc software (both Open MPI subprojects). All details available at http://runtime.bordeaux.inria.fr/goglin/201410-Engineer-hwloc+netloc.en.pdf or French version