[OMPI users] CuEventCreate Failed...

2014-10-17 Thread Steven Eliuk
Hi All, We have run into issues, that don’t really seem to materialize into incorrect results, nonetheless, we hope to figure out why we are getting them. We have several environments with test from one machine, with say 1-16 processes per node, to several machines with 1-16 processes. All

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-17 Thread Ralph Castain
> On Oct 17, 2014, at 12:06 PM, Gus Correa wrote: > > Hi Jeff > > Many thanks for looking into this and filing a bug report at 11:16PM! > > Thanks to Aurelien, Ralph and Nathan for their help and clarifications > also. > > ** > > Related suggestion: > > Add a note

Re: [OMPI users] Open MPI 1.8.3 openmpi-mca-params.conf: old and new parameters

2014-10-17 Thread Ralph Castain
> On Oct 17, 2014, at 10:23 AM, Gus Correa wrote: > > Hi Ralph > > Thank you. > Your fixes covered much more than I could find. > The section about the three levels of process placement > (" Mapping, Ranking, and Binding: Oh My!") really helps. > I would just add at the

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-17 Thread Gus Correa
Hi Jeff Many thanks for looking into this and filing a bug report at 11:16PM! Thanks to Aurelien, Ralph and Nathan for their help and clarifications also. ** Related suggestion: Add a note to the FAQ explaining that in OMPI 1.8 the new (default) btl is vader (and what it is). It was a real

Re: [OMPI users] Open MPI 1.8.3 openmpi-mca-params.conf: old and new parameters

2014-10-17 Thread Gus Correa
Hi Ralph Thank you. Your fixes covered much more than I could find. The section about the three levels of process placement (" Mapping, Ranking, and Binding: Oh My!") really helps. I would just add at the very beginning short sentences quickly characterizing each of the three levels. Kind of an

Re: [OMPI users] Open MPI 1.8: link problem when Fortran+C+Platform LSF

2014-10-17 Thread Ralph Castain
Sigh - the original message didn’t get in there, I think. See below: Paul - it looks to me like we are adding the required libraries, but perhaps not to the wrapper compilers. Jeff is on travel today, but I’ll check with him next week. Ralph > Dear Open MPI developer, > > we have both Open

Re: [OMPI users] [FEniCS] Question about MPI barriers

2014-10-17 Thread Jeff Squyres (jsquyres)
Thanks; I filed https://github.com/open-mpi/ompi/issues/242. On Oct 17, 2014, at 5:59 AM, Jed Brown wrote: > Martin Sandve Alnæs writes: > >> Thanks, but ibarrier doesn't seem to be in the stable version of openmpi: >> http://www.open-mpi.org/doc/v1.8/

Re: [OMPI users] [FEniCS] Question about MPI barriers

2014-10-17 Thread Jed Brown
Martin Sandve Alnæs writes: > Thanks, but ibarrier doesn't seem to be in the stable version of openmpi: > http://www.open-mpi.org/doc/v1.8/ > Otherwise mpi_ibarrier+mpi_test+homemade time/sleep loop would do the trick. MPI_Ibarrier is there (since 1.7), just missing a man

[OMPI users] large memory usage and hangs when preconnecting beyond 1000 cpus

2014-10-17 Thread Marshall Ward
I currently have a numerical model that, for reasons unknown, requires preconnection to avoid hanging on an initial MPI_Allreduce call. But when we try to scale out beyond around 1000 cores, we are unable to get past MPI_Init's preconnection phase. To test this, I have a basic C program

Re: [OMPI users] Open MPI 1.8.3 openmpi-mca-params.conf: old and new parameters

2014-10-17 Thread Ralph Castain
I know this commit could be a little hard to parse, but I have updated the mpirun man page on the trunk and will port the change over to the 1.8 series tomorrow. FWIW, I’ve provided the link to the commit below so you can “preview” it.

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-17 Thread Jeff Squyres (jsquyres)
On Oct 16, 2014, at 1:35 PM, Gus Correa wrote: > and on the MCA parameter file: > > btl_sm_use_knem = 1 I think the logic enforcing this MCA param got broken when we revamped the MCA param system. :-( > I am scratching my head to understand why a parameter with such