Re: [OMPI users] Bug in OpenMPI-1.8.1: missing routines mpi_win_allocate_shared, mpi_win_shared_query called from Ftn95-code

2014-06-09 Thread Jeff Squyres (jsquyres)
Oops. Looks like we missed these in the Fortran interfaces. I'll file a bug; we'll get this fixed in OMPI 1.8.2. Many thanks for reporting this. On Jun 5, 2014, at 5:41 AM, michael.rach...@dlr.de wrote: > Dear developers of OpenMPI, > > I found that when building an executable from a Fortr

Re: [OMPI users] intermittent segfaults with openib on ring_c.c

2014-06-09 Thread Jeff Squyres (jsquyres)
I'm digging out from mail backlog from being at the MPI Forum last week... Yes, from looking at the stack traces, it's segv'ing inside the memory allocator, which typically means some other memory error occurred before this. I.e., this particular segv is a symptom of the problem, not the actual

Re: [OMPI users] openib segfaults with Torque

2014-06-09 Thread Jeff Squyres (jsquyres)
I seem to recall that you have an IB-based cluster, right? >From a *very quick* glance at the code, it looks like this might be a simple >incorrect-finalization issue. That is: - you run the job on a single server - openib disqualifies itself because you're running on a single server - openib t

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Jeff Squyres (jsquyres)
On Jun 9, 2014, at 7:00 PM, Vineet Rawat wrote: > We actually do ship the /share and /etc directories. We set > OPAL_PREFIX to a sub-directory of our installation and make sure those things > are in our PATH/LD_LIBRARY_PATH. > > I can try adding the additional shared libs but it doesn't sound

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Vineet Rawat
On Mon, Jun 9, 2014 at 3:40 PM, Jeff Squyres (jsquyres) wrote: > On Jun 9, 2014, at 6:36 PM, Vineet Rawat wrote: > > > No, we only included what seemed necessary (from ldd output and > experience on other clusters). The only things in my /lib/openmpi > are libompi_dbg_msgq*. Is that what you're

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Vineet Rawat
On Mon, Jun 9, 2014 at 3:31 PM, Jeff Squyres (jsquyres) wrote: > On Jun 9, 2014, at 5:41 PM, Vineet Rawat wrote: > > > We've deployed OpenMPI on a small cluster but get a SEGV in orted. Debug > information is very limited as the cluster is at a remote customer site. > They have a network card wi

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Ralph Castain
There is one new "feature" in 1.8 - it now checks to see if the version on the backend matches the version on the frontend. In other words, mpirun checks to see if the orted connecting to it is from the same version - if not, the orted will die. Shouldn't segfault, though - just abort. You cou

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Jeff Squyres (jsquyres)
On Jun 9, 2014, at 6:36 PM, Vineet Rawat wrote: > No, we only included what seemed necessary (from ldd output and experience on > other clusters). The only things in my /lib/openmpi are > libompi_dbg_msgq*. Is that what you're referring to? In /lib for > 12.8.1 (ignoring the VampirTrace libs)

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Vineet Rawat
On Mon, Jun 9, 2014 at 3:21 PM, Ralph Castain wrote: > > On Jun 9, 2014, at 2:41 PM, Vineet Rawat wrote: > > Hi, > > We've deployed OpenMPI on a small cluster but get a SEGV in orted. Debug > information is very limited as the cluster is at a remote customer site. > They have a network card with

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Jeff Squyres (jsquyres)
On Jun 9, 2014, at 5:41 PM, Vineet Rawat wrote: > We've deployed OpenMPI on a small cluster but get a SEGV in orted. Debug > information is very limited as the cluster is at a remote customer site. They > have a network card with which I'm not familiar (Cisco Systems Inc VIC P81E > PCIe Ethern

Re: [OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Ralph Castain
On Jun 9, 2014, at 2:41 PM, Vineet Rawat wrote: > Hi, > > We've deployed OpenMPI on a small cluster but get a SEGV in orted. Debug > information is very limited as the cluster is at a remote customer site. They > have a network card with which I'm not familiar (Cisco Systems Inc VIC P81E > P

[OMPI users] orted 1.6.4 and 1.8.1 segv with bonded Cisco P81E

2014-06-09 Thread Vineet Rawat
Hi, We've deployed OpenMPI on a small cluster but get a SEGV in orted. Debug information is very limited as the cluster is at a remote customer site. They have a network card with which I'm not familiar (Cisco Systems Inc VIC P81E PCIe Ethernet NIC) and it seems capable of using the usNIC BTL. I'm

Re: [OMPI users] openmpi linking problem

2014-06-09 Thread Marco Atzeri
On 09/06/2014 19:14, Sergii Veremieiev wrote: Dear Sir/Madam, I'm trying to link a C/FORTRAN code on Cygwin with Open MPI 1.7.5 and GCC 4.8.2: mpicxx ./lib/Multigrid.o ./lib/GridFE.o ./lib/Data.o ./lib/GridFD.o ./lib/Parameters.o ./lib/MtInt.o ./lib/MtPol.o ./lib/MtDob.o -o Test_cygwin_openmpi_

Re: [OMPI users] openmpi linking problem

2014-06-09 Thread Tim Prince
On 6/9/2014 1:14 PM, Sergii Veremieiev wrote: Dear Sir/Madam, I'm trying to link a C/FORTRAN code on Cygwin with Open MPI 1.7.5 and GCC 4.8.2: mpicxx ./lib/Multigrid.o ./lib/GridFE.o ./lib/Data.o ./lib/GridFD.o ./lib/Parameters.o ./lib/MtInt.o ./lib/MtPol.o ./lib/MtDob.o -o Test_cygwin_ope

[OMPI users] openmpi linking problem

2014-06-09 Thread Sergii Veremieiev
Dear Sir/Madam, I'm trying to link a C/FORTRAN code on Cygwin with Open MPI 1.7.5 and GCC 4.8.2: mpicxx ./lib/Multigrid.o ./lib/GridFE.o ./lib/Data.o ./lib/GridFD.o ./lib/Parameters.o ./lib/MtInt.o ./lib/MtPol.o ./lib/MtDob.o -o Test_cygwin_openmpi_gcc -L./external/MUMPS/lib -ldmumps_cygwin_open

Re: [OMPI users] Compiling OpenMPI 1.8.1 for Cray XC30

2014-06-09 Thread Nathan Hjelm
I have a platform file for the XC30 that I haven't yet pushed to the repository. I will try to push it later today. -Nathan On Thu, Jun 05, 2014 at 04:00:03PM +, Hammond, Simon David (-EXP) wrote: > Hi OpenMPI developers/users, > > Does anyone have a working configure line for OpenMPI 1.8.1

Re: [OMPI users] Determining what parameters a scheduler passes to OpenMPI

2014-06-09 Thread Sasso, John (GE Power & Water, Non-GE)
I will be testing w/ 1.6.5 and update accordingly. Thanks much! -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Sunday, June 08, 2014 7:40 PM To: Open MPI Users Subject: Re: [OMPI users] Determining what parameters a scheduler passes to

Re: [OMPI users] Bind multiple cores to rank - OpenMPI 1.8.1

2014-06-09 Thread Dan Dietz
Ack - that was my fault. Too early on a monday morning. This seems to work perfectly when I correctly submit a job! Thanks! Dan On Mon, Jun 9, 2014 at 9:34 AM, Dan Dietz wrote: > Yes, you're exactly right - this system has 2 Phi cards per node. I > believe the "PCI 8086" device in the lstopo out

Re: [OMPI users] Bind multiple cores to rank - OpenMPI 1.8.1

2014-06-09 Thread Dan Dietz
Yes, you're exactly right - this system has 2 Phi cards per node. I believe the "PCI 8086" device in the lstopo output is them. Possibly related, we've observed a weird bug with Torque and the allocation it provides when you request the Phis. When requesting them you get a nodefile with only 1 entr