Re: [OMPI users] [OMPI devel] mpirun: symbol lookup error:/usr/local/lib/openmpi/mca_plm_lsf.so: undefined symbol: ls b_init

2009-03-31 Thread Alessandro Surace
Hi Jeff, Yes I've installed LSF and the liblsf and libbat are found by the configure how you can see in the previous attach and here: /opt/lsf/7.0/linux2.6-glibc2.3-x86/lib -rw-r--r-- 1 root 10007 1771182 Sep 24 2008 libbat.a -rw-r--r-- 1 root 10007 31278 Nov 23 2007 libbat.jsdl.a -rwxr-xr-x 1

Re: [OMPI users] Generic Type

2009-03-31 Thread Gabriele Fatigati
Mm, OpenMPI functions like MPI_Irecv, does pointer arithmetics over recv buffer using type info in ompi_datatype_t i suppose. I'm trying to write a wrapper to MPI_Gather using Irecv functions: int MPI_FT_Gather(void*sendbuf, int sendcount, MPI_Datatype sendtype, void*recvbuff,

Re: [OMPI users] Generic Type

2009-03-31 Thread Gabriele Fatigati
Thanks Massimo, now it works well. I've erroneous think that Irecv did this automatically using recvtype fields. 2009/3/31 Massimo Cafaro : > Hi, > > let me see that it is still not clear to me why you want to reimplement the > MPI_Gather supplied by an MPI

Re: [OMPI users] Generic Type

2009-03-31 Thread Massimo Cafaro
Hi, unfortunatelly it's up to us to provide the starting address of the buffer and the number of elements to be received multiplied by the datatype extent. This kind of things is dealt automatically in the internals of collective communication operations. Massimo On 31/mar/09, at 14:00,

[OMPI users] Strange behaviour of SGE+OpenMPI

2009-03-31 Thread PN
Dear all, I'm using Open MPI 1.3.1 and SGE 6.2u2 on CentOS 5.2 I have 2 compute nodes for testing, each node has a single quad core CPU. Here is my submission script and PE config: $ cat hpl-8cpu.sge #!/bin/bash # #$ -N HPL_8cpu_IB #$ -pe mpi-fu 8 #$ -cwd #$ -j y #$ -S /bin/bash #$ -V # cd

Re: [OMPI users] Strange behaviour of SGE+OpenMPI

2009-03-31 Thread Rolf Vandevaart
On 03/31/09 11:43, PN wrote: Dear all, I'm using Open MPI 1.3.1 and SGE 6.2u2 on CentOS 5.2 I have 2 compute nodes for testing, each node has a single quad core CPU. Here is my submission script and PE config: $ cat hpl-8cpu.sge #!/bin/bash # #$ -N HPL_8cpu_IB #$ -pe mpi-fu 8 #$ -cwd #$ -j y

Re: [OMPI users] Linux opteron infiniband sunstudio configure, problem

2009-03-31 Thread Jeff Squyres
On Mar 31, 2009, at 1:31 PM, Terry Dontje wrote: Can you manually run UNAME_REL=`(/bin/uname -X|grep Release|sed -e 's/.*= //')` in your shell without error? Better would be to put this small script by itself: #! /bin/sh UNAME_REL=`(/bin/uname -X|grep Release|sed -e 's/.*= //')` echo got

Re: [OMPI users] Linux opteron infiniband sunstudio configure, problem

2009-03-31 Thread Kevin McManus
On Tue, Mar 31, 2009 at 01:37:22PM -0400, Jeff Squyres wrote: > On Mar 31, 2009, at 1:31 PM, Terry Dontje wrote: > > >Can you manually run UNAME_REL=`(/bin/uname -X|grep Release|sed -e > >'s/.*= //')` in your shell without error? > > > > Better would be to put this small script by itself: > >

Re: [OMPI users] OpenMPI 1.3.1 + BLCR build problem

2009-03-31 Thread Dave Love
M C writes: > --- MCA component crs:blcr (m4 configuration macro) > checking for MCA component crs:blcr compile mode... dso > checking --with-blcr value... sanity check ok (/opt/blcr) > checking --with-blcr-libdir value... sanity check ok (/opt/blcr/lib) > configure:

Re: [OMPI users] Strange behaviour of SGE+OpenMPI

2009-03-31 Thread Dave Love
Rolf Vandevaart writes: >> However, I found that if I explicitly specify the "-machinefile >> $TMPDIR/machines", all 8 mpi processes were spawned within a single >> node, i.e. node0002. I had that sort of behaviour recently when the tight integration was broken on the

Re: [OMPI users] OpenMPI 1.3.1 + BLCR build problem

2009-03-31 Thread Josh Hursey
I think that the missing configure option might be the problem as well. The BLCR configure logic checks to see if you have enabled checkpoint/restart in Open MPI. If you haven't then it fails out of configure (probably should print a better error message - I'll put that on my todo list).

Re: [OMPI users] Linux opteron infiniband sunstudio configure, problem

2009-03-31 Thread Bogdan Costescu
On Tue, 31 Mar 2009, Jeff Squyres wrote: UNAME_REL=`(/bin/uname -X|grep Release|sed -e 's/.*= //')` Not sure what you want to achieve here... 'uname -X' is valid on Solaris, but not on Linux. The OP has indicated already that he is running this on Linux (SLES) so the above line is supposed

Re: [OMPI users] Linux opteron infiniband sunstudio configure, problem

2009-03-31 Thread Kevin McManus
On Tue, Mar 31, 2009 at 10:11:17PM +0200, Bogdan Costescu wrote: > On Tue, 31 Mar 2009, Bogdan Costescu wrote: > > >'uname -X' is valid on Solaris, but not on Linux. > > Not good to reply to oneself, but I've looked at the archives and > realized that 'uname -X' comes from a message of the OP.

Re: [OMPI users] Linux opteron infiniband sunstudio configure, problem

2009-03-31 Thread Jeff Squyres
My goal in having you try that statement in a standalone shell script wasn't the success or failure of the uname command -- but rather to figure out if something in that statement itself was causing the syntax error. Apparently it is not. There's an errant character elsewhere that is

Re: [OMPI users] Strange behaviour of SGE+OpenMPI

2009-03-31 Thread Rolf Vandevaart
On 03/31/09 14:50, Dave Love wrote: Rolf Vandevaart writes: However, I found that if I explicitly specify the "-machinefile $TMPDIR/machines", all 8 mpi processes were spawned within a single node, i.e. node0002. I had that sort of behaviour recently when the tight

Re: [OMPI users] Linux opteron infiniband sunstudio configure, problem

2009-03-31 Thread Kevin McManus
On Tue, Mar 31, 2009 at 04:59:00PM -0400, Jeff Squyres wrote: > My goal in having you try that statement in a standalone shell script > wasn't the success or failure of the uname command -- but rather to > figure out if something in that statement itself was causing the > syntax error. > >

Re: [OMPI users] Linux opteron infiniband sunstudio configure, problem

2009-03-31 Thread Jeff Squyres
On Mar 31, 2009, at 5:25 PM, Kevin McManus wrote: --- MCA component mtl:psm (m4 configuration macro) checking for MCA component mtl:psm compile mode... static checking --with-psm value... simple ok (unspecified) checking --with-psm-libdir value... sanity check ok (/usr/lib64) checking psm.h

Re: [OMPI users] Linux opteron infiniband sunstudio configure, problem

2009-03-31 Thread Kevin McManus
On Tue, Mar 31, 2009 at 05:36:19PM -0400, Jeff Squyres wrote: > On Mar 31, 2009, at 5:25 PM, Kevin McManus wrote: > > >--- MCA component mtl:psm (m4 configuration macro) > >checking for MCA component mtl:psm compile mode... static > >checking --with-psm value... simple ok (unspecified) >

Re: [OMPI users] job runs with mpirun on a node but not if submitted via Torque.

2009-03-31 Thread Ralph Castain
It is very hard to debug the problem with so little information. We regularly run OMPI jobs on Torque without issue. Are you getting an allocation from somewhere for the nodes? If so, are you using Moab to get it? Do you have a $PBS_NODEFILE in your environment? I have no idea why your

Re: [OMPI users] job runs with mpirun on a node but not if submitted via Torque.

2009-03-31 Thread Rahul Nabar
2009/3/31 Ralph Castain : > It is very hard to debug the problem with so little information. We Thanks Ralph! I'm sorry my first post lacked enough specifics. I'll try my best to fill you guys in on as much debug info as I can. > regularly run OMPI jobs on Torque without issue.

Re: [OMPI users] job runs with mpirun on a node but not if submitted via Torque.

2009-03-31 Thread Rahul Nabar
2009/3/31 Ralph Castain : > > Information would be most helpful - the information we really need is > specified here: http://www.open-mpi.org/community/help/ Output of "ompi_info --all" is attached in a file. echo $LD_LIBRARY_PATH

Re: [OMPI users] job runs with mpirun on a node but not if submitted via Torque.

2009-03-31 Thread Rahul Nabar
2009/3/31 Ralph Castain : > It is very hard to debug the problem with so little information. We > regularly run OMPI jobs on Torque without issue. Another small thing that I noticed. Not sure if it is relevant. When the job starts running there is an orte process. The args to this

Re: [OMPI users] Strange behaviour of SGE+OpenMPI

2009-03-31 Thread PN
Dear Rolf, Thanks for your reply. I've created another PE and changed the submission script, explicitly specify the hostname with "--host". However the result is the same. # qconf -sp orte pe_nameorte slots 8 user_lists NONE xuser_listsNONE