Hi Jeff,
Yes I've installed LSF and the liblsf and libbat are found by the configure
how you can see in the previous attach and here:
/opt/lsf/7.0/linux2.6-glibc2.3-x86/lib
-rw-r--r-- 1 root 10007 1771182 Sep 24 2008 libbat.a
-rw-r--r-- 1 root 10007 31278 Nov 23 2007 libbat.jsdl.a
-rwxr-xr-x 1
Mm,
OpenMPI functions like MPI_Irecv, does pointer arithmetics over recv
buffer using type info in ompi_datatype_t i suppose. I'm trying to
write a wrapper to MPI_Gather using Irecv functions:
int MPI_FT_Gather(void*sendbuf, int sendcount, MPI_Datatype sendtype,
void*recvbuff,
Thanks Massimo,
now it works well.
I've erroneous think that Irecv did this automatically using recvtype fields.
2009/3/31 Massimo Cafaro :
> Hi,
>
> let me see that it is still not clear to me why you want to reimplement the
> MPI_Gather supplied by an MPI
Hi,
unfortunatelly it's up to us to provide the starting address of the
buffer and the number of elements to be received multiplied by the
datatype extent.
This kind of things is dealt automatically in the internals of
collective communication operations.
Massimo
On 31/mar/09, at 14:00,
Dear all,
I'm using Open MPI 1.3.1 and SGE 6.2u2 on CentOS 5.2
I have 2 compute nodes for testing, each node has a single quad core CPU.
Here is my submission script and PE config:
$ cat hpl-8cpu.sge
#!/bin/bash
#
#$ -N HPL_8cpu_IB
#$ -pe mpi-fu 8
#$ -cwd
#$ -j y
#$ -S /bin/bash
#$ -V
#
cd
On 03/31/09 11:43, PN wrote:
Dear all,
I'm using Open MPI 1.3.1 and SGE 6.2u2 on CentOS 5.2
I have 2 compute nodes for testing, each node has a single quad core CPU.
Here is my submission script and PE config:
$ cat hpl-8cpu.sge
#!/bin/bash
#
#$ -N HPL_8cpu_IB
#$ -pe mpi-fu 8
#$ -cwd
#$ -j y
On Mar 31, 2009, at 1:31 PM, Terry Dontje wrote:
Can you manually run UNAME_REL=`(/bin/uname -X|grep Release|sed -e
's/.*= //')` in your shell without error?
Better would be to put this small script by itself:
#! /bin/sh
UNAME_REL=`(/bin/uname -X|grep Release|sed -e 's/.*= //')`
echo got
On Tue, Mar 31, 2009 at 01:37:22PM -0400, Jeff Squyres wrote:
> On Mar 31, 2009, at 1:31 PM, Terry Dontje wrote:
>
> >Can you manually run UNAME_REL=`(/bin/uname -X|grep Release|sed -e
> >'s/.*= //')` in your shell without error?
> >
>
> Better would be to put this small script by itself:
>
>
M C writes:
> --- MCA component crs:blcr (m4 configuration macro)
> checking for MCA component crs:blcr compile mode... dso
> checking --with-blcr value... sanity check ok (/opt/blcr)
> checking --with-blcr-libdir value... sanity check ok (/opt/blcr/lib)
> configure:
Rolf Vandevaart writes:
>> However, I found that if I explicitly specify the "-machinefile
>> $TMPDIR/machines", all 8 mpi processes were spawned within a single
>> node, i.e. node0002.
I had that sort of behaviour recently when the tight integration was
broken on the
I think that the missing configure option might be the problem as
well. The BLCR configure logic checks to see if you have enabled
checkpoint/restart in Open MPI. If you haven't then it fails out of
configure (probably should print a better error message - I'll put
that on my todo list).
On Tue, 31 Mar 2009, Jeff Squyres wrote:
UNAME_REL=`(/bin/uname -X|grep Release|sed -e 's/.*= //')`
Not sure what you want to achieve here... 'uname -X' is valid on
Solaris, but not on Linux. The OP has indicated already that he is
running this on Linux (SLES) so the above line is supposed
On Tue, Mar 31, 2009 at 10:11:17PM +0200, Bogdan Costescu wrote:
> On Tue, 31 Mar 2009, Bogdan Costescu wrote:
>
> >'uname -X' is valid on Solaris, but not on Linux.
>
> Not good to reply to oneself, but I've looked at the archives and
> realized that 'uname -X' comes from a message of the OP.
My goal in having you try that statement in a standalone shell script
wasn't the success or failure of the uname command -- but rather to
figure out if something in that statement itself was causing the
syntax error.
Apparently it is not. There's an errant character elsewhere that is
On 03/31/09 14:50, Dave Love wrote:
Rolf Vandevaart writes:
However, I found that if I explicitly specify the "-machinefile
$TMPDIR/machines", all 8 mpi processes were spawned within a single
node, i.e. node0002.
I had that sort of behaviour recently when the tight
On Tue, Mar 31, 2009 at 04:59:00PM -0400, Jeff Squyres wrote:
> My goal in having you try that statement in a standalone shell script
> wasn't the success or failure of the uname command -- but rather to
> figure out if something in that statement itself was causing the
> syntax error.
>
>
On Mar 31, 2009, at 5:25 PM, Kevin McManus wrote:
--- MCA component mtl:psm (m4 configuration macro)
checking for MCA component mtl:psm compile mode... static
checking --with-psm value... simple ok (unspecified)
checking --with-psm-libdir value... sanity check ok (/usr/lib64)
checking psm.h
On Tue, Mar 31, 2009 at 05:36:19PM -0400, Jeff Squyres wrote:
> On Mar 31, 2009, at 5:25 PM, Kevin McManus wrote:
>
> >--- MCA component mtl:psm (m4 configuration macro)
> >checking for MCA component mtl:psm compile mode... static
> >checking --with-psm value... simple ok (unspecified)
>
It is very hard to debug the problem with so little information. We
regularly run OMPI jobs on Torque without issue.
Are you getting an allocation from somewhere for the nodes? If so, are
you using Moab to get it? Do you have a $PBS_NODEFILE in your
environment?
I have no idea why your
2009/3/31 Ralph Castain :
> It is very hard to debug the problem with so little information. We
Thanks Ralph! I'm sorry my first post lacked enough specifics. I'll
try my best to fill you guys in on as much debug info as I can.
> regularly run OMPI jobs on Torque without issue.
2009/3/31 Ralph Castain :
>
> Information would be most helpful - the information we really need is
> specified here: http://www.open-mpi.org/community/help/
Output of "ompi_info --all" is attached in a file.
echo $LD_LIBRARY_PATH
2009/3/31 Ralph Castain :
> It is very hard to debug the problem with so little information. We
> regularly run OMPI jobs on Torque without issue.
Another small thing that I noticed. Not sure if it is relevant.
When the job starts running there is an orte process. The args to this
Dear Rolf,
Thanks for your reply.
I've created another PE and changed the submission script, explicitly
specify the hostname with "--host".
However the result is the same.
# qconf -sp orte
pe_nameorte
slots 8
user_lists NONE
xuser_listsNONE
23 matches
Mail list logo