Hi All,
I am trying to submit simple hello mpi job from one cluster to another with the
help of Globus through SGE. The rsl job description file is as:
& (count=8)
(jobType=mpi)
(environment=
(GLOBUS_TCP_PORT_RANGE "3000,3090")
(GLOBUS_DUROC_SUBJOB_INDEX 0)
)
(directory=/home/osdd/)
(executable=/home/osdd/hello)
(stdout=hello-mpi.out)
(stderr=hello-mpi.err)
And submitting the job as:
globusrun -r "cluster.hpc.org:/jobmanager-sge" -f hello-mpi.rsl
But it seems the job is submitted for a single processor and the output is as
below:
-catch_rsh
/opt/gridengine/default/spool/compute-0-0/active_jobs/179.1/pe_hostfile
compute-0-0
compute-0-0
compute-0-1
compute-0-1
compute-0-1
compute-0-1
compute-0-1
compute-0-1
HELLO_WORLD - Master process:
C version
An MPI example program.
The number of processes is 1.
Process 0 says 'Hello, world!'
HELLO_WORLD - Master process:
Normal end of execution: 'Goodbye, world!'
Elapsed wall clock time = 0.000023 seconds.
>From "qstat -f" it seems its distributed to compute-0-0 and compute-0-1
>properly
queuename qtype used/tot. load_avg arch states
----------------------------------------------------------------------------
[email protected] BIP 2/2 0.01 lx26-amd64
181 0.55500 scheduler_ osdd r 02/25/2009 19:10:02 2
----------------------------------------------------------------------------
[email protected] BIP 6/16 1.01 lx26-amd64
181 0.55500 scheduler_ osdd r 02/25/2009 19:10:02 6
If I submit the mpi job to sge with the submit script in execution host as
below:
#!/bin/bash
#$ -S /bin/bash
#
# set the P4_GLOBMEMSIZE
#$ -v P4_GLOBMEMSIZE=10000000
#
# Set the Parallel Environment and number of procs.
#$ -pe mpi 8
/usr/bin/mpirun -np $NSLOTS -machinefile /home/osdd/machines /home/osdd/hello >
hello_output.txt
The typical output I get is as the following:
Process 1 says 'Hello, world!'
Process 2 says 'Hello, world!'
Process 5 says 'Hello, world!'
Process 4 says 'Hello, world!'
Process 6 says 'Hello, world!'
Process 7 says 'Hello, world!'
Process 3 says 'Hello, world!'
HELLO_WORLD - Master process:
C version
An MPI example program.
The number of processes is 8.
Process 0 says 'Hello, world!'
HELLO_WORLD - Master process:
Normal end of execution: 'Goodbye, world!'
Elapsed wall clock time = 0.000082 seconds.
My execution host has two nodes first node compute-0-0 has two processor and
compute-0-1 has 16 prcessor.
Could you please help me how to allocate multiple processor to mpi jobs via sge
with rsl scripts in globus environment.
Thanking you,
Best regards,
Soumyadeep