On Mar 18, 2009, at 9:56 PM, tracy_luofengji wrote:
Dear all,
After read the paper of Nicholas.Karonis "MPICH-G2: A Grid-Enabled
Implementation of the Message Passing Interface" and the MPICH-G2's
web page, I want to ask 2 questions in order to understand MPICH-G2
better.
1. As the paper and the web page said, MPICH-G2 used the vendor-
supplied MPI implemention to perform the intra-communications, and
the vendor-supplied MPI implemention means the MPI implemention that
already exists on the cluster and which is not MPICH-based. But what
should I do if my cluster has already installed MPICH? In this case,
how does MPICH-G2 perform the intra-communications? And if I have
already configured my cluster with normal MPICH, should I remove the
installation of MPICH and re-install the MPICH-G2 on the head node?
The early versions of MPICH-G2 could not be configured with
an MPI flavor of the GT library that was, in turn, built with
an MPICH-based MPI. However, as of MPICH-G2 v1.2.5.1 (which was
probably released after the article was published) that restriction
was removed.
So, you should be able to take the MPICH-based vendor-MPI on your
cluster, use it to build an MPI flavor of the Globus libraries,
and then use that MPI flavor of the Globus libraries to configure
and build MPICH-G2. In this setting:
(1) the Globus Job Manager script that runs on that cluster
will have to be modified to use the 'mpirun' that comes
with the vendor-supplied MPI (note, NOT MPICH-G2's mpirun)
when the subjob in the RSL to run on that cluster specifies
(jobtype=mpi),
(2) all RSL subjobs that run on that cluster should specify
(jobtype=mpi), and
(3) when doing (1)+(2) above all intra-cluster messages will
be done over the vendor-supplied MPI.
Note also, if you don't have a vendor-supplied MPI on your cluster
or if you don't want to use the one that's there, you can always
build a non-MPI flavor of the Globus libraries, configure and
build MPICH-G2 atop those Globus libraries, and run on the
cluster that way. In that case you do not need to modify
the Globus Job Manager, you do not specify (jobtype=mpi)
in your RSL subjob for that cluster, and all intra-cluster
messaging will be done via TCP/IP.
2.Nick said the mpich-g2 works based on the infrastructure we
already have that controls access to our cluster. I understand it,
but I am still a little confused about how Globus submit mpi jobs to
local schedular (such as PBS). When my cluster is managed by PBS and
have normal MPICH installed,I submit mpi jobs to PBS using something
like "mpriun -machinefile $PBS_NODEFILE -np 10 app". Now I remove
the normal MPICH and install MPICH-G2, then the machinefile used by
MPICH-G2 is no longer the nodes of my cluster, but the address of
compute resources in the grid. In this case, how does the PBS
jobmanager sumit mpi jobs to PBS ?
Yes, this can be confusing. It all works by modifying the Globus
PBS Job Manager script to detect when (jobtype=mpi) is in the
Globus RSL subjob, and when it is, to call the vendor-supplied
MPICH mpirun (as all described above). It is also important for
MPICH-G2 that the env vars - those specified in the RSL and those
in the user's environment (e.g., .cshrc) all get propagated to
the running app. This too often requires some hacking to the
Globus Job manager script.
There are folks in the Globus community (developers and users) that
know how to hack the Job Manager scripts as described above.
They might be willing to share their hacks ;-). I think, for
example, certain TeraGrid sites might have these hacks in place
for PBS (you might try [email protected]).
Nick
Any help will be appraciated!
Thanks,
Tracy
网易邮箱,中国第一大电子邮件服务商