Karonis Nicholas wrote:
Jan,

I don't know of a way to do what you're suggesting
(i.e., prologue - run MPI job - epilogue) using
Globus alone.  There may be ways to do this with
higher-level "workflow" tools built atop Globus.

One simple solution that came to mind that might
work is to write a script that does the prologue,
then calls "mpirun" to run the MPI job, then does
the epilogue.

Sorry that I couldn't be of more help.

Nick,

Thanks for the reply. Your suggested solution won't work because $PBS_NODEFILE is not defined for Globus jobs (the dreaded use of rsh/ssh even for 'single' type jobs gets into way). However you'd neeed the parameter -machinefile $PBS_NODEFILE for mpirun. Even if it worked, it would not be nice because of the need to hardcode site-specific mpirun syntax into job scripts.

I've been examining different options since my original posting. The general idea for the one I like most (so far) is sketched here: https://bi.offis.de/wisent/tiki-index.php?page=Condor-GT4-BigJobs

The trick is to submit an 'mpi' type job, but to use a special Perl script instead of the real MPI executable. It works fine, except in environments where mpirun force-kills the processes after MPI_Finalize (in such cases the cleanup phase won't run).

I guess I'm going to submit an enhancement request for a feature that would allow invoking a user-specified executable before and after mpirun in the Globus-generated PBS script. The same option would be also beneficial for jobs of type 'multiple'. Using "workflow" tools to achieve the desired effect sounds like an overkill to me. From a user's perspective, the setup-MPI-cleanup thing is really a single, atomic job, not three jobs that need to be coordinated.

Best regards,
Jan Ploski

Reply via email to