On Dec 11, 2008, at 2:55 PM, Terry Dontje wrote:

Well under SGE it allows you to have SGE send mpirun SIGUSR1 so many minutes before sending the Suspend signal.


My point is that the right approach might be to work in the context of Josh's CR stuff -- he's already got hooks for "do this right before pausing for checkpoint" / "do this right after resuming", etc.

Sure, we're not checkpointing, but several of the characteristics of this action are pretty similar to what is required for checkpointing/ restarting. So it might be good to use that framework for it...?

--
Jeff Squyres
Cisco Systems

Reply via email to