There is a 'self' checkpointer (CRS component) that does application level 
checkpointing - exposed at the MPI level. I don't know how different what you 
are working on is, but maybe something like that could be harnessed. Note that 
I have not tested the 'self' checkpointer with the process migration support, 
it -should- work, but there might be some bugs to work out.

Documentation and examples at the link below:
  http://osl.iu.edu/research/ft/ompi-cr/examples.php#example-self

-- Josh

On Aug 26, 2011, at 6:17 PM, Ralph Castain wrote:

> FWIW: I'm in the process of porting some code from a branch that allows apps 
> to do on-demand checkpoint/recovery style operations at the app level. 
> Specifically, it provides the ability to:
> 
> * request a "recovery image" - an application-level blob containing state 
> info required for the app to recover its state.
> 
> * register a callback point for providing a "recovery image", either to store 
> for later use (separate API is used to indicate when to acquire it) or to 
> provide to another process upon request
> 
> This is at the RTE level, so someone would have to expose it via an 
> appropriate MPI call if someone wants to use it at that layer (I'm open to 
> changes to support that use, if someone is interested).
> 
> 
> On Aug 26, 2011, at 3:16 PM, Josh Hursey wrote:
> 
>> There are some great comments in this thread. Process migration (like
>> many topics in systems) can get complex fast.
>> 
>> The Open MPI process migration implementation is checkpoint/restart
>> based (currently using BLCR), and uses an 'eager' style of migration.
>> This style of migration stops a process completely on the source
>> machine, checkpoints/terminates it, restarts it on the destination
>> machine, then rejoins it with the other running processes. I think the
>> only documentation that we have is at the webpage below (and my PhD
>> thesis, if you want the finer details):
>> http://osl.iu.edu/research/ft/ompi-cr/
>> 
>> We have wanted to experiment with a 'pre-copy' or 'live' migration
>> style, but have not had the necessary support from the underlying
>> checkpointer or time to devote to making it happen. I think BLCR is
>> working on including the necessary pieces in a future release (there
>> are papers where a development version of BLCR has done this with
>> LAM/MPI). So that might be something of interest.
>> 
>> Process migration techniques can benefit from fault prediction and
>> 'good' target destination selection. Fault prediction allows us to
>> move processes away from soon-to-fail locations, but it can be
>> difficult to accurately predict failures. Open MPI has some hooks in
>> the runtime layer that support 'sensors' which might help here. Good
>> target destination selection is equally complex, but the idea here is
>> to move processes to a machine where they can continue supporting the
>> efficient execution of the application. So this might mean moving to
>> the least loaded machine, or moving to a machine with other processes
>> to reduce interprocess communication (something like dynamic load
>> balancing).
>> 
>> So there are some ideas to get you started.
>> 
>> -- Josh
>> 
>> On Thu, Aug 25, 2011 at 12:06 PM, Rayson Ho <raysonlo...@gmail.com> wrote:
>>> Don't know which SSI project you are referring to... I only know the
>>> OpenSSI project, and I was one of the first who subscribed to its
>>> mailing list (since 2001).
>>> 
>>> http://openssi.org/cgi-bin/view?page=openssi.html
>>> 
>>> I don't think those OpenSSI clusters are designed for tens of
>>> thousands of nodes, and not sure if it scales well to even a thousand
>>> nodes -- so IMO they have limited use for HPC clusters.
>>> 
>>> Rayson
>>> 
>>> 
>>> 
>>> On Thu, Aug 25, 2011 at 11:45 AM, Durga Choudhury <dpcho...@gmail.com> 
>>> wrote:
>>>> Also, in 2005 there was an attempt to implement SSI (Single System
>>>> Image) functionality to the then-current 2.6.10 kernel. The proposal
>>>> was very detailed and covered most of the bases of task creation, PID
>>>> allocation etc across a loosely tied cluster (without using fancy
>>>> hardware such as RDMA fabric). Anybody knows if it was ever
>>>> implemented? Any pointers in this direction?
>>>> 
>>>> Thanks and regards
>>>> Durga
>>>> 
>>>> 
>>>> On Thu, Aug 25, 2011 at 11:08 AM, Rayson Ho <raysonlo...@gmail.com> wrote:
>>>>> Srinivas,
>>>>> 
>>>>> There's also Kernel-Level Checkpointing vs. User-Level Checkpointing -
>>>>> if you can checkpoint an MPI task and restart it on a new node, then
>>>>> this is also "process migration".
>>>>> 
>>>>> Of course, doing a checkpoint & restart can be slower than pure
>>>>> in-kernel process migration, but the advantage is that you don't need
>>>>> any kernel support, and can in fact do all of it in user-space.
>>>>> 
>>>>> Rayson
>>>>> 
>>>>> 
>>>>> On Thu, Aug 25, 2011 at 10:26 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>> It also depends on what part of migration interests you - are you 
>>>>>> wanting to look at the MPI part of the problem (reconnecting MPI 
>>>>>> transports, ensuring messages are not lost, etc.) or the RTE part of the 
>>>>>> problem (where to restart processes, detecting failures, etc.)?
>>>>>> 
>>>>>> 
>>>>>> On Aug 24, 2011, at 7:04 AM, Jeff Squyres wrote:
>>>>>> 
>>>>>>> Be aware that process migration is a pretty complex issue.
>>>>>>> 
>>>>>>> Josh is probably the best one to answer your question directly, but 
>>>>>>> he's out today.
>>>>>>> 
>>>>>>> 
>>>>>>> On Aug 24, 2011, at 5:45 AM, srinivas kundaram wrote:
>>>>>>> 
>>>>>>>> I am final year grad student looking for my final year project in 
>>>>>>>> OpenMPI.We are group of 4 students.
>>>>>>>> I wanted to know about the "Process Migration" process of MPI 
>>>>>>>> processes in OpenMPI.
>>>>>>>> Can anyone suggest me any ideas for project related to process 
>>>>>>>> migration in OenMPI or other topics in Systems.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> regards,
>>>>>>>> Srinivas Kundaram
>>>>>>>> srinu1...@gmail.com
>>>>>>>> +91-8149399160
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Jeff Squyres
>>>>>>> jsquy...@cisco.com
>>>>>>> For corporate legal information go to:
>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Rayson
>>>>> 
>>>>> ==================================================
>>>>> Open Grid Scheduler - The Official Open Source Grid Engine
>>>>> http://gridscheduler.sourceforge.net/
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Rayson
>>> 
>>> ==================================================
>>> Open Grid Scheduler - The Official Open Source Grid Engine
>>> http://gridscheduler.sourceforge.net/
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>> 
>> 
>> 
>> -- 
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


Reply via email to