[ 
https://issues.apache.org/jira/browse/MYRIAD-13?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Santosh Marella updated MYRIAD-13:
----------------------------------
    Summary: High Availability for Myriad  (was: High Availability Mode for 
Myriad)

> High Availability for Myriad
> ----------------------------
>
>                 Key: MYRIAD-13
>                 URL: https://issues.apache.org/jira/browse/MYRIAD-13
>             Project: Myriad
>          Issue Type: Improvement
>            Reporter: Mohit Soni
>            Assignee: Swapnil Daingade
>             Fix For: Myriad 0.1.0
>
>
> When recovering from a failure, either a ResourceManager/Myriad JVM failure 
> (new process) or a driver crash (same process), Myriad's should be able:
> 1. to reconstruct it's existing state during recovery
> 2. to reconcile the TaskStatus of non-terminal tasks
> To achieve 1, Myriad need to persist it state externally so that state 
> outlives a Myriad process run. State can be stored either in Zookeeper or 
> Replicated log abstraction provided by Mesos. (Issue MYRIAD-15)
> To achieve 2, Myriad needs to leverage reconciliation feature. Ben Mahler 
> [document|https://gist.github.com/bmahler/18409fc4f052df43f403] on 
> Reconciliation discusses an algorithm which frameworks can use to reconcile 
> tasks. This should be implemented and used until Reconciliation is managed by 
> SchedulerDriver itself. (Issue MYRIAD-16)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to