[ https://issues.apache.org/jira/browse/MYRIAD-13?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Swapnil Daingade reassigned MYRIAD-13: -------------------------------------- Assignee: Swapnil Daingade > High Availability Mode for Myriad > --------------------------------- > > Key: MYRIAD-13 > URL: https://issues.apache.org/jira/browse/MYRIAD-13 > Project: Myriad > Issue Type: Improvement > Reporter: Mohit Soni > Assignee: Swapnil Daingade > > When recovering from a failure, either a ResourceManager/Myriad JVM failure > (new process) or a driver crash (same process), Myriad's should be able: > 1. to reconstruct it's existing state during recovery > 2. to reconcile the TaskStatus of non-terminal tasks > To achieve 1, Myriad need to persist it state externally so that state > outlives a Myriad process run. State can be stored either in Zookeeper or > Replicated log abstraction provided by Mesos. (Issue MYRIAD-15) > To achieve 2, Myriad needs to leverage reconciliation feature. Ben Mahler > [document|https://gist.github.com/bmahler/18409fc4f052df43f403] on > Reconciliation discusses an algorithm which frameworks can use to reconcile > tasks. This should be implemented and used until Reconciliation is managed by > SchedulerDriver itself. (Issue MYRIAD-16) -- This message was sent by Atlassian JIRA (v6.3.4#6332)