[ 
https://issues.apache.org/jira/browse/STORM-44?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Kellogg updated STORM-44:
------------------------------
    Component/s: storm-core

> Replication
> -----------
>
>                 Key: STORM-44
>                 URL: https://issues.apache.org/jira/browse/STORM-44
>             Project: Apache Storm
>          Issue Type: Wish
>          Components: storm-core
>            Reporter: James Xu
>
> https://github.com/nathanmarz/storm/issues/132
> This is an idea to replicate a computation across many tasks on different 
> machines. The "replication" part is already possible since you can implement 
> your own grouping which sends the tuple to multiple tasks. What is needed is 
> help from Nimbus to make sure those tasks run on different machines.
> Replicated computation would be useful for doing things like highly available 
> DRPC. Essentially you do the same DRPC multiple times and at the end pick the 
> first one that finishes for the result.
> -----------------------------------------------------------------------------------------------------
> LiSu: I am trying to implement this replication feature. My idea is to have 
> replica tasks for each primary task. The replica and the primary are 
> receiving the same input and doing the same execution. When the primary is 
> running, the output of the replica will be blocked. After the primary failed, 
> the output of the replica will be used instead. What I am doing now is that I 
> set a switch in the component common(or topology context for each task) to 
> control the blocking. After I listened to the heartbeat of primary stopped 
> from the nimbus, the state of the switch will be changed. But my problem now 
> is that change of the ComponentCommon on nimbus will not reflect in the 
> workers. Do you guys have any ideas that the state of the switch will be 
> announced to the tasks in time ?
> -----------------------------------------------------------------------------------------------------
> nathanmarz: Trying to coordinate the replicas with each other like this is a 
> dead-end. The point of this feature is that if a replica suddenly dies, 
> there's no loss of availability because the computation is happening anyway 
> on another task. Obviously, replicated tasks require there to be no side 
> effects.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to