[
https://issues.apache.org/jira/browse/STORM-44?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rick Kellogg updated STORM-44:
------------------------------
Component/s: storm-core
> Replication
> -----------
>
> Key: STORM-44
> URL: https://issues.apache.org/jira/browse/STORM-44
> Project: Apache Storm
> Issue Type: Wish
> Components: storm-core
> Reporter: James Xu
>
> https://github.com/nathanmarz/storm/issues/132
> This is an idea to replicate a computation across many tasks on different
> machines. The "replication" part is already possible since you can implement
> your own grouping which sends the tuple to multiple tasks. What is needed is
> help from Nimbus to make sure those tasks run on different machines.
> Replicated computation would be useful for doing things like highly available
> DRPC. Essentially you do the same DRPC multiple times and at the end pick the
> first one that finishes for the result.
> -----------------------------------------------------------------------------------------------------
> LiSu: I am trying to implement this replication feature. My idea is to have
> replica tasks for each primary task. The replica and the primary are
> receiving the same input and doing the same execution. When the primary is
> running, the output of the replica will be blocked. After the primary failed,
> the output of the replica will be used instead. What I am doing now is that I
> set a switch in the component common(or topology context for each task) to
> control the blocking. After I listened to the heartbeat of primary stopped
> from the nimbus, the state of the switch will be changed. But my problem now
> is that change of the ComponentCommon on nimbus will not reflect in the
> workers. Do you guys have any ideas that the state of the switch will be
> announced to the tasks in time ?
> -----------------------------------------------------------------------------------------------------
> nathanmarz: Trying to coordinate the replicas with each other like this is a
> dead-end. The point of this feature is that if a replica suddenly dies,
> there's no loss of availability because the computation is happening anyway
> on another task. Obviously, replicated tasks require there to be no side
> effects.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)