[
https://issues.apache.org/jira/browse/STORM-24?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rick Kellogg updated STORM-24:
------------------------------
Component/s: storm-core
> Refactor internal routing to more efficiently send the same values to
> multiple tasks
> ------------------------------------------------------------------------------------
>
> Key: STORM-24
> URL: https://issues.apache.org/jira/browse/STORM-24
> Project: Apache Storm
> Issue Type: Improvement
> Components: storm-core
> Reporter: James Xu
> Labels: HighPriority
>
> https://github.com/nathanmarz/storm/issues/408
> Storm should be more efficient when sending the same payload to multiple
> tasks. Rather than create many tuples for each target task, the internal
> routing should send to the target worker [list of task ids, payload] as one
> message, and then the recipient will turn that into a tuple for each task in
> the worker.
> This issue is a prerequisite for having a "stats" stream (for use in
> dynamically adjusting tasks), as the stats payload is fairly large.
> This issue comprises the following pieces:
> Internal routing changed from being [task id, tuple] to [list of task ids,
> tuple values, list of message ids]
> Transfer thread turns [list of task ids, tuple values, list of message ids]
> into as few messages as possible
> Routing thread needs similar modifications as transfer thread (should
> probably share code)
> Reciever transforms [list of task ids, tuple values, list of message ids]
> into a tuple for every task
> Serialization code needs to be refactored to understand this new format
> Tuples aren't created outright, but are created later once it reaches the
> destination worker (because message ids and tuple payload need to be kept
> separate)
> Another emitDirect that takes in a list of task ids
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)