Siwoon Son created STORM-2539:
---------------------------------
Summary: Locality aware grouping, which is a new grouping method
considering locality
Key: STORM-2539
URL: https://issues.apache.org/jira/browse/STORM-2539
Project: Apache Storm
Issue Type: New Feature
Components: storm-core
Reporter: Siwoon Son
Attachments: receiver-imbalance.png, sender-imbalance.png
I’d like to propose a new grouping method. This method solves problems that can
occur with existing _Shuffle grouping_ method and _Local-or-shuffle_ method.
I was motivated by the following a question.
bq. "Would not it be more efficient to transfer tuples to nearby nodes?"
Among the Storm's grouping methods, _Shuffle grouping_ is a method of randomly
selecting a task of the next node. In this method, all nodes can receive data
evenly, but the same amount of data is transferred to a relatively distant
node, which can cause a high delay. To solve this problem, Storm provides
Local-or-shuffle grouping considering locality.
_Local-or-shuffle grouping_ can minimize the delay by internally processing
data in the process if the task receiving the data is in the same process.
Local-or-shuffle grouping, however, considers *only locality*, which may leads
to two load balancing problems: +the sender imbalance+ and +the receiver
imbalance+. First, the sender imbalance problem is a load balancing problem in
which traffic is concentrated on a specific task because the number of senders’
tasks and the number of nodes are not equal. Next, the receiver imbalance
problem is a load balancing problem in which traffic is concentrated on a
specific object because the number of receivers' tasks and the number of nodes
are not equal. If these problems occur, the tasks of receivers will perform
different amounts of work, resulting in performance degradation and processing
delays.
!sender-imbalance.png!
(a) Example of sender imbalance problem.
!receiver-imbalance.png!
(b) Example of receiver imbalance problem.
So, I propose locality aware grouping which can solve the load balancing
problem while considering the locality. Locality aware grouping is a method of
periodically calculating the ping response time between nodes and transmitting
more tuples probabilistically to nodes with low response time. I implemented
the proposed Locality aware grouping at
[https://github.com/dke-knu/i2am/tree/master/i2am-app/locality-aware-grouping].
LocalityAwareGrouping.java is a class that implements locality aware grouping.
LocalityGroupingTestTopology.java, TupleGeneratorSpout.java, and
PerformanceLoggingBolt.java are topology classes for testing this.
LocalityAwareGrouping$prepare() method reads the network information of each
node from the Zookeeper and activates the thread. This thread periodically
calculates the ping response time of each node.
LocalityAwareGrouping$chooseTasks() method selects a task by a higher
probability for the nodes with lower network response times.
But the implementation is straightforward. To calculate the ping between nodes,
the network information of the nodes that tasks are performing is needed. I got
this information from Zookeeper using the Zookeeper$getData() method. At this
time, in order to create a Zookeeper object, I had no choice but to receive the
connection information of the Zookeeper from the user.
Please let me know, if there is a way to get the network information of each
node without requiring additional parameters from the user and if you have any
additional comments.
Thank you for reading.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)