GitHub user wendyshusband opened a pull request:
https://github.com/apache/storm/pull/2164
[STORM-2557]: A bug in DisruptorQueue causing severe underestimation ofâ¦
Recently, we are tuning the performance of our topology and deploying some
theoretical performance models that heavily rely on the metrics of query
arrival rates. We found a bug in `DisruptorQueue `that leads to severe
underestimation to the queue arrival rates. After further investigation, we
finally found that in the current implementation of `DisruptorQueue`, the
arrival rates are actually measured as the number of batches of tuples rather
than the actual number of tuples, resulting in significant underestimation of
the arrival rates.
To be more specific, in` DisruptorQueue.publishDirectSingle()` and
`DisruptorQueue.publishDirect()` functions, objects containing tuples are
published to the buffer and the metrics are notified by calling
`_metric.notifyArrivals(1)`. This works fine when the object is simply a
wrapper of a single tuple. However, the object could also be an instance of
`ArrayList<AddressedTuple>` or `HashMap<Integer, ArrayList<TaskMessage>>`. In
such case, we should get the actual number of tuples in the object and notify
the `_metrics `with the right value.
I added some code that determine the type of object to fix this issue at
`DisruptorQueue.publishDirectSingle ()` and `DisruptorQueue.publishDirectSingle
()` functions.
Any comment or suggestion regarding to this pull request is welcomed.
Thanks.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/wendyshusband/storm master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/storm/pull/2164.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2164
----
commit 70d45b29c5fd7d67e4d841a2cb2c8504fb999e2c
Author: wendyshusband <[email protected]>
Date: 2017-06-17T05:40:17Z
STORM-2557: A bug in DisruptorQueue causing severe underestimation of queue
arrival rates
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---