Jungtaek Lim created STORM-1698:
-----------------------------------
Summary: Asynchronous MetricsConsumerBolt
Key: STORM-1698
URL: https://issues.apache.org/jira/browse/STORM-1698
Project: Apache Storm
Issue Type: Improvement
Components: storm-core
Affects Versions: 1.0.0, 2.0.0
Reporter: Jungtaek Lim
Assignee: Jungtaek Lim
Currently MetricsConsumerBolt is delegating MetricsConsumer to handle data
points via synchronous manner.
When MetricsConsumer cannot keep up, it will trigger backpressure when (queue
size + overflow buffer size) reaches high watermark, which incurs slowing down
the topology in result.
Slowing down Itself is not a problem because that’s what backpressure is for.
The actual problem is that backpressure only throttles spout, not metrics. If
MetricsConsumerBolt cannot keep up with incoming tuples, backpressure never
ends and topology just hangs. If we turn off backpressure, we have unbounded
queue and worker could throw OOME eventually.
Making MetricsConsumerBolt asynchronous can resolve this issue. One downside of
making it async is that it's hard to see that MetricsConsumerBolt is keeping up
now. (capacity will be always around 0)
I don't have an idea for now but I think it's still better than current.
Before making consensus about huge change of metrics, I'd love to improve
current metrics without breaking backward compatible manner. It could be
applied to 1.x-branch, and even 0.10.x-branch.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)