Jungtaek Lim created STORM-1698:
-----------------------------------

             Summary: Asynchronous MetricsConsumerBolt
                 Key: STORM-1698
                 URL: https://issues.apache.org/jira/browse/STORM-1698
             Project: Apache Storm
          Issue Type: Improvement
          Components: storm-core
    Affects Versions: 1.0.0, 2.0.0
            Reporter: Jungtaek Lim
            Assignee: Jungtaek Lim


Currently MetricsConsumerBolt is delegating MetricsConsumer to handle data 
points via synchronous manner.

When MetricsConsumer cannot keep up, it will trigger backpressure when (queue 
size + overflow buffer size) reaches high watermark, which incurs slowing down 
the topology in result. 

Slowing down Itself is not a problem because that’s what backpressure is for. 
The actual problem is that backpressure only throttles spout, not metrics. If 
MetricsConsumerBolt cannot keep up with incoming tuples, backpressure never 
ends and topology just hangs. If we turn off backpressure, we have unbounded 
queue and worker could throw OOME eventually.

Making MetricsConsumerBolt asynchronous can resolve this issue. One downside of 
making it async is that it's hard to see that MetricsConsumerBolt is keeping up 
now. (capacity will be always around 0)
I don't have an idea for now but I think it's still better than current.

Before making consensus about huge change of metrics, I'd love to improve 
current metrics without breaking backward compatible manner. It could be 
applied to 1.x-branch, and even 0.10.x-branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to