[jira] [Commented] (STORM-2914) Remove enable.auto.commit support from storm-kafka-client

JIRA Sat, 27 Jan 2018 15:07:58 -0800

    [ 
https://issues.apache.org/jira/browse/STORM-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16342365#comment-16342365
 ]


Stig Rohde Døssing commented on STORM-2914:
-------------------------------------------

{quote}
My belief is that our alerting topology gets stuck when there's a huge peak of 
messages rate because somehow we saturate Storm by exceeding its capacity, 
and/or maybe we hit some race condition that hands the spouts in a way that 
"normal load" would never trigger.
{quote}

I'd also add that what you're describing here is something you can likely avoid 
by configuring your topologies differently. For example, you can set 
topology.max.spout.pending to limit how many pending tuples will be allowed 
into the topology at a time, so a spike in messages written to Kafka doesn't 
trigger a flood of messages in Storm. Even without this configuration, the 
topology shouldn't hang (more likely an OOME would occur). I'd encourage you to 
try to get some more information (e.g. Storm UI screenshots, thread dumps of 
the workers) if it happens again, and raise a new issue here so we can try to 
figure out what's happening.

> Remove enable.auto.commit support from storm-kafka-client
> ---------------------------------------------------------
>
>                 Key: STORM-2914
>                 URL: https://issues.apache.org/jira/browse/STORM-2914
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-kafka-client
>    Affects Versions: 2.0.0, 1.2.0
>            Reporter: Stig Rohde Døssing
>            Assignee: Stig Rohde Døssing
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> The enable.auto.commit option causes the KafkaConsumer to periodically commit 
> the latest offsets it has returned from poll(). It is convenient for use 
> cases where messages are polled from Kafka and processed synchronously, in a 
> loop. 
> Due to https://issues.apache.org/jira/browse/STORM-2913 we'd really like to 
> store some metadata in Kafka when the spout commits. This is not possible 
> with enable.auto.commit. I took at look at what that setting actually does, 
> and it just causes the KafkaConsumer to call commitAsync during poll (and 
> during a few other operations, e.g. close and assign) with some interval. 
> Ideally I'd like to get rid of ProcessingGuarantee.NONE, since I think 
> ProcessingGuarantee.AT_MOST_ONCE covers the same use cases, and is likely 
> almost as fast. The primary difference between them is that AT_MOST_ONCE 
> commits synchronously.
> If we really want to keep ProcessingGuarantee.NONE, I think we should make 
> our ProcessingGuarantee.NONE setting cause the spout to call commitAsync 
> after poll, and never use the enable.auto.commit option. This allows us to 
> include metadata in the commit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (STORM-2914) Remove enable.auto.commit support from storm-kafka-client

Reply via email to