[ 
https://issues.apache.org/jira/browse/KAFKA-6399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392105#comment-16392105
 ] 

Guozhang Wang commented on KAFKA-6399:
--------------------------------------

I'm +1 on changing this config's default from Integer.MAX_VALUE, but I'm on the 
fence for setting the default to 30 seconds. The reason is that in Streams, it 
is quite common to have stateful processing such that each record may take some 
time to process. I'd prefer setting the default to larger values, say 5 minute 
than 30 seconds.

> Consider reducing "max.poll.interval.ms" default for Kafka Streams
> ------------------------------------------------------------------
>
>                 Key: KAFKA-6399
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6399
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>    Affects Versions: 1.0.0
>            Reporter: Matthias J. Sax
>            Assignee: Khaireddine Rezgui
>            Priority: Minor
>
> In Kafka {{0.10.2.1}} we change the default value of 
> {{max.poll.intervall.ms}} for Kafka Streams to {{Integer.MAX_VALUE}}. The 
> reason was that long state restore phases during rebalance could yield 
> "rebalance storms" as consumers drop out of a consumer group even if they are 
> healthy as they didn't call {{poll()}} during state restore phase.
> In version {{0.11}} and {{1.0}} the state restore logic was improved a lot 
> and thus, now Kafka Streams does call {{poll()}} even during restore phase. 
> Therefore, we might consider setting a smaller timeout for 
> {{max.poll.intervall.ms}} to detect bad behaving Kafka Streams applications 
> (ie, targeting user code) that don't make progress any more during regular 
> operations.
> The open question would be, what a good default might be. Maybe the actual 
> consumer default of 30 seconds might be sufficient. During one {{poll()}} 
> roundtrip, we would only call {{restoreConsumer.poll()}} once and restore a 
> single batch of records. This should take way less time than 30 seconds.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to