[ 
https://issues.apache.org/jira/browse/KAFKA-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001370#comment-16001370
 ] 

Amit Daga commented on KAFKA-4479:
----------------------------------

[~ijuma] Initial findings: After trying to debug for a while, I found that the 
test fails even before we try to close the stream runnables. Lines beyond [1] 
are not executed. 

Test report has been attached for your reference (test.zip). Please let me know 
your inputs on this.

[1] 
https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/integration/QueryableStateIntegrationTest.java#L357

> Streams tests should pass without hardcoded Time.SYSTEM in GroupCoordinator
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-4479
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4479
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Ismael Juma
>            Assignee: Amit Daga
>            Priority: Minor
>              Labels: newbie, newbie++
>         Attachments: test.zip
>
>
> If we pass `KafkaServer.time` to `GroupCoordinator`[1], some streams tests 
> like QueryableStateIntegrationTest fail sem-regularly. [~damianguy] looked 
> into it and described it as:
> {quote}
> Looking at the sequence of events, one thread is stopped, and hence leaves 
> the group triggering a rebalance, but the other thread doesn’t seem to get 
> the memo, tries to commit, fails, and then game-over.
> So.. the case that it fails the one alive thread is not getting a rebalance. 
> This would happen during  a `poll(..)` right? However i can see the thread is 
> polling many times after the other thread has shutdown.
> It tries to commit every time around the loop, so:
> poll(..)
> process(..)
> maybeCommit(..)
> and there is like < 10ms between calls to `poll`.
> {quote}
> A theory was that the mock time was not advancing enough to trigger a 
> rebalance in the group coordinator. However, the consumer is closed, so that 
> should trigger a `LeaveGroup` request and it's unclear why a rebalance is not 
> triggered for the live consumer.
> PR where this issue was first seen and discussed: 
> https://github.com/apache/kafka/pull/2095
> [1] 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaServer.scala#L222



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to