[ 
https://issues.apache.org/jira/browse/KAFKA-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismael Juma reopened KAFKA-4479:
--------------------------------

The code still passes Time.SYSTEM:

{code}
/* start group coordinator */
        // Hardcode Time.SYSTEM for now as some Streams tests fail otherwise, 
it would be good to fix the underlying issue
        groupCoordinator = GroupCoordinator(config, zkClient, replicaManager, 
Time.SYSTEM)
        groupCoordinator.startup()
{code}

> Streams tests should pass without hardcoded Time.SYSTEM in GroupCoordinator
> ---------------------------------------------------------------------------
>
>                 Key: KAFKA-4479
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4479
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams, unit tests
>            Reporter: Ismael Juma
>            Priority: Minor
>              Labels: newbie, newbie++
>         Attachments: test.zip
>
>
> If we pass `KafkaServer.time` to `GroupCoordinator`[1], some streams tests 
> like QueryableStateIntegrationTest fail sem-regularly. [~damianguy] looked 
> into it and described it as:
> {quote}
> Looking at the sequence of events, one thread is stopped, and hence leaves 
> the group triggering a rebalance, but the other thread doesn’t seem to get 
> the memo, tries to commit, fails, and then game-over.
> So.. the case that it fails the one alive thread is not getting a rebalance. 
> This would happen during  a `poll(..)` right? However i can see the thread is 
> polling many times after the other thread has shutdown.
> It tries to commit every time around the loop, so:
> poll(..)
> process(..)
> maybeCommit(..)
> and there is like < 10ms between calls to `poll`.
> {quote}
> A theory was that the mock time was not advancing enough to trigger a 
> rebalance in the group coordinator. However, the consumer is closed, so that 
> should trigger a `LeaveGroup` request and it's unclear why a rebalance is not 
> triggered for the live consumer.
> PR where this issue was first seen and discussed: 
> https://github.com/apache/kafka/pull/2095
> [1] 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/KafkaServer.scala#L222



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to