[jira] [Commented] (KAFKA-8529) Flakey test ConsumerBounceTest#testCloseDuringRebalance

A. Sophie Blee-Goldman (Jira) Tue, 13 Jul 2021 12:58:07 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-8529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380138#comment-17380138
 ]


A. Sophie Blee-Goldman commented on KAFKA-8529:
-----------------------------------------------

This has been failing _very_ frequently as of late, for example 
[https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-pr/detail/PR-11009/4/tests].

I took a brief looks at the logs and it appears to be an issue with the 
connection, or possibly something to do with the incremental fetch (not sure if 
this warning is a red herring or not):

 
{code:java}
[2021-07-13 12:31:46,774] WARN [ReplicaFetcher replicaId=1, leaderId=2, 
fetcherId=0] Error in response for fetch request (type=FetchRequest, 
replicaId=1, maxWait=500, minBytes=1, maxBytes=10485760, 
fetchData={closetest-1=PartitionData(fetchOffset=0, logStartOffset=0, 
maxBytes=1048576, currentLeaderEpoch=Optional[0], 
lastFetchedEpoch=Optional.empty), closetest-7=PartitionData(fetchOffset=0, 
logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], 
lastFetchedEpoch=Optional.empty), closetest-4=PartitionData(fetchOffset=0, 
logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], 
lastFetchedEpoch=Optional.empty)}, isolationLevel=READ_UNCOMMITTED, toForget=, 
metadata=(sessionId=1829936913, epoch=1), rackId=) 
(kafka.server.ReplicaFetcherThread:72)org.apache.kafka.common.errors.FetchSessionTopicIdException:
 The fetch session encountered inconsistent topic ID usage[2021-07-13 
12:31:47,002] WARN [ReplicaFetcher replicaId=1, leaderId=0, fetcherId=0] Error 
in response for fetch request (type=FetchRequest, replicaId=1, maxWait=500, 
minBytes=1, maxBytes=10485760, fetchData={}, isolationLevel=READ_UNCOMMITTED, 
toForget=, metadata=(sessionId=2081224068, epoch=1), rackId=) 
(kafka.server.ReplicaFetcherThread:72)org.apache.kafka.common.errors.FetchSessionTopicIdException:
 The fetch session encountered inconsistent topic ID usage[2021-07-13 
12:31:48,545] WARN [Consumer clientId=ConsumerTestConsumer, groupId=group1] 
Close timed out with 3 pending requests to coordinator, terminating client 
connections 
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:1024)[2021-07-13
 12:31:48,549] WARN [ReplicaFetcher replicaId=0, leaderId=2, fetcherId=0] Error 
in response for fetch request (type=FetchRequest, replicaId=0, maxWait=500, 
minBytes=1, maxBytes=10485760, 
fetchData={closetest-1=PartitionData(fetchOffset=0, logStartOffset=0, 
maxBytes=1048576, currentLeaderEpoch=Optional[0], 
lastFetchedEpoch=Optional.empty), closetest-7=PartitionData(fetchOffset=0, 
logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], 
lastFetchedEpoch=Optional.empty), topic-1=PartitionData(fetchOffset=0, 
logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], 
lastFetchedEpoch=Optional.empty), closetest-4=PartitionData(fetchOffset=0, 
logStartOffset=0, maxBytes=1048576, currentLeaderEpoch=Optional[0], 
lastFetchedEpoch=Optional.empty)}, isolationLevel=READ_UNCOMMITTED, toForget=, 
metadata=(sessionId=1532743228, epoch=INITIAL), rackId=) 
(kafka.server.ReplicaFetcherThread:72)java.io.IOException: Connection to 2 was 
disconnected before the response was read 
    at 
org.apache.kafka.clients.NetworkClientUtils.sendAndReceive(NetworkClientUtils.java:100)
    at 
kafka.server.ReplicaFetcherBlockingSend.sendRequest(ReplicaFetcherBlockingSend.scala:109)
    at 
kafka.server.ReplicaFetcherThread.fetchFromLeader(ReplicaFetcherThread.scala:219)
    at 
kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:313)
    at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3(AbstractFetcherThread.scala:137)
    at 
kafka.server.AbstractFetcherThread.$anonfun$maybeFetch$3$adapted(AbstractFetcherThread.scala:136)
    at scala.Option.foreach(Option.scala:437) at 
kafka.server.AbstractFetcherThread.maybeFetch(AbstractFetcherThread.scala:136)
    at 
kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:119)
    at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:96)
{code}
 

Wondering if we should bump this the priority on this? Can someone more 
familiar with this test maybe take a few minutes to glance over these logs and 
chime in on whether this is potentially concerning or not? cc [~hachikuji] 
[~cmccabe]

> Flakey test ConsumerBounceTest#testCloseDuringRebalance
> -------------------------------------------------------
>
>                 Key: KAFKA-8529
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8529
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>            Reporter: Boyang Chen
>            Priority: Major
>
> [https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/5450/consoleFull]
>  
> *16:16:10* kafka.api.ConsumerBounceTest > testCloseDuringRebalance 
> STARTED*16:16:22* kafka.api.ConsumerBounceTest.testCloseDuringRebalance 
> failed, log available in 
> /home/jenkins/jenkins-slave/workspace/kafka-pr-jdk11-scala2.12/core/build/reports/testOutput/kafka.api.ConsumerBounceTest.testCloseDuringRebalance.test.stdout*16:16:22*
>  *16:16:22* kafka.api.ConsumerBounceTest > testCloseDuringRebalance 
> FAILED*16:16:22*     java.lang.AssertionError: Rebalance did not complete in 
> time*16:16:22*         at org.junit.Assert.fail(Assert.java:89)*16:16:22*     
>     at org.junit.Assert.assertTrue(Assert.java:42)*16:16:22*         at 
> kafka.api.ConsumerBounceTest.waitForRebalance$1(ConsumerBounceTest.scala:402)*16:16:22*
>          at 
> kafka.api.ConsumerBounceTest.checkCloseDuringRebalance(ConsumerBounceTest.scala:416)*16:16:22*
>          at 
> kafka.api.ConsumerBounceTest.testCloseDuringRebalance(ConsumerBounceTest.scala:379)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-8529) Flakey test ConsumerBounceTest#testCloseDuringRebalance

Reply via email to