[GitHub] [kafka] jolshan commented on a change in pull request #11097: KAFKA-8529: Flakey test ConsumerBounceTest#testCloseDuringRebalance

GitBox Thu, 22 Jul 2021 10:29:32 -0700


jolshan commented on a change in pull request #11097:
URL: https://github.com/apache/kafka/pull/11097#discussion_r675011109




##########
File path: core/src/test/scala/unit/kafka/server/AbstractFetcherThreadTest.scala
##########
@@ -144,6 +144,44 @@ class AbstractFetcherThreadTest {
     assertEquals(2L, replicaState.highWatermark)
   }
 
+  @Test
+  def testDelay(): Unit = {
+    val partition = new TopicPartition("topic", 0)
+
+    class ErrorMockFetcherThread(fetchBackOffMs: Int)
+      extends MockFetcherThread(fetchBackOffMs =  fetchBackOffMs) {
+
+      override def fetchFromLeader(fetchRequest: FetchRequest.Builder): 
Map[TopicPartition, FetchData] = {
+         throw new UnknownTopicIdException("Topic ID was unknown as expected 
for this test")
+      }
+    }
+    val fetcher = new ErrorMockFetcherThread(fetchBackOffMs = 1000)
+
+    fetcher.setReplicaState(partition, 
MockFetcherThread.PartitionState(leaderEpoch = 0))
+    fetcher.addPartitions(Map(partition -> initialFetchState(0L, leaderEpoch = 
0)))
+
+    val batch = mkBatch(baseOffset = 0L, leaderEpoch = 0,
+      new SimpleRecord("a".getBytes), new SimpleRecord("b".getBytes))
+    val leaderState = MockFetcherThread.PartitionState(Seq(batch), leaderEpoch 
= 0, highWatermark = 2L)
+    fetcher.setLeaderState(partition, leaderState)
+
+    // Do work for the first time. This should result in all partitions in 
error.
+    val timeBeforeFirst = System.currentTimeMillis()
+    fetcher.doWork()
+    val timeAfterFirst = System.currentTimeMillis()
+    val firstWork = timeAfterFirst - timeBeforeFirst
+
+    // The second doWork will pause for fetchBackOffMs since all partitions 
will be delayed
+    val timeBeforeSecond = System.currentTimeMillis()
+    fetcher.doWork()
+    val timeAfterSecond = System.currentTimeMillis()

Review comment:
       Ah hmm. It does seem to be a little flaky for the second check 
(fetchBackOffMs < secondWorkDuration).
   
   In a sample of 50 tests I ran with backOffMs = 500, there were 8 failures 
and it seems like all of them had secondWorkDuration = 500. So maybe I can just 
change to <=
   
   Rerunning with this setup 200 times and with fetchBackOffMs=250, I saw 0 
failures.
   
   Of course, this was all locally. I'm not sure if Jenkins will behave 
differently.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [kafka] jolshan commented on a change in pull request #11097: KAFKA-8529: Flakey test ConsumerBounceTest#testCloseDuringRebalance

Reply via email to