machi1990 commented on PR #13665: URL: https://github.com/apache/kafka/pull/13665#issuecomment-1598653771
> @showuon I was looking onto this and after several local runs, I managed to eliminate some flasky test and came up with the list of failures that are only caused by this change. The failure total number of failures that I've seen locally are: > > ``` > kafka.api.TransactionsBounceTest.testWithGroupId() > kafka.api.TransactionsBounceTest.testWithGroupMetadata() > kafka.api.TransactionsTest.testSendOffsetsWithGroupId(String)[1] > kafka.api.TransactionsTest.testSendOffsetsWithGroupId(String)[2] > kafka.api.TransactionsTest.testSendOffsetsWithGroupMetadata(String)[1] > kafka.api.TransactionsTest.testSendOffsetsWithGroupMetadata(String)[2] > kafka.server.ClientQuotasRequestTest.testAlterClientQuotasBadIp()[1] > kafka.server.ClientQuotasRequestTest.testAlterClientQuotasBadIp()[2] > kafka.server.ClientQuotasRequestTest.testAlterClientQuotasBadIp()[3] > kafka.server.DynamicConfigChangeUnitTest.testIpHandlerUnresolvableAddress() > kafka.zk.ZkMigrationIntegrationTest.testNewAndChangedTopicsInDualWrite(ClusterInstance)[1] > kafka.admin.ConfigCommandTest.shouldFailIfInvalidHost() > ``` > > And only the `TransactionsTest` and `TransactionsBounceTest` are the ones that I've identified to be related to this PR. I've started to investigate these it so far my conclusion is that the failure there are related to reading of stale cache values because the cache item is stored only once when fetching the offset in [1]. The test neither calls `commitSync`, nor `commitAsync` which means that the cache is never updated in [2] after initially set in [1]. > > I was thinking of dropping off cache update when fetching committed offsets i.e in [1] and only perform cache update when during offset commit [2] > > 1. https://github.com/apache/kafka/blob/76d25c94e2c8723eec31a3df64c752bc66c79b34/clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerCoordinator.java#L1075 > 2. https://github.com/apache/kafka/blob/76d25c94e2c8723eec31a3df64c752bc66c79b34/clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerCoordinator.java#L1456 > > That would align to the comment you raised in [#13665 (comment)](https://github.com/apache/kafka/pull/13665#discussion_r1197621674) Let me know what you think. I've pushed this change in https://github.com/apache/kafka/pull/13665/commits/9539c559a782aba8ce95c9b8b48831c6879821d2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org