gharris1727 opened a new pull request, #13838: URL: https://github.com/apache/kafka/pull/13838
The MirrorIntegrationBaseTest-derived suites sometimes fail with the following error message: > java.lang.AssertionError: Connector MirrorCheckpointConnector tasks did not start in time on cluster: backup-connect-cluster This assertion fails because the MirrorCheckpointConnector never generates any task configurations: > [2023-06-09 21:36:43,037] INFO GET response for URL=http://localhost:37695/connectors/MirrorCheckpointConnector/status is {"name":"MirrorCheckpointConnector","connector":{"state":"RUNNING","worker_id":"localhost:35597"},"tasks":[],"type":"source"} (org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster:905) The connector generates non-empty task configurations because there are no consumer groups to distribute to the tasks, because the only consumer group available is filtered out: > [2023-06-09 21:34:39,370] DEBUG [MirrorCheckpointConnector|worker] Ignoring the following groups which do not have any offsets for topics that are accepted by the topic filter: [consumer-group-dummy] (org.apache.kafka.connect.mirror.MirrorCheckpointConnector:190) This is because the consumer-group-dummy did not commit any offsets for any topics, and doesn't satisfy the filtering logic added in #13446 . This is because the dummy group commits offsets in warmUpConsumer: https://github.com/apache/kafka/blob/7eea2a3908fdcee1627c18827e6dcb5ed0089fdd/connect/mirror/src/test/java/org/apache/kafka/connect/mirror/integration/MirrorConnectorsIntegrationBaseTest.java#L1189-L1198 If the single invocation of poll times out before the consumer can retrieve the metadata for the partition, then the commitSync does not commit any offsets for the consumer, leaving no offsets instead of the expected offset 0. This poll duration is 500ms, which is regularly exceeded in my 30% CPU de-flaking environment. To fix this, increase the poll timeout from 500ms to 5s, to make consumer call-sites less flaky. Unfortunately this strategy is not ideal for the warmUpConsumer function itself, which is always called on an empty topic and will always block for the whole duration, lengthening the runtime of the test. So for warmUpConsumer, replace the empty poll() with Admin calls which retrieve the set of partitions and then commit offset 0 for all partitions. ### Committer Checklist (excluded from commit message) - [ ] Verify design and implementation - [ ] Verify test coverage and CI build status - [ ] Verify documentation (including upgrade notes) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org