gharris1727 opened a new pull request, #13838:
URL: https://github.com/apache/kafka/pull/13838

   The MirrorIntegrationBaseTest-derived suites sometimes fail with the 
following error message:
   
   > java.lang.AssertionError: Connector MirrorCheckpointConnector tasks did 
not start in time on cluster: backup-connect-cluster
   
   This assertion fails because the MirrorCheckpointConnector never generates 
any task configurations:
   > [2023-06-09 21:36:43,037] INFO GET response for 
URL=http://localhost:37695/connectors/MirrorCheckpointConnector/status is 
{"name":"MirrorCheckpointConnector","connector":{"state":"RUNNING","worker_id":"localhost:35597"},"tasks":[],"type":"source"}
 (org.apache.kafka.connect.util.clusters.EmbeddedConnectCluster:905)
   
   The connector generates non-empty task configurations because there are no 
consumer groups to distribute to the tasks, because the only consumer group 
available is filtered out:
   > [2023-06-09 21:34:39,370] DEBUG [MirrorCheckpointConnector|worker] 
Ignoring the following groups which do not have any offsets for topics that are 
accepted by the topic filter: [consumer-group-dummy] 
(org.apache.kafka.connect.mirror.MirrorCheckpointConnector:190)
   
   This is because the consumer-group-dummy did not commit any offsets for any 
topics, and doesn't satisfy the filtering logic added in #13446 . This is 
because the dummy group commits offsets in warmUpConsumer: 
https://github.com/apache/kafka/blob/7eea2a3908fdcee1627c18827e6dcb5ed0089fdd/connect/mirror/src/test/java/org/apache/kafka/connect/mirror/integration/MirrorConnectorsIntegrationBaseTest.java#L1189-L1198
   
   If the single invocation of poll times out before the consumer can retrieve 
the metadata for the partition, then the commitSync does not commit any offsets 
for the consumer, leaving no offsets instead of the expected offset 0. This 
poll duration is 500ms, which is regularly exceeded in my 30% CPU de-flaking 
environment.
   
   To fix this, increase the poll timeout from 500ms to 5s, to make consumer 
call-sites less flaky.
   Unfortunately this strategy is not ideal for the warmUpConsumer function 
itself, which is always called on an empty topic and will always block for the 
whole duration, lengthening the runtime of the test.
   So for warmUpConsumer, replace the empty poll() with Admin calls which 
retrieve the set of partitions and then commit offset 0 for all partitions.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to