[ https://issues.apache.org/jira/browse/KAFKA-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17795958#comment-17795958 ]
Greg Harris commented on KAFKA-15372: ------------------------------------- This is now set to release for 3.7 and 3.6, but I had some issues with the 3.5 backport that I had to revert. In particular, the DedicatedMirrorTest has this persistent failure: {noformat} org.apache.kafka.test.NoRetryException at app//org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.lambda$awaitTaskConfigurations$8(DedicatedMirrorIntegrationTest.java:363) at app//org.apache.kafka.test.TestUtils.lambda$waitForCondition$4(TestUtils.java:337) at app//org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:385) at app//org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:334) at app//org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:318) at app//org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:308) at app//org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.awaitTaskConfigurations(DedicatedMirrorIntegrationTest.java:353) at app//org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.testMultiNodeCluster(DedicatedMirrorIntegrationTest.java:301) Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.connect.runtime.distributed.RebalanceNeededException: Request cannot be completed because a rebalance is expected at org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:123) at org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:115) at org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.lambda$awaitTaskConfigurations$8(DedicatedMirrorIntegrationTest.java:357) ... 7 more Caused by: org.apache.kafka.connect.runtime.distributed.RebalanceNeededException: Request cannot be completed because a rebalance is expected{noformat} > MM2 rolling restart can drop configuration changes silently > ----------------------------------------------------------- > > Key: KAFKA-15372 > URL: https://issues.apache.org/jira/browse/KAFKA-15372 > Project: Kafka > Issue Type: Bug > Components: mirrormaker > Reporter: Daniel Urban > Assignee: Greg Harris > Priority: Major > Fix For: 3.7.0, 3.6.2 > > > When MM2 is restarted, it tries to update the Connector configuration in all > flows. This is a one-time trial, and fails if the Connect worker is not the > leader of the group. > In a distributed setup and with a rolling restart, it is possible that for a > specific flow, the Connect worker of the just restarted MM2 instance is not > the leader, meaning that Connector configurations can get dropped. > For example, assuming 2 MM2 instances, and one flow A->B: > # MM2 instance 1 is restarted, the worker inside MM2 instance 2 becomes the > leader of A->B Connect group. > # MM2 instance 1 tries to update the Connector configurations, but fails > (instance 2 has the leader, not instance 1) > # MM2 instance 2 is restarted, leadership moves to worker in MM2 instance 1 > # MM2 instance 2 tries to update the Connector configurations, but fails > At this point, the configuration changes before the restart are never > applied. Many times, this can also happen silently, without any indication. -- This message was sent by Atlassian Jira (v8.20.10#820010)