[jira] [Commented] (KAFKA-15372) MM2 rolling restart can drop configuration changes silently

Greg Harris (Jira) Tue, 12 Dec 2023 15:17:40 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17795958#comment-17795958
 ]


Greg Harris commented on KAFKA-15372:
-------------------------------------

This is now set to release for 3.7 and 3.6, but I had some issues with the 3.5 
backport that I had to revert. In particular, the DedicatedMirrorTest has this 
persistent failure:
{noformat}
    org.apache.kafka.test.NoRetryException
        at 
app//org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.lambda$awaitTaskConfigurations$8(DedicatedMirrorIntegrationTest.java:363)
        at 
app//org.apache.kafka.test.TestUtils.lambda$waitForCondition$4(TestUtils.java:337)
        at 
app//org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:385)
        at 
app//org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:334)
        at 
app//org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:318)
        at 
app//org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:308)
        at 
app//org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.awaitTaskConfigurations(DedicatedMirrorIntegrationTest.java:353)
        at 
app//org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.testMultiNodeCluster(DedicatedMirrorIntegrationTest.java:301)
        
Caused by:
        java.util.concurrent.ExecutionException: 
org.apache.kafka.connect.runtime.distributed.RebalanceNeededException: Request 
cannot be completed because a rebalance is expected
            at 
org.apache.kafka.connect.util.ConvertingFutureCallback.result(ConvertingFutureCallback.java:123)
            at 
org.apache.kafka.connect.util.ConvertingFutureCallback.get(ConvertingFutureCallback.java:115)
            at 
org.apache.kafka.connect.mirror.integration.DedicatedMirrorIntegrationTest.lambda$awaitTaskConfigurations$8(DedicatedMirrorIntegrationTest.java:357)
            ... 7 more            
Caused by:
            
org.apache.kafka.connect.runtime.distributed.RebalanceNeededException: Request 
cannot be completed because a rebalance is expected{noformat}

> MM2 rolling restart can drop configuration changes silently
> -----------------------------------------------------------
>
>                 Key: KAFKA-15372
>                 URL: https://issues.apache.org/jira/browse/KAFKA-15372
>             Project: Kafka
>          Issue Type: Bug
>          Components: mirrormaker
>            Reporter: Daniel Urban
>            Assignee: Greg Harris
>            Priority: Major
>             Fix For: 3.7.0, 3.6.2
>
>
> When MM2 is restarted, it tries to update the Connector configuration in all 
> flows. This is a one-time trial, and fails if the Connect worker is not the 
> leader of the group.
> In a distributed setup and with a rolling restart, it is possible that for a 
> specific flow, the Connect worker of the just restarted MM2 instance is not 
> the leader, meaning that Connector configurations can get dropped.
> For example, assuming 2 MM2 instances, and one flow A->B:
>  # MM2 instance 1 is restarted, the worker inside MM2 instance 2 becomes the 
> leader of A->B Connect group.
>  # MM2 instance 1 tries to update the Connector configurations, but fails 
> (instance 2 has the leader, not instance 1)
>  # MM2 instance 2 is restarted, leadership moves to worker in MM2 instance 1
>  # MM2 instance 2 tries to update the Connector configurations, but fails
> At this point, the configuration changes before the restart are never 
> applied. Many times, this can also happen silently, without any indication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-15372) MM2 rolling restart can drop configuration changes silently

Reply via email to