[ https://issues.apache.org/jira/browse/KAFKA-15372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17755833#comment-17755833 ]
Daniel Urban commented on KAFKA-15372: -------------------------------------- [~gharris1727] Not sure if I follow this part: "should forward configurations to the leader via the internal REST API." I checked org.apache.kafka.connect.mirror.MirrorMaker#configureConnector which then calls org.apache.kafka.connect.runtime.distributed.DistributedHerder#putConnectorConfig, and I don't really see any sign of forwarding to the leader. The callback of the validation explicitly handles the non-leader state with a failure: {code:java} if (!isLeader()) { callback.onCompletion(new NotLeaderException("Only the leader can set connector configs.", leaderUrl()), null); return null; } {code} So I think that current trunk is also affected by this, there is no Connector configuration forwarding to the leader in MM2. Additionally, I'm not sure if a single forward attempt is enough to ensure correctness, but that is an implementation detail. Unfortunately, I really don't have an exact reproduction, but I saw this happening in an actual cluster, the leadership changes occurred as I detailed in the ticket description. > MM2 rolling restart can drop configuration changes silently > ----------------------------------------------------------- > > Key: KAFKA-15372 > URL: https://issues.apache.org/jira/browse/KAFKA-15372 > Project: Kafka > Issue Type: Improvement > Components: mirrormaker > Reporter: Daniel Urban > Priority: Major > > When MM2 is restarted, it tries to update the Connector configuration in all > flows. This is a one-time trial, and fails if the Connect worker is not the > leader of the group. > In a distributed setup and with a rolling restart, it is possible that for a > specific flow, the Connect worker of the just restarted MM2 instance is not > the leader, meaning that Connector configurations can get dropped. > For example, assuming 2 MM2 instances, and one flow A->B: > # MM2 instance 1 is restarted, the worker inside MM2 instance 2 becomes the > leader of A->B Connect group. > # MM2 instance 1 tries to update the Connector configurations, but fails > (instance 2 has the leader, not instance 1) > # MM2 instance 2 is restarted, leadership moves to worker in MM2 instance 1 > # MM2 instance 2 tries to update the Connector configurations, but fails > At this point, the configuration changes before the restart are never > applied. Many times, this can also happen silently, without any indication. -- This message was sent by Atlassian Jira (v8.20.10#820010)