[
https://issues.apache.org/jira/browse/NIFI-15906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18080959#comment-18080959
]
ASF subversion and git services commented on NIFI-15906:
--------------------------------------------------------
Commit 5bf68c8913d67776566c7c3210525a315985b844 in nifi's branch
refs/heads/main from Rakesh Kumar Singh
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=5bf68c8913d ]
NIFI-15906 Fixed Exception on cluster reconnect for running RemoteGroupPort
Destinations (#11212)
During cluster reconnect, StandardVersionedComponentSynchronizer calls
updateConnectionDestinations() which temporarily re-points connections to
a dummy Funnel. When the current destination is a RemoteGroupPort (RPG)
that has versionedComponentId=null (common for S2S ports discovered at
runtime), the synchronizer cannot match it to the versioned component map
and attempts a temp-Funnel detour.
Signed-off-by: David Handermann <[email protected]>
> Cluster reconnect inheritance throws IllegalStateException when a
> connection's destination is a running RemoteGroupPort whose
> versionedComponentId is null
> ----------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: NIFI-15906
> URL: https://issues.apache.org/jira/browse/NIFI-15906
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 2.5.0
> Reporter: Xinyu Wang
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> *Symptom*
> When a cluster node performs reconnect inheritance and the local flow
> contains a Connection whose destination is a RemoteGroupPort (RGP) with
> transmission=ON, the synchronizer aborts with:
> ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService
> Handling reconnection request failed
> org.apache.nifi.controller.serialization.FlowSynchronizationException:
> Failed to connect node to cluster because local flow controller partially
> updated.
> Administrator should disconnect node and review flow for corruption.
> at
> o.a.n.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:947)
> ...
> Caused by: java.lang.IllegalStateException: Cannot change destination of
> Connection because the current destination
> ([RemoteGroupPort[name=TARGET_PORT,targets=https://nifi-1:8443]]) is running
> at
> o.a.n.connectable.StandardConnection.setDestination(StandardConnection.java:296)
> at
> o.a.n.flow.synchronization.StandardVersionedComponentSynchronizer.updateConnectionDestinations(StandardVersionedComponentSynchronizer.java:863)
> at
> o.a.n.flow.synchronization.StandardVersionedComponentSynchronizer.synchronize(StandardVersionedComponentSynchronizer.java:573)
> ...
> The node is then marked DISCONNECTED, Disconnect Code = Node's Flow did not
> Match Cluster Flow and requires manual intervention to rejoin.
> *Steps to Reproduce*
> Minimal repro on a fresh two-node 2.5.0 cluster:
> 1. Build the following minimal flow on node-1 via REST API:
> - An InputPort with allowRemoteAccess=true (the "target port")
> - A Funnel connected from the InputPort (so the InputPort can be started)
> - A RemoteProcessGroup whose targetUris points back at the same cluster
> (https://node-1:8443) — i.e. a loopback RPG
> - Wait for the RPG to discover the target port via S2S handshake (creates
> a local RemoteGroupPort instance with versionedComponentId = null)
> - Add a GenerateFlowFile processor with a connection whose destination is
> the discovered RemoteGroupPort
> - Enable RemoteProcessGroup transmission (RGP becomes RUNNING)
> 2. Disconnect node-2: PUT /controller/cluster/nodes/\{id} with status:
> DISCONNECTING. Wait until DISCONNECTED.
> 3. Immediately reconnect node-2: PUT /controller/cluster/nodes/\{id} with
> status: CONNECTING.
> Note: The bug fires on any reconnect that triggers inheritance, as long as
> the local RGP has versionedComponentId = null and is RUNNING.
> {*}Expected{*}: node-2 reconnects to CONNECTED.
> {*}Actual{*}: node-2 logs IllegalStateException: Cannot change destination of
> Connection ... within ~250 ms of the reconnect, transitions to DISCONNECTED
> with Node's Flow did not Match Cluster Flow, and stays disconnected.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)