Xinyu Wang created NIFI-15906:
---------------------------------
Summary: Cluster reconnect inheritance throws
IllegalStateException when a connection's destination is a running
RemoteGroupPort whose versionedComponentId is null
Key: NIFI-15906
URL: https://issues.apache.org/jira/browse/NIFI-15906
Project: Apache NiFi
Issue Type: Bug
Components: Core Framework
Affects Versions: 2.5.0
Reporter: Xinyu Wang
*Symptom*
When a cluster node performs reconnect inheritance and the local flow
contains a Connection whose destination is a RemoteGroupPort (RGP) with
transmission=ON, the synchronizer aborts with:
ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService Handling
reconnection request failed
org.apache.nifi.controller.serialization.FlowSynchronizationException: Failed
to connect node to cluster because local flow controller partially updated.
Administrator should disconnect node and review flow for corruption.
at
o.a.n.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:947)
...
Caused by: java.lang.IllegalStateException: Cannot change destination of
Connection because the current destination
([RemoteGroupPort[name=TARGET_PORT,targets=https://nifi-1:8443]]) is running
at
o.a.n.connectable.StandardConnection.setDestination(StandardConnection.java:296)
at
o.a.n.flow.synchronization.StandardVersionedComponentSynchronizer.updateConnectionDestinations(StandardVersionedComponentSynchronizer.java:863)
at
o.a.n.flow.synchronization.StandardVersionedComponentSynchronizer.synchronize(StandardVersionedComponentSynchronizer.java:573)
...
The node is then marked DISCONNECTED, Disconnect Code = Node's Flow did not
Match Cluster Flow and requires manual intervention to rejoin.
*Steps to Reproduce*
Minimal repro on a fresh two-node 2.5.0 cluster:
1. Build the following minimal flow on node-1 via REST API:
- An InputPort with allowRemoteAccess=true (the "target port")
- A Funnel connected from the InputPort (so the InputPort can be started)
- A RemoteProcessGroup whose targetUris points back at the same cluster
(https://node-1:8443) — i.e. a loopback RPG
- Wait for the RPG to discover the target port via S2S handshake (creates a
local RemoteGroupPort instance with versionedComponentId = null)
- Add a GenerateFlowFile processor with a connection whose destination is
the discovered RemoteGroupPort
- Enable RemoteProcessGroup transmission (RGP becomes RUNNING)
2. Disconnect node-2: PUT /controller/cluster/nodes/\{id} with status:
DISCONNECTING. Wait until DISCONNECTED.
3. Immediately reconnect node-2: PUT /controller/cluster/nodes/\{id} with
status: CONNECTING.
Note: The bug fires on any reconnect that triggers inheritance, as long as
the local RGP has versionedComponentId = null and is RUNNING.
{*}Expected{*}: node-2 reconnects to CONNECTED.
{*}Actual{*}: node-2 logs IllegalStateException: Cannot change destination of
Connection ... within ~250 ms of the reconnect, transitions to DISCONNECTED
with Node's Flow did not Match Cluster Flow, and stays disconnected.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)