[ 
https://issues.apache.org/jira/browse/NIFI-15906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18080959#comment-18080959
 ] 

ASF subversion and git services commented on NIFI-15906:
--------------------------------------------------------

Commit 5bf68c8913d67776566c7c3210525a315985b844 in nifi's branch 
refs/heads/main from Rakesh Kumar Singh
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=5bf68c8913d ]

NIFI-15906 Fixed Exception on cluster reconnect for running RemoteGroupPort 
Destinations (#11212)

During cluster reconnect, StandardVersionedComponentSynchronizer calls
updateConnectionDestinations() which temporarily re-points connections to
a dummy Funnel. When the current destination is a RemoteGroupPort (RPG)
that has versionedComponentId=null (common for S2S ports discovered at
runtime), the synchronizer cannot match it to the versioned component map
and attempts a temp-Funnel detour.

Signed-off-by: David Handermann <[email protected]>

> Cluster reconnect inheritance throws IllegalStateException when a 
> connection's destination is a running RemoteGroupPort whose 
> versionedComponentId is null
> ----------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-15906
>                 URL: https://issues.apache.org/jira/browse/NIFI-15906
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 2.5.0
>            Reporter: Xinyu Wang
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Symptom*
>   When a cluster node performs reconnect inheritance and the local flow 
> contains a Connection whose destination is a RemoteGroupPort (RGP) with 
> transmission=ON, the synchronizer aborts with:
>   ERROR [Reconnect to Cluster] o.a.nifi.controller.StandardFlowService 
> Handling reconnection request failed
>   org.apache.nifi.controller.serialization.FlowSynchronizationException: 
> Failed to connect node to cluster because local flow controller partially 
> updated.
>   Administrator should disconnect node and review flow for corruption.
>       at 
> o.a.n.controller.StandardFlowService.loadFromConnectionResponse(StandardFlowService.java:947)
>       ...
>   Caused by: java.lang.IllegalStateException: Cannot change destination of 
> Connection because the current destination
>   ([RemoteGroupPort[name=TARGET_PORT,targets=https://nifi-1:8443]]) is running
>       at 
> o.a.n.connectable.StandardConnection.setDestination(StandardConnection.java:296)
>       at 
> o.a.n.flow.synchronization.StandardVersionedComponentSynchronizer.updateConnectionDestinations(StandardVersionedComponentSynchronizer.java:863)
>       at 
> o.a.n.flow.synchronization.StandardVersionedComponentSynchronizer.synchronize(StandardVersionedComponentSynchronizer.java:573)
>       ...
>   The node is then marked DISCONNECTED, Disconnect Code = Node's Flow did not 
> Match Cluster Flow and requires manual intervention to rejoin.
> *Steps to Reproduce*
>   Minimal repro on a fresh two-node 2.5.0 cluster:
>   1. Build the following minimal flow on node-1 via REST API:
>     - An InputPort with allowRemoteAccess=true (the "target port")
>     - A Funnel connected from the InputPort (so the InputPort can be started)
>     - A RemoteProcessGroup whose targetUris points back at the same cluster 
> (https://node-1:8443) — i.e. a loopback RPG
>     - Wait for the RPG to discover the target port via S2S handshake (creates 
> a local RemoteGroupPort instance with versionedComponentId = null)
>     - Add a GenerateFlowFile processor with a connection whose destination is 
> the discovered RemoteGroupPort
>     - Enable RemoteProcessGroup transmission (RGP becomes RUNNING)
>   2. Disconnect node-2: PUT /controller/cluster/nodes/\{id} with status: 
> DISCONNECTING. Wait until DISCONNECTED.
>   3. Immediately reconnect node-2: PUT /controller/cluster/nodes/\{id} with 
> status: CONNECTING.
>   Note: The bug fires on any reconnect that triggers inheritance, as long as 
> the local RGP has versionedComponentId = null and is RUNNING.
> {*}Expected{*}: node-2 reconnects to CONNECTED.
> {*}Actual{*}: node-2 logs IllegalStateException: Cannot change destination of 
> Connection ... within ~250 ms of the reconnect, transitions to DISCONNECTED 
> with Node's Flow did not Match Cluster Flow, and stays disconnected.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to