[
https://issues.apache.org/jira/browse/NIFI-12221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17775062#comment-17775062
]
ASF subversion and git services commented on NIFI-12221:
--------------------------------------------------------
Commit f4ae292a457638d3226fb0491e5186fa52ae8518 in nifi's branch
refs/heads/main from Mark Payne
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=f4ae292a45 ]
NIFI-12221: This closes #7876. Be more lenient about which Disconnection Codes
we allow a node to be reconnected to a cluster vs. when we notify the node to
disconnect again. Also updated the timeout for OffloadIT because it
occasionally times ou out while running properly.
Signed-off-by: Joseph Witt <[email protected]>
> Make heartbeat responses more lenient in some cases
> ---------------------------------------------------
>
> Key: NIFI-12221
> URL: https://issues.apache.org/jira/browse/NIFI-12221
> Project: Apache NiFi
> Issue Type: Improvement
> Components: Core Framework
> Reporter: Mark Payne
> Assignee: Mark Payne
> Priority: Major
> Fix For: 2.latest
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When a heartbeat is received by the Cluster Coordinator, it responds based on
> the node's current connection state. In the case of a disconnected node, it
> either notifies the node that it is disconnected so that it will stop
> hearting, or it requests the node to reconnect to the cluster.
> Due to changes that were made in 1.16, as well as a few additional changes
> that have been made since, we can be much more lenient about when we ask the
> node to reconnect vs. disconnect. For example, if a node was disconnected due
> to not handling an update request, we previously needed to request that the
> node disconnect again. However, now we can ask the node to reconnect, as it
> may well be able to reconcile any differences and rejoin.
> We even currently request that a node disconnect if receiving a heartbeat
> from a node whose last state was "Disconnected because Node was Shutdown". We
> should definitely be more lenient in this case, as it's occasionally causing
> System Test failures (e.g.,
> [https://github.com/apache/nifi/actions/runs/6498488206).|https://github.com/apache/nifi/actions/runs/6498488206)]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)