[
https://issues.apache.org/jira/browse/IGNITE-14053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladimir Steshin updated IGNITE-14053:
--------------------------------------
Description:
Suggestion: remove duplicated ‘ping’, make the code simpler.
To ensure some node isn't failed TcpDiscoverySpi has sustained ping,
TcpDiscoveryConnectionCheckMessage and the backward connection check. But there
is also status check message (TcpDiscoveryStatusCheckMessage) which looks
outdated. This message was introduced with first versions of the discovery when
the cluster stability and message delivery were under developing.
was:
Suggestion: remove duplicated ‘ping’, make the code simpler.
To ensure some node isn't failed TcpDiscoverySpi has sustained ping,
TcpDiscoveryConnectionCheckMessage and the backward connection check. But there
is also status check message (TcpDiscoveryStatusCheckMessage) which looks
outdated. This message was introduced with first versions of the discovery when
the cluster stability and message delivery were under developing.
Currently, TcpDiscoveryStatusCheckMessage is actually launched only at cluster
start sometimes. And doesn't happen later due to the ping. The ping updates
time of the last message received which is the reason not to raise the status
check.
It is possible that node loses all incoming connection but keeps connection to
next node. In this case the node gets removed from the ring by its follower.
But cannot recognize the failure because it still successfully send message to
next node. Instead of complex processing of TcpDiscoveryStatusCheckMessage, it
iseems enough to answer on message 'OK, but you are not in the ring'. Every
other nodes see failure of malfunction node and can notify about it in the
message response.
The ticket has been additionally verified with the integration discovery test:
[https://github.com/apache/ignite/pull/8716]
The parent ticket (IGNITE-13980) suggests keeping
TcpDiscoveryStatusCheckMessage for backward compatibility with older versions
of Ignite.
> Remove status check message at all.
> -----------------------------------
>
> Key: IGNITE-14053
> URL: https://issues.apache.org/jira/browse/IGNITE-14053
> Project: Ignite
> Issue Type: Sub-task
> Reporter: Vladimir Steshin
> Assignee: Vladimir Steshin
> Priority: Minor
> Time Spent: 20m
> Remaining Estimate: 0h
>
> Suggestion: remove duplicated ‘ping’, make the code simpler.
> To ensure some node isn't failed TcpDiscoverySpi has sustained ping,
> TcpDiscoveryConnectionCheckMessage and the backward connection check. But
> there is also status check message (TcpDiscoveryStatusCheckMessage) which
> looks outdated. This message was introduced with first versions of the
> discovery when the cluster stability and message delivery were under
> developing.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)