[
https://issues.apache.org/jira/browse/IGNITE-25802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Roman Puchkovskiy updated IGNITE-25802:
---------------------------------------
Description:
There is a log from a Docker run where a line reports the following
2025-07-02 08:32:57:705 +0000 [INFO][main][ConnectionManager] Server started
[address=/0.0.0.0:3344]
2025-07-02 08:32:57:705 +0000 [INFO][main][LocalIpAddresses] Local IP
addresses: [/172.28.0.2, /127.0.0.1]
But later there is
2025-07-02 08:32:57:904 +0000 [INFO][main][MembershipProtocol]
[default:Test_cluster_0:[email protected]:3344] Making initial Sync
to all seed members: [172.17.0.1:3344, 172.17.0.1:3345, 172.17.0.1:3346]
2025-07-02 08:32:58:028 +0000
[WARNING][Test_cluster_0-network-worker-1][HandshakeManagerUtils] Rejecting
handshake: Got handshake start from self, this should never happen; this is a
programming error [localNode=\{id=a2363b06-889f-42ab-a32e-d5563b0dacf8,
name=Test_cluster_0, address=172.28.0.2:3344},
acceptorNode=ClusterNodeMessageImpl [host=172.28.0.2,
id=a2363b06-889f-42ab-a32e-d5563b0dacf8, name=Test_cluster_0, port=3344]]
which means that current node is also available as 172.17.0.1. But it did not
find that IP on start as its own IP for some reason.
# Maybe our code that collects all IPs of the node we run on is incomplete; in
this case we need to fix it
# If we try to establish a connection by some IP1 and we find out that it
leads to the same node, we could just add IP1 to the list of known local IPs
and repeat an attempt to send the message (this time, it would be handled
locally, without going from the network).
If it's not possible to make the detection reliable, we can just memorize the
fact that IP X points to ourselves and later use this fact when detecting
'calls to self', instead of throwing an exception.
was:
There is a log from a Docker run where a line reports the following
2025-07-02 08:32:57:705 +0000 [INFO][main][ConnectionManager] Server started
[address=/0.0.0.0:3344]
2025-07-02 08:32:57:705 +0000 [INFO][main][LocalIpAddresses] Local IP
addresses: [/172.28.0.2, /127.0.0.1]
But later there is
2025-07-02 08:32:57:904 +0000 [INFO][main][MembershipProtocol]
[default:Test_cluster_0:[email protected]:3344] Making initial Sync
to all seed members: [172.17.0.1:3344, 172.17.0.1:3345, 172.17.0.1:3346]
2025-07-02 08:32:58:028 +0000
[WARNING][Test_cluster_0-network-worker-1][HandshakeManagerUtils] Rejecting
handshake: Got handshake start from self, this should never happen; this is a
programming error [localNode=\{id=a2363b06-889f-42ab-a32e-d5563b0dacf8,
name=Test_cluster_0, address=172.28.0.2:3344},
acceptorNode=ClusterNodeMessageImpl [host=172.28.0.2,
id=a2363b06-889f-42ab-a32e-d5563b0dacf8, name=Test_cluster_0, port=3344]]
which means that current node is also available as 172.17.0.1. But it did not
find that IP on start as its own IP for some reason.
# Maybe our code that collects all IPs of the node we run on is incomplete; in
this case we need to fix it
# If we try to establish a connection by some IP1 and we find out that it
leads to the same node, we could just add IP1 to the list of known local IPs
and repeat an attempt to send the message (this time, it would be handled
locally, without going from the network).
> 'Not my IP' detection is unreliable in Docker envs
> --------------------------------------------------
>
> Key: IGNITE-25802
> URL: https://issues.apache.org/jira/browse/IGNITE-25802
> Project: Ignite
> Issue Type: Bug
> Reporter: Roman Puchkovskiy
> Priority: Major
> Labels: ignite-3
>
> There is a log from a Docker run where a line reports the following
> 2025-07-02 08:32:57:705 +0000 [INFO][main][ConnectionManager] Server started
> [address=/0.0.0.0:3344]
> 2025-07-02 08:32:57:705 +0000 [INFO][main][LocalIpAddresses] Local IP
> addresses: [/172.28.0.2, /127.0.0.1]
> But later there is
> 2025-07-02 08:32:57:904 +0000 [INFO][main][MembershipProtocol]
> [default:Test_cluster_0:[email protected]:3344] Making initial Sync
> to all seed members: [172.17.0.1:3344, 172.17.0.1:3345, 172.17.0.1:3346]
> 2025-07-02 08:32:58:028 +0000
> [WARNING][Test_cluster_0-network-worker-1][HandshakeManagerUtils] Rejecting
> handshake: Got handshake start from self, this should never happen; this is a
> programming error [localNode=\{id=a2363b06-889f-42ab-a32e-d5563b0dacf8,
> name=Test_cluster_0, address=172.28.0.2:3344},
> acceptorNode=ClusterNodeMessageImpl [host=172.28.0.2,
> id=a2363b06-889f-42ab-a32e-d5563b0dacf8, name=Test_cluster_0, port=3344]]
> which means that current node is also available as 172.17.0.1. But it did not
> find that IP on start as its own IP for some reason.
> # Maybe our code that collects all IPs of the node we run on is incomplete;
> in this case we need to fix it
> # If we try to establish a connection by some IP1 and we find out that it
> leads to the same node, we could just add IP1 to the list of known local IPs
> and repeat an attempt to send the message (this time, it would be handled
> locally, without going from the network).
> If it's not possible to make the detection reliable, we can just memorize the
> fact that IP X points to ourselves and later use this fact when detecting
> 'calls to self', instead of throwing an exception.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)