Ilya Kasnacheev created IGNITE-8633:
---------------------------------------
Summary: Node fails to bail out of wrong BLT, instead hanging
around indefinitely
Key: IGNITE-8633
URL: https://issues.apache.org/jira/browse/IGNITE-8633
Project: Ignite
Issue Type: Bug
Affects Versions: 2.4
Reporter: Ilya Kasnacheev
Assignee: Stanislav Lukyanov
Follow-up on
https://stackoverflow.com/questions/50234056/how-to-give-multiple-static-ip-in-apache-ignite-cache-configuration-xml-file/50270676?noredirect=1#comment88095814_50270676
but not quite the same.
I have three nodes: A, B and C.
I've started A and C and performed activation.
Then I stopped them both, started B and performed activation on it.
Now I have two BlT clusters: (A, C) and (B)
However, when I start B; and then try to launch nodes A or C I get inconsistent
behavior:
When I launch C, I get the error:
{code}
org.apache.ignite.spi.IgniteSpiException: BaselineTopology of joining node
(8c1e210f-52bb-424f-9c7c-a2e7b1bab546 ) is not compatible with BaselineTopology
in the cluster. Branching history of cluster BlT ([-1349069127]) doesn't
contain branching point hash of joining node BlT (631694798). Consider cleaning
persistent storage of the node and adding it to the cluster again.
{code}
But when I launch A, it never enters topology, but also never fails. Moreover,
A and B will ping pong each other for eternity:
{code}
[20:16:38,596][WARNING][main][TcpDiscoverySpi] Node has not been connected to
topology and will repeat join process. Check remote nodes logs for possible
error messages. Note that large topology may require significant time to start.
Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting
this message on the starting nodes [networkTimeout=5000]
[20:17:29,514][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery accepted
incoming connection [rmtAddr=/172.25.1.36, rmtPort=49030]
[20:17:29,522][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery spawning
a new thread for connection [rmtAddr=/172.25.1.36, rmtPort=49030]
[20:17:29,523][INFO][tcp-disco-sock-reader-#26][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/172.25.1.36:49030, rmtPort=49030]
[20:17:29,524][INFO][tcp-disco-sock-reader-#26][TcpDiscoverySpi] Received ping
request from the remote node [rmtNodeId=37104137-a21e-4b6f-a70b-09164300bbfc,
rmtAddr=/172.25.1.36:49030, rmtPort=49030]
[20:17:29,525][INFO][tcp-disco-sock-reader-#26][TcpDiscoverySpi] Finished
writing ping response [rmtNodeId=37104137-a21e-4b6f-a70b-09164300bbfc,
rmtAddr=/172.25.1.36:49030, rmtPort=49030]
[20:17:29,526][INFO][tcp-disco-sock-reader-#26][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/172.25.1.36:49030, rmtPort=49030
[20:18:30,733][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery accepted
incoming connection [rmtAddr=/172.25.1.36, rmtPort=50857]
[20:18:30,733][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery spawning
a new thread for connection [rmtAddr=/172.25.1.36, rmtPort=50857]
[20:18:30,733][INFO][tcp-disco-sock-reader-#47][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/172.25.1.36:50857, rmtPort=50857]
[20:18:30,734][INFO][tcp-disco-sock-reader-#47][TcpDiscoverySpi] Received ping
request from the remote node [rmtNodeId=37104137-a21e-4b6f-a70b-09164300bbfc,
rmtAddr=/172.25.1.36:50857, rmtPort=50857]
[20:18:30,734][INFO][tcp-disco-sock-reader-#47][TcpDiscoverySpi] Finished
writing ping response [rmtNodeId=37104137-a21e-4b6f-a70b-09164300bbfc,
rmtAddr=/172.25.1.36:50857, rmtPort=50857]
[20:18:30,734][INFO][tcp-disco-sock-reader-#47][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/172.25.1.36:50857, rmtPort=50857
{code}
{code}
[20:16:28,793][INFO][tcp-disco-msg-worker-#3][GridSnapshotAwareClusterStateProcessorImpl]
Received state change finish message: true
[20:16:28,803][INFO][exchange-worker-#62][time] Finished exchange init
[topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1], crd=true]
[20:16:28,812][INFO][exchange-worker-#62][GridCachePartitionExchangeManager]
Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
[topVer=1, minorTopVer=1], evt=DISCOVERY_CUSTOM_EVT,
node=37104137-a21e-4b6f-a70b-09164300bbfc]
[20:16:28,818][INFO][sys-#68][GridSnapshotAwareClusterStateProcessorImpl]
Successfully performed final activation steps
[nodeId=37104137-a21e-4b6f-a70b-09164300bbfc, client=false,
topVer=AffinityTopologyVersion [topVer=1, minorTopVer=1]]
[20:16:33,571][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery accepted
incoming connection [rmtAddr=/172.25.1.35, rmtPort=42500]
[20:16:33,579][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery spawning
a new thread for connection [rmtAddr=/172.25.1.35, rmtPort=42500]
[20:16:33,580][INFO][tcp-disco-sock-reader-#9][TcpDiscoverySpi] Started serving
remote node connection [rmtAddr=/172.25.1.35:42500, rmtPort=42500]
[20:16:33,592][INFO][tcp-disco-sock-reader-#9][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/172.25.1.35:42500, rmtPort=42500
[20:16:39,801][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery accepted
incoming connection [rmtAddr=/172.25.1.35, rmtPort=42714]
[20:16:39,801][INFO][tcp-disco-srvr-#2][TcpDiscoverySpi] TCP discovery spawning
a new thread for connection [rmtAddr=/172.25.1.35, rmtPort=42714]
[20:16:39,802][INFO][tcp-disco-sock-reader-#10][TcpDiscoverySpi] Started
serving remote node connection [rmtAddr=/172.25.1.35:42714, rmtPort=42714]
[20:16:39,806][INFO][tcp-disco-sock-reader-#10][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/172.25.1.35:42714, rmtPort=42714
{code}
I don't think this is expected behaviour. I will attach config and work
directories.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)