[ https://issues.apache.org/jira/browse/IGNITE-9144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Evgenii Zagumennov reassigned IGNITE-9144: ------------------------------------------ Assignee: Evgenii Zagumennov > A client node leaving a grid may trigger the wrong message about coordinator > change in the logs > ----------------------------------------------------------------------------------------------- > > Key: IGNITE-9144 > URL: https://issues.apache.org/jira/browse/IGNITE-9144 > Project: Ignite > Issue Type: Bug > Reporter: Ivan Artukhov > Assignee: Evgenii Zagumennov > Priority: Major > > The issue was introduced by https://issues.apache.org/jira/browse/IGNITE-8738. > Suppose we have a grid with X server nodes and Y client nodes. Server nodes > are restarted periodically while client nodes are left untouched. In this > case *order* of current coordinator might be greater than *order* of any > client node. Then when some client node leaves the grid, we will erroneously > print the *Coordinator changed* message with *client* node being the previous > coordinator. E.g.: > {noformat} > [2018-07-19 14:55:28,897][INFO ][disco-event-worker-#61] Node left topology: > TcpDiscoveryNode [id=7240957f-a51b-452d-bfc8-420e8ef9ea68, addrs=[127.0.0.1, > 172.17.0.1, 172.25.1.15], sockAddrs=[/172.17.0.1:0, /127.0.0.1:0, > lab15.gridgain.local/172.25.1.15:0], discPort=0, order=16, intOrder=11, > lastExchangeTime=1532001260398, loc=false, ver=2.5.1#20180717-sha1:80e51c80, > isClient=true] > [2018-07-19 14:55:28,899][INFO ][disco-event-worker-#61] Topology snapshot > [ver=27, servers=3, clients=4, CPUs=96, offheap=260.0GB, heap=56.0GB] > [2018-07-19 14:55:28,899][INFO ][disco-event-worker-#61] Coordinator changed > [prev=TcpDiscoveryNode [id=7240957f-a51b-452d-bfc8-420e8ef9ea68, > addrs=[127.0.0.1, 172.17.0.1, 172.25.1.15], sockAddrs=[/172.17.0.1:0, > /127.0.0.1:0, lab15.gridgain.local/172.25.1.15:0], discPort=0, order=16, > intOrder=11, lastExchangeTime=1532001260398, loc=false, > ver=2.5.1#20180717-sha1:80e51c80, isClient=true], cur=TcpDiscoveryNode > [id=760fd8f2-b9d7-4953-aa86-3954c05c9feb, addrs=[127.0.0.1, 172.17.0.1, > 172.25.1.21], sockAddrs=[/172.17.0.1:47500, > lab21.gridgain.local/172.25.1.21:47500, /127.0.0.1:47500], discPort=47500, > order=21, intOrder=15, lastExchangeTime=1532001260428, loc=false, > ver=2.5.1#20180717-sha1:80e51c80, isClient=false]] > [2018-07-19 14:55:28,899][INFO ][disco-event-worker-#61] ^-- Node > [id=22B15E97-9944-48B5-A473-5C64E75A4D5A, clusterState=ACTIVE] > [2018-07-19 14:55:28,899][INFO ][disco-event-worker-#61] ^-- Baseline > [id=6, size=3, online=3, offline=0] > [2018-07-19 14:55:28,899][INFO ][disco-event-worker-#61] Data Regions > Configured: > [2018-07-19 14:55:28,900][INFO ][disco-event-worker-#61] ^-- default > [initSize=256.0 MiB, maxSize=60.0 GiB, persistenceEnabled=true] > {noformat} > The *Coordinator changed* message should not be here because in fact the > coordinator was not changed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)