Hello, I'm running Apache Ignite (2.4.0) embedded into a java application that runs in a master/slave architecture. This means that there are only ever two nodes in a grid, in FULL_SYNC, REPLICATED mode. Only the master application writes to the grid, the slave only reads from it when it gets promoted to master on a failover.
In such an architecture, network segmentation issues mean different things. Typically I see that for handling segmentation, the node that experienced the issue would need to be restarted. However in this scenario if the master is segmented, I do not want to restart it and I cannot do a failover because a network issue just happened and the stand-by may be invalid. The fix is to always restart the slave. However I notice that regardless of handling the EVT_NODE_SEGMENTED event, adding a SegmentationProcess, running with SegmentationPolicy.NOOP and having a segmentation plugin and always returning true/OK, I find that the node that runs in master always remains in segmented state, and it is impossible for it to re-join a cluster after restarting the slave node. Is there some mechanism I can use to tell the node within my master process to completely ignore segmentation? Or tell it that it is fine so that discovery can still happen after I restart the slave node? Currently I used port 4444 with TcpDiscoverySpi with hard-coded addresses (master and slave IP addresses). When the master node is segmented (by simulating network issues on the command-line) it appears there's no way for the discovery to recover - port 4444 is shut down, and the slave node always comes up blind to the master. I would appreciate any insights on this issue. Thank you. -- Sent from: http://apache-ignite-users.70518.x6.nabble.com/
