>I do not want to restart it and I cannot do a failover because a network issue just happened and the stand-by may be invalid. The fix is to always restart the slave. You can enable CacheWriteSynchronizationMode.FULL_SYNC and there will be no differences between primary and backup partitions. In this case, you can just restart your master node - the backup node will have valid data.
There is no way to join nodes after segmentation without restarting one of the nodes. Evgenii вт, 16 июн. 2020 г. в 06:26, Actarus <mathieu.grig...@grassvalley.com>: > Hello, > > I'm running Apache Ignite (2.4.0) embedded into a java application that > runs > in a master/slave architecture. This means that there are only ever two > nodes in a grid, in FULL_SYNC, REPLICATED mode. Only the master application > writes to the grid, the slave only reads from it when it gets promoted to > master on a failover. > > In such an architecture, network segmentation issues mean different things. > Typically I see that for handling segmentation, the node that experienced > the issue would need to be restarted. However in this scenario if the > master > is segmented, I do not want to restart it and I cannot do a failover > because > a network issue just happened and the stand-by may be invalid. The fix is > to > always restart the slave. > > However I notice that regardless of handling the EVT_NODE_SEGMENTED event, > adding a SegmentationProcess, running with SegmentationPolicy.NOOP and > having a segmentation plugin and always returning true/OK, I find that the > node that runs in master always remains in segmented state, and it is > impossible for it to re-join a cluster after restarting the slave node. > > Is there some mechanism I can use to tell the node within my master process > to completely ignore segmentation? Or tell it that it is fine so that > discovery can still happen after I restart the slave node? Currently I used > port 4444 with TcpDiscoverySpi with hard-coded addresses (master and slave > IP addresses). When the master node is segmented (by simulating network > issues on the command-line) it appears there's no way for the discovery to > recover - port 4444 is shut down, and the slave node always comes up blind > to the master. > > I would appreciate any insights on this issue. Thank you. > > > > -- > Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >