Hello! Personally I've never seen a split brain. We recommend having collocated clusters, in which case notes will only fail one by one as opposed to forming a segmented cluster. But, if you are really concerned with split brain, you can use ZooKeeper-based discovery, since ZooKeeper has built-in split brain protection that you can rely on.
Regards, -- Ilya Kasnacheev вт, 24 дек. 2019 г. в 14:37, Akash Shinde <[email protected]>: > Can someone please help me on this? > > On Thu, Dec 12, 2019 at 1:11 PM Akash Shinde <[email protected]> > wrote: > >> Hi, >> >> Can you please explain on high level how GridGain implementations >> protects from having two segments that are alive at the same time which >> could lead to data inconsistency over time? What exactly does it do to >> achieve this? >> >> Regards, >> A. >> >> On Wed, Dec 11, 2019 at 5:48 PM Stanislav Lukyanov < >> [email protected]> wrote: >> >>> In Ignite a node can go into "segmented" state in two cases really: 1. A >>> node was unavailable (sleeping. hanging in full GC, etc) for a long time 2. >>> Cluster detected a possible split-brain situation and marked the node as >>> "segmented". >>> >>> Yes, split-brain protection (in GridGain implementation and in theory >>> too) doesn't protect your node from stopping. It protects you from having >>> two segments that are alive at the same time which could lead to data >>> inconsistency over time. >>> >>> Regarding Discovery and large clusters. If your cluster is too big for >>> the ring-based TcpDiscoverySpi to work well then you should use Zookeeper >>> Discovery which was created specifically to support large clusters. >>> >>> Stan >>> >>> On Mon, Dec 9, 2019 at 4:02 PM Prasad Bhalerao < >>> [email protected]> wrote: >>> >>>> >>>> Can someone please advise on this? >>>>> >>>>> ---------- Forwarded message --------- >>>>> From: Prasad Bhalerao <[email protected]> >>>>> Date: Fri, Nov 29, 2019 at 7:53 AM >>>>> Subject: Re: Local node terminated after segmentation >>>>> To: <[email protected]> >>>>> >>>>> >>>>> I had checked the resource you mentioned, but I was confused with >>>>> grid-gain doc describing it as protection against split-brain. Because if >>>>> the node is segmented the only thing one can do is stop/restart/noop. >>>>> I was just wondering how it provides protection against split-brain. >>>>> Now I think by protection it means kill the segmented node/nodes or >>>>> restart it and bring it back in the cluster . >>>>> >>>>> Ignite uses TcpDiscoverSpi to send a heartbeat the next node in the >>>>> ring right to check if the node is reachable or not. >>>>> So the question in what situation one needs one more ways to check if >>>>> the node is reachable or not using different resolvers? >>>>> >>>>> Please let me know if my understanding is correct. >>>>> >>>>> The article you mentioned, I had checked that code. It requires a node >>>>> to be configured in advance so that resolver can check if that node is >>>>> reachable from local host. It doesn't not check if all the nodes are >>>>> reachable from local host. >>>>> >>>>> Eg: node1 will check for node2 and node2 will check for node 3 and >>>>> node 3 will check for node1 to complete the ring >>>>> Just wondering how to configure this plugin in prod env with large >>>>> cluster. >>>>> I tried to check grid-gain doc to see if they have provided any sample >>>>> code to configure their plugins just to get an idea but did not find any. >>>>> >>>>> Can you please advise? >>>>> >>>>> >>>>> Thanks, >>>>> Prasad >>>>> >>>>> On Thu 28 Nov, 2019, 11:41 PM akurbanov <[email protected] wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> Basically this is a mechanism to implement custom logical/network >>>>>> split-brain protection. Segmentation resolvers allow you to implement >>>>>> a way >>>>>> to determine if node has to be segmented/stopped/etc in method >>>>>> isValidSegment() and possibly use different combinations of resolvers >>>>>> within >>>>>> processor. >>>>>> >>>>>> If you want to check out how it could be done, some articles/source >>>>>> samples >>>>>> that might give you a good insight may be easily found on the web, >>>>>> like: >>>>>> >>>>>> https://medium.com/@aamargajbhiye/how-to-handle-network-segmentation-in-apache-ignite-35dc5fa6f239 >>>>>> >>>>>> http://apache-ignite-users.70518.x6.nabble.com/Segmentation-Plugin-blog-or-article-td27955.html >>>>>> >>>>>> 2-3 are described in the documentation, copying the link just to >>>>>> point out >>>>>> which one: >>>>>> https://apacheignite.readme.io/docs/critical-failures-handling >>>>>> >>>>>> By default answer to 2 is: Ignite doesn't ignote node FailureType >>>>>> SEGMENTATION and calls the failure handler in this case. Actions that >>>>>> are >>>>>> taken are defined in failure handler. >>>>>> >>>>>> AbstractFailureHandler class has only SYSTEM_WORKER_BLOCKED and >>>>>> SYSTEM_CRITICAL_OPERATION_TIMEOUT ignored by default. However, you >>>>>> might >>>>>> override the failure handler and call .setIgnoredFailureTypes(). >>>>>> >>>>>> Links: >>>>>> Extend this class: >>>>>> >>>>>> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/failure/AbstractFailureHandler.java >>>>>> — check for custom implementations used in Ignite tests and how they >>>>>> are >>>>>> used. >>>>>> >>>>>> Sample from tests: >>>>>> >>>>>> https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/failure/SystemWorkersBlockingTest.java >>>>>> >>>>>> Failure processor: >>>>>> >>>>>> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/failure/FailureProcessor.java >>>>>> >>>>>> Best regards, >>>>>> Anton >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >>>>>> >>>>>
