Hello!

Personally I've never seen a split brain. We recommend having collocated
clusters, in which case notes will only fail one by one as opposed to
forming a segmented cluster.
But, if you are really concerned with split brain, you can use
ZooKeeper-based discovery, since ZooKeeper has built-in split brain
protection that you can rely on.

Regards,
-- 
Ilya Kasnacheev


вт, 24 дек. 2019 г. в 14:37, Akash Shinde <[email protected]>:

> Can someone please help me on this?
>
> On Thu, Dec 12, 2019 at 1:11 PM Akash Shinde <[email protected]>
> wrote:
>
>> Hi,
>>
>> Can you please explain on high level how GridGain implementations
>> protects from having  two segments that are alive at the same time which
>> could lead to data inconsistency over time? What exactly does it do to
>> achieve this?
>>
>> Regards,
>> A.
>>
>> On Wed, Dec 11, 2019 at 5:48 PM Stanislav Lukyanov <
>> [email protected]> wrote:
>>
>>> In Ignite a node can go into "segmented" state in two cases really: 1. A
>>> node was unavailable (sleeping. hanging in full GC, etc) for a long time 2.
>>> Cluster detected a possible split-brain situation and marked the node as
>>> "segmented".
>>>
>>> Yes, split-brain protection (in GridGain implementation and in theory
>>> too) doesn't protect your node from stopping. It protects you from having
>>> two segments that are alive at the same time which could lead to data
>>> inconsistency over time.
>>>
>>> Regarding Discovery and large clusters. If your cluster is too big for
>>> the ring-based TcpDiscoverySpi to work well then you should use Zookeeper
>>> Discovery which was created specifically to support large clusters.
>>>
>>> Stan
>>>
>>> On Mon, Dec 9, 2019 at 4:02 PM Prasad Bhalerao <
>>> [email protected]> wrote:
>>>
>>>>
>>>> Can someone please advise on this?
>>>>>
>>>>> ---------- Forwarded message ---------
>>>>> From: Prasad Bhalerao <[email protected]>
>>>>> Date: Fri, Nov 29, 2019 at 7:53 AM
>>>>> Subject: Re: Local node terminated after segmentation
>>>>> To: <[email protected]>
>>>>>
>>>>>
>>>>> I had checked the resource you mentioned, but I was confused with
>>>>> grid-gain doc  describing it as protection against split-brain. Because if
>>>>> the node is segmented the only thing one can do is stop/restart/noop.
>>>>> I was just wondering how it provides protection against split-brain.
>>>>> Now I think by protection it means kill the segmented node/nodes or
>>>>> restart it and bring it back in the cluster .
>>>>>
>>>>> Ignite uses TcpDiscoverSpi to send a heartbeat the next node in the
>>>>> ring right to check if the node is reachable or not.
>>>>> So the question in what situation one needs one more ways to check if
>>>>> the node is reachable or not using different resolvers?
>>>>>
>>>>> Please let me know if my understanding is correct.
>>>>>
>>>>> The article you mentioned, I had checked that code. It requires a node
>>>>> to be configured in advance so that resolver can check if that node is
>>>>> reachable from local host. It doesn't not check if all the nodes are
>>>>> reachable from local host.
>>>>>
>>>>> Eg: node1 will check for node2 and node2 will check for node 3 and
>>>>> node 3 will check for node1 to complete the ring
>>>>> Just wondering how to configure this plugin in prod env with large
>>>>> cluster.
>>>>> I tried to check grid-gain doc to see if they have provided any sample
>>>>> code to configure their plugins just to get an idea but did not find any.
>>>>>
>>>>> Can you please advise?
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Prasad
>>>>>
>>>>> On Thu 28 Nov, 2019, 11:41 PM akurbanov <[email protected] wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Basically this is a mechanism to implement custom logical/network
>>>>>> split-brain protection. Segmentation resolvers allow you to implement
>>>>>> a way
>>>>>> to determine if node has to be segmented/stopped/etc in method
>>>>>> isValidSegment() and possibly use different combinations of resolvers
>>>>>> within
>>>>>> processor.
>>>>>>
>>>>>> If you want to check out how it could be done, some articles/source
>>>>>> samples
>>>>>> that might give you a good insight may be easily found on the web,
>>>>>> like:
>>>>>>
>>>>>> https://medium.com/@aamargajbhiye/how-to-handle-network-segmentation-in-apache-ignite-35dc5fa6f239
>>>>>>
>>>>>> http://apache-ignite-users.70518.x6.nabble.com/Segmentation-Plugin-blog-or-article-td27955.html
>>>>>>
>>>>>> 2-3 are described in the documentation, copying the link just to
>>>>>> point out
>>>>>> which one:
>>>>>> https://apacheignite.readme.io/docs/critical-failures-handling
>>>>>>
>>>>>> By default answer to 2 is: Ignite doesn't ignote node FailureType
>>>>>> SEGMENTATION and calls the failure handler in this case. Actions that
>>>>>> are
>>>>>> taken are defined in failure handler.
>>>>>>
>>>>>> AbstractFailureHandler class has only SYSTEM_WORKER_BLOCKED and
>>>>>> SYSTEM_CRITICAL_OPERATION_TIMEOUT ignored by default. However, you
>>>>>> might
>>>>>> override the failure handler and call .setIgnoredFailureTypes().
>>>>>>
>>>>>> Links:
>>>>>> Extend this class:
>>>>>>
>>>>>> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/failure/AbstractFailureHandler.java
>>>>>> — check for custom implementations used in Ignite tests and how they
>>>>>> are
>>>>>> used.
>>>>>>
>>>>>> Sample from tests:
>>>>>>
>>>>>> https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/failure/SystemWorkersBlockingTest.java
>>>>>>
>>>>>> Failure processor:
>>>>>>
>>>>>> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/internal/processors/failure/FailureProcessor.java
>>>>>>
>>>>>> Best regards,
>>>>>> Anton
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>>>>
>>>>>

Reply via email to