Guys - I've started documenting this here [1] and will put together a prototype this week.
One question though, perhaps one for Dan/Adrian - is there any special handling for state transfer if a MergeView is detected? - M [1] https://community.jboss.org/wiki/DesignDealingWithNetworkPartitions On 6 Apr 2013, at 04:26, Bela Ban <[email protected]> wrote: > > > On 4/5/13 3:53 PM, Manik Surtani wrote: >> Guys, >> >> So this is what I have in mind for this, looking for opinions. >> >> 1. We write a SplitBrainListener which is registered when the >> channel connects. The aim of this listener is to identify when we >> have a partition. This can be identified when a view change is >> detected, and the new view is significantly smaller than the old >> view. Easier to detect for large clusters, but smaller clusters will >> be harder - trying to decide between a node leaving vs a partition. >> (Any better ideas here?) >> >> 2. The SBL flips a switch in an interceptor >> (SplitBrainHandlerInterceptor?) which switches the node to be >> read-only (reject invocations that change the state of the local >> node) if it is in the smaller partition (newView.size < oldView.size >> / 2). Only works reliably for odd-numbered cluster sizes, and the >> issues with small clusters seen in (1) will affect here as well. >> >> 3. The SBL can flip the switch in the interceptor back to normal >> operation once a MergeView is detected. >> >> It's no way near perfect but at least it means that we can recommend >> enabling this and setting up an odd number of nodes, with a cluster >> size of at least N if you want to reduce inconsistency in your grid >> during partitions. >> >> Is this even useful? > > > So I assume this is to shut down (or make read-only) non primary > partitions. I'd go with an approach similar to [1] section 5.6.2, which > makes a partition read-only once it drops below a certain number of nodes N. > > >> Bela, is there a more reliable mechanism to detect a split in (1)? > > I'm afraid no. We never know whether a large number of members being > removed from the view means that they left, or that we have a partition, > e.g. because a switch crashed. > > One thing you could do though is for members who are about to leave > regularly to broadcast a LEAVE messages, so that when the view is > received, the SBL knows those members, and might be able to determine > better whether we have a partition, or not. > > [1] http://www.jgroups.org/manual-3.x/html/user-advanced.html, section 5.6.2 > > -- > Bela Ban, JGroups lead (http://www.jgroups.org) > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Manik Surtani [email protected] twitter.com/maniksurtani Platform Architect, JBoss Data Grid http://red.ht/data-grid _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
