In case of MergeView the cluster topology manager running on (the new) coordinator will request the current cache topology from all members and will compute a new topology as the union of all. The new topology id is computed as the max + 2 of the existing topology ids. Any currently pending rebalance in any subpartition is ended now and a new rebalance is triggered for the new cluster. No data version conflict resolution is performed => chaos :)
On 04/16/2013 10:05 PM, Manik Surtani wrote: > Guys - I've started documenting this here [1] and will put together a > prototype this week. > > One question though, perhaps one for Dan/Adrian - is there any special > handling for state transfer if a MergeView is detected? > > - M > > [1] https://community.jboss.org/wiki/DesignDealingWithNetworkPartitions > > On 6 Apr 2013, at 04:26, Bela Ban <[email protected]> wrote: > >> >> On 4/5/13 3:53 PM, Manik Surtani wrote: >>> Guys, >>> >>> So this is what I have in mind for this, looking for opinions. >>> >>> 1. We write a SplitBrainListener which is registered when the >>> channel connects. The aim of this listener is to identify when we >>> have a partition. This can be identified when a view change is >>> detected, and the new view is significantly smaller than the old >>> view. Easier to detect for large clusters, but smaller clusters will >>> be harder - trying to decide between a node leaving vs a partition. >>> (Any better ideas here?) >>> >>> 2. The SBL flips a switch in an interceptor >>> (SplitBrainHandlerInterceptor?) which switches the node to be >>> read-only (reject invocations that change the state of the local >>> node) if it is in the smaller partition (newView.size < oldView.size >>> / 2). Only works reliably for odd-numbered cluster sizes, and the >>> issues with small clusters seen in (1) will affect here as well. >>> >>> 3. The SBL can flip the switch in the interceptor back to normal >>> operation once a MergeView is detected. >>> >>> It's no way near perfect but at least it means that we can recommend >>> enabling this and setting up an odd number of nodes, with a cluster >>> size of at least N if you want to reduce inconsistency in your grid >>> during partitions. >>> >>> Is this even useful? >> >> So I assume this is to shut down (or make read-only) non primary >> partitions. I'd go with an approach similar to [1] section 5.6.2, which >> makes a partition read-only once it drops below a certain number of nodes N. >> >> >>> Bela, is there a more reliable mechanism to detect a split in (1)? >> I'm afraid no. We never know whether a large number of members being >> removed from the view means that they left, or that we have a partition, >> e.g. because a switch crashed. >> >> One thing you could do though is for members who are about to leave >> regularly to broadcast a LEAVE messages, so that when the view is >> received, the SBL knows those members, and might be able to determine >> better whether we have a partition, or not. >> >> [1] http://www.jgroups.org/manual-3.x/html/user-advanced.html, section 5.6.2 >> >> -- >> Bela Ban, JGroups lead (http://www.jgroups.org) >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > -- > Manik Surtani > [email protected] > twitter.com/maniksurtani > > Platform Architect, JBoss Data Grid > http://red.ht/data-grid > > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
