On 11 Jun 2013, at 16:27, Dan Berindei <[email protected]> wrote:
> > > > On Tue, Jun 11, 2013 at 2:01 PM, Manik Surtani <[email protected]> wrote: > > On 10 Jun 2013, at 15:12, Dan Berindei <[email protected]> wrote: > >> Erik, I think in your case you'd be better served by a ConsistentHashFactory >> that always assigns at most one owner from each machine for each segment. >> >> I guess the fix for ISPN-3140 should work as well, but it wouldn't be very >> straightforward: you'd have to keep the rebalancingEnabled attribute set to >> false by default, and you'd have to enable it temporarily every time you >> have a topology change that you do want to process. > > Why? Does the workflow detailed in ISPN-3140 not work? > > > ISPN-3140 is geared toward planned shutdowns, my understanding was that > Erik's scenario involves an unexpected failure. > > Say we have a cluster with 4 nodes spread on 2 machines: A(m1), B(m1), C(m2), > D(m2). > If m2 fails, rebalancing will start automatically and m1 will have 2 copies > of each entry (one on A and one on B). > Trying to suspend rebalancing after m2 has already failed won't have any > effect - if state transfer is already in progress it won't be cancelled. > In order to avoid the unnecessary transfers, rebalancing would have to be > suspended before the failure - i.e. rebalancing should be suspended by > default. > >> It's certainly possible to do this automatically from your app or from a >> monitoring daemon, but I'm pretty sure an enhanced topology-aware CHF would >> be a better fit. > > Do explain. > > > A custom ConsistentHashFactory could distribute segments so that a machine > never has more than 1 copy of each segment. If m2 failed, there would be just > one machine in the cluster, and just one copy of each segment. The factory > would not change the consistent hash, and there wouldn't be any state > transfer. But that's bad for unplanned failures, as you lose data in that case. > > It could be even simpler - the existing > TopologyAwareConsistentHashFactory/TopologyAwareSyncConsistentHashFactory > implementations already ensure just one copy per machine if the number of > machines is >= numOwners. So a custom ConsistentHashFactory could just extend > one of these and skip calling super.rebalance() when the number of machines > in the cluster is < numOwners. > > > >> >> >> >> On Fri, Jun 7, 2013 at 1:45 PM, Erik Salter <[email protected]> wrote: >> I'd like something similar. If I have equal keys on two machines (given an >> orthogonal setup and a TACH), I'd like to suppress state transfer and run >> with only one copy until I can recover my machines. The business case is >> that in a degraded scenario, additional replicas aren't going to buy me >> anything, as a failure will most likely be at the machine level and will >> cause me to lose data. Once I've recovered the other machine, I can turn >> back on state transfer to get my data redundancy. >> >> Erik >> >> -----Original Message----- >> From: [email protected] >> [mailto:[email protected]] On Behalf Of Mircea Markus >> Sent: Tuesday, June 04, 2013 5:44 AM >> To: infinispan -Dev List >> Subject: Re: [infinispan-dev] Suppressing state transfer via JMX >> >> Manik, what's wrong with Dan's suggestion with clearing the cache before >> shutdown? >> >> On 31 May 2013, at 14:20, Manik Surtani <[email protected]> wrote: >> >> >> >> >> If we only want to deal with full cluster shutdown, then I think stopping >> all application requests, calling Cache.clear() on one node, and then >> shutting down all the nodes should be simpler. On start, assuming no cache >> store, the caches will start empty, so starting all the nodes at once and >> only allowing application requests when they've all joined should also work >> without extra work. >> >> >> >> If we only want to stop a part of the cluster, suppressing rebalancing >> would be better, because we wouldn't lose all the data. But we'd still lose >> the keys whose owners are all among the nodes we want to stop. I've >> discussed this with Adrian, and we think if we want to stop a part of the >> cluster without losing data we need a JMX operation on the coordinator that >> will "atomically" remove a set of nodes from the CH. After the operation >> completes, the user will know it's safe to stop those nodes without losing >> data. >> > >> > I think the no-data-loss option is bigger scope, perhaps part of >> ISPN-1394. And that's not what I am asking about. >> > >> >> When it comes to starting a part of the cluster, a "pause rebalancing" >> option would probably be better - but again, on the coordinator, not on each >> joining node. And clearly, if more than numOwner nodes leave while >> rebalancing is suspended, data will be lost. >> > >> > Yup. This sort of option would only be used where data loss isn't an >> issue (such as a distributed cache). Where data loss is an issue, we'd need >> more control - ISPN-1394. >> > >> >> Cheers, >> -- >> Mircea Markus >> Infinispan lead (www.infinispan.org) >> >> >> >> >> >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- > Manik Surtani > [email protected] > twitter.com/maniksurtani > > Platform Architect, JBoss Data Grid > http://red.ht/data-grid > > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Manik Surtani [email protected] twitter.com/maniksurtani Platform Architect, JBoss Data Grid http://red.ht/data-grid
_______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
