On 1 Feb 2012, at 12:23, Dan Berindei wrote: > Bela, you're right, this is essentially what we talked about in Lisbon: > https://community.jboss.org/wiki/AsymmetricCachesAndManualRehashingDesign > > For joins I actually started working on a policy of coalescing joins > that happen one after the other in a short time interval. The current > implementation is very primitive, as I shifted focus to stability, but > it does coalesce joins 1 second after another join started (or while > that join is still running). > > I don't quite agree with Sanne's assessment that it's fine for > getCache() to block for 5 minutes until the administrator allows the > new node to join. We should modify startCaches() instead to signal to > the coordinator that we are ready to receive data for one or all of > the defined caches, and wait with a customizable time limit until the > caches have properly joined the cluster. > > The getCache() timeout should not be increased at all. Instead I would > propose that getCache() returns a functional cache immediately, even > if the cache didn't receive any data, and it works solely as an L1 > cache until the administrator allows it to join. I'd even make it > possible to designate a cache as an L1-only cache, so it's never an > owner for any key.
I presume this would be encoded in the Address? That would make sense for a node permanently designated as an L1 node. But then how would this work for a node temporarily acting as L1 only, until it has been allowed to join? Change the Address instance on the fly? A delegating Address? :/ > For leaves, the main problem is that every node has to compute the > same primary owner for a key, at all times. So we need a 2PC cache > view installation immediately after any leave to ensure that every > node determines the primary owner in the same way - we can't coalesce > or postpone leaves. Yes, manual rehashing would probably just be for joins. Controlled shutdown in itself is manual, and crashes, well, need to be dealt with immediately IMO. > > For 5.2 I will try to decouple the cache view installation from the > state transfer, so in theory we will be able to coalesce/postpone the > state transfer for leaves as well > (https://issues.jboss.org/browse/ISPN-1827). I'm kind of need it for > non-blocking state transfer, because with the current implementation a > leave forces us to cancel any state transfer in progress and restart > with the updated cache view - a state transfer rollback will be very > expensive with NBST. > > > Erik does raise a valid point - with TACH, if we bring up a node with > a different siteId, then it will be an owner for all the keys in the > cache. That node probably isn't provisioned to hold all the keys, so > it would very likely run out of memory or evict much of the data. I > guess that makes it a 5.2 issue? Yes. > Shutting down a site should be possible even with what we have now - > just insert a DISCARD protocol in the JGroups stack of all the nodes > that are shutting down, and when FD finally times out on the nodes in > the surviving datacenter they won't have any state transfer to do > (although it may cause a few failed state transfer attempts). We could > make it simpler though. > > > Cheers > Dan > > > On Tue, Jan 31, 2012 at 6:21 PM, Erik Salter <[email protected]> wrote: >> ...such as bringing up a backup data center. >> >> -----Original Message----- >> From: [email protected] >> [mailto:[email protected]] On Behalf Of Bela Ban >> Sent: Tuesday, January 31, 2012 11:18 AM >> To: [email protected] >> Subject: Re: [infinispan-dev] Proposal: ISPN-1394 Manual rehashing in 5.2 >> >> I cannot volunteer either, but I find it important to be done in 5.2 ! >> >> Unless rehashing works flawlessly with a large number of nodes joining >> at the same time, I think manual rehashing is crucial... >> >> >> >> On 1/31/12 5:13 PM, Sanne Grinovero wrote: >>> On 31 January 2012 16:06, Bela Ban<[email protected]> wrote: >>>> This is essentially what I suggested at the Lisbon meeting, right ? >>> >>> Yes! >>> >>>> I think Dan had a design wiki on this somewhere... >>> >>> Just rising it here as it was moved to 6.0, while I think it deserves >>> a dedicated thread to better think about it. If it's not hard, I think >>> it should be done sooner. >>> But while I started the thread to wake up the brilliant minds, I can't >>> volunteer for this to make it happen. >>> >>> Sanne >>> >>>> >>>> >>>> On 1/31/12 4:53 PM, Sanne Grinovero wrote: >>>>> I think this is an important feature to have soon; >>>>> >>>>> My understanding of it: >>>>> >>>>> We default with the feature off, and newly discovered nodes are >>>>> added/removed as usual. With a JMX operatable switch, one can disable >>>>> this: >>>>> >>>>> If a remote node is joining the JGroups view, but rehash is off: it >>>>> will be added to a to-be-installed view, but this won't be installed >>>>> until rehash is enabled again. This gives time to add more changes >>>>> before starting the rehash, and would help a lot to start larger >>>>> clusters. >>>>> >>>>> If the [self] node is booting and joining a cluster with manual rehash >>>>> off, the start process and any getCache() invocation should block and >>>>> wait for it to be enabled. This would need of course to override the >>>>> usually low timeouts. >>>>> >>>>> When a node is suspected it's a bit a different story as we need to >>>>> make sure no data is lost. The principle is the same, but maybe we >>>>> should have two flags: one which is a "soft request" to avoid rehashes >>>>> of less than N members (and refuse N>=numOwners ?), one which is just >>>>> disable it and don't care: data might be in a cachestore, data might >>>>> not be important. Which reminds me, we should consider as well a JMX >>>>> command to flush the container to the CacheLoader. >>>>> >>>>> --Sanne >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> [email protected] >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> -- >>>> Bela Ban >>>> Lead JGroups (http://www.jgroups.org) >>>> JBoss / Red Hat >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> [email protected] >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> [email protected] >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Bela Ban >> Lead JGroups (http://www.jgroups.org) >> JBoss / Red Hat >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Manik Surtani [email protected] twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
