Re: [infinispan-dev] Suppressing state transfer via JMX

Adrian Nistor Fri, 31 May 2013 10:23:12 -0700

The coordinator is the first member of the jgroups view, which is easyto obtain via jmx from any node, but you do make a valid point. Weshould remove this strain from the admin and automate a bit. I like theforwarding idea.

Excluding the coordinator should not be a problem. After this isperformed there will be no running caches left on the coordinator butthere will still be a //ClusterTopologyManager instance running on thecoordinator which will continue to manage the topology of the cluster.This was made possible (in 5.1?) when the concept of asymetric cacheswas implemented. After this coordinator is excluded it can still workuntil it is killed and a new coordinator is elected and will take control.


Thanks for feedback Dennis!

On 05/31/2013 07:52 PM, Dennis Reed wrote:

I see 2 potential issues:
1. How does the user know which node is the master to connect to,since the operations are a no-op on all the others?
- instead of a no-op, what if the other nodes just forward theoperation to the correct node?Then the user doesn't have to know who the current master is,and can just connect to any node.
2.  What if the current master is one of the nodes being stopped?

-Dennis

On 05/31/2013 11:40 AM, Adrian Nistor wrote:
Yes, ISPN-1394 has a broader scope but the proposed solution forISPN-3140 solves quite a lot of ISPN-1394 and it's not complex. Wemight not even need ISPN-1394 soon unless somebody really wants tocontrol data ownership down to segment granularity. If we only wantto batch joins/leaves and manually kick out nodes with or withoutloosing their data then this proposal should be enough. This solutionshould not prevent implementation of ISPN-1394 in future and will notneed to be removed/undone.
Here are the details:
1. /Add a JMX writable attribute (or operation?) toClusterTopologyManager (name it suppressRehashing?) that is false bydefault but should also be configurable via API or xml. While thisattribute is true the ClusterTopologyManager queues alljoin/leave/exclude(see below) requests and does not execute them onthe spot as it would normally happen. The value of this attribute isignored on all nodes but the coordinator. When it is set back tofalse all queued operations (except the ones that cancel eachotherout) are executed. The setter should be synchronous so when settingis back to false it does not return until the queue is empty and allrehashing was processed. /
2. /We add a JMX operation excludeNodes(list of addresses) toClusterTopologyManager. Calling this method on any node but thecoordinator is no-op. This operation removes the node from thetopology (almost as if it left) and forces a rebalance./ The node isstill present in the current CH but not in the pending CH. It'sbasically disowned by all its data which is now being transferred toother (not excluded) nodes. At the end of the rebalance the node isremoved from topology for good and can be shut down without loosingdata. Note that if suppressRehashing==false operationexcludeNodes(..) just queues them for later removal. We can batchmultiple such exclusions and then re-activate the rehashing.
The parts that need to be implemented are written in italic above.Everything else is already there.
excludeNodes is a way of achieving a soft shutdown and should be usedonly if we care about preserving data int the extreme case where thenodes are the last/single owners. We can just kill the node directlyif we do not care about its data.
suppressRehashing is a way of achieving some kind of batching oftopology changes. This should speed up state transfer a lot becauseit avoids a lot of pointless reshuffling of data segments when wehave many successive joiners/leavers.
So what happens if the current coordinator dies for whatever reason?The new one will take control and will not have knowledge of theexisting rehash queue or the previous status of suppressRehashingattribute so it will just get the current cache membership statusfrom all members of current view and proceed with the rehashing asusual. If the user does not want this he can set a default value oftrue for suppressRehashing. The admin has to interact now via JMXwith the new coordinator. But that's not as bad as the alternativewhere all the nodes are involved in this jmx scheme :) I think havingonly the coordinator involved in this is a plus.
Manik, how does this fit for the full and partial shutdown?

Cheers
Adi


On 05/31/2013 04:20 PM, Manik Surtani wrote:
On 31 May 2013, at 13:52, Dan Berindei <[email protected]<mailto:[email protected]>> wrote:
If we only want to deal with full cluster shutdown, then I thinkstopping all application requests, calling Cache.clear() on onenode, and then shutting down all the nodes should be simpler. Onstart, assuming no cache store, the caches will start empty, sostarting all the nodes at once and only allowing applicationrequests when they've all joined should also work without extra work.
If we only want to stop a part of the cluster, suppressingrebalancing would be better, because we wouldn't lose all the data.But we'd still lose the keys whose owners are all among the nodeswe want to stop. I've discussed this with Adrian, and we think ifwe want to stop a part of the cluster without losing data we need aJMX operation on the coordinator that will "atomically" remove aset of nodes from the CH. After the operation completes, the userwill know it's safe to stop those nodes without losing data.
I think the no-data-loss option is bigger scope, perhaps part ofISPN-1394. And that's not what I am asking about.
When it comes to starting a part of the cluster, a "pauserebalancing" option would probably be better - but again, on thecoordinator, not on each joining node. And clearly, if more thannumOwner nodes leave while rebalancing is suspended, data will be lost.
Yup. This sort of option would only be used where data loss isn'tan issue (such as a distributed cache). Where data loss is anissue, we'd need more control - ISPN-1394.
Cheers
Dan
On Fri, May 31, 2013 at 12:17 PM, Manik Surtani<[email protected] <mailto:[email protected]>> wrote:
    Guys

    We've discussed ISPN-3140 elsewhere before, I'm brining it to
    this forum now.

    https://issues.jboss.org/browse/ISPN-3140

    Any thoughts/concerns?  Particularly looking to hear from Dan
    or Adrian about viability, complexity, ease of implementation.

    Thanks
    Manik
    --
    Manik Surtani
    [email protected] <mailto:[email protected]>
    twitter.com/maniksurtani <http://twitter.com/maniksurtani>

    Platform Architect, JBoss Data Grid
    http://red.ht/data-grid


    _______________________________________________
    infinispan-dev mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[email protected] <mailto:[email protected]>
https://lists.jboss.org/mailman/listinfo/infinispan-dev
--
Manik Surtani
[email protected] <mailto:[email protected]>
twitter.com/maniksurtani <http://twitter.com/maniksurtani>

Platform Architect, JBoss Data Grid
http://red.ht/data-grid



_______________________________________________
infinispan-dev mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Suppressing state transfer via JMX

Reply via email to