Hi all, This topic came up in a separate discussion with Mircea, and he suggested I post something on the mailing list for a wider audience.
I have a business case where I need the value of the rebalancing flag read by the joining nodes. Let's say we have a TACH where we want our keys striped across machines, racks, etc. Due to how NBST works, if we start a bunch of nodes on one side of the topology marker, we'rewill end up with the case where all keys will dog-pile on the first node that joins before being disseminated to the other nodes. In other words, the first joining node on the other side of the topology acts as a "pivot." That's bad, especially if the key is marked as DELTA_WRITE, where the receiving node must pull the key from the readCH before applying the changelog. So not only do we have a single choke-point, but it's made worse by the initial burst of every write requiring numOwner threads for remote reads. If we disable rebalancing and start up the nodes on the other side of the topology, we can process this in a single view change. But there's a catch -- and this is the reason I added the state of the flag. We've run into a case where the current coordinator changed (crash or a MERGE) as the other nodes are starting up. And the new coordinator was elected from the new side of the topology. So we had two separate but balanced CHs on both sides of the topology. And data integrity went out the window. Hence the flag. Note also that this deployment requires the awaitInitialTransfer flag to be false. In a real production environment, this has saved me more times than I can count. Node failover/failback is now reasonably deterministic with a simple operational procedure for our customer(s) to follow. The question is whether this feature would be useful for the community. Even with the new partition handling, I think this implementation is still viable and may warrant inclusion into 7.0 (or 7.1). What does the team think? I welcome any and all feedback. Regards, Erik Salter Cisco Systems, SPVTG (404) 317-0693 _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
