I'm in favor of doing thing incrementally but also in favor of a system less dependent on a single master. As always, you have good ideas Andrew ;) Locks and all leases should be transfered in ZK.
For the next release, if we stick to current plan, a HRS failing while the master is unavailable is indeed very bad so it would be advisable to keep the master downtime as short as possible (by keeping a sleeping master that keeps pinging the unique master lock every 2 seconds for example). Then the first thing the new master has to do is to scan ROOT to see if all META assignments are correct and fix those that are wrong (same with ROOT after reading ZK). Then it scans META to confirm all region assignments, reassign regions when needed. J-D On Wed, Jan 7, 2009 at 3:04 PM, Andrew Purtell <[email protected]> wrote: > > From: Jean-Daniel Cryans > > > > With ZK in 0.20, a cluster restart won't be necessary. > > Since the ROOT address will be stored in ZK, the clients > > will practically never communicate with the master and > > the region servers will just keep serving regions. If > > the master fails, the RS should not block gets/puts but > > won't be able to do splits. However, the new master will > > have to be started manually (or we can implement a > > simple way to have extra masters sleeping just in case) > > so that it gets its unique lock which will surely > > contain it's address. > > If the master fails, and then a HRS fails, then what? > Especially what happens if the HRS is carrying ROOT or META? > > There's no reason that all the live HRS cannot use ZK to > negotiate among themselves who should also assume the master > role, since the master role will also go on a diet. After ZK > integration, is there a need for separate processes for the > master and region server functions? > > In fact the master role might be distributed among the HRS > via ZK. About the only need for a master would be to manage > region assignments upon splits and HRS failures. Why not put > up locks (or appropriate synchronization primitives) for > every region and have the HRS figure out among themselves > who should carry new or unassigned regions? > > Just thinking out loud here. > > - Andy > > > > >
