Rong-En Fan, I agree multi-master requires manual tasks and the current lack of doc does not help (it's on my list tho).
I also agree that stop on a backup master shouldn't stop the cluster. Can you fill in a Jira? (kill -9 works well btw) wrt multi-master conf, I personally ruled it out of 0.20.0 but do you think we should still include it for usability? Is it currently too rough? Thx, J-D On Thu, Jul 16, 2009 at 12:40 PM, Rong-en Fan<[email protected]> wrote: > Few days ago, I played with the latest trunk to see how fail-tolerance > works in 0.20. While running PerformanceEvaluation to generate > workloads, killing HRS and HMaster is not a big deal. The client > recovers after tens of secs to few minutes. This is good. > > For multi masters, it seems that I have to manually start backup master by > > bin/hbase-daemon.sh start master > > This is ok, though it's better that we can specify this as part of > hbase-site.xml or a new conf/masters. > > But stop backup master is messy... if I just do > > bin/hbase-daemon.sh stop master > > It will bring the whole cluster down. That's bad. > > Not sure if we can do something like this : > > 1. if there is an active master, stop master will just make HMaster > die without shutdown the whole cluster > 2. otherwise, shutdown the whole cluster as before > > Any ideas? > > Thanks, > Rong-En Fan >
