Thanks a lot, Stack.

Now, I have a much clearer understanding.

I think I made a mistake in my previous experimentation. Since I use
CDH3 for testing, I shutdown the master using command
         service hadoop-hbase-master stop
It turns to shutdown the master via hbase-daemon.sh which just send
the KILL signal to existing master process. Thus, this master shutdown
has no chance to set the flag on zookeeper.

Meanwhile, stop-hbase.sh doesn't use hbase-daemon.sh to shutdown
master and has chance to set flag on zookeeper.

Thanks
Weihua

2011/7/19 Stack <[email protected]>:
> On Tue, Jul 19, 2011 at 12:02 AM, Weihua JIANG <[email protected]> wrote:
>> It seems stop-hbase.sh only stops master/backup masters and zookeepers.
>>
>
> Usually it sends a signal to the master that then sets a flag in
> zookeeper.  When regionservers see this flag, they start to close down
> user-space regions.  When all user-space regions have been closed,
> they the server will close catalog regions.  When a regionserver is
> carrying no regions, it shuts itself down.
>
> The master waits until all regionservers are down.  It then will go down 
> itself.
>
> If you have set hbase to manage zookeeper, the last thing done on the
> way out is shutdown the zk ensemble.
>
> This is how it is supposed to work.
>
>
>> So, according to my understanding, region servers shall shutdown
>> itself since it can't find either master or zookeeper.
>>
>
> Hmm  Don't they keep retrying?
>
>
>> But, I made a recent experimentation on our hbase cluster. After 2
>> days of mater/zookeeper shutdown, the region servers are still alive.
>
> That doesn't seem correct.  Did the cluster come up cleanly?  Or did
> the master go down before regionservers came up?
>
>> I am not sure whether it is the problem in hbase release or our own
>> problem since our version is a heavy patched one.
>>
>> Then, can I perform hbase cluster in following way?
>> 1. stop master
>> 2. stop master backups
>> 3. stop zookeepers
>> 4. stop region servers
>>
>> The only difference is step #4. If I manually stop down RS, will it
>> affect data integrity? If not, then I can safely performed the steps
>> to shutdown the cluster.
>>
>
> If a regionserver crashes down rather than shutdown cleanly, it will
> leave its wal logs around.  The master will notice them and replay
> them.  So try not to crash out your regionservers. ./bin/stop-hbase.sh
> should put the regionservers all down cleanly.
>
> If you do ./bin/hbase-daemon.sh stop regionserver, that'll send the
> process a signal.  It'll run its shutdown signal handler.  I think
> this will bring on a clean shutdown.  See the code to be sure.
>
> if  clean shutdown, data should be preserved.   Even if its not a
> clean shutdown, as long as the log splitting is allowed complete,
> there should be no data loss even if server is crashed down.
>
> St.Ack
>

Reply via email to