Re: how to restart a hbase cluster

highpointe Tue, 19 Jul 2011 10:09:25 -0700

Dave,

Would you be willing to post your custom scripts?


Your setup sounds useful for what we are doing.

Thanks. 

Sent from my iPhone

On Jul 19, 2011, at 10:49 AM, "Buttler, David" <[email protected]> wrote:

> Hi Stack,
> 
> As a further data point, I always use the hbase-daemon.sh scripts to 
> start/stop HBase.  I modified the start/stop-hbase.sh scripts so that they 
> don't start/stop zookeeper, and I have a modified version that I call 
> start/stop-zookeeper.sh.  This allows me to use HBase to manage zookeeper so 
> I can have a more sane configuration system, but not necessarily stop 
> zookeeper when I stop HBase, since I use zookeeper for some other stuff too.
> 
> Sometimes the region servers don't die when I want them to, so I have another 
> script that calls the hbase-daemon.sh stop regionserver script in parallel on 
> all of the machines.  Only rarely do I have to kill -9 one.  But, as far as I 
> can tell, I have never lost data doing this.
> 
> Dave
> 
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Stack
> Sent: Tuesday, July 19, 2011 12:11 AM
> To: [email protected]
> Subject: Re: how to restart a hbase cluster
> 
> On Tue, Jul 19, 2011 at 12:02 AM, Weihua JIANG <[email protected]> wrote:
>> It seems stop-hbase.sh only stops master/backup masters and zookeepers.
>> 
> 
> Usually it sends a signal to the master that then sets a flag in
> zookeeper.  When regionservers see this flag, they start to close down
> user-space regions.  When all user-space regions have been closed,
> they the server will close catalog regions.  When a regionserver is
> carrying no regions, it shuts itself down.
> 
> The master waits until all regionservers are down.  It then will go down 
> itself.
> 
> If you have set hbase to manage zookeeper, the last thing done on the
> way out is shutdown the zk ensemble.
> 
> This is how it is supposed to work.
> 
> 
>> So, according to my understanding, region servers shall shutdown
>> itself since it can't find either master or zookeeper.
>> 
> 
> Hmm  Don't they keep retrying?
> 
> 
>> But, I made a recent experimentation on our hbase cluster. After 2
>> days of mater/zookeeper shutdown, the region servers are still alive.
> 
> That doesn't seem correct.  Did the cluster come up cleanly?  Or did
> the master go down before regionservers came up?
> 
>> I am not sure whether it is the problem in hbase release or our own
>> problem since our version is a heavy patched one.
>> 
>> Then, can I perform hbase cluster in following way?
>> 1. stop master
>> 2. stop master backups
>> 3. stop zookeepers
>> 4. stop region servers
>> 
>> The only difference is step #4. If I manually stop down RS, will it
>> affect data integrity? If not, then I can safely performed the steps
>> to shutdown the cluster.
>> 
> 
> If a regionserver crashes down rather than shutdown cleanly, it will
> leave its wal logs around.  The master will notice them and replay
> them.  So try not to crash out your regionservers. ./bin/stop-hbase.sh
> should put the regionservers all down cleanly.
> 
> If you do ./bin/hbase-daemon.sh stop regionserver, that'll send the
> process a signal.  It'll run its shutdown signal handler.  I think
> this will bring on a clean shutdown.  See the code to be sure.
> 
> if  clean shutdown, data should be preserved.   Even if its not a
> clean shutdown, as long as the log splitting is allowed complete,
> there should be no data loss even if server is crashed down.
> 
> St.Ack

Re: how to restart a hbase cluster

Reply via email to