Dave,

Would you be willing to post your custom scripts?

Your setup sounds useful for what we are doing.

Thanks. 

Sent from my iPhone

On Jul 19, 2011, at 10:49 AM, "Buttler, David" <[email protected]> wrote:

> Hi Stack,
> 
> As a further data point, I always use the hbase-daemon.sh scripts to 
> start/stop HBase.  I modified the start/stop-hbase.sh scripts so that they 
> don't start/stop zookeeper, and I have a modified version that I call 
> start/stop-zookeeper.sh.  This allows me to use HBase to manage zookeeper so 
> I can have a more sane configuration system, but not necessarily stop 
> zookeeper when I stop HBase, since I use zookeeper for some other stuff too.
> 
> Sometimes the region servers don't die when I want them to, so I have another 
> script that calls the hbase-daemon.sh stop regionserver script in parallel on 
> all of the machines.  Only rarely do I have to kill -9 one.  But, as far as I 
> can tell, I have never lost data doing this.
> 
> Dave
> 
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Stack
> Sent: Tuesday, July 19, 2011 12:11 AM
> To: [email protected]
> Subject: Re: how to restart a hbase cluster
> 
> On Tue, Jul 19, 2011 at 12:02 AM, Weihua JIANG <[email protected]> wrote:
>> It seems stop-hbase.sh only stops master/backup masters and zookeepers.
>> 
> 
> Usually it sends a signal to the master that then sets a flag in
> zookeeper.  When regionservers see this flag, they start to close down
> user-space regions.  When all user-space regions have been closed,
> they the server will close catalog regions.  When a regionserver is
> carrying no regions, it shuts itself down.
> 
> The master waits until all regionservers are down.  It then will go down 
> itself.
> 
> If you have set hbase to manage zookeeper, the last thing done on the
> way out is shutdown the zk ensemble.
> 
> This is how it is supposed to work.
> 
> 
>> So, according to my understanding, region servers shall shutdown
>> itself since it can't find either master or zookeeper.
>> 
> 
> Hmm  Don't they keep retrying?
> 
> 
>> But, I made a recent experimentation on our hbase cluster. After 2
>> days of mater/zookeeper shutdown, the region servers are still alive.
> 
> That doesn't seem correct.  Did the cluster come up cleanly?  Or did
> the master go down before regionservers came up?
> 
>> I am not sure whether it is the problem in hbase release or our own
>> problem since our version is a heavy patched one.
>> 
>> Then, can I perform hbase cluster in following way?
>> 1. stop master
>> 2. stop master backups
>> 3. stop zookeepers
>> 4. stop region servers
>> 
>> The only difference is step #4. If I manually stop down RS, will it
>> affect data integrity? If not, then I can safely performed the steps
>> to shutdown the cluster.
>> 
> 
> If a regionserver crashes down rather than shutdown cleanly, it will
> leave its wal logs around.  The master will notice them and replay
> them.  So try not to crash out your regionservers. ./bin/stop-hbase.sh
> should put the regionservers all down cleanly.
> 
> If you do ./bin/hbase-daemon.sh stop regionserver, that'll send the
> process a signal.  It'll run its shutdown signal handler.  I think
> this will bring on a clean shutdown.  See the code to be sure.
> 
> if  clean shutdown, data should be preserved.   Even if its not a
> clean shutdown, as long as the log splitting is allowed complete,
> there should be no data loss even if server is crashed down.
> 
> St.Ack

Reply via email to