RE: Region server shutdown after putting millions of rows

Jonathan Gray Tue, 14 Sep 2010 11:01:00 -0700

> * The most common reason I've had for Region Server Suicide is
> zookeeper.  The region server thinks zookeeper is down.  I thought this
> had to do with heavy load, but this also happens for me even when there
> is nothing running.  I haven't been able to find a quantifiable cause.
> This is just a weakness that exists in the hbase-zookeeper dependency.
> Higher loads exacerbate the problem, but are not required for a Region
> Server Suicide event to occur.


Can you go into more detail here?  You seem to be saying that for absolutely no 
reason you can lose communication between the RS and ZooKeeper quorum for 
extended periods of time?

The "suicide" only happens if the RS loses its ZK session which can take 
upwards of a minute.

I've been running clusters of all sizes and have not seen this happen.  Over 
the past few weeks I've been running kill tests on ZK, ensuring we properly 
ride over the failure of a ZK node.  Things generally work but it depends what 
version you are on.  In any case there should not be random session expirations 
w/o reason.

Do you have networking issues?  Do you see dropped packets on your device stats?

> * Another reason is the HDFS dependency... if a file is perhaps
> temporarily unavailable for any reason, HBase handles this situation
> with Region Server Suicide.

Saying "perhaps temporarily unavailable" is not exactly right.  There are 
retries and timeouts.  Files do not just disappear for no reason.  Yes, given 
enough time or issues, the RS will eventually kill itself.  We can do better 
here but in the end any database that loses its file system is going to have no 
choice but to shutdown.

I certainly agree that we have work to do on robustness but I disagree with the 
picture you're painting here that HBase is this unstable and flaky.  Many of us 
have been in production for quite some time.

> Perhaps if there were a setting, whether or not a region server is
> allowed to commit suicide, some of us would feel more comfortable with
> the idea.

What you are calling suicide is something that we are using more not less these 
days.  We do it when we get into some kind of irrecoverable state (generally ZK 
session expiration or losing the FS).  The best way to handle this from an 
operations pov is to have a monitoring process on the RS that will start the 
process again if it ever dies.  Hopefully we can open source some scripts to do 
this soon.

So this can't be an option to turn on/off.

Are you using Ganglia or other cluster monitoring software?  These kinds of ZK 
and HDFS issues usually come from over-extended clusters with swapping and/or 
io starvation.  Sorry if you've had previous threads and I just forget.

> In the mean time, you can try to work around any of these issues by
> using bigger hardware than you would otherwise think is needed and not
> letting the load get very high.  For example, I tend to have these
> kinds of problems much less often when the load on any individual
> machine never goes above the number of cores.

What kind of hardware do you think should be needed?  And are you talking about 
machine specs of cluster size?

I've run HBase on small, slow clusters and big, fast ones.  Of course, you need 
to configure properly and have the right expectations.  I'm not even sure what 
you mean by "load goes above number of cores".


Looking back at your previous threads, I see you're running on EC2.  I guess 
that explains much of your experiences, though I know some people running in 
production and happy on EC2 w/ HBase.  In the future, preface your post with 
that as I think it has a lot to do with the problems you've been seeing.

JG

RE: Region server shutdown after putting millions of rows

Reply via email to