> * The most common reason I've had for Region Server Suicide is > zookeeper. The region server thinks zookeeper is down. I thought this > had to do with heavy load, but this also happens for me even when there > is nothing running. I haven't been able to find a quantifiable cause. > This is just a weakness that exists in the hbase-zookeeper dependency. > Higher loads exacerbate the problem, but are not required for a Region > Server Suicide event to occur.
Can you go into more detail here? You seem to be saying that for absolutely no reason you can lose communication between the RS and ZooKeeper quorum for extended periods of time? The "suicide" only happens if the RS loses its ZK session which can take upwards of a minute. I've been running clusters of all sizes and have not seen this happen. Over the past few weeks I've been running kill tests on ZK, ensuring we properly ride over the failure of a ZK node. Things generally work but it depends what version you are on. In any case there should not be random session expirations w/o reason. Do you have networking issues? Do you see dropped packets on your device stats? > * Another reason is the HDFS dependency... if a file is perhaps > temporarily unavailable for any reason, HBase handles this situation > with Region Server Suicide. Saying "perhaps temporarily unavailable" is not exactly right. There are retries and timeouts. Files do not just disappear for no reason. Yes, given enough time or issues, the RS will eventually kill itself. We can do better here but in the end any database that loses its file system is going to have no choice but to shutdown. I certainly agree that we have work to do on robustness but I disagree with the picture you're painting here that HBase is this unstable and flaky. Many of us have been in production for quite some time. > Perhaps if there were a setting, whether or not a region server is > allowed to commit suicide, some of us would feel more comfortable with > the idea. What you are calling suicide is something that we are using more not less these days. We do it when we get into some kind of irrecoverable state (generally ZK session expiration or losing the FS). The best way to handle this from an operations pov is to have a monitoring process on the RS that will start the process again if it ever dies. Hopefully we can open source some scripts to do this soon. So this can't be an option to turn on/off. Are you using Ganglia or other cluster monitoring software? These kinds of ZK and HDFS issues usually come from over-extended clusters with swapping and/or io starvation. Sorry if you've had previous threads and I just forget. > In the mean time, you can try to work around any of these issues by > using bigger hardware than you would otherwise think is needed and not > letting the load get very high. For example, I tend to have these > kinds of problems much less often when the load on any individual > machine never goes above the number of cores. What kind of hardware do you think should be needed? And are you talking about machine specs of cluster size? I've run HBase on small, slow clusters and big, fast ones. Of course, you need to configure properly and have the right expectations. I'm not even sure what you mean by "load goes above number of cores". Looking back at your previous threads, I see you're running on EC2. I guess that explains much of your experiences, though I know some people running in production and happy on EC2 w/ HBase. In the future, preface your post with that as I think it has a lot to do with the problems you've been seeing. JG
