HBase Thrift health checker

Daniel Einspanjer Thu, 09 Sep 2010 17:37:03 -0700

 Cross posting my recent blog entry...

As documented in THRIFT-601, sending random data to Thrift can cause itto leak memory. 

At Mozilla, we use a web load balancer to distribute traffic to ourThrift machines, and the default liveness check it uses is a simple TCPconnect. We also had Nagios performing TCP connect checks on these nodesfor general alerting.

All these connects were causing the Thrift servers to start generatingOOM errors sometimes as quickly as a few days after being started.

I wrote a test utility that performs a legitimate Thrift API call (itactually tries to get the schema of the .META. table) and returns asuccess if it can execute the call.

The utility can either run from the command line, or it can use thelightweight HTTP server class that is part of the Sun JRE 6 and it willlisten for a request to /thrift/health and report back the status.


$ java -jar HbaseThriftTester.jar

Missing required option: [-check Immediately checks the followinghost:port combinations and returns a summary message with an exit valueof the number of failures., -listen Run as an HTTP daemon listening onport. Checks the hosts every time /thrift/health URL is requested.]

usage: HbaseThriftTester [-timeout <ms>] <mode> <host:port>...
-check Immediately checks the following host:port
combinations and returns a summary message with an
exit value of the number of failures.
-listen <port> Run as an HTTP daemon listening on port. Checks the
hosts every time /thrift/health URL is requested.
-timeout <seconds> Number of seconds to wait for Thrift call to
complete

The app is bundled up using one-jar so it is simple and easy to callfrom within a Nagios script or some-such. Maybe it will be useful tosomeone else. Just pull down the project then build with ant.


https://svn.mozilla.org/metrics/hadoop/hbase/HbaseThriftTester/

-Daniel

HBase Thrift health checker

Reply via email to