We do simple TCP check, and I know that some people run a more extensive check that calls a low overhead method like getConfiguration just to see if the region server can answer requests.
J-D On Mon, Jun 6, 2011 at 6:55 PM, Wayne <[email protected]> wrote: > Are there any recommended methods/scripts to monitor nodes via nagios? It > would be best to have a simple nagios call to check hadoop, hbase, & > thrift separately and alarm if one of them is awol (and not have the script > cause damage like I have read with thrift). For example our friendly CMF > issues which will occasionally go beyond the 60s zookeeper timeout will > cause hbase nodes to shutdown. We want a nagios alarm/text message to be > sent to let us know we have to go restart the region server. > > I know we can figure out something on our own...but wanted to see if there > are some std methods for monitoring that have been developed and shared that > we can use. > > Thanks. >
