bq. data node processes doesn’t die Which hadoop version are you using ?
Have you read the following section in http://hbase.apache.org/book.html#_hbase_and_hdfs ? HDFS takes a while to mark a node as dead. You can configure HDFS to avoid using stale DataNodes Cheers On Wed, Jun 24, 2015 at 10:19 AM, Arun Mishra <[email protected]> wrote: > I am guessing that HBASE-7351 won’t work for my case since process won’t > be able to read the script from disk. > > Regards, > Arun > > > On Jun 23, 2015, at 9:48 PM, Arun Mishra <[email protected]> wrote: > > > > Hello, > > > > I am using hbase cdh version 0.98.6. I am facing a problem where a disk > controller fails on a host and all disk operation kind of hang up on that > host. But region server/data node processes doesn’t die and at the same > time the zookeeper session keeps alive. Resulting in all requests to that > region server failing. Currently, I use zookeeper client to delete the > corresponding znode manually to initiate the recovery process. It will take > some time to figure out the hardware issue and fix it. Meanwhile, I am > looking to find some solution to automate the recovery process. > > > > I came across HBASE-7351. I am wondering if any one has used this > feature or if any other option is available to kill a region server in > similar partial hardware failures case. Any insight would be very helpful > to me. > > > > Thanks - Arun. > > > > > >
