I'm having a strange problem that my usual bag of tricks is having
trouble sorting out. On Friday queries stoped returning for some reason.
You could see them come in and there would be a resource utilization
spike that would fade out after an appropriate amount of time, however,
the query would never actually return. This could be related to our
client code but I wasn't able to dig into it since this was the middle
of the day on a production system. Since this had happened before and
bouncing HBase cleared it up, I proceeded to disable tables and restart
HBase. Upon bringing HBase backup a few thousand regions are stuck in
PENDING_OPEN state and refuse to move from that state. I've run hbck
-repair a number of times under a few conditions (even the offline
repair), have deleted everything out of /hbase in zookeeper and even
migrated the cluster to new servers (EMR) with no luck. When I spin
HBase up the regions are already at PENDING_OPEN even though the tables
are offline.
Any ideas on what's going on here would be a huge help.
Thanks,
Austin