Re: Regions Stuck PENDING_OPEN

Austin Heyne Mon, 01 Oct 2018 07:46:20 -0700

I'm running HBase 1.4.4 on EMR. In following your suggestions I realizedthat the master is trying to assign the regions to dead/non-existantregion servers. While trying to fix this problem I had killed the EMRcluster and started a new one. It's still trying to assign some regionsto those region servers in the previous cluster. I tried to manuallymove one of the regions to a good region server but I'm getting 'ERROR:No route to host' when I try to close the region.

I've tried nuking the /hbase directory in Zookeeper but that didn't seemto help so I'm not sure where it's getting these references from.


-Austin


On 09/30/2018 02:38 PM, Josh Elser wrote:

First off: You're on EMR? What version of HBase you're using? (MaybeZach or Stephen can help here too). Can you figure out theRegionServer(s) which are stuck opening these PENDING_OPEN regions?Can you get a jstack/thread-dump from those RS's?
In terms of how the system is supposed to work: the PENDING_OPEN statefor a Region "R" means: the active Master has asked a RegionServer toopen R. That RS should have an active thread which is trying to openR. Upon success, the state of R will move from PENDING_OPEN to OPEN.Otherwise, the Master will try to assign R again.
In absence of any custom coprocessors (including Phoenix), this wouldmean some subset of RegionServers are in a bad state. Figuring outwhat those RS's are trying to do will be the next step in figuring outwhy they're stuck like that. It might be obvious from the UI, or youmight have to look at hbase:meta or the master log to figure it out.
One caveat, it's possible that the Master is just not doing the rightthing as described above. If the steps described above don't seem tobe matching what your system is doing, you might have to look closerat the Master log. Make sure you have DEBUG on to get anything ofvalue out of the system.
On 9/30/18 1:43 PM, Austin Heyne wrote:
I'm having a strange problem that my usual bag of tricks is havingtrouble sorting out. On Friday queries stoped returning for somereason. You could see them come in and there would be a resourceutilization spike that would fade out after an appropriate amount oftime, however, the query would never actually return. This could berelated to our client code but I wasn't able to dig into it sincethis was the middle of the day on a production system. Since this hadhappened before and bouncing HBase cleared it up, I proceeded todisable tables and restart HBase. Upon bringing HBase backup a fewthousand regions are stuck in PENDING_OPEN state and refuse to movefrom that state. I've run hbck -repair a number of times under a fewconditions (even the offline repair), have deleted everything out of/hbase in zookeeper and even migrated the cluster to new servers(EMR) with no luck. When I spin HBase up the regions are already atPENDING_OPEN even though the tables are offline.
Any ideas on what's going on here would be a huge help.

Thanks,
Austin


--
Austin L. Heyne

Re: Regions Stuck PENDING_OPEN

Reply via email to