> While I'm trying to figure out what is causing the region to be 
> non-responsive, what's the best way to recover?

I'd recover from that the same way I'd recover from any unavailable
storage component, since that can happen with any DB/SAN/etc right?
The link between your client and HBase could be cut, or a split took
way too much time, or a more serious issue like your whole cluster
went down... your choices are either to fail the user's request (if
it's user-facing), or buffer the edits somewhere until the problem is
resolved, or any other nifty trick you can think of.

BTW we're are planning to clean up the client retries policy, see
https://issues.apache.org/jira/browse/HBASE-2445

J-D

On Fri, Jun 18, 2010 at 6:13 AM, Michael Segel
<michael_se...@hotmail.com> wrote:
>
> Hi,
>
> Here's the situation ...
>
> Something is futzing up our HBase tables and when processes try to run a 
> query, they end up getting an error of trying to connect to a region which 
> isn't responding. After 10 tries it fails.
>
> While I'm trying to figure out what is causing the region to be 
> non-responsive, what's the best way to recover?
>
> It looks like ZK and the Region Server is out of sync. But that's a guess.
> I had this problem late last night and I was using one of the tools in HBase 
> Shell that seemed to correct it and then I could truncate the table. (Not 
> sure which one did it.)
>
> Thx
>
> -Mike
>
>
> _________________________________________________________________
> The New Busy is not the old busy. Search, chat and e-mail from your inbox.
> http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3

Reply via email to