Hi,

>> We are running ZK 3.3.4, Cloudera cdh3u3, HBase 0.94.16.

ZK version is quite old. I could see ClientCnxn is only catching IOException 
and when there is OOME it will exit SendThread.
I think, thats the reason for client hanging. Client side threaddump will help 
us to see the liveliness of SendThread.

Client side exception handling has been modified in 3.4 & 3.5 branches.
Can you check the possibility of upgrading to 3.4.6 latest release.

Regards,
Rakesh

-----Original Message-----
From: Qiang Tian [mailto:tian...@gmail.com] 
Sent: 14 August 2014 11:03
To: user@hbase.apache.org; d...@zookeeper.apache.org
Subject: Re: HBase client hangs after client-side OOM

the sendthread stacktrace looks not correct. Do you have the client log?
(in case zk client code log sth there)
from the zk code, it looks ClientCnxn$SendThread.run should have caught
it(throwable) and done the cleanup work, e.g. notify the main thread, so that 
it can wake up from ClientCnxn.submitRequest..

send to Zookeeper for help.
thanks.



On Thu, Aug 14, 2014 at 11:19 AM, Ted Tuttle <t...@mentacapital.com> wrote:

> Hi Lars-
>
> We are running ZK 3.3.4, Cloudera cdh3u3, HBase 0.94.16.
>
> Thanks,
> Ted
>
> > On Aug 13, 2014, at 5:36 PM, "lars hofhansl" <la...@apache.org> wrote:
> >
> > Hey Ted,
> >
> > so this is a problem with the ZK client, it seems to not clean 
> > itself up
> correctly upon receiving an exception at the wrong moment.
> > Which version of ZK are you using?
> >
> >
> > -- Lars
> >
> >
> >
> > ----- Original Message -----
> > From: Ted Tuttle <t...@mentacapital.com>
> > To: "user@hbase.apache.org" <user@hbase.apache.org>
> > Cc: Development <developm...@mentacapital.com>
> > Sent: Wednesday, August 13, 2014 4:38 PM
> > Subject: HBase client hangs after client-side OOM
> >
> > Hello-
> >
> > We are running HBase v0.94.16 on an 8 node cluster.
> >
> > We have a recurring problem w/ HBase clients hanging.  In latest
> occurrence, I observed the following sequence of events:
> >
> > 0) client plays w/ HBase for a long time w/o issue
> > 1) client runs out of memory during HBase operation:
> >
> >                 http://pastebin.com/b5x44Lx7
> >
> > 3) Exception is thrown, memory is released
> > 2) In some shutdown logic the client tries to access HBase again and
> hangs:
> >
> >                 http://pastebin.com/xU4MSq9k
> >
> > Clearly I need to fix OOM.  However, the fact that client hangs is 
> > not
> nice.  Any ideas why?
> >
> > BTW- I started by looking at zookeeper log. Not much there but here 
> > you
> go:
> >
> >                 http://pastebin.com/wZvE0Fbv
> >
> > Thanks,
> > Ted
> >
>

Reply via email to