I took a quick look, but didn't find a smoking gun. Can you get a jstack of the ThrfitServer when it has the 8K threads active? I'd like to see what exactly is keeping the threads running.
-Joey On Mon, Apr 30, 2012 at 9:29 PM, Dhaval Y Shah <[email protected]> wrote: > > Hi Joey.. Did you get a chance to look at the code? I digged through this a > bit further but didn't get much.. I added logging to the TThreadPoolServer > class and the thread pool there is pretty small.. I looked through the > ThriftServer class and the Hbase.java file generated by thrift and there > aren't many places where HBase configurations are created..The ones where > they are created look exactly the same as CDH3B4 which works correctly which > is confusing the hell out of me now > > > Dhaval Y Shah wrote: >> >> Cool thanks Joey... What I also noticed is that the number of threads >> spawned by the Thrift server also goes to > 16000 when the number of file >> descriptors is going to > 32000.. Probably we are not re-using threads in >> a correct way and we are missing the option to limit the number of threads >> (which I guess is fixed for 0.94 as per what I read in a jira ticket) >> >> >> Joey Echeverria-2 wrote: >>> >>> I don't have the code in front of me, but I bet we're not reusing the >>> Configuration object when getting the HTable handle. I'll take a look at >>> when the regression went in this evening. >>> >>> -Joey >>> >>> Sent from my iPhone >>> >>> On Apr 27, 2012, at 14:02, Dhaval Y Shah <[email protected]> >>> wrote: >>> >>>> >>>> We have an app written in Ruby which uses HBase as the backing store.. >>>> It >>>> uses >>>> Thrift to connect to it.. We were using HBase from Cloudera's CDH3B4 >>>> distro >>>> until >>>> now and it worked fine.. I just upgraded our Hadoop install to CDH3U3 >>>> (which >>>> is >>>> the latest stable CDH release at this point) and in a matter of hours >>>> all >>>> Thrift >>>> servers went down.. >>>> >>>> Upon further investigation I realized that it was hitting the limit on >>>> the >>>> number >>>> of allowed file descriptors (which is pretty high at 32k).. This problem >>>> occurs if >>>> I use thrift in any configuration (hsha, framed transport, threadpool) >>>> except the >>>> nonblocking mode.. Digging further I realized a couple of things: >>>> 1. Even with light load (1-2 processes hitting the thrift server in >>>> quick >>>> succession), thrift is spinning up new threads and each of the threads >>>> is >>>> maintaining a socket connection to zookeeper.. In a matter on minutes >>>> (with >>>> this >>>> load test), thrift has > 32k open connections with > 8k threads having >>>> connection >>>> to zookeeper which do not seem to die even after a day.. >>>> 2. The logs show approx 3-4 open connections (presumably for each >>>> thread): >>>> java 53588 hbase 4135r FIFO 0,6 177426 >>>> pipe >>>> java 53588 hbase 4136w FIFO 0,6 177426 >>>> pipe >>>> java 53588 hbase 4137r 0000 0,11 0 177427 >>>> eventpoll >>>> java 53588 hbase 4138u IPv4 177428 TCP >>>> njhaddev05:49729->njhaddev01:2181 (ESTABLISH >>>> ED) >>>> >>>> CDH3B4 with the exact same configurations and the exact same setup works >>>> fine but >>>> CDH3U3 does not.. Using Thrift in nonblocking mode isn't really an >>>> option >>>> because >>>> of the low throughput and single threaded nature.. >>>> >>>> Any help is greatly appreciated.. >>>> >>>> -- >>>> View this message in context: >>>> http://old.nabble.com/HBase-Thrift-for-CDH3U3-leaking-file-descriptors-socket-connections-to-Zookeeper-tp33760461p33760461.html >>>> Sent from the HBase User mailing list archive at Nabble.com. >>>> >>> >>> >> >> > > -- > View this message in context: > http://old.nabble.com/HBase-Thrift-for-CDH3U3-leaking-file-descriptors-socket-connections-to-Zookeeper-tp33760461p33763327.html > Sent from the HBase User mailing list archive at Nabble.com. > -- Joey Echeverria Senior Solutions Architect Cloudera, Inc.
