We have an app written in Ruby which uses HBase as the backing store.. It uses Thrift to connect to it.. We were using HBase from Cloudera's CDH3B4 distro until now and it worked fine.. I just upgraded our Hadoop install to CDH3U3 (which is the latest stable CDH release at this point) and in a matter of hours all Thrift servers went down..
Upon further investigation I realized that it was hitting the limit on the number of allowed file descriptors (which is pretty high at 32k).. This problem occurs if I use thrift in any configuration (hsha, framed transport, threadpool) except the nonblocking mode.. Digging further I realized a couple of things: 1. Even with light load (1-2 processes hitting the thrift server in quick succession), thrift is spinning up new threads and each of the threads is maintaining a socket connection to zookeeper.. In a matter on minutes (with this load test), thrift has > 32k open connections with > 8k threads having connection to zookeeper which do not seem to die even after a day.. 2. The logs show approx 3-4 open connections (presumably for each thread): java 53588 hbase 4135r FIFO 0,6 177426 pipe java 53588 hbase 4136w FIFO 0,6 177426 pipe java 53588 hbase 4137r 0000 0,11 0 177427 eventpoll java 53588 hbase 4138u IPv4 177428 TCP njhaddev05:49729->njhaddev01:2181 (ESTABLISH ED) CDH3B4 with the exact same configurations and the exact same setup works fine but CDH3U3 does not.. Using Thrift in nonblocking mode isn't really an option because of the low throughput and single threaded nature.. Any help is greatly appreciated.. -- View this message in context: http://old.nabble.com/HBase-Thrift-for-CDH3U3-leaking-file-descriptors-socket-connections-to-Zookeeper-tp33760461p33760461.html Sent from the HBase User mailing list archive at Nabble.com.
