Seems like a "bad" bug. Please file a JIRA detailing steps to reproduce this. Someone should be able to take a quick look at it.
Thanks. -Hanifi On Tue, Jan 12, 2016 at 7:38 AM, Ian Maloney <[email protected]> wrote: > I've got two nodes currently running drill, so I ran it on both, before and > after restarting. > > Node mentioned previously: > Before: 396 > After: 396 > Other node: > Before: 14 > After: 395 > > On Monday, January 11, 2016, Hanifi GUNES <[email protected]> wrote: > > > Can you run the following on a clean Drillbit(after restart) to report > > number of open files? Please run this before & after executing your > > queries. > > > > 1: lsof -a -p DRILL_PID | wc -l > > > > 2016-01-11 14:13 GMT-08:00 Ian Maloney <[email protected] > > <javascript:;>>: > > > > > I'm don't know a lot about checking the limits, but I'm hive (on hdfs). > > Im > > > able to perform a count on a hive table, so I don't think the issue is > > > there. The hive client and drillbit sate on different nodes though. I > > did a > > > ulimit -a and got: > > > > > > core file size (blocks, -c) 0 > > > data seg size (kbytes, -d) unlimited > > > scheduling priority (-e) 0 > > > file size (blocks, -f) unlimited > > > pending signals (-i) 31403 > > > max locked memory (kbytes, -l) 64 > > > max memory size (kbytes, -m) unlimited > > > open files (-n) 1024 > > > pipe size (512 bytes, -p) 8 > > > POSIX message queues (bytes, -q) 819200 > > > real-time priority (-r) 0 > > > stack size (kbytes, -s) 10240 > > > cpu time (seconds, -t) unlimited > > > max user processes (-u) 1024 > > > virtual memory (kbytes, -v) unlimited > > > file locks (-x) unlimited > > > > > > > > > If you have any steps you'd like me to take to help debug, I'd be happy > > to > > > attempt them. > > > > > > > > > > > > On Monday, January 11, 2016, Hanifi GUNES <[email protected] > > <javascript:;>> wrote: > > > > > > > Interesting. Can you check your procfs to confirm Drill opens > excessive > > > > number of file descriptors after making sure that query is complete > > > through > > > > Web UI? If so, we likely have some leak. Also, can you confirm soft & > > > hard > > > > limits for open file descriptors are not set too low on your > platform? > > > > > > > > -Hanifi > > > > > > > > 2016-01-11 13:13 GMT-08:00 Ian Maloney <[email protected] > > <javascript:;> > > > > <javascript:;>>: > > > > > > > > > Hanifi, > > > > > > > > > > Thanks for the suggestions, I'm not using any concurrency at the > > > moment. > > > > In > > > > > the web UI under profiles I see "No running queries", the error is > > > still > > > > > happening, even after restarting the bit. > > > > > > > > > > > > > > > > > > > > On Monday, January 11, 2016, Hanifi GUNES <[email protected] > > <javascript:;> > > > > <javascript:;>> wrote: > > > > > > > > > > > -* Any ideas on how to **keep the (suspected) resource leak from > > > > > > happening?* > > > > > > > > > > > > The answer to this depends on your workload as well. You have > > > mentioned > > > > > you > > > > > > are running a lot of queries so Drill might ordinarily open too > > many > > > > > > descriptors to serve the high demand. Of course, this assumes > that > > > you > > > > do > > > > > > not limit the level of concurrency. If so, why don't you try > > enabling > > > > > > queues to bound concurrent execution and run your workload again? > > [1] > > > > > > > > > > > > You can always open up web UI to see if your queries are > completed > > or > > > > > > hanging around. If you see too many queries pending and stuck > for a > > > > long > > > > > > time, it is a good indicator of the problem I described above [2] > > > > > > > > > > > > > > > > > > -Hanifi > > > > > > > > > > > > 1: https://drill.apache.org/docs/enabling-query-queuing/ > > > > > > 2: https://drill.apache.org/docs/query-profiles/ > > > > > > > > > > > > 2016-01-11 10:35 GMT-08:00 Ian Maloney < > > [email protected] <javascript:;> > > > > <javascript:;> > > > > > > <javascript:;>>: > > > > > > > > > > > > > Hi Abdel, > > > > > > > > > > > > > > I just noticed I'm still on v. 1.2, so maybe upgrading will > > help. I > > > > > could > > > > > > > open a Jira, if need be. > > > > > > > > > > > > > > As far as restarting and reproducing, I restarted and after > > running > > > > my > > > > > > app > > > > > > > with the jdbc logic for a while, I get an IOException: Error > > > > accessing > > > > > / > > > > > > > I paused the app and restarted the specific bit from the jdbc > > > > > connection > > > > > > > below, which gave me the "Too many open files" exception again. > > > > > > > > > > > > > > Now, restarting that bit won't fix the error. > > > > > > > > > > > > > > > > > > > > > On Monday, January 11, 2016, Abdel Hakim Deneche < > > > > > [email protected] <javascript:;> <javascript:;> > > > > > > <javascript:;>> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Ian, > > > > > > > > > > > > > > > > Can you open up a JIRA for this ? is it easy to reproduce ? > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > On Mon, Jan 11, 2016 at 8:59 AM, Ian Maloney < > > > > > > > [email protected] <javascript:;> <javascript:;> > > <javascript:;> > > > > > > > > <javascript:;>> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > I've been running a lot of queries via jdbc/drill. I have > > four > > > > > > > drillbits, > > > > > > > > > but I could not get the zk jdbc URL to work so I used: > > > > > > > > > jdbc:drill:drillbit=a-bits-hostname > > > > > > > > > > > > > > > > > > Now I get a SocketException for too many open files, even > > when > > > > > > > accessing > > > > > > > > > via cli. I imagine I could restart the bits, but for > > something > > > > > > intended > > > > > > > > for > > > > > > > > > production, that doesn't seem like a viable solution. Any > > ideas > > > > on > > > > > > how > > > > > > > to > > > > > > > > > keep the (suspected) resource leak from happening? > > > > > > > > > > > > > > > > > > I'm closing ResultSet, Statement, and Connection, after > each > > > > query. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > Abdelhakim Deneche > > > > > > > > > > > > > > > > Software Engineer > > > > > > > > > > > > > > > > <http://www.mapr.com/> > > > > > > > > > > > > > > > > > > > > > > > > Now Available - Free Hadoop On-Demand Training > > > > > > > > < > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
