Seems like a "bad" bug. Please file a JIRA detailing steps to reproduce
this. Someone should be able to take a quick look at it.

Thanks.
-Hanifi

On Tue, Jan 12, 2016 at 7:38 AM, Ian Maloney <[email protected]>
wrote:

> I've got two nodes currently running drill, so I ran it on both, before and
> after restarting.
>
> Node mentioned previously:
>    Before: 396
>    After: 396
> Other node:
>    Before: 14
>    After: 395
>
> On Monday, January 11, 2016, Hanifi GUNES <[email protected]> wrote:
>
> > Can you run the following on a clean Drillbit(after restart) to report
> > number of open files? Please run this before & after executing your
> > queries.
> >
> > 1: lsof -a -p DRILL_PID | wc -l
> >
> > 2016-01-11 14:13 GMT-08:00 Ian Maloney <[email protected]
> > <javascript:;>>:
> >
> > > I'm don't know a lot about checking the limits, but I'm hive (on hdfs).
> > Im
> > > able to perform a count on a hive table, so I don't think the issue is
> > > there. The hive client and drillbit sate on different nodes though. I
> > did a
> > > ulimit -a and got:
> > >
> > > core file size          (blocks, -c) 0
> > > data seg size           (kbytes, -d) unlimited
> > > scheduling priority             (-e) 0
> > > file size               (blocks, -f) unlimited
> > > pending signals                 (-i) 31403
> > > max locked memory       (kbytes, -l) 64
> > > max memory size         (kbytes, -m) unlimited
> > > open files                      (-n) 1024
> > > pipe size            (512 bytes, -p) 8
> > > POSIX message queues     (bytes, -q) 819200
> > > real-time priority              (-r) 0
> > > stack size              (kbytes, -s) 10240
> > > cpu time               (seconds, -t) unlimited
> > > max user processes              (-u) 1024
> > > virtual memory          (kbytes, -v) unlimited
> > > file locks                      (-x) unlimited
> > >
> > >
> > > If you have any steps you'd like me to take to help debug, I'd be happy
> > to
> > > attempt them.
> > >
> > >
> > >
> > > On Monday, January 11, 2016, Hanifi GUNES <[email protected]
> > <javascript:;>> wrote:
> > >
> > > > Interesting. Can you check your procfs to confirm Drill opens
> excessive
> > > > number of file descriptors after making sure that query is complete
> > > through
> > > > Web UI? If so, we likely have some leak. Also, can you confirm soft &
> > > hard
> > > > limits for open file descriptors are not set too low on your
> platform?
> > > >
> > > > -Hanifi
> > > >
> > > > 2016-01-11 13:13 GMT-08:00 Ian Maloney <[email protected]
> > <javascript:;>
> > > > <javascript:;>>:
> > > >
> > > > > Hanifi,
> > > > >
> > > > > Thanks for the suggestions, I'm not using any concurrency at the
> > > moment.
> > > > In
> > > > > the web UI under profiles I see "No running queries", the error is
> > > still
> > > > > happening, even after restarting the bit.
> > > > >
> > > > >
> > > > >
> > > > > On Monday, January 11, 2016, Hanifi GUNES <[email protected]
> > <javascript:;>
> > > > <javascript:;>> wrote:
> > > > >
> > > > > > -* Any ideas on how to **keep the (suspected) resource leak from
> > > > > > happening?*
> > > > > >
> > > > > > The answer to this depends on your workload as well. You have
> > > mentioned
> > > > > you
> > > > > > are running a lot of queries so Drill might ordinarily open too
> > many
> > > > > > descriptors to serve the high demand. Of course, this assumes
> that
> > > you
> > > > do
> > > > > > not limit the level of concurrency. If so, why don't you try
> > enabling
> > > > > > queues to bound concurrent execution and run your workload again?
> > [1]
> > > > > >
> > > > > > You can always open up web UI to see if your queries are
> completed
> > or
> > > > > > hanging around. If you see too many queries pending and stuck
> for a
> > > > long
> > > > > > time, it is a good indicator of the problem I described above [2]
> > > > > >
> > > > > >
> > > > > > -Hanifi
> > > > > >
> > > > > > 1: https://drill.apache.org/docs/enabling-query-queuing/
> > > > > > 2: https://drill.apache.org/docs/query-profiles/
> > > > > >
> > > > > > 2016-01-11 10:35 GMT-08:00 Ian Maloney <
> > [email protected] <javascript:;>
> > > > <javascript:;>
> > > > > > <javascript:;>>:
> > > > > >
> > > > > > > Hi Abdel,
> > > > > > >
> > > > > > > I just noticed I'm still on v. 1.2, so maybe upgrading will
> > help. I
> > > > > could
> > > > > > > open a Jira, if need be.
> > > > > > >
> > > > > > > As far as restarting and reproducing, I restarted and after
> > running
> > > > my
> > > > > > app
> > > > > > > with the jdbc logic for a while, I get an IOException: Error
> > > > accessing
> > > > > /
> > > > > > > I paused the app and restarted the specific bit from the jdbc
> > > > > connection
> > > > > > > below, which gave me the "Too many open files" exception again.
> > > > > > >
> > > > > > > Now, restarting that bit won't fix the error.
> > > > > > >
> > > > > > >
> > > > > > > On Monday, January 11, 2016, Abdel Hakim Deneche <
> > > > > [email protected] <javascript:;> <javascript:;>
> > > > > > <javascript:;>>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Ian,
> > > > > > > >
> > > > > > > > Can you open up a JIRA for this ? is it easy to reproduce ?
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > On Mon, Jan 11, 2016 at 8:59 AM, Ian Maloney <
> > > > > > > [email protected] <javascript:;> <javascript:;>
> > <javascript:;>
> > > > > > > > <javascript:;>>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > I've been running a lot of queries via jdbc/drill. I have
> > four
> > > > > > > drillbits,
> > > > > > > > > but I could not get the zk jdbc URL to work so I used:
> > > > > > > > > jdbc:drill:drillbit=a-bits-hostname
> > > > > > > > >
> > > > > > > > > Now I get a SocketException for too many open files, even
> > when
> > > > > > > accessing
> > > > > > > > > via cli. I imagine I could restart the bits, but for
> > something
> > > > > > intended
> > > > > > > > for
> > > > > > > > > production, that doesn't seem like a viable solution. Any
> > ideas
> > > > on
> > > > > > how
> > > > > > > to
> > > > > > > > > keep the (suspected) resource leak from happening?
> > > > > > > > >
> > > > > > > > > I'm closing ResultSet, Statement, and Connection, after
> each
> > > > query.
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Abdelhakim Deneche
> > > > > > > >
> > > > > > > > Software Engineer
> > > > > > > >
> > > > > > > >   <http://www.mapr.com/>
> > > > > > > >
> > > > > > > >
> > > > > > > > Now Available - Free Hadoop On-Demand Training
> > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to