When auth is *enabled*, is the worker process logging out after queries are
done? When auth is *disabled* can you set session_max_idle_secs in
drill.exec.http block in drill-override.conf to something like 30 (secs)
and try? This way anonymous sessions are closed quickly and not kept for
1hr (default value). I think we may need to avoid creating sessions in
anonymous mode (when auth is disabled).

Thanks
Venki

On Tue, Feb 2, 2016 at 4:02 PM, Josh Schlesser <j...@spoutable.com> wrote:

> I have a background worker process (on a server, not a browser) that kicks
> off every minute or so and issues some queries sequentially to the rest
> query endpoint.    In 1.4 with no authentication this worked fine except
> that in 1 instance I need to issue a CTAS query with a different format
> (json).
>
> I upgraded to 1.5-SNAPSHOT commit bb3fc15216d9cab804fc9a6f0e5bd34597dd4394
>
> Since the upgrade I am getting a resource starvation problem with or
> without authentication
> The drillbit process stays up for a an hour or less and then becomes
> unresponsive and eats up the cpu.
>
> It is definitely a resource starvation issue, not sure if its a resource
> leak.
> Below is a stack trace.
> Also when i lsof on the pid there are a lot (more than a thousand) of
> files like this listed which are used by NIO selectors.  so it smells like
> a resource leak.
>
> COMMAND  PID USER   FD   TYPE             DEVICE  SIZE/OFF    NODE NAME
> java    2931 root  288u  0000               0,11         0    7705
> anon_inode
>
> 2016-02-02 21:56:26,520 [qtp1250890858-11590] ERROR
> o.a.d.e.s.r.a.AnonymousLoginService - Login failed.
> java.lang.IllegalStateException: failed to create a child event loop
>         at
> io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:68)
> ~[netty-common-4.0.27.Final.jar:4.0.27.Final]
>         at
> io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:49)
> ~[netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at
> io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:61)
> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at
> io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:49)
> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at
> org.apache.drill.exec.rpc.TransportCheck.createEventLoopGroup(TransportCheck.java:73)
> ~[drill-rpc-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.apache.drill.exec.client.DrillClient.createEventLoop(DrillClient.java:239)
> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:220)
> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:178)
> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.apache.drill.exec.server.rest.auth.AbstractDrillLoginService.createDrillClient(AbstractDrillLoginService.java:56)
> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.apache.drill.exec.server.rest.auth.AnonymousLoginService.login(AnonymousLoginService.java:47)
> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.apache.drill.exec.server.rest.auth.AnonymousAuthenticator.validateRequest(AnonymousAuthenticator.java:71)
> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:503)
> [jetty-security-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1111)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:478)
> [jetty-servlet-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:183)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1045)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at org.eclipse.jetty.server.Server.handle(Server.java:462)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:279)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:232)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:534)
> [jetty-io-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:607)
> [jetty-util-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:536)
> [jetty-util-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_91]
> Caused by: java.lang.RuntimeException: epoll_create1() failed: Too many
> open files
>         at io.netty.channel.epoll.Native.epollCreate(Native Method)
> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at
> io.netty.channel.epoll.EpollEventLoop.<init>(EpollEventLoop.java:74)
> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at
> io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:76)
> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at
> io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:64)
> ~[netty-common-4.0.27.Final.jar:4.0.27.Final]
>         ... 25 common frames omitted
> 2016-02-02 21:56:30,130 [qtp1250890858-11591] ERROR
> o.a.d.e.s.r.a.AnonymousLoginService - Login failed.
> java.lang.IllegalStateException: failed to create a child event loop
>         at
> io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:68)
> ~[netty-common-4.0.27.Final.jar:4.0.27.Final]
>         at
> io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:49)
> ~[netty-transport-4.0.27.Final.jar:4.0.27.Final]
>         at
> io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:61)
> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at
> io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:49)
> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at
> org.apache.drill.exec.rpc.TransportCheck.createEventLoopGroup(TransportCheck.java:73)
> ~[drill-rpc-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.apache.drill.exec.client.DrillClient.createEventLoop(DrillClient.java:239)
> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:220)
> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:178)
> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.apache.drill.exec.server.rest.auth.AbstractDrillLoginService.createDrillClient(AbstractDrillLoginService.java:56)
> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.apache.drill.exec.server.rest.auth.AnonymousLoginService.login(AnonymousLoginService.java:47)
> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.apache.drill.exec.server.rest.auth.AnonymousAuthenticator.validateRequest(AnonymousAuthenticator.java:71)
> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>         at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:503)
> [jetty-security-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1111)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:478)
> [jetty-servlet-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:183)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1045)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at org.eclipse.jetty.server.Server.handle(Server.java:462)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:279)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:232)
> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:534)
> [jetty-io-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:607)
> [jetty-util-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:536)
> [jetty-util-9.1.5.v20140505.jar:9.1.5.v20140505]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_91]
> Caused by: java.lang.RuntimeException: epoll_create1() failed: Too many
> open files
>         at io.netty.channel.epoll.Native.epollCreate(Native Method)
> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at
> io.netty.channel.epoll.EpollEventLoop.<init>(EpollEventLoop.java:74)
> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at
> io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:76)
> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
>         at
> io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:64)
> ~[netty-common-4.0.27.Final.jar:4.0.27.Final]
>         ... 25 common frames omitted
>
>
>
> > On Feb 2, 2016, at 7:40 AM, Venki Korukanti <venki.koruka...@gmail.com>
> wrote:
> >
> > Currently we keep the DrillClient per session. All the state is in Server
> > and DrillClient is the reference to reuse the state. DrillClient is
> > automatically closed when the session expires (default value is 1hr after
> > the last activity on session) or user explicitly logs out. I am trying to
> > understand if there is a resource leak. Do you have too many sessions
> open
> > when the system load is max or just few sessions but you have already ran
> > many queries using the existing sessions? If it is the former it is
> > understandable to have per connection per session life. Also are the
> > resources not freeing up after logout?
> >
> > If you need to have multiple simultaneous sessions, it is better to
> connect
> > to different Drillbits (may be in a round-robin fashion) than always
> > connecting to a single Drillbit.
> >
> > Thanks
> > Venki
> >
> > On Mon, Feb 1, 2016 at 11:51 PM, Josh Schlesser <j...@spoutable.com
> <mailto:j...@spoutable.com>> wrote:
> >
> >> First: Im a total newb at contributing to apache projects so please
> excuse
> >> any indiscretions, feel free to give comments on style or whatever, i
> take
> >> feedback well.  Thick skin too.
> >>
> >>
> >> Ill give some background next and then a proposal.
> >>
> >> Background:
> >> I recently changed over to using authentication in the 1.5 snapshot
> >> because I need to have a session via the REST api so that I can set the
> >> session storage options in an initial query for a subsequent CTAS query.
> >> Previously all rest calls seemed to be completely independent.
> >>
> >> Since the change I have started seeing ‘too many files open’ errors in
> my
> >> drillbit.log and the drillbit java process becomes effectively hung
> waiting
> >> for open file descriptor slots.  When running the top command the
> machine
> >> is running at max load due to the drillbit process and the drillbit
> becomes
> >> effectively unresponsive, even the simple pages in the web console don’t
> >> respond.   Investigating further it seems that there might be a file
> kept
> >> open per session by the drillbit process for the life of the session.
>  I
> >> used the lsof unix command on the drillbit process and found a lot of
> unix
> >> pipes.  Looking at the code it looks like these pipes could be for the
> >> communication between the web process and the rpc server, with one being
> >> allocated per session.  I haven’t validated this, its just a guess after
> >> scanning the code.   I had 1.4 running without this requirement and
> without
> >> ever seeing the error.  It seems without authentication the number of
> open
> >> files is a non-issue for me, possibly due to sessions.
> >>
> >> I'm wondering if my guess about what is causing the ‘too many open
> files’
> >> error is plausible?   Does anybody with a deeper understanding of the
> >> architecture have any comments on this?
> >>
> >> Proposal:
> >> Assuming sessions are the issue, I am making some changes to my rest
> >> client so that sessions are more effectively used and I can up the
> ulimit
> >> for the drillbit process for the linux user in hopes of mitigating
> this.  I
> >> am effectively creating a rest client based session pool that resets
> >> session variables to defaults  when the session gets reused.   However,
> it
> >> seems hacky.
> >>
> >> Below is an idea for getting per request based settings which seems less
> >> hacky in the long term.
> >>
> >> Can I add a new array member to the query.json REST method in a
> backwards
> >> compatible way to set session level parameters in a single request?
> >> Currently a rest request via the api has a body like so:
> >> { “queryType”: “SQL”, “query” : “<drill query>”}
> >>
> >> id like to do the following
> >>
> >> { “queryType”: “SQL”, “query” : “<drill query>”, “sessionSettings”:
> >> [“option_1_name”:”option_1_value”, “option_2_name”:”option_2_value”]}
> >>
> >> or even
> >>
> >> { “queryType”: “SQL”, “query” : “<drill query>”, “sessionSettings”:
> [“SET
> >> `option_name` = value”, “SET `option_name1` = value1”,“SET
> `option_name2` =
> >> value2”, “SET `option_name3` = value3”]}
> >>
> >> As far as I can tell drill is essentially stateless between queries
> right
> >> now except for session level system parameters and authentication.
> There
> >> aren’t any in memory temp tables or cursors or variables like PL/SQL or
> >> PSQL or other SQLs that would make it stateful.
> >>
> >> Given the stateless assumption, being able to set session level params
> on
> >> a per request basis would cover all of the cases that I might need.  It
> >> looks relatively straight forward to add something to QueryWrapper to
> >> accept an optional query session settings section of the json packet and
> >> execute those ’SET' commands before the final query.    This will work
> for
> >> me, as I can run without authentication in an ’secure' backend
> environment
> >> which will remove sessions and hence file descriptors, assuming my
> >> assumptions about file descriptors and sessions are correct.
> >>
> >>
> >> My java is rusty (circa 2003) but some casual googling implies that if
> >> this were added as a 3rd @FormParam to submitQuery in QueryResources it
> >> would be magically be null if it werent present and could easily be
> >> ignored. If its present then an alternative constructor of QueryWrapper
> >> could be called with the extra param and it would be easy to alter its
> run
> >> method to execute the SET commands.  There would need to be some error
> >> handling of course if the SET commands were illegal or failed to run for
> >> some reason.
> >>
> >> If this seems reasonable, how do I go about contributing?  I looked
> >> through the links in the docs to apache foundation incubator projects
> but
> >> the links to drill were broken :(   http://drill.apache.org/team.html <
> >> http://drill.apache.org/team.html <http://drill.apache.org/team.html>>
> I read this
> >> http://drill.apache.org/docs/apache-drill-contribution-guidelines/ <
> http://drill.apache.org/docs/apache-drill-contribution-guidelines/> <
> >> http://drill.apache.org/docs/apache-drill-contribution-guidelines/>
> and
> >> i have subscribed to the dev mailing list (obvious since you are getting
> >> this).    It said to post here before creating a JIRA.  Am I missing
> >> anything in my assumptions?  Comments?  Should I just submit a JIRA and
> a
> >> patch or submit a JIRA and a comment or wait for comments before coding
> >> stuff up as an example?
> >>
> >> Thanks for taking the time to read and respond.
> >>
> >> Josh
>
>

Reply via email to