When auth is *enabled*, is the worker process logging out after queries are done? When auth is *disabled* can you set session_max_idle_secs in drill.exec.http block in drill-override.conf to something like 30 (secs) and try? This way anonymous sessions are closed quickly and not kept for 1hr (default value). I think we may need to avoid creating sessions in anonymous mode (when auth is disabled).
Thanks Venki On Tue, Feb 2, 2016 at 4:02 PM, Josh Schlesser <j...@spoutable.com> wrote: > I have a background worker process (on a server, not a browser) that kicks > off every minute or so and issues some queries sequentially to the rest > query endpoint. In 1.4 with no authentication this worked fine except > that in 1 instance I need to issue a CTAS query with a different format > (json). > > I upgraded to 1.5-SNAPSHOT commit bb3fc15216d9cab804fc9a6f0e5bd34597dd4394 > > Since the upgrade I am getting a resource starvation problem with or > without authentication > The drillbit process stays up for a an hour or less and then becomes > unresponsive and eats up the cpu. > > It is definitely a resource starvation issue, not sure if its a resource > leak. > Below is a stack trace. > Also when i lsof on the pid there are a lot (more than a thousand) of > files like this listed which are used by NIO selectors. so it smells like > a resource leak. > > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > java 2931 root 288u 0000 0,11 0 7705 > anon_inode > > 2016-02-02 21:56:26,520 [qtp1250890858-11590] ERROR > o.a.d.e.s.r.a.AnonymousLoginService - Login failed. > java.lang.IllegalStateException: failed to create a child event loop > at > io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:68) > ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:49) > ~[netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:61) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:49) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > org.apache.drill.exec.rpc.TransportCheck.createEventLoopGroup(TransportCheck.java:73) > ~[drill-rpc-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.client.DrillClient.createEventLoop(DrillClient.java:239) > ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:220) > ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:178) > ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.server.rest.auth.AbstractDrillLoginService.createDrillClient(AbstractDrillLoginService.java:56) > ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.server.rest.auth.AnonymousLoginService.login(AnonymousLoginService.java:47) > ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.server.rest.auth.AnonymousAuthenticator.validateRequest(AnonymousAuthenticator.java:71) > [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:503) > [jetty-security-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1111) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:478) > [jetty-servlet-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:183) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1045) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at org.eclipse.jetty.server.Server.handle(Server.java:462) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:279) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:232) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:534) > [jetty-io-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:607) > [jetty-util-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:536) > [jetty-util-9.1.5.v20140505.jar:9.1.5.v20140505] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_91] > Caused by: java.lang.RuntimeException: epoll_create1() failed: Too many > open files > at io.netty.channel.epoll.Native.epollCreate(Native Method) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoop.<init>(EpollEventLoop.java:74) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:76) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:64) > ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > ... 25 common frames omitted > 2016-02-02 21:56:30,130 [qtp1250890858-11591] ERROR > o.a.d.e.s.r.a.AnonymousLoginService - Login failed. > java.lang.IllegalStateException: failed to create a child event loop > at > io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:68) > ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:49) > ~[netty-transport-4.0.27.Final.jar:4.0.27.Final] > at > io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:61) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:49) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > org.apache.drill.exec.rpc.TransportCheck.createEventLoopGroup(TransportCheck.java:73) > ~[drill-rpc-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.client.DrillClient.createEventLoop(DrillClient.java:239) > ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:220) > ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:178) > ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.server.rest.auth.AbstractDrillLoginService.createDrillClient(AbstractDrillLoginService.java:56) > ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.server.rest.auth.AnonymousLoginService.login(AnonymousLoginService.java:47) > ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.apache.drill.exec.server.rest.auth.AnonymousAuthenticator.validateRequest(AnonymousAuthenticator.java:71) > [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:503) > [jetty-security-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1111) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:478) > [jetty-servlet-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:183) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1045) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at org.eclipse.jetty.server.Server.handle(Server.java:462) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:279) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:232) > [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:534) > [jetty-io-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:607) > [jetty-util-9.1.5.v20140505.jar:9.1.5.v20140505] > at > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:536) > [jetty-util-9.1.5.v20140505.jar:9.1.5.v20140505] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_91] > Caused by: java.lang.RuntimeException: epoll_create1() failed: Too many > open files > at io.netty.channel.epoll.Native.epollCreate(Native Method) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoop.<init>(EpollEventLoop.java:74) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:76) > ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > at > io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:64) > ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > ... 25 common frames omitted > > > > > On Feb 2, 2016, at 7:40 AM, Venki Korukanti <venki.koruka...@gmail.com> > wrote: > > > > Currently we keep the DrillClient per session. All the state is in Server > > and DrillClient is the reference to reuse the state. DrillClient is > > automatically closed when the session expires (default value is 1hr after > > the last activity on session) or user explicitly logs out. I am trying to > > understand if there is a resource leak. Do you have too many sessions > open > > when the system load is max or just few sessions but you have already ran > > many queries using the existing sessions? If it is the former it is > > understandable to have per connection per session life. Also are the > > resources not freeing up after logout? > > > > If you need to have multiple simultaneous sessions, it is better to > connect > > to different Drillbits (may be in a round-robin fashion) than always > > connecting to a single Drillbit. > > > > Thanks > > Venki > > > > On Mon, Feb 1, 2016 at 11:51 PM, Josh Schlesser <j...@spoutable.com > <mailto:j...@spoutable.com>> wrote: > > > >> First: Im a total newb at contributing to apache projects so please > excuse > >> any indiscretions, feel free to give comments on style or whatever, i > take > >> feedback well. Thick skin too. > >> > >> > >> Ill give some background next and then a proposal. > >> > >> Background: > >> I recently changed over to using authentication in the 1.5 snapshot > >> because I need to have a session via the REST api so that I can set the > >> session storage options in an initial query for a subsequent CTAS query. > >> Previously all rest calls seemed to be completely independent. > >> > >> Since the change I have started seeing ‘too many files open’ errors in > my > >> drillbit.log and the drillbit java process becomes effectively hung > waiting > >> for open file descriptor slots. When running the top command the > machine > >> is running at max load due to the drillbit process and the drillbit > becomes > >> effectively unresponsive, even the simple pages in the web console don’t > >> respond. Investigating further it seems that there might be a file > kept > >> open per session by the drillbit process for the life of the session. > I > >> used the lsof unix command on the drillbit process and found a lot of > unix > >> pipes. Looking at the code it looks like these pipes could be for the > >> communication between the web process and the rpc server, with one being > >> allocated per session. I haven’t validated this, its just a guess after > >> scanning the code. I had 1.4 running without this requirement and > without > >> ever seeing the error. It seems without authentication the number of > open > >> files is a non-issue for me, possibly due to sessions. > >> > >> I'm wondering if my guess about what is causing the ‘too many open > files’ > >> error is plausible? Does anybody with a deeper understanding of the > >> architecture have any comments on this? > >> > >> Proposal: > >> Assuming sessions are the issue, I am making some changes to my rest > >> client so that sessions are more effectively used and I can up the > ulimit > >> for the drillbit process for the linux user in hopes of mitigating > this. I > >> am effectively creating a rest client based session pool that resets > >> session variables to defaults when the session gets reused. However, > it > >> seems hacky. > >> > >> Below is an idea for getting per request based settings which seems less > >> hacky in the long term. > >> > >> Can I add a new array member to the query.json REST method in a > backwards > >> compatible way to set session level parameters in a single request? > >> Currently a rest request via the api has a body like so: > >> { “queryType”: “SQL”, “query” : “<drill query>”} > >> > >> id like to do the following > >> > >> { “queryType”: “SQL”, “query” : “<drill query>”, “sessionSettings”: > >> [“option_1_name”:”option_1_value”, “option_2_name”:”option_2_value”]} > >> > >> or even > >> > >> { “queryType”: “SQL”, “query” : “<drill query>”, “sessionSettings”: > [“SET > >> `option_name` = value”, “SET `option_name1` = value1”,“SET > `option_name2` = > >> value2”, “SET `option_name3` = value3”]} > >> > >> As far as I can tell drill is essentially stateless between queries > right > >> now except for session level system parameters and authentication. > There > >> aren’t any in memory temp tables or cursors or variables like PL/SQL or > >> PSQL or other SQLs that would make it stateful. > >> > >> Given the stateless assumption, being able to set session level params > on > >> a per request basis would cover all of the cases that I might need. It > >> looks relatively straight forward to add something to QueryWrapper to > >> accept an optional query session settings section of the json packet and > >> execute those ’SET' commands before the final query. This will work > for > >> me, as I can run without authentication in an ’secure' backend > environment > >> which will remove sessions and hence file descriptors, assuming my > >> assumptions about file descriptors and sessions are correct. > >> > >> > >> My java is rusty (circa 2003) but some casual googling implies that if > >> this were added as a 3rd @FormParam to submitQuery in QueryResources it > >> would be magically be null if it werent present and could easily be > >> ignored. If its present then an alternative constructor of QueryWrapper > >> could be called with the extra param and it would be easy to alter its > run > >> method to execute the SET commands. There would need to be some error > >> handling of course if the SET commands were illegal or failed to run for > >> some reason. > >> > >> If this seems reasonable, how do I go about contributing? I looked > >> through the links in the docs to apache foundation incubator projects > but > >> the links to drill were broken :( http://drill.apache.org/team.html < > >> http://drill.apache.org/team.html <http://drill.apache.org/team.html>> > I read this > >> http://drill.apache.org/docs/apache-drill-contribution-guidelines/ < > http://drill.apache.org/docs/apache-drill-contribution-guidelines/> < > >> http://drill.apache.org/docs/apache-drill-contribution-guidelines/> > and > >> i have subscribed to the dev mailing list (obvious since you are getting > >> this). It said to post here before creating a JIRA. Am I missing > >> anything in my assumptions? Comments? Should I just submit a JIRA and > a > >> patch or submit a JIRA and a comment or wait for comments before coding > >> stuff up as an example? > >> > >> Thanks for taking the time to read and respond. > >> > >> Josh > >