I think you made valid points. It makes sense to have session less REST calls in both auth enabled and disabled cases.
In case of auth enabled: 1) Session-less calls can be authenticated using Basic auth (this was already asked on mailing list sometime back) as a start and move onto token based auth later. These requests usually come from non-browsers. The only issue is setting options before the query. For this we can implement your suggestion of enhancing the query REST API to accept the options. 2) Session-based call using form auth for browser based access. If we enhance the UI to enter options in the query form, we don't need any session on server actually. I will get a fix ASAP to remove the sessions in anonymous calls as they the session are not reused in non-browser cases. Thanks Venki On Tue, Feb 2, 2016 at 9:20 PM, Josh Schlesser <j...@spoutable.com> wrote: > No, it wasn’t logging out, it was just stopping, obviously that caused > dangling sessions for the authenticated scenario. > > I don’t think that a short timeout for anonymous sessions is a good way to > go for anonymous api calls. Session management isn’t what anybody would > expect when using a REST api that is anonymous in a server to server > context. I would expect to use a token for authorization for a server to > server REST api as well. I’m not saying that is what it should be here, > but that is what my general expectation is based on using other apis. In > the case of browser to server REST apis, I have run into authentication for > a browser session and subsequent REST calls leaning on a browser cookie for > persistent authentication. > > Removing sessions for anonymous calls seems like the right path and > possibly easy and I think would be the expected behavior from most > developers. I would advocate for sessionless and token authenticated REST > apis for when using authentication for the server to server case and cookie > based with a session for the browser to server scenario, but its really the > browser that has a session, not the api per se, its just piggybacking on a > regular authenticated web session for the REST api calls. > > This would actually leave me in a quandary for what I am trying to do > which is set a session configuration option ’store.format', but I cant > think of any reason that those types of settings shouldn’t just be set on a > per request basis for a REST api. In a server to server context for a rest > api, keeping it sessionless means you could front a cluster of drillbits > with a load balancer and not worry about dying nodes and sticky sessions > etc... > > I have to get something up and running quickly right now so im versioning > back to 1.4 and just spinning up a separate drillbit that will have the > store.format system variable set to ‘json’ . it will be ok for me until a > good long term solution arrives in drill. > > I’ll run the test on short session_max_idle_secs to 30 seconds on > 1.5.0-SNAPSHOT to see if that gets rid of the file handle starvation > problem, but keep in mind that means that users of the web console will > have 30 seconds between pages or they have to authenticate again, which > will probably be very annoying. It doesnt seem like a good long term > solution either. > > How do you think all of this should work? I look forward to staying > involved. > > Cheers, > Josh > > > On Feb 2, 2016, at 4:40 PM, Venki Korukanti <venki.koruka...@gmail.com> > wrote: > > > > When auth is *enabled*, is the worker process logging out after queries > are > > done? When auth is *disabled* can you set session_max_idle_secs in > > drill.exec.http block in drill-override.conf to something like 30 (secs) > > and try? This way anonymous sessions are closed quickly and not kept for > > 1hr (default value). I think we may need to avoid creating sessions in > > anonymous mode (when auth is disabled). > > > > Thanks > > Venki > > > > On Tue, Feb 2, 2016 at 4:02 PM, Josh Schlesser <j...@spoutable.com> > wrote: > > > >> I have a background worker process (on a server, not a browser) that > kicks > >> off every minute or so and issues some queries sequentially to the rest > >> query endpoint. In 1.4 with no authentication this worked fine except > >> that in 1 instance I need to issue a CTAS query with a different format > >> (json). > >> > >> I upgraded to 1.5-SNAPSHOT commit > bb3fc15216d9cab804fc9a6f0e5bd34597dd4394 > >> > >> Since the upgrade I am getting a resource starvation problem with or > >> without authentication > >> The drillbit process stays up for a an hour or less and then becomes > >> unresponsive and eats up the cpu. > >> > >> It is definitely a resource starvation issue, not sure if its a resource > >> leak. > >> Below is a stack trace. > >> Also when i lsof on the pid there are a lot (more than a thousand) of > >> files like this listed which are used by NIO selectors. so it smells > like > >> a resource leak. > >> > >> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > >> java 2931 root 288u 0000 0,11 0 7705 > >> anon_inode > >> > >> 2016-02-02 21:56:26,520 [qtp1250890858-11590] ERROR > >> o.a.d.e.s.r.a.AnonymousLoginService - Login failed. > >> java.lang.IllegalStateException: failed to create a child event loop > >> at > >> > io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:68) > >> ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > >> at > >> > io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:49) > >> ~[netty-transport-4.0.27.Final.jar:4.0.27.Final] > >> at > >> > io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:61) > >> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > >> at > >> > io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:49) > >> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > >> at > >> > org.apache.drill.exec.rpc.TransportCheck.createEventLoopGroup(TransportCheck.java:73) > >> ~[drill-rpc-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> > org.apache.drill.exec.client.DrillClient.createEventLoop(DrillClient.java:239) > >> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:220) > >> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:178) > >> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> > org.apache.drill.exec.server.rest.auth.AbstractDrillLoginService.createDrillClient(AbstractDrillLoginService.java:56) > >> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> > org.apache.drill.exec.server.rest.auth.AnonymousLoginService.login(AnonymousLoginService.java:47) > >> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> > org.apache.drill.exec.server.rest.auth.AnonymousAuthenticator.validateRequest(AnonymousAuthenticator.java:71) > >> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:503) > >> [jetty-security-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1111) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:478) > >> [jetty-servlet-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:183) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1045) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at org.eclipse.jetty.server.Server.handle(Server.java:462) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:279) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:232) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:534) > >> [jetty-io-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:607) > >> [jetty-util-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:536) > >> [jetty-util-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_91] > >> Caused by: java.lang.RuntimeException: epoll_create1() failed: Too many > >> open files > >> at io.netty.channel.epoll.Native.epollCreate(Native Method) > >> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > >> at > >> io.netty.channel.epoll.EpollEventLoop.<init>(EpollEventLoop.java:74) > >> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > >> at > >> > io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:76) > >> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > >> at > >> > io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:64) > >> ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > >> ... 25 common frames omitted > >> 2016-02-02 21:56:30,130 [qtp1250890858-11591] ERROR > >> o.a.d.e.s.r.a.AnonymousLoginService - Login failed. > >> java.lang.IllegalStateException: failed to create a child event loop > >> at > >> > io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:68) > >> ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > >> at > >> > io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:49) > >> ~[netty-transport-4.0.27.Final.jar:4.0.27.Final] > >> at > >> > io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:61) > >> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > >> at > >> > io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:49) > >> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > >> at > >> > org.apache.drill.exec.rpc.TransportCheck.createEventLoopGroup(TransportCheck.java:73) > >> ~[drill-rpc-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> > org.apache.drill.exec.client.DrillClient.createEventLoop(DrillClient.java:239) > >> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:220) > >> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:178) > >> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> > org.apache.drill.exec.server.rest.auth.AbstractDrillLoginService.createDrillClient(AbstractDrillLoginService.java:56) > >> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> > org.apache.drill.exec.server.rest.auth.AnonymousLoginService.login(AnonymousLoginService.java:47) > >> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> > org.apache.drill.exec.server.rest.auth.AnonymousAuthenticator.validateRequest(AnonymousAuthenticator.java:71) > >> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:503) > >> [jetty-security-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1111) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:478) > >> [jetty-servlet-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:183) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1045) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at org.eclipse.jetty.server.Server.handle(Server.java:462) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:279) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:232) > >> [jetty-server-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:534) > >> [jetty-io-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:607) > >> [jetty-util-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at > >> > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:536) > >> [jetty-util-9.1.5.v20140505.jar:9.1.5.v20140505] > >> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_91] > >> Caused by: java.lang.RuntimeException: epoll_create1() failed: Too many > >> open files > >> at io.netty.channel.epoll.Native.epollCreate(Native Method) > >> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > >> at > >> io.netty.channel.epoll.EpollEventLoop.<init>(EpollEventLoop.java:74) > >> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > >> at > >> > io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:76) > >> ~[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na] > >> at > >> > io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:64) > >> ~[netty-common-4.0.27.Final.jar:4.0.27.Final] > >> ... 25 common frames omitted > >> > >> > >> > >>> On Feb 2, 2016, at 7:40 AM, Venki Korukanti <venki.koruka...@gmail.com > > > >> wrote: > >>> > >>> Currently we keep the DrillClient per session. All the state is in > Server > >>> and DrillClient is the reference to reuse the state. DrillClient is > >>> automatically closed when the session expires (default value is 1hr > after > >>> the last activity on session) or user explicitly logs out. I am trying > to > >>> understand if there is a resource leak. Do you have too many sessions > >> open > >>> when the system load is max or just few sessions but you have already > ran > >>> many queries using the existing sessions? If it is the former it is > >>> understandable to have per connection per session life. Also are the > >>> resources not freeing up after logout? > >>> > >>> If you need to have multiple simultaneous sessions, it is better to > >> connect > >>> to different Drillbits (may be in a round-robin fashion) than always > >>> connecting to a single Drillbit. > >>> > >>> Thanks > >>> Venki > >>> > >>> On Mon, Feb 1, 2016 at 11:51 PM, Josh Schlesser <j...@spoutable.com > >> <mailto:j...@spoutable.com>> wrote: > >>> > >>>> First: Im a total newb at contributing to apache projects so please > >> excuse > >>>> any indiscretions, feel free to give comments on style or whatever, i > >> take > >>>> feedback well. Thick skin too. > >>>> > >>>> > >>>> Ill give some background next and then a proposal. > >>>> > >>>> Background: > >>>> I recently changed over to using authentication in the 1.5 snapshot > >>>> because I need to have a session via the REST api so that I can set > the > >>>> session storage options in an initial query for a subsequent CTAS > query. > >>>> Previously all rest calls seemed to be completely independent. > >>>> > >>>> Since the change I have started seeing ‘too many files open’ errors in > >> my > >>>> drillbit.log and the drillbit java process becomes effectively hung > >> waiting > >>>> for open file descriptor slots. When running the top command the > >> machine > >>>> is running at max load due to the drillbit process and the drillbit > >> becomes > >>>> effectively unresponsive, even the simple pages in the web console > don’t > >>>> respond. Investigating further it seems that there might be a file > >> kept > >>>> open per session by the drillbit process for the life of the session. > >> I > >>>> used the lsof unix command on the drillbit process and found a lot of > >> unix > >>>> pipes. Looking at the code it looks like these pipes could be for the > >>>> communication between the web process and the rpc server, with one > being > >>>> allocated per session. I haven’t validated this, its just a guess > after > >>>> scanning the code. I had 1.4 running without this requirement and > >> without > >>>> ever seeing the error. It seems without authentication the number of > >> open > >>>> files is a non-issue for me, possibly due to sessions. > >>>> > >>>> I'm wondering if my guess about what is causing the ‘too many open > >> files’ > >>>> error is plausible? Does anybody with a deeper understanding of the > >>>> architecture have any comments on this? > >>>> > >>>> Proposal: > >>>> Assuming sessions are the issue, I am making some changes to my rest > >>>> client so that sessions are more effectively used and I can up the > >> ulimit > >>>> for the drillbit process for the linux user in hopes of mitigating > >> this. I > >>>> am effectively creating a rest client based session pool that resets > >>>> session variables to defaults when the session gets reused. > However, > >> it > >>>> seems hacky. > >>>> > >>>> Below is an idea for getting per request based settings which seems > less > >>>> hacky in the long term. > >>>> > >>>> Can I add a new array member to the query.json REST method in a > >> backwards > >>>> compatible way to set session level parameters in a single request? > >>>> Currently a rest request via the api has a body like so: > >>>> { “queryType”: “SQL”, “query” : “<drill query>”} > >>>> > >>>> id like to do the following > >>>> > >>>> { “queryType”: “SQL”, “query” : “<drill query>”, “sessionSettings”: > >>>> [“option_1_name”:”option_1_value”, “option_2_name”:”option_2_value”]} > >>>> > >>>> or even > >>>> > >>>> { “queryType”: “SQL”, “query” : “<drill query>”, “sessionSettings”: > >> [“SET > >>>> `option_name` = value”, “SET `option_name1` = value1”,“SET > >> `option_name2` = > >>>> value2”, “SET `option_name3` = value3”]} > >>>> > >>>> As far as I can tell drill is essentially stateless between queries > >> right > >>>> now except for session level system parameters and authentication. > >> There > >>>> aren’t any in memory temp tables or cursors or variables like PL/SQL > or > >>>> PSQL or other SQLs that would make it stateful. > >>>> > >>>> Given the stateless assumption, being able to set session level params > >> on > >>>> a per request basis would cover all of the cases that I might need. > It > >>>> looks relatively straight forward to add something to QueryWrapper to > >>>> accept an optional query session settings section of the json packet > and > >>>> execute those ’SET' commands before the final query. This will work > >> for > >>>> me, as I can run without authentication in an ’secure' backend > >> environment > >>>> which will remove sessions and hence file descriptors, assuming my > >>>> assumptions about file descriptors and sessions are correct. > >>>> > >>>> > >>>> My java is rusty (circa 2003) but some casual googling implies that if > >>>> this were added as a 3rd @FormParam to submitQuery in QueryResources > it > >>>> would be magically be null if it werent present and could easily be > >>>> ignored. If its present then an alternative constructor of > QueryWrapper > >>>> could be called with the extra param and it would be easy to alter its > >> run > >>>> method to execute the SET commands. There would need to be some error > >>>> handling of course if the SET commands were illegal or failed to run > for > >>>> some reason. > >>>> > >>>> If this seems reasonable, how do I go about contributing? I looked > >>>> through the links in the docs to apache foundation incubator projects > >> but > >>>> the links to drill were broken :( http://drill.apache.org/team.html > < > >>>> http://drill.apache.org/team.html <http://drill.apache.org/team.html > >> > >> I read this > >>>> http://drill.apache.org/docs/apache-drill-contribution-guidelines/ < > >> http://drill.apache.org/docs/apache-drill-contribution-guidelines/> < > >>>> http://drill.apache.org/docs/apache-drill-contribution-guidelines/> > >> and > >>>> i have subscribed to the dev mailing list (obvious since you are > getting > >>>> this). It said to post here before creating a JIRA. Am I missing > >>>> anything in my assumptions? Comments? Should I just submit a JIRA > and > >> a > >>>> patch or submit a JIRA and a comment or wait for comments before > coding > >>>> stuff up as an example? > >>>> > >>>> Thanks for taking the time to read and respond. > >>>> > >>>> Josh > >> > >> > >