Re: Drill Session ID between Nodes
I stand corrected on one point (Thanks, Sorabh!): the Drill web server does have a session timeout, configurable in boot options, that defaults to one hour. - Paul > On Jun 23, 2017, at 2:10 PM, Paul Rogerswrote: > > Hi John, > > Your use case is interesting. I’m certainly not an expert in the network > aspects of what you are trying to do, but I can take a short at the related > Drill issues. > > Drill’s primary use case is connecting via the Drill client (typically via > JDBC or ODBC.) The Drill client handles security. It also allows SQL > sessions, and hence session options. > > Your use case is based on the REST API. At present, the REST API is best > described as a “prototype.” REST supports username/password login, and > sessions associated with the login (on a single Drillbit). Sessions never > timeout (as far as I can tell.) More importantly, the REST API returns all > query results in a single message, encoded as JSON. This is great for small > queries, but does not scale well when returning millions of large rows. > (Hint: we are looking for contributions to improve the REST API!) > > As Keys pointed out, the important question is this: does your app need > session state other than security? If so, then you need to consider overall > SQL session state, not just SSL connections. If your script does “ALTER > SESSION” followed by a query, then the ALTER SESSION might be sent to node A, > with the query going to node B. Node B does not know about the session on A, > and so results will be different than what you expect. The same is true with > temp tables. > > Said another way, you’d like your scripts to do round-robin per *request*, > but Drill is designed to do round-robin *per session.* (The Drill client, > when using ZK, does random selection of nodes that achieves roughly the same > result.) In short, your use case is clear, but is not supported today in > Drill. > > Putting this together: > > 1. Sessions must be sticky to a single Drillbit so that session state, temp > tables and so on are persisted (on that one Drillbit.) > 2. If a session on one Drillbit drops, the app must establish a new session > on another Drillbit. That involves not just security tokens and cookies, but > also resetting session options, rebuilding temp tables, etc. > 3. Since the app has to handle session recreation when switching Drillbits, > the security issue, while a nuisance, is a necessary result of switching > sessions. > 4. (As Keys points out,) changing sessions is a rare event (due to timeouts, > node failures, etc.) so session recovery should be rare. > > The only way to make sessions “portable” is to create a shared, global > session shared across Drillbits, which is what you are proposing. Doing so is > non-trivial: it requires a global session registry (or a way of synchronizing > session state). Such sharing is not supported in Drill’s distributed, > shared-nothing architecture. Could we add it? Probably, but not in the short > term. If we ever find the need for a “metastore” (or central work scheduler), > then at that time Drill would have a mechanism to support session > portability; but that is a ways off. > > For the short term, can you perhaps rethink the use case given that sessions > are local? How will your app handle failover? Is the security issue as much > of a problem when seen as part of session recreation? (I’m not an expert > here; I’m asking how this might work: are there things, short of persistent > sessions, we can do to help?) > > You mentioned Drill-on-YARN (DoY). DoY is an interesting question. On the > surface, REST works the same on DoY as in “regular” Drill: the REST endpoint > doesn’t care how the Drillbit was launched. Whatever works with regular Drill > will work with DoY. Under DOY, J/ODBC clients work as usual: they maintain a > session with one Drillbit, and use ZK to find a fall-back Drillbit if the > first one fails (with the need for the client to re-establish the SQL session > state by resending session options, etc.) Can we improve this? Yes, if we did > the work described earlier. > > (BTW: I’m still looking for volunteers to help with code reviews so we can > contribute DoY to Apache Drill…) > > We have not yet looked into the security setup for DoY. (We wanted to get the > security fully working with Drill itself first.) You raise some good issues > that we must wrestle with as we enhance DoY to use the security features that > are becoming available in Drill itself. > > Thanks, > > - Paul > > >> On Jun 23, 2017, at 9:50 AM, John Omernik wrote: >> >> That makes sense, ya, I would love to hear about the challenges of this in >> general from the Drill folks. >> >> Also, I wonder if Paul R at MapR has any thoughts in how something like >> this would be handled in the Drill on Yarn Setup. >> >> >> John >> >
Re: Drill Session ID between Nodes
corp.com><http:/ > /drill1.mydrill.corp.com<http://drill1.mydrill.corp.com>>, > drill2.mydrill.corp.com<http://drill2.mydrill.corp.com><http:/ > /drill2.mydrill.corp.com<http://drill2.mydrill.corp.com>>, > drill3.mydrill.corp.com<http://drill3.mydrill.corp.com><http:/ > /drill3.mydrill.corp.com<http://drill3.mydrill.corp.com>>, > drill4.mydrill.corp.com<http://drill4.mydrill.corp.com><http:/ > /drill4.mydrill.corp.com<http://drill4.mydrill.corp.com>>, > this is bad with wildcards: drill1, drill2, drill3, drill4 > > > Keys > ___ > Keys Botzum > MapR Technologies > > > > On Jun 22, 2017, at 8:24 PM, John Omernik <j...@omernik.com<mailto:john@ > omernik.com><mailto:john@ > omernik.com<http://omernik.com>>> wrote: > > Would there be interest in finding a way to globalize this? This is > challenging for me and others that may run drill with multi Tennant > orchestrators. In my particular setup, each node running drill gets added > to an a record automatically giving me HA and distribution of Rest API > queries. It also allows me to have a single certificate for my cluster > rather than managing certificates on a individual basis. I set things up > to connect via IP but then I had certificate mismatch warnings. My goal is > to find a way to connect to the rest API , while maintaining a session to > single node, with out sacrificing HA and balancing and with compromising > ssl security. I know it's a tall order, but if there I ideas outside of a > global state management I am all ears. > > Note some ideas I've also considered: > > 1. using a load balancer that would allow me to pin connections. Not > ideal because it's another service to manage but it would work. > > 2. There may be a way to hack things with a wild card cert but it's seems > complicated and fragile. > > On Jun 22, 2017 5:47 PM, "Sorabh Hamirwasia" <shamirwa...@mapr.com shamirwa...@mapr.com> shamirwa...@mapr.com<mailto:shamirwa...@mapr.com>>> wrote: > > Hi John, > As Paul mentioned session ID's are not global. Each session is part of the > BitToUserConnection instance created for a connection between Drillbit and > client. Hence it's local to that Drillbit only and the lifetime of the > session is tied to lifetime of the connection. You can find the code here< > https://github.com/apache/drill/blob/master/exec/ > java-exec/src/main/java/org/apache/drill/exec/rpc/user/ > UserServer.java#L102>. > > Thanks, > Sorabh > > > From: Paul Rogers <prog...@mapr.com> > Sent: Thursday, June 22, 2017 2:19:50 PM > To: user@drill.apache.org > Subject: Re: Drill Session ID between Nodes > > Hi John, > > I do not believe that session IDs are global. Each Drillbit maintains its > own concept of sessions. A global session would require some centralized > registry of sessions, which Drill does not have. > > Would be great if someone can confirm… > > - Paul > > On Jun 22, 2017, at 12:14 PM, John Omernik <j...@omernik.com> wrote: > > When I log onto a drill node, and get Session Id, if I connect to another > drill node in the cluster will the session id be valid? > > I am guessing not, but want to validate. > > My conumdrum, I have my Drill cluster running in such a way that the > connections to the nodes are load balanced via DNS. However, if I get a > DNS > IP while in session it appears to invalidate, and thus forces me to log > on... > > > > > > > >
Re: Drill Session ID between Nodes
I know it's a tall order, but if there I ideas outside of a global state management I am all ears. Note some ideas I've also considered: 1. using a load balancer that would allow me to pin connections. Not ideal because it's another service to manage but it would work. 2. There may be a way to hack things with a wild card cert but it's seems complicated and fragile. On Jun 22, 2017 5:47 PM, "Sorabh Hamirwasia" <shamirwa...@mapr.commailto:shamirwa...@mapr.com>>> wrote: Hi John, As Paul mentioned session ID's are not global. Each session is part of the BitToUserConnection instance created for a connection between Drillbit and client. Hence it's local to that Drillbit only and the lifetime of the session is tied to lifetime of the connection. You can find the code here< https://github.com/apache/drill/blob/master/exec/ java-exec/src/main/java/org/apache/drill/exec/rpc/user/ UserServer.java#L102>. Thanks, Sorabh ____ From: Paul Rogers <prog...@mapr.com> Sent: Thursday, June 22, 2017 2:19:50 PM To: user@drill.apache.org Subject: Re: Drill Session ID between Nodes Hi John, I do not believe that session IDs are global. Each Drillbit maintains its own concept of sessions. A global session would require some centralized registry of sessions, which Drill does not have. Would be great if someone can confirm… - Paul On Jun 22, 2017, at 12:14 PM, John Omernik <j...@omernik.com> wrote: When I log onto a drill node, and get Session Id, if I connect to another drill node in the cluster will the session id be valid? I am guessing not, but want to validate. My conumdrum, I have my Drill cluster running in such a way that the connections to the nodes are load balanced via DNS. However, if I get a DNS IP while in session it appears to invalidate, and thus forces me to log on...
Re: Drill Session ID between Nodes
sl security. I know it's a tall order, but if there I ideas outside of a > global state management I am all ears. > > Note some ideas I've also considered: > > 1. using a load balancer that would allow me to pin connections. Not > ideal because it's another service to manage but it would work. > > 2. There may be a way to hack things with a wild card cert but it's seems > complicated and fragile. > > On Jun 22, 2017 5:47 PM, "Sorabh Hamirwasia" <shamirwa...@mapr.com shamirwa...@mapr.com> shamirwa...@mapr.com<mailto:shamirwa...@mapr.com>>> wrote: > > Hi John, > As Paul mentioned session ID's are not global. Each session is part of the > BitToUserConnection instance created for a connection between Drillbit and > client. Hence it's local to that Drillbit only and the lifetime of the > session is tied to lifetime of the connection. You can find the code here< > https://github.com/apache/drill/blob/master/exec/ > java-exec/src/main/java/org/apache/drill/exec/rpc/user/ > UserServer.java#L102>. > > Thanks, > Sorabh > > > From: Paul Rogers <prog...@mapr.com> > Sent: Thursday, June 22, 2017 2:19:50 PM > To: user@drill.apache.org > Subject: Re: Drill Session ID between Nodes > > Hi John, > > I do not believe that session IDs are global. Each Drillbit maintains its > own concept of sessions. A global session would require some centralized > registry of sessions, which Drill does not have. > > Would be great if someone can confirm… > > - Paul > > On Jun 22, 2017, at 12:14 PM, John Omernik <j...@omernik.com> wrote: > > When I log onto a drill node, and get Session Id, if I connect to another > drill node in the cluster will the session id be valid? > > I am guessing not, but want to validate. > > My conumdrum, I have my Drill cluster running in such a way that the > connections to the nodes are load balanced via DNS. However, if I get a > DNS > IP while in session it appears to invalidate, and thus forces me to log > on... > > > > > >
Re: Drill Session ID between Nodes
is also important for Drill on Yarn setups (unless there is some sort of application container proxy back to the bits). If you want to have security (SSL) with hostnames, the session maintenance must be addressed. So that's why I toss it out here... this is a desirable feature I would imagine, even if people are not asking for it now, it may not because they don't need it, but in their testing of Drill, and how they using it now, it may not come up... when they have multiple people and services hitting drill end points pointing them individual nodes for SSL management etc, becomes a nightmare... thus, as a thought exercise, could be securely maintain valid session ideas in Zookeeper for nodes to check on? What would an ideal setup for something like that be? On Fri, Jun 23, 2017 at 7:07 AM, Keys Botzum <kbot...@mapr.com<mailto:kbot...@mapr.com>> wrote: Why is a wildcard certificate a problem? They are quite common. One just needs all of the Drillbits to share a common domain for the wildcard to be easy and thus avoid having to list individual hosts. Are you saying that you can't use hostnames and must use IPs? In case I'm not clear, here's an example of what I'm saying. this is good with wildcards: drill1.mydrill.corp.com<http://drill1.mydrill.corp.com>http://drill1.mydrill.corp.com>>, drill2.mydrill.corp.com<http://drill2.mydrill.corp.com>http://drill2.mydrill.corp.com>>, drill3.mydrill.corp.com<http://drill3.mydrill.corp.com>http://drill3.mydrill.corp.com>>, drill4.mydrill.corp.com<http://drill4.mydrill.corp.com>http://drill4.mydrill.corp.com>>, this is bad with wildcards: drill1, drill2, drill3, drill4 Keys ___ Keys Botzum MapR Technologies On Jun 22, 2017, at 8:24 PM, John Omernik <j...@omernik.com<mailto:j...@omernik.com><mailto:john@ omernik.com<http://omernik.com>>> wrote: Would there be interest in finding a way to globalize this? This is challenging for me and others that may run drill with multi Tennant orchestrators. In my particular setup, each node running drill gets added to an a record automatically giving me HA and distribution of Rest API queries. It also allows me to have a single certificate for my cluster rather than managing certificates on a individual basis. I set things up to connect via IP but then I had certificate mismatch warnings. My goal is to find a way to connect to the rest API , while maintaining a session to single node, with out sacrificing HA and balancing and with compromising ssl security. I know it's a tall order, but if there I ideas outside of a global state management I am all ears. Note some ideas I've also considered: 1. using a load balancer that would allow me to pin connections. Not ideal because it's another service to manage but it would work. 2. There may be a way to hack things with a wild card cert but it's seems complicated and fragile. On Jun 22, 2017 5:47 PM, "Sorabh Hamirwasia" <shamirwa...@mapr.com<mailto:shamirwa...@mapr.com>mailto:shamirwa...@mapr.com>>> wrote: Hi John, As Paul mentioned session ID's are not global. Each session is part of the BitToUserConnection instance created for a connection between Drillbit and client. Hence it's local to that Drillbit only and the lifetime of the session is tied to lifetime of the connection. You can find the code here< https://github.com/apache/drill/blob/master/exec/ java-exec/src/main/java/org/apache/drill/exec/rpc/user/ UserServer.java#L102>. Thanks, Sorabh ____________ From: Paul Rogers <prog...@mapr.com> Sent: Thursday, June 22, 2017 2:19:50 PM To: user@drill.apache.org Subject: Re: Drill Session ID between Nodes Hi John, I do not believe that session IDs are global. Each Drillbit maintains its own concept of sessions. A global session would require some centralized registry of sessions, which Drill does not have. Would be great if someone can confirm… - Paul On Jun 22, 2017, at 12:14 PM, John Omernik <j...@omernik.com> wrote: When I log onto a drill node, and get Session Id, if I connect to another drill node in the cluster will the session id be valid? I am guessing not, but want to validate. My conumdrum, I have my Drill cluster running in such a way that the connections to the nodes are load balanced via DNS. However, if I get a DNS IP while in session it appears to invalidate, and thus forces me to log on...
Re: Drill Session ID between Nodes
drill1, drill2, drill3, drill4 > > > Keys > ___ > Keys Botzum > MapR Technologies > > > > On Jun 22, 2017, at 8:24 PM, John Omernik <j...@omernik.com<mailto:john@ > omernik.com>> wrote: > > Would there be interest in finding a way to globalize this? This is > challenging for me and others that may run drill with multi Tennant > orchestrators. In my particular setup, each node running drill gets added > to an a record automatically giving me HA and distribution of Rest API > queries. It also allows me to have a single certificate for my cluster > rather than managing certificates on a individual basis. I set things up > to connect via IP but then I had certificate mismatch warnings. My goal is > to find a way to connect to the rest API , while maintaining a session to > single node, with out sacrificing HA and balancing and with compromising > ssl security. I know it's a tall order, but if there I ideas outside of a > global state management I am all ears. > > Note some ideas I've also considered: > > 1. using a load balancer that would allow me to pin connections. Not > ideal because it's another service to manage but it would work. > > 2. There may be a way to hack things with a wild card cert but it's seems > complicated and fragile. > > On Jun 22, 2017 5:47 PM, "Sorabh Hamirwasia" <shamirwa...@mapr.com shamirwa...@mapr.com>> wrote: > > Hi John, > As Paul mentioned session ID's are not global. Each session is part of the > BitToUserConnection instance created for a connection between Drillbit and > client. Hence it's local to that Drillbit only and the lifetime of the > session is tied to lifetime of the connection. You can find the code here< > https://github.com/apache/drill/blob/master/exec/ > java-exec/src/main/java/org/apache/drill/exec/rpc/user/ > UserServer.java#L102>. > > Thanks, > Sorabh > > > From: Paul Rogers <prog...@mapr.com> > Sent: Thursday, June 22, 2017 2:19:50 PM > To: user@drill.apache.org > Subject: Re: Drill Session ID between Nodes > > Hi John, > > I do not believe that session IDs are global. Each Drillbit maintains its > own concept of sessions. A global session would require some centralized > registry of sessions, which Drill does not have. > > Would be great if someone can confirm… > > - Paul > > On Jun 22, 2017, at 12:14 PM, John Omernik <j...@omernik.com> wrote: > > When I log onto a drill node, and get Session Id, if I connect to another > drill node in the cluster will the session id be valid? > > I am guessing not, but want to validate. > > My conumdrum, I have my Drill cluster running in such a way that the > connections to the nodes are load balanced via DNS. However, if I get a > DNS > IP while in session it appears to invalidate, and thus forces me to log > on... > > > >
Re: Drill Session ID between Nodes
Why is a wildcard certificate a problem? They are quite common. One just needs all of the Drillbits to share a common domain for the wildcard to be easy and thus avoid having to list individual hosts. Are you saying that you can't use hostnames and must use IPs? In case I'm not clear, here's an example of what I'm saying. this is good with wildcards: drill1.mydrill.corp.com<http://drill1.mydrill.corp.com>, drill2.mydrill.corp.com<http://drill2.mydrill.corp.com>, drill3.mydrill.corp.com<http://drill3.mydrill.corp.com>, drill4.mydrill.corp.com<http://drill4.mydrill.corp.com>, this is bad with wildcards: drill1, drill2, drill3, drill4 Keys ___ Keys Botzum MapR Technologies On Jun 22, 2017, at 8:24 PM, John Omernik <j...@omernik.com<mailto:j...@omernik.com>> wrote: Would there be interest in finding a way to globalize this? This is challenging for me and others that may run drill with multi Tennant orchestrators. In my particular setup, each node running drill gets added to an a record automatically giving me HA and distribution of Rest API queries. It also allows me to have a single certificate for my cluster rather than managing certificates on a individual basis. I set things up to connect via IP but then I had certificate mismatch warnings. My goal is to find a way to connect to the rest API , while maintaining a session to single node, with out sacrificing HA and balancing and with compromising ssl security. I know it's a tall order, but if there I ideas outside of a global state management I am all ears. Note some ideas I've also considered: 1. using a load balancer that would allow me to pin connections. Not ideal because it's another service to manage but it would work. 2. There may be a way to hack things with a wild card cert but it's seems complicated and fragile. On Jun 22, 2017 5:47 PM, "Sorabh Hamirwasia" <shamirwa...@mapr.com<mailto:shamirwa...@mapr.com>> wrote: Hi John, As Paul mentioned session ID's are not global. Each session is part of the BitToUserConnection instance created for a connection between Drillbit and client. Hence it's local to that Drillbit only and the lifetime of the session is tied to lifetime of the connection. You can find the code here< https://github.com/apache/drill/blob/master/exec/ java-exec/src/main/java/org/apache/drill/exec/rpc/user/ UserServer.java#L102>. Thanks, Sorabh From: Paul Rogers <prog...@mapr.com> Sent: Thursday, June 22, 2017 2:19:50 PM To: user@drill.apache.org Subject: Re: Drill Session ID between Nodes Hi John, I do not believe that session IDs are global. Each Drillbit maintains its own concept of sessions. A global session would require some centralized registry of sessions, which Drill does not have. Would be great if someone can confirm… - Paul On Jun 22, 2017, at 12:14 PM, John Omernik <j...@omernik.com> wrote: When I log onto a drill node, and get Session Id, if I connect to another drill node in the cluster will the session id be valid? I am guessing not, but want to validate. My conumdrum, I have my Drill cluster running in such a way that the connections to the nodes are load balanced via DNS. However, if I get a DNS IP while in session it appears to invalidate, and thus forces me to log on...
Re: Drill Session ID between Nodes
Would there be interest in finding a way to globalize this? This is challenging for me and others that may run drill with multi Tennant orchestrators. In my particular setup, each node running drill gets added to an a record automatically giving me HA and distribution of Rest API queries. It also allows me to have a single certificate for my cluster rather than managing certificates on a individual basis. I set things up to connect via IP but then I had certificate mismatch warnings. My goal is to find a way to connect to the rest API , while maintaining a session to single node, with out sacrificing HA and balancing and with compromising ssl security. I know it's a tall order, but if there I ideas outside of a global state management I am all ears. Note some ideas I've also considered: 1. using a load balancer that would allow me to pin connections. Not ideal because it's another service to manage but it would work. 2. There may be a way to hack things with a wild card cert but it's seems complicated and fragile. On Jun 22, 2017 5:47 PM, "Sorabh Hamirwasia" <shamirwa...@mapr.com> wrote: > Hi John, > As Paul mentioned session ID's are not global. Each session is part of the > BitToUserConnection instance created for a connection between Drillbit and > client. Hence it's local to that Drillbit only and the lifetime of the > session is tied to lifetime of the connection. You can find the code here< > https://github.com/apache/drill/blob/master/exec/ > java-exec/src/main/java/org/apache/drill/exec/rpc/user/ > UserServer.java#L102>. > > Thanks, > Sorabh > > > From: Paul Rogers <prog...@mapr.com> > Sent: Thursday, June 22, 2017 2:19:50 PM > To: user@drill.apache.org > Subject: Re: Drill Session ID between Nodes > > Hi John, > > I do not believe that session IDs are global. Each Drillbit maintains its > own concept of sessions. A global session would require some centralized > registry of sessions, which Drill does not have. > > Would be great if someone can confirm… > > - Paul > > > On Jun 22, 2017, at 12:14 PM, John Omernik <j...@omernik.com> wrote: > > > > When I log onto a drill node, and get Session Id, if I connect to another > > drill node in the cluster will the session id be valid? > > > > I am guessing not, but want to validate. > > > > My conumdrum, I have my Drill cluster running in such a way that the > > connections to the nodes are load balanced via DNS. However, if I get a > DNS > > IP while in session it appears to invalidate, and thus forces me to log > > on... > >
Re: Drill Session ID between Nodes
Hi John, As Paul mentioned session ID's are not global. Each session is part of the BitToUserConnection instance created for a connection between Drillbit and client. Hence it's local to that Drillbit only and the lifetime of the session is tied to lifetime of the connection. You can find the code here<https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserServer.java#L102>. Thanks, Sorabh From: Paul Rogers <prog...@mapr.com> Sent: Thursday, June 22, 2017 2:19:50 PM To: user@drill.apache.org Subject: Re: Drill Session ID between Nodes Hi John, I do not believe that session IDs are global. Each Drillbit maintains its own concept of sessions. A global session would require some centralized registry of sessions, which Drill does not have. Would be great if someone can confirm… - Paul > On Jun 22, 2017, at 12:14 PM, John Omernik <j...@omernik.com> wrote: > > When I log onto a drill node, and get Session Id, if I connect to another > drill node in the cluster will the session id be valid? > > I am guessing not, but want to validate. > > My conumdrum, I have my Drill cluster running in such a way that the > connections to the nodes are load balanced via DNS. However, if I get a DNS > IP while in session it appears to invalidate, and thus forces me to log > on...
Re: Drill Session ID between Nodes
Hi John, I do not believe that session IDs are global. Each Drillbit maintains its own concept of sessions. A global session would require some centralized registry of sessions, which Drill does not have. Would be great if someone can confirm… - Paul > On Jun 22, 2017, at 12:14 PM, John Omernikwrote: > > When I log onto a drill node, and get Session Id, if I connect to another > drill node in the cluster will the session id be valid? > > I am guessing not, but want to validate. > > My conumdrum, I have my Drill cluster running in such a way that the > connections to the nodes are load balanced via DNS. However, if I get a DNS > IP while in session it appears to invalidate, and thus forces me to log > on...
Drill Session ID between Nodes
When I log onto a drill node, and get Session Id, if I connect to another drill node in the cluster will the session id be valid? I am guessing not, but want to validate. My conumdrum, I have my Drill cluster running in such a way that the connections to the nodes are load balanced via DNS. However, if I get a DNS IP while in session it appears to invalidate, and thus forces me to log on...