Hi John,

Your use case is interesting. I’m certainly not an expert in the network 
aspects of what you are trying to do, but I can take a short at the related 
Drill issues.

Drill’s primary use case is connecting via the Drill client (typically via JDBC 
or ODBC.) The Drill client handles security. It also allows SQL sessions, and 
hence session options.

Your use case is based on the REST API. At present, the REST API is best 
described as a “prototype.” REST supports username/password login, and sessions 
associated with the login (on a single Drillbit). Sessions never timeout (as 
far as I can tell.) More importantly, the REST API returns all query results in 
a single message, encoded as JSON. This is great for small queries, but does 
not scale well when returning millions of large rows. (Hint: we are looking for 
contributions to improve the REST API!)

As Keys pointed out, the important question is this: does your app need session 
state other than security? If so, then you need to consider overall SQL session 
state, not just SSL connections. If your script does “ALTER SESSION” followed 
by a query, then the ALTER SESSION might be sent to node A, with the query 
going to node B. Node B does not know about the session on A, and so results 
will be different than what you expect. The same is true with temp tables.

Said another way, you’d like your scripts to do round-robin per *request*, but 
Drill is designed to do round-robin *per session.* (The Drill client, when 
using ZK, does random selection of nodes that achieves roughly the same 
result.) In short, your use case is clear, but is not supported today in Drill.

Putting this together:

1. Sessions must be sticky to a single Drillbit so that session state, temp 
tables and so on are persisted (on that one Drillbit.)
2. If a session on one Drillbit drops, the app must establish a new session on 
another Drillbit. That involves not just security tokens and cookies, but also 
resetting session options, rebuilding temp tables, etc.
3. Since the app has to handle session recreation when switching Drillbits, the 
security issue, while a nuisance, is a necessary result of switching sessions.
4. (As Keys points out,) changing sessions is a rare event (due to timeouts, 
node failures, etc.) so session recovery should be rare.

The only way to make sessions “portable” is to create a shared, global session 
shared across Drillbits, which is what you are proposing. Doing so is 
non-trivial: it requires a global session registry (or a way of synchronizing 
session state). Such sharing is not supported in Drill’s distributed, 
shared-nothing architecture. Could we add it? Probably, but not in the short 
term. If we ever find the need for a “metastore” (or central work scheduler), 
then at that time Drill would have a mechanism to support session portability; 
but that is a ways off.

For the short term, can you perhaps rethink the use case given that sessions 
are local? How will your app handle failover? Is the security issue as much of 
a problem when seen as part of session recreation? (I’m not an expert here; I’m 
asking how this might work: are there things, short of persistent sessions, we 
can do to help?)

You mentioned Drill-on-YARN (DoY). DoY is an interesting question. On the 
surface, REST works the same on DoY as in “regular” Drill: the REST endpoint 
doesn’t care how the Drillbit was launched. Whatever works with regular Drill 
will work with DoY. Under DOY, J/ODBC clients work as usual: they maintain a 
session with one Drillbit, and use ZK to find a fall-back Drillbit if the first 
one fails (with the need for the client to re-establish the SQL session state 
by resending session options, etc.) Can we improve this? Yes, if we did the 
work described earlier.

(BTW: I’m still looking for volunteers to help with code reviews so we can 
contribute DoY to Apache Drill…)

We have not yet looked into the security setup for DoY. (We wanted to get the 
security fully working with Drill itself first.) You raise some good issues 
that we must wrestle with as we enhance DoY to use the security features that 
are becoming available in Drill itself.

Thanks,

- Paul


> On Jun 23, 2017, at 9:50 AM, John Omernik <[email protected]> wrote:
> 
> That makes sense, ya, I would love to hear about the challenges of this in
> general from the Drill folks.
> 
> Also, I wonder if Paul R at MapR has any thoughts in how something like
> this would be handled in the Drill on Yarn Setup.
> 
> 
> John
> 

Reply via email to