Paul, let's talk about this race condition you mention Let's use a real option here for clarity. store.format.
SYSTEM store.format is parquet Scenario 1: I log on, I set SESSION store.format to csv and run CREATE TABLE foo as select * from bar. The SESSION variable is read from my login and that is passed to all drill bits that may get a fragment.... correct? I won't have some fragments on other drill bits reading SYSTEM of parquet because I set a SESSION of csv. Scenario 2 (Your described race condition): I logon in, I do NOT set a SESSION variable for store.format. (Thus it's parquet). and run CREATE TABLE foo as select * from bar. As it is running, and before the final output, I change SYSTEM store.format to csv. And now I will run into this race condition? Scenario 2 is odd to me. Using the principle of least surprise, I would expect to have my variables read once during planning and projected to all downstream fragments rather than have a situation where variables could be read at different times. This also seems like a lot of unnecessary ZK reads... each fragment has to look up ZK settings? Should the foremen of the query do that once? Are you sure this race condition exists? This just doesn't sit right with me. On Wed, Sep 5, 2018 at 11:35 AM, Paul Rogers <[email protected]> wrote: > Hi Joe, > > To answer our question about how options "travel"... > > Drill maintains system options in ZK. Session options are maintained per > connection on the Foreman Drillbit to which the user connects. This is why > a simple round-robin load balancer does not work: why load balancing has to > be session-oriented as is the Drill JDBC an ODBC clients. (It is also why > the non-secure REST API does not preserve session options.) If you query > the session options table, you get a merger of the system options from ZK > overplayed with the session options on the Forman. > > When a query executes, the query planner copies session options into the > query play that it sends to Drillbits for execution. The operators in each > Drillbit merge the session options (from the query plan) and the system > options (from ZK) to get the full set of options for the query. > > Note that this does, in fact, create a race condition: if we change a > system option while queries run, some fragments of the query may see the > old value, others the new value. The workaround is simple: don't change > system options while queries are running. > > Thanks, > - Paul > > > > On Wednesday, September 5, 2018, 8:42:02 AM PDT, Joe Auty < > [email protected]> wrote: > > Hmmm.... > > So, if user 1 sets the store.format to CSV on Drillbit 1 and work gets > farmed out to Drillbit 2, this session setting will "travel" with the > user from drillbit to drillbit? We were originally thinking that this > would be the case if the session information was retained in Zookeeper, > but we weren't so sure about this. How is session information tracked > and maintained across multiple drillbits? > > Thanks so much for taking the time to engage on this! > > > > John Omernik wrote on 2018-09-05 9:40 AM: > > Rereading your post, I think there is some concern between > embedded/single > > drillbit mode, and distributed mode. > > > > When you run multiple drillbits in distributed mode, you will (should) be > > enabling authentication. Thus each user will log in to "a" drill bit. > > There is no concern on which one, it's not the only one doing the work, > You > > could have two users on the same drill bit, or on different bits. The > > "System" options will be set by default for all users on logon and at > this > > point, a session for the user is created. You can see what the current > > options for that user is with "Select * from sys.options" If you are a > > user who is in the administrative group (security.admin.users in select * > > from sys.options) then you can use > > > > ALTER SYSTEM set `somekey` = `someval` > > > > And that will set the default option for all users when they logon. > > > > A user can use > > > > ALTER SESSION set `somekey = `someval` > > > > To alter what say their store.format is. So User 1 can authentication > and > > connect to Drillbit 1 or 2, and set store.format (at the session level) > to > > parquet and user 2 can logon it Drillbit 1 or 2 (doesn't matter, even if > > user 1 is logged into the same drill bit with a different store.format) > and > > set it to csv. > > > > I think the key here is authenticated distributed mode in Drill and that > > will be how you will do what you need to do. > > > > John > > > > > > > > On Tue, Sep 4, 2018 at 7:30 PM, John Omernik <[email protected]> wrote: > > > >> The session is the users session, not the drill bit. Since you are > >> connected to a specific drill bit, when you alter session it will work. > Try > >> to use session stickiness or pinning on your HA solution and you will be > >> go. With my DNS round robin it picks a "connecting" drill bit and > sticks to > >> it until the session is done. The settings apply to the drill cluster > while > >> in distributed mode. > >> > >> On Tue, Sep 4, 2018 at 3:56 PM, Joe Auty <[email protected]> > wrote: > >> > >>> Thanks for your response John! > >>> > >>> We are using Drill both in an ETL context, as well as for general > >>> warehouse queries. One Drill user uses store format set to Parquet > while > >>> the other uses store format set to CSV to read and write from HDFS. We > are > >>> currently using Kubernetes Services rather than DNS round robin, but we > >>> only have one drillbit in the cluster while we try to sort out this > issue. > >>> > >>> I'm not clear though how alter session would work with a session with > >>> multiple drillbits involved? We need to ensure that the right store > format > >>> option is set, so like I said we are using different Drill usernames > and > >>> sessions to accommodate this, but how would alter session commands > apply to > >>> preserve these different settings across multiple drillbits? > >>> > >>> > >>> > >>> > >>> > >>> > >>> John Omernik wrote on 2018-09-04 2:48 PM: > >>> > >>>> Are these ETL ish type queries? store.format should only apply when > >>>> Drill > >>>> is writing data, when it is reading, it uses the filenames and other > >>>> hints > >>>> to read. > >>>> > >>>> Thus, if you do HA, say with DNS (like like in the other thread) and > >>>> prior > >>>> to running your CREATE TABLE AS (I Am assuming this is what you are > >>>> doing) > >>>> you can do ALTER SESSION set store.format = 'parquet' > >>>> > >>>> Instead of setting the ALTER SYSTEM, you can use ALTER SESSION so it > only > >>>> applies to the current session, regardless of foreman. > >>>> > >>>> John > >>>> > >>>> > >>>> On Tue, Sep 4, 2018 at 1:00 PM, Joe Auty <[email protected]> > wrote: > >>>> > >>>> Hello, > >>>>> We need to have some queries executed with store.format set to > parquet > >>>>> and > >>>>> some with this option set to CSV. To date we have experimented with > >>>>> setting > >>>>> the store format for sessions controlled by using two separate user > >>>>> logins > >>>>> as a sort of context switch, but I'm wondering if the group here > might > >>>>> have > >>>>> suggestions for a better way to handle this, particularly one that > will > >>>>> scale a little better for us? > >>>>> > >>>>> The main problem we have with this approach is in introducing > multiple > >>>>> drillbits/HA and assuring that the session and the settings we need > are > >>>>> respected across all drillbits (whether with an HAProxy + sticky > session > >>>>> approach or any other approach). There is a more general thread > (which > >>>>> I've > >>>>> chosen not to hijack) about HA Drill from a more general standpoint, > you > >>>>> might think of my question here as being similar, but with the need > for > >>>>> a > >>>>> context switch to support multiple Drill configurations/session > options. > >>>>> > >>>>> Here are the various attempts and approaches we have come up with so > >>>>> far. > >>>>> I'm wondering if you'd have any general advice as to which approach > >>>>> would > >>>>> be best for us to take, considering future plans for Drill itself. > For > >>>>> example, if need be we can write our own plugin(s) if this is the > >>>>> smartest > >>>>> approach: > >>>>> > >>>>> - embedded the store.format option into the query itself by chaining > >>>>> multiple queries together separated by a comma (it appears that this > >>>>> doesn't work at all) > >>>>> - look into writing some sort of plugin to allow us to scale our > current > >>>>> approach somehow (I realize that this is vague) > >>>>> - a "foreman" approach where we stick with our current approach and > >>>>> direct > >>>>> all requests to our "foreman"/master with the hope and expectation > that > >>>>> it > >>>>> will farm out work to the workers/slaves > >>>>> - multiple clusters set with different settings > >>>>> > >>>>> Each of these approaches seems to have its pros and cons. To > reiterate: > >>>>> what approach do you think would be the smartest and most > future-proof > >>>>> approach for us to take? > >>>>> > >>>>> Thanks in advance! > >>>>> > >>>>> > > >
