The sys.boot data is the current node's configuration as seen from
sys.drillbits. Right now, we don't have a way of looking across nodes. Good
feature request though.

--
Jacques Nadeau
CTO and Co-Founder, Dremio

On Mon, Nov 9, 2015 at 11:00 AM, John Omernik <[email protected]> wrote:

> That's actually a good follow-up, if sys.boot (and others) are drillbit
> specific, do we have a way to query sys.boot across all bits to show
> values?
>
>
>
> On Mon, Nov 9, 2015 at 12:58 PM, John Omernik <[email protected]> wrote:
>
> > I did
> >
> > select * from boot where name like '%sort%'
> >
> > and I did get
> >
> > string_val - [ # env var DRILL_SPILLLOC
> >                     "/var/mapr/local/onedrillbithostname/drillspill"
> >         ]
> >
> > onedrillbithostname is one of my 5 nodes, I suppose this is only going to
> > be the foreman of the query (or perhaps the node that zk gave the JDBC
> > connect to connect to) That seems well and good then.  Looks like the
> value
> > is getting propagated well, I'd love to see data in those directories, to
> > ensure I don't have some silly situation where the since sys.boot only
> > shows one value, that all the nodes try to use the same spill location
> > (i.e. proving that each bit is truly writing to it's own location) but
> this
> > all looks very promising.
> >
> >
> >
> >
> > On Mon, Nov 9, 2015 at 12:51 PM, John Omernik <[email protected]> wrote:
> >
> >> Just for future, readers of the user list, Jacques posted to the JIRA
> >> that HOCON variables likely already have this. I created some scripts
> for
> >> my MapR setup (MapRTech folks, please take a look at the JIRA, I think I
> >> created effectively a local volume correctly using maprcli in the
> >> drill-env.sh)
> >>
> >> The one last piece of the puzzle is how to test this is working, the
> >> drillbits start with no errors, but I'd like to validate it's all
> working
> >> as intended i.e. can I force my memory low? What type of query would
> cause
> >> a spill? If Drill tries to not use spill as much as possible this may be
> >> hard to prove...perhaps a query that shows this is setup right under the
> >> hood?
> >>
> >> John
> >>
> >> On Mon, Nov 9, 2015 at 8:28 AM, John Omernik <[email protected]> wrote:
> >>
> >>> Hey all, after speaking to Andries, I've gone ahead and created a JIRA
> >>> to support variables in the drill-override.conf file:
> >>>
> >>> https://issues.apache.org/jira/browse/DRILL-4052
> >>>
> >>> This will be a huge help and a great flexibility for administrators
> >>> looking to organize their drill clusters.  Please comment with ideas
> if you
> >>> have thoughts on the subject!
> >>>
> >>> John
> >>>
> >>>
> >>>
> >>> On Fri, Sep 25, 2015 at 10:20 AM, Andy Pernsteiner <
> >>> [email protected]> wrote:
> >>>
> >>>> That was considered,and we may elect to do that or some variation
> >>>> (separate mount point going to the target directory per node, but
> where the
> >>>> local mount point is identical across all cluster nodes).  I was
> hoping
> >>>> that Drill had a way of parsing options within the config.  If not,
> I’ll
> >>>> file a JIRA for enhancement, since this sort of thing would be useful
> for a
> >>>> number of scenarios.
> >>>>
> >>>>
> >>>>
> >>>>  Andy Pernsteiner
> >>>>  Manager, Field Enablement
> >>>> ph: 206.228.0737
> >>>>
> >>>> www.mapr.com
> >>>> Now Available - Free Hadoop On-Demand Training
> >>>>
> >>>>
> >>>>
> >>>> From: kbotzum <[email protected]>
> >>>> Reply: [email protected] <[email protected]>>
> >>>> Date: September 24, 2015 at 5:36:31 PM
> >>>> To: [email protected] <[email protected]>>
> >>>> Cc: Andries Engelbrecht <[email protected]>>
> >>>> Subject:  Re: Setting drill.exec.sort.external.spill.directories
> >>>>
> >>>> How about a symbolic link from the local file system on each node to
> >>>> the node specific tmp dir? A little hacky but workable. You could do
> that
> >>>> once and then copy the drill config without concern.
> >>>>
> >>>> fyi, many eons ago a file system known as AFS had special vars that
> >>>> would expand in pathnames to handle this type of thing transparently.
> My
> >>>> memory is fuzzy but I think we had @sys, @host, and probably a few
> others.
> >>>>
> >>>> Keys
> >>>> _______________________________
> >>>> Keys Botzum
> >>>> Senior Principal Technologist
> >>>> [email protected]
> >>>> 443-718-0098
> >>>> MapR Technologies
> >>>> http://www.mapr.com
> >>>>
> >>>>
> >>>>
> >>>> On Sep 24, 2015, at 5:30 PM, Andy Pernsteiner <
> >>>> [email protected]> wrote:
> >>>>
> >>>> > One question for those in the know: Is there a way to use shell (or
> >>>> other)
> >>>> > variables in these options? I'd much prefer $HOSTNAME , as opposed
> to
> >>>> > having to set the variable differently on each node in my cluster.
> >>>> >
> >>>> >
> >>>> >
> >>>> > On Thu, Sep 24, 2015 at 5:22 PM, Andy Pernsteiner <
> >>>> [email protected]
> >>>> >> wrote:
> >>>> >
> >>>> >> So, I *think* i got things working, I had some inconsistencies on
> >>>> what I
> >>>> >> would see depending on which user I had launched sqlline as, but I
> >>>> can’t
> >>>> >> reproduce reliably.
> >>>> >>
> >>>> >> In any case, here’s what I put in the config:
> >>>> >>
> >>>> >> drill.exec: {
> >>>> >> cluster-id: "se1-drillbits",
> >>>> >> zk.connect: "10.10.15.10:5181,10.10.15.11:5181,10.10.15.12:5181",
> >>>> >> sys.store.provider.zk.blobroot: "maprfs:///user/mapr/profiles",
> >>>> >> * sort.external.spill.directories: [
> >>>> >> "/var/mapr/local/se-node10.se.lab/drillspill" ],*
> >>>> >> * sort.external.spill.fs: "maprfs:///",*
> >>>> >> impersonation: {
> >>>> >> enabled: true,
> >>>> >> max_chained_user_hops: 3
> >>>> >> }
> >>>> >> }
> >>>> >>
> >>>> >> Note: putting a shell variable ($HOSTNAME) did not seem to work (
> >>>> I’d get
> >>>> >> errors when running queries that resulted in a spill to disk,
> >>>> complaining
> >>>> >> about directory permissions, likely because it couldn’t resolve the
> >>>> path).
> >>>> >>
> >>>> >> If I can figure out the original issue I had (e.g.: if I can
> >>>> reproduce), I
> >>>> >> will file a JIRA.
> >>>> >>
> >>>> >>
> >>>> >>
> >>>> >> Andy Pernsteiner
> >>>> >> Manager, Field Enablement
> >>>> >> ph: 206.228.0737
> >>>> >>
> >>>> >> www.mapr.com
> >>>> >>
> >>>> >> Now Available - Free Hadoop On-Demand Training
> >>>> >> <
> >>>>
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >>>> >
> >>>> >>
> >>>> >>
> >>>> >> From: Andries Engelbrecht <[email protected]>
> >>>> >> <[email protected]>
> >>>> >> Reply: [email protected] <[email protected]>>
> >>>> >> <[email protected]>
> >>>> >> Date: September 24, 2015 at 4:21:50 PM
> >>>> >> To: [email protected] <[email protected]>> <
> >>>> [email protected]>
> >>>> >> Subject: Re: Setting drill.exec.sort.external.spill.directories
> >>>> >>
> >>>> >> Maybe try
> >>>> >>
> >>>> >> sort.external.spill.directories: [
> >>>> "/var/mapr/local/$hostname/drillspill"
> >>>> >> ],
> >>>> >>
> >>>> >> —Andries
> >>>> >>
> >>>> >>> On Sep 24, 2015, at 12:38 PM, Andy Pernsteiner <
> >>>> >> [email protected]> wrote:
> >>>> >>>
> >>>> >>> I’m trying to do some experimentation and set the
> >>>> >> drill.exec.sort.external.spill.directories value. Since this option
> >>>> appears
> >>>> >> as a ‘boot’ option (
> https://drill.apache.org/docs/start-up-options/
> >>>> ) ,
> >>>> >> I believe the right way is to set this in drill-override.conf on
> >>>> each node.
> >>>> >>>
> >>>> >>> I tried doing this via the following:
> >>>> >>>
> >>>> >>>
> >>>> >>> drill.exec: {
> >>>> >>> cluster-id: "se1-drillbits",
> >>>> >>> zk.connect: "10.10.15.10:5181,10.10.15.11:5181,10.10.15.12:5181",
> >>>> >>> sys.store.provider.zk.blobroot: "maprfs:///user/mapr/profiles",
> >>>> >>> sort.external.spill.directories: [
> "/var/mapr/$hostname/drillspill"
> >>>> ],
> >>>> >>> sort.external.spill.fs: "maprfs:///",
> >>>> >>> impersonation: {
> >>>> >>> enabled: true,
> >>>> >>> max_chained_user_hops: 3
> >>>> >>> }
> >>>> >>> }
> >>>> >>>
> >>>> >>> I also tried setting via:
> >>>> >>>
> >>>> >>> sort: {
> >>>> >>> purge.threshold : 100,
> >>>> >>> external: {
> >>>> >>> batch.size : 4000,
> >>>> >>> spill: {
> >>>> >>> batch.size : 4000,
> >>>> >>> group.size : 100,
> >>>> >>> threshold : 200,
> >>>> >>> directories : [ "/var/mapr/$hostname/drillspill" ],
> >>>> >>> fs : “maprfs:///"
> >>>> >>> }
> >>>> >>> }
> >>>> >>> },
> >>>> >>>
> >>>> >>>
> >>>> >>> But then looking at the sys.boot table after restarting the drill
> >>>> bits,
> >>>> >> I still see the default values:
> >>>> >>>
> >>>> >>> 0: jdbc:drill:> select * from sys.boot where name like '%spill%';
> >>>> >>>
> >>>> >>
> >>>>
> +------+------+------+--------+---------+------------+----------+-----------+
> >>>> >>
> >>>> >>> | name | kind | type | status | num_val | string_val | bool_val |
> >>>> >> float_val |
> >>>> >>>
> >>>> >>
> >>>>
> +------+------+------+--------+---------+------------+----------+-----------+
> >>>> >>
> >>>> >>> | drill.exec.sort.external.spill.batch.size | LONG | BOOT | BOOT |
> >>>> 4000
> >>>> >> | null | null | null |
> >>>> >>> | drill.exec.sort.external.spill.directories | STRING | BOOT |
> BOOT
> >>>> |
> >>>> >> null | [
> >>>> >>> #
> >>>> >>
> >>>>
> jar:file:/opt/mapr/drill/drill-1.1.0/jars/drill-java-exec-1.1.0.jar!/drill-module.conf:
> >>>> >> 145
> >>>> >>> "/tmp/drill/spill"
> >>>> >>> ] | null | null |
> >>>> >>> | drill.exec.sort.external.spill.fs | STRING | BOOT | BOOT | null
> |
> >>>> >> "file:///" | null | null |
> >>>> >>> | drill.exec.sort.external.spill.group.size | LONG | BOOT | BOOT |
> >>>> 40000
> >>>> >> | null | null | null |
> >>>> >>> | drill.exec.sort.external.spill.threshold | LONG | BOOT | BOOT |
> >>>> 40000
> >>>> >> | null | null | null |
> >>>> >>>
> >>>> >>
> >>>>
> +------+------+------+--------+---------+------------+----------+-----------+
> >>>> >>
> >>>> >>>
> >>>> >>> Note that I’ve tried removing the shell ‘$hostname’ variable (in
> >>>> case it
> >>>> >> causes issues), no dice.
> >>>> >>>
> >>>> >>> What’s the right way to set these values?
> >>>> >>>
> >>>> >>>
> >>>> >>>
> >>>> >>>
> >>>> >>>
> >>>> >>>
> >>>> >>> Andy Pernsteiner
> >>>> >>> Manager, Field Enablement
> >>>> >>> ph: 206.228.0737
> >>>> >>>
> >>>> >>> www.mapr.com
> >>>> >>> Now Available - Free Hadoop On-Demand Training
> >>>> >>>
> >>>> >>>
> >>>> >>
> >>>> >>
> >>>> >
> >>>> >
> >>>> > --
> >>>> > Andy Pernsteiner
> >>>> > Manager, Field Enablement
> >>>> > ph: 206.228.0737
> >>>> >
> >>>> > www.mapr.com
> >>>> >
> >>>> > Now Available - Free Hadoop On-Demand Training
> >>>> > <
> >>>>
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >>>> >
> >>>>
> >>>>
> >>>
> >>
> >
>

Reply via email to