So, I *think* i got things working, I had some inconsistencies on what I would
see depending on which user I had launched sqlline as, but I can’t reproduce
reliably.
In any case, here’s what I put in the config:
drill.exec: {
cluster-id: "se1-drillbits",
zk.connect: "10.10.15.10:5181,10.10.15.11:5181,10.10.15.12:5181",
sys.store.provider.zk.blobroot: "maprfs:///user/mapr/profiles",
sort.external.spill.directories: [
"/var/mapr/local/se-node10.se.lab/drillspill" ],
sort.external.spill.fs: "maprfs:///",
impersonation: {
enabled: true,
max_chained_user_hops: 3
}
}
Note: putting a shell variable ($HOSTNAME) did not seem to work ( I’d get
errors when running queries that resulted in a spill to disk, complaining about
directory permissions, likely because it couldn’t resolve the path).
If I can figure out the original issue I had (e.g.: if I can reproduce), I will
file a JIRA.
Andy Pernsteiner
Manager, Field Enablement
ph: 206.228.0737
www.mapr.com
Now Available - Free Hadoop On-Demand Training
From: Andries Engelbrecht <[email protected]>
Reply: [email protected] <[email protected]>>
Date: September 24, 2015 at 4:21:50 PM
To: [email protected] <[email protected]>>
Subject: Re: Setting drill.exec.sort.external.spill.directories
Maybe try
sort.external.spill.directories: [ "/var/mapr/local/$hostname/drillspill" ],
—Andries
> On Sep 24, 2015, at 12:38 PM, Andy Pernsteiner <[email protected]>
> wrote:
>
> I’m trying to do some experimentation and set the
> drill.exec.sort.external.spill.directories value. Since this option appears
> as a ‘boot’ option ( https://drill.apache.org/docs/start-up-options/ ) , I
> believe the right way is to set this in drill-override.conf on each node.
>
> I tried doing this via the following:
>
>
> drill.exec: {
> cluster-id: "se1-drillbits",
> zk.connect: "10.10.15.10:5181,10.10.15.11:5181,10.10.15.12:5181",
> sys.store.provider.zk.blobroot: "maprfs:///user/mapr/profiles",
> sort.external.spill.directories: [ "/var/mapr/$hostname/drillspill" ],
> sort.external.spill.fs: "maprfs:///",
> impersonation: {
> enabled: true,
> max_chained_user_hops: 3
> }
> }
>
> I also tried setting via:
>
> sort: {
> purge.threshold : 100,
> external: {
> batch.size : 4000,
> spill: {
> batch.size : 4000,
> group.size : 100,
> threshold : 200,
> directories : [ "/var/mapr/$hostname/drillspill" ],
> fs : “maprfs:///"
> }
> }
> },
>
>
> But then looking at the sys.boot table after restarting the drill bits, I
> still see the default values:
>
> 0: jdbc:drill:> select * from sys.boot where name like '%spill%';
> +------+------+------+--------+---------+------------+----------+-----------+
>
> | name | kind | type | status | num_val | string_val | bool_val | float_val |
>
> +------+------+------+--------+---------+------------+----------+-----------+
>
> | drill.exec.sort.external.spill.batch.size | LONG | BOOT | BOOT | 4000 |
> null | null | null |
> | drill.exec.sort.external.spill.directories | STRING | BOOT | BOOT | null |
> [
> #
> jar:file:/opt/mapr/drill/drill-1.1.0/jars/drill-java-exec-1.1.0.jar!/drill-module.conf:
> 145
> "/tmp/drill/spill"
> ] | null | null |
> | drill.exec.sort.external.spill.fs | STRING | BOOT | BOOT | null |
> "file:///" | null | null |
> | drill.exec.sort.external.spill.group.size | LONG | BOOT | BOOT | 40000 |
> null | null | null |
> | drill.exec.sort.external.spill.threshold | LONG | BOOT | BOOT | 40000 |
> null | null | null |
> +------+------+------+--------+---------+------------+----------+-----------+
>
>
> Note that I’ve tried removing the shell ‘$hostname’ variable (in case it
> causes issues), no dice.
>
> What’s the right way to set these values?
>
>
>
>
>
>
> Andy Pernsteiner
> Manager, Field Enablement
> ph: 206.228.0737
>
> www.mapr.com
> Now Available - Free Hadoop On-Demand Training
>
>