[ 
https://issues.apache.org/jira/browse/DRILL-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216522#comment-15216522
 ] 

Paul Rogers edited comment on DRILL-4543 at 3/30/16 6:02 PM:
-------------------------------------------------------------

(Revised to reflect current implemetation as explained below by Jacques. 
Revised again to revers list order.) Drill already provides four layers of 
config:

* -Dname=value system properties from the command line.
* Overrides config file (provided by user)
* Per-module defaults (from class path)
* Defaults config file.

Items higher in the list take precedence over items lower in the list.

A suggestion would be to insert an additional step for selected options:

* System properties
* Environment variables <-- New
* Overrides

The challenge is that env vars cannot use the same syntax as used elsewhere. 
Perhaps we need a map of (env var name: system prop name) values. Note that two 
JVM options already are configured as env vars in drill-env.sh:

DRILL_MAX_DIRECT_MEMORY="8G"
DRILL_HEAP="4G"

The -Dname=value system properties are handy, but they require assistance from 
the launch script to 1) format the properties as system properties, and 2) put 
them into the right place on the command line.

It may be easier for tools such as Mesos to set an environment variable and not 
alter the command line.

The easiest solution is simply for the drillbit.sh script to special-case the 
few varibles (to Jacques' point below) that are needed: ports, etc. If the 
DRILL_HOST_PORT env var is set, say, then add the -Dname=value equivalent to 
the command line.

Another alternative is to allow multiple layers of overrides:

* Drill-bit overrides (new, per-Drill-bit file)
* Site Overrides (existing file)

In this case, the custom properties would be written to a per-drill-bit file 
that the Drill bit reads. (Details ommitted for now.)

All three proposals allow Mesos, YARN (or Ansible or other tools) to overide 
the site-wide config as needed.

Because we are altering values per-Drill-bit, the values that impact query 
planning must be communicated to the other Drill bits. Thus the request that 
each Drill-bit use ZK to advertise its actual values as computed using the 
override rules. Net result: YARN (or Mesos) can adjust ports and resources, and 
other Drill-bits can learn of those customizations.

It would also be good, in the Web UI, to display the set of actual values. A 
bonus would be to tag each value with its source (defaults, config file name, 
env, system) to aid admin troubleshooting. (Once can see actual values today 
using

SELECT * FROM sys.boot;

But it doesn't show the origin of the value.)

(This JIRA covers only the ZK advertisement, the other details above are 
provided as background. We'll the config and reporting items to another JIRA.)


was (Author: paul-rogers):
(Revised to reflect current implemetation as explained below by Jacques.) Drill 
already provides four layers of config:

* Defaults config file.
* Per-module defaults (from class path)
* Overrides config file (provided by user)
* -Dname=value system properties from the command line.

Items higher in order take precedence over items lower in the order.

A suggestion would be to insert an additional step:

* Overrides
* Environment variables <-- New
* System properties

The challenge is that env vars cannot use the same syntax as used elsewhere. 
Perhaps we need a map of (env var name: system prop name) values.

The -Dname=value system properties are handy, but they require assistance from 
the launch script to 1) format the properties as system properties, and 2) put 
them into the right place on the command line.

It may be easier for tools such as Mesos to set an environment variable and not 
alter the command line.

Another alternative is to allow multiple layers of overrides:

* Site Overrides (existing file)
* Drill-bit overrides (new, per-Drill-bit file)

In this case, the custom properties would be written to a per-drill-bit file 
that the Drill bit reads. (Details ommitted for now.)

All three proposals allow Mesos, YARN (or Ansible or other tools) to overide 
the site-wide config as needed.

Because we are altering values per-Drill-bit, the values that impact query 
planning must be communicated to the other Drill bits. Thus the request that 
each Drill-bit use ZK to advertise its actual values as computed using the 
override rules. Net result: YARN (or Mesos) can adjust ports and resources, and 
other Drill-bits can learn of those customizations.

It would also be good, in the Web UI, to display the set of actual values. A 
bonus would be to tag each value with its source (defaults, config file name, 
env, system) to aid admin troubleshooting. (Once can see actual values today 
using

SELECT * FROM sys.boot;

But it doesn't show the origin of the value.)

(This JIRA covers only the ZK advertisement, the other details above are 
provided as background. We'll the config and reporting items to another JIRA.)

> Advertise Drill-bit ports, status, capabilities in ZooKeeper
> ------------------------------------------------------------
>
>                 Key: DRILL-4543
>                 URL: https://issues.apache.org/jira/browse/DRILL-4543
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components:  Server
>            Reporter: Paul Rogers
>             Fix For: 2.0.0
>
>
> Today Drill uses ZooKeeper (ZK) to advertise the existence of a Drill-bit, 
> providing the host name/IP Address of the Drill-bit and the ports used, 
> encoded in Protobuf format. All other information (status, CPUs, memory) are 
> assumed to be the same across all Drill-bits in the cluster as specified in 
> the Drill config file. (Amended to reflect 1.6 behavior.)
> Moving forward, as Drill becomes more sophisticated, Drill should advertise 
> the specifics of each Drill-bit so that one Drill bit can differ from another.
> For example, when running on YARN, we need a way for Drill to gracefully shut 
> down. Advertising a status of Ready or Unavailable will help. Ready is the 
> normal state. Unavailable means the Drill-bit will finish in-flight queries, 
> but won't accept new ones. (The actual status is a separate enhancement.)
> In a YARN cluster, Drill should take advantage of machines with more memory, 
> but live with machines with less. (Perhaps some are newer, some are older or 
> more heavily loaded.) Drill should use ZK to identify its available memory 
> and CPUs so that the planner can use them. (Use of the info is a separate 
> enhancement.)
> There may be times when two drill bits run on a single machine. If so, they 
> must use separate ports. So, each Drill-bit should advertise its ports in ZK.
> For backward compatibility, the information is optional; if not present, the 
> receiver should assume the information defaults to that in the config file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to