[ 
https://issues.apache.org/jira/browse/DRILL-4543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15220209#comment-15220209
 ] 

Paul Rogers commented on DRILL-4543:
------------------------------------

Here is a revised take at the request.

Today, Drill advertises its Drill-bits with host name and three of five ports. 
Information is stored in ZK in Protobuf format. The ZK information is used by 
Drill, but is valuable to management tools (such as YARN). Because of the 
Protobuf format, Drill code is required to read the values (or the user must 
roll their own Protobuf code.) Brining in Drill code creates complex 
dependencies (DRILL-4561).

The requested changes are:

1. Store the ZK information as simple values: text, numbers; not in Protobuf 
format, to allow non-Drill clients to read the information without the need for 
Drill code.
2. Identify each node with a unique identifier such as hostName:userPort. 
(Drill uses mutliple ports, but only one bit can use any given port number, so 
using a single port in the identifier is sufficient.)
3. Identify the node in a version-aware way. For example, as 
drill://hostName:userPort as string, which is easily differentiated from the 
existing Protobuf format. (New clients can read both, older clients are broken 
in this case.)
4. Add other key information (ports, capabilities) as child nodes, again using 
simple values. (This assumes the a znode subtree has the same timeout semantics 
as a single node -- something to verify.)

Then, we need a simple way to pass start-up parameters to Drill without a file; 
see DRILL-4569.

> Advertise Drill-bit ports, status, capabilities in ZooKeeper
> ------------------------------------------------------------
>
>                 Key: DRILL-4543
>                 URL: https://issues.apache.org/jira/browse/DRILL-4543
>             Project: Apache Drill
>          Issue Type: Sub-task
>          Components:  Server
>            Reporter: Paul Rogers
>             Fix For: 2.0.0
>
>
> Today Drill uses ZooKeeper (ZK) to advertise the existence of a Drill-bit, 
> providing the host name/IP Address of the Drill-bit and the ports used, 
> encoded in Protobuf format. All other information (status, CPUs, memory) are 
> assumed to be the same across all Drill-bits in the cluster as specified in 
> the Drill config file. (Amended to reflect 1.6 behavior.)
> Moving forward, as Drill becomes more sophisticated, Drill should advertise 
> the specifics of each Drill-bit so that one Drill bit can differ from another.
> For example, when running on YARN, we need a way for Drill to gracefully shut 
> down. Advertising a status of Ready or Unavailable will help. Ready is the 
> normal state. Unavailable means the Drill-bit will finish in-flight queries, 
> but won't accept new ones. (The actual status is a separate enhancement.)
> In a YARN cluster, Drill should take advantage of machines with more memory, 
> but live with machines with less. (Perhaps some are newer, some are older or 
> more heavily loaded.) Drill should use ZK to identify its available memory 
> and CPUs so that the planner can use them. (Use of the info is a separate 
> enhancement.)
> There may be times when two drill bits run on a single machine. If so, they 
> must use separate ports. So, each Drill-bit should advertise its ports in ZK.
> For backward compatibility, the information is optional; if not present, the 
> receiver should assume the information defaults to that in the config file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to