[ 
https://issues.apache.org/jira/browse/BIGTOP-1746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355193#comment-14355193
 ] 

Richard Pelavin commented on BIGTOP-1746:
-----------------------------------------

An approach to look at is having a simple topology DSL say in yaml that the
end user interacts with to specify the logical set of nodes, what daemons
are on each node and how they are interconnected (for more complex things
like ha or indicating how monitors, user authentication ,etc could plug
in).. It would be easy then to write some simple code to "compile this into
for example" hiera.yaml files or as an ENC or even both.

I think this begs the question as to what type of end user is expected to
use this; if it is a Puppet savvy end user then having them specify things
in hiera would be clear, but if going after end users that are not well
versed in.Puppet having a Bigtop topology dsl" may resonate better and can
be simpler.

Now, I have done much work around this area so would be happy to propose a
starting point for a topology DSL if this approach makes sense. I can also
flesh out a number of issues that could be addressed to see what priority
people would give it.
One issue for example is whether the topology just logically identifies
nodes and groups of nodes (e.g., the set of slaves) and does not require ip
or dns addresses to be assigned to them; this allows more sharable designs
without locking users into what would be one team's deployment specific
settings. It also facilitates the process where given a toplogy we spin up
a set of nodes and in a late binding way attach the host addresses to the
nodes and to the attributes for connecting between hosts.

For the specific example of init-hdfs,sh if I am correctly guessing at what
issue would be is that ideally want to only create directories for services
that are being actually used and not create it for all directories or
equivalently for the data driven way in which this is created you want to
construct the description of directories to include as a function of what
deamons are on the topology nodes. This is something I have tackled and can
include this in a write up if interested,

> Introduce the concept of roles in bigtop cluster deployment
> -----------------------------------------------------------
>
>                 Key: BIGTOP-1746
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1746
>             Project: Bigtop
>          Issue Type: New Feature
>          Components: deployment
>            Reporter: vishnu gajendran
>              Labels: features
>             Fix For: 0.9.0
>
>
> Currently, during cluster deployment, puppet categorizes nodes as head_node, 
> worker_nodes, gateway_nodes, standy_node based on user specified info. This 
> functionality gives user control over picking up a particular node as 
> head_node, standy_node, gateway_node and rest others as worker_nodes. But, I 
> woulld like to have more fine-grained control on which deamons should run on 
> which node. For example, I do not want to run namenode, datanode on the same 
> node. This functionality can be introduced with the concept of roles. Each 
> node can be assigned a set of role. For example, Node A can be assigned 
> ["namenode", "resourcemanager"] roles. Node B can be assigned ["datanode", 
> "nodemanager"] and Node C can be assigned ["nodemanager", "hadoop-client"]. 
> Now, each node will only run the specified daemons. Prerequisite for this 
> kind of deployment is that each node should be given the necessary 
> configurations that it needs to know. For example, each datanode should know 
> which is the namenode etc... This functionality will allow users to customize 
> the cluster deployment according to their needs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to