[ 
https://issues.apache.org/jira/browse/AMBARI-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178617#comment-14178617
 ] 

John Speidel commented on AMBARI-6275:
--------------------------------------

For 2.0, here are what I consider to be the minimum requirements for blueprint 
add hosts:
- add 1-n hosts with a single api call
- api call is asynchronous and will return request information which can be 
used to tracks the request status
- provide api syntax that allows for mapping of components to new hosts (See 
description for more on this)
- new hosts can contain slave and/or client components for services already 
represented in the cluster (restrict adding of new master components)
- automatically update configurations as necessary

Restrictions:
- can't add master components
- can't modify configuration for existing hostgroups 
- can't modify components on existing hosts

Additional Concerns:
In some cases, adding hosts to a cluster could require that additional context 
specific operations occur.  For example, rebalancing of HDFS after addition of 
datanodes or updating configuration properties such as “hive.zookeeper.quorum” 
when a ZK server is added.  Restarting of services/components may also be 
necessary in some cases.  These additional context specific operations may need 
to be accounted for in the api.  Unlike the UI, blueprint scaling operations 
are likely to be headless (no administrator sitting in a chair adding the 
hosts) so the approach used by the UI of notifying the user to restart a 
service, etc. may not work for blueprints.  Adding multiple hosts in a single 
request will likely be more performant for cases where additional context 
specific operations like HDFS rebalancing are needed since these could be 
executed once after all of the hosts are added. Determining how to deal with 
this issue will likely be the most difficult aspect of adding a scaling api.  
The above restrictions are in place primarily to minimize these concerns.

Possible approaches (certainly not all inclusive) for dealing with additional 
context specific operations during cluster scaling:
- Don’t account for this in the api.  This is obviously the easiest to 
implement but the least usable.   Somehow a user would need to determine which 
additional operations are needed and then figure out how to execute them via 
the api after the scaling operation completes.  Because of the extreme 
difficulties the user would have in successfully using the api with this 
approach, I don’t consider this a viable option.

- Include the suggested/required operations in the response to the scaling 
operation.  This could include hrefs and descriptions for each action. These 
actions could be placed in “suggested” and “required” categories to indicate 
necessity.  This is better than doing nothing but is still complicated by the 
fact that the script executing the scaling operation would need to process the 
response and have the necessary logic for determining which operations to 
invoke and then invoking the commands after the asynchronous scaling operation 
completes.

- Handle operations automatically.  After executing all of the add host 
operations for a scaling operation, we could also determine and execute any 
additional context specific operations for the scaling operation with no 
further user input.  This is very user friendly (assuming we make the correct 
decisions) but likely wouldn’t provide the necessary level of control.  Since 
some operations such as HDFS rebalancing after adding datanodes would not be 
required, executing these autonomously would likely not be the best solution as 
a user may want to explicitly control when the operation occurred.

- Likely the best solution would be somewhere between fully manual and 
autonomous execution of related tasks.  A user could specify operations to 
occur after adding the hosts in the scaling operation.  We could potentially 
allow operations scoped at different granularities.  For example, we might want 
to allow a user to specify that all required operations be executed.  Or at a 
more granular level, “all required service restarts”.  Or at an even more 
granular level, “HDFS service restart”.  To make this work, a user would need 
to know which operations would be relevant (suggested/required) for a scaling 
operation prior to invoking the scaling api.  One solution for this discovery 
would be to allow the user to ask the api for the set of relevant operations 
for a scaling operation without actually executing the scaling operation.  The 
api syntax for specifying the “commands” to execute would take a lot of thought 
to ensure that it is easy to use but also flexible and extensible.


At this time, given the 2.0 timeline, I feel that the minimum requirements 
shouldn't account for any additional context specific operations.  By 
restricting the adding of master services we are largely eliminating the need 
to update configurations and restart services for all services in the current 
stacks.  For one of the primary use cases, adding additional DATANODE hosts, 
rebalancing of HDFS would have to be done in a separate api after the request 
completes. Providing a robust "add host" api without the above restrictions 
that properly handles/identifies all additional context specific operations 
will be complicated and will likely be iterative, accomplished via many finer 
grained steps.



> Add support for "add hosts" with Blueprints API
> -----------------------------------------------
>
>                 Key: AMBARI-6275
>                 URL: https://issues.apache.org/jira/browse/AMBARI-6275
>             Project: Ambari
>          Issue Type: Improvement
>          Components: ambari-server
>    Affects Versions: 1.7.0
>            Reporter: Yusaku Sako
>            Assignee: John Speidel
>             Fix For: 2.0.0
>
>
> Support for "adding hosts" based on *blueprint* style *host_group* via Ambari 
> REST API. There are two scenarios to consider for this JIRA:
> 1) Add hosts based on an existing host in the cluster (and it's *blueprint* 
> style *host_group* component layout). This enables the user to add hosts with 
> components similar to existing hosts in the cluster. For example: expand this 
> cluster with these X hosts and make each of these hosts like Y host 
> (components + configs) existing in the cluster.
> 2) Add hosts based on components + configs. This would be a verbose method 
> that uses *blueprint* style *host_groups* and *configs* to allow you to add 
> hosts to a cluster that do not necessarily have a component layout or config 
> of a similar host existing in the cluster. For example: expand this cluster 
> with these X hosts and make each of these hosts include Y components with Z 
> configs. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to