[
https://issues.apache.org/jira/browse/AMBARI-6275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14178617#comment-14178617
]
John Speidel commented on AMBARI-6275:
--------------------------------------
For 2.0, here are what I consider to be the minimum requirements for blueprint
add hosts:
- add 1-n hosts with a single api call
- api call is asynchronous and will return request information which can be
used to tracks the request status
- provide api syntax that allows for mapping of components to new hosts (See
description for more on this)
- new hosts can contain slave and/or client components for services already
represented in the cluster (restrict adding of new master components)
- automatically update configurations as necessary
Restrictions:
- can't add master components
- can't modify configuration for existing hostgroups
- can't modify components on existing hosts
Additional Concerns:
In some cases, adding hosts to a cluster could require that additional context
specific operations occur. For example, rebalancing of HDFS after addition of
datanodes or updating configuration properties such as “hive.zookeeper.quorum”
when a ZK server is added. Restarting of services/components may also be
necessary in some cases. These additional context specific operations may need
to be accounted for in the api. Unlike the UI, blueprint scaling operations
are likely to be headless (no administrator sitting in a chair adding the
hosts) so the approach used by the UI of notifying the user to restart a
service, etc. may not work for blueprints. Adding multiple hosts in a single
request will likely be more performant for cases where additional context
specific operations like HDFS rebalancing are needed since these could be
executed once after all of the hosts are added. Determining how to deal with
this issue will likely be the most difficult aspect of adding a scaling api.
The above restrictions are in place primarily to minimize these concerns.
Possible approaches (certainly not all inclusive) for dealing with additional
context specific operations during cluster scaling:
- Don’t account for this in the api. This is obviously the easiest to
implement but the least usable. Somehow a user would need to determine which
additional operations are needed and then figure out how to execute them via
the api after the scaling operation completes. Because of the extreme
difficulties the user would have in successfully using the api with this
approach, I don’t consider this a viable option.
- Include the suggested/required operations in the response to the scaling
operation. This could include hrefs and descriptions for each action. These
actions could be placed in “suggested” and “required” categories to indicate
necessity. This is better than doing nothing but is still complicated by the
fact that the script executing the scaling operation would need to process the
response and have the necessary logic for determining which operations to
invoke and then invoking the commands after the asynchronous scaling operation
completes.
- Handle operations automatically. After executing all of the add host
operations for a scaling operation, we could also determine and execute any
additional context specific operations for the scaling operation with no
further user input. This is very user friendly (assuming we make the correct
decisions) but likely wouldn’t provide the necessary level of control. Since
some operations such as HDFS rebalancing after adding datanodes would not be
required, executing these autonomously would likely not be the best solution as
a user may want to explicitly control when the operation occurred.
- Likely the best solution would be somewhere between fully manual and
autonomous execution of related tasks. A user could specify operations to
occur after adding the hosts in the scaling operation. We could potentially
allow operations scoped at different granularities. For example, we might want
to allow a user to specify that all required operations be executed. Or at a
more granular level, “all required service restarts”. Or at an even more
granular level, “HDFS service restart”. To make this work, a user would need
to know which operations would be relevant (suggested/required) for a scaling
operation prior to invoking the scaling api. One solution for this discovery
would be to allow the user to ask the api for the set of relevant operations
for a scaling operation without actually executing the scaling operation. The
api syntax for specifying the “commands” to execute would take a lot of thought
to ensure that it is easy to use but also flexible and extensible.
At this time, given the 2.0 timeline, I feel that the minimum requirements
shouldn't account for any additional context specific operations. By
restricting the adding of master services we are largely eliminating the need
to update configurations and restart services for all services in the current
stacks. For one of the primary use cases, adding additional DATANODE hosts,
rebalancing of HDFS would have to be done in a separate api after the request
completes. Providing a robust "add host" api without the above restrictions
that properly handles/identifies all additional context specific operations
will be complicated and will likely be iterative, accomplished via many finer
grained steps.
> Add support for "add hosts" with Blueprints API
> -----------------------------------------------
>
> Key: AMBARI-6275
> URL: https://issues.apache.org/jira/browse/AMBARI-6275
> Project: Ambari
> Issue Type: Improvement
> Components: ambari-server
> Affects Versions: 1.7.0
> Reporter: Yusaku Sako
> Assignee: John Speidel
> Fix For: 2.0.0
>
>
> Support for "adding hosts" based on *blueprint* style *host_group* via Ambari
> REST API. There are two scenarios to consider for this JIRA:
> 1) Add hosts based on an existing host in the cluster (and it's *blueprint*
> style *host_group* component layout). This enables the user to add hosts with
> components similar to existing hosts in the cluster. For example: expand this
> cluster with these X hosts and make each of these hosts like Y host
> (components + configs) existing in the cluster.
> 2) Add hosts based on components + configs. This would be a verbose method
> that uses *blueprint* style *host_groups* and *configs* to allow you to add
> hosts to a cluster that do not necessarily have a component layout or config
> of a similar host existing in the cluster. For example: expand this cluster
> with these X hosts and make each of these hosts include Y components with Z
> configs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)