ASF GitHub Bot updated AMBARI-23467:
    Labels: pull-request-available  (was: )

> Blueprint configuration support for multiple NameNode HA deployments in a 
> Federated cluster
> -------------------------------------------------------------------------------------------
>                 Key: AMBARI-23467
>                 URL: https://issues.apache.org/jira/browse/AMBARI-23467
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.7.0
>            Reporter: Robert Nettleton
>            Assignee: Robert Nettleton
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 2.7.0
> This new requirement was discovered while investigating the failures around 
> Blueprint Deployments of HA NameNodes in a Federated cluster.  
> In previous versions of Ambari (from Ambari 2.0 up to Ambari 2.6), HDFS 
> NameNode HA deployments with Blueprints relied on some custom configuration 
> in order to start up the Active and StandBy namenodes properly. In 
> particular, we had to introduce the following configuration properties in the 
> "hadoop-env" configuration type:
>  # *dfs_ha_initial_namenode_active* - This property should contain the 
> hostname for the “active” NameNode in this cluster.
>  # *dfs_ha_initial_namenode_standby* - This property should contain the host 
> name for the “passive” NameNode in this cluster.
> These properties could be set by users to determine the initial state of the 
> NameNode cluster, meaning which NameNode would be Active vs. Standby in the 
> initial startup. This was required, since the startup commands for a NameNode 
> in a Blueprint deployment, which occurs from scratch), are different for the 
> Active and Standby cases. By default (the most common case), users did not 
> set this property, and the BlueprintConfigurationProcessor would choose the 
> Active and Standby nodes, and pass this information down to the ambari-agent 
> via the properties listed above. The agents would then use this configuration 
> to determine which NameNode commands to run on each node.
> Based on information provided by [~swagle], 
> in a new Federated cluster, there will be an HA deployment within each HDFS 
> nameservice, consisting of a pair of Active/Standby nodes.
> The current Blueprint configuration properties mentioned above assume a 
> single nameservice, and so we'll need to introduce some new configuration in 
> order to configure the ambari-agents to start each Active/Standby NameNode 
> pair within each configured nameservice.
> Since it appears to be possible to have an arbitrary number of nameservices 
> defined, I propose that we add some new properties to "hadoop_env", with the 
> express purpose of allowing either users or the Blueprint configuration 
> processor (by default) to specify the set of Active and Standby NameNodes for 
> the initial install:
>  # *dfs_ha_initial_namenode_active_set* : A comma-separated list of Active 
> Namenode hosts, across all known nameservices defined.
>  # *dfs_ha_initial_namenode_standby_set*: A comma-separated list of Standy 
> NameNode hosts, across all known nameservices defined.
> There should be no intersections between these two sets, meaning that a host 
> can only be listed in one of these properties. The Blueprint config processor 
> should verify this, in the case that this property is customized by a user. 
> Generally, this property should only be set by the BlueprintConfiguration 
> processor, but for the sake of flexibility we should provide support for 
> customization if the need arises.
> Since we'd like to preserve backwards compatibility with the original 
> Blueprint HA support, I think we should keep the original properties, and use 
> them in the non-Federated case. When multiple HDFS nameservices are defined, 
> then the new properties above should be used.
> This JIRA tracks the work to properly configure these properties, and add 
> them to the cluster configuration at deployment time. I'll file a separate 
> companion JIRA to track the work required in the HDFS stack scripts to parse 
> out the hostnames in these new properties, and use that information to 
> determine the Active/Standby status of a host.

This message was sent by Atlassian JIRA

Reply via email to