A fall back logic will be good, perhaps required. Any documentation around dfs.namenode.shared.edits.dir that I can refer to in terms of its content and reliability of it?
My original thoughts were: * For fresh NN/HA enablements and fresh blueprint deployments, it will work fine as Amabri can add and rely on the existence of "dfs.internal.nameservice" * For existing clusters, during Ambari upgrade it can auto copy value from dfs.nameservice to dfs.internal.nameservice if the value not a comma separated list ** If the value is a comma separated list then we already a problem - so those deployments need to fix it manually after upgrade as Ambari can now understand dfs.internal.nameservice * We may still need a fall back logic ** For fallback, I was thinking of falling back to the current logic (mostly, as I did not know about dfs.namenode.shared.edits.dir) ** Fallback should not be needed if we can fix the configs automatically during Ambari Upgrade -Sumit ________________________________________ From: Henning Kropp <[email protected]> Sent: Friday, April 01, 2016 12:49 PM To: [email protected] Subject: Re: Support for Multiple Nameservices Hi Sumit, very interesting. I will certainly explore the possibility of using dfs.inernal.nameservices (leaving federation for another day :) You think a fallback to dfs.nameservices (the current behaivour), if internal nameservices is not set, is needed? Regards, Henning Am 01/04/16 um 20:30 schrieb Sumit Mohanty: > Hi Henning. > > Its a very good problem that you brought up. > > https://community.hortonworks.com/questions/8989/how-to-use-name-service-id-between-to-clusters.html > talks about "dfs.internal.nameservices" and if Ambari populates this and > uses it for all its internal scripts/alerts then that may be a solution. > > This, as you notice also says "nameservices" (plural) which points to HDFS > federation. That is probably a problem for a different day. > In any case, in absence of HDFS federation, we can have Ambari refer to > "dfs.internal.nameservices" for its own use. The requirement will be to have > NN HA wizard and Blueprint deployment populate this property. > > AMBARI-15615 modified the NN HA wizard to populate the property. But as far > as I can see no other changes have happened. > > I was not aware of hdfs_site['dfs.namenode.shared.edits.dir'] as a possible > solution but talking to few folks from HDFS it seems that populating/using > "dfs.internal.nameservices" will be more desirable. > > Do you want to explore the possibility of using "dfs.internal.nameservices"? > > thanks > Sumit > ________________________________________ > From: Henning Kropp <[email protected]> > Sent: Friday, April 01, 2016 10:38 AM > To: [email protected] > Subject: Support for Multiple Nameservices > > Hi, > > we noticed that throughout Ambari hdfs_site['dfs.nameservices'] is > treated as a string denoting just one nameservice. It is possible to > configure multiple namservices for example to support seamless distcp > between two HA clusters. In general this makes it possible to use > multiple nameservices wiht hdfs. > > This makes dfs.nameservices a comma separated string holding mulitple > namservices. When doing this with the current Ambari release this leads > to multiple problems. One is that Ambari marks the restart of HDFS as > failed, even though the restart was succesful. The reason for that is > that hdfs_resources.py is not working this a comma separated list of > nameservices AMBARI-15506. > > We created an umbrella JIRA to track the other issues AMBARI-15507. > Problems with Blueprint install, because the DN's were registered with > the other cluster AMBARI-15509. Web alerting for NNs does not work > AMBARI-15508. > > There might be other places where dfs.namservices is treated just a > string? How can web alerting be refactored to work with multiple > nameservices? > > Also I would appreciate any feedback about the function to resolve the > current nameservice for the current cluster. > > For AMBARI-15506 I defined the following nameservice resolution: > 1. split names by ',' > 2. for all services check if string is also contained in > dfs.namenode.shared.edits.dir . Typically this are > jounalnode1,journalnode2,journalnode3:port/servicename. Here it would > probably be better to verify the name with fs.defaultFS, but this is > part of core-site not hdfs-site, which would add a separate dependency. > For namenode_ha_utils.py for me this seemed like an issue, because > refactoring all the python scripts to also include the core-site seemed > much more involved. > 3. A default fallback the first string in the list is used as the > nameservice. > > Thanks and regards, > Henning
