Hi,

we noticed that throughout Ambari hdfs_site['dfs.nameservices'] is treated as a string denoting just one nameservice. It is possible to configure multiple namservices for example to support seamless distcp between two HA clusters. In general this makes it possible to use multiple nameservices wiht hdfs.

This makes dfs.nameservices a comma separated string holding mulitple namservices. When doing this with the current Ambari release this leads to multiple problems. One is that Ambari marks the restart of HDFS as failed, even though the restart was succesful. The reason for that is that hdfs_resources.py is not working this a comma separated list of nameservices AMBARI-15506.

We created an umbrella JIRA to track the other issues AMBARI-15507. Problems with Blueprint install, because the DN's were registered with the other cluster AMBARI-15509. Web alerting for NNs does not work AMBARI-15508.

There might be other places where dfs.namservices is treated just a string? How can web alerting be refactored to work with multiple nameservices?

Also I would appreciate any feedback about the function to resolve the current nameservice for the current cluster.

For AMBARI-15506 I defined the following nameservice resolution:
1. split names by ','
2. for all services check if string is also contained in dfs.namenode.shared.edits.dir . Typically this are jounalnode1,journalnode2,journalnode3:port/servicename. Here it would probably be better to verify the name with fs.defaultFS, but this is part of core-site not hdfs-site, which would add a separate dependency. For namenode_ha_utils.py for me this seemed like an issue, because refactoring all the python scripts to also include the core-site seemed much more involved. 3. A default fallback the first string in the list is used as the nameservice.

Thanks and regards,
Henning

Reply via email to