mmiklavc commented on a change in pull request #1365: Metron-2050: 
Automatically populate a list of enrichments from HBase
URL: https://github.com/apache/metron/pull/1365#discussion_r273648026
 
 

 ##########
 File path: 
metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-env.xml
 ##########
 @@ -121,6 +121,14 @@
             <empty-value-valid>true</empty-value-valid>
         </value-attributes>
     </property>
+    <property>
 
 Review comment:
   @merrimanr - I played around with this a bit. The one benefit to the way I 
have it is that it allows the user to change the HDFS url if there's an issue 
with it for any reason - it will pull the default recommended value from the 
property `fs.defaultFS` on install. My main concern with this property at all 
is namenode HA. I can't exactly use the same approach as was done for storm and 
zookeeper because I can't get the namenode port, i.e.  `hdfs_url = 
default("/clusterHostInfo/namenode_host", [])` only gives me a host, no port, 
and there's no other property in clusterHostInfo to obtain that. I was, 
however, able to find a modification to the service_advisor style that allowed 
me to grab it from inside the params files, 
`config["configurations"]["core-site"]["fs.defaultFS"]`. That gives me the full 
url `hdfs://node1:8020` as desired. The shortcoming here is that there is no 
way to change this whatsoever without modifying the python files if there's an 
issue. But I believe this should still work with namenode HA just fine, per 
[https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html)
   
   > fs.defaultFS - the default path prefix used by the Hadoop FS client when 
none is given
   > 
   > Optionally, you may now configure the default path for Hadoop clients to 
use the new HA-enabled logical URI. If you used “mycluster” as the nameservice 
ID earlier, this will be the value of the authority portion of all of your HDFS 
paths. This may be configured like so, in your core-site.xml file:
   
   ```
   <property>
     <name>fs.defaultFS</name>
     <value>hdfs://mycluster</value>
   </property>
   ```
   
   I guess I don't have a strong opinion either way - both can work. Exposing 
the property means that the user can change it if they need, but they will also 
need to manually manage it in cases where the namenode nameservice ID changes. 
Not exposing it means they cannot change it at all, however it will always pull 
the latest ID from any changes with namenode HA, etc. Which do you think is 
better? In light of the HA probably working fine with fs.defaultFS, I'm leaning 
towards not exposing the property.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to