mmiklavc commented on a change in pull request #1365: Metron-2050:
Automatically populate a list of enrichments from HBase
URL: https://github.com/apache/metron/pull/1365#discussion_r273648026
##########
File path:
metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-env.xml
##########
@@ -121,6 +121,14 @@
<empty-value-valid>true</empty-value-valid>
</value-attributes>
</property>
+ <property>
Review comment:
@merrimanr - I played around with this a bit. The one benefit to the way I
have it is that it allows the user to change the HDFS url if there's an issue
with it for any reason - it will pull the default recommended value from the
property `fs.defaultFS` on install. My main concern with this property at all
is namenode HA. I can't exactly use the same approach as was done for storm and
zookeeper because I can't get the namenode port, i.e. `hdfs_url =
default("/clusterHostInfo/namenode_host", [])` only gives me a host, no port,
and there's no other property in clusterHostInfo to obtain that. I was,
however, able to find a modification to the service_advisor style that allowed
me to grab it from inside the params files,
`config["configurations"]["core-site"]["fs.defaultFS"]`. That gives me the full
url `hdfs://node1:8020` as desired. The shortcoming here is that there is no
way to change this whatsoever without modifying the python files if there's an
issue. But I believe this should still work with namenode HA just fine, per
[https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithNFS.html)
> fs.defaultFS - the default path prefix used by the Hadoop FS client when
none is given
>
> Optionally, you may now configure the default path for Hadoop clients to
use the new HA-enabled logical URI. If you used “mycluster” as the nameservice
ID earlier, this will be the value of the authority portion of all of your HDFS
paths. This may be configured like so, in your core-site.xml file:
```
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
```
I guess I don't have a strong opinion either way - both can work. Exposing
the property means that the user can change it if they need, but they will also
need to manually manage it in cases where the namenode nameservice ID changes.
Not exposing it means they cannot change it at all, however it will always pull
the latest ID from any changes with namenode HA, etc. Which do you think is
better? In light of the HA probably working fine with fs.defaultFS, I'm leaning
towards not exposing the property.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services