[ https://issues.apache.org/jira/browse/HDFS-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994534#comment-14994534 ]
Bob Hansen commented on HDFS-9117: ---------------------------------- [~wheat9]: thanks for the feedback and carrying the conversation forward. The primary use case of the Configuration class (as I see it) is to provide compatibility not with libhdfs, but with deployed Hadoop java environments. As an example, let's say we are writing a native replacement for the dfs tool using the native libhdfs++ codebase (not the libhdfs compatability layer) that cat do "-ls" and "-copyFromLocal", etc. To provide Least Astonishment for our consumers, they would expect that a properly configured Hadoop node [with the HADOOP_HOME pointing to /etc/hadoop-2.9.9 and its config files] could run "hdfspp -ls /tmp" and have it automatically find the NN and configure the communications parameters correctly to talk to their cluster. To fully support that use case, we need to read xml in the currently-deployed file format (which specifies that we honor "final" tags where they appear in the files), and dereference at least HADOOP_HOME in loading the default files. We could force our consumers to do that, but that doesn't seem a kindness for code we need to write anyway. We also need to be able to read the encodings that are being used in the field (such as "1M" for buffer sizes). If we really think that config-substitution and environmental substitution is exceedingly rare in the field, we can defer the work, but I am concerned that we will deploy a libhdfs++ application to the field only to find that it can't read an early adopter's config file. That work has already been shuffled off to HDFS-9385 so we can revisit it later. Other use cases may not need to read existing hdfs-site.xml files, which is why I think you are wise in have a separation between the Config reader and the Options object. I agree with your concern that the libhdfs++ default Options object will get out of sync with the Java defaults, and will happily write unit test that verifies that they stay together. [~wheat9] - given this context, do you agree that we need to support libhdfs++ compatibility with the hdfs-site.xml files that are already deployed at customer sites? > Config file reader / options classes for libhdfs++ > -------------------------------------------------- > > Key: HDFS-9117 > URL: https://issues.apache.org/jira/browse/HDFS-9117 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client > Affects Versions: HDFS-8707 > Reporter: Bob Hansen > Assignee: Bob Hansen > Attachments: HDFS-9117.HDFS-8707.001.patch, > HDFS-9117.HDFS-8707.002.patch, HDFS-9117.HDFS-8707.003.patch, > HDFS-9117.HDFS-8707.004.patch, HDFS-9117.HDFS-8707.005.patch, > HDFS-9117.HDFS-8707.006.patch, HDFS-9117.HDFS-8707.008.patch, > HDFS-9117.HDFS-8707.009.patch, HDFS-9117.HDFS-8707.010.patch, > HDFS-9117.HDFS-8707.011.patch, HDFS-9117.HDFS-8707.012.patch, > HDFS-9117.HDFS-9288.007.patch > > > For environmental compatability with HDFS installations, libhdfs++ should be > able to read the configurations from Hadoop XML files and behave in line with > the Java implementation. > Most notably, machine names and ports should be readable from Hadoop XML > configuration files. > Similarly, an internal Options architecture for libhdfs++ should be developed > to efficiently transport the configuration information within the system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)