[ 
https://issues.apache.org/jira/browse/HDFS-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14994534#comment-14994534
 ] 

Bob Hansen commented on HDFS-9117:
----------------------------------

[~wheat9]: thanks for the feedback and carrying the conversation forward.

The primary use case of the Configuration class (as I see it) is to provide 
compatibility not with libhdfs, but with deployed Hadoop java environments.  

As an example, let's say we are writing a native replacement for the dfs tool 
using the native libhdfs++ codebase (not the libhdfs compatability layer) that 
cat do "-ls" and "-copyFromLocal", etc.  To provide Least Astonishment for our 
consumers, they would expect that a properly configured Hadoop node [with the 
HADOOP_HOME pointing to /etc/hadoop-2.9.9 and its config files] could run 
"hdfspp -ls /tmp" and have it automatically find the NN and configure the 
communications parameters correctly to talk to their cluster.

To fully support that use case, we need to read xml in the currently-deployed 
file format (which specifies that we honor "final" tags where they appear in 
the files), and dereference at least HADOOP_HOME in loading the default files.  
We could force our consumers to do that, but that doesn't seem a kindness for 
code we need to write anyway.  We also need to be able to read the encodings 
that are being used in the field (such as "1M" for buffer sizes).  

If we really think that config-substitution and environmental substitution is 
exceedingly rare in the field, we can defer the work, but I am concerned that 
we will deploy a libhdfs++ application to the field only to find that it can't 
read an early adopter's config file.  That work has already been shuffled off 
to HDFS-9385 so we can revisit it later.

Other use cases may not need to read existing hdfs-site.xml files, which is why 
I think you are wise in have a separation between the Config reader and the 
Options object.

I agree with your concern that the libhdfs++ default Options object will get 
out of sync with the Java defaults, and will happily write unit test that 
verifies that they stay together.

[~wheat9] - given this context, do you agree that we need to support libhdfs++ 
compatibility with the hdfs-site.xml files that are already deployed at 
customer sites?

> Config file reader / options classes for libhdfs++
> --------------------------------------------------
>
>                 Key: HDFS-9117
>                 URL: https://issues.apache.org/jira/browse/HDFS-9117
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: HDFS-8707
>            Reporter: Bob Hansen
>            Assignee: Bob Hansen
>         Attachments: HDFS-9117.HDFS-8707.001.patch, 
> HDFS-9117.HDFS-8707.002.patch, HDFS-9117.HDFS-8707.003.patch, 
> HDFS-9117.HDFS-8707.004.patch, HDFS-9117.HDFS-8707.005.patch, 
> HDFS-9117.HDFS-8707.006.patch, HDFS-9117.HDFS-8707.008.patch, 
> HDFS-9117.HDFS-8707.009.patch, HDFS-9117.HDFS-8707.010.patch, 
> HDFS-9117.HDFS-8707.011.patch, HDFS-9117.HDFS-8707.012.patch, 
> HDFS-9117.HDFS-9288.007.patch
>
>
> For environmental compatability with HDFS installations, libhdfs++ should be 
> able to read the configurations from Hadoop XML files and behave in line with 
> the Java implementation.
> Most notably, machine names and ports should be readable from Hadoop XML 
> configuration files.
> Similarly, an internal Options architecture for libhdfs++ should be developed 
> to efficiently transport the configuration information within the system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to