yanghua commented on a change in pull request #2430:
URL: https://github.com/apache/hudi/pull/2430#discussion_r557306549
##########
File path: hudi-flink/src/main/java/org/apache/hudi/util/StreamerUtil.java
##########
@@ -81,16 +103,50 @@ public static DFSPropertiesConfiguration
readConfig(FileSystem fs, Path cfgPath,
return conf;
}
- public static Configuration getHadoopConf() {
- return new Configuration();
+ public static org.apache.hadoop.conf.Configuration getHadoopConf() {
+ // create HiveConf from hadoop configuration with hadoop conf directory
configured.
+ org.apache.hadoop.conf.Configuration hadoopConf = null;
+ for (String possibleHadoopConfPath :
HadoopUtils.possibleHadoopConfPaths(new Configuration())) {
Review comment:
> The method firstly find the specified path `fs.hdfs.hadoopconf`, then
directory `HADOOP_CONF_DIR` `HADOOP_HOME/conf` `HADOOP_HOME/etc/hadoop` from
the system environment.
>
I have watched the source code before raising this concern.
> Even if storage is separated from computing, the `FileSystem` we created
is still correct, if we split the hadoop conf files correctly.
>
> In any case, we should not pass an empty hadoop configuration.
I mean, do we allow the user's explicit parameter assignment as the highest
priority? Greater than the default convention that some users may not know?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]