openinx commented on pull request #1558: URL: https://github.com/apache/iceberg/pull/1558#issuecomment-709014580
> What I don't understand is why Iceberg should do anything other than ensure that hive-site.xml is added as a default resource from the classpath. For flink SQL, we usually create only one `iceberg-flink-runtime.jar` and make it in the flink classpath. Different users would use the same flink distribution to run their flink sql jobs, if the `iceberg-flink-runtime.jar` is bounded with the specific `hive-site.xml` in its resources, then how could different users use the same distribution to access different hive metastore ? That would require different users to build different distribution ? For hadoop configurations, we provide `HADOOP_HOME` environment , or `fs.hdfs.hadoopconf` config keys in flink configuration file , or `HADOOP_CONF_DIR` environment to load hadoop(hdfs-site.xml, core-site.xml) configurations, then it won't have the hive config issues. I created a patch to provide the similar behavior here : https://github.com/apache/iceberg/pull/1586/files#diff-dfee8e9c94fb35806da6eea03a18614d2c5ad778563749493452829bcaec7cc1R95. @zhangjun0x01 , for uploading the configurations files to hdfs and downloading & loading it for flink stream job in `application` mode, I'm thinking that it's over designed now. Besides the hive-site.xml, hadoop configurations are loaded either from environment or classpath or path configured in flink conf file, would the flink module in iceberg also need to download those and loading them ? That does not make sense. A better way is following the current flink design , bundled the hive-site.xml and other related config files into your flink datastream jar, and upload it to flink cluster. Flink DataStream job submission is more flexible that Flink SQL, it make sense to build a separate bundled jar per job. I provided a more reasonable patch (https://github.com/apache/iceberg/pull/1586) to handle hive-site.xml (As the flink 0.10.0 release is coming, and we hope to resolve this thing as soon as possible, so I pull requested the patch for the same issue, your discussion is valuable @zhangjun0x01 , hope you don't mind, Thanks). ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
