palladium-coder commented on issue #16640: URL: https://github.com/apache/iceberg/issues/16640#issuecomment-4634178837
Hello, @steveloughran Thanks for the suggestions. We are trying them out. Digging deeper, I do see that there is scope for improvement on the iceberg side . For e.g in [HadoopOutputFile](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/hadoop/HadoopOutputFile.java#L57) we allow the user to provide there own FileSytem but in [ParquetIO](https://github.com/apache/iceberg/blob/main/parquet/src/main/java/org/apache/iceberg/parquet/ParquetIO.java#L70) it is never utilized it and instead creates its own in [another library](https://github.com/apache/parquet-java/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopOutputFile.java#L58). Such special handling of HadoopOutputFile isn't known to the users of iceberg library and was surfaced when we disable fileSystem cache. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
