[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.
vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r378465876 ## File path: core/src/main/java/org/apache/iceberg/hadoop/HadoopInputFile.java ## @@ -133,7 +143,7 @@ public SeekableInputStream newStream() { } public Configuration getConf() { -return conf; +return fs.getConf(); Review comment: Still the same applies to `FileIO` implementation. Without a deep copy of `Configuration` inside `HadoopInputFile`, it is possible to cache `Configuration` object on a class that implements `FileIO` and expose methods that modify it. I do not suggest that it needs to be handled as part of this PR, just pointing out that even though it is not possible to change reference, the returned object is not immutable. Assuming that there are no other issues to address, can you please approve the PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.
vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r378004624 ## File path: core/src/main/java/org/apache/iceberg/hadoop/HadoopInputFile.java ## @@ -133,7 +143,7 @@ public SeekableInputStream newStream() { } public Configuration getConf() { -return conf; +return fs.getConf(); Review comment: I don't see much difference between ``` Configuration conf = new Configuration(); Path path = new Path(...); HadoopInputFile file = HadoopInputFile.fromPath(path, conf); ... conf = new Configuration(); file.setConf(conf); ``` and ``` Configuration conf = new Configuration(); Path path = new Path(...); HadoopInputFile file = HadoopInputFile.fromPath(path, conf); ... conf.set(); ``` if I correctly understand your concern with incorrectly updating the configuration. It would be necessary to copy passed configuration object to prevent hijacking. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.
vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377980728 ## File path: core/src/main/java/org/apache/iceberg/hadoop/HadoopInputFile.java ## @@ -133,7 +143,7 @@ public SeekableInputStream newStream() { } public Configuration getConf() { -return conf; +return fs.getConf(); Review comment: I don't see how extending from `Configured` will add complexity. The behavior of `HadoopInputFile` or `HadoopOutputFile` won't change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.
vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377660527 ## File path: core/src/main/java/org/apache/iceberg/hadoop/HadoopInputFile.java ## @@ -30,6 +30,8 @@ import org.apache.iceberg.io.InputFile; import org.apache.iceberg.io.SeekableInputStream; +import static com.google.common.base.Preconditions.checkArgument; Review comment: Will it be good to remove `com.google.common.base.Preconditions.*` exclusion from `AvoidStaticImport` rule in the project checkstyle? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org
[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.
vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object. URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377654499 ## File path: core/src/main/java/org/apache/iceberg/hadoop/HadoopInputFile.java ## @@ -133,7 +143,7 @@ public SeekableInputStream newStream() { } public Configuration getConf() { -return conf; +return fs.getConf(); Review comment: @rdblue I made the change but would like to clarify. If `HadoopInputFile` `conf` may not be the same as `FileSystem` `conf`, will it be better to make it more explicit by implementing `Configurable` or extending from `Configured`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org