[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-12 Thread GitBox
vrozov commented on a change in pull request #784: Allow caller to construct 
HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem 
object.
URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r378465876
 
 

 ##
 File path: core/src/main/java/org/apache/iceberg/hadoop/HadoopInputFile.java
 ##
 @@ -133,7 +143,7 @@ public SeekableInputStream newStream() {
   }
 
   public Configuration getConf() {
-return conf;
+return fs.getConf();
 
 Review comment:
   Still the same applies to `FileIO` implementation. Without a deep copy of 
`Configuration` inside `HadoopInputFile`, it is possible to cache 
`Configuration` object on a class that implements `FileIO` and expose methods 
that modify it. I do not suggest that it needs to be handled as part of this 
PR, just pointing out that even though it is not possible to change reference, 
the returned object is not immutable. Assuming that there are no other issues 
to address, can you please approve the PR?
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-11 Thread GitBox
vrozov commented on a change in pull request #784: Allow caller to construct 
HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem 
object.
URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r378004624
 
 

 ##
 File path: core/src/main/java/org/apache/iceberg/hadoop/HadoopInputFile.java
 ##
 @@ -133,7 +143,7 @@ public SeekableInputStream newStream() {
   }
 
   public Configuration getConf() {
-return conf;
+return fs.getConf();
 
 Review comment:
   I don't see much difference between
   ```
   Configuration conf = new Configuration();
   Path path = new Path(...);
   HadoopInputFile file = HadoopInputFile.fromPath(path, conf);
   ...
   conf = new Configuration();
   file.setConf(conf);
   ```
   and
   ```
   Configuration conf = new Configuration();
   Path path = new Path(...);
   HadoopInputFile file = HadoopInputFile.fromPath(path, conf);
   ...
   conf.set();
   ```
   if I correctly understand your concern with incorrectly updating the 
configuration. It would be necessary to copy passed configuration object to 
prevent hijacking.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-11 Thread GitBox
vrozov commented on a change in pull request #784: Allow caller to construct 
HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem 
object.
URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377980728
 
 

 ##
 File path: core/src/main/java/org/apache/iceberg/hadoop/HadoopInputFile.java
 ##
 @@ -133,7 +143,7 @@ public SeekableInputStream newStream() {
   }
 
   public Configuration getConf() {
-return conf;
+return fs.getConf();
 
 Review comment:
   I don't see how extending from `Configured` will add complexity. The 
behavior of `HadoopInputFile` or `HadoopOutputFile` won't change.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-11 Thread GitBox
vrozov commented on a change in pull request #784: Allow caller to construct 
HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem 
object.
URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377660527
 
 

 ##
 File path: core/src/main/java/org/apache/iceberg/hadoop/HadoopInputFile.java
 ##
 @@ -30,6 +30,8 @@
 import org.apache.iceberg.io.InputFile;
 import org.apache.iceberg.io.SeekableInputStream;
 
+import static com.google.common.base.Preconditions.checkArgument;
 
 Review comment:
   Will it be good to remove `com.google.common.base.Preconditions.*` exclusion 
from `AvoidStaticImport` rule in the project checkstyle?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org



[GitHub] [incubator-iceberg] vrozov commented on a change in pull request #784: Allow caller to construct HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem object.

2020-02-11 Thread GitBox
vrozov commented on a change in pull request #784: Allow caller to construct 
HadoopInputFile and HadoopOutputFile using an existing instance of FileSystem 
object.
URL: https://github.com/apache/incubator-iceberg/pull/784#discussion_r377654499
 
 

 ##
 File path: core/src/main/java/org/apache/iceberg/hadoop/HadoopInputFile.java
 ##
 @@ -133,7 +143,7 @@ public SeekableInputStream newStream() {
   }
 
   public Configuration getConf() {
-return conf;
+return fs.getConf();
 
 Review comment:
   @rdblue I made the change but would like to clarify. If `HadoopInputFile` 
`conf` may not be the same as `FileSystem` `conf`, will it be better to make it 
more explicit by implementing `Configurable` or extending from `Configured`? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org