[ 
https://issues.apache.org/jira/browse/HADOOP-19199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HADOOP-19199:
------------------------------------
    Labels: pull-request-available  (was: )

> Include FileStatus when opening a file from FileSystem
> ------------------------------------------------------
>
>                 Key: HADOOP-19199
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19199
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 3.4.0
>            Reporter: Oliver Caballero Alvarez
>            Priority: Major
>              Labels: pull-request-available
>
> The FileSystem abstract class prevents that if you have information about the 
> FileStatus of a file, you use it to open that file, which means that in the 
> implementations of the open method, they have to request the FileStatus of 
> the same file again, making unnecessary requests.
> A very clear example is seen in today's latest version of the parquet-hadoop 
> implementation, where:
> https://github.com/apache/parquet-java/blob/apache-parquet-1.14.0/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/util/HadoopInputFile.java
> Although to create the implementation you had to consult the file to know its 
> FileStatus, when opening it only the path is included, since the FileSystem 
> implementation is the only thing it allows you to do. This implies that the 
> implementation will surely, in its open function, verify that the file exists 
> or what information the file has and perform the same operation again to 
> collect the FileStatus.
>  
> This would simply be resolved by taking the latest current version:
>  
> [https://github.com/apache/hadoop/blob/release-3.4.0-RC3/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java]
> and including the following:
>  
>   public FSDataInputStream open(FileStatus f) throws IOException {
>         return this.open(f.getPath(), 
> this.getConf().getInt("io.file.buffer.size", 4096));
>     }
>  
> This would imply that it is backward compatible with all current Filesystems, 
> but since it is in the implementation it could be used when this information 
> is already known.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to