Juan Yu created HDFS-10466:
------------------------------
Summary: DistributedFileSystem.listLocatedStatus() should return
HdfsBlockLocation instead of BlockLocation
Key: HDFS-10466
URL: https://issues.apache.org/jira/browse/HDFS-10466
Project: Hadoop HDFS
Issue Type: Improvement
Components: hdfs
Reporter: Juan Yu
Assignee: Juan Yu
Priority: Minor
https://issues.apache.org/jira/browse/HDFS-202 added a new API
listLocatedStatus() to get all files' status with block locations for a
directory. This is great that we don't need to call
FileSystem.getFileBlockLocations() for each file. it's much faster (about 8-10
times).
However, the returned LocatedFileStatus only contains basic BlockLocation
instead of HdfsBlockLocation, the LocatedBlock details are stripped out.
It should do the similar as DFSClient.getBlockLocations(), return
HdfsBlockLocation which provide full block location details.
The implementation of DistributedFileSystem. listLocatedStatus() retrieves
HdfsLocatedFileStatus which contains all information, but when convert it to
LocatedFileStatus, it doesn't keep LocatedBlock data. It's a simple (and
compatible) change to make to keep the LocatedBlock details.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]