[
https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ayush Saxena updated HDFS-14440:
--------------------------------
Attachment: HDFS-14440-HDFS-13891-05.patch
> RBF: Optimize the file write process in case of multiple destinations.
> ----------------------------------------------------------------------
>
> Key: HDFS-14440
> URL: https://issues.apache.org/jira/browse/HDFS-14440
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Ayush Saxena
> Assignee: Ayush Saxena
> Priority: Major
> Attachments: HDFS-14440-HDFS-13891-01.patch,
> HDFS-14440-HDFS-13891-02.patch, HDFS-14440-HDFS-13891-03.patch,
> HDFS-14440-HDFS-13891-04.patch, HDFS-14440-HDFS-13891-05.patch
>
>
> In case of multiple destinations, We need to check if the file already exists
> in one of the subclusters for which we use the existing getBlockLocation()
> API which is by default a sequential Call,
> In an ideal scenario where the file needs to be created each subcluster shall
> be checked sequentially, this can be done concurrently to save time.
> In another case where the file is found and if the last block is null, we
> need to do getFileInfo to all the locations to get the location where the
> file exists. This also can be prevented by use of ConcurrentCall since we
> shall be having the remoteLocation to where the getBlockLocation returned a
> non null entry.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]