[
https://issues.apache.org/jira/browse/HDFS-14440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844738#comment-16844738
]
Vinayakumar B commented on HDFS-14440:
--------------------------------------
{quote}I'm still wrapping my head around using invokeConcurrent() and
invokeSequential()...
What about using sequential for HASH and HASH_ALL and concurrent for the others?
{quote}
I can see, this would not be a problem at all in the latest patch.
Though {{involeConcurrent()}} is used, result check from the returned map
happens according to original list of remote locations which is based on the
order.
Also, {{getFileInfo()}} is much lighter compared to {{getBlockLocations()}}.
So, I would say changes are direct and straight forward instead of earlier
complex check with blocklocations.
+1
> RBF: Optimize the file write process in case of multiple destinations.
> ----------------------------------------------------------------------
>
> Key: HDFS-14440
> URL: https://issues.apache.org/jira/browse/HDFS-14440
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Ayush Saxena
> Assignee: Ayush Saxena
> Priority: Major
> Attachments: HDFS-14440-HDFS-13891-01.patch,
> HDFS-14440-HDFS-13891-02.patch, HDFS-14440-HDFS-13891-03.patch,
> HDFS-14440-HDFS-13891-04.patch, HDFS-14440-HDFS-13891-05.patch,
> HDFS-14440-HDFS-13891-06.patch
>
>
> In case of multiple destinations, We need to check if the file already exists
> in one of the subclusters for which we use the existing getBlockLocation()
> API which is by default a sequential Call,
> In an ideal scenario where the file needs to be created each subcluster shall
> be checked sequentially, this can be done concurrently to save time.
> In another case where the file is found and if the last block is null, we
> need to do getFileInfo to all the locations to get the location where the
> file exists. This also can be prevented by use of ConcurrentCall since we
> shall be having the remoteLocation to where the getBlockLocation returned a
> non null entry.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]