[
https://issues.apache.org/jira/browse/HDFS-12778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16255428#comment-16255428
]
Virajith Jalaparti edited comment on HDFS-12778 at 11/16/17 2:48 PM:
---------------------------------------------------------------------
Thanks for taking a look [~elgoiri]. Posting a new patch with the additional
test cases ({{testNumberOfProvidedLocations}} and
{{testNumberOfProvidedLocationsManyBlocks}}).
bq. Should we make the block locations deterministic to some degree? I can see
two mappers trying to accessing the same block and in that way some caching
could be done.
Yes, that is entirely possible. I agree that returning a consistent set of
locations can help with things like caching. We can fix this as part of
HDFS-12809.
was (Author: virajith):
Thanks for taking a look [~elgoiri]. Posting a new patch with the additional
test cases ({{testNumberOfProvidedLocations}} and
{{testNumberOfProvidedLocationsManyBlocks}}).
> [READ] Report multiple locations for PROVIDED blocks
> ----------------------------------------------------
>
> Key: HDFS-12778
> URL: https://issues.apache.org/jira/browse/HDFS-12778
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Virajith Jalaparti
> Assignee: Virajith Jalaparti
> Attachments: HDFS-12778-HDFS-9806.001.patch,
> HDFS-12778-HDFS-9806.002.patch
>
>
> On {{getBlockLocations}}, only one Datanode is returned as the location for
> all PROVIDED blocks. This can hurt the performance of applications which
> typically 3 locations per block. We need to return multiple Datanodes for
> each PROVIDED block for better application performance/resilience.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]