[ 
https://issues.apache.org/jira/browse/HDFS-14872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938614#comment-16938614
 ] 

David Mollitor commented on HDFS-14872:
---------------------------------------

[~sodonnell]

I imagine something like... the client looks up the size of the file in HDFS 
and pre-allocates the file on the local system, then it gets a list of all the 
blocks for the file, shuffles them, iterates over them, then starts writing 
blocks to the local file at the required offsets.  Once the list of blocks is 
exhausted, the file is complete and made available to the application.

The first use case that comes to mind is better supporting large files 
submitted to the cluster that are required for MapReduce / Spark applications.  
The jobs will not start unless all of the required files are first localized 
from HDFS into the local host by the YARN NodeManager.  If the job requires a 
large JAR file or, even more likely, a large dependency file, all of the nodes 
will fight with each other to download the blocks in order.

One could increase {{mapreduce.client.submit.file.replication}}, however this 
has its limitations as well.  In a large cluster, it may take a long time for 
the NameNode to schedule all of the replication required to get all of the 
blocks up to the requested replication.

https://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/SharedCache.html
https://blog.cloudera.com/resource-localization-in-yarn-deep-dive/
https://hadoop.apache.org/docs/r2.7.2/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml

> Read HDFS Blocks in Random Order
> --------------------------------
>
>                 Key: HDFS-14872
>                 URL: https://issues.apache.org/jira/browse/HDFS-14872
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: hdfs-client
>    Affects Versions: 2.8.5, 3.2.1
>            Reporter: David Mollitor
>            Priority: Major
>
> When the HDFS client is downloading (copying) an entire file, allow the 
> client to download the blocks in random order.  If a lot of clients are 
> reading the same file, in parallel, they will all download the first block, 
> the second block, and so on, stampeding down the line.
> It would be interesting to spread the load across across all the available 
> DataNodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to