Tim Yao created MAPREDUCE-6877:
----------------------------------
Summary: Assign map task preferentially to the data node where the
split is on faster storage type
Key: MAPREDUCE-6877
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6877
Project: Hadoop Map/Reduce
Issue Type: Improvement
Reporter: Tim Yao
SSD has been widely used in HDFS to improve reading/writing performance.
However, SSD costs much more than HDD, so there is a tradeoff policy ONE-SSD to
balance the performance and cost. But there occurs a problem whether
applications will read the replication on SSD. If applications cannot read the
replication on SSD, the advantage of SSD can no longer be utilized, which will
lead to much poorer performance compared to ALL-SSD policy. The current
MapReduce only assign tasks according to data locality. The storage types of
all the replications of each split should also been taken into consideration in
order to assign map task preferentially to a node where its split is located on
a faster storage type.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]