DFS should provide partition information for blocks, and map/reduce should 
schedule avoid schedule mappers with the splits off the same file system 
partition at the same time
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

                 Key: HADOOP-2093
                 URL: https://issues.apache.org/jira/browse/HADOOP-2093
             Project: Hadoop
          Issue Type: New Feature
            Reporter: Runping Qi



The summary is a bit of long. But the basic idea is to better utilize multiple 
file system partitions.
For example, in a map reduce job, if we have 100 splits local to a node, and 
these 100 splits spread 
across 4 file system partitions, if we allow 4 mappers running concurrently, it 
is better that mappers
each work on splits on different file system partitions. If in the worst case, 
all the mappers work on the splits on the same file system partition, then the 
other three 
file systems are not utilized at all.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to