Lin Yiqun created MAPREDUCE-6551:
------------------------------------

             Summary: Dynamic adjust mapTaskAttempt memory size
                 Key: MAPREDUCE-6551
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6551
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: task
    Affects Versions: 2.7.1
            Reporter: Lin Yiqun
            Assignee: Lin Yiqun


I found a scenario that the map tasks cost so much resource of cluster.This 
scenario will be happened that if there are many small file blokcs (even some 
are not reach 1M),and this will lead to many map task to read.And in gengeral,a 
map task attempt will use the default config {{MRJobConfig#MAP_MEMORY_MB}} to 
set its resourceCapcity's memory to deal with their datas.And this will cause a 
problem that map tasks cost so much memory resource and target data is small.So 
I have a idea that wherther we can dynamic set mapTaskAttempt memory size by 
its inputDataLength.And this value can be provided by 
{{TaskSplitMetaInfo#getInputDataLength}} methods.Besides that,we should 
provided a standard unit dataLength for a standard memory size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to