Noel C. F. Codella, Ph.D. created MAPREDUCE-5313:
----------------------------------------------------

             Summary: JobTracker Creates Empty Mapper Task, and a Mapper Task 
with 2 FileSplits.
                 Key: MAPREDUCE-5313
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5313
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: jobtracker
    Affects Versions: 1.2.0
         Environment: Linux
            Reporter: Noel C. F. Codella, Ph.D.


When reading an input file, the Job Tracker seems to assign the first two 
FileSplits to a single Mapper Task, then assigns an EMPTY FileSplit (end of 
file) to a Mapper Task, which finishes instantaneously. This can affect job 
balance, since one map job is now twice as big as the others.

In "src/mapred/org/apache/hadoop/mapred/LineRecordReader.java", line 110, there 
is a comment about skipping the first line of the input file by default, since 
"next()" reads two lines anyway. This was not the behavior in 0.20.2, which did 
not have this problem.

It seems this was not implemented properly and is leading to the issue 
described above.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to