Pass the size of the MapReduce input to JobInProgress
-----------------------------------------------------

                 Key: HADOOP-3441
                 URL: https://issues.apache.org/jira/browse/HADOOP-3441
             Project: Hadoop Core
          Issue Type: Improvement
          Components: mapred
    Affects Versions: 0.17.0
         Environment: all
            Reporter: Ari Rabkin
            Assignee: Ari Rabkin
            Priority: Minor
             Fix For: 0.18.0
         Attachments: addDataSize.patch

Currently, there's no easy way for the JobInProgress to know how large the 
job's input data is.

This patch corrects the problem, by storing the size of the input split's data 
through the RawSplit.  The sizes of each split are then totaled up and made 
available via JobInProgress.getInputSize().  

This is needed, among other reasons, so that the JobInProgress knows how much 
data it's being run on, which will help build smarter schedulers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to