[
https://issues.apache.org/jira/browse/HADOOP-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ari Rabkin updated HADOOP-3441:
-------------------------------
Comment: was deleted
> Pass the size of the MapReduce input to JobInProgress
> -----------------------------------------------------
>
> Key: HADOOP-3441
> URL: https://issues.apache.org/jira/browse/HADOOP-3441
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.17.0
> Environment: all
> Reporter: Ari Rabkin
> Assignee: Ari Rabkin
> Priority: Minor
> Fix For: 0.18.0
>
> Attachments: addDataSize.patch
>
>
> Currently, there's no easy way for the JobInProgress to know how large the
> job's input data is.
> This patch corrects the problem, by storing the size of the input split's
> data through the RawSplit. The sizes of each split are then totaled up and
> made available via JobInProgress.getInputSize().
> This is needed, among other reasons, so that the JobInProgress knows how much
> data it's being run on, which will help build smarter schedulers.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.