[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859679#action_12859679
 ] 

Vinod K V commented on MAPREDUCE-1719:
--------------------------------------

This was found by Karams while testing the fix for MAPREDUCE-1635. Thanks Karam!

I think, it is happening like this because of overflow of the long 
calculations. The first time a TT reports the output, the calculations that 
happen are roughly as below:

reportedOutputOfMap  =   10751353136 bytes(10GB)
cumulativeOutputSize =   10751353136 (same as above)
cumulativeInputSize  = 1206552881823 (10TB total input for the whole job)

getEstimatedTotalMapOutputSize()
{code}
  long estimate = Math.round((inputSize * 
          completedMapsOutputSize * 2.0)/completedMapsInputSize);
{code}

I ran a simple java program and this will come out to be 6655367 (6MB, because 
of overflow errors).

getEstimatedMapOutputSize():
{code}
 estimate = getEstimatedTotalMapOutputSize()  / job.desiredMaps();
{code}
This will be 59422 (59KB totally wrong).

More iterations of this estimate must be finally reaching the stabilized 9642 
bytes estimate, a guess.

> Wrong map-output estimates by ResourceEstimator because of overflow errors in 
> 'long' calculations
> -------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1719
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1719
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>            Reporter: Vinod K V
>
> On a cluster with disks nearly full, while running a simple sort job on a 
> 10TBdata with ~100 maps, ResourceEstimator is not getting triggered even 
> after 10%maps are completed. Instead, maps keep on getting scheduled till the 
> disks became full and then eventually when the disk size becomes zero, the 
> ResourceEstimator finally comes back alive saying it can find only zero bytes 
> instead of the estimated '9642' bytes. The estimate should have be close to 
> 10GB.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to