[ 
https://issues.apache.org/jira/browse/MAPREDUCE-743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12730677#action_12730677
 ] 

Ravi Gummadi commented on MAPREDUCE-743:
----------------------------------------

When compressed files are given as input to maps, the progress is not updated 
because the size of the input file(uncompressed size) is considered as 
Long.MAX_VALUE and thus the progress of map task with compressed file as input 
is ignored because of very small value 1/Long.MAX_VALUE. Progress values seen 
are of the order of 10^-17 to 10^-11.

I saw on the web   
http://www.abeel.be/content/determine-uncompressed-size-gzip-file    that says 
that the last 4 bytes of gzipped file contain the uncompressed file size. But 
this works only if the size is < 4GB.

Any thoughts on getting the uncompressed file size of compressed files(at 
leaset for gzipped files) ?

> Progress of map phase in map task is not updated properly
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-743
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-743
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>    Affects Versions: 0.21.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.21.0
>
>         Attachments: MR-743.patch, MR-743.v1.patch
>
>
> Progress of map phase in map task is not updated properly. The progress set 
> by TrackedRecordReader and NewTrackingRecordReader should set the progress 
> object of map phase. It was setting it as the progress of whole task and 
> because of phases, this is not considered as part of map task progress.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to