[
https://issues.apache.org/jira/browse/HADOOP-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12481705
]
Owen O'Malley commented on HADOOP-1128:
---------------------------------------
This is a good catch. A couple of other points:
1. start should be set to the position in the stream after the sync, so that
the "real" start point is used.
2. the result of the division should be compared to 1.0 to make sure the
getProgress never returns numbers bigger than 1.0f. This can happen because of
the way that splits are chosen blindly and then adjusted to the sync
boundaries. (Both the start and end boundaries are pushed back to the next sync
boundary.)
> Missing progress information in map tasks
> -----------------------------------------
>
> Key: HADOOP-1128
> URL: https://issues.apache.org/jira/browse/HADOOP-1128
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.1
> Reporter: Andrzej Bialecki
> Assigned To: Andrzej Bialecki
> Fix For: 0.12.1
>
> Attachments: progress.patch
>
>
> Long-running map tasks don't update properly their progress - the propgress
> percentage stays at 0% only to jump suddenly at the end of the task to 100%.
> The reason, discovered by Espen Amble Kolstad, is that there's a missing cast
> to float in SequenceFileRecordReader and in LineRecordReader.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.