[ 
https://issues.apache.org/jira/browse/HADOOP-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated HADOOP-3131:
----------------------------------

    Status: Open  (was: Patch Available)

Matei, sorry I missed this piece the first time around:

{noformat}
+          for (Segment<K, V> s: segmentsToMerge) {
+            totalBytesProcessed += s.getPosition(); // Count initial bytes read
+          }
+          if (totalBytes != 0) {
+            mergeProgress.set(totalBytesProcessed * progPerByte);
+          } else {
+            mergeProgress.set(1.0f);
+          }
{noformat}

At best it reports progress slightly early (i.e. before the final merge begins) 
and at worst it provides completely wrong progress value during the merging of 
intermediate map-outputs since all output for all reduces is in a single file. 
Hence {{s.getPosition}} is hopelessly off as a  measure of merge progress... I 
vote we just do away with that block.

> enabling BLOCK compression for map outputs breaks the reduce progress counters
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-3131
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3131
>             Project: Hadoop Core
>          Issue Type: Bug
>    Affects Versions: 0.17.1, 0.17.0, 0.17.2, 0.18.0, 0.19.0
>            Reporter: Colin Evans
>            Assignee: Matei Zaharia
>             Fix For: 0.19.0
>
>         Attachments: HADOOP-3131-v2.patch, HADOOP-3131-v3.patch, 
> HADOOP-3131-v4.patch, HADOOP-3131-v5.patch, merge-progress-trunk.patch, 
> merge-progress.patch, Picture 1.png
>
>
> Enabling map output compression and setting the compression type to BLOCK 
> causes the progress counters during the reduce to go crazy and report 
> progress counts over 100%.
> This is problematic for speculative execution because it thinks the tasks are 
> doing fine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to