[ 
http://issues.apache.org/jira/browse/HADOOP-76?page=comments#action_12450232 ] 
            
Owen O'Malley commented on HADOOP-76:
-------------------------------------

The propsed PhasedFileSystem needs to support mkdirs in order to work with 
MapFileOutputFormat.

It will break less user OutputFormats if we also support the read operations:
exists
openRaw
getLength
isDirectory
listPathsRaw
setWorkingDirectory
getWorkingDirectory
by passing the request to the underlying FileSystem.

It is confusing to use the TaskInProgress.hasSucceededTask for reduces and 
TaskInProgress.completes for maps. I think it would be better to use the 
completes for both.

Thanks for putting the generic types into activeTasks, but it should look like:

Map<String, String> activeTasks = new HashMap();

The declared type should use the interface and the constructor doesn't need the 
generic types.

The logic in TaskInProgress.isRunnable is pretty convoluted although I think if 
you use completes for reduces, it doesn't need to change.

TaskInProgress.hasRanOnMachine should be hasRunOnMachine.

findNewTask should add a new condition instead of continue:


               } else if (specTarget == -1 &&
                          task.hasSpeculativeTask(avgProgress)) {
+                if(task.hasRanOnMachine(taskTracker)){
+                  continue ;
+                }
                 specTarget = i;
               }

should be:

               } else if (specTarget == -1 &&
                              task.hasSpeculativeTask(avgProgress) &&
                              !task.hasRanOnMachine(taskTracker)) {
                 specTarget = i;
               }

The patch always creates a PhasedFileSystem even when it won't be used because 
there is no speculative execution.

@@ -298,7 +305,14 @@

     } finally {
       reducer.close();
-      out.close(reporter);
+      if( runSpeculative ){
+        out.close(reporter);
+        pfs.commit();
+        pfs.close();
+       }else{
+        out.close(reporter);
+        fs.close();
+      }
     }

"out.close(reporter);" should be lifted out of the branch. And it is usually 
better to not close the file system because they are cached and may be used in 
another context. So I'd drop the else clause all together.




> Implement speculative re-execution of reduces
> ---------------------------------------------
>
>                 Key: HADOOP-76
>                 URL: http://issues.apache.org/jira/browse/HADOOP-76
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.1.0
>            Reporter: Doug Cutting
>         Assigned To: Sanjay Dahiya
>            Priority: Minor
>         Attachments: Hadoop-76.patch, Hadoop-76_1.patch, Hadoop-76_2.patch, 
> Hadoop-76_3.patch, Hadoop-76_4.patch, Hadoop-76_5.patch, spec_reducev.patch
>
>
> As a first step, reduce task outputs should go to temporary files which are 
> renamed when the task completes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to