[ 
https://issues.apache.org/jira/browse/HADOOP-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdy updated HADOOP-6093:
--------------------------

    Description: 
If the Capacity Scheduler decides to kill a reduce job then it selects the task 
that made the least progress. In my test setup I created a dummy reduce task 
that does nothing but waiting indefinitely. All reduce progresses are "1" 
because all reducers are in the progress of their last record. Now the 
"getRunningTaskWithLeastProgress(tip)" will return null, so no task is killed.

Although not very likely this will occur in a production setup (timeout killing 
would kick in anyway) but it may be a bit unexpecting.

I will attach a patch.

  was:
If the Capacity Scheduler decides to kill a reduce job then it selects the task 
that made the least progress. In my test setup I created a dummy reduce task 
that does nothing but waiting indefinitely. All reduce progresses are "1". Now 
the "getRunningTaskWithLeastProgress(tip)" will return null, so no task is 
killed.

Although not very likely this will occur in a production setup (timeout killing 
would kick in anyway) but it may be a bit unexpecting.

I will attach a patch.

        Summary: Capacity Scheduler does not kill reduce tasks if all reducers 
are in the progress of their last record.  (was: Capacity Scheduler does not 
kill reduce tasks if no running reducers have made any progress at all.)

> Capacity Scheduler does not kill reduce tasks if all reducers are in the 
> progress of their last record.
> -------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-6093
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6093
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.20.0
>            Reporter: Ferdy
>            Priority: Minor
>         Attachments: PatchHadoop6093v1.patch
>
>
> If the Capacity Scheduler decides to kill a reduce job then it selects the 
> task that made the least progress. In my test setup I created a dummy reduce 
> task that does nothing but waiting indefinitely. All reduce progresses are 
> "1" because all reducers are in the progress of their last record. Now the 
> "getRunningTaskWithLeastProgress(tip)" will return null, so no task is killed.
> Although not very likely this will occur in a production setup (timeout 
> killing would kick in anyway) but it may be a bit unexpecting.
> I will attach a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to