[
https://issues.apache.org/jira/browse/HADOOP-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Smith updated HADOOP-3687:
--------------------------------
Attachment: hadoop-pausing.8.trunk.patch
Attached a patch of my current progress on this issue. It defines a new job
priority ({PAUSED}), which prevents new reducers from being started, and pauses
existing reduces. You can also pause individual (reduce) tasks via the command
line or web ui. Paused tasks (from non-paused jobs) are resumed when their
tracker requests new work and there are no higher-priority tasks waiting.
The communication between the TaskTracker and in-progress tasks works by
replacing the boolean response to ping/updateStatus in the
TaskUmbilicalProtocol with a TaskPingResponse object which specifies both
whether the task is known by the tracker, and whether it is paused or not. Once
a task is paused, it sits in a sleep loop waiting to be unpaused.
As previously mentioned, paused tasks are kept in memory, so there's an obvious
limit on how much you can pause. We're currently testing the patch on a cluster
to see whether or not this is problematic in practice.
Comments/suggestions welcome!
> Ability to pause/resume tasks
> -----------------------------
>
> Key: HADOOP-3687
> URL: https://issues.apache.org/jira/browse/HADOOP-3687
> Project: Hadoop Core
> Issue Type: New Feature
> Components: mapred
> Reporter: Chris Smith
> Assignee: Chris Smith
> Priority: Minor
> Attachments: hadoop-pausing.8.trunk.patch
>
>
> It would be nice to be able to pause (and subsequently resume) tasks that are
> currently running, in order to allow tasks from higher priority jobs to
> execute. At present it is quite easy for long-running tasks from low priority
> jobs to block a task from a newer high priority job, and there is no way to
> force the execution of the high priority task without killing the low
> priority jobs.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.