[jira] Updated: (HADOOP-2847) [HOD] Idle cluster cleanup does not work if the JobTracker becomes unresponsive to RPC calls

Hemanth Yamijala (JIRA) Thu, 21 Feb 2008 10:32:00 -0800

     [ 
https://issues.apache.org/jira/browse/HADOOP-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Hemanth Yamijala updated HADOOP-2847:
-------------------------------------

    Attachment: hadoop-2847

This patch adds some error handling around code which calls the hadoop client 
to determine number of running jobs. If an exception is thrown here, typically 
due to SocketTimeout or SocketException, the error code from the hadoop client 
is captured and used to determine idleness time.

> [HOD] Idle cluster cleanup does not work if the JobTracker becomes 
> unresponsive to RPC calls
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2847
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2847
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.16.1
>
>         Attachments: hadoop-2847
>
>
> In some erroneous conditions, the Hadoop JobTracker becomes unresponsive to 
> RPC calls (for e.g. if a misconfiguration causes the JobTracker to run out of 
> memory). In such cases, a cluster allocated by HOD no longer runs any jobs 
> and is wastefully holding up nodes. The usual idle cluster cleaner should 
> deallocate the cluster ideally, but it does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-2847) [HOD] Idle cluster cleanup does not work if the JobTracker becomes unresponsive to RPC calls

Reply via email to