[ 
https://issues.apache.org/jira/browse/HADOOP-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12570745#action_12570745
 ] 

Hemanth Yamijala commented on HADOOP-2847:
------------------------------------------

Runping, your point is valid. However, the issue at hand is independent, as it 
is talking about what HOD should do in the event of an unresponsive JobTracker.

We discussed the point of missing small jobs while developing HOD 0.4. However, 
we didn't have any information then that this was a limitation on our user 
clusters, as HOD 0.3 essentially followed the same approach. Further, we needed 
support from Hadoop to do this. AFAIK, the JobStatus object doesn't give the 
information of completion time. All that said, there is room for improvement. I 
just file HADOOP-2859 for tracking this issue.

Meanwhile, as the need to fix this issue is greater, I am going ahead with the 
basic approach and changing the error handling.

> [HOD] Idle cluster cleanup does not work if the JobTracker becomes 
> unresponsive to RPC calls
> --------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-2847
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2847
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hod
>    Affects Versions: 0.16.0
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>            Priority: Blocker
>             Fix For: 0.16.1
>
>
> In some erroneous conditions, the Hadoop JobTracker becomes unresponsive to 
> RPC calls (for e.g. if a misconfiguration causes the JobTracker to run out of 
> memory). In such cases, a cluster allocated by HOD no longer runs any jobs 
> and is wastefully holding up nodes. The usual idle cluster cleaner should 
> deallocate the cluster ideally, but it does not.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to