[jira] Commented: (HADOOP-5478) Provide a node health check script and run it periodically to check the node health status

Hemanth Yamijala (JIRA) Tue, 16 Jun 2009 21:16:33 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12720490#action_12720490
 ]


Hemanth Yamijala commented on HADOOP-5478:
------------------------------------------

bq. This is disappointing. Hadoop has enough ports open that I think it 
qualifies as a cheese.

I was counting on you to object, Allen *smile*.

What do others feel ? I guess we could piggyback on the port used for the 
TaskUmbilicalProtocol - need Sreekanth to confirm this though. The main concern 
is if it will interfere with the processing of the tasks reporting their status 
to TT. To be clear though, the amount of work done in the RPC for the health 
reporter call is minimal - it just sets two variables.

> Provide a node health check script and run it periodically to check the node 
> health status
> ------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5478
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5478
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Aroop Maliakkal
>            Assignee: Sreekanth Ramakrishnan
>         Attachments: hadoop-5478-1.patch, hadoop-5478-2.patch, 
> hadoop-5478-3.patch, hadoop-5478-4.patch, hadoop-5478-5.patch
>
>
> Hadoop must have some mechanism to find the health status of a node . It 
> should run the health check script periodically and if there is any errors, 
> it should black list the node. This will be really helpful when we run static 
> mapred clusters. Else we may have to run some scripts/daemons periodically to 
> find the node status and take it offline manually.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-5478) Provide a node health check script and run it periodically to check the node health status

Reply via email to