[ 
https://issues.apache.org/jira/browse/HADOOP-3585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy reassigned HADOOP-3585:
-------------------------------------

    Assignee: Ioannis Koltsidas

The audit warnings are due to the fact that the patch has added new files 
without the Apache License. Please fix them, thanks!

> Hardware Failure Monitoring in large clusters running Hadoop/HDFS
> -----------------------------------------------------------------
>
>                 Key: HADOOP-3585
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3585
>             Project: Hadoop Core
>          Issue Type: New Feature
>         Environment: Linux
>            Reporter: Ioannis Koltsidas
>            Assignee: Ioannis Koltsidas
>            Priority: Minor
>         Attachments: FailMon-standalone.zip, failmon.pdf, failmon.pdf, 
> failmon2.pdf, FailMon_Package_descrip.html, FailMon_QuickStart.html, 
> HADOOP-3585.2.patch, HADOOP-3585.3.patch, HADOOP-3585.patch, HADOOP-3585.patch
>
>   Original Estimate: 480h
>  Remaining Estimate: 480h
>
> At IBM we're interested in identifying hardware failures on large clusters 
> running Hadoop/HDFS. We are working on a framework that will enable nodes to 
> identify failures on their hardware using the Hadoop log, the system log and 
> various OS hardware diagnosing utilities. The implementation details are not 
> very clear, but you can see a draft of our design in the attached document. 
> We are pretty interested in Hadoop and system logs from failed machines, so 
> if you are in possession of such, you are very welcome to contribute them; 
> they would be of great value for hardware failure diagnosing.
> Some details about our design can be found in the attached document 
> failmon.doc. More details will follow in a later post.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to