[ 
https://issues.apache.org/jira/browse/HADOOP-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Kunz updated HADOOP-4103:
-----------------------------------

    Description: 
A whole bunch of datanodes became dead because of some network problems 
resulting in  heartbeat timeouts although datanodes were fine.

Many processes started to fail because of the corrupted filesystem.

In order to catch and diagnose such problems faster the namenode should detect 
the corruption automatically and provide a way to alert operations. At the 
minimum it should show the fact of corruption on the GUI.

  was:
A whole bunch of datanodes became dead because of some network problems 
resulting in  heartbeat timeouts although datanodes were fine.

Many processes started to fail because of the corrupted filesystem.

In order to catch and diagnose such problems faster the namenode should detect 
the corruption automatically and provide a way to alert operations. At the 
minimum it should the fact of corruption on the GUI.


> Alert for missing blocks
> ------------------------
>
>                 Key: HADOOP-4103
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4103
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.17.2
>            Reporter: Christian Kunz
>
> A whole bunch of datanodes became dead because of some network problems 
> resulting in  heartbeat timeouts although datanodes were fine.
> Many processes started to fail because of the corrupted filesystem.
> In order to catch and diagnose such problems faster the namenode should 
> detect the corruption automatically and provide a way to alert operations. At 
> the minimum it should show the fact of corruption on the GUI.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to