[jira] [Updated] (HDFS-8869) Don't mark storages as failed before first block report

2018-03-30 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-8869:
--
Target Version/s:   (was: 2.7.6)

> Don't mark storages as failed before first block report
> ---
>
> Key: HDFS-8869
> URL: https://issues.apache.org/jira/browse/HDFS-8869
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Daryn Sharp
>Priority: Major
>
> Creating this ticket on behalf of [~daryn].
> Heartbeat processing performs the failed storage check. The DN reports its 
> storages and any prior missing storages, ex. unique storage id upgrade, are 
> marked failed. The heartbeat monitor removes all blocks associated to the 
> failed storage. A replication storm ensues for all blocks on the node.
> Eventually the DN block reports for the new storages - up to 15m later for 
> large clusters. Now the NN has many excess blocks to invalidate. If the 
> cluster has failed over in the past 24h, ex. rolling upgrade, the standby 
> gone active will queue the block invalidations which triggers the severe 
> performance degradation of HDFS-8674 which has been greatly lessened but is 
> still an issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8869) Don't mark storages as failed before first block report

2016-08-17 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8869:
--
Target Version/s: 2.7.4  (was: 2.7.3)

2.7.3 is under release process, changing target-version to 2.7.4.

> Don't mark storages as failed before first block report
> ---
>
> Key: HDFS-8869
> URL: https://issues.apache.org/jira/browse/HDFS-8869
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Daryn Sharp
>
> Creating this ticket on behalf of [~daryn].
> Heartbeat processing performs the failed storage check. The DN reports its 
> storages and any prior missing storages, ex. unique storage id upgrade, are 
> marked failed. The heartbeat monitor removes all blocks associated to the 
> failed storage. A replication storm ensues for all blocks on the node.
> Eventually the DN block reports for the new storages - up to 15m later for 
> large clusters. Now the NN has many excess blocks to invalidate. If the 
> cluster has failed over in the past 24h, ex. rolling upgrade, the standby 
> gone active will queue the block invalidations which triggers the severe 
> performance degradation of HDFS-8674 which has been greatly lessened but is 
> still an issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-8869) Don't mark storages as failed before first block report

2015-11-03 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated HDFS-8869:
--
Target Version/s: 2.7.3  (was: 2.7.2)

Moving out all non-critical / non-blocker issues that didn't make it out of 
2.7.2 into 2.7.3.

> Don't mark storages as failed before first block report
> ---
>
> Key: HDFS-8869
> URL: https://issues.apache.org/jira/browse/HDFS-8869
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Rushabh S Shah
>Assignee: Daryn Sharp
>
> Creating this ticket on behalf of [~daryn].
> Heartbeat processing performs the failed storage check. The DN reports its 
> storages and any prior missing storages, ex. unique storage id upgrade, are 
> marked failed. The heartbeat monitor removes all blocks associated to the 
> failed storage. A replication storm ensues for all blocks on the node.
> Eventually the DN block reports for the new storages - up to 15m later for 
> large clusters. Now the NN has many excess blocks to invalidate. If the 
> cluster has failed over in the past 24h, ex. rolling upgrade, the standby 
> gone active will queue the block invalidations which triggers the severe 
> performance degradation of HDFS-8674 which has been greatly lessened but is 
> still an issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)