[ 
https://issues.apache.org/jira/browse/HDFS-11155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-11155.
------------------------------------
    Resolution: Not A Problem

It turns out the symptom described in this jira is part of HDFS-11160, which is 
the root cause of the symptom. So close this in order to concentrate my fix on 
HDFS-11160.

> VolumeScanner should report the latest generation stamp of a bad replica
> ------------------------------------------------------------------------
>
>                 Key: HDFS-11155
>                 URL: https://issues.apache.org/jira/browse/HDFS-11155
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.7.4
>         Environment: CDH5.7.3
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>
> HDFS-10512 fixed a race condition that caused VolumeScanner to terminate 
> abruptly when a corrupt replica, which is being updated, is detected. 
> However, when such a corrupt replica is detected, VolumeScanner still reports 
> the old replica generation stamp to the NN. NN then directs DN to remove the 
> older replica. Because the generation stamp is updated, DN can not find it, 
> so corrupt replica remains corrupt.
> NN's log shows something similar to the following:
> {quote}
> 2016-11-17 21:08:05,350 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1077571736 added as corrupt on 
> 192.168.168.58:50010 by /192.168.168.58  because client machine reported it
> 2016-11-17 21:08:05,350 INFO BlockStateChange: BLOCK* invalidateBlock: 
> blk_1077571736_3991953(stored=blk_1077571736_3992018) on 192.168.168.58:50010
> {quote}
> The DN's log has these:
> {noformat}
> 2016-11-17 21:08:04,815 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Appending to FinalizedReplica, blk_1077571736_3991953, FINALIZED
>   getNumBytes()     = 39061752
>   getBytesOnDisk()  = 39061752
>   getVisibleLength()= 39061752
>   getVolume()       = /data/3/dfs/dn/current
>   getBlockFile()    = 
> /data/3/dfs/dn/current/BP-1092022411-192.168.168.55-1474407949037/current/finalized/subdir58/subdir112/blk_1077571736
> 2016-11-17 21:08:09,158 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed 
> to delete replica blk_1077571736_3991953: ReplicaInfo not found.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to