[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865782#comment-13865782 ]
Eric Sirianni commented on HDFS-5483: ------------------------------------- Arpit - I noticed that the supplied patch only ignores the extra replica in the full Block Report code path ({{processReport()}}). Doesn't this leave the assertion still exposed on the {{BLOCK_RECEIVED}} ({{processIncrementalReportedBlock()}}) path? It seems like this code might need to be changed to search based on storage ID also: {code} if (reportedState == ReplicaState.FINALIZED && (storedBlock.findDatanode(dn) < 0 || corruptReplicas.isReplicaCorrupt(storedBlock, dn))) { toAdd.add(storedBlock); } {code} > NN should gracefully handle multiple block replicas on same DN > -------------------------------------------------------------- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode > Affects Versions: Heterogeneous Storage (HDFS-2832) > Reporter: Arpit Agarwal > Fix For: 3.0.0 > > Attachments: h5483.02.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)