[ 
https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13852131#comment-13852131
 ] 

Buddy commented on HDFS-5483:
-----------------------------

I want to clarify. I had the error case wrong in the above comment.

I think that we are hitting this when the data node is reporting multiple 
replicas of the same block.
It looks like we have two replicas of the block on two different storages. The 
data node has both storages mounted - one storage is mounted read-write and the 
other storage is mounted read-only. 

It reports the first replica on the first storage as read-write and the second 
replica on the second storage as read-only.




> Make reportDiff resilient to malformed block reports
> ----------------------------------------------------
>
>                 Key: HDFS-5483
>                 URL: https://issues.apache.org/jira/browse/HDFS-5483
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: Heterogeneous Storage (HDFS-2832)
>            Reporter: Arpit Agarwal
>
> {{BlockManager#reportDiff}} can cause an assertion failure in 
> {{BlockInfo#moveBlockToHead}} if the block report shows the same block as 
> belonging to more than one storage.
> The issue is that {{moveBlockToHead}} assumes it will find the 
> DatanodeStorageInfo for the given block.
> Exception details:
> {code}
> java.lang.AssertionError: Index is out of bound
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709)
>         at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637)
>         at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984)
>         at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to