[
https://issues.apache.org/jira/browse/HDFS-17003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17730030#comment-17730030
]
ASF GitHub Bot commented on HDFS-17003:
---------------------------------------
hfutatzhanghb commented on code in PR #5643:
URL: https://github.com/apache/hadoop/pull/5643#discussion_r1221253077
##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestReadStripedFileWithDecoding.java:
##########
@@ -169,6 +171,129 @@ public void testInvalidateBlock() throws IOException,
InterruptedException {
}
}
+ /**
+ * This unit test try to cover the below situation:
+ * Suppose we have an EC file with RS(d,p) policy and block group id
+ * is blk_-9223372036845119810_1920002.
+ * If the first and second data block in this ec block group are corrupted,
+ * meanwhile we read this EC file.
+ * It will trigger reportBadBlock RPC and
+ * add the blk_-9223372036845119810_1920002
+ * and blk_-9223372036845119809_1920002 blocks to corruptReplicas.
+ * It will also reconstruct the two blocks and send IBR to namenode,
+ * then execute BlockManager#addStoredBlock and
+ * invalidateCorruptReplicas method. Suppose we first receive the IBR of
+ * blk_-9223372036845119810_1920002, then in invalidateCorruptReplicas
method,
+ * it will only invalidate 9223372036845119809_1920002 on the two datanodes
contains
+ * the two corrupt blocks.
+ *
+ * @throws Exception
+ */
+ @Test
+ public void testCorruptionECBlockInvalidate() throws Exception {
+
+ final Path file = new Path("/invalidate_corrupted");
+ final int length = BLOCK_SIZE * NUM_DATA_UNITS;
+ final byte[] bytes = StripedFileTestUtil.generateBytes(length);
+ DFSTestUtil.writeFile(dfs, file, bytes);
+
+ int dnIndex = findFirstDataNode(cluster, dfs, file,
+ CELL_SIZE * NUM_DATA_UNITS);
+ int dnIndex2 = findDataNodeAtIndex(cluster, dfs, file,
+ CELL_SIZE * NUM_DATA_UNITS, 2);
+ Assert.assertNotEquals(-1, dnIndex);
+ Assert.assertNotEquals(-1, dnIndex2);
+
+ LocatedStripedBlock slb = (LocatedStripedBlock) dfs.getClient()
+ .getLocatedBlocks(file.toString(), 0, CELL_SIZE * NUM_DATA_UNITS)
+ .get(0);
+ final LocatedBlock[] blks = StripedBlockUtil.parseStripedBlockGroup(slb,
+ CELL_SIZE, NUM_DATA_UNITS, NUM_PARITY_UNITS);
+
+ final Block b = blks[0].getBlock().getLocalBlock();
+ final Block b2 = blks[1].getBlock().getLocalBlock();
+
+ // Find the first block file.
+ File storageDir = cluster.getInstanceStorageDir(dnIndex, 0);
+ File blkFile = MiniDFSCluster.getBlockFile(storageDir, blks[0].getBlock());
+ Assert.assertTrue("Block file does not exist", blkFile.exists());
+ // Corrupt the block file.
+ LOG.info("Deliberately corrupting file " + blkFile.getName());
+ try (FileOutputStream out = new FileOutputStream(blkFile)) {
+ out.write("corruption".getBytes());
+ out.flush();
+ }
+
+ // Find the second block file.
+ File storageDir2 = cluster.getInstanceStorageDir(dnIndex2, 0);
+ File blkFile2 = MiniDFSCluster.getBlockFile(storageDir2,
blks[1].getBlock());
+ Assert.assertTrue("Block file does not exist", blkFile2.exists());
+ // Corrupt the second block file.
+ LOG.info("Deliberately corrupting file " + blkFile2.getName());
+ try (FileOutputStream out = new FileOutputStream(blkFile2)) {
+ out.write("corruption".getBytes());
+ out.flush();
+ }
+
+ // Disable the heartbeat from DN so that the corrupted block record is kept
+ // in NameNode.
+ for (DataNode dataNode : cluster.getDataNodes()) {
+ DataNodeTestUtils.setHeartbeatsDisabledForTests(dataNode, true);
+ }
+ try {
+ // Do stateful read.
+ StripedFileTestUtil.verifyStatefulRead(dfs, file, length, bytes,
+ ByteBuffer.allocate(1024));
+
+ // Check whether the corruption has been reported to the NameNode.
+ final FSNamesystem ns = cluster.getNamesystem();
+ final BlockManager bm = ns.getBlockManager();
+ BlockInfo blockInfo =
(ns.getFSDirectory().getINode4Write(file.toString())
+ .asFile().getBlocks())[0];
+ GenericTestUtils.waitFor(() -> {
+ if (bm.getCorruptReplicas(blockInfo) == null) {
+ return false;
+ }
+ return bm.getCorruptReplicas(blockInfo).size() == 2;
+ }, 250, 60000);
+ // Double check.
+ Assert.assertEquals(2, bm.getCorruptReplicas(blockInfo).size());
+
+ DatanodeDescriptor dnd =
+ NameNodeAdapter.getDatanode(ns,
cluster.getDataNodes().get(dnIndex).getDatanodeId());
+
+ DatanodeDescriptor dnd2 =
+ NameNodeAdapter.getDatanode(ns,
cluster.getDataNodes().get(dnIndex2).getDatanodeId());
+
+ for (DataNode datanode : cluster.getDataNodes()) {
+ if (!datanode.getDatanodeUuid().equals(dnd.getDatanodeUuid()) ||
Review Comment:
OMG, i am sorry, i mistook your review as line 276- line 277. my bad.
Yes, here should be &&. I will fix it.
> Erasure coding: invalidate wrong block after reporting bad blocks from
> datanode
> -------------------------------------------------------------------------------
>
> Key: HDFS-17003
> URL: https://issues.apache.org/jira/browse/HDFS-17003
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: farmmamba
> Assignee: farmmamba
> Priority: Critical
> Labels: pull-request-available
>
> After receiving reportBadBlocks RPC from datanode, NameNode compute wrong
> block to invalidate. It is a dangerous behaviour and may cause data loss.
> Some logs in our production as below:
>
> NameNode log:
> {code:java}
> 2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR*
> reportBadBlocks for block:
> BP-932824627-xxxx-1680179358678:blk_-9223372036848404320_1471186 on datanode:
> datanode1:50010
> 2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR*
> reportBadBlocks for block:
> BP-932824627-xxxx-1680179358678:blk_-9223372036848404319_1471186 on datanode:
> datanode2:50010{code}
> datanode1 log:
> {code:java}
> 2023-05-08 21:23:49,088 WARN
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad
> BP-932824627-xxxx-1680179358678:blk_-9223372036848404320_1471186 on
> /data7/hadoop/hdfs/datanode
> 2023-05-08 21:24:00,509 INFO
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed
> to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not
> found.{code}
>
> This phenomenon can be reproduced.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]