[
https://issues.apache.org/jira/browse/HDFS-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephen O'Donnell updated HDFS-14626:
-------------------------------------
Attachment: test-to-reproduce.patch
> Decommission all nodes hosting last block of open file succeeds unexpectedly
> -----------------------------------------------------------------------------
>
> Key: HDFS-14626
> URL: https://issues.apache.org/jira/browse/HDFS-14626
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 3.3.0
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
> Attachments: test-to-reproduce.patch
>
>
> I have been investigating scenarios that cause decommission to hang,
> especially around one long standing issue. That is, an open block on the host
> which is being decommissioned can cause the process to never complete.
> Checking the history, there seems to have been at least one change in
> HDFS-5579 which greatly improved the situation, but from reading comments and
> support cases, there still seems to be some scenarios where open blocks on a
> DN host cause the decommission to get stuck.
> No matter what I try, I have not been able to reproduce this, but I think I
> have uncovered another issue that may partly explain why.
> If I do the following, the nodes will decommission without any issues:
> 1. Create a file and write to it so it crosses a block boundary. Then there
> is one complete block and one under construction block. Keep the file open,
> and write a few bytes periodically.
> 2. Now note the nodes which the UC block is currently being written on, and
> decommission them all.
> 3. The decommission should succeed.
> 4. Now attempt to close the open file, and it will fail to close with an
> error like below, probably as decommissioned nodes are not allowed to send
> IBRs:
> {code:java}
> java.io.IOException: Unable to close file because the last block
> BP-646926902-192.168.0.20-1562099323291:blk_1073741827_1003 does not have
> enough number of replicas.
> at
> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:968)
> at
> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:911)
> at
> org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:894)
> at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:849)
> at
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
> at
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101){code}
> Interestingly, if you recommission the nodes without restarting them before
> closing the file, it will close OK, and writes to it can continue even once
> decommission has completed.
> I don't think this is expected - ie decommission should not complete on all
> nodes hosting the last UC block of a file?
> From what I have figured out, I don't think UC blocks are considered in the
> DatanodeAdminManager at all. This is because the original list of blocks it
> cares about, are taken from the Datanode block Iterator, which takes them
> from the DatanodeStorageInfo objects attached to the datanode instance. I
> believe UC blocks don't make it into the DatanodeStoreageInfo until after
> they have been completed and an IBR sent, so the decommission logic never
> considers them.
> What troubles me about this explanation, is how did open files previously
> cause decommission to get stuck if it never checks for them, so I suspect I
> am missing something.
> I will attach a patch with a test case that demonstrates this issue. This
> reproduces on trunk and I also tested on CDH 5.8.1, which is based on the 2.6
> branch, but with a lot of backports.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]