Hui Fei created HDFS-15875:
------------------------------
Summary: Check whether file is being truncated before truncate
Key: HDFS-15875
URL: https://issues.apache.org/jira/browse/HDFS-15875
Project: Hadoop HDFS
Issue Type: Bug
Affects Versions: 3.2.2, 3.1.4, 3.3.0
Reporter: Hui Fei
Assignee: Hui Fei
We have got this problem.
* A job sends truncate to namenode, and the block recovery goes.
* DataNode D is timeout while it connects another datanode (60s), so block
recovery costs 60+s
* A job tails, and B job starts and it sends truncate to namenode. New
recoveryId generates during recovery lease.
* DataNode D commitBlockSynchronization and get errors "does not match current
recovery id"
So truncate will not complete forever. Datanode D has replica with new length
and two other datanodes have replica old length.
DN has the error messages "Inconsistent size of finalized replicas"
the related code is in BlockRecoveryWorker.java
{code}
for (BlockRecord r : syncList) {
assert r.rInfo.getNumBytes() > 0 : "zero length replica";
ReplicaState rState = r.rInfo.getOriginalReplicaState();
if (rState.getValue() < bestState.getValue()) {
bestState = rState;
}
if(rState == ReplicaState.FINALIZED) {
if (finalizedLength > 0 && finalizedLength != r.rInfo.getNumBytes()) {
throw new IOException("Inconsistent size of finalized replicas. " +
"Replica " + r.rInfo + " expected size: " + finalizedLength);
}
finalizedLength = r.rInfo.getNumBytes();
}
}
{code}
{code:java}
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]