ZanderXu commented on PR #5583:
URL: https://github.com/apache/hadoop/pull/5583#issuecomment-1521121956
Yeah, for this case, all the replicas with different GS are on the same
datanodes, as follows:
| block | DN1 | DN2 | DN3 |
| --- | --- | --- | --- |
| blk_1024_1001 | dn1 | dn2 | dn3 |
| blk_1024_1002 | dn1 | dn2 | dn3 |
Standby NameNode stored blk_1024_1002 in blocksMap through relaying edits
from active first. When processing these BRD rpcs from datanode, standby
namenode will postpone the message with small GS. After processed all BRD
requests, the block status in Standby are as:
- PendingDataNodeMessages: [blk_1024_1001, dn1], [blk_1024_1001, dn2],
[blk_1024_dn3]
- BlocksMap: [blk_1024_1002, **(dn3)**]
Whiling starting Active service, the namenode shouldn't mark (blk_1024_1001,
dn1), (blk_1024_1001, dn2) and (blk_1024_1001, dn3) as corrupted block, because
dn1, dn2 and dn3 has reported the newest replicas (blk_1024_1003).
After tracing the `markBlockAsCorrupt`, it will remove dn1, dn2 from the
storage list for the stored block (blk_1024_1002), because
`corruptedDuringWrite` is true. And it will not remove dn3 from the storage
list for the stored block (blk_1024_1002), because the `corruptedDuringWrite`
is false (`minReplicationSatisfied` is false).
@ayushtkn you are right here.
So for this case, maybe `markBlockAsCorrupt` should do somethings to avoid
its wrong action (remove the expected dn1 and dn2 from the storage list).
```
if (storageInfo != null) {
storageInfo.addBlock(b.getStored(), b.getCorrupted());
}
```
If this step returned `AddBlockResult.ALREADY_EXIST`, means the datanode
already has one newest replica, so the `markBlockAsCorrupt` should return
immediately.
@ayushtkn How about directly modify some logic of `markBlockAsCorrupt` to
fix this problem? what do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]