[
https://issues.apache.org/jira/browse/HDFS-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987793#comment-15987793
]
Todd Lipcon commented on HDFS-2970:
-----------------------------------
Yep, if my memory from 5 years ago serves me correctly, we can just remove the
error, or at least the "THIS IS NOT SUPPOSED TO HAPPEN" part :)
> Contending block synchronizations can result in scary log messages
> ------------------------------------------------------------------
>
> Key: HDFS-2970
> URL: https://issues.apache.org/jira/browse/HDFS-2970
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: datanode
> Affects Versions: 0.23.0
> Reporter: Todd Lipcon
> Assignee: Wei-Chiu Chuang
>
> If multiple datanodes are attempting to act as the coordinator for a block
> recovery, but one is being particularly slow, it's possible that you see the
> following interleaving:
> - Primary A receives block recovery command for recovery genstamp = 1, then
> starts acting slow
> - Primary B receives block recovery command for recovery genstamp = 2
> - Primary B calls initReplicaRecovery on other nodes for genstamp = 2
> - Primary A calls initReplicaRecovery on other nodes for genstamp = 1
> This results in a scary message. For example:
> java.io.IOException: java.io.IOException: THIS IS NOT SUPPOSED TO HAPPEN:
> replica.getGenerationStamp() >= recoveryId = 4148,
> block=blk_6899379920748342698_4136, replica=FinalizedReplica,
> blk_6899379920748342698_4176, FINALIZED
> BUT, this scenario is properly handled by the recovery protocol. We should
> tone down the message a bit.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]