[
https://issues.apache.org/jira/browse/HDFS-12704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217230#comment-16217230
]
Daryn Sharp commented on HDFS-12704:
------------------------------------
During a decomm of a faulty nodem the NNs frequently reported invalid protobufs
from the node during decode of {{reportBlock}}, interleaved with
{{ArrayIndexOutBounds}} during actual processing of the report. The jvm
clipped the stacktrace of the exception so it is unknown where it occurs.
The {{DecommissionManager}} stopped after the first AIOB which is probably the
root cause of HDFS-12703. The block states appear to be corrupted into an
unknown state. Since the decomm task aborts and the exception is lost, it's
impossible to know where the bug is occurring.
> FBR may corrupt block state
> ---------------------------
>
> Key: HDFS-12704
> URL: https://issues.apache.org/jira/browse/HDFS-12704
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.8.0
> Reporter: Daryn Sharp
> Priority: Critical
>
> If FBR processing generates a runtime exception it is believed to foul the
> block state and lead to unpredictable behavior.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]