[ 
https://issues.apache.org/jira/browse/HDFS-12704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217230#comment-16217230
 ] 

Daryn Sharp commented on HDFS-12704:
------------------------------------

During a decomm of a faulty nodem the NNs frequently reported invalid protobufs 
from the node during decode of {{reportBlock}}, interleaved with 
{{ArrayIndexOutBounds}} during actual processing of the report.  The jvm 
clipped the stacktrace of the exception so it is unknown where it occurs.

The {{DecommissionManager}} stopped after the first AIOB which is probably the 
root cause of HDFS-12703.  The block states appear to be corrupted into an 
unknown state.  Since the decomm task aborts and the exception is lost, it's 
impossible to know where the bug is occurring.

> FBR may corrupt block state
> ---------------------------
>
>                 Key: HDFS-12704
>                 URL: https://issues.apache.org/jira/browse/HDFS-12704
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.8.0
>            Reporter: Daryn Sharp
>            Priority: Critical
>
> If FBR processing generates a runtime exception it is believed to foul the 
> block state and lead to unpredictable behavior.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to