[ 
https://issues.apache.org/jira/browse/HDFS-7199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14163847#comment-14163847
 ] 

Jason Lowe commented on HDFS-7199:
----------------------------------

bq.  But I also wonder why you are getting a non-IOException in the first 
place. That seems like a bug.

The bug in the case we encountered was bad hardware.  The JVM was glitching out 
and happened to generate a java.lang.VerifyError in the DataStreamer thread.  
Unfortunately due to this bug the reducer ended up with a "successful" run that 
generated a zero-length file, and the data was silently dropped.  We caught it 
later downstream when a subsequent job tried to consume the empty file.

> DFSOutputStream can silently drop data if DataStreamer crashes with a non-I/O 
> exception
> ---------------------------------------------------------------------------------------
>
>                 Key: HDFS-7199
>                 URL: https://issues.apache.org/jira/browse/HDFS-7199
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.5.0
>            Reporter: Jason Lowe
>            Assignee: Chen He
>            Priority: Critical
>
> If the DataStreamer thread encounters a non-I/O exception then it closes the 
> output stream but does not set lastException.  When the client later calls 
> close on the output stream then it will see the stream is already closed with 
> lastException == null, mistakently think this is a redundant close call, and 
> fail to report any error to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to