[
https://issues.apache.org/jira/browse/HDFS-2043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271828#comment-15271828
]
Masatake Iwasaki commented on HDFS-2043:
----------------------------------------
In both case of IOException and ClosedByInterruptException, I can see the
message "Got expected exception during close" in the test logs. The exception
was thrown on the second {{stm.close()}} in the catch block below.
{code}
try {
stm.close();
// If we made it past the close(), then that means that the ack made it
back
// from the pipeline before we got to the wait() call. In that case we
should
// still have interrupted status.
assertTrue(Thread.interrupted());
} catch (InterruptedIOException ioe) {
System.out.println("Got expected exception during close");
// If we got the exception, we shouldn't have interrupted status
anymore.
assertFalse(Thread.currentThread().isInterrupted());
// Now do a successful close.
stm.close();
}
{code}
The catched ioe points to {{DFSOutputStream#closeImpl}}. (The stack trace is
logged by fixing TestHFlush in my local environment.)
{noformat}
java.io.InterruptedIOException: Interrupted while waiting for data to be
acknowledged by pipeline
at
org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:771)
at
org.apache.hadoop.hdfs.DFSOutputStream.flushInternal(DFSOutputStream.java:697)
at
org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:778)
at
org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:755)
at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
at
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:101)
at
org.apache.hadoop.hdfs.TestHFlush.testHFlushInterrupted(TestHFlush.java:480)
{noformat}
The testHFlushInterrupted expects that the second {{stm.close()}} succeeds but
it is not true. Underlying streamer thread is closed since
{{closeThreads(true)}} is called in the finally block of
{{DFSOutputStream#closeImpl}}.
{code}
} finally {
// Failures may happen when flushing data.
// Streamers may keep waiting for the new block information.
// Thus need to force closing these threads.
// Don't need to call setClosed() because closeThreads(true)
// calls setClosed() in the finally block.
closeThreads(true);
}
{code}
I think we should just catch IOException on the second {{stm.close()}} and
ignore it. The final check in the test should fail if there is a problem.
{code}
// verify that entire file is good
AppendTestUtil.checkFullFile(fs, p, 4, fileContents,
"Failed to deal with thread interruptions", false);
{code}
> TestHFlush failing intermittently
> ---------------------------------
>
> Key: HDFS-2043
> URL: https://issues.apache.org/jira/browse/HDFS-2043
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Aaron T. Myers
> Assignee: Lin Yiqun
> Attachments: HDFS-2043.002.patch, HDFS-2043.003.patch, HDFS.001.patch
>
>
> I can't reproduce this failure reliably, but it seems like TestHFlush has
> been failing intermittently, with the frequency increasing of late.
> Note the following two pre-commit test runs from different JIRAs where
> TestHFlush seems to have failed spuriously:
> https://builds.apache.org/job/PreCommit-HDFS-Build/734//testReport/
> https://builds.apache.org/job/PreCommit-HDFS-Build/680//testReport/
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]