[
https://issues.apache.org/jira/browse/HDFS-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mingliang Liu updated HDFS-10293:
---------------------------------
Attachment: HDFS-10293.000.patch
The code is as following:
{code}
static int readAll(FSDataInputStream in, byte[] buf) throws IOException {
int readLen = 0;
int ret;
while ((ret = in.read(buf, readLen, buf.length - readLen)) >= 0 &&
readLen <= buf.length) {
readLen += ret;
}
return readLen;
}
{code}
If the {{readLen}} equals to {{buf.length}}, then {{buf.length - readLen}} will
be zero, and {{in.read()}} will simply returns zero without reading from the
stream. This case, no exception will be thrown, and the code is stuck in the
while-loop.
One possible fix is to strict the condition as {{ret = in.read(buf, readLen,
buf.length - readLen)) > 0 && readLen < buf.length}}. A probable better fix is
to use the {{IOUtils.readFully()}}, which will throw an IOException if it reads
premature EOF from inputStream, see the v0 patch.
> StripedFileTestUtil#readAll flaky
> ---------------------------------
>
> Key: HDFS-10293
> URL: https://issues.apache.org/jira/browse/HDFS-10293
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: erasure-coding, test
> Affects Versions: 3.0.0
> Reporter: Mingliang Liu
> Assignee: Mingliang Liu
> Attachments: HDFS-10293.000.patch
>
>
> The flaky test helper method cause several UT test failing intermittently.
> For example, the
> {{TestDFSStripedOutputStreamWithFailure#testAddBlockWhenNoSufficientParityNumOfNodes}}
> timed out in a recent run (see
> [exception|https://builds.apache.org/job/PreCommit-HDFS-Build/15158/testReport/org.apache.hadoop.hdfs/TestDFSStripedOutputStreamWithFailure/testAddBlockWhenNoSufficientParityNumOfNodes/]),
> which can be easily reproduced locally.
> Debugging at the code, chances are that the helper method is stuck in an
> infinite loop. We need a fix to make the test robust.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)