[
https://issues.apache.org/jira/browse/HDFS-9384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth updated HDFS-9384:
--------------------------------
Attachment: HDFS-9384.001.patch
I'm attaching a patch to fix the problem. I've run with this patch multiple
times in multiple environments, and the test failure no longer repros.
I left a comment in the patch explaining the problem in more detail. I'm
pasting it here for convenience:
{code}
// The second request can be sent with Transfer-Encoding: chunked.
// The Java HTTP client tends to split the headers and the chunked
// body into separate writes, so the first read above likely only read
// the headers. We must fully consume the input to prevent a hang on
// the client side.
{code}
{{TestWebHdfsTimeouts}} is an example of an existing similar test that already
works correctly, because it follows the same strategy of fully consuming the
input sent by the client.
> TestWebHdfsContentLength intermittently hangs and fails due to TCP
> conversation mismatch between client and server.
> -------------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-9384
> URL: https://issues.apache.org/jira/browse/HDFS-9384
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: test
> Reporter: Chris Nauroth
> Assignee: Chris Nauroth
> Priority: Minor
> Attachments: HDFS-9384.001.patch
>
>
> {{TestWebHdfsContentLength}} runs a simple hand-coded HTTP server in a
> background thread to simulate some WebHDFS server responses. In some
> environments (notably Windows), I have observed that the test can hang and
> fail intermittently. The root cause is that the server fails to fully
> consume the client's input. This causes a mismatch in the TCP conversation
> state, and ultimately the client side hangs, then aborts after the 60-second
> socket timeout.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)