[
https://issues.apache.org/jira/browse/HDFS-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419410#comment-13419410
]
Daryn Sharp commented on HDFS-3577:
-----------------------------------
Yes, HDFS-3166 (add timeouts) exposed the 200s tail on >2GB files caused by a
java bug. The content-length has to be known in order to workaround the java
bug, and thus avoid the read timeout.
What I think can be done:
* If chunked, eliminate the content-length requirement
* If not chunked, and no content-length, obtain the length from a file stat or
a HEAD, etc
> WebHdfsFileSystem can not read files larger than 24KB
> -----------------------------------------------------
>
> Key: HDFS-3577
> URL: https://issues.apache.org/jira/browse/HDFS-3577
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs client
> Affects Versions: 0.23.3, 2.0.0-alpha
> Reporter: Alejandro Abdelnur
> Assignee: Tsz Wo (Nicholas), SZE
> Priority: Blocker
> Fix For: 0.23.3, 2.1.0-alpha
>
> Attachments: h3577_20120705.patch, h3577_20120708.patch,
> h3577_20120714.patch, h3577_20120716.patch, h3577_20120717.patch
>
>
> If reading a file large enough for which the httpserver running
> webhdfs/httpfs uses chunked transfer encoding (more than 24K in the case of
> webhdfs), then the WebHdfsFileSystem client fails with an IOException with
> message *Content-Length header is missing*.
> It looks like WebHdfsFileSystem is delegating opening of the inputstream to
> *ByteRangeInputStream.URLOpener* class, which checks for the *Content-Length*
> header, but when using chunked transfer encoding the *Content-Length* header
> is not present and the *URLOpener.openInputStream()* method thrown an
> exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira