[
https://issues.apache.org/jira/browse/HDFS-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415162#comment-13415162
]
Daryn Sharp commented on HDFS-3577:
-----------------------------------
The {{BoundedInputStream}} is a no-op when the ctor w/o a length is used, so I
think this:
{code}final InputStream is = cl == null? new BoundedInputStream(in)
: new BoundedInputStream(in, Long.parseLong(cl));{code}
can be:
{code}final InputStream is = cl == null? in
: new BoundedInputStream(in, Long.parseLong(cl));{code}
A chunk size can be specified for a {{HttpURLConnection}} and we should be able
to enable keep-alive on the socket (I thought it was the default?) to avoid new
connections for every chunk. I don't know anything about {{MessageBodyWriter}}
et al, so if my suggestion isn't feasible and someone else oks the
{{MessageBodyWriter}}, I'm fine with it.
> WebHdfsFileSystem can not read files larger than 24KB
> -----------------------------------------------------
>
> Key: HDFS-3577
> URL: https://issues.apache.org/jira/browse/HDFS-3577
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs client
> Affects Versions: 2.0.0-alpha
> Reporter: Alejandro Abdelnur
> Assignee: Tsz Wo (Nicholas), SZE
> Priority: Blocker
> Attachments: h3577_20120705.patch, h3577_20120708.patch,
> h3577_20120714.patch
>
>
> If reading a file large enough for which the httpserver running
> webhdfs/httpfs uses chunked transfer encoding (more than 24K in the case of
> webhdfs), then the WebHdfsFileSystem client fails with an IOException with
> message *Content-Length header is missing*.
> It looks like WebHdfsFileSystem is delegating opening of the inputstream to
> *ByteRangeInputStream.URLOpener* class, which checks for the *Content-Length*
> header, but when using chunked transfer encoding the *Content-Length* header
> is not present and the *URLOpener.openInputStream()* method thrown an
> exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira