[
https://issues.apache.org/jira/browse/HDFS-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418974#comment-13418974
]
Eli Collins commented on HDFS-3577:
-----------------------------------
Hey Nicholas,
Did you test this with distcp? Trying to distcp from a recent trunk build with
this change still fails with *Content-Length header is missing*. Hadoop fs -get
using webhdfs with the same file works.
{noformat}
12/07/19 23:56:43 INFO mapreduce.Job: Task Id :
attempt_1342766959778_0002_m_000000_0, Status : FAILED
Error: java.io.IOException: File copy failed:
webhdfs://eli-thinkpad:50070/user/eli/data1/big.iso -->
hdfs://localhost:8020/user/eli/data4/data1/big.iso
at
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229)
at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:726)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:154)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:149)
Caused by: java.io.IOException: Couldn't run retriable-command: Copying
webhdfs://eli-thinkpad:50070/user/eli/data1/big.iso to
hdfs://localhost:8020/user/eli/data4/data1/big.iso
at
org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101)
at
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258)
... 10 more
Caused by:
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException:
java.io.IOException: Content-Length header is missing
at
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:201)
at
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:167)
at
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToTmpFile(RetriableFileCopyCommand.java:112)
at
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:90)
at
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:71)
at
org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
... 11 more
Caused by: java.io.IOException: Content-Length header is missing
at
org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:125)
at
org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:103)
at
org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:158)
at java.io.DataInputStream.read(DataInputStream.java:132)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at java.io.FilterInputStream.read(FilterInputStream.java:90)
at
org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStream.java:70)
at
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:198)
... 16 more
{noformat}
> WebHdfsFileSystem can not read files larger than 24KB
> -----------------------------------------------------
>
> Key: HDFS-3577
> URL: https://issues.apache.org/jira/browse/HDFS-3577
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs client
> Affects Versions: 0.23.3, 2.0.0-alpha
> Reporter: Alejandro Abdelnur
> Assignee: Tsz Wo (Nicholas), SZE
> Priority: Blocker
> Fix For: 0.23.3, 2.1.0-alpha
>
> Attachments: h3577_20120705.patch, h3577_20120708.patch,
> h3577_20120714.patch, h3577_20120716.patch, h3577_20120717.patch
>
>
> If reading a file large enough for which the httpserver running
> webhdfs/httpfs uses chunked transfer encoding (more than 24K in the case of
> webhdfs), then the WebHdfsFileSystem client fails with an IOException with
> message *Content-Length header is missing*.
> It looks like WebHdfsFileSystem is delegating opening of the inputstream to
> *ByteRangeInputStream.URLOpener* class, which checks for the *Content-Length*
> header, but when using chunked transfer encoding the *Content-Length* header
> is not present and the *URLOpener.openInputStream()* method thrown an
> exception.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira