[jira] [Commented] (HDFS-3577) WebHdfsFileSystem can not read files larger than 24KB

Eli Collins (JIRA) Fri, 20 Jul 2012 00:02:39 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13418974#comment-13418974
 ]


Eli Collins commented on HDFS-3577:
-----------------------------------

Hey Nicholas,

Did you test this with distcp? Trying to distcp from a recent trunk build with 
this change still fails with *Content-Length header is missing*. Hadoop fs -get 
using webhdfs with the same file works.

{noformat}
12/07/19 23:56:43 INFO mapreduce.Job: Task Id : 
attempt_1342766959778_0002_m_000000_0, Status : FAILED
Error: java.io.IOException: File copy failed: 
webhdfs://eli-thinkpad:50070/user/eli/data1/big.iso --> 
hdfs://localhost:8020/user/eli/data4/data1/big.iso
        at 
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:262)
        at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:229)
        at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:45)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:726)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:333)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:154)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:149)
Caused by: java.io.IOException: Couldn't run retriable-command: Copying 
webhdfs://eli-thinkpad:50070/user/eli/data1/big.iso to 
hdfs://localhost:8020/user/eli/data4/data1/big.iso
        at 
org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101)
        at 
org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:258)
        ... 10 more
Caused by: 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand$CopyReadException: 
java.io.IOException: Content-Length header is missing
        at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:201)
        at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyBytes(RetriableFileCopyCommand.java:167)
        at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.copyToTmpFile(RetriableFileCopyCommand.java:112)
        at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:90)
        at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:71)
        at 
org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87)
        ... 11 more
Caused by: java.io.IOException: Content-Length header is missing
        at 
org.apache.hadoop.hdfs.ByteRangeInputStream.openInputStream(ByteRangeInputStream.java:125)
        at 
org.apache.hadoop.hdfs.ByteRangeInputStream.getInputStream(ByteRangeInputStream.java:103)
        at 
org.apache.hadoop.hdfs.ByteRangeInputStream.read(ByteRangeInputStream.java:158)
        at java.io.DataInputStream.read(DataInputStream.java:132)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
        at java.io.FilterInputStream.read(FilterInputStream.java:90)
        at 
org.apache.hadoop.tools.util.ThrottledInputStream.read(ThrottledInputStream.java:70)
        at 
org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.readBytes(RetriableFileCopyCommand.java:198)
        ... 16 more
{noformat}
                
> WebHdfsFileSystem can not read files larger than 24KB
> -----------------------------------------------------
>
>                 Key: HDFS-3577
>                 URL: https://issues.apache.org/jira/browse/HDFS-3577
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.23.3, 2.0.0-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Blocker
>             Fix For: 0.23.3, 2.1.0-alpha
>
>         Attachments: h3577_20120705.patch, h3577_20120708.patch, 
> h3577_20120714.patch, h3577_20120716.patch, h3577_20120717.patch
>
>
> If reading a file large enough for which the httpserver running 
> webhdfs/httpfs uses chunked transfer encoding (more than 24K in the case of 
> webhdfs), then the WebHdfsFileSystem client fails with an IOException with 
> message *Content-Length header is missing*.
> It looks like WebHdfsFileSystem is delegating opening of the inputstream to 
> *ByteRangeInputStream.URLOpener* class, which checks for the *Content-Length* 
> header, but when using chunked transfer encoding the *Content-Length* header 
> is not present and  the *URLOpener.openInputStream()* method thrown an 
> exception.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3577) WebHdfsFileSystem can not read files larger than 24KB

Reply via email to