[
https://issues.apache.org/jira/browse/HADOOP-8582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410312#comment-13410312
]
Harsh J commented on HADOOP-8582:
---------------------------------
Hi Paul, thanks for filing this and the patch! I've run into this as well.
Patch looks good but can you also add in a test (you can selectively disable
loading of native libs via configuration if need be), so we don't regress from
this test, if possible?
Also, perhaps we can instead be more specific in the error message (saying
"SequenceFile.Reader can't read Gzip compressed files without native-hadoop
libraries" or so? Feel free to improve the Writer when you're at it too, if
needed)
> Improve error reporting for GZIP-compressed SequenceFiles with missing native
> libraries.
> ----------------------------------------------------------------------------------------
>
> Key: HADOOP-8582
> URL: https://issues.apache.org/jira/browse/HADOOP-8582
> Project: Hadoop Common
> Issue Type: Improvement
> Components: io
> Affects Versions: 2.0.0-alpha
> Environment: Centos 5.8, Java 6 Update 26
> Reporter: Paul Wilkinson
> Priority: Minor
> Attachments: HADOOP-8582-1.diff
>
>
> At present it is not possible to write or read block-compressed SequenceFiles
> using the GZIP codec without the native libraries being available.
> The SequenceFile.Writer code checks for the availability of native libraries
> and throws a useful exception, but the SequenceFile.Reader doesn't do the
> same:
> {noformat}
> Exception in thread "main" java.io.EOFException
> at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:249)
> at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:239)
> at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:142)
> at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:58)
> at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:67)
> at
> org.apache.hadoop.io.compress.GzipCodec$GzipInputStream$ResetableGZIPInputStream.<init>(GzipCodec.java:95)
> at
> org.apache.hadoop.io.compress.GzipCodec$GzipInputStream.<init>(GzipCodec.java:104)
> at
> org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:173)
> at
> org.apache.hadoop.io.compress.GzipCodec.createInputStream(GzipCodec.java:183)
> at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1591)
> at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1493)
> at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1480)
> at
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1475)
> at test.SequenceReader.read(SequenceReader.java:23)
> {noformat}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira