[
https://issues.apache.org/jira/browse/HADOOP-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195856#comment-13195856
]
Tom White commented on HADOOP-8003:
-----------------------------------
I think we should try to do this without breaking compatibility, e.g. by having
a new SplittableCompressionCodec interface that returns a
SplittableCompressionInputStream interface in its createInputStream method.
> Make SplitCompressionInputStream an interface instead of an abstract class
> --------------------------------------------------------------------------
>
> Key: HADOOP-8003
> URL: https://issues.apache.org/jira/browse/HADOOP-8003
> Project: Hadoop Common
> Issue Type: New Feature
> Components: io
> Affects Versions: 0.21.0, 0.22.0, 0.23.0, 1.0.0
> Reporter: Tim Broberg
>
> To be splittable, a codec must extend SplittableCompressionCodec which has a
> function returning a SplitCompressionInputStream.
> SplitCompressionInputStream is an abstract class which extends
> CompressionInputStream, the lowest level compression stream class.
> So, no codec that wants to be splittable can reuse any code from
> DecompressorStream or BlockDecompressorStream.
> You either have to duplicate that code, or not be splittable.
> SplitCompressionInputStream adds just a few very thin functions. Can we make
> this an interface rather than an abstract class to allow splittable
> decompression streams to extend DecompressorStream, BlockDecompressorStream,
> or whatever else we should scheme up in the future?
> To my knowledge, this would impact only the BZip2 codec. None of the other
> implement this form of splittability yet.
> LineRecordReader looks only at whether the codec is an instance of
> SplittableCompressionCodec, and then calls the appropriate version of
> createInputStream. This would not change, so the application code should not
> have to change, just BZip and SplitCompressionInputStream.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira