[ 
https://issues.apache.org/jira/browse/HADOOP-8003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13195856#comment-13195856
 ] 

Tom White commented on HADOOP-8003:
-----------------------------------

I think we should try to do this without breaking compatibility, e.g. by having 
a new SplittableCompressionCodec interface that returns a 
SplittableCompressionInputStream interface in its createInputStream method.
                
> Make SplitCompressionInputStream an interface instead of an abstract class
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-8003
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8003
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: io
>    Affects Versions: 0.21.0, 0.22.0, 0.23.0, 1.0.0
>            Reporter: Tim Broberg
>
> To be splittable, a codec must extend SplittableCompressionCodec which has a 
> function returning a SplitCompressionInputStream.
> SplitCompressionInputStream is an abstract class which extends 
> CompressionInputStream, the lowest level compression stream class.
> So, no codec that wants to be splittable can reuse any code from 
> DecompressorStream or BlockDecompressorStream.
> You either have to duplicate that code, or not be splittable.
> SplitCompressionInputStream adds just a few very thin functions. Can we make 
> this an interface rather than an abstract class to allow splittable 
> decompression streams to extend DecompressorStream, BlockDecompressorStream, 
> or whatever else we should scheme up in the future?
> To my knowledge, this would impact only the BZip2 codec. None of the other 
> implement this form of splittability yet.
> LineRecordReader looks only at whether the codec is an instance of 
> SplittableCompressionCodec, and then calls the appropriate version of 
> createInputStream. This would not change, so the application code should not 
> have to change, just BZip and SplitCompressionInputStream.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to