[ 
https://issues.apache.org/jira/browse/COMPRESS-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15223246#comment-15223246
 ] 

Thomas Meyer commented on COMPRESS-207:
---------------------------------------

Some background/infos: I was inspired of the possibility to process the bzip2 
stream block by block while reading this book: Hadoop: The Definitive Guide 
(http://shop.oreilly.com/product/0636920033448.do). The Hadoop has so called 
splittable compression streams. which AFAIK does this: it splits the total 
length of the compressed input file by 2 (e.g.) and then searches for the 
number PI marker (start of block) in the stream, once found it starts to 
uncompress. this should mostly work, but I guess when can created a bzip2 
stream which has the number PI as output of the compression algorithm, but this 
is very theoretically.

See also:
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/BZip2Codec.java
and 
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/bzip2/CBZip2InputStream.java
(which seems to probably be copy&pasted from the commons-compress 
implementation, at least it looks very similar)



> add notifier support for new block in BZip2CompressorInputStream
> ----------------------------------------------------------------
>
>                 Key: COMPRESS-207
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-207
>             Project: Commons Compress
>          Issue Type: New Feature
>          Components: Compressors
>    Affects Versions: 1.4.1
>            Reporter: Thomas Meyer
>            Priority: Minor
>              Labels: API, bzip
>         Attachments: 
> 0001-Add-notifier-support-for-new-block-in-BZip2Compresso.patch, 
> BZip2CompressorInputStream-add-newBlock-notifier.patch, 
> BZip2CompressorInputStream-add-newBlock-notifier.patch, 
> BZip2CompressorInputStream-add-newBlock-notifier.patch
>
>
> hi,
> attached patch enables an program to add a listener when a new bzip2
> block is detected.
> The notifier is called with:
>  - xxx.newBlock(this, currBlockPosition)
> - this = the current BZip2CompressorInputStream object
> - currBlockPosition = The offset (i.e. start position) in the compressed
> input stream of the current block



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to