[
https://issues.apache.org/jira/browse/COMPRESS-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15223246#comment-15223246
]
Thomas Meyer commented on COMPRESS-207:
---------------------------------------
Some background/infos: I was inspired of the possibility to process the bzip2
stream block by block while reading this book: Hadoop: The Definitive Guide
(http://shop.oreilly.com/product/0636920033448.do). The Hadoop has so called
splittable compression streams. which AFAIK does this: it splits the total
length of the compressed input file by 2 (e.g.) and then searches for the
number PI marker (start of block) in the stream, once found it starts to
uncompress. this should mostly work, but I guess when can created a bzip2
stream which has the number PI as output of the compression algorithm, but this
is very theoretically.
See also:
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/BZip2Codec.java
and
https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/bzip2/CBZip2InputStream.java
(which seems to probably be copy&pasted from the commons-compress
implementation, at least it looks very similar)
> add notifier support for new block in BZip2CompressorInputStream
> ----------------------------------------------------------------
>
> Key: COMPRESS-207
> URL: https://issues.apache.org/jira/browse/COMPRESS-207
> Project: Commons Compress
> Issue Type: New Feature
> Components: Compressors
> Affects Versions: 1.4.1
> Reporter: Thomas Meyer
> Priority: Minor
> Labels: API, bzip
> Attachments:
> 0001-Add-notifier-support-for-new-block-in-BZip2Compresso.patch,
> BZip2CompressorInputStream-add-newBlock-notifier.patch,
> BZip2CompressorInputStream-add-newBlock-notifier.patch,
> BZip2CompressorInputStream-add-newBlock-notifier.patch
>
>
> hi,
> attached patch enables an program to add a listener when a new bzip2
> block is detected.
> The notifier is called with:
> - xxx.newBlock(this, currBlockPosition)
> - this = the current BZip2CompressorInputStream object
> - currBlockPosition = The offset (i.e. start position) in the compressed
> input stream of the current block
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)