[
https://issues.apache.org/jira/browse/COMPRESS-111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651308#comment-13651308
]
Damjan Jovanovic commented on COMPRESS-111:
-------------------------------------------
The fundamental problem is that Commons Compress does decompression via
CompressorInputStream’s read() methods, which are a pull-model interface, while
the LZMA SDK (in the public domain) does it with Decoder.code(), a method that
takes a compressed input stream and an output stream to decompress to, then
reads, decompresses, and writes, only returning when the entire file is
decompressed. There is no way to convert this to a pull-model
CompressorInputStream: either you have to pull in one thread while pushing from
another, or push everything into a ByteArrayInputStream (which needs O\(n)
memory!!) and then pull from that afterwards. Both are really ugly solutions:
thread per stream is heavy and creating new threads is not allowed in some
environments (eg. unsigned Applets and Java EE servers), while trying to
allocate O\(n) memory can OutOfMemoryError the entire JVM.
The Java LZMA attempts out there rate as follows:
Maurel’s patch here uses O\(n) memory, and decompresses the entire stream in
the constructor and stores it in a ByteArrayInputStream which is then copied
from on each read().
http://jponge.github.io/lzma-java/ is licensed ASLv2 and states how it solved
the push/pull problem: “Although not a derivate work, the streaming api classes
were inspired from the work of Christopher League. I reused his technique of
fake streams and working threads to pass the data around between
encoders/decoders and "normal" Java streams.” In other words, it pushes in one
thread and pulls in another. Actual decompression in the other thread is still
done with the LZMA SDK, which it just wraps into an InputStream subclass.
http://contrapunctus.net/league/haques/lzmajio/ was done by Christopher League,
it’s under “LGPL or the Common Public License” and has the same push in one
thread pull in another story. It’s also just a wrapper of the LZMA SDK.
http://tukaani.org/xz/java.html is in the public domain and is already used by
Commons Compress to provide XZ compression support. It supports XZ and LZMA2
only and supports them well - proper pull-model InputStream with no O\(n)
memory or background threads. LZMA2 is a different file format from LZMA. But
then again LZMA2 uses LZMA internally. I’ll have to investigate in detail.
> support for lzma files
> ----------------------
>
> Key: COMPRESS-111
> URL: https://issues.apache.org/jira/browse/COMPRESS-111
> Project: Commons Compress
> Issue Type: New Feature
> Components: Compressors
> Affects Versions: 1.0
> Reporter: maurel jean francois
> Attachments: compress-trunk-lzmaRev0.patch,
> compress-trunk-lzmaRev1.patch
>
>
> adding support for compressing and decompressing of files with LZMA algoritm
> (Lempel-Ziv-Markov chain-Algorithm)
> (see
> http://markmail.org/search/?q=list%3Aorg.apache.commons.users/#query:list%3Aorg.apache.commons.users%2F+page:1+mid:syn4uuvbzusevtko+state:results)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira