[
https://issues.apache.org/jira/browse/COMPRESS-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452455#comment-17452455
]
Gary D. Gregory commented on COMPRESS-600:
------------------------------------------
Hi [~davoustp]
Feel free to provide a PR on GutHub ;)
> Add capability to configure Deflater strategy in GzipCompressorOutputStream
> ---------------------------------------------------------------------------
>
> Key: COMPRESS-600
> URL: https://issues.apache.org/jira/browse/COMPRESS-600
> Project: Commons Compress
> Issue Type: Improvement
> Components: Compressors
> Affects Versions: 1.21
> Environment: Any JDK-based environment.
> Reporter: Pascal Davoust
> Priority: Major
>
> The {{GzipCompressorOutputStream}} uses a {{java.util.zip.Deflater}} to
> perform the compression heavy lifting.
> However, the {{java.util.zip.Deflater}} class (making use of and delegating
> to the underlying native {{zlib}} library) allows to specify a strategy which
> drives which part of the deflate algorithm is used or not (keeping the full
> deflate format compatibility, requiring no change on the decoding side), see
> [https://docs.oracle.com/javase/8/docs/api/java/util/zip/Deflater.html#setStrategy-int-]
> Adding the capability to control this strategy within the {{GzipParameters}}
> would be a very welcomed addition, as there is no way to sub-class and extend
> {{GzipCompressorOutputStream}} to do so.
> The rationale behind this request is related to compressing base64-heavy
> content.
> It turns out that since base64 is breaking byte-alignment, the LZ77 part of
> the deflate algorithm is run in sub-optimal conditions (read: it is defeated
> most of the time), consuming CPU cycles for almost no gain.
> Skipping the LZ77 part of the deflate algorithm to use Huffman coding only
> does the job pretty well (at least in our case): it takes 3x to 5x less time
> (= CPU cycles) to compress and saves 26% of the initial data size instead of
> 27% with default settings (the compression ratio drop is then very minimal
> vs. a very significant CPU usage win).
> Irrespective of our own use case and measurements, this looks like a very
> slick addition to this utility class to expose an already proven and
> available feature.
> I'm happy to provide a PR for you guys to review, just let me know.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)