[ 
https://issues.apache.org/jira/browse/COMPRESS-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452455#comment-17452455
 ] 

Gary D. Gregory commented on COMPRESS-600:
------------------------------------------

Hi [~davoustp] 

Feel free to provide a PR on GutHub ;)

> Add capability to configure Deflater strategy in GzipCompressorOutputStream
> ---------------------------------------------------------------------------
>
>                 Key: COMPRESS-600
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-600
>             Project: Commons Compress
>          Issue Type: Improvement
>          Components: Compressors
>    Affects Versions: 1.21
>         Environment: Any JDK-based environment.
>            Reporter: Pascal Davoust
>            Priority: Major
>
> The {{GzipCompressorOutputStream}} uses a {{java.util.zip.Deflater}} to 
> perform the compression heavy lifting.
> However, the {{java.util.zip.Deflater}} class (making use of and delegating 
> to the underlying native {{zlib}} library) allows to specify a strategy which 
> drives which part of the deflate algorithm is used or not (keeping the full 
> deflate format compatibility, requiring no change on the decoding side), see 
> [https://docs.oracle.com/javase/8/docs/api/java/util/zip/Deflater.html#setStrategy-int-]
> Adding the capability to control this strategy within the {{GzipParameters}} 
> would be a very welcomed addition, as there is no way to sub-class and extend 
> {{GzipCompressorOutputStream}} to do so.
> The rationale behind this request is related to compressing base64-heavy 
> content.
> It turns out that since base64 is breaking byte-alignment, the LZ77 part of 
> the deflate algorithm is run in sub-optimal conditions (read: it is defeated 
> most of the time), consuming CPU cycles for almost no gain.
> Skipping the LZ77 part of the deflate algorithm to use Huffman coding only 
> does the job pretty well (at least in our case): it takes 3x to 5x less time 
> (= CPU cycles) to compress and saves 26% of the initial data size instead of 
> 27% with default settings (the compression ratio drop is then very minimal 
> vs. a very significant CPU usage win).
> Irrespective of our own use case and measurements, this looks like a very 
> slick addition to this utility class to expose an already proven and 
> available feature.
> I'm happy to provide a PR for you guys to review, just let me know.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to