[
https://issues.apache.org/jira/browse/DAFFODIL-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Guichard Desrosiers updated DAFFODIL-3082:
------------------------------------------
Description:
Currently, the gzip layer always uses java.util.zip.GZIPOutputStream at the
JDK's default compression level (which zlib resolves to level 6) when
unparsing. There is no way for a schema or user to choose a different level.
Make the gzip compression level configurable via a new DFDL layer parameter
variable, `gz:compressionLevel`. Valid values:
0 no compression
1-9 increasing compression, 1 = fastest and 9 = best compression
-1 sentinel meaning "use the implementation default" (level 6)
The variable should default to -1 and be declared external="true" so users
can set it without having to modify the schema. Additionally, it can be set in
schema via `dfdl:newVariableInstance` and `dfdl:setVariable`.
was:
Current gzip layer tests use the default compression level(level 6), but that
yields different output between classic zlib vs zlib-ng for the same input
sometimes. Setting the default level to 9 instead yields deterministic output,
but level 9 is slow and may not reflect typical DFDL use. Make the gzip
compression level configurable via layer parameters, with a default of -1 from
the java side, which internally translates to level 6 by zlib, to use the
JDK/zlib default, and allow tests to set a deterministic level when needed.
Some JDK distributions (e.g., Temurin) bundle their own zlib implementation
with {{{}java.util.zip{}}}, using the classic stock zlib, while others link to
the system zlib, which may be stock zlib or zlib-ng depending on the OS. With
default compression, stock zlib and zlib-ng produce byte-different gzip output
for the same input. A configurable level (e.g., 0 or 9) lets users choose a
compression level that yields more deterministic output across different
JDK/zlib combinations when necessary.
> Make GZIP layer compression level configurable
> ----------------------------------------------
>
> Key: DAFFODIL-3082
> URL: https://issues.apache.org/jira/browse/DAFFODIL-3082
> Project: Daffodil
> Issue Type: New Feature
> Components: Back End
> Reporter: Guichard Desrosiers
> Assignee: Guichard Desrosiers
> Priority: Major
>
> Currently, the gzip layer always uses java.util.zip.GZIPOutputStream at the
> JDK's default compression level (which zlib resolves to level 6) when
> unparsing. There is no way for a schema or user to choose a different level.
> Make the gzip compression level configurable via a new DFDL layer parameter
> variable, `gz:compressionLevel`. Valid values:
> 0 no compression
> 1-9 increasing compression, 1 = fastest and 9 = best compression
> -1 sentinel meaning "use the implementation default" (level 6)
> The variable should default to -1 and be declared external="true" so users
> can set it without having to modify the schema. Additionally, it can be set
> in schema via `dfdl:newVariableInstance` and `dfdl:setVariable`.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)