[
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17713524#comment-17713524
]
Claude Warren commented on CASSANDRA-12937:
-------------------------------------------
hints_compression and commitlog_compression use the standard ParameterizedClass.
The CompressionParams has 3 parameters that it extracts or creates from the
parameters in the ParameterizedClass. The parameters in CompressionParams are
{code:java}
private final int chunkLength;
private final int maxCompressedLength; // In content we store max length to
avoid rounding errors causing compress/decompress mismatch.
private final double minCompressRatio; // In configuration we store min ratio,
the input parameter.
{code}
The ParameterizedClass constructor that accepts the Map<String,String> of
options expects a key of "chunk_length_in_kb" or "chunk_length_kb" as well as
a "min_compress_ratio".
This change I made does not change the hints_compression or
commitlog_compression options.
The yaml file has an additional set of requirements:
* The chunkLength (yaml: chunk_length) should be specified with the
DataStorageSpec suffix (e.g. KiB).
* The maxCompressedLength should be accepted as a parameter.
* The maxCompressedLength (yaml: max_compressed_length) should be specified
with the DataStorageSpec extensions (e.g. KiB).
* maxCompressedLength and minCompressRatio are related to each other via
chunk_length; so only one can be specified.
I could work chunkLength and maxCompressedLength into the class_name
parameters, however, I believe this will result in adding 2 more reserved words
both of which will need to be removed from the parameter list. This change
will affect all CompressionParams constructions that use the
Map<String,String> format.
I will make the change with the following processes for determining collision
values:
* If both max_compressed_length and min_compress_ratio are specified an
ConfigurationException will be thrown.
* if both chunk_length and either chunk_length_in_kb or chunk_length_kb are
specified and they are not equal ConfiguraitonException will be thrown.
* if chunk_length or max_compressed_length are specified and do not use the
DataStorageSpec suffix a ConfigurationException will be thrown
I will also ensure that the short names: lz4, none, noop, snappy, deflate, and
zstd will work as class names and use the defaults specified by the
CompressionParams methods of the same names.
> Default setting (yaml) for SSTable compression
> ----------------------------------------------
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/Config
> Reporter: Michael Semb Wever
> Assignee: Claude Warren
> Priority: Low
> Labels: AdventCalendar2021, lhf
> Fix For: 5.x
>
> Time Spent: 3h
> Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable
> compression that new tables will inherit (instead of the defaults found in
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly
> compression (btrfs, zfs) or specific disk configurations or even specific C*
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying
> the field required for defining the default compression parameters. In
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for
> the default compression. This field should be initialized in
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where
> {{CompressionParams.DEFAULT}} was used the code should call
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test
> that the table schema use the new default when a new table is created (see
> CreateTest for some example).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]