[
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17837237#comment-17837237
]
Alex Petrov commented on CASSANDRA-12937:
-----------------------------------------
bq. Yes, I think this is the most ideal solution. If somebody wants to
experiment with a new compressor and similar, there would need to be some knob
to override it, like some JMX method or similar, and all risks attached to that
(divergence of the configuration caused by operator's negligence) would be on
him.
Some things are actually quite useful for gradual rollout. For example,
compression. You probably do not want to rewrite your sstables across the
entire cluster. Similar arguments may be made for canary deployments of
memtable settings and other things.
I agree that it is fine if these parameters are completely transient (i.e. if
you have set it to something that diverges from the clusterwide value, it will
get reverted back after the node bounce). In such case, probably they will not
go through TCM and will be purely node-local.
Examples of things that are now configuable via yaml but will be configurable
via TCM if we go ahead with this proposal: partitioner, memtable configuration,
default compaction strategy, compression. As Sam has mentioned, "which specific
value makes it into schema just depends on which instance acts as the
coordinator for a given DCL statement".
bq. but I remain unconvinced that just picking the defaults from whatever node
happens to be coordinating is the right way to go.
I have talked with Sam shortly just to make sure I understand it correctly
before trying to describe it. Since this was first worded in a way that
suggested a problem but not directly proposed a solution (possibly described
elsewhere), I will attempt to do this. Sam has already described a part of the
solution as:
bq. That should probably be in a parallel local datastructure though, not in
the node's local log table as we don't want to ship those local defaults to
peers when providing log catchup (because they should use their own defaults).
The part that was missing for me was where would the values be coming from, and
what would be the precedence. When executing a {CREATE} statement on some node
_without_ specifying, say, compression, the statement will be created and
executed without the value for compression set at all. Every node will pick the
value from its ephemeral parallel structure Sam described (which is also
settable via JMX and alike like Stefan mentioned). If no value is present in
this table, it will be picked from yaml (alternatively, we could just populate
this structure from yaml, too, but I consider these things roughly equivalent).
> Default setting (yaml) for SSTable compression
> ----------------------------------------------
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/Config
> Reporter: Michael Semb Wever
> Assignee: Stefan Miklosovic
> Priority: Low
> Labels: AdventCalendar2021
> Fix For: 5.x
>
> Time Spent: 8h
> Remaining Estimate: 0h
>
> In many situations the choice of compression for sstables is more relevant to
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable
> compression that new tables will inherit (instead of the defaults found in
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly
> compression (btrfs, zfs) or specific disk configurations or even specific C*
> versions (see CASSANDRA-10995 ).
> +Additional information for newcomers+
> Some new fields need to be added to {{cassandra.yaml}} to allow specifying
> the field required for defining the default compression parameters. In
> {{DatabaseDescriptor}} a new {{CompressionParams}} field should be added for
> the default compression. This field should be initialized in
> {{DatabaseDescriptor.applySimpleConfig()}}. At the different places where
> {{CompressionParams.DEFAULT}} was used the code should call
> {{DatabaseDescriptor#getDefaultCompressionParams}} that should return some
> copy of configured {{CompressionParams}}.
> Some unit test using {{OverrideConfigurationLoader}} should be used to test
> that the table schema use the new default when a new table is created (see
> CreateTest for some example).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]