[
https://issues.apache.org/jira/browse/CASSANDRA-15379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091240#comment-17091240
]
Joey Lynch commented on CASSANDRA-15379:
----------------------------------------
Final commit with some quick fixes to the docs to make them a little clearer,
test runs linked below.
||trunk||
|[063811c44|https://github.com/jolynch/cassandra/commit/063811c44f41996ee4903c92a95aa108e7ff7ad4]|
|[branch|https://github.com/apache/cassandra/compare/trunk...jolynch:CASSANDRA-15379-final]|
|[!https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-15379-final.png?circle-token=
1102a59698d04899ec971dd36e925928f7b521f5!|https://circleci.com/gh/jolynch/cassandra/tree/CASSANDRA-15379-final]|
All unit tests and in-jvm dtests passed, a few dtest flakes on java8 and java11
that I'm pretty sure are unrelated (a transient replication dtest and two
nodetool dtests).
* test_refresh_size_estimates_clears_invalid_entries -
nodetool_test.TestNodetool
* test_optimized_primary_range_repair -
transient_replication_test.TestTransientReplication
* test_repaired_tracking_with_mismatching_replicas -
repair_tests.incremental_repair_test.TestIncRepair
All appear to be unrelated failures.
> Make it possible to flush with a different compression strategy than we
> compact with
> ------------------------------------------------------------------------------------
>
> Key: CASSANDRA-15379
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15379
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/Compaction, Local/Config, Local/Memtable
> Reporter: Joey Lynch
> Assignee: Joey Lynch
> Priority: Normal
> Fix For: 4.0-alpha
>
> Attachments: 15379_backfill_drops_zstd_level10.png,
> 15379_backfill_duration_zstd_level10.png,
> 15379_backfill_queueing_zstd_level10.png, 15379_backfill_zstd_level10.png,
> 15379_baseline_flush_trace.png, 15379_candidate_flush_trace.png,
> 15379_concurrent_flushes_zstd_level10.png, 15379_coordinator_defaults.png,
> 15379_coordinator_zstd_defaults.png, 15379_coordinator_zstd_level10.png,
> 15379_flush_flamegraph_zstd_level10.png,
> 15379_message_drops_zstd_level10.png, 15379_replica_defaults.png,
> 15379_replica_zstd_defaults.png, 15379_request_queueing_zstd_level10.png,
> 15379_system_defaults.png, 15379_system_zstd_defaults.png
>
>
> [~josnyder] and I have been testing out CASSANDRA-14482 (Zstd compression) on
> some of our most dense clusters and have been observing close to 50%
> reduction in footprint with Zstd on some of our workloads! Unfortunately
> though we have been running into an issue where the flush might take so long
> (Zstd is slower to compress than LZ4) that we can actually block the next
> flush and cause instability.
> Internally we are working around this with a very simple patch which flushes
> SSTables as the default compression strategy (LZ4) regardless of the table
> params. This is a simple solution but I think the ideal solution though might
> be for the flush compression strategy to be configurable separately from the
> table compression strategy (while defaulting to the same thing). Instead of
> adding yet another compression option to the yaml (like hints and commitlog)
> I was thinking of just adding it to the table parameters and then adding a
> {{default_table_parameters}} yaml option like:
> {noformat}
> # Default table properties to apply on freshly created tables. The currently
> supported defaults are:
> # * compression : How are SSTables compressed in general (flush,
> compaction, etc ...)
> # * flush_compression : How are SSTables compressed as they flush
> # supported
> default_table_parameters:
> compression:
> class_name: 'LZ4Compressor'
> parameters:
> chunk_length_in_kb: 16
> flush_compression:
> class_name: 'LZ4Compressor'
> parameters:
> chunk_length_in_kb: 4
> {noformat}
> This would have the nice effect as well of giving our configuration a path
> forward to providing user specified defaults for table creation (so e.g. if a
> particular user wanted to use a different default chunk_length_in_kb they can
> do that).
> So the proposed (~mandatory) scope is:
> * Flush with a faster compression strategy
> I'd like to implement the following at the same time:
> * Per table flush compression configuration
> * Ability to default the table flush and compaction compression in the yaml.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]