Re: [PR] CASSANDRA-17021: Support ZSTD dictionary compression [cassandra]

via GitHub Fri, 17 Oct 2025 14:55:24 -0700


smiklosovic commented on code in PR #4399:
URL: https://github.com/apache/cassandra/pull/4399#discussion_r2410519082



##########
src/java/org/apache/cassandra/db/compression/CompressionDictionary.java:
##########
@@ -0,0 +1,210 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.db.compression;
+
+import java.io.DataInput;
+import java.io.DataOutput;
+import java.io.EOFException;
+import java.io.IOException;
+import java.util.Objects;
+import javax.annotation.Nullable;
+
+import com.google.common.hash.Hasher;
+import com.google.common.hash.Hashing;
+
+import org.apache.cassandra.cql3.UntypedResultSet;
+
+public interface CompressionDictionary extends AutoCloseable
+{
+    /**
+     * Get the dictionary id
+     *
+     * @return dictionary id
+     */
+    DictId identifier();
+
+    /**
+     * Get the raw bytes of the compression dictionary
+     *
+     * @return raw compression dictionary
+     */
+    byte[] rawDictionary();
+
+    /**
+     * Get the kind of the compression algorithm
+     *
+     * @return compression algorithm kind
+     */
+    default Kind kind()
+    {
+        return identifier().kind;
+    }
+
+    /**
+     * Write compression dictionary to file
+     *
+     * @param out file output stream
+     * @throws IOException on any I/O exception when writing to the file
+     */
+    default void serialize(DataOutput out) throws IOException
+    {
+        DictId dictId = identifier();
+        int ordinal = dictId.kind.ordinal();
+        out.writeByte(ordinal);
+        out.writeLong(dictId.id);
+        byte[] dict = rawDictionary();
+        out.writeInt(dict.length);
+        out.write(dict);
+        int checksum = calculateChecksum((byte) ordinal, dictId.id, dict);
+        out.writeInt(checksum);
+    }
+
+    /**
+     * A factory method to create concrete CompressionDictionary from the file 
content
+     *
+     * @param input   file input stream
+     * @param manager compression dictionary manager that caches the 
dictionaries
+     * @return compression dictionary; otherwise, null if there is no 
dictionary
+     * @throws IOException on any I/O exception when reading from the file
+     */
+    @Nullable
+    static CompressionDictionary deserialize(DataInput input, @Nullable 
CompressionDictionaryManager manager) throws IOException
+    {
+        int kindOrdinal;
+        try
+        {
+            kindOrdinal = input.readByte();
+        }
+        catch (EOFException eof)
+        {
+            // no dictionary
+            return null;
+        }
+
+        if (kindOrdinal < 0 || kindOrdinal >= Kind.values().length)
+        {
+            throw new IOException("Invalid compression dictionary kind: " + 
kindOrdinal);
+        }
+        Kind kind = Kind.values()[kindOrdinal];
+        long id = input.readLong();
+        DictId dictId = new DictId(kind, id);
+
+        if (manager != null)
+        {
+            CompressionDictionary dictionary = manager.get(dictId);
+            if (dictionary != null)
+            {
+                return dictionary;
+            }
+        }
+
+        int length = input.readInt();
+        byte[] dict = new byte[length];
+        input.readFully(dict);
+        int checksum = input.readInt();
+        int calculatedChecksum = calculateChecksum((byte) kindOrdinal, id, 
dict);
+        if (checksum != calculatedChecksum)
+        {
+            throw new IOException("Compression dictionary checksum does not 
match");
+        }
+
+        CompressionDictionary dictionary = null;

Review Comment:
   This is also too complicated I think. Similarly as below:
   
           int length = input.readInt();
           byte[] dict = new byte[length];
           input.readFully(dict);
           int checksum = input.readInt();
           int calculatedChecksum = calculateChecksum((byte) kindOrdinal, id, 
dict);
           if (checksum != calculatedChecksum)
               throw new IOException("Compression dictionary checksum does not 
match");
           
           // here we are guaranteed to have a dictionary, no?
           CompressionDictionary dictionary = kind.getDictionary(dictId, dict);
   
           // update the dictionary manager if it exists
           if (manager != null)
               manager.add(dictionary);
           
           return dictionary;



##########
conf/cassandra.yaml:
##########
@@ -617,6 +617,54 @@ counter_cache_save_period: 7200s
 # Disabled by default, meaning all keys are going to be saved
 # counter_cache_keys_to_save: 100
 
+# Dictionary compression settings for ZSTD dictionary-based compression
+# These settings control the automatic training and caching of compression 
dictionaries
+# for tables that use ZSTD dictionary compression.
+
+# How often to refresh compression dictionaries across the cluster.
+# During refresh, nodes will check for newer dictionary versions and update 
their caches.
+# Min unit: s
+compression_dictionary_refresh_interval: 3600s
+
+# Initial delay before starting the first dictionary refresh cycle after node 
startup.
+# This prevents all nodes from refreshing simultaneously when the cluster 
starts.
+# Min unit: s
+compression_dictionary_refresh_initial_delay: 10s
+
+# Maximum number of compression dictionaries to cache per table.
+# Each table using dictionary compression can have multiple dictionaries cached
+# (current version plus recently used versions for reading older SSTables).
+compression_dictionary_cache_size: 10
+
+# How long to keep compression dictionaries in the cache before they expire.
+# Expired dictionaries will be removed from memory but can be reloaded if 
needed.
+# Min unit: s
+compression_dictionary_cache_expire: 3600s
+
+# Dictionary training configuration (advanced settings)
+# These settings control how compression dictionaries are trained from sample 
data.
+
+# Maximum size of a trained compression dictionary in bytes.
+# Larger dictionaries may provide better compression but use more memory.
+# Min unit: B
+compression_dictionary_training_max_dictionary_size: 65536
+
+# Maximum total size of sample data to collect for dictionary training.
+# More sample data generally produces better dictionaries but takes longer to 
train.
+# The recommended sample size is 100x the dictionary size.
+# Min unit: B
+compression_dictionary_training_max_total_sample_size: 10485760
+
+# Enable automatic dictionary training based on sampling of write operations.
+# When enabled, the system will automatically collect samples and train new 
dictionaries.
+# Manual training via nodetool is always available regardless of this setting.
+compression_dictionary_training_auto_train_enabled: false
+
+# Sampling rate for automatic dictionary training (1-10000).
+# Value of 100 means 1% of writes are sampled. Lower values reduce overhead 
but may
+# result in less representative sample data for dictionary training.
+compression_dictionary_training_sampling_rate: 100

Review Comment:
   Does it really matter to be so granular to express 0.5% as "50"? The 
"problem" with this is that a reader has to manually re-compute the number in 
their head. Like 2.35% would be 235 and 0.8% would be 80.
   
   It would be way more comfortable to put these figures in percentage already. 
   
   There is `data_disk_usage_percentage_warn_threshold` which operates with 
percentage conceptually. We can convert the percentage to whatever data type 
internally afterwards. So where it would be `1` instead of `100`, we would just 
multiply by 100 internally.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] CASSANDRA-17021: Support ZSTD dictionary compression [cassandra]

Reply via email to