Re: [PR] CASSANDRA-21178 add created_at column to system_distributed.compression_dictionaries [cassandra]

via GitHub Tue, 17 Feb 2026 11:08:27 -0800


yifan-c commented on code in PR #4622:
URL: https://github.com/apache/cassandra/pull/4622#discussion_r2818627638



##########
src/java/org/apache/cassandra/db/compression/CompressionDictionaryDetailsTabularData.java:
##########
@@ -278,15 +289,11 @@ private void validate()
             if (table == null)
                 throw new IllegalArgumentException("Table not specified.");
             if (tableId == null)
-                throw new IllegalArgumentException("Table id not specified");
+                throw new IllegalArgumentException("Table id not specified.");
             if (dictId <= 0)
                 throw new IllegalArgumentException("Provided dictionary id 
must be positive but it is '" + dictId + "'.");
             if (dict == null || dict.length == 0)
                 throw new IllegalArgumentException("Provided dictionary byte 
array is null or empty.");
-            if (dict.length > FileUtils.ONE_MIB)

Review Comment:
   Thanks for the background! It helps to understand.
   
   Dictionaries are attached to every SSTable. The size limit is added with 
this context. The size of the dictionaries are typically 64~100 KiB. That said, 
the underlying zstd trainer do allow train large dictionaries. The questions, 
do we want to train dictionaries larger than 1 MiB? The added dictionary size 
might outweighs the compression gains (64 KiB vs. 1 MiB dictionaries)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] CASSANDRA-21178 add created_at column to system_distributed.compression_dictionaries [cassandra]

Reply via email to