jyothsnakonisa commented on code in PR #4523:
URL: https://github.com/apache/cassandra/pull/4523#discussion_r2636117041
##########
doc/modules/cassandra/pages/managing/operating/compression.adoc:
##########
@@ -298,12 +298,6 @@ next access.
=== Training Configuration
-* `compression_dictionary_training_max_dictionary_size` (default: `65536`):
-Maximum size of trained dictionaries in bytes. Larger dictionaries can
-capture more patterns but increase memory overhead.
-* `compression_dictionary_training_max_total_sample_size` (default:
-`10485760`): Maximum total size of sample data to collect for training,
-approximately 10MB.
* `compression_dictionary_training_auto_train_enabled` (default: `false`):
Enable automatic background dictionary training. When enabled, Cassandra
samples writes and trains dictionaries automatically.
Review Comment:
isn't sampling rate supposed to be a decimal, Please change the
documentation accordingly everywhere
##########
.circleci/config.yml:
##########
@@ -24,7 +24,7 @@ jobs:
resource_class: medium
working_directory: ~/
shell: /bin/bash -eo pipefail -l
- parallelism: 4
+ parallelism: 25
Review Comment:
Did you checkin these changes in the file by accident?
##########
src/java/org/apache/cassandra/db/compression/ZstdDictionaryTrainer.java:
##########
@@ -290,15 +297,16 @@ public TrainingState getTrainingState()
}
@Override
- public boolean start(boolean manualTraining)
+ public boolean start(boolean manualTraining,
CompressionDictionaryTrainingConfig trainingConfig)
{
if (closed || !(manualTraining || shouldAutoStartTraining()))
return false;
try
{
// reset on starting; a new zstdTrainer instance is created during
reset
- reset();
+ reset(trainingConfig);
+ config = trainingConfig;
Review Comment:
Should config be volatile? reset(...) is setting the `config` to be null.
There could be other thread calling `isReady()` method which checks config to
be not null and there could be a race condition.
##########
src/java/org/apache/cassandra/db/compression/ZstdDictionaryTrainer.java:
##########
@@ -68,17 +68,23 @@ public class ZstdDictionaryTrainer implements
ICompressionDictionaryTrainer
private volatile TrainingStatus currentTrainingStatus;
private volatile String failureMessage;
- public ZstdDictionaryTrainer(String keyspaceName, String tableName,
- CompressionDictionaryTrainingConfig config,
- int compressionLevel)
+ public ZstdDictionaryTrainer(String keyspaceName, String tableName, int
compressionLevel)
+ {
+ this(keyspaceName,
+ tableName,
+ compressionLevel,
+ Math.round(1 /
DatabaseDescriptor.getCompressionDictionaryTrainingSamplingRate()));
+ }
+
+ @VisibleForTesting
+ public ZstdDictionaryTrainer(String keyspaceName, String tableName, int
compressionLevel, int samplingRate)
Review Comment:
You are passing `1/samplingRate` here, should the name of the variable be
percentage? or any other appropriate name
```suggestion
public ZstdDictionaryTrainer(String keyspaceName, String tableName, int
compressionLevel, int samplingPercentage)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]