[ 
https://issues.apache.org/jira/browse/CASSANDRA-21188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18060843#comment-18060843
 ] 

Stefan Miklosovic commented on CASSANDRA-21188:
-----------------------------------------------

if I understand this correctly, the problem is that when it is compacted then 
we take SSTables away from the training. Basically they disappear after they 
are compacted so training will stop to see them / can not reference it. I am 
trying to come up with a solution. Obvious one would be to just disable the 
compaction while training, but I am not completely sure what happens with the 
sstables being compacted, then we disable compaction, then we train. Even the 
compaction is disabled as such I guess that the ongoing compaction would still 
run, it is just no new compaction would be scheduled.

So maybe better solution is to 1) disable 2) wait until all ongoing compactions 
are finished 3) only after that training might be conducted. Or we just say 
that if any compaction is happening we will just fail the training which is not 
very user friendly. That would mean that a user would need to "guess" when 
there is no compaction, maybe repeatedly until they are lucky. 

> Race between compaction and dictionary compression training. Status stuck at 
> SAMPLING. ExportImportListCompressionDictionaryTest hangs
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-21188
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21188
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Feature/Compression
>            Reporter: Maxim Muzafarov
>            Priority: Normal
>             Fix For: 5.x
>
>
> There is a race beween compaction process and dictionary compression training 
> start:
>  # CompressionDictionaryManager: We collect all live sstables
>  # ICompressionDictionaryTrainer: start a new traning
>  # currentTrainingStatus moved to SAMPLING
>  # All SSTables get compacted within concurrent compaction thread
>  # SSTableSamplingTask: in the constructor sstable.tryRef return null
>  # We run this task on a thread pool and it exists in cancelManualTraining
>  # the currentTrainingStatus remains SAMPLING (should be *FAILED* or 
> {*}COMPLETED!{*})
> ExportImportListCompressionDictionaryTest hangs for 10 minutes (configured 
> constant) for now reason. 
> The logs:
> {code}
> INFO  [PerDiskMemtableFlushWriter_0:1] 2026-02-21T17:07:05,061 
> Flushing.java:157 - Writing 
> Memtable-table_testexportingspecificdictionary_strateg_18@1268950324(61.523KiB
>  serialized bytes, 1000 ops, 506.836KiB (0%) on-heap, 0B (0%) off-heap), 
> flushed range = [min(-9223372036854775808), max(9223372036854775807))
> INFO  [PerDiskMemtableFlushWriter_0:1] 2026-02-21T17:07:05,061 
> Flushing.java:197 - Completed flushing 
> /Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-11-big-Data.db
>  (28.198KiB) for commitlog position 
> CommitLogPosition(segmentId=1771693567140, position=654098), time spent: 0 
> ms, bytes flushed: 28875 / (rate: 28.198KiB/s), partitions flushed: 1000 / 
> (rate: 1000/s), rows: 1000 / (rate: 1000/s), cpu time: 0 ms, heap allocated: 
> 220.711KiB
> INFO  [MemtableFlushWriter:1] 2026-02-21T17:07:05,084 LogTransaction.java:266 
> - Unfinished transaction log, deleting 
> /Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa_txn_flush_bf3ac330-0f47-11f1-88d2-574197b4b378.log
>  
> DEBUG [MemtableFlushWriter:1] 2026-02-21T17:07:05,087 
> ColumnFamilyStore.java:1416 - Flushed to 
> [BigTableReader:big(path='/Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-11-big-Data.db')]
>  (1 sstables, 30.889KiB), biggest 30.889KiB, smallest 30.889KiB
> INFO  [main] 2026-02-21T17:07:05,091 ColumnFamilyStore.java:1088 - Enqueuing 
> flush of cql_test_keyspace.table_testexportingspecificdictionary_strateg_18, 
> Reason: UNIT_TESTS, Usage: 506.836KiB (0%) on-heap, 0B (0%) off-heap
> INFO  [PerDiskMemtableFlushWriter_0:2] 2026-02-21T17:07:05,092 
> Flushing.java:157 - Writing 
> Memtable-table_testexportingspecificdictionary_strateg_18@957877902(61.523KiB 
> serialized bytes, 1000 ops, 506.836KiB (0%) on-heap, 0B (0%) off-heap), 
> flushed range = [min(-9223372036854775808), max(9223372036854775807))
> INFO  [PerDiskMemtableFlushWriter_0:2] 2026-02-21T17:07:05,094 
> Flushing.java:197 - Completed flushing 
> /Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-12-big-Data.db
>  (28.201KiB) for commitlog position 
> CommitLogPosition(segmentId=1771693567140, position=726098), time spent: 0 
> ms, bytes flushed: 28878 / (rate: 28.201KiB/s), partitions flushed: 1000 / 
> (rate: 1000/s), rows: 1000 / (rate: 1000/s), cpu time: 0 ms, heap allocated: 
> 220.711KiB
> INFO  [MemtableFlushWriter:2] 2026-02-21T17:07:05,112 LogTransaction.java:266 
> - Unfinished transaction log, deleting 
> /Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa_txn_flush_bf3fa530-0f47-11f1-88d2-574197b4b378.log
>  
> DEBUG [MemtableFlushWriter:2] 2026-02-21T17:07:05,116 
> ColumnFamilyStore.java:1416 - Flushed to 
> [BigTableReader:big(path='/Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-12-big-Data.db')]
>  (1 sstables, 30.887KiB), biggest 30.887KiB, smallest 30.887KiB
> DEBUG [CompactionExecutor:2] 2026-02-21T17:07:05,117 Directories.java:554 - 
> FileStore /System/Volumes/Data (/dev/disk3s5) has 593792975872 bytes 
> available, checking if we can write 103847 bytes
> INFO  [CompactionExecutor:2] 2026-02-21T17:07:05,117 CompactionTask.java:229 
> - Compacting (bf4375c0-0f47-11f1-88d2-574197b4b378) 
> [/Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-11-big-Data.db,
>  
> /Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-12-big-Data.db,
>  
> /Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-9-big-Data.db,
>  
> /Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-10-big-Data.db,
>  ]
> DEBUG [CompactionExecutor:2] 2026-02-21T17:07:05,118 CursorCompactor.java:152 
> - Cursor compaction for table: 
> table_testexportingspecificdictionary_strateg_18 keyspace: cql_test_keyspace 
> is supported.
> INFO  [RMI TCP Connection(26)-127.0.0.1] 2026-02-21T17:07:05,146 
> CommandInvokerService.java:185 - Executing command 'train' with execution ID: 
> a5960218-7f58-41a2-a06f-d627acf20efd
> INFO  [RMI TCP Connection(26)-127.0.0.1] 2026-02-21T17:07:05,147 
> CompressionDictionaryManager.java:237 - Starting SSTable-based training for 
> cql_test_keyspace.table_testexportingspecificdictionary_strateg_18 with 1 
> SSTables
> INFO  [RMI TCP Connection(26)-127.0.0.1] 2026-02-21T17:07:05,150 
> CompressionDictionaryScheduler.java:101 - Starting SSTable-based dictionary 
> training for 
> cql_test_keyspace.table_testexportingspecificdictionary_strateg_18 from 1 
> SSTables
> DEBUG [RMI TCP Connection(26)-127.0.0.1] 2026-02-21T17:07:05,150 
> CompressionDictionaryScheduler.java:198 - Couldn't acquire reference to 
> SSTable 
> /Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-13-big.
>  It may have been removed.
> WARN  [NonPeriodicTasks:1] 2026-02-21T17:07:05,150 
> CompressionDictionaryScheduler.java:213 - No SSTables available for sampling 
> in cql_test_keyspace.table_testexportingspecificdictionary_strateg_18
> INFO  [RMI TCP Connection(26)-127.0.0.1] 2026-02-21T17:07:05,150 
> ToolRunner.java:927 - >>>> Polling training status...SAMPLING
> INFO  [CompactionExecutor:2] 2026-02-21T17:07:05,152 
> CursorCompactor.java:1574 - Compaction ended 
> bf4375c0-0f47-11f1-88d2-574197b4b378: { data bytes read = 294620, data bytes 
> written = 297868,  input (keys = [1:10000,] = 10000, rows = [1:10000,] = 
> 10000, cells = [1:10000,] = 10000),  output (keys = 10000, rows = 10000, 
> cells = 10000)}
> INFO  [CompactionExecutor:2] 2026-02-21T17:07:05,153 CompactionTask.java:336 
> - Compacted (bf4375c0-0f47-11f1-88d2-574197b4b378) 4 sstables to 
> [build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-13-big,]
>  to level=0.  101.413KiB to 91.637KiB (~90% of original) in 35ms.  Read 
> Throughput = 2.826MiB/s, Write Throughput = 2.554MiB/s, Row Throughput = 
> ~10,000/s.  10,000 total partitions merged to 10,000.  Partition merge counts 
> were {1:10000, }. Time spent writing keys = 10ms
> INFO  [NonPeriodicTasks:1] 2026-02-21T17:07:05,153 BigFormat.java:324 - 
> Deleting sstable: 
> /Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-11-big
> INFO  [NonPeriodicTasks:1] 2026-02-21T17:07:05,154 BigFormat.java:324 - 
> Deleting sstable: 
> /Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-12-big
> INFO  [NonPeriodicTasks:1] 2026-02-21T17:07:05,154 BigFormat.java:324 - 
> Deleting sstable: 
> /Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-9-big
> INFO  [NonPeriodicTasks:1] 2026-02-21T17:07:05,155 BigFormat.java:324 - 
> Deleting sstable: 
> /Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa-10-big
> INFO  [NonPeriodicTasks:1] 2026-02-21T17:07:05,155 LogTransaction.java:266 - 
> Unfinished transaction log, deleting 
> /Users/maxim.muzafarov/IdeaProjects/cassandra/build/test/cassandra/data/cql_test_keyspace/table_testexportingspecificdictionary_strateg_18-1b255f4def2540a60000000000000056/pa_txn_compaction_bf4375c0-0f47-11f1-88d2-574197b4b378.log
>  
> INFO  [RMI TCP Connection(26)-127.0.0.1] 2026-02-21T17:07:06,155 
> ToolRunner.java:927 - >>>> Polling training status...SAMPLING
> INFO  [RMI TCP Connection(26)-127.0.0.1] 2026-02-21T17:07:07,159 
> ToolRunner.java:927 - >>>> Polling training status...SAMPLING
> INFO  [RMI TCP Connection(26)-127.0.0.1] 2026-02-21T17:07:08,163 
> ToolRunner.java:927 - >>>> Polling training status...SAMPLING
> INFO  [RMI TCP Connection(26)-127.0.0.1] 2026-02-21T17:07:09,168 
> ToolRunner.java:927 - >>>> Polling training status...SAMPLING
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to