The old files will not be split. TWCS doesn’t ever do that.
> On Jan 9, 2018, at 12:26 AM, wxn...@zjqunshuo.com wrote: > > Hi Alex, > After I changed one node to TWCS using JMX command, it started to compact. I > expect the old large sstable files will be split into smaller ones according > to the time bucket. But I got still large sstable file. > > JMX command used: > set CompactionParametersJson > {"class":"com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy","compaction_window_unit":"DAYS","compaction_window_size":"8"} > > Logs: > INFO [CompactionExecutor:4] 2018-01-09 15:55:04,525 > CompactionManager.java:654 - Will not compact > /mnt/hadoop/cassandra/data/system/batchlog-0290003c977e397cac3efdfdc01d626b/lb-37-big: > it is not an active sstable > INFO [CompactionExecutor:4] 2018-01-09 15:55:04,525 > CompactionManager.java:664 - No files to compact for user defined compaction > > The last log means something? > > Cheers, > -Simon > > From: wxn...@zjqunshuo.com <mailto:wxn...@zjqunshuo.com> > Date: 2018-01-05 15:54 > To: user <mailto:user@cassandra.apache.org> > Subject: Re: Full repair caused disk space increase issue > Thanks Alex. Some nodes have finished anticompaction and disk space got > reclaimed as you mentioned. > BTW, after reading your > post(http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html > <http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html>) on TWCS, I > decided to use TWCS, and doing the full repair is one of the preparation of > changing to TWCS. > > From: Alexander Dejanovski <mailto:a...@thelastpickle.com> > Date: 2018-01-05 15:17 > To: user <mailto:user@cassandra.apache.org> > Subject: Re: Full repair caused disk space increase issue > Hi Simon, > > since Cassandra 2.2, anticompaction is performed in all types of repairs, > except subrange repair. > Given that you have some very big SSTables, the temporary space used by > anticompaction (which does the opposite of compaction : read one sstable, > output two sstables) will impact your disk usage while it's running. It will > reach a peak when they are close to completion. > The anticompaction that is reported by compactionstats is currently using an > extra 147GB*[compression ratio]. So with a compression ratio of 0.3 for > example, that would be 44GB that will get reclaimed shortly after the > anticompaction is over. > > You can check the current overhead of compaction by listing temporary > sstables : *tmp*Data.db > > It's also possible that you have some overstreaming that occurred during your > repair, which will increase the size on disk until it gets compacted away > (over time). > You should also check if you don't have snapshots sticking around by running > "nodetool listsnapshots". > > Now, you're mentioning that you ran repair to evict tombstones. This is not > what repair does, and tombstones are evicted through compaction when they > meet the requirements (gc_grace_seconds and all the cells of the partition > involved in the same compaction). > If you want to optimize your tombstone eviction, especially with STCS, I > advise to turn on unchecked_tombstone_compaction, which will allow single > sstables compactions to be triggered by Cassandra when there is more than 20% > of estimated droppable tombstones in an SSTable. > You can check your current droppable tombstone ratio by running > sstablemetadata on all your sstables. > A command like the following should do the trick (it will print out min/max > timestamps too) : > > for f in *Data.db; do meta=$(sudo sstablemetadata $f); echo -e "Max:" $(date > --date=@$(echo "$meta" | grep Maximum\ time | cut -d" " -f3| cut -c 1-10) > '+%m/%d/%Y') "Min:" $(date --date=@$(echo "$meta" | grep Minimum\ time | cut > -d" " -f3| cut -c 1-10) '+%m/%d/%Y') $(echo "$meta" | grep droppable) ' \t ' > $(ls -lh $f | awk '{print $5" "$6" "$7" "$8" "$9}'); done | sort > > Check if the 20% threshold is high enough by verifying that newly created > SSTables don't already reach that level, and adjust accordingly if it's the > case (for example raise the threshold to 50%). > > To activate the tombstone compactions, with a 50% droppable tombstone > threshold, perform the following statement on your table : > > ALTER TABLE cargts.eventdata WITH compaction = > {'class':'SizeTieredCompactionStrategy', > 'unchecked_tombstone_compaction':'true', 'tombstone_threshold':'0.5'} > > Picking the right threshold is up to you. > Note that tombstone compactions running more often will use temporary space > as well, but they should help evicting tombstones faster if the partitions > are contained within a single SSTable. > > If you are dealing with TTLed data and your partitions spread over time, I'd > strongly suggest considering TWCS instead of STCS which can remove fully > expired SSTables much more efficiently. > > Cheers, > > > On Fri, Jan 5, 2018 at 7:43 AM wxn...@zjqunshuo.com > <mailto:wxn...@zjqunshuo.com><wxn...@zjqunshuo.com > <mailto:wxn...@zjqunshuo.com>> wrote: > Hi All, > In order to evict tombstones, I issued full repair with the command "nodetool > -pr -full". Then the data load size was indeed decreased by 100G for each > node by using "nodetool status" to check. But the actual disk usage increased > by 500G for each node. The repair is still ongoing and leaving less and less > disk space for me. > > From compactionstats, I see "Anticompaction after repair". Based on my > understanding, it is for incremental repair by changing sstable metadata to > indicate which file is repaired, so in next repair it is not going to be > repaired. But I'm doing full repair, Why Anticompaction? > 9e09c490-f1be-11e7-b2ea-b3085f85ccae Anticompaction after repair cargts > eventdata 147.3 GB 158.54 GB bytes 92.91% > > There are pare sstable files. I mean they have the same timestamp as below. I > guess one of them or both of them should be deleted after during repair, but > for some unknown reason, the repair process failed to delete them. > -rw-r--r-- 1 root root 237G Dec 31 12:48 lb-123800-big-Data.db > -rw-r--r-- 1 root root 243G Dec 31 12:48 lb-123801-big-Data.db > > C* version is 2.2.8 with STCS. Any ideas? > > Cheers, > -Simon > > > -- > ----------------- > Alexander Dejanovski > France > @alexanderdeja > > Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com <http://www.thelastpickle.com/>