Thanks Jeff. There was no restart between "Compacting" and "Compacted" logs but I observed that full repair (-pr) was running at that time with errors.
*Caused by: java.lang.RuntimeException: java.io.IOException: Cannot proceed on repair because a neighbor (/aa.bb.cc.dd) is dead: session failed* Does anyone remember any JIRA ticket related to obsolete sstables not being deleted after compaction? Regards Manish On Wed, Jan 22, 2020 at 11:37 AM Jeff Jirsa <jji...@gmail.com> wrote: > > > On Tue, Jan 21, 2020 at 8:58 PM manish khandelwal < > manishkhandelwa...@gmail.com> wrote: > >> Thanks Nitan, >> >> Thanks for your reply. >> >> I am using following methodology to find obsolete sstables and just want >> to make sure that I don't delete live data if I delete them . >> >> In the following logs I searched for sstable " >> keyspace-columnfamily-jb-456789" and found that this >> "*CompactionExecutor:1957" >> *thread compacted keyspace-columnfamily-jb-123456-Data.db , >> keyspace-columnfamily-jb-234567 -Data.db , keyspace-columnfamily-jb- >> 345678-Data.db. These files are still present in my data directory so I am >> assuming that they are obsolete. I*s my assumption correct*? >> > > The lines from 'Compacting' are the ones obsoleted IF and ONLY IF you see > a completed "Compacted" line for the same thread without a restart in > between. > > >> >> INFO [CompactionExecutor:1957] 2020-01-20 06:44:56,721 >> CompactionTask.java (line 120) Compacting >> [SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/ >> *keyspace-columnfamily-jb-123456-Data.db*'), >> SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/ >> *keyspace-columnfamily-jb-234567-Data.db*'), >> SSTableReader(path='/var/lib/cassandra/data/keyspace/columnfamily/ >> *keyspace-columnfamily-jb-345678-Data.db*')] >> INFO [CompactionExecutor:1957] 2020-01-20 12:45:23,270 >> ColumnFamilyStore.java (line 795) Enqueuing flush of >> Memtable-compactions_in_progress@519967741(0/0 serialized/live bytes, 1 >> ops) >> INFO [*CompactionExecutor:1957*] 2020-01-20 12:45:23,502 >> CompactionTask.java (line 296) Compacted 3 sstables to >> [/var/lib/cassandra/data/keyspace/columnfamily/ >> *keyspace-columnfamily-jb-456789*,]. 136,795,757,524 bytes to >> 100,529,812,389 (~73% of original) in 21,626,781ms = 4.433055MB/s. >> 1,738,999,743 total partitions merged to 1,274,232,528. Partition merge >> counts were {1:1049583261, 2:309997005, 3:23140824, } >> >> > In this case, > /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-123456-* > , > /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-234567-*, > and > /var/lib/cassandra/data/keyspace/columnfamily/keyspace-columnfamily-jb-345678-* > are all obsolete and should be gc'd "soon". If they're not being gc'd, > there's something wrong and you should figure out what's going on. The > cases where this happened in 2.0.x (which is what you're running) were > usually pretty nasty bugs, and consider this a reason why you should be > upgrading. > > Note that if you just `rm` those files, you'll probably throw FileNotFound > exceptions and break the node until you restart, which is bad. You'd have > to stop the host, confirm everything is shut down, then remove that 137GB > worth of input files if they still exist. > > Also, please upgrade to 2.1.20. Your life will probably be much easier > because of it. > > As with all things, these are personal opinions, I cant guarantee they're > safe, manually mucking around with database data files is scary, make sure > you have a backup, practice in a lab, etc. > > >> Regards >> Manish >> >> >> On Tue, Jan 21, 2020 at 9:09 PM Nitan Kainth <nitankai...@gmail.com> >> wrote: >> >>> If you are certain that you don’t need data, your plan is good. Make >>> sure to delete all the files for any given sequence number ie data, index, >>> toc etc >>> >>> Regards, >>> >>> Nitan >>> >>> Cell: 510 449 9629 >>> >>> On Jan 21, 2020, at 5:36 AM, manish khandelwal < >>> manishkhandelwa...@gmail.com> wrote: >>> >>> >>> Hi Team >>> >>> I am observing some obsolete files in Cassandra 2.0.14 which are already >>> compacted but not removed from the system after compaction. >>> As per CASSANDRA-7872 >>> <https://issues.apache.org/jira/browse/CASSANDRA-7872> , after GC grace >>> period has passed the sstables are open for read again and can lead to data >>> resurrection. I am facing disk crunch (90% full ) as well and need to >>> remove those obsolete files ASAP. >>> >>> >>> To avoid this what should be our strategy? I am thinking on following >>> lines >>> 1. Stop the Cassandra server. >>> 2. Remove the obsolete files manually. >>> 3. Start the Cassandra server. >>> >>> Regards >>> Manish >>> >>> >>> >>> >>>