[jira] [Updated] (CASSANDRA-15378) Data not cleaned up from disk for SSTables after compaction

Jon Haddad (Jira) Sun, 27 Oct 2019 11:35:37 -0700


     [ 
https://issues.apache.org/jira/browse/CASSANDRA-15378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Jon Haddad updated CASSANDRA-15378:
-----------------------------------
    Resolution: Not A Bug
        Status: Resolved  (was: Triage Needed)

This is not a bug.  Cassandra doesn't make a guarantee that the data will be 
removed, just that it is eligible for removal.  The tombstones can be removed 
during compaction if and only if there's no data in other SSTables the 
tombstones _might_ shadow. 

For additional reading on when compaction will run there's my blog post here: 
https://thelastpickle.com/blog/2017/03/16/compaction-nuance.html

I'm closing this out unless compaction is run and isn't clearing out the 
tombstones. 

> Data not cleaned up from disk for SSTables after compaction
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-15378
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15378
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Piush
>            Priority: Normal
>
> Hello Team,
> We have an application where we create data in cf, and delete the data based 
> on the partition key on a frequent basis. We have gc_grace_seconds set to 
> lower value (2 mins) to evict tombstones on the cf.
>  Version of cassandra - 3.11.3
> We are noticing a behaviour where even though the number of records in cf is 
> 0, the data is left back  on disk in cassandra data directory for the 
> specific cf. 
>  
> Size on filesystem for cfs {{subscriber_event_by_id_shadow}}{{, 
> }}{{subscriber_event_shadow}}
>  112M subscriber_event_by_id_shadow-4f08b880f59311e98530a93a5d955b83
> 129M subscriber_event_shadow-4e7b1e80f59311e98530a93a5d955b83}}
> we see 0 records on this table
>  
> cqlsh:apim> select count (id) from subscriber_event_shadow;
>  
> *count*
> -------
>      *0*
>  
> (1 rows)
>  
> Warnings :
> Aggregation query used without partition key
>  
> cqlsh:apim> select count(id) from subscriber_event_by_id_shadow;
>  
> *count*
> -------
>      *0*
>  
> (1 rows)
>  
> Schema for the cfs
> CREATE TABLE apim.subscriber_event_by_id_shadow (
>     transaction_id uuid,
>     shadow_version text,
>     id uuid,
>     namespace text,
>     generated_at timeuuid,
>     api_version text,
>     created_at timestamp,
>     event text,
>     event_type text,
>     filter text,
>     metadata map<text, text>,
>     name text,
>     occ_keys list<text>,
>     operation text,
>     payload blob,
>     retries int,
>     scope text,
>     shadow boolean,
>     shadow_id timeuuid,
>     shadow_metadata map<text, text>,
>     state text,
>     summary text,
>     title text,
>     type text,
>     updated_at timestamp,
>     url text,
>     PRIMARY KEY (transaction_id, shadow_version, id, namespace, generated_at)
> ) WITH CLUSTERING ORDER BY (shadow_version ASC, id ASC, namespace ASC, 
> generated_at ASC)
>     AND bloom_filter_fp_chance = 0.01
>     AND caching = \{'keys': 'ALL', 'rows_per_partition': '10'}
>     AND comment = ''
>     AND compaction = \{'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4', 'tombstone_threshold': '0.1', 
> 'unchecked_tombstone_compaction': 'true'}
>     AND compression = \{'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND crc_check_chance = 1.0
>     AND dclocal_read_repair_chance = 0.1
>     AND default_time_to_live = 0
>     AND gc_grace_seconds = 120
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND read_repair_chance = 0.0
>     AND speculative_retry = '99PERCENTILE';
>  ```
> We see gc_grace_seconds set to 120 (2 mins), my understanding is that 
> tombstones should have been evicted and cleaned disk.
>  
> However the keyspace has following contents in the file system.
> bash-4.2$ cd subscriber_event_shadow-4e7b1e80f59311e98530a93a5d955b83
> bash-4.2$ du -sh *
> 4.0K backups
> 4.0K mc-102-big-CompressionInfo.db
> 22M mc-102-big-Data.db
> 4.0K mc-102-big-Digest.crc32
> 4.0K mc-102-big-Filter.db
> 8.0K mc-102-big-Index.db
> 8.0K mc-102-big-Statistics.db
> 4.0K mc-102-big-Summary.db
> 4.0K mc-102-big-TOC.txt
> 4.0K mc-103-big-CompressionInfo.db
> 4.5M mc-103-big-Data.db
> 4.0K mc-103-big-Digest.crc32
> 4.0K mc-103-big-Filter.db
> 4.0K mc-103-big-Index.db
> 8.0K mc-103-big-Statistics.db
> 4.0K mc-103-big-Summary.db
> 4.0K mc-103-big-TOC.txt
> 4.0K mc-104-big-CompressionInfo.db
> 4.0K mc-104-big-Data.db
> 4.0K mc-104-big-Digest.crc32
> 4.0K mc-104-big-Filter.db
> 4.0K mc-104-big-Index.db
> 8.0K mc-104-big-Statistics.db
> 4.0K mc-104-big-Summary.db
> 4.0K mc-104-big-TOC.txt
> 8.0K mc-95-big-CompressionInfo.db
> 52M mc-95-big-Data.db
> 4.0K mc-95-big-Digest.crc32
> 4.0K mc-95-big-Filter.db
> 8.0K mc-95-big-Index.db
> 8.0K mc-95-big-Statistics.db
> 4.0K mc-95-big-Summary.db
> 4.0K mc-95-big-TOC.txt
> 8.0K mc-96-big-CompressionInfo.db
> 51M mc-96-big-Data.db
> 4.0K mc-96-big-Digest.crc32
> 4.0K mc-96-big-Filter.db
> 12K mc-96-big-Index.db
> 8.0K mc-96-big-Statistics.db
> 4.0K mc-96-big-Summary.db
> 4.0K mc-96-big-TOC.txt
> 4.0K snapshots
> bash-4.2$
> Not able to figure out why we see .db files with 50 MB of data on disk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (CASSANDRA-15378) Data not cleaned up from disk for SSTables after compaction

Reply via email to