Hello Cassandra guru:
We are using Cassandra 4.0.1. I want to understand the meaning of
liveness_info from sstabledump output. The following is a sample output of a
record from sstabledump
{
"partition" : {
"key" : [ "1065", "25034769", "6" ],
"position" : 110384220
},
"rows" : [
{
"type" : "row",
"position" : 110384255,
"clustering" : [ "2021-12-03 08:11:00.000Z" ],
"liveness_info" : { "tstamp" : "2021-12-03T08:11:00Z", "ttl" : 259200,
"expires_at" : "2021-12-13T08:34:04Z", "expired" : false },
"cells" : [
{ "name" : "close", "value" : {"size": 10000, "ts": "2021-12-03
08:11:51.919Z", "value": 132.259} },
{ "name" : "high", "value" : {"size": 10000, "ts": "2021-12-03
08:11:37.852Z", "value": 132.263} },
{ "name" : "low", "value" : {"size": 10000, "ts": "2021-12-03
08:11:21.377Z", "value": 132.251} },
{ "name" : "open", "value" : {"size": 10000, "ts": "2021-12-03
08:11:00.434Z", "value": 132.261} }
]
}
]
},
What I am puzzled about is the "expires_at" timestamp.
The TTL is 259200, which is 3 days. Yet, the expires_at is set for 10 days
later. These non-expiring data seems to be stuck in the db longer than
expected. This cause read latency towards the end of the week.
I am not sure where the 10 days "expires_at" is coming from. Our keyspace is
configured as follows:
CREATE TABLE storage_system.sample_rate (
market smallint,
sin bigint,
field smallint,
slot timestamp,
close frozen<pricerecord>,
high frozen<pricerecord>,
low frozen<pricerecord>,
open frozen<pricerecord>,
PRIMARY KEY ((market, sin, field), slot)
) WITH CLUSTERING ORDER BY (slot ASC)
AND additional_write_policy = '99p'
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND cdc = false
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
'compaction_window_size': '2', 'compaction_window_unit': 'HOURS',
'max_threshold': '32', 'min_threshold': '4', 'tombstone_compaction_interval':
'86400', 'unchecked_tombstone_compaction': 'true',
'unsafe_aggressive_sstable_expiration': 'true'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND default_time_to_live = 86400
AND extensions = {}
AND gc_grace_seconds = 3600
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair = 'BLOCKING'
AND speculative_retry = '99p';
Our application has logic that Monday to Thurs records has a TTL of one day.
Friday records has a TTL of 3 days.
The Monday to Thursday records are cleaning up properly. It is always the
Friday's data seem to have extended expires_at.
Thanks in advance for anyone who can provide some pointers on where to check
for problem.
Eric Wong