Josh Wickman created CASSANDRA-11577:
----------------------------------------
Summary: Traces persist for longer than 24 hours
Key: CASSANDRA-11577
URL: https://issues.apache.org/jira/browse/CASSANDRA-11577
Project: Cassandra
Issue Type: Bug
Reporter: Josh Wickman
Priority: Minor
My deployment currently has clusters on both Cassandra 1.2 (1.2.19) and 2.1
(2.1.11) with tracing on. On 2.1, the trace records persist for longer than
the [documented 24
hours|https://docs.datastax.com/en/cql/3.3/cql/cql_reference/tracing_r.html]:
{noformat}
cqlsh> select started_at from system_traces.sessions limit 10;
started_at
--------------------------
2016-03-11 23:28:40+0000
2016-03-14 21:09:07+0000
2016-03-14 16:42:25+0000
2016-03-14 16:13:13+0000
2016-03-14 19:12:11+0000
2016-03-14 21:25:57+0000
2016-03-29 22:45:28+0000
2016-03-14 19:56:27+0000
2016-03-09 23:31:41+0000
2016-03-10 23:08:44+0000
(10 rows)
{noformat}
My systems on 1.2 do not exhibit this problem:
{noformat}
cqlsh> select started_at from system_traces.sessions limit 10;
started_at
--------------------------
2016-04-13 22:49:31+0000
2016-04-14 18:06:45+0000
2016-04-14 07:57:00+0000
2016-04-14 04:35:05+0000
2016-04-14 03:54:20+0000
2016-04-14 10:54:38+0000
2016-04-14 18:34:04+0000
2016-04-14 12:56:57+0000
2016-04-14 01:57:20+0000
2016-04-13 21:36:01+0000
{noformat}
The event records also persist alongside the session records, for example:
{noformat}
cqlsh> select session_id, dateOf(event_id) from system_traces.events where
session_id = fc8c1e80-e7e0-11e5-a2fb-1968ff3c067b;
session_id | dateOf(event_id)
--------------------------------------+--------------------------
fc8c1e80-e7e0-11e5-a2fb-1968ff3c067b | 2016-03-11 23:28:40+0000
{noformat}
Between these versions, the table parameter {{default_time_to_live}} was
introduced. The {{system_traces}} tables report the default value of 0:
{noformat}
cqlsh> desc table system_traces.sessions
CREATE TABLE system_traces.sessions (
session_id uuid PRIMARY KEY,
coordinator inet,
duration int,
parameters map<text, text>,
request text,
started_at timestamp
) WITH bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = 'traced sessions'
AND compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression':
'org.apache.cassandra.io.compress.SnappyCompressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
{noformat}
I suspect that {{default_time_to_live}} is superseding the mechanism used in
1.2 to expire the trace records. Evidently I cannot change this parameter for
this table:
{noformat}
cqlsh> alter table system_traces.sessions with default_time_to_live = 86400;
Unauthorized: code=2100 [Unauthorized] message="Cannot ALTER <table
system_traces.sessions>"
{noformat}
I realize Cassandra 1.2 is no longer supported, but the problem is being
manifested in Cassandra 2.1 for me (I included 1.2 only for comparison). Since
I couldn't find an existing ticket addressing this issue, I'm concerned that it
may be present in more recent versions of Cassandra as well, but I have not
tested these.
The persistent trace records are contributing to disk filling, and more
importantly, making it more difficult to analyze the trace data. Is there a
workaround for this?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)