I will look into raising the gc_grace_seconds.
We are using LocalQuorum for all reads and writes. We do not use ALL exactly 
for outage reasons.



________________________________
From: Jeff Jirsa <jji...@gmail.com>
Sent: Saturday, February 4, 2023 8:44 PM
To: user@cassandra.apache.org <user@cassandra.apache.org>
Subject: Re: Deletions getting omitted

While you'd expect only_purge_repaired_tombstones:true to be sufficient, your 
gc_grace_secnds of 1 hour is making you unusually susceptible to resurrecting 
data.

(To be clear, you should be safe to do this, but if there is a bug hiding in 
there somewhere, your low gc_grace_seconds will make it likely to resurrect; if 
this is causing you problems, I'd try raising that first to mitigate while you 
investigate the real cause).

If it's CASSANDRA-15690, a second read at consistency ALL may cause the data to 
properly show up "deleted" (you dont want to use ALL all the time, because 
it'll be an outage if you ever have a node go down). Given CASSANDRA-15690 
exists, you probably want to upgrade.



On Sat, Feb 4, 2023 at 4:56 PM shankha b 
<shankha-ms-wor...@outlook.com<mailto:shankha-ms-wor...@outlook.com>> wrote:
We are facing an issue on one of our production systems where after we delete 
the data
the data doesn't seem to get deleted. We have a Get call just after the delete 
call.
The data shows up.

Versions

    cassandra : 3.11.6
    gocqlx : v2 v2.1.0


1. Client Settings: LocalQuorum
2. Number of Nodes : 3
3. All 3 nodes up and running for weeks.
4. Inserts were done few days earlier. So there is good amount of time 
difference
between Inserts and Deletes and Inserts have made through successfully.


The Delete Call :

    q := s.session.Query(stmt, names).BindStruct(*customModel)
    err := q.ExecRelease()

We do check the error and it is Nil.
There are no exceptions during that time either on the client side or server 
side.


The Get Call :

    q := s.session.Query(stmt, names).BindStruct(*customModel)
    err := q.GetRelease(customModel)

This returns the data successfully.

We do have these two options enabled.
1. 
https://docs.datastax.com/en/dse/6.8/dse-dev/datastax_enterprise/config/configCassandra_yaml.html#configCassandra_yaml__commitlog_sync

    batch - Send ACK signal for writes after the commit log has been flushed to 
disk. Each incoming write triggers the flush task.

2. only_purge_repaired_tombstones

This does not happen for all the delete operations. For many of them, the 
delete seems
to go through. This does not seem to be timing-related and the successful and 
unsuccessful ones
are spread out.


CASSANDRA-15690
Single partition queries can mistakenly omit partition deletions and resurrect 
data

I am trying to go through this PR and ticket. If you have any suggestions, 
please do let me know.


The table structure is the following

    CREATE KEYSPACE cycling WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '3'}  AND durable_writes = true;

    CREATE TABLE cycling.rider (
        uuid text,
        created_at timestamp,
        PRIMARY KEY (uuid, created_at)
    ) WITH CLUSTERING ORDER BY (created_at DESC)
        AND WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4', 'only_purge_repaired_tombstones': 
'true'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 3600
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';

Thanks


Reply via email to