[jira] [Commented] (CASSANDRA-18105) TRUNCATED data come back after a restart or upgrade

Stefan Miklosovic (Jira) Wed, 19 Apr 2023 08:32:44 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-18105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17714175#comment-17714175
 ]


Stefan Miklosovic commented on CASSANDRA-18105:
-----------------------------------------------

Together with great help of [~samt] we found the problem. Basically, upon 
dropping of an index, it will eventually call (1) but the problem is that id is 
id of the base table, not of the index. So it will remove the record from the 
truncate_at map in system.local for the base table. So TRUNCATE will put that 
record there but next DROP of index will remove it from there.

If you notice, index has same id as base table because of this.

It was said to me that there is some reason behind the sharing of the id 
between base table and the index but we should probably revisit this decision. 
I am personally not sure why it is done like that.

The fix consists of simple check to not remove the trucated_at entry when table 
metadata is of an index:

{code}
        if (!metadata.get().isIndex())
            SystemKeyspace.removeTruncationRecord(metadata.id);
{code}

(1) 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L695]
(2) 
[https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/index/internal/CassandraIndex.java#L739]

> TRUNCATED data come back after a restart or upgrade
> ---------------------------------------------------
>
>                 Key: CASSANDRA-18105
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18105
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Feature/2i Index
>            Reporter: Ke Han
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0.x, 4.1.x, 5.x
>
>
> When we use the TRUNCATE command to delete all data in the table, the deleted 
> data come back after a node restart or upgrade. This problem happens at the 
> latest releases (2.2.19, 3.0.28, or 4.0.7)
> h1. Steps to reproduce
> h2. To reproduce it at release (3.0.28 or 4.0.7)
> Start up a single Cassandra node. Using the default configuration and execute 
> the following cqlsh commands.
> {code:java}
> CREATE KEYSPACE IF NOT EXISTS ks WITH REPLICATION = { 'class' : 
> 'SimpleStrategy', 'replication_factor' : 1 };
> CREATE TABLE  ks.tb (c3 TEXT,c4 TEXT,c2 INT,c1 TEXT, PRIMARY KEY (c1, c2, c3 
> ));
> INSERT INTO ks.tb (c3, c1, c2) VALUES ('val1','val2',1);
> CREATE INDEX IF NOT EXISTS tb ON ks.tb ( c3);
> TRUNCATE TABLE ks.tb;
> DROP INDEX IF EXISTS ks.tb; {code}
> Execute a read command
> {code:java}
> cqlsh> SELECT c2 FROM ks.tb; 
>  c2
> ----
> (0 rows) {code}
> Then, we flush the node and kill the Cassandra daemon by
> {code:java}
> bin/nodetool flush
> pgrep -f cassandra | xargs kill -9 {code}
> We restart the node. When the node has started, perform the same read, and 
> the deleted data comes back again.
> {code:java}
> cqlsh> SELECT c2 FROM ks.tb; 
>  c2
> ----
>   1
> (1 rows) {code}
> h2. To reproduce it at release (2.2.19)
> We don't need to kill the Cassandra daemon. Use bin/nodetool stopdaemon is 
> enough. The other steps are the same as reproducing it at 4.0.7 or 3.0.28.
> {code:java}
> bin/nodetool -h ::FFFF:127.0.0.1 flush 
> bin/nodetool -h ::FFFF:127.0.0.1 stopdaemon{code}
>  
> I have put the full log to reproduce it for release 4.0.7 and 2.2.19 in the 
> comments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-18105) TRUNCATED data come back after a restart or upgrade

Reply via email to