Hi,

I have a table defined like this:

CREATE TABLE myTable (
course_id text,
assignment_id text,
assignment_item_id text,
data text,
boolean active,
PRIMARY KEY (course_id, assignment_id, assignment_item_id)
);
i.e. course_id as the partition key and assignment_id, assignment_item_id
as clustering keys.

After data is populated, some delete queries by course_id and assignment_id
occurs, e.g. "DELETE FROM myTable WHERE course_id = 'C' AND assignment_id =
'A1';". This would create tombstones so query "SELECT * FROM myTable WHERE
course_id = 'C';" would be affected, right? Would query "SELECT * FROM
myTable WHERE course_id = 'C' AND assignment_id = 'A2';" be affected too?

For query "SELECT * FROM myTable WHERE course_id = 'C';", to workaround the
tombstone problem, we are thinking about not doing hard deletes, instead
doing soft deletes. So instead of doing "DELETE FROM myTable WHERE
course_id = 'C' AND assignment_id = 'A1';", we do "UPDATE myTable SET
active = false WHERE course_id = 'C' AND assignment_id = 'A1';". Then in
the application, we do query "SELECT * FROM myTable WHERE course_id = 'C';"
and filter out records that have "active" equal to "false". I am not really
sure this would improve performance because C* still has to scan through
all records with the partition key "C". It is just instead of scanning
through X records + Y tombstone records with hard deletes that generate
tombstones, it now scans through X + Y records with soft deletes and no
tombstones. Am I right?

Thanks.

George.

Reply via email to