Laxmikant,
You mentioned that you need to filter records based on status='pending' in
option-1. I don't see that filtering is done in that option. You are setting
status as 'processed' when partition key is matched for table. For delete
(option-2) it will completely remove whole partition for records_by_date table
if that's what you want.
Regards,
Aakash Pandhi
On Saturday, May 23, 2020, 09:09:48 AM CDT, Laxmikant Upadhyay
<[email protected]> wrote:
Hi All,I have a query regarding Cassandra data modelling: I have created two
tables:
1. CREATE TABLE ks.records_by_id ( id uuid PRIMARY KEY, status text, details
text);
2. CREATE TABLE ks.records_by_date ( date date, id uuid, status text, PRIMARY
KEY(date, id));
I need to fetch records by date and then process each of them.Which of the
following options will be better when the record is processed?
Option-1 :
BEGIN BATCH
UPDATE ks.records_by_id SET status = 'processed' WHERE id = <id1>;
UPDATE ks.records_by_date SET status = 'processed' WHERE id = <id1> and
date='date1';
APPLY BATCH ;
Option-2
BEGIN BATCH
UPDATE ks.records_by_id SET status = 'processed' WHERE id = <id1>;
DELETE FROM ks.records_by_date WHERE id = <id1> and date='date1';
APPLY BATCH ;
Option-1 will not create tombstones but i need to filter the records based of
status='pending' at application layer for each date. Option-2 will create
tombstone (however number of tombstones will be limited in a partition) but it
will not require application side filtering.
I think that we should avoid tombstones specially row-level so should go with
option-1. Kindly suggest on above or any other better approach ?
--
regards,Laxmikant Upadhyay