Laxmikant, 
You mentioned that you need to filter records based on status='pending' in 
option-1. I don't see that filtering is done in that option. You are setting 
status as 'processed' when partition key is matched for table. For delete 
(option-2) it will completely remove whole partition for records_by_date table 
if that's what you want. 
Regards,
Aakash Pandhi
 

    On Saturday, May 23, 2020, 09:09:48 AM CDT, Laxmikant Upadhyay 
<laxmikant....@gmail.com> wrote:  
 
 Hi All,I have a query regarding Cassandra data modelling:  I have created two 
tables:
1. CREATE TABLE ks.records_by_id ( id uuid PRIMARY KEY,  status text, details 
text);
2. CREATE TABLE ks.records_by_date ( date date, id uuid,  status text, PRIMARY 
KEY(date, id));

I need to fetch records by date and then process each of them.Which of the 
following options will be better when the record is processed?

Option-1 : 
BEGIN BATCH
UPDATE ks.records_by_id SET status = 'processed' WHERE id = <id1>;
UPDATE ks.records_by_date SET status = 'processed' WHERE id = <id1> and 
date='date1';
APPLY BATCH ;

Option-2
BEGIN BATCH
UPDATE ks.records_by_id SET status = 'processed' WHERE id = <id1>;
DELETE FROM ks.records_by_date WHERE id = <id1> and date='date1';
APPLY BATCH ;

Option-1 will not create tombstones but i need to filter the records based of 
status='pending' at application layer for each date. Option-2 will create 
tombstone (however number of tombstones will be limited in a partition) but it 
will not require application side filtering.

I think that we should avoid tombstones specially row-level so should go with 
option-1. Kindly suggest on above or any other better approach ?

-- 

regards,Laxmikant Upadhyay
  

Reply via email to