Re: Optimising the data model for reads

Romain Hardouin Thu, 29 Sep 2016 06:10:41 -0700

Hi Julian,
The problem with any deletes here is that you can *read* potentially many 
tombstones. I mean you have two concerns: 1. Avoid to read tombstones during a 
query 2. How to evict tombstones as quickly as possible to reclaim disk space   
 The first point is a data model consideration. Generally speaking, to avoid to 
read tombstones we have to think about order. Let's take an example not related 
to your data model: say you have a "updated_at" column, maybe you always want 
to read the newest data (e.g. < 7 days) while oldest ones will be TTL'ed 
(tombstones). If you order your data by "updated_at DESC" (and TTL>7 days and 
there are no manual deletes) you won't read tombstones.
The second point depends on many factors: gc_grace, compaction strategy, 
compaction throughput, number of compactors, IO performances, #CPUs, ...    
Also, with such a data model, you will have unbalance data distribution. What 
if a user has 1,000,000 files or more?You can use a composite partition key to 
avoid that: PRIMARY KEY ((userid, fileid), ...).The data distribution will be 
much better and on top of that you won't read tombstones when a file is deleted 
(because you won't query the partition key at all). *However if you always read 
many files per user, each query will hit many nodes.*You have to decide 
depending on the query pattern, the average/max number of files per user, the 
average/max file size, etc.
Regarding the compaction strategy, LCS is good for read heavy workload but you 
need good disk IO and enough CPUs/vCPUs (watch out if your write workload is 
quite heavy).The LCS will compact frequently so, *if tombstones are evictable*, 
they will be evicted faster that with STCS.As you mentioned, you have 10 days 
of gc_grace so you might consider to lower this value if maintenance repair are 
running in few hours/days.
LCS is doing a good job with updates and that gives me an idea: what about soft 
deletes? A clustering column "status int" could do the trick. Let's say 
1=>"live file", 2=>"to delete".When a user deletes a file, you set the "status" 
to 2 and write the userid and fileid in a table "files_to_delete" (the 
partition key can be the date of the day if there are not millions of deletion 
per day). Then a batch job can run during off-peak hours to delete i.e. add a 
tombstone on files to delete.In read queries you would have to add "WHERE 
status = 1 AND ...". Again it's just an idea that crosses my mind, I never 
tested this model, but maybe you can think about it. The bonus is that you can 
"undeleted" a file as long as the batch job has not been triggered.
Best,
Romain


    Le Jeudi 29 septembre 2016 11h31, Thomas Julian <thomasjul...@zoho.com> a 
écrit :
 

 Hello,

I have created a column family for User File Management.
CREATE TABLE "UserFile" ("USERID" bigint,"FILEID" text,"FILETYPE" 
int,"FOLDER_UID" text,"FILEPATHINFO" text,"JSONCOLUMN" text,PRIMARY KEY 
("USERID","FILEID"));

Sample Entry

(4*************003, 3f9**************************6a1, null, 2 , 
[{"FOLDER_TYPE":"-1","UID":"1","FOLDER":"\"HOME\""}] 
,{"filename":"untitled","size":1,"kind":-1,"where":""})


Queries :

Select "USERID","FILEID","FILETYPE","FOLDER_UID","JSONCOLUMN" from "UserFile" 
where "USERID"=<value> and "FILEID" in (<value>,<value>,...)

Select "USERID","FILEID","FILEPATHINFO" from "UserFile" where "USERID"=<value> 
and "FILEID" in (<value>,<value>,...) 

This column family was perfectly working in our lab. I was able to fetch the 
results for the queries stated at less than 10ms. I deployed this in 
production(Cassandra 2.1.13), It was working perfectly for a month or two. But 
now at times the queries are taking 5s to 10s. On analysing further, I found 
that few users are deleting the files too frequently. This generates too many 
tombstones. I have set the gc_grace_seconds to the default 10 days and I have 
chosen SizeTieredCompactionStrategy. I want to optimise this Data Model for 
read efficiency. 

Any help is much appreciated.

Best Regards,
Julian.

Re: Optimising the data model for reads

Reply via email to