Hi, 


I needed to save a distinct value for a key in each hour, the problem with 
saving everything and computing distincts in memory is that there

are too many repeated data.

Table schema:

Table distinct(

hourNumber int,

key text,

distinctValue long

primary key (hourNumber)

)



I want to retrieve distinct count of all keys in a specific hour and using this 
data model it would be achieved by reading a single partition.

The problem : i can't read from this table, system.log indicates that more than 
100K tombstones read and no live data in it. The gc_grace time is

the default (10 days), so i thought decreasing it to 1 hour and run compaction, 
but is this a right approach at all? i mean the whole idea of replacing

some millions of rows. each  10 times in a partition again and again that 
creates alot of tombstones just to achieve distinct behavior?



Thanks in advance


Sent using Zoho Mail





Reply via email to