[ 
https://issues.apache.org/jira/browse/CASSANDRA-11407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Kale updated CASSANDRA-11407:
-------------------------------------
    Summary: Proposal for simplified DTCS  (was: Proposal for a simple DTCS)

> Proposal for simplified DTCS
> ----------------------------
>
>                 Key: CASSANDRA-11407
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11407
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction
>            Reporter: Anubhav Kale
>         Attachments: 0001-Simple-DTCS.patch
>
>
> Today's DTCS implementation has been discussed and debated in a few JIRAs 
> already (the notable one is 
> https://issues.apache.org/jira/browse/CASSANDRA-9666). One of the main 
> challenges with the current approach is that it is very difficult to reason 
> about how the "Target" class makes buckets, thus making it difficult to 
> reason about the expected file layout on disk.
> I am proposing a simplification to current approach that keeps most of the 
> DTCS properties intact that makes it a great fit for time-series data. The 
> simplification is as follows.
> Given the min and max timestamps across all SS Tables in question, start from 
> min and make windows based on base and min_threshold. The logic in GetWindow 
> simply tries to fit maximum sized windows from min to max. 
> This keeps the DTCS properties intact except that we don't need to wait for 
> min_threshold windows before making a bigger one. I would argue this 
> simplifies the algorithm to a great extent, is easy to reason about and the 
> end result isn't drastically different than the original DTCS in most cases. 
> We give up on the "alignment" logic in current class, but I honestly don't 
> think it buys us a lot besides complexity.
> The implementation can obviously be optimized and cleaned up more if folks 
> think this is a good idea. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to