Björn Hegerfors created CASSANDRA-8361:
------------------------------------------

             Summary: Make DTCS split SSTables to perfectly fit time windows
                 Key: CASSANDRA-8361
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8361
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Björn Hegerfors
            Priority: Minor


The time windows that DTCS uses are what the strategy tries to align SSTables 
to, in order to get the right structure, for best performance. I added the 
ticket CASSANDRA-8360, taking SSTables one step closer to aligning with these 
windows in a 1:1 manner.

The idea in this ticket is to perfectly align SSTables with the DTCS time 
windows, by splitting SSTables that cross window borders. This can lead to 
certain benefits, perhaps mostly in consistency and predictability terms, where 
it will be very well defined where every value is stored that is old enough to 
have stabilized.

Read queries can be aligned with windows in order to guarantee a single disk 
seek (although then the client needs to know the right window placements). 
Basically, SSTables can be made to align perfectly on day borders, for example. 
Right now, there would be an SSTable that almost represents a day, but not 
perfectly. So some data is still in another SSTable. 

It could also be a useful property for tombstone expiration and repairs.

Practically all splits would happen only in the latest time windows with the 
newest and smallest SSTables. After those are split, DTCS would never compact 
SSTables across window borders. I have a hard time seeing when this could cause 
an expensive operation except for when switching from another compaction 
strategy (or even from current DTCS), and after a major compaction. In fact 
major compaction for DTCS should put data perfectly in windows rather than 
everything in one SSTable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to