[ 
https://issues.apache.org/jira/browse/CASSANDRA-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932584#action_12932584
 ] 

Jonathan Ellis commented on CASSANDRA-1658:
-------------------------------------------

FWIW, this seems like much less bang-for-complexity-buck than limiting sstable 
size as in CASSANDRA-1608 to me.  (They are not mutually exclusive but I would 
like to see how urgent this feels after limiting size is done, first.)

> support incremental sstable switching
> -------------------------------------
>
>                 Key: CASSANDRA-1658
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1658
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Peter Schuller
>            Priority: Minor
>
> I have been thinking about how to minimize the impact of compaction further 
> beyond CASSANDRA-1470. 1470 deals with the impact of the compaction process 
> itself in that it avoids going through the buffer cache; however, once 
> compaction is complete you are still switching to new sstables which will 
> imply cold reads.
> Instead of switching all at once, one could keep both the old and new 
> sstables around for a bit and incrementally switch over traffic to the new 
> sstables.
> A given request would go to the new or old sstable depending on e.g. the hash 
> of the row key couple with the point in time relative to compaction 
> completion and relative to the intended target sstable switch-over.
> In terms of end-user configuration/mnemonics, one would specify, for a given 
> column family, something like "sstable transition period per gb of data" or 
> similar. The "per gb of data" would refer to the size of the newly written 
> sstable after a compaction. So; for a major compaction you would wait for a 
> very significant period of time since the entire database just went cold. For 
> a minor compaction, you would only wait for a short period of time.
> The result should be a reasonable negative impact on e.g. disk space usage, 
> but hopefully a very significant impact in terms of making the sstable 
> transition as smooth as possible for the node.
> I like this because it feels pretty simple, is not relying on OS specific 
> features or otherwise rely on specific support from the OS other than a "well 
> functioning cache mechanism", and does not imply something hugely significant 
> like writing our own page cache layer. The performance w.r.t. CPU should be 
> very small, but the improvement in terms of disk I/O should be very 
> significant for workloads where it matters.
> The feature would be optional and per-sstable (or possibly global for the 
> node).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to