[
https://issues.apache.org/jira/browse/CASSANDRA-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Ellis updated CASSANDRA-5561:
--------------------------------------
Fix Version/s: (was: 2.0)
2.0.1
> Compaction strategy that minimizes re-compaction of old/frozen data
> -------------------------------------------------------------------
>
> Key: CASSANDRA-5561
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5561
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Affects Versions: 1.2.3
> Reporter: Tupshin Harper
> Fix For: 2.0.1
>
>
> Neither LCS nor STCS are good for data that becomes immutable over time. The
> most obvious case is for time-series data where the application can guarantee
> that out-of-order delivery (to Cassandra) of events can't take place more
> than N minutes/seconds/hours/days have elapsed after the real (wall time).
> There are various approaches that could involve paying attention to the row
> keys (if they include a time component) and/or the column name (if they are
> TimeUUID or Integer based and are inherently time-ordered), but it might be
> sufficient to just look at the timestamp of the columns themselves.
> A possible approach:
> 1) Define an optional max-out-of-order window on a per-CF basis.
> 2) Use normal (LCS or STCS) compaction strategy for any SSTables that include
> any columns younger than max-out-of-order-delivery).
> 3) Use alternate compaction strategy (will call it TWCS time window
> compaction strategy for now) for any SSTables that only contain columns older
> than max-out-of-order-delivery.
> 4) TWCS will only compact sstables containing data older than
> max-out-of-order-delivery.
> 5) TWCS will only perform compaction to reduce row fragmentation (if there is
> any by the time it gets to TWCS or to reduce the number of small sstables.
> 6) To minimize re-compaction in TWCS, it should aggresively try to compact as
> many small sstables as possible into one large sstable that would never have
> to get recompacted.
> In the case of large datasets (e.g. 5TB per node) with LCS, there would be on
> the order of seven levels, and hence seven separate writes of the same data
> over time. With this approach, it should be possible to get about 3
> compactions per column (2 in original compaction and one more once hitting
> TWCS) in most cases, cutting the write workload by a factor of two or more
> for high volume time-series applications.
> Note that the only workaround I can currently suggest to minimize compaction
> for these workloads is to programatically shard your data across time-window
> ranges (e.g. new CF per week), but that pushes unnecessary writing and
> querying logic out to the user and is not as convenient nor flexible.
> Also note that I am not convinced that the approach I've suggested above is
> the best/most general way to solve the problem, but it does appear to be a
> relatively easy one to implement.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira