[
https://issues.apache.org/jira/browse/CASSANDRA-7203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Ellis resolved CASSANDRA-7203.
---------------------------------------
Resolution: Won't Fix
I'm okay with accepting that our performance will be suboptimal if you have
partitions with wildly different workload characteristics.
> Flush (and Compact) High Traffic Partitions Separately
> ------------------------------------------------------
>
> Key: CASSANDRA-7203
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7203
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Benedict
> Labels: compaction, performance
>
> An idea possibly worth exploring is the use of streaming count-min sketches
> to collect data over the up-time of a server to estimating the velocity of
> different partitions, so that high-volume partitions can be flushed
> separately on the assumption that they will be much smaller in number, thus
> reducing write amplification by permitting compaction independently of any
> low-velocity data.
> Whilst the idea is reasonably straight forward, it seems that the biggest
> problem here will be defining any success metric. Obviously any workload
> following an exponential/zipf/extreme distribution is likely to benefit from
> such an approach, but whether or not that would translate in real terms is
> another matter.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)