[
https://issues.apache.org/jira/browse/CASSANDRA-6446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Oleg Anastasyev updated CASSANDRA-6446:
---------------------------------------
Summary: Faster range tombstones on wide partitions (was: Faster range
tombstones on wide rows)
> Faster range tombstones on wide partitions
> ------------------------------------------
>
> Key: CASSANDRA-6446
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6446
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Oleg Anastasyev
> Attachments: RangeTombstonesReadOptimization.diff,
> RangeTombstonesWriteOptimization.diff
>
>
> Having wide CQL rows (~1M in single partition) and after deleting some of
> them, we found inefficiencies in handling of range tombstones on both write
> and read paths.
> I attached 2 patches here, one for write path
> (RangeTombstonesWriteOptimization.diff) and another on read
> (RangeTombstonesReadOptimization.diff).
> On write path, when you have some CQL rows deletions by primary key, each of
> deletion is represented by range tombstone. On put of this tombstone to
> memtable the original code takes all columns from memtable from partition and
> checks DeletionInfo.isDeleted by brute for loop to decide, should this column
> stay in memtable or it was deleted by new tombstone. Needless to say, more
> columns you have on partition the slower deletions you have heating your CPU
> with brute range tombstones check.
> The RangeTombstonesWriteOptimization.diff patch for partitions with more than
> 10000 columns loops by tombstones instead and checks existance of columns for
> each of them. Also it copies of whole memtable range tombstone list only if
> there are changes to be made there (original code copies range tombstone list
> on every write).
> On read path, original code scans whole range tombstone list of a partition
> to match sstable columns to their range tomstones. The
> RangeTombstonesReadOptimization.diff patch scans only necessary range of
> tombstones, according to filter used for read.
--
This message was sent by Atlassian JIRA
(v6.1#6144)