[
https://issues.apache.org/jira/browse/CASSANDRA-8547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aleksey Yeschenko updated CASSANDRA-8547:
-----------------------------------------
Fix Version/s: (was: 2.1.x)
> Make RangeTombstone.Tracker.isDeleted() faster
> ----------------------------------------------
>
> Key: CASSANDRA-8547
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8547
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Environment: 2.0.11
> Reporter: Dominic Letz
> Assignee: Dominic Letz
> Labels: tombstone
> Attachments: Selection_044.png, cassandra-2.0.11-8547.txt,
> cassandra-2.1-8547.txt, rangetombstone.tracker.txt
>
>
> During compaction and repairs with many tombstones an exorbitant amount of
> time is spend in RangeTombstone.Tracker.isDeleted().
> The amount of time spend there can be so big that compactions and repairs
> look "stalled" and the time remaining time estimated frozen at the same value
> for days.
> Using visualvm I've been sample profiling the code during execution and both
> in Compaction as well as during repairs found this. (point in time backtraces
> attached)
> Looking at the code the problem is obviously the linear scanning:
> {code}
> public boolean isDeleted(Column column)
> {
> for (RangeTombstone tombstone : ranges)
> {
> if (comparator.compare(column.name(), tombstone.min) >= 0
> && comparator.compare(column.name(), tombstone.max) <= 0
> && tombstone.maxTimestamp() >= column.timestamp())
> {
> return true;
> }
> }
> return false;
> }
> {code}
> I would like to propose to change this and instead use a sorted list (e.g.
> RangeTombstoneList) here instead.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)