[
https://issues.apache.org/jira/browse/CASSANDRA-11831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15289223#comment-15289223
]
Jonathan Ellis commented on CASSANDRA-11831:
--------------------------------------------
Some thoughts:
# Manually disabling the check is a big hammer we can add easily enough for 2.1.
# Moving forward, reducing L0 backup (CASSANDRA-10979, CASSANDRA-11833) is
probably the biggest factor in avoiding this situation.
# I'm not a fan of exposing this more widely than a -D flag, because disabling
tombstone purge is very dangerous in its own right.
# We could try to be clever and add a threshold (e.g. <1% tombstones in the
sstable) below which we don't bother purging. However, if we have few
tombstones then ipso facto we won't be calling this code very often anyway.
And if we have a lot of tombstones than we will want to purge those, again
unless L0 is pathological (point #2). So I keep coming back to that as the
real problem.
# If we can in fact keep L0 growth under control then I don't think we need to
add behavior like "skip the check if there are more than N sstables involved."
> Ability to disable purgeable tombstone check via startup flag
> -------------------------------------------------------------
>
> Key: CASSANDRA-11831
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11831
> Project: Cassandra
> Issue Type: New Feature
> Reporter: Ryan Svihla
>
> On Cassandra 2.1.14 hen a node gets way behind and has 10s of thousand
> sstables it appears a lot of the CPU time is spent doing checks like this on
> a call to getMaxPurgeableTimestamp
> org.apache.cassandra.utils.Murmur3BloomFilter.hash(java.nio.ByteBuffer,
> int, int, long, long[]) @bci=13, line=57 (Compiled frame; information may be
> imprecise)
> - org.apache.cassandra.utils.BloomFilter.indexes(java.nio.ByteBuffer)
> @bci=22, line=82 (Compiled frame)
> - org.apache.cassandra.utils.BloomFilter.isPresent(java.nio.ByteBuffer)
> @bci=2, line=107 (Compiled frame)
> -
> org.apache.cassandra.db.compaction.CompactionController.maxPurgeableTimestamp(org.apache.cassandra.db.DecoratedKey)
> @bci=89, line=186 (Compiled frame)
> -
> org.apache.cassandra.db.compaction.LazilyCompactedRow.getMaxPurgeableTimestamp()
> @bci=21, line=99 (Compiled frame)
> -
> org.apache.cassandra.db.compaction.LazilyCompactedRow.access$300(org.apache.cassandra.db.compaction.LazilyCompactedRow)
> @bci=1, line=49 (Compiled frame)
> -
> org.apache.cassandra.db.compaction.LazilyCompactedRow$Reducer.getReduced()
> @bci=241, line=296 (Compiled frame)
> -
> org.apache.cassandra.db.compaction.LazilyCompactedRow$Reducer.getReduced()
> @bci=1, line=206 (Compiled frame)
> - org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext()
> @bci=44, line=206 (Compiled frame)
> - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9,
> line=143 (Compiled frame)
> - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=138
> (Compiled frame)
> - com.google.common.collect.Iterators$7.computeNext() @bci=4, line=645
> (Compiled frame)
> - com.google.common.collect.AbstractIterator.tryToComputeNext() @bci=9,
> line=143 (Compiled frame)
> - com.google.common.collect.AbstractIterator.hasNext() @bci=61, line=138
> (Compiled frame)
> -
> org.apache.cassandra.db.ColumnIndex$Builder.buildForCompaction(java.util.Iterator)
> @bci=1, line=166 (Compiled frame)
> - org.apache.cassandra.db.compaction.LazilyCompactedRow.write(long,
> org.apache.cassandra.io.util.DataOutputPlus) @bci=52, line=121 (Compiled
> frame)
> -
> org.apache.cassandra.io.sstable.SSTableWriter.append(org.apache.cassandra.db.compaction.AbstractCompactedRow)
> @bci=18, line=193 (Compiled frame)
> -
> org.apache.cassandra.io.sstable.SSTableRewriter.append(org.apache.cassandra.db.compaction.AbstractCompactedRow)
> @bci=13, line=127 (Compiled frame)
> - org.apache.cassandra.db.compaction.CompactionTask.runMayThrow()
> @bci=666, line=197 (Compiled frame)
> - org.apache.cassandra.utils.WrappedRunnable.run() @bci=1, line=28
> (Compiled frame)
> -
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector)
> @bci=6, line=73 (Compiled frame)
> -
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(org.apache.cassandra.db.compaction.CompactionManager$CompactionExecutorStatsCollector)
> @bci=2, line=59 (Compiled frame)
> -
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run()
> @bci=125, line=264 (Compiled frame)
> - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511
> (Compiled frame)
> - java.util.concurrent.FutureTask.run() @bci=42, line=266 (Compiled frame)
> -
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
> @bci=95, line=1142 (Compiled frame)
> - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=617
> (Compiled frame)
> - java.lang.Thread.run() @bci=11, line=745 (Compiled frame)
> If we could at least on startup pass a flag like -DskipTombstonePurgeCheck so
> we could in these particularly bad cases just avoid the calculation and merge
> tables until we have less to worry about then restart the node with that flag
> missing once we're down to a more manageable amount of sstables.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)