[
https://issues.apache.org/jira/browse/CASSANDRA-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299920#comment-14299920
]
Benedict commented on CASSANDRA-8689:
-------------------------------------
I've been meaning to file a bug about something for a while, after
investigating if it happens anywhere, and it looks to me like this may be it in
action (in one of many potential bolt holes).
In markCompacting we don't ensure the sstable we're marking compacting is
actually in the set of live sstables. So we really need to take a reference
either before or after marking compacting, or we need to ensure the
intersection with the live set is the same as the set we are marking. We can't
do this universally, though, because we sometimes deliberately markCompacting
files not in the live set.
Anyway, the upshot is that if we don't do this we can start working on a file
that has been dropped and cleaned up. The race window for grabbing such a file
is narrow, but the window over which it can cause problems once we do is
unbounded.
IndexSummaryManager.getCompactingAndNonCompactingSSTables looks to me to
exhibit this race condition.
> Assertion error in 2.1.2: ERROR [IndexSummaryManager:1]
> -------------------------------------------------------
>
> Key: CASSANDRA-8689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8689
> Project: Cassandra
> Issue Type: Bug
> Reporter: Jeff Liu
> Fix For: 2.1.3
>
>
> After upgrading a 6 nodes cassandra from 2.1.0 to 2.1.2, start getting the
> following assertion error.
> {noformat}
> ERROR [IndexSummaryManager:1] 2015-01-26 20:55:40,451
> CassandraDaemon.java:153 - Exception in thread
> Thread[IndexSummaryManager:1,1,main]
> java.lang.AssertionError: null
> at org.apache.cassandra.io.util.Memory.size(Memory.java:307)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.sstable.IndexSummary.getOffHeapSize(IndexSummary.java:192)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.sstable.SSTableReader.getIndexSummaryOffHeapSize(SSTableReader.java:1070)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:292)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.sstable.IndexSummaryManager.redistributeSummaries(IndexSummaryManager.java:238)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.io.sstable.IndexSummaryManager$1.runMayThrow(IndexSummaryManager.java:139)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:77)
> ~[apache-cassandra-2.1.2.jar:2.1.2]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> [na:1.7.0_45]
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
> [na:1.7.0_45]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
> [na:1.7.0_45]
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> [na:1.7.0_45]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> [na:1.7.0_45]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> [na:1.7.0_45]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> {noformat}
> cassandra service is still running despite the issue. Node has total 8G
> memory with 2G allocated to heap. We are basically running read queries to
> retrieve data out of cassandra.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)