[
https://issues.apache.org/jira/browse/CASSANDRA-6666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14078171#comment-14078171
]
Vishal Mehta commented on CASSANDRA-6666:
-----------------------------------------
Hello Every,
Please pardon my ignorance, since I am writing first time in opensource bug
report.
Recently I think I hit this bug because I saw similar symptoms in my 3 node
cassandra setup. Where I am running a test with around 12K qps (inserts in 3
different tables) with TTL set to 1 hour and keyspace has GC seconds set to
14400 (4 hours).
So tests eventually runs to a point where Cassandra sees Tombstones more than
100K and it crashes with following exception in
/var/log/cassandra/cassandra.log.
{noformat}
ERROR 13:23:56,747 Scanned over 100000 tombstones in system.hints; query
aborted (see tombstone_fail_threshold)
ERROR 13:23:56,962 Exception in thread Thread[HintedHandoff:1,1,main]
org.apache.cassandra.db.filter.TombstoneOverwhelmingException
at
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:202)
at
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)
at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376)
at
org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:373)
at
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:330)
at
org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:91)
at
org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:547)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
INFO 13:24:00,987 No gossip backlog; proceeding
{noformat}
*Note:* Is it plausible to keep GC seconds closer to TTLs? Also I could see one
of the node deleted all the records from disk and freed up the space, where as
other two nodes never deleted their tombstones.
> Avoid accumulating tombstones after partial hint replay
> -------------------------------------------------------
>
> Key: CASSANDRA-6666
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6666
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Jonathan Ellis
> Assignee: Jonathan Ellis
> Priority: Minor
> Labels: hintedhandoff
> Fix For: 2.0.10
>
> Attachments: 6666.txt, cassandra_system.log.debug.gz
>
>
--
This message was sent by Atlassian JIRA
(v6.2#6252)