[
https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Harvey updated CASSANDRA-4492:
------------------------------------
Attachment: (was: jstack)
> Large HintsColumnFamily compactions hang
> ----------------------------------------
>
> Key: CASSANDRA-4492
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 1.0.11
> Reporter: Jason Harvey
> Priority: Minor
>
> Running into an issue on a 6 node ring running 1.0.11 where whenever a
> somewhat large set of hints build up (seen as low as 400MB), compaction on
> the hints CF hangs indefinitely. Nothing of note in the logs. In some cases,
> the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I have and restarted several times. The
> issue always comes back rather quickly and predictably after wiping the
> sstables. Compaction always seems to succeed if the hints CFs are rather
> small.
> Hints are enabled, and my hint window is the default of 1hr. I do have some
> copies of HintsColumnFamily sstables that do replicate this issue. However,
> the hints may contain confidential data. If they'd be helpful in
> troubleshooting this issue, let me know and I can see about sending them
> directly.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung.
> The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
> compaction type keyspace column family bytes compacted
> bytes total progress
> Compaction systemHintsColumnFamily 268082
> 464784758 0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9
> waiting on condition [0x00007eb8c6ffa000]
> java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x000000050f2e0e58> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
> at
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
> at
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
> at
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
> at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
> at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
> at
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
> at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
> at
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
> at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
> at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
> at
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
> at
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
> at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
> at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
> at
> com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
> at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
> at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
> at
> org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
> at
> org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira