[ 
https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Harvey updated CASSANDRA-4492:
------------------------------------

    Attachment:     (was: jstack)
    
> Large HintsColumnFamily compactions hang
> ----------------------------------------
>
>                 Key: CASSANDRA-4492
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 1.0.11
>            Reporter: Jason Harvey
>            Priority: Minor
>
> Running into an issue on a 6 node ring running 1.0.11 where whenever a 
> somewhat large set of hints build up (seen as low as 400MB), compaction on 
> the hints CF hangs indefinitely. Nothing of note in the logs. In some cases, 
> the compaction hangs before a tmp sstable is even created.
> I've wiped out every hints sstable I have and restarted several times. The 
> issue always comes back rather quickly and predictably after wiping the 
> sstables. Compaction always seems to succeed if the hints CFs are rather 
> small.
> Hints are enabled, and my hint window is the default of 1hr. I do have some 
> copies of HintsColumnFamily sstables that do replicate this issue. However, 
> the hints may contain confidential data. If they'd be helpful in 
> troubleshooting this issue, let me know and I can see about sending them 
> directly.
> This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade.
> Here is the output I see from compactionstats where a compaction has hung. 
> The 'bytes compacted' column never changes.
> {code}
> pending tasks: 1
>           compaction type        keyspace   column family bytes compacted     
> bytes total  progress
>                Compaction          systemHintsColumnFamily          268082    
>    464784758     0.06%
> {code}
> The hung thread stack is as follows: (full jstack attached, as well)
> {code}
> "CompactionExecutor:37" daemon prio=10 tid=0x00000000063df800 nid=0x49d9 
> waiting on condition [0x00007eb8c6ffa000]
>    java.lang.Thread.State: WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x000000050f2e0e58> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>         at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>         at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>         at 
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329)
>         at 
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281)
>         at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at 
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147)
>         at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126)
>         at 
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100)
>         at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at 
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101)
>         at 
> org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88)
>         at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at 
> com.google.common.collect.Iterators$7.computeNext(Iterators.java:614)
>         at 
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>         at 
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>         at 
> org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:141)
>         at 
> org.apache.cassandra.db.compaction.CompactionManager$7.call(CompactionManager.java:395)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to