[jira] [Comment Edited] (CASSANDRA-4492) HintsColumnFamily compactions hang when using multithreaded compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534970#comment-13534970 ] T Jake Luciani edited comment on CASSANDRA-4492 at 12/18/12 3:34 PM: - This is still happening: Looking at the code it seems there are two places where HintedHandoffManager calls a user defined compact() for all sstables. Multithreaded compaction would allow this to race since I see no check to avoid multiple calls to user defined compaction for the same sstables was (Author: tjake): This is still happening: Looking at the code it seems there are two places where HintedHandoffManager calls a user defined compact() for all sstables. Multithreaded compaction would allow this to race since I see no check to avoid multiple calls to user defined compaction; HintsColumnFamily compactions hang when using multithreaded compaction -- Key: CASSANDRA-4492 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.11 Reporter: Jason Harvey Priority: Minor Attachments: jstack.txt Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created. I've wiped out every hints sstable and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring. Compactions of all other CFs seem to work just fine. This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade. I should note that the ring gets a huge amount of writes, and as a result the HintedHandoff rows get be quite wide. I didn't see any large-row compaction notices when the compaction was hanging (perhaps the bug was triggered by incremental compaction?). After disabling multithreaded compaction, several of the rows that were successfully compacted were over 1GB. Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes. {code} pending tasks: 1 compaction typekeyspace column family bytes compacted bytes total progress Compaction systemHintsColumnFamily 268082 464784758 0.06% {code} The hung thread stack is as follows: (full jstack attached, as well) {code} CompactionExecutor:37 daemon prio=10 tid=0x063df800 nid=0x49d9 waiting on condition [0x7eb8c6ffa000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x00050f2e0e58 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329) at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147) at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101) at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at
[jira] [Comment Edited] (CASSANDRA-4492) HintsColumnFamily compactions hang when using multithreaded compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13534973#comment-13534973 ] Thomas Vachon edited comment on CASSANDRA-4492 at 12/18/12 3:35 PM: {quote}Looking at the code it seems there are two places where HintedHandoffManager calls a user defined compact() for all sstable{quote} Well that would explain why everytime I restart and I get hints, I get every sstable compacted was (Author: tvachon): {quote}Looking at the code it seems there are two places where HintedHandoffManager calls a user defined compact() for all sstable{quote} Well that would explain why everytime I start and I get hints, I get every sstable compacted HintsColumnFamily compactions hang when using multithreaded compaction -- Key: CASSANDRA-4492 URL: https://issues.apache.org/jira/browse/CASSANDRA-4492 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.11 Reporter: Jason Harvey Priority: Minor Attachments: jstack.txt Running into an issue on a 6 node ring running 1.0.11 where HintsColumnFamily compactions often hang indefinitely when using multithreaded compaction. Nothing of note in the logs. In some cases, the compaction hangs before a tmp sstable is even created. I've wiped out every hints sstable and restarted several times. The issue always comes back rather quickly and predictably. The compactions sometimes complete if the hint sstables are very small. Disabling multithreaded compaction stops this issue from occurring. Compactions of all other CFs seem to work just fine. This ring was upgraded from 1.0.7. I didn't keep any hints from the upgrade. I should note that the ring gets a huge amount of writes, and as a result the HintedHandoff rows get be quite wide. I didn't see any large-row compaction notices when the compaction was hanging (perhaps the bug was triggered by incremental compaction?). After disabling multithreaded compaction, several of the rows that were successfully compacted were over 1GB. Here is the output I see from compactionstats where a compaction has hung. The 'bytes compacted' column never changes. {code} pending tasks: 1 compaction typekeyspace column family bytes compacted bytes total progress Compaction systemHintsColumnFamily 268082 464784758 0.06% {code} The hung thread stack is as follows: (full jstack attached, as well) {code} CompactionExecutor:37 daemon prio=10 tid=0x063df800 nid=0x49d9 waiting on condition [0x7eb8c6ffa000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for 0x00050f2e0e58 (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:329) at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Deserializer.computeNext(ParallelCompactionIterable.java:281) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147) at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:126) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:100) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:101) at org.apache.cassandra.db.compaction.ParallelCompactionIterable$Unwrapper.computeNext(ParallelCompactionIterable.java:88) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135) at com.google.common.collect.Iterators$7.computeNext(Iterators.java:614) at