[
https://issues.apache.org/jira/browse/CASSANDRA-13037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15750591#comment-15750591
]
Stefania commented on CASSANDRA-13037:
--------------------------------------
The multiplexed runs above did not produce any timeouts, they were both against
2.2. I've also verified that the last timeout in 2.2 was over 2 months ago and
so it looks like the problem in 2.2 has been fixed or hidden by some other
change.
I've repeated another multiplexed run in 2.1
[here|http://cassci.datastax.com/job/stef1927-testall-multiplex/57/] and I got
5 timeouts out of 50 iterations. In each case, the timeout is caused by a
deadlock, full stack traces for all threads are available
[here|http://cassci.datastax.com/job/stef1927-testall-multiplex/57/testReport/junit/org.apache.cassandra.cql3/DropKeyspaceCommitLogRecycleTest_17/testRecycle/].
The test is waiting for a blocking flush when dropping the keyspace. The flush
is blocked on the write barrier for the memtables:
{code}
Thread MemtableFlushWriter:1
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:315)
at
org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:283)
at
org.apache.cassandra.utils.concurrent.OpOrder$Barrier.await(OpOrder.java:417)
at
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1104)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
{code}
The memtable is not released because the non periodic tasks are waiting for a
CL segment sync when clearing the sstable read meter in the GlobalTidy:
{code}
Thread NonPeriodicTasks:1
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:315)
at
org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:283)
at
org.apache.cassandra.db.commitlog.CommitLogSegment$Allocation.awaitDiskSync(CommitLogSegment.java:642)
at
org.apache.cassandra.db.commitlog.BatchCommitLogService.maybeWaitForSync(BatchCommitLogService.java:34)
at
org.apache.cassandra.db.commitlog.AbstractCommitLogService.finishWriteFor(AbstractCommitLogService.java:152)
at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:253)
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:379)
at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:359)
at org.apache.cassandra.db.Mutation.apply(Mutation.java:214)
at
org.apache.cassandra.cql3.statements.ModificationStatement.executeInternalWithoutCondition(ModificationStatement.java:675)
at
org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(ModificationStatement.java:663)
at
org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:317)
at
org.apache.cassandra.db.SystemKeyspace.clearSSTableReadMeter(SystemKeyspace.java:1004)
at
org.apache.cassandra.io.sstable.SSTableReader$GlobalTidy.tidy(SSTableReader.java:2409)
at
org.apache.cassandra.utils.concurrent.Ref$GlobalState.release(Ref.java:293)
at org.apache.cassandra.utils.concurrent.Ref$State.release(Ref.java:195)
at org.apache.cassandra.utils.concurrent.Ref.release(Ref.java:95)
at
org.apache.cassandra.io.sstable.SSTableReader$DescriptorTypeTidy.tidy(SSTableReader.java:2308)
at
org.apache.cassandra.utils.concurrent.Ref$GlobalState.release(Ref.java:293)
at org.apache.cassandra.utils.concurrent.Ref$State.release(Ref.java:195)
at org.apache.cassandra.utils.concurrent.Ref.release(Ref.java:95)
at
org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier$1.run(SSTableReader.java:2245)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
{code}
[~blambov] have you seen this deadlock before? Any idea why the CL segment
cannot be synced? So far I've seen this sort of problems when stopping the CL
service, not when it is still running.
> DropKeyspaceCommitLogRecycleTest.testRecycle times out in 2.1 and 2.2
> ---------------------------------------------------------------------
>
> Key: CASSANDRA-13037
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13037
> Project: Cassandra
> Issue Type: Bug
> Components: Testing
> Reporter: Stefania
> Assignee: Stefania
> Fix For: 2.1.x, 2.2.x
>
>
> DropKeyspaceCommitLogRecycleTest.testRecycle times out in 2.1 and 2.2:
> http://cassci.datastax.com/job/cassandra-2.2_testall/589/testReport/junit/org.apache.cassandra.cql3/DropKeyspaceCommitLogRecycleTest/testRecycle/
> http://cassci.datastax.com/job/cassandra-2.1_testall/399/testReport/org.apache.cassandra.cql3/DropKeyspaceCommitLogRecycleTest/testRecycle/
> {code}
> Error Message
> Timeout occurred. Please note the time in the report does not reflect the
> time until the timeout.
> Stacktrace
> junit.framework.AssertionFailedError: Timeout occurred. Please note the time
> in the report does not reflect the time until the timeout.
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)