[
https://issues.apache.org/jira/browse/CASSANDRA-13037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15753849#comment-15753849
]
Stefania commented on CASSANDRA-13037:
--------------------------------------
The changes that I made in order to to print debug information in
awaitDiskSync() have fixed the failures, but [some
tests|http://cassci.datastax.com/job/stef1927-testall-multiplex/59/testReport/org.apache.cassandra.cql3/]
now take 1 minute longer to complete and in these tests we can see the
following errors:
{code}
ERROR 08:28:16 CL disk sync still waiting after 1 minute:
segment.lastSyncedOffset 5242880, position 121
{code}
See for example, [this
test|http://cassci.datastax.com/job/stef1927-testall-multiplex/59/testReport/org.apache.cassandra.cql3/DropKeyspaceCommitLogRecycleTest_48/testRecycle/].
These errors are generated by this modified code:
{code}
void awaitDiskSync()
{
while (segment.lastSyncedOffset < position)
{
WaitQueue.Signal signal =
segment.syncComplete.register(CommitLog.instance.metrics.waitingOnCommit.time());
if (segment.lastSyncedOffset < position)
{
do
{
try
{
if (signal.awaitUntil(System.nanoTime() +
TimeUnit.MINUTES.toNanos(1)))
break;
logger.error("CL disk sync still waiting after 1
minute: segment.lastSyncedOffset {}, position {}",
segment.lastSyncedOffset, position);
}
catch (InterruptedException t)
{
logger.error("CL disk sync wait was interrupted:
segment.lastSyncedOffset {}, position {}",
segment.lastSyncedOffset, position);
}
} while (segment.lastSyncedOffset < position);
}
else
{
signal.cancel();
}
}
}
{code}
Insert of waiting forever in a {{signal.awaitUninterruptibly()}}, it waits for
1 minute and then it checks if {{segment.lastSyncedOffset < position}}. I'm
guessing that a race is preventing the non-periodic-task thread from being
signaled. I still don't understand where the race is however.
> DropKeyspaceCommitLogRecycleTest.testRecycle times out in 2.1 and 2.2
> ---------------------------------------------------------------------
>
> Key: CASSANDRA-13037
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13037
> Project: Cassandra
> Issue Type: Bug
> Components: Testing
> Reporter: Stefania
> Assignee: Stefania
> Fix For: 2.1.x, 2.2.x
>
>
> DropKeyspaceCommitLogRecycleTest.testRecycle times out in 2.1 and 2.2:
> http://cassci.datastax.com/job/cassandra-2.2_testall/589/testReport/junit/org.apache.cassandra.cql3/DropKeyspaceCommitLogRecycleTest/testRecycle/
> http://cassci.datastax.com/job/cassandra-2.1_testall/399/testReport/org.apache.cassandra.cql3/DropKeyspaceCommitLogRecycleTest/testRecycle/
> {code}
> Error Message
> Timeout occurred. Please note the time in the report does not reflect the
> time until the timeout.
> Stacktrace
> junit.framework.AssertionFailedError: Timeout occurred. Please note the time
> in the report does not reflect the time until the timeout.
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)