Possible livelock during commit log playback
--------------------------------------------
Key: CASSANDRA-3751
URL: https://issues.apache.org/jira/browse/CASSANDRA-3751
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 1.0.1
Environment: Linux (CentOS 5.7)
Reporter: John Chakerian
In CommitLog.recover, there seems to be the possibility of concurrent inserts
to tablesRecovered (a HashSet) in the Runnables instantiated a bit below (line
323 in 1.0.7). This apparently happened during a commit log playback during
startup of a node that had not shut down cleanly (the cluster was under heavy
load previously and there were several gigabytes of commit logs), resulting in
two threads running in perpetuity (2 cores were at 100% from running these
threads), preventing the node from coming up. The relevant portion of the stack
trace is:
{noformat}
INFO | jvm 1 | 2012/01/16 16:54:42 | "MutationStage:25" prio=10
tid=0x00002aaad01e0800 nid=0x6f62 runnable [0x0000000044d54000]
INFO | jvm 1 | 2012/01/16 16:54:42 | java.lang.Thread.State: RUNNABLE
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.HashMap.put(HashMap.java:374)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.HashSet.add(HashSet.java:200)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.concurrent.FutureTask.run(FutureTask.java:138)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.lang.Thread.run(Thread.java:662)
INFO | jvm 1 | 2012/01/16 16:54:42 |
INFO | jvm 1 | 2012/01/16 16:54:42 | "MutationStage:21" prio=10
tid=0x00002aaad00a2800 nid=0x6f5e runnable [0x0000000044950000]
INFO | jvm 1 | 2012/01/16 16:54:42 | java.lang.Thread.State: RUNNABLE
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.HashMap.put(HashMap.java:374)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.HashSet.add(HashSet.java:200)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
org.apache.cassandra.db.commitlog.CommitLog$2.runMayThrow(CommitLog.java:338)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.concurrent.FutureTask.run(FutureTask.java:138)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
INFO | jvm 1 | 2012/01/16 16:54:42 | at
java.lang.Thread.run(Thread.java:662)
{noformat}
The most recently modified file in the commit log directory was this entry:
{noformat}
-rw-r----- 1 <redacted> <redacted> 0 Jan 16 16:03 CommitLog-1326758622599.log
{noformat}
though I'm not sure if this was related or not.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira