[
https://issues.apache.org/jira/browse/CASSANDRA-13652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081873#comment-16081873
]
Fuud commented on CASSANDRA-13652:
----------------------------------
Just to keep all things together, copy from mailing list
http://www.mail-archive.com/[email protected]/msg11313.html
-------------------------------------------------------------
Hello,
I found possible deadlock in AbstractCommitLogSegmentManager. The root cause is
incorrect use of LockSupport.park/unpark pair. Unpark should be invoked only if
caller is sure that thread was parked in appropriate place. Otherwice
permission given by calling unpark can be consumed by other structures (for
example - inside ReadWriteLock).
Jira: https://issues.apache.org/jira/browse/CASSANDRA-13652
I suggest simplest solution: change LockSupport to Semaphore.
PR: https://github.com/apache/cassandra/pull/127
Also I suggest another solution with SynchronousQueue-like structure to move
available segment from Manager Thread to consumers. With theese changes code
became more clear and
straightforward.
PR https://github.com/apache/cassandra/pull/129
We can not use j.u.c.SynchronousQueue because we need to support shutdown and
there is only way to terminate SynchronousQueue.put is to call
Thread.interrupt(). But C* uses nio and it does not expect
ClosedByInterruptException during IO operations. Thus we can not interrupt
Manager Thread.
I implemented o.a.c.u.c.Transferer that supports shutdown and restart (needed
for tests).
https://github.com/Fuud/cassandra/blob/e1a695874dc24e532ae21ef627e852bf999a75f3/src/java/org/apache/cassandra/utils/concurrent/Transferer.java
Also I modified o.a.c.d.c.SimpleCachedBufferPool to support waiting for free
space.
Please feel free to ask any questions.
Thank you.
Feodor Bobin
[email protected]
> Deadlock in AbstractCommitLogSegmentManager
> -------------------------------------------
>
> Key: CASSANDRA-13652
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13652
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Fuud
>
> AbstractCommitLogManager uses LockSupport.(un)park incorreclty. It invokes
> unpark without checking if manager thread was parked in approriate place.
> For example, logging frameworks uses queues and queues uses ReadWriteLock's
> that uses LockSupport. Therefore AbstractCommitLogManager.wakeManager can
> wake thread inside Lock and manager thread will sleep forever at park()
> method (because unpark permit was already consumed inside lock).
> For examle stack traces:
> {code}
> "MigrationStage:1" id=412 state=WAITING
> at sun.misc.Unsafe.park(Native Method)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
> at
> org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:279)
> at
> org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.awaitAvailableSegment(AbstractCommitLogSegmentManager.java:263)
> at
> org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.advanceAllocatingFrom(AbstractCommitLogSegmentManager.java:237)
> at
> org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager.forceRecycleAll(AbstractCommitLogSegmentManager.java:279)
> at
> org.apache.cassandra.db.commitlog.CommitLog.forceRecycleAllSegments(CommitLog.java:210)
> at org.apache.cassandra.config.Schema.dropView(Schema.java:708)
> at
> org.apache.cassandra.schema.SchemaKeyspace.lambda$updateKeyspace$23(SchemaKeyspace.java:1361)
> at
> org.apache.cassandra.schema.SchemaKeyspace$$Lambda$382/1123232162.accept(Unknown
> Source)
> at java.util.LinkedHashMap$LinkedValues.forEach(LinkedHashMap.java:608)
> at
> java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080)
> at
> org.apache.cassandra.schema.SchemaKeyspace.updateKeyspace(SchemaKeyspace.java:1361)
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1332)
> at
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1282)
> - locked java.lang.Class@cc38904
> at
> org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:51)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$LocalSessionWrapper.run(DebuggableThreadPoolExecutor.java:322)
> at
> com.ringcentral.concurrent.executors.MonitoredRunnable.run(MonitoredRunnable.java:36)
> at MON_R_MigrationStage.run(NamedRunnableFactory.java:67)
> at
> com.ringcentral.concurrent.executors.MonitoredThreadPoolExecutor$MdcAwareRunnable.run(MonitoredThreadPoolExecutor.java:114)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
> at
> org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$61/1733339045.run(Unknown
> Source)
> at java.lang.Thread.run(Thread.java:745)
> "COMMIT-LOG-ALLOCATOR:1" id=80 state=WAITING
> at sun.misc.Unsafe.park(Native Method)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
> at
> org.apache.cassandra.db.commitlog.AbstractCommitLogSegmentManager$1.runMayThrow(AbstractCommitLogSegmentManager.java:128)
> at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79)
> at
> org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$61/1733339045.run(Unknown
> Source)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Solution is to use Semaphore instead of low-level LockSupport.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]