[
https://issues.apache.org/jira/browse/CASSANDRA-15367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036235#comment-17036235
]
Benedict Elliott Smith commented on CASSANDRA-15367:
----------------------------------------------------
So, I'm looking at this more closely now I have some time, and I wonder if you
could outline how you think the deadlock occurs between
{{setCommitLogUpperBound}} and {{writeBarrier.issue()}}? Because the deadlock
requires a new cohort to exist, that does not get instantiated until
{{writeBarrier.issue()}} so the deadlock cannot occur until then?
However there _is_ a window _after_ {{!writeOp.isBehindBarrier()}}, which
cannot be avoided because there are no timed wait mechanisms for obtaining a
monitor, and {{tryMonitorEnter}} anyway isn't possible in later versions of
Java.
So, I propose a variant of my earlier approach that definitely worked, that
waited for all earlier operations to complete, to instead essentially invert
the behaviour of your suggestion: if there are any running older operations,
refuse to lock until they all complete (and invoke {{Thread.yield()}} once to
give them an opportunity with the CPU). So locking is essentially disabled for
all newer operations until the older ones expire, and we try to give them dibs
on the CPU if the scheduler lets us, so that this window is as narrow as
possible.
> Memtable memory allocations may deadlock
> ----------------------------------------
>
> Key: CASSANDRA-15367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15367
> Project: Cassandra
> Issue Type: Bug
> Components: Local/Commit Log, Local/Memtable
> Reporter: Benedict Elliott Smith
> Assignee: Benedict Elliott Smith
> Priority: Normal
> Fix For: 4.0, 2.2.x, 3.0.x, 3.11.x
>
>
> * Under heavy contention, we guard modifications to a partition with a mutex,
> for the lifetime of the memtable.
> * Memtables block for the completion of all {{OpOrder.Group}} started before
> their flush began
> * Memtables permit operations from this cohort to fall-through to the
> following Memtable, in order to guarantee a precise commitLogUpperBound
> * Memtable memory limits may be lifted for operations in the first cohort,
> since they block flush (and hence block future memory allocation)
> With very unfortunate scheduling
> * A contended partition may rapidly escalate to a mutex
> * The system may reach memory limits that prevent allocations for the new
> Memtable’s cohort (C2)
> * An operation from C2 may hold the mutex when this occurs
> * Operations from a prior Memtable’s cohort (C1), for a contended partition,
> may fall-through to the next Memtable
> * The operations from C1 may execute after the above is encountered by those
> from C2
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]