[
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13835123#comment-13835123
]
Benedict commented on CASSANDRA-5549:
-------------------------------------
I suggest the following (somewhat complex seeming approach), building on my
patch for 3578:
We extract the CLS.Allocation object into a CommitState object that is
allocated in CFS.apply() prior to performing CL.add(). The CLS.AppendLock
object is rewritten to be a more custom job which we will for now call
MutationBarrier, and extracted along with CommitState.
MutationBarrier will be a synchronisation primitive that permits issuing
periodic barriers that ensure operations started prior to the barrier have all
completed, and also can be used to create a token on an about-to-be-issued
barrier that ensures operations started after the barrier (when it *is* issued)
know not to interfere with any state of an object that is using the barrier.
This was probably unclear, but the steps of CFS.apply() using it may clarify it:
# Allocate CommitState, register operation with MutationBarrier
# CL.add() - on exit, has updated CommitState with position and segment of
replay position.*
# Checks Memtable's BarrierToken to see if we are permitted to modify:
#* if it's absent we simply make our modification and scoot;
#* if it's present and permits us to make our modification, we do so but ALSO
update a ReplayPosition property (with cas, ensuring it is >= the one we have
in CommitState)
#* if it's present and does not permit us to modify, we look up its replacement
Memtable and repeat
# Release our hold on the MutationBarrier, signalling any waiters
When we flush a memtable, we:
# Set the ReplayPosition
# Create the replacement Memtable and chain it from ourselves
# Set the BarrierToken
# Wait on the Barrier
# Flush to disk, using the ReplayPosition we have maintained
Note that we will no longer perform any reference counting on the Memtables. I
will ensure that the mutation calls are all non-blocking, but may for
correctness and simplicity make those attempting to flush/issue a new barrier
take out a (possibly spin-) lock on the MB in order to issue a Token atomically.
Thoughts?
\*CL.sync() will use the MB to fulfil the AppendLock role also. CL.add() will
release a hold on the MB that is related to CL only, to permit CL to proceed
immediately.
> Remove Table.switchLock
> -----------------------
>
> Key: CASSANDRA-5549
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
> Project: Cassandra
> Issue Type: Bug
> Reporter: Jonathan Ellis
> Assignee: Vijay
> Labels: performance
> Fix For: 2.1
>
> Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write
> path. ReentrantReadWriteLock is not lightweight, even if there is no
> contention per se between readers and writers of the lock (in Cassandra,
> memtable updates and switches).
--
This message was sent by Atlassian JIRA
(v6.1#6144)