[ 
https://issues.apache.org/jira/browse/CASSANDRA-5549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13835123#comment-13835123
 ] 

Benedict commented on CASSANDRA-5549:
-------------------------------------

I suggest the following (somewhat complex seeming approach), building on my 
patch for 3578:

We extract the CLS.Allocation object into a CommitState object that is 
allocated in CFS.apply() prior to performing CL.add(). The CLS.AppendLock 
object is rewritten to be a more custom job which we will for now call 
MutationBarrier, and extracted along with CommitState.

MutationBarrier will be a synchronisation primitive that permits issuing 
periodic barriers that ensure operations started prior to the barrier have all 
completed, and also can be used to create a token on an about-to-be-issued 
barrier that ensures operations started after the barrier (when it *is* issued) 
know not to interfere with any state of an object that is using the barrier. 
This was probably unclear, but the steps of CFS.apply() using it may clarify it:

# Allocate CommitState, register operation with MutationBarrier
# CL.add() - on exit, has updated CommitState with position and segment of 
replay position.*
# Checks Memtable's BarrierToken to see if we are permitted to modify: 
#* if it's absent we simply make our modification and scoot;
#* if it's present and permits us to make our modification, we do so but ALSO 
update a ReplayPosition property (with cas, ensuring it is >= the one we have 
in CommitState)
#* if it's present and does not permit us to modify, we look up its replacement 
Memtable and repeat
# Release our hold on the MutationBarrier, signalling any waiters

When we flush a memtable, we:
# Set the ReplayPosition
# Create the replacement Memtable and chain it from ourselves
# Set the BarrierToken
# Wait on the Barrier
# Flush to disk, using the ReplayPosition we have maintained

Note that we will no longer perform any reference counting on the Memtables. I 
will ensure that the mutation calls are all non-blocking, but may for 
correctness and simplicity make those attempting to flush/issue a new barrier 
take out a (possibly spin-) lock on the MB in order to issue a Token atomically.

Thoughts?

\*CL.sync() will use the MB to fulfil the AppendLock role also. CL.add() will 
release a hold on the MB that is related to CL only, to permit CL to proceed 
immediately.

> Remove Table.switchLock
> -----------------------
>
>                 Key: CASSANDRA-5549
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5549
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Vijay
>              Labels: performance
>             Fix For: 2.1
>
>         Attachments: 5549-removed-switchlock.png, 5549-sunnyvale.png
>
>
> As discussed in CASSANDRA-5422, Table.switchLock is a bottleneck on the write 
> path.  ReentrantReadWriteLock is not lightweight, even if there is no 
> contention per se between readers and writers of the lock (in Cassandra, 
> memtable updates and switches).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to