[
https://issues.apache.org/jira/browse/CASSANDRA-1991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982300#action_12982300
]
Peter Schuller commented on CASSANDRA-1991:
-------------------------------------------
I started a proof-of-concept whereby maybeSwitchMemtable() would take an
additional boolean "forceCheckPoint". forceFlush() and forceBlockingFlush()
also took a boolean "checkpoint". The idea being to share the code paths as
much as possible and not introducing a new method. It would depend on correctly
handling clean memtables (by proceeding but not writing an empty file).
However, there are *lots* of calls to this. I looked at and changed all in the
main code base, then there are quite a huge amount in the unit tests. So, I'm
awaiting feedback/thoughts before going ahead. If it's obviously not the right
solution, no sense in going through all those tests trying to determine whether
they should force a checkpoint.
I'll attach the work-in-progress diff still, but it doesn't result in a
buildable codebase.
Thoughts?
> CFS.maybeSwitchMemtable() calls CommitLog.instance.getContext(), which may
> block, under flusher lock write lock
> ---------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-1991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1991
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Peter Schuller
> Assignee: Peter Schuller
> Attachments: 1991-checkpointing-flush.txt, 1991-logchanges.txt,
> 1991-trunk-v2.txt, 1991-trunk.txt, 1991-v3.txt, 1991-v4.txt, trigger.py
>
>
> While investigate CASSANDRA-1955 I realized I was seeing very poor latencies
> for reasons that had nothing to do with flush_writers, even when using
> periodic commit log mode (and flush writers set ridiculously high, 500).
> It turns out writes blocked were slow because Table.apply() was spending lots
> of time (I can easily trigger seconds on moderate work-load) trying to
> acquire a flusher lock read lock ("flush lock millis" log printout in the
> logging patch I'll attach).
> That in turns is caused by CFS.maybeSwitchMemtable() which acquires the
> flusher lock write lock.
> Bisecting further revealed that the offending line of code that blocked was:
> final CommitLogSegment.CommitLogContext ctx =
> writeCommitLog ? CommitLog.instance.getContext() : null;
> Indeed, CommitLog.getContext() simply returns currentSegment().getContext(),
> but does so by submitting a callable on the service executor. So
> independently of flush writers, this can block all (global, for all cf:s)
> writes very easily, and does.
> I'll attach a file that is an independent Python script that triggers it on
> my macos laptop (with an intel SSD, which is why I was particularly
> surprised) (it assumes CPython, out-of-the-box-or-almost Cassandra on
> localhost that isn't in a cluster, and it will drop/recreate a keyspace
> called '1955').
> I'm also attaching, just FYI, the patch with log entries that I used while
> tracking it down.
> Finally, I'll attach a patch with a suggested solution of keeping track of
> the latest commit log with an AtomicReference (as an alternative to
> synchronizing all access to segments). With that patch applied, latencies are
> not affected by my trigger case like they were before. There are some
> sub-optimal > 100 ms cases on my test machine, but for other reasons. I'm no
> longer able to trigger the extremes.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.