[
https://issues.apache.org/jira/browse/CASSANDRA-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14569535#comment-14569535
]
Benedict commented on CASSANDRA-9533:
-------------------------------------
bq. matches the comment *less* well
You're being a bit selective in which parts you bold :)
"it will wait up to" implies _it will wait_ - which it would not, at all. The
reference to Postgres' behaviour also indicates it will actually wait that
period (although with separate sibling requirements, and on a microsecond time
horizon which is much more sensible). Further, our docs say "To avoid syncing
after every write, Cassandra groups the mutations into batches and *syncs
every* {{commitlog_batch_window_in_ms.}}" Not at least that often, but -
implcitly - exactly that often, as we do now.
bq. I thought we did it that way because we don't have a queue of operations
to peek into anymore, so it's difficult to provide the old behavior of "stop
sleeping when the queue is empty."
No, unfortunately the best I can find of the etymology of this change is some
offline discussion between Brandon, Jeremiah and myself, which occurred 2-3
months after commit:
{quote}
Benedict Elliott Smith so, just figured out why that CL unit test @yukim
found went from hero to zero in 2.1
Benedict Elliott Smith SHANP on IRC has found the old Batch CL code doesn't
behave in the same way
Benedict Elliott Smith the window only serves as a maximum for buffering
Benedict Elliott Smith so if you get one record arrive, it will immediately
sync
Benedict Elliott Smith in 2.1 this changed. i'm not sure if I got the wrong
end of the stick (and it seems everyone else who's been discussing it since,
maybe), or if this is a mistake that's been present all along
Benedict Elliott Smith but we should probably decide which behaviour we want
to go with in 2.1
dr driftx I don't think it was a mistake, exactly, I've explained it
lots of times in training that way (pre-2.1 behavior)
Benedict Elliott Smith right
Benedict Elliott Smith so question is: did I just misunderstand, or did
somebody tell me to implement it this way? and ignoring the answer, do we want
to restore the old behaviour?
Benedict Elliott Smith "To avoid syncing after every write, Cassandra groups
the mutations into batches and syncs every {{commitlog_batch_window_in_ms. }}"
Benedict Elliott Smith anyway, i'm easy, and heading to bed. if nobody says
anything i'll leave it how it is :-)
ZJeremiah D Jordan As long as the window time is the max time between
syncs and writes block until they sync. I think your code is fine. As that is
how we have documented it working all along
Benedict Elliott Smith @jd my version has it as the the exact time between
syncs
{quote}
So it is not at all clear why it happened. My recollection was that we
discussed it and decided to normalise on the doc behaviour, but I cannot find a
reference to that, so it is possible I simply implemented it how the docs
described it, and you reviewed it with the same lens. Either way, it was
decided to let the change stand due to it matching the doc descriptions and the
lack of further feedback.
We can certainly wait for the "queue" to empty, by issuing Barriers on the
OpOrder; we already issue one and wait for any active at that moment to
complete, which is probably good enough. If we want to wait until _none_ can be
active, we can just (on completion of the barrier), check if there are any now
running and issue another if there are.
> Make batch commitlog mode easier to tune
> ----------------------------------------
>
> Key: CASSANDRA-9533
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9533
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Jonathan Ellis
> Assignee: Benedict
> Fix For: 3.x
>
>
> As discussed in CASSANDRA-9504, 2.1 changed commitlog_sync_batch_window_in_ms
> from a maximum time to wait between fsync to the minimum time, so one must be
> very careful to keep it small enough that most writers aren't kept waiting.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)