[
https://issues.apache.org/jira/browse/CASSANDRA-10202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15340354#comment-15340354
]
Benedict edited comment on CASSANDRA-10202 at 6/20/16 8:40 PM:
---------------------------------------------------------------
We also ideally want lock-free swapping-in of the new segment, no? Currently
we don't have it, but until we reach _pure_-TPC (probably never) en route fewer
application threads exposes us to a higher risk of gumming up the system.
But yes, we could do full mutex, but it is still significantly safer to move it
all into one structure where that is well managed. The prior art has it
clumsily littered amongst all the other code. Thing is, once you do that you
essentially have the new algorithm, just with one of the methods wrapped in an
unnecessary mutex call.
I do agree the code should be tested better, but that is true of everything -
the current code is trusted only on the word of commitlog-stress, making this
as trustworthy, but it is always better to improve that.
I would however reiterate I don't necessarily think the patch entirely warrants
inclusion, I just want the discussion to be a bit more informed.
On the topic of generic linked-lists, I have two view points: 1) I've attempted
to integrate any number of generic linked-lists, and they are universally
rejected\*, so I gave up and tried to stick to hyper-safety-oriented structures
that have functionality hamstrung as far as possible in light of the use case
constraints; 2) those constraints matter for readability and function, too, and
you can end up with a more powerful linked-list for your situation despite a
less powerful overall structure, as well as one that tells you more about the
behaviour of its users.
I'd point out that this whole code area is massively concurrent, as is the
whole project. This linked-list is by far the easiest part of this code, and
most of the project, to reason about concurrency-wise. If we do not trust
ourselves to write it, we should probably start introspecting about what that
means.
NB: I must admit I haven't read the code in question for a while, and am typing
this all from memory, in bed recovering from flu, so I might just be delirious.
It could all be terrible.
\* Notably, I can recall at least two serious bugs that would have been avoided
with one of these structures has they been included when proferred. One
occurred in this code, the other was down to a pathological and unexpected
behaviour in ConcurrentLinkedQueue, the most battle-tested structure around (it
was an understood behaviour by the author, just undocumented and unexpected).
was (Author: benedict):
We also ideally want lock-free swapping-in of the new segment, no? Currently
we don't have it, but until we reach _pure_-TPC (probably never) fen route ewer
application threads exposes us to a higher risk of gumming up the system.
But yes, we could do full mutex, but it is still significantly safer to move it
all into one structure where that is well managed. The prior art has it
clumsily littered amongst all the other code. Thing is, once you do that you
essentially have the new algorithm, just with one of the methods wrapped in an
unnecessary mutex call.
I do agree the code should be tested better, but that is true of everything -
the current code is trusted only on the word of commitlog-stress, making this
as trustworthy, but it is always better to improve that.
I would however reiterate I don't necessarily think the patch entirely warrants
inclusion, I just want the discussion to be a bit more informed.
On the topic of generic linked-lists, I have two view points: 1) I've attempted
to integrate any number of generic linked-lists, and they are universally
rejected\*, so I gave up and tried to stick to hyper-safety-oriented structures
that have functionality hamstrung as far as possible in light of the use case
constraints; 2) those constraints matter for readability and function, too, and
you can end up with a more powerful linked-list for your situation despite a
less powerful overall structure, as well as one that tells you more about the
behaviour of its users.
I'd point out that this whole code area is massively concurrent, as is the
whole project. This linked-list is by far the easiest part of this code, and
most of the project, to reason about concurrency-wise. If we do not trust
ourselves to write it, we should probably start introspecting about what that
means.
NB: I must admit I haven't read the code in question for a while, and am typing
this all from memory, in bed recovering from flu, so I might just be delirious.
It could all be terrible.
\* Notably, I can recall at least two serious bugs that would have been avoided
with one of these structures has they been included when proferred. One
occurred in this code, the other was down to a pathological and unexpected
behaviour in ConcurrentLinkedQueue, the most battle-tested structure around (it
was an understood behaviour by the author, just undocumented and unexpected).
> simplify CommitLogSegmentManager
> --------------------------------
>
> Key: CASSANDRA-10202
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10202
> Project: Cassandra
> Issue Type: Improvement
> Components: Local Write-Read Paths
> Reporter: Jonathan Ellis
> Assignee: Branimir Lambov
> Priority: Minor
>
> Now that we only keep one active segment around we can simplify this from the
> old recycling design.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)