[
https://issues.apache.org/jira/browse/CASSANDRA-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16609099#comment-16609099
]
Jason Brown commented on CASSANDRA-14503:
-----------------------------------------
Patch available here:
||14503||
|[branch|https://github.com/jasobrown/cassandra/tree/14503]|
|[utests &
dtests|https://circleci.com/gh/jasobrown/workflows/cassandra/tree/14503]|
||
Additionally, I've [created a Pull
Request|https://github.com/apache/cassandra/pull/264] for review, as well.
Note: this patch will need to be rebased when CASSANDRA-13630 is committed, and
incorprate the changes ChannelWriter for large messages, but that should not
affect this patch much (I've been keeping that in mind as I worked on this)
- OutboundMessagingConnection changes
-- All producer threads queue messages into the backlog, and messages are only
consumed by a task on a fixed thread (the event loop). Producers will contend
to schedule the consumer,
but have no further involvement in sending a message (unlike the current
implementation).
-- All netty-related activity (setting up a remote connection,
connection-related callbacks and time outs, consuming form the backlog and
writing to the channel and associated callbacks)
are all handled on the event loop. OutboundMessagingConnection gets a reference
to a event loop in it's constructor, and uses that for the duration of it's
lifetime.
-- Finally forward-ported the queue bounding functionality of CASSANDRA-13265.
In short, we want to limit the size of queued messages in order to not OOM.
Thus, we schedule a task for the consumer thread
that examines the queue looking for elements to prune. Further, I've added a
naive upper bound to the queue so that producers drop messages before enqueuing
if the backlog is in a *really* bad state.
@djoshi3 has recomended bounding by message size rather than by message count,
which I agree with, but propose saving that for a followup ticket.
-- Cleaner, more documented, and better tested State machine to manage state
transitions for the class.
- ChannelWriter and MessageOutHandler became much simpler as we can control the
flush behaviors from the OMC (instead of the previous complicated CW/MOH dance)
because we're already on the event loop
when consuming from the backlog and writing to the channel.
- I was able to clean up/remove a bunch of extra code due to this
simplification, as well (ExpiredException, OutboundMessagingParameters,
MessageResult)
- Updated all the javadoc documentation for these changes (mostly OMC and
ChannelWriter)
> Internode connection management is race-prone
> ---------------------------------------------
>
> Key: CASSANDRA-14503
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14503
> Project: Cassandra
> Issue Type: Bug
> Components: Streaming and Messaging
> Reporter: Sergio Bossa
> Assignee: Jason Brown
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Following CASSANDRA-8457, internode connection management has been rewritten
> to rely on Netty, but the new implementation in
> {{OutboundMessagingConnection}} seems quite race prone to me, in particular
> on those two cases:
> * {{#finishHandshake()}} racing with {{#close()}}: i.e. in such case the
> former could run into an NPE if the latter nulls the {{channelWriter}} (but
> this is just an example, other conflicts might happen).
> * Connection timeout and retry racing with state changing methods:
> {{connectionRetryFuture}} and {{connectionTimeoutFuture}} are cancelled when
> handshaking or closing, but there's no guarantee those will be actually
> cancelled (as they might be already running), so they might end up changing
> the connection state concurrently with other methods (i.e. by unexpectedly
> closing the channel or clearing the backlog).
> Overall, the thread safety of {{OutboundMessagingConnection}} is very
> difficult to assess given the current implementation: I would suggest to
> refactor it into a single-thread model, where all connection state changing
> actions are enqueued on a single threaded scheduler, so that state
> transitions can be clearly defined and checked.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]