[
https://issues.apache.org/jira/browse/CASSANDRA-15066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16860129#comment-16860129
]
Aleksey Yeschenko commented on CASSANDRA-15066:
-----------------------------------------------
Agreed on readiness to commit in current state. To complete the list
(non-exhaustively), below are some notable changes on my part.
To start with, the largest change has been the redesign of large message
handling - suggested by [~xedin] during review. Whereas
previously we'd have a companion thread deserializing the large message as new
frames kept coming, now, instead, we accumulate all
the frames needed for the large message deserialization - and then schedule a
task directly on that message's verb's {{Stage}}, a task
that desrializes the message and executes the verb handler in one go. This not
only simplified the logic in {{InboundMessageHandler}},
but also increases locality, and reduces the lifetime of the large messages on
heap.
Other changes include:
- Fixed a bug with double-release of permits on deser exceptions in
{{InboundMessageHandler}}
- Fixed forgetting to signal a {{WaitQueue}} when releasing permits back in
case of partial allocate failure
- Fixed {{FrameDecoder}} not propagating {{channelClose()}} to
{{InboundMessageHandler}}
- Fixed several legacy handshake issues
- Fixed legacy LZ4 frame encoder and decoder performance (broken Netty xxhash
behaviour)
- Fixed mutation forwarding to remote DCs mistakenly including the picked
forwarder node itself (spotted by [~jmeredithco])
- Started immediately expiring callbacks for all forwarded mutation
destinations when failing to send to the forwarder
- Introduced inbound backpressure counters (throttled count and nanos)
- Started treating all deserialize exceptions as non-fatal, to prevent
unnecessary message loss and reconnects
- Factored out header fields from {{Message}} into a standalone {{Header}}
class to prevent double-deserialization of some fields and to clean up callback
signatures
- Introduced max message size config param, akin to max mutation size - set to
endpoint reserve capacity by default
- Introduced an MPSC linked queue with volatile offer semantics and
non-blocking {{poll()}} and {{drain()}} and used it to fix visibility issues or
blocking behaviour in {{OutboundMessageQueue}},
{{InboundMessageHandler.WaitQueue}}, and Netty's event loops; then used it to
minimise amount of signalling done when {{InboundMessageHandler}} get
registered on the wait queue
- Refactored callbacks and callback map ({{RequestCallbacks}}) to allow reusing
the same request ID for multiple messages, got rid of an extra object per entry
- Building on the refactoring above, reduced and mostly eliminated allocation
of extra {{Message}} objects, allowing to save on {{serializedSize}}
invocations and some garbage
- Reworked integration between {{InboundMessageHandler}} and {{FrameDecoder}}
for clarity and performance
- Fixed {{FrameDecoder}} over-issuing {{channel.read()}} calls in some
circumstances
- Refactored {{InboundMessageHandler}} frame handling and callbacks
- Push processing exception handling to callbacks/message sink
- Added a lot of comments/documentation, tests, made various logging
improvements, better thread names
Also, some changes made by [~ifesdjeen] directly - in addition to his many
helpful review corrections:
- Introduced in-JVM proxy to test expirations and closure, added tests for
inbound expirations
- Fixed a bug in outbound virtual table (overflow_count/overflow_bytes swapped
values), and same in outbound metrics
- Introduced {{UnknownColumnsException}} to more places instead of
{{RuntimeException}}
- Fixed {{Message.Builder.builder(Message)}} to copy over original flags
> Improvements to Internode Messaging
> -----------------------------------
>
> Key: CASSANDRA-15066
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15066
> Project: Cassandra
> Issue Type: Improvement
> Components: Messaging/Internode
> Reporter: Benedict
> Assignee: Benedict
> Priority: High
> Fix For: 4.0
>
> Attachments: 20k_backfill.png, 60k_RPS.png,
> 60k_RPS_CPU_bottleneck.png, backfill_cass_perf_ft_msg_tst.svg,
> baseline_patch_vs_30x.png, increasing_reads_latency.png,
> many_reads_cass_perf_ft_msg_tst.svg
>
>
> CASSANDRA-8457 introduced asynchronous networking to internode messaging, but
> there have been several follow-up endeavours to improve some semantic issues.
> CASSANDRA-14503 and CASSANDRA-13630 are the latest such efforts, and were
> combined some months ago into a single overarching refactor of the original
> work, to address some of the issues that have been discovered. Given the
> criticality of this work to the project, we wanted to bring some more eyes to
> bear to ensure the release goes ahead smoothly. In doing so, we uncovered a
> number of issues with messaging, some of which long standing, that we felt
> needed to be addressed. This patch widens the scope of CASSANDRA-14503 and
> CASSANDRA-13630 in an effort to close the book on the messaging service, at
> least for the foreseeable future.
> The patch includes a number of clarifying refactors that touch outside of the
> {{net.async}} package, and a number of semantic changes to the {{net.async}}
> packages itself. We believe it clarifies the intent and behaviour of the
> code while improving system stability, which we will outline in comments
> below.
> https://github.com/belliottsmith/cassandra/tree/messaging-improvements
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]