[
https://issues.apache.org/jira/browse/CASSANDRA-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204328#comment-17204328
]
Vinay Chella commented on CASSANDRA-14746:
------------------------------------------
Thank you for following up on this [~pauloricardomg]
{quote}a) Is work on this issue still active?
{quote}
Yes, it was active until I took a long break from work for personal reasons,
if you see CASSANDRA-15181 and CASSANDRA-14764, I started some of this work but
had to put it on hold, I am starting to get back in motion, should be able to
make progress in coming weeks.
{quote}
b) Can we complete this issue once all subtasks are completed or are there more
subtasks to be added?
{quote}
quoting from the description "The goal is that 4.0 should have better latency,
more throughput, fewer threads, fewer context switches, less GC allocation, and
faster recovery time" - I would say it is all about building the confidence in
4.0, I can sign up to add more tasks as we make progress and findings based on
CASSANDRA-14747, CASSANDRA-15181, and CASSANDRA-14764.
> Ensure Netty Internode Messaging Refactor is Solid
> --------------------------------------------------
>
> Key: CASSANDRA-14746
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14746
> Project: Cassandra
> Issue Type: Improvement
> Components: Legacy/Streaming and Messaging
> Reporter: Joey Lynch
> Assignee: Joey Lynch
> Priority: Normal
> Labels: 4.0-QA
> Fix For: 4.0-beta
>
>
> Before we release 4.0 let's ensure that the internode messaging refactor is
> 100% solid. As internode messaging is naturally used in many code paths and
> widely configurable we have a large number of cluster configurations and test
> configurations that must be vetted.
> We plan to vary the following:
> * Version of Cassandra 3.0.17 vs 4.0-alpha
> * Cluster sizes with *multi-dc* deployments ranging from 6 - 100 nodes
> * Client request rates varying between 1k QPS and 100k QPS of varying sizes
> and shapes (BATCH, INSERT, SELECT point, SELECT range, etc ...)
> * Internode compression
> * Internode SSL (as well as openssl vs jdk)
> * Internode Coalescing options
> We are looking to measure the following as appropriate:
> * Latency distributions of reads and writes (lower is better)
> * Scaling limit, aka maximum throughput before violating p99 latency
> deadline of 10ms @ LOCAL_QUORUM, on a fixed hardware deployment for 100%
> writes, 100% reads and 50-50 writes+reads (higher is better)
> * Thread counts (lower is better)
> * Context switches (lower is better)
> * On-CPU time of tasks (higher periods without context switch is better)
> * GC allocation rates / throughput for a fixed size heap (lower allocation
> better)
> * Streaming recovery time for a single node failure, i.e. can Cassandra
> saturate the NIC
>
> The goal is that 4.0 should have better latency, more throughput, fewer
> threads, fewer context switches, less GC allocation, and faster recovery
> time. I'm putting Jason Brown as the reviewer since he implemented most of
> the internode refactor.
> Current collaborators driving this QA task: Dinesh Joshi, Jordan West, Joey
> Lynch (Netflix), Vinay Chella (Netflix)
> Owning committer(s): Jason Brown
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]