[
https://issues.apache.org/jira/browse/CASSANDRA-19979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886978#comment-17886978
]
Jon Haddad commented on CASSANDRA-19979:
----------------------------------------
Unfortunately it's in a completely different location. The logic we need for
streaming is isolated to CassandraCompressedStreamWriter which isn't hit in
compaction.
My original goal was to do several things in CASSANDRA-15452, but given how
long it's taken to get merged, my thinking is to merge the improvements we
already have and do this as a followup.
> Use internal buffer on streaming slow path
> ------------------------------------------
>
> Key: CASSANDRA-19979
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19979
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Jon Haddad
> Priority: Normal
> Attachments: image-2024-10-04-12-40-26-727.png
>
>
> CASSANDRA-15452 is introducing an internal buffer to compaction in order to
> increase throughput while reducing IOPS. We can do the same thing with our
> streaming slow path. There's a common misconception that the overhead comes
> from serde overhead, but I've found on a lot of devices the overhead is due
> to our read patterns. This is most commonly found on non-NVMe drives,
> especially disaggregated storage such as EBS where the latency is higher and
> more variable.
> Attached is a perf profile showing the cost of streaming is dominated by
> pread. The team I was working with was seeing they could stream only 12MB
> per streaming session. Reducing the number of read operations by using
> internal buffered reads should improve this by at least 3-5x, as well as
> reduce CPU overhead from reduced system calls.
>
>
>
>
>
> !image-2024-10-04-12-40-26-727.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]