[jira] [Comment Edited] (CASSANDRA-15452) Improve disk access patterns during compaction and range reads

Benedict Elliott Smith (Jira) Fri, 11 Apr 2025 04:21:07 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17943516#comment-17943516
 ]


Benedict Elliott Smith edited comment on CASSANDRA-15452 at 4/11/25 11:20 AM:
------------------------------------------------------------------------------

Hi, I have noticed this patch showing up in traces allocating buffers within 
`clear`. It looks like `close` and `deallocateResources` both invoke `clear`, 
but clear is not designed to be invoked twice. If it is, the second invocation 
will allocate a new buffer before clearing it again.

Interestingly, I see allocations in the `deallocateResources` call (which looks 
to occur before `close`), suggesting we may also have readers that never 
allocated a buffer allocate one to be deallocated. I haven't investigated 
further to verify why this is happening, but we should probably first confirm 
we have a buffer to deallocate before we do any cleanup.


was (Author: benedict):
Hi, I have noticed this patch showing up in traces allocating buffers within 
`clear`. It looks like `close` and `deallocateResources` both invoke `clear`, 
but clear is not designed to be invoked twice. If it is, the second invocation 
will allocate a new buffer before clearing it again.

> Improve disk access patterns during compaction and range reads
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-15452
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15452
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Legacy/Local Write-Read Paths, Local/Compaction
>            Reporter: Jon Haddad
>            Assignee: Jordan West
>            Priority: Normal
>             Fix For: 5.0.4, 5.1
>
>         Attachments: ci_summary_jrwest_jwest-15452-5.0_153.html, everyfs.txt, 
> image-2024-11-22-16-17-23-194.png, image-2025-01-07-16-04-23-909.png, 
> image-2025-01-07-16-56-12-853.png, image-2025-01-07-16-57-29-134.png, 
> iostat-5.0-head.output, iostat-5.0-patched.output, iostat-ebs-15452.png, 
> iostat-ebs-head.png, iostat-instance-15452.png, iostat-instance-head.png, 
> results.txt, results_details_jrwest_jwest-15452-5.0_153.tar.xz, 
> screenshot-1.png, screenshot-2.png, screenshot-3.png, screenshot-4.png, 
> screenshot-5.png, screenshot-6.png, sequential.fio, throughput-1.png, 
> throughput.png
>
>          Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> On read heavy workloads Cassandra performs much better when using a low read 
> ahead setting.   In my tests I've seen an 5x improvement in throughput and 
> more than a 50% reduction in latency.  However, I've also observed that it 
> can have a negative impact on compaction and streaming throughput. It 
> especially negatively impacts cloud environments where small reads incur high 
> costs in IOPS due to tiny requests.
>  # We should investigate using POSIX_FADV_DONTNEED on files we're compacting 
> to see if we can improve performance and reduce page faults. 
>  # This should be combined with an internal read ahead style buffer that 
> Cassandra manages, similar to a BufferedInputStream but with our own 
> machinery.  This buffer should read fairly large blocks of data off disk at 
> at time.  EBS, for example, allows 1 IOP to be up to 256KB.  A considerable 
> amount of time is spent in blocking I/O during compaction and streaming. 
> Reducing the frequency we read from disk should speed up all sequential I/O 
> operations.
>  # We can reduce system calls by buffering writes as well, but I think it 
> will have less of an impact than the reads



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-15452) Improve disk access patterns during compaction and range reads

Reply via email to