[ 
https://issues.apache.org/jira/browse/CASSANDRA-19987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18055159#comment-18055159
 ] 

Sam Lightfoot commented on CASSANDRA-19987:
-------------------------------------------

[~dnk] Thank you, and great question.

Currently Direct IO is only supported for iterator-based compaction (via the 
scanner path). Cursor-based compaction does not yet use it, though the 
underlying Direct IO infrastructure is already in place. Wiring Direct IO 
through to cursor-based compaction would be a straightforward follow-up, which 
I'd keen to do in the very near future.

> Direct IO support for compaction reads
> --------------------------------------
>
>                 Key: CASSANDRA-19987
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19987
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Local/Compaction
>            Reporter: Jon Haddad
>            Assignee: Sam Lightfoot
>            Priority: Normal
>             Fix For: 5.x
>
>         Attachments: buffered_during_compaction-reads_1m.txt, 
> buffered_during_compaction-reads_5m.txt, buffered_page_cache_compaction, 
> direct_during_compaction-reads_1m.txt, direct_during_compaction-reads_5m.txt, 
> direct_page_cache_compaction, image-2025-12-24-10-20-44-947.png, 
> image-2025-12-24-10-21-11-928.png, image-2025-12-24-10-34-10-834.png
>
>          Time Spent: 4h
>  Remaining Estimate: 0h
>
> If we use direct io to read SSTables during compaction, we can avoid 
> polluting the page cache with data we're about to delete.  As another side 
> effect, we also evict pages to make room for whatever we're putting in.  This 
> unnecessary churn leads to higher CPU overhead and can cause dips in client 
> read latency, as we're going to be evicting pages that could be used to serve 
> those reads.
> This is most notable with STCS as the SSTables get larger, potentially 
> evicting the entire hot dataset out of cache, but is affected by every 
> compaction strategy.
> This is a follow up to be done after CASSANDRA-15452 since we will have an 
> internal buffer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to