[
https://issues.apache.org/jira/browse/CASSANDRA-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305095#comment-14305095
]
Ariel Weisberg commented on CASSANDRA-8729:
-------------------------------------------
There are other reasons to use private memory that maybe aren't so important.
For in-memory write workloads you get outliers if you have threads write to a
memory mapped files. They did tend to show up in the very long tail P99.99,
P99.999. With a dedicated thread draining to the filesystem you can control how
much data is buffered when the filesystem is out to lunch.
If you write a quick benchmark that just spits out zeroes to a file via write
vs a memory mapped file do you see a difference in throughput or CPU
utilization? I am skeptical that mmap is actually much faster (or even slower!).
> Commitlog causes read before write when overwriting
> ---------------------------------------------------
>
> Key: CASSANDRA-8729
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8729
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Ariel Weisberg
>
> The memory mapped commit log implementation writes directly to the page
> cache. If a page is not in the cache the kernel will read it in even though
> we are going to overwrite.
> The way to avoid this is to write to private memory, and then pad the write
> with 0s at the end so it is page (4k) aligned before writing to a file.
> The commit log would benefit from being refactored into something that looks
> more like a pipeline with incoming requests receiving private memory to write
> in, completed buffers being submitted to a parallelized compression/checksum
> step, followed by submission to another thread for writing to a file that
> preserves the order.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)