[
https://issues.apache.org/jira/browse/CASSANDRA-6746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13915658#comment-13915658
]
Benedict commented on CASSANDRA-6746:
-------------------------------------
2.0:
INFO [main] 2014-02-27 18:54:12,724 CLibrary.java (line 63) JNA not found.
Native methods will be disabled.
2.1:
INFO [main] 2014-02-27 19:51:10,440 CLibrary.java:117 - JNA mlockall successful
The result being that we do not "skip IO cache" in 2.0. Assuming the OS
actually listens to the DONTNEED command, this results in an empty page cache
even if the OS could make room for it. This is OS dependent, as whilst testing
on my own box, I found my OS would keep the pages cached anyway. On searching I
found that this behaviour was modified in the linux kernel sometime around
2010/11, as referenced
[here|http://lists.samba.org/archive/rsync/2010-November/025827.html], although
it's not clear which kernel it first made it into, clearly it is not in the
build on this cluster and is on my laptop.
Note this is confirmed by fincore on the data files: in 2.0 all files remain
100% cached at all times; in 2.1 they drop to 0% cached immediately after
compaction completes.
I'm not sure what the correct response to this is. Largely this is simply
behaving as expected, except that really issuing a DONTNEED when we probably DO
need is not a great idea. The rationale of course is that if we're compacting
stale data we don't want to pollute the page cache; but if we're compacting
live data we will actively destroy the page cache when the OS listens
stringently to the DONTNEED (which in this case it apparently does even though
it has plenty of room to ignore us). Unless we can be smarter about issuing
these commands, I think issuing them isn't actually such a great idea, at least
not on kernel versions that elicit this behaviour. However I'm not convinced
they make sense on newer kernel versions either, as live reads going to files
that are about to be discarded could still leave us with an empty buffer cache,
as the new pages are evicted in advance of becoming the live versions. In this
scenario simply letting the kernel keep whatever pages it wants is probably
best so there's no sudden performance cliff, although moving to incremental
opening of the new sstables might be a better solution to this, so that reads
transfer progressively, always to data that is already in buffer, keeping the
transition smooth.
> Reads have a slow ramp up in speed
> ----------------------------------
>
> Key: CASSANDRA-6746
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6746
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Ryan McGuire
> Assignee: Benedict
> Labels: performance
> Fix For: 2.1 beta2
>
> Attachments: 2.1_vs_2.0_read.png
>
>
> On a physical four node cluister I am doing a big write and then a big read.
> The read takes a long time to ramp up to respectable speeds.
> !2.1_vs_2.0_read.png!
> [See data
> here|http://ryanmcguire.info/ds/graph/graph.html?stats=stats.2.1_vs_2.0_vs_1.2.retry1.json&metric=interval_op_rate&operation=stress-read&smoothing=1]
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)