[
https://issues.apache.org/jira/browse/CASSANDRA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088061#comment-13088061
]
Peter Schuller commented on CASSANDRA-2594:
-------------------------------------------
Well, the theory strongly suggests that it is likely to help, and I have
empirically observed several cases where the allocation is in fact unbalanced.
However, I cannot claim any scientific evidence of the impact the change has.
I've made anecdotal observations, but that's about it.
It's the kind of thing I'd want to tweak on a production system in order to
eliminate it as a source of problems, whether it's InnoDB, Cassandra, etc. I
think it's unlikely to make things worse, and there seems to be a general
consensus among kernel people that interleaving makes sense when you're
optimizing for cache hit ratio w.r.t. disk and are not concerned with CPU
efficiency.
(This is an educated guess on not based on evidence: In the case of Cassandra I
think it's unlikely that we'd actually get any CPU efficiency anyway since
there is in general no particular affinity with respect to CPU for a given
piece of data; if anything we throw data around like crazy so the fact that
some.)
FWIW, my main concern is if there's a bug in the shell scripting such that it
would break for someone. I tried to avoid that :)
> run cassandra under numactl --interleave=all
> --------------------------------------------
>
> Key: CASSANDRA-2594
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2594
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Peter Schuller
> Assignee: Peter Schuller
> Priority: Minor
> Fix For: 0.8.5
>
> Attachments: CASSANDRA-2594-trunkk.txt
>
>
> By default, Linux attempts to be smart about memory allocations such that
> data is close to the NUMA node on which it runs. For big database type of
> applications, this is not the best thing to do if the priority is to avoid
> disk I/O. In particular with Cassandra, we're heavily multi-threaded anyway
> and there is no particular reason to believe that one NUMA node is "better"
> than another.
> Consequences of allocating unevenly among NUMA nodes can include excessive
> page cache eviction when the kernel tries to allocate memory - such as when
> restarting the JVM.
> With that briefly stated background, I propse the following patch to make the
> Cassandra script run Cassandra with numactl --interleave=all if numactl seems
> to be available.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira