[
https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881055#action_12881055
]
Jonathan Ellis commented on CASSANDRA-1214:
-------------------------------------------
It seems that what is happening is,
- the JVM hasn't needed to run a major collection in a while,
- so Linux says "I'll swap part of the JVM's heap so I can pull more of this
hot sstable into ram,"
- then the JVM goes to GC and thrashes pulling its heap in from swap
The "right" solution is probably to use mlockall(MCL_CURRENT) on JVM start
(with min heap = max heap so that gets pre-allocated). Then perform the
mmapping.
mmap'd io is enough faster that this is probably worth biting the native code
bullet for.
> Make standard IO the default
> ----------------------------
>
> Key: CASSANDRA-1214
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1214
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: 0.7
> Reporter: James Golick
>
> The way mmap()'d IO is handled in cassandra is dangerous. It allocates
> potentially massive buffers without any care for bounding the total size of
> the program's buffers. As the node's dataset grows, this *will* lead to
> swapping and instability.
> This is a dangerous and wrong default for a couple of reasons.
> 1) People are likely to test cassandra with the default settings. This issue
> is insidious because it only appears when you have sufficient data in a
> certain node, there is absolutely no way to control it, and it doesn't at all
> respect the memory limits that you give to the JVM.
> That can all be ascertained by reading the code, and people should certainly
> do their homework, but nevertheless, cassandra should ship with sane defaults
> that don't break down when you cross some magic unknown threshold.
> 2) It's deceptive. Unless you are extremely careful with capacity planning,
> you will get bit by this. Most people won't really be able to use this in
> production, so why get them excited about performance that they can't
> actually have?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.