[ https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881055#action_12881055 ]
Jonathan Ellis commented on CASSANDRA-1214: ------------------------------------------- It seems that what is happening is, - the JVM hasn't needed to run a major collection in a while, - so Linux says "I'll swap part of the JVM's heap so I can pull more of this hot sstable into ram," - then the JVM goes to GC and thrashes pulling its heap in from swap The "right" solution is probably to use mlockall(MCL_CURRENT) on JVM start (with min heap = max heap so that gets pre-allocated). Then perform the mmapping. mmap'd io is enough faster that this is probably worth biting the native code bullet for. > Make standard IO the default > ---------------------------- > > Key: CASSANDRA-1214 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1214 > Project: Cassandra > Issue Type: Bug > Affects Versions: 0.7 > Reporter: James Golick > > The way mmap()'d IO is handled in cassandra is dangerous. It allocates > potentially massive buffers without any care for bounding the total size of > the program's buffers. As the node's dataset grows, this *will* lead to > swapping and instability. > This is a dangerous and wrong default for a couple of reasons. > 1) People are likely to test cassandra with the default settings. This issue > is insidious because it only appears when you have sufficient data in a > certain node, there is absolutely no way to control it, and it doesn't at all > respect the memory limits that you give to the JVM. > That can all be ascertained by reading the code, and people should certainly > do their homework, but nevertheless, cassandra should ship with sane defaults > that don't break down when you cross some magic unknown threshold. > 2) It's deceptive. Unless you are extremely careful with capacity planning, > you will get bit by this. Most people won't really be able to use this in > production, so why get them excited about performance that they can't > actually have? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.