[jira] Commented: (CASSANDRA-1214) Make standard IO the default
[ https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896473#action_12896473 ] Peter Schuller commented on CASSANDRA-1214: --- I'll admit I did not investigate JNA (or POSIX-JNA) for this particular case. Last time I did however, I found it lacking. Very trivial cases were okay, but even something as simple as grab errno became a holy mess of portability concerns. I looked briefly at what posix-jna does, and I was unable to find any magic bullets in there and instead saw things like hard-coded constants that are non-portable and difficult to detect when they break due to changes to some particular platform. The proposed JNA patch seems to suffer from exactly this problem as far as I can see, making assumptions about what the concrete values are of MCL_CURRENT and MCL_FUTURE. As far as I can tell, once one has gotten over the initial one-time hurdle of using JNI and the associated building issues, you have a much more correct/standards-compliant access to the native platform than through JNA since you're in compile time with access to appropriate headers etc. Please do correct me if I'm wrong, since the idea of avoiding compile time/build issues is certainly *very* attractive and the reason why I tried to find an acceptable solution with JNA in the past. Make standard IO the default Key: CASSANDRA-1214 URL: https://issues.apache.org/jira/browse/CASSANDRA-1214 Project: Cassandra Issue Type: Bug Reporter: James Golick Attachments: mlockall-jna.patch.txt, Read Throughput with mmap.jpg, trunk-1214.txt The way mmap()'d IO is handled in cassandra is dangerous. It allocates potentially massive buffers without any care for bounding the total size of the program's buffers. As the node's dataset grows, this *will* lead to swapping and instability. This is a dangerous and wrong default for a couple of reasons. 1) People are likely to test cassandra with the default settings. This issue is insidious because it only appears when you have sufficient data in a certain node, there is absolutely no way to control it, and it doesn't at all respect the memory limits that you give to the JVM. That can all be ascertained by reading the code, and people should certainly do their homework, but nevertheless, cassandra should ship with sane defaults that don't break down when you cross some magic unknown threshold. 2) It's deceptive. Unless you are extremely careful with capacity planning, you will get bit by this. Most people won't really be able to use this in production, so why get them excited about performance that they can't actually have? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1214) Make standard IO the default
[ https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896571#action_12896571 ] Folke Behrens commented on CASSANDRA-1214: -- {quote} How does the JNA approach behave if there is no C library (Windows?) or mlockall doesn't exist (OS X?) {quote} In case of Mac OS X an UnsatisfiedLinkError will be thrown. Windows? I don't know. Maybe a JNA-specific exception, maybe a ULE, too. OS's can be easily detected with Platform.isXXX() and dealt with accordingly. {quote} something as simple as grab errno became a holy mess of portability concerns. {quote} Yes, but errno is a particularly hard case. The inventors messed up big time with this. That's why the JNA developers provide two ways to check errno: you either mark your methods with throws LastErrorException or you ask Native.getLastError(). This works under Windows, too. {quote} The proposed JNA patch seems to suffer from exactly this problem as far as I can see, making assumptions about what the concrete values are of MCL_CURRENT and MCL_FUTURE. {quote} Theoretically, you're right, in practice, however, I can't find a single POSIX system that assigns different values to MCL_CURRENT or MCL_FUTURE, and I think it's highly unlikely that these will change in the future. If so, Cassandra's code can be adjusted. {quote} As far as I can tell, once one has gotten over the initial one-time hurdle of using JNI and the associated building issues, you have a much more correct/standards-compliant access to the native platform than through JNA since you're in compile time with access to appropriate headers etc. Please do correct me if I'm wrong, since the idea of avoiding compile time/build issues is certainly very attractive and the reason why I tried to find an acceptable solution with JNA in the past. {quote} You're absolutely right, and your JNI code is really superb. If Cassandra needs to bind a couple more native functions I'd say JNI is the way to go. But not just yet. Make standard IO the default Key: CASSANDRA-1214 URL: https://issues.apache.org/jira/browse/CASSANDRA-1214 Project: Cassandra Issue Type: Bug Reporter: James Golick Attachments: mlockall-jna.patch.txt, Read Throughput with mmap.jpg, trunk-1214.txt The way mmap()'d IO is handled in cassandra is dangerous. It allocates potentially massive buffers without any care for bounding the total size of the program's buffers. As the node's dataset grows, this *will* lead to swapping and instability. This is a dangerous and wrong default for a couple of reasons. 1) People are likely to test cassandra with the default settings. This issue is insidious because it only appears when you have sufficient data in a certain node, there is absolutely no way to control it, and it doesn't at all respect the memory limits that you give to the JVM. That can all be ascertained by reading the code, and people should certainly do their homework, but nevertheless, cassandra should ship with sane defaults that don't break down when you cross some magic unknown threshold. 2) It's deceptive. Unless you are extremely careful with capacity planning, you will get bit by this. Most people won't really be able to use this in production, so why get them excited about performance that they can't actually have? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1214) Make standard IO the default
[ https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896617#action_12896617 ] Peter Schuller commented on CASSANDRA-1214: --- It all sounds reasonable. So I take it the way forward would be to take your JNA version and combine with the configuration/policy parts of my patch (assuming people agree that those parts are a good idea) and go for that version for now and maybe move to JNI in the future if JNI becomes a dependency anyway for some other reason. Any objections? Make standard IO the default Key: CASSANDRA-1214 URL: https://issues.apache.org/jira/browse/CASSANDRA-1214 Project: Cassandra Issue Type: Bug Reporter: James Golick Attachments: mlockall-jna.patch.txt, Read Throughput with mmap.jpg, trunk-1214.txt The way mmap()'d IO is handled in cassandra is dangerous. It allocates potentially massive buffers without any care for bounding the total size of the program's buffers. As the node's dataset grows, this *will* lead to swapping and instability. This is a dangerous and wrong default for a couple of reasons. 1) People are likely to test cassandra with the default settings. This issue is insidious because it only appears when you have sufficient data in a certain node, there is absolutely no way to control it, and it doesn't at all respect the memory limits that you give to the JVM. That can all be ascertained by reading the code, and people should certainly do their homework, but nevertheless, cassandra should ship with sane defaults that don't break down when you cross some magic unknown threshold. 2) It's deceptive. Unless you are extremely careful with capacity planning, you will get bit by this. Most people won't really be able to use this in production, so why get them excited about performance that they can't actually have? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1214) Make standard IO the default
[ https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896444#action_12896444 ] Jonathan Ellis commented on CASSANDRA-1214: --- How does the JNA approach behave if there is no C library (Windows?) or mlockall doesn't exist (OS X?) Make standard IO the default Key: CASSANDRA-1214 URL: https://issues.apache.org/jira/browse/CASSANDRA-1214 Project: Cassandra Issue Type: Bug Reporter: James Golick Attachments: mlockall-jna.patch.txt, Read Throughput with mmap.jpg, trunk-1214.txt The way mmap()'d IO is handled in cassandra is dangerous. It allocates potentially massive buffers without any care for bounding the total size of the program's buffers. As the node's dataset grows, this *will* lead to swapping and instability. This is a dangerous and wrong default for a couple of reasons. 1) People are likely to test cassandra with the default settings. This issue is insidious because it only appears when you have sufficient data in a certain node, there is absolutely no way to control it, and it doesn't at all respect the memory limits that you give to the JVM. That can all be ascertained by reading the code, and people should certainly do their homework, but nevertheless, cassandra should ship with sane defaults that don't break down when you cross some magic unknown threshold. 2) It's deceptive. Unless you are extremely careful with capacity planning, you will get bit by this. Most people won't really be able to use this in production, so why get them excited about performance that they can't actually have? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1214) Make standard IO the default
[ https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12888950#action_12888950 ] Jonathan Ellis commented on CASSANDRA-1214: --- according to http://andrigoss.blogspot.com/2008/02/jvm-performance-tuning.html, using huge pages automatically gives us the lock-jvm-heap-in-memory behavior we want, and may provide a substantial performance benefit as well. See also: http://java.sun.com/javase/technologies/hotspot/largememory.jsp Make standard IO the default Key: CASSANDRA-1214 URL: https://issues.apache.org/jira/browse/CASSANDRA-1214 Project: Cassandra Issue Type: Bug Affects Versions: 0.7 Reporter: James Golick Attachments: Read Throughput with mmap.jpg The way mmap()'d IO is handled in cassandra is dangerous. It allocates potentially massive buffers without any care for bounding the total size of the program's buffers. As the node's dataset grows, this *will* lead to swapping and instability. This is a dangerous and wrong default for a couple of reasons. 1) People are likely to test cassandra with the default settings. This issue is insidious because it only appears when you have sufficient data in a certain node, there is absolutely no way to control it, and it doesn't at all respect the memory limits that you give to the JVM. That can all be ascertained by reading the code, and people should certainly do their homework, but nevertheless, cassandra should ship with sane defaults that don't break down when you cross some magic unknown threshold. 2) It's deceptive. Unless you are extremely careful with capacity planning, you will get bit by this. Most people won't really be able to use this in production, so why get them excited about performance that they can't actually have? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1214) Make standard IO the default
[ https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12886037#action_12886037 ] Nate McCall commented on CASSANDRA-1214: I have not hit this issue yet, but has anyone tried using the -XX:MaxDirectMemorySize option? Make standard IO the default Key: CASSANDRA-1214 URL: https://issues.apache.org/jira/browse/CASSANDRA-1214 Project: Cassandra Issue Type: Bug Affects Versions: 0.7 Reporter: James Golick The way mmap()'d IO is handled in cassandra is dangerous. It allocates potentially massive buffers without any care for bounding the total size of the program's buffers. As the node's dataset grows, this *will* lead to swapping and instability. This is a dangerous and wrong default for a couple of reasons. 1) People are likely to test cassandra with the default settings. This issue is insidious because it only appears when you have sufficient data in a certain node, there is absolutely no way to control it, and it doesn't at all respect the memory limits that you give to the JVM. That can all be ascertained by reading the code, and people should certainly do their homework, but nevertheless, cassandra should ship with sane defaults that don't break down when you cross some magic unknown threshold. 2) It's deceptive. Unless you are extremely careful with capacity planning, you will get bit by this. Most people won't really be able to use this in production, so why get them excited about performance that they can't actually have? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1214) Make standard IO the default
[ https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881055#action_12881055 ] Jonathan Ellis commented on CASSANDRA-1214: --- It seems that what is happening is, - the JVM hasn't needed to run a major collection in a while, - so Linux says I'll swap part of the JVM's heap so I can pull more of this hot sstable into ram, - then the JVM goes to GC and thrashes pulling its heap in from swap The right solution is probably to use mlockall(MCL_CURRENT) on JVM start (with min heap = max heap so that gets pre-allocated). Then perform the mmapping. mmap'd io is enough faster that this is probably worth biting the native code bullet for. Make standard IO the default Key: CASSANDRA-1214 URL: https://issues.apache.org/jira/browse/CASSANDRA-1214 Project: Cassandra Issue Type: Bug Affects Versions: 0.7 Reporter: James Golick The way mmap()'d IO is handled in cassandra is dangerous. It allocates potentially massive buffers without any care for bounding the total size of the program's buffers. As the node's dataset grows, this *will* lead to swapping and instability. This is a dangerous and wrong default for a couple of reasons. 1) People are likely to test cassandra with the default settings. This issue is insidious because it only appears when you have sufficient data in a certain node, there is absolutely no way to control it, and it doesn't at all respect the memory limits that you give to the JVM. That can all be ascertained by reading the code, and people should certainly do their homework, but nevertheless, cassandra should ship with sane defaults that don't break down when you cross some magic unknown threshold. 2) It's deceptive. Unless you are extremely careful with capacity planning, you will get bit by this. Most people won't really be able to use this in production, so why get them excited about performance that they can't actually have? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1214) Make standard IO the default
[ https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12880681#action_12880681 ] Jeff Hodges commented on CASSANDRA-1214: This is one of the very first things we've had to do with every cluster we've built. The mmap implementation just does not work for anything I've seen in production beyond trivial datasets. This would be a wonderful, reality-driven change. Make standard IO the default Key: CASSANDRA-1214 URL: https://issues.apache.org/jira/browse/CASSANDRA-1214 Project: Cassandra Issue Type: Bug Affects Versions: 0.7 Reporter: James Golick The way mmap()'d IO is handled in cassandra is dangerous. It allocates potentially massive buffers without any care for bounding the total size of the program's buffers. As the node's dataset grows, this *will* lead to swapping and instability. This is a dangerous and wrong default for a couple of reasons. 1) People are likely to test cassandra with the default settings. This issue is insidious because it only appears when you have sufficient data in a certain node, there is absolutely no way to control it, and it doesn't at all respect the memory limits that you give to the JVM. That can all be ascertained by reading the code, and people should certainly do their homework, but nevertheless, cassandra should ship with sane defaults that don't break down when you cross some magic unknown threshold. 2) It's deceptive. Unless you are extremely careful with capacity planning, you will get bit by this. Most people won't really be able to use this in production, so why get them excited about performance that they can't actually have? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.