subject:"\[jira\] Commented\: \(CASSANDRA\-1214\) Make standard IO the default"

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

2010-08-09 Thread Peter Schuller (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896473#action_12896473
]

Peter Schuller commented on CASSANDRA-1214:
---

I'll admit I did not investigate JNA (or POSIX-JNA) for this particular case.
Last time I did however, I found it lacking. Very trivial cases were okay, but
even something as simple as grab errno became a holy mess of portability
concerns.

I looked briefly at what posix-jna does, and I was unable to find any magic
bullets in there and instead saw things like hard-coded constants that are
non-portable and difficult to detect when they break due to changes to some
particular platform.

The proposed JNA patch seems to suffer from exactly this problem as far as I
can see, making assumptions about what the concrete values are of MCL_CURRENT
and MCL_FUTURE.

As far as I can tell, once one has gotten over the initial one-time hurdle of
using JNI and the associated building issues, you have a much more
correct/standards-compliant access to the native platform than through JNA
since you're in compile time with access to appropriate headers etc.

Please do correct me if I'm wrong, since the idea of avoiding compile
time/build issues is certainly *very* attractive and the reason why I tried to
find an acceptable solution with JNA in the past.

Make standard IO the default

Key: CASSANDRA-1214
URL: https://issues.apache.org/jira/browse/CASSANDRA-1214
Project: Cassandra
Issue Type: Bug
Reporter: James Golick
Attachments: mlockall-jna.patch.txt, Read Throughput with mmap.jpg,
trunk-1214.txt

The way mmap()'d IO is handled in cassandra is dangerous. It allocates
potentially massive buffers without any care for bounding the total size of
the program's buffers. As the node's dataset grows, this *will* lead to
swapping and instability.
This is a dangerous and wrong default for a couple of reasons.
1) People are likely to test cassandra with the default settings. This issue
is insidious because it only appears when you have sufficient data in a
certain node, there is absolutely no way to control it, and it doesn't at all
respect the memory limits that you give to the JVM.
That can all be ascertained by reading the code, and people should certainly
do their homework, but nevertheless, cassandra should ship with sane defaults
that don't break down when you cross some magic unknown threshold.
2) It's deceptive. Unless you are extremely careful with capacity planning,
you will get bit by this. Most people won't really be able to use this in
production, so why get them excited about performance that they can't
actually have?

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

2010-08-09 Thread Folke Behrens (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896571#action_12896571
]

Folke Behrens commented on CASSANDRA-1214:
--

{quote}
How does the JNA approach behave if there is no C library (Windows?) or
mlockall doesn't exist (OS X?)
{quote}
In case of Mac OS X an UnsatisfiedLinkError will be thrown. Windows? I don't
know. Maybe a JNA-specific exception, maybe a ULE, too. OS's can be easily
detected with Platform.isXXX() and dealt with accordingly.

{quote}
something as simple as grab errno became a holy mess of portability concerns.
{quote}
Yes, but errno is a particularly hard case. The inventors messed up big time
with this. That's why the JNA developers provide two ways to check errno: you
either mark your methods with throws LastErrorException or you ask
Native.getLastError(). This works under Windows, too.

{quote}
The proposed JNA patch seems to suffer from exactly this problem as far as I
can see, making assumptions about what the concrete values are of MCL_CURRENT
and MCL_FUTURE.
{quote}
Theoretically, you're right, in practice, however, I can't find a single POSIX
system that assigns different values to MCL_CURRENT or MCL_FUTURE, and I think
it's highly unlikely that these will change in the future. If so, Cassandra's
code can be adjusted.

{quote}
As far as I can tell, once one has gotten over the initial one-time hurdle of
using JNI and the associated building issues, you have a much more
correct/standards-compliant access to the native platform than through JNA
since you're in compile time with access to appropriate headers etc.
Please do correct me if I'm wrong, since the idea of avoiding compile
time/build issues is certainly very attractive and the reason why I tried to
find an acceptable solution with JNA in the past.
{quote}
You're absolutely right, and your JNI code is really superb. If Cassandra needs
to bind a couple more native functions I'd say JNI is the way to go. But not
just yet.

Make standard IO the default

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

2010-08-09 Thread Peter Schuller (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896617#action_12896617
]

Peter Schuller commented on CASSANDRA-1214:
---

It all sounds reasonable.

So I take it the way forward would be to take your JNA version and combine with
the configuration/policy parts of my patch (assuming people agree that those
parts are a good idea) and go for that version for now and maybe move to JNI in
the future if JNI becomes a dependency anyway for some other reason.

Any objections?

Make standard IO the default

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

2010-08-08 Thread Jonathan Ellis (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12896444#action_12896444
]

Jonathan Ellis commented on CASSANDRA-1214:
---

How does the JNA approach behave if there is no C library (Windows?) or
mlockall doesn't exist (OS X?)

Make standard IO the default

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

2010-07-15 Thread Jonathan Ellis (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12888950#action_12888950
]

Jonathan Ellis commented on CASSANDRA-1214:
---

according to http://andrigoss.blogspot.com/2008/02/jvm-performance-tuning.html,
using huge pages automatically gives us the lock-jvm-heap-in-memory behavior we
want, and may provide a substantial performance benefit as well.

See also: http://java.sun.com/javase/technologies/hotspot/largememory.jsp

Make standard IO the default

Key: CASSANDRA-1214
URL: https://issues.apache.org/jira/browse/CASSANDRA-1214
Project: Cassandra
Issue Type: Bug
Affects Versions: 0.7
Reporter: James Golick
Attachments: Read Throughput with mmap.jpg

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

2010-07-07 Thread Nate McCall (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12886037#action_12886037
]

Nate McCall commented on CASSANDRA-1214:

I have not hit this issue yet, but has anyone tried using the
-XX:MaxDirectMemorySize option?

Make standard IO the default

Key: CASSANDRA-1214
URL: https://issues.apache.org/jira/browse/CASSANDRA-1214
Project: Cassandra
Issue Type: Bug
Affects Versions: 0.7
Reporter: James Golick

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

2010-06-21 Thread Jonathan Ellis (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12881055#action_12881055
]

Jonathan Ellis commented on CASSANDRA-1214:
---

It seems that what is happening is,

- the JVM hasn't needed to run a major collection in a while,
- so Linux says I'll swap part of the JVM's heap so I can pull more of this
hot sstable into ram,
- then the JVM goes to GC and thrashes pulling its heap in from swap

The right solution is probably to use mlockall(MCL_CURRENT) on JVM start
(with min heap = max heap so that gets pre-allocated). Then perform the
mmapping.

mmap'd io is enough faster that this is probably worth biting the native code
bullet for.

Make standard IO the default

Key: CASSANDRA-1214
URL: https://issues.apache.org/jira/browse/CASSANDRA-1214
Project: Cassandra
Issue Type: Bug
Affects Versions: 0.7
Reporter: James Golick

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

2010-06-20 Thread Jeff Hodges (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12880681#action_12880681
]

Jeff Hodges commented on CASSANDRA-1214:

This is one of the very first things we've had to do with every cluster we've
built. The mmap implementation just does not work for anything I've seen in
production beyond trivial datasets. This would be a wonderful, reality-driven
change.

Make standard IO the default

Key: CASSANDRA-1214
URL: https://issues.apache.org/jira/browse/CASSANDRA-1214
Project: Cassandra
Issue Type: Bug
Affects Versions: 0.7
Reporter: James Golick

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

[jira] Commented: (CASSANDRA-1214) Make standard IO the default

8 matches

Site Navigation

Mail list logo

Footer information