Marcus Olsson created CASSANDRA-13886:
-----------------------------------------
Summary: OOM put node in limbo
Key: CASSANDRA-13886
URL: https://issues.apache.org/jira/browse/CASSANDRA-13886
Project: Cassandra
Issue Type: Bug
Environment: Cassandra version 2.2.10
Reporter: Marcus Olsson
Priority: Minor
In one of our test clusters we have had some issues with OOM. While working on
fixing this it was discovered that one of the nodes that got OOM actually
wasn't shut down properly. Instead it went into a half-up-state where the
affected node considered itself up while all other nodes considered it as down.
The following stacktrace was observed which seems to be the cause of this:
{noformat}
java.lang.NoClassDefFoundError: Could not initialize class java.lang.UNIXProcess
at java.lang.ProcessImpl.start(ProcessImpl.java:130) ~[na:1.8.0_131]
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
~[na:1.8.0_131]
at java.lang.Runtime.exec(Runtime.java:620) ~[na:1.8.0_131]
at java.lang.Runtime.exec(Runtime.java:485) ~[na:1.8.0_131]
at
org.apache.cassandra.utils.HeapUtils.generateHeapDump(HeapUtils.java:88)
~[apache-cassandra-2.2.10.jar:2.2.10]
at
org.apache.cassandra.utils.JVMStabilityInspector.inspectThrowable(JVMStabilityInspector.java:56)
~[apache-cassandra-2.2.10.jar:2.2.10]
at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:168)
~[apache-cassandra-2.2.10.jar:2.2.10]
at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
~[apache-cassandra-2.2.10.jar:2.2.10]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
~[apache-cassandra-2.2.10.jar:2.2.10]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
{noformat}
It seems that if an unexpected exception/error is thrown inside
JVMStabilityInspector.inspectThrowable the JVM is not actually shut down but
instead keeps on running. My expectation is that the JVM should shut down in
case OOM is thrown.
Potential workaround is to add:
{noformat}
JVM_OPTS="$JVM_OPTS -XX:+ExitOnOutOfMemoryError"
{noformat}
to cassandra-env.sh.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]