[
https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16901555#comment-16901555
]
Joseph Lynch commented on CASSANDRA-15214:
------------------------------------------
[~yifanc] If you are ok with it I can add your test cases to
[jvmquake|https://github.com/Netflix-Skunkworks/jvmquake/tree/master/tests] to
ensure it handles all edge cases. For what it's worth jvmquake is a strict
superset of jvmkill and I wouldn't advocate for using jvmkill (I'm biased
though). In my production experience jvmquake actually works at detecting GC
spirals of death that C* runs into while jvmkill simply doesn't work as C*
doesn't actually go OOM, it just death spirals. See the "hard oom" [test
cases|https://github.com/Netflix-Skunkworks/jvmquake/blob/master/tests/test_hard_ooms.py]
for example where jvmkill won't work while jvmquake will work.
> OOMs caught and not rethrown
> ----------------------------
>
> Key: CASSANDRA-15214
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15214
> Project: Cassandra
> Issue Type: Bug
> Components: Messaging/Client, Messaging/Internode
> Reporter: Benedict
> Priority: Normal
> Fix For: 4.0
>
> Attachments: oom-experiments.zip
>
>
> Netty (at least, and perhaps elsewhere in Executors) catches all exceptions,
> so presently there is no way to ensure that an OOM reaches the JVM handler to
> trigger a crash/heapdump.
> It may be that the simplest most consistent way to do this would be to have a
> single thread spawned at startup that waits for any exceptions we must
> propagate to the Runtime.
> We could probably submit a patch upstream to Netty, but for a guaranteed
> future proof approach, it may be worth paying the cost of a single thread.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]