Matei Zaharia created SPARK-1989:
------------------------------------
Summary: Exit executors faster if they get into a cycle of heavy GC
Key: SPARK-1989
URL: https://issues.apache.org/jira/browse/SPARK-1989
Project: Spark
Issue Type: New Feature
Components: Spark Core
Reporter: Matei Zaharia
Fix For: 1.1.0
I've seen situations where an application is allocating too much memory across
its tasks + cache to proceed, but Java gets into a cycle where it repeatedly
runs full GCs, frees up a bit of the heap, and continues instead of giving up.
This then leads to timeouts and confusing error messages. It would be better to
crash with OOM sooner. The JVM has options to support this:
http://java.dzone.com/articles/tracking-excessive-garbage.
The right solution would probably be:
- Add some config options used by spark-submit to set XX:GCTimeLimit and
XX:GCHeapFreeLimit, with more conservative values than the defaults (e.g. 90%
time limit, 5% free limit)
- Make sure we pass these into the Java options for executors in each
deployment mode
--
This message was sent by Atlassian JIRA
(v6.2#6252)