[ https://issues.apache.org/jira/browse/SPARK-11022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
colin shaw updated SPARK-11022: ------------------------------- Description: Worker process often down,while there were not any abnormal tasks,just crash without anymessage, after added "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=${SPARK_HOME}/logs", a dump file show there is "17,010 instances of "org.apache.spark.deploy.worker.ExecutorRunner", loaded by "sun.misc.Launcher$AppClassLoader @ 0xe2abfcc8" occupy 496,706,920 (96.14%) bytes. " and all the instance were stored in a "org.apache.spark.deploy.worker.Worker" instance, the finishedExecutors field hold many ExecutorRunner. The codes(Worker.scala) shows finishedExecutors just "finishedExecutors(fullId) = executor" and "finishedExecutors.values.toList",there is no action which remove the Executor,all were stored in memory,so after long time running,crashed. was: Worker process often down,while there were not any abnormal task,just crash without anymessage, after added "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=${SPARK_HOME}/logs", a dump file show there is "17,010 instances of "org.apache.spark.deploy.worker.ExecutorRunner", loaded by "sun.misc.Launcher$AppClassLoader @ 0xe2abfcc8" occupy 496,706,920 (96.14%) bytes. " and all the instance were stored in a "org.apache.spark.deploy.worker.Worker" instance, the finishedExecutors field hold many ExecutorRunner. The codes(Worker.scala) shows finishedExecutors just "finishedExecutors(fullId) = executor" and "finishedExecutors.values.toList",there is no action which remove the Executor,all were stored in memory,so after long time running,crashed. > Spark Worker process find Memory leak after long time running > ------------------------------------------------------------- > > Key: SPARK-11022 > URL: https://issues.apache.org/jira/browse/SPARK-11022 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.4.0 > Reporter: colin shaw > > Worker process often down,while there were not any abnormal tasks,just crash > without anymessage, after added "-XX:+HeapDumpOnOutOfMemoryError > -XX:HeapDumpPath=${SPARK_HOME}/logs", a dump file show there is "17,010 > instances of "org.apache.spark.deploy.worker.ExecutorRunner", loaded by > "sun.misc.Launcher$AppClassLoader @ 0xe2abfcc8" occupy 496,706,920 (96.14%) > bytes. " > and all the instance were stored in a "org.apache.spark.deploy.worker.Worker" > instance, the finishedExecutors field hold many ExecutorRunner. > The codes(Worker.scala) shows finishedExecutors just > "finishedExecutors(fullId) = executor" and > "finishedExecutors.values.toList",there is no action which remove the > Executor,all were stored in memory,so after long time running,crashed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org