[
https://issues.apache.org/jira/browse/SPARK-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382292#comment-14382292
]
davep commented on SPARK-6539:
------------------------------
Looks like it's more related to memory usage with Java 8
Saw this in the logs from the cluster
{code}
2015-03-26 14:51:32,266 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
Container [pid=1324,containerID=container_1427380869947_0004_01_000001] is
running beyond virtual memory limits. Current usage: 167.8 MB of 1 GB physical
memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1427380869947_0004_01_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS)
VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 1324 534 1324 1324 (bash) 0 0 15007744 707 /bin/bash -c
/usr/java/default/bin/java -server -Xmx512m
-Djava.io.tmpdir=/var/lib/gphd/hadoop-yarn/cache/yarn/nm-local-dir/usercache/hdfs/appcache/application_1427380869947_0004/container_1427380869947_0004_01_000001/tmp
'-Dspark.executor.id=<driver>'
'-Dspark.tachyonStore.folderName=spark-f683a69a-5ab4-4d9f-9551-810a5101bb87'
'-Dspark.app.name=Spark Test' '-Dspark.master=yarn-client'
'-Dspark.driver.host=davescinema.tor.pivotallabs.com'
'-Dspark.fileserver.uri=http://10.74.5.105:56106' '-Dspark.driver.port=56105'
'-Dspark.driver.appUIAddress=http://davescinema.tor.pivotallabs.com:4040'
-Dspark.yarn.app.container.log.dir=/var/log/gphd/hadoop-yarn/containers/application_1427380869947_0004/container_1427380869947_0004_01_000001
org.apache.spark.deploy.yarn.ExecutorLauncher --arg
'davescinema.tor.pivotallabs.com:56105' --executor-memory 1024m
--executor-cores 1 --num-executors 2 1>
/var/log/gphd/hadoop-yarn/containers/application_1427380869947_0004/container_1427380869947_0004_01_000001/stdout
2>
/var/log/gphd/hadoop-yarn/containers/application_1427380869947_0004/container_1427380869947_0004_01_000001/stderr
|- 1328 1324 1324 1324 (java) 467 68 2267566080 42248
/usr/java/default/bin/java -server -Xmx512m
-Djava.io.tmpdir=/var/lib/gphd/hadoop-yarn/cache/yarn/nm-local-dir/usercache/hdfs/appcache/application_1427380869947_0004/container_1427380869947_0004_01_000001/tmp
-Dspark.executor.id=<driver>
-Dspark.tachyonStore.folderName=spark-f683a69a-5ab4-4d9f-9551-810a5101bb87
-Dspark.app.name=Spark Test -Dspark.master=yarn-client
-Dspark.driver.host=davescinema.tor.pivotallabs.com
-Dspark.fileserver.uri=http://10.74.5.105:56106 -Dspark.driver.port=56105
-Dspark.driver.appUIAddress=http://davescinema.tor.pivotallabs.com:4040
-Dspark.yarn.app.container.log.dir=/var/log/gphd/hadoop-yarn/containers/application_1427380869947_0004/container_1427380869947_0004_01_000001
org.apache.spark.deploy.yarn.ExecutorLauncher --arg
davescinema.tor.pivotallabs.com:56105 --executor-memory 1024m --executor-cores
1 --num-executors 2
{code}
Related Hadoop JIRA: https://issues.apache.org/jira/browse/HADOOP-11364
This snippet will fix our issues with java 8
{code}
SparkConf conf = new SparkConf().setAppName(appName).setMaster("yarn-client");
conf.set("spark.yarn.am.extraJavaOptions", "-XX:ReservedCodeCacheSize=100M
-XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256m");
{code}
But those arguments will fail if a cluster uses java7. Specifically because
the VM options are unrecognized.
This results in a more accurate error:
{code}
org.apache.spark.SparkException: Yarn application has already ended! It might
have been killed or unable to launch application master.
at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:113)
at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59)
at
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:379)
at
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
{code}
My expectation would be that a SparkException is thrown instead of the
NullPointerException when containers are killed due to memory issues.
> SparkContext throws NullPointerException
> ----------------------------------------
>
> Key: SPARK-6539
> URL: https://issues.apache.org/jira/browse/SPARK-6539
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.3.0
> Environment: Spark built using the folllowing:
> {code}
> ./make-distribution.sh --tgz -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0
> -DskipTests
> {code}
> Targeting Pivotal HD 2.1 (Hadoop 2.2) cluster with the nodes running JDK 8u40
> Reporter: davep
>
> The following snippet of the client driver will throw a NullPointerException
> when trying to create a SparkContext
> {code:java}
> SparkConf conf = new SparkConf().setAppName(appName).setMaster("yarn-client");
> JavaSparkContext sc = new JavaSparkContext(conf);
> {code}
> The exception is thrown when trying to create the block manager source here:
> https://github.com/apache/spark/blob/branch-1.3/core/src/main/scala/org/apache/spark/SparkContext.scala#L544
> If I compile the client driver with JDK7 or JDK8 it throws the same
> exception. Changing the JDK on the Hadoop cluster to v7 resolves this issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]