[ 
https://issues.apache.org/jira/browse/SPARK-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382292#comment-14382292
 ] 

davep commented on SPARK-6539:
------------------------------

Looks like it's more related to memory usage with Java 8

Saw this in the logs from the cluster
{code}
    2015-03-26 14:51:32,266 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
 Container [pid=1324,containerID=container_1427380869947_0004_01_000001] is 
running beyond virtual memory limits. Current usage: 167.8 MB of 1 GB physical 
memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1427380869947_0004_01_000001 :
  |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) 
VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
  |- 1324 534 1324 1324 (bash) 0 0 15007744 707 /bin/bash -c 
/usr/java/default/bin/java -server -Xmx512m 
-Djava.io.tmpdir=/var/lib/gphd/hadoop-yarn/cache/yarn/nm-local-dir/usercache/hdfs/appcache/application_1427380869947_0004/container_1427380869947_0004_01_000001/tmp
 '-Dspark.executor.id=<driver>' 
'-Dspark.tachyonStore.folderName=spark-f683a69a-5ab4-4d9f-9551-810a5101bb87' 
'-Dspark.app.name=Spark Test' '-Dspark.master=yarn-client' 
'-Dspark.driver.host=davescinema.tor.pivotallabs.com' 
'-Dspark.fileserver.uri=http://10.74.5.105:56106' '-Dspark.driver.port=56105' 
'-Dspark.driver.appUIAddress=http://davescinema.tor.pivotallabs.com:4040' 
-Dspark.yarn.app.container.log.dir=/var/log/gphd/hadoop-yarn/containers/application_1427380869947_0004/container_1427380869947_0004_01_000001
 org.apache.spark.deploy.yarn.ExecutorLauncher --arg 
'davescinema.tor.pivotallabs.com:56105' --executor-memory 1024m 
--executor-cores 1 --num-executors  2 1> 
/var/log/gphd/hadoop-yarn/containers/application_1427380869947_0004/container_1427380869947_0004_01_000001/stdout
 2> 
/var/log/gphd/hadoop-yarn/containers/application_1427380869947_0004/container_1427380869947_0004_01_000001/stderr
 
  |- 1328 1324 1324 1324 (java) 467 68 2267566080 42248 
/usr/java/default/bin/java -server -Xmx512m 
-Djava.io.tmpdir=/var/lib/gphd/hadoop-yarn/cache/yarn/nm-local-dir/usercache/hdfs/appcache/application_1427380869947_0004/container_1427380869947_0004_01_000001/tmp
 -Dspark.executor.id=<driver> 
-Dspark.tachyonStore.folderName=spark-f683a69a-5ab4-4d9f-9551-810a5101bb87 
-Dspark.app.name=Spark Test -Dspark.master=yarn-client 
-Dspark.driver.host=davescinema.tor.pivotallabs.com 
-Dspark.fileserver.uri=http://10.74.5.105:56106 -Dspark.driver.port=56105 
-Dspark.driver.appUIAddress=http://davescinema.tor.pivotallabs.com:4040 
-Dspark.yarn.app.container.log.dir=/var/log/gphd/hadoop-yarn/containers/application_1427380869947_0004/container_1427380869947_0004_01_000001
 org.apache.spark.deploy.yarn.ExecutorLauncher --arg 
davescinema.tor.pivotallabs.com:56105 --executor-memory 1024m --executor-cores 
1 --num-executors 2 
{code}

Related Hadoop JIRA: https://issues.apache.org/jira/browse/HADOOP-11364

This snippet will fix our issues with java 8
{code}
SparkConf conf = new SparkConf().setAppName(appName).setMaster("yarn-client");
conf.set("spark.yarn.am.extraJavaOptions", "-XX:ReservedCodeCacheSize=100M 
-XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=256m");
{code}

But those arguments will fail if a cluster uses java7.  Specifically because 
the VM options are unrecognized.

This results in a more accurate error:

{code}
org.apache.spark.SparkException: Yarn application has already ended! It might 
have been killed or unable to launch application master.
        at 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:113)
        at 
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59)
        at 
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:379)
        at 
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
{code}

My expectation would be that a SparkException is thrown instead of the 
NullPointerException when containers are killed due to memory issues.

> SparkContext throws NullPointerException
> ----------------------------------------
>
>                 Key: SPARK-6539
>                 URL: https://issues.apache.org/jira/browse/SPARK-6539
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.3.0
>         Environment: Spark built using the folllowing:
> {code}
> ./make-distribution.sh --tgz -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 
> -DskipTests
> {code}
> Targeting Pivotal HD 2.1 (Hadoop 2.2) cluster with the nodes running JDK 8u40
>            Reporter: davep
>
> The following snippet of the client driver will throw a NullPointerException 
> when trying to create a SparkContext
> {code:java}
> SparkConf conf = new SparkConf().setAppName(appName).setMaster("yarn-client");
> JavaSparkContext sc = new JavaSparkContext(conf);
> {code}
> The exception is thrown when trying to create the block manager source here:
> https://github.com/apache/spark/blob/branch-1.3/core/src/main/scala/org/apache/spark/SparkContext.scala#L544
> If I compile the client driver with JDK7 or JDK8 it throws the same 
> exception. Changing the JDK on the Hadoop cluster to v7 resolves this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to