[jira] [Commented] (SPARK-7700) Spark 1.3.0 on YARN: Application failed 2 times due to AM Container
[ https://issues.apache.org/jira/browse/SPARK-7700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14569016#comment-14569016 ] Kaveen Raajan commented on SPARK-7700: -- I used Patch-1.patch spark working fine. Spark 1.3.0 on YARN: Application failed 2 times due to AM Container --- Key: SPARK-7700 URL: https://issues.apache.org/jira/browse/SPARK-7700 Project: Spark Issue Type: Story Components: Build Affects Versions: 1.3.1 Environment: windows 8 Single language Hadoop-2.5.2 Protocol Buffer-2.5.0 Scala-2.11 Reporter: Kaveen Raajan Attachments: Patch-1.patch I build SPARK on yarn mode by giving following command. Build got succeeded. {panel} mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-0.12.0 -Phive-thriftserver -DskipTests clean package {panel} I set following property at spark-env.cmd file {panel} SET SPARK_JAR=hdfs://master:9000/user/spark/jar {panel} *Note:* spark jar files are moved to hdfs specified location. Also spark classpath are added to hadoop-config.cmd and HADOOP_CONF_DIR are set at enviroment variable. I tried to execute following SparkPi example in yarn-cluster mode. {panel} spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3 --driver-memory 4g --executor-memory 2g --executor-cores 1 --queue default S:\Hadoop\Spark\spark-1.3.1\examples\target\spark-examples_2.10-1.3.1.jar 10 {panel} My job able to submit at hadoop cluster, but it always in accepted state and Failed with following error {panel} 15/05/14 13:00:51 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 15/05/14 13:00:51 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers 15/05/14 13:00:51 INFO yarn.Client: Verifying our application has not requestedmore than the maximum memory capability of the cluster (8048 MB per container) 15/05/14 13:00:51 INFO yarn.Client: Will allocate AM container, with 4480 MB memory including 384 MB overhead 15/05/14 13:00:51 INFO yarn.Client: Setting up container launch context for ourAM 15/05/14 13:00:51 INFO yarn.Client: Preparing resources for our AM container 15/05/14 13:00:52 INFO yarn.Client: Source and destination file systems are thesame. Not copying hdfs://master:9000/user/spark/jar 15/05/14 13:00:52 INFO yarn.Client: Uploading resource file:/S:/Hadoop/Spark/spark-1.3.1/examples/target/spark-examples_2.10-1.3.1.jar - hdfs://master:9000/user/HDFS/.sparkStaging/application_1431587916618_0003/spark-examples_2.10-1.3.1.jar 15/05/14 13:00:52 INFO yarn.Client: Setting up the launch environment for our AM container 15/05/14 13:00:52 INFO spark.SecurityManager: Changing view acls to: HDFS 15/05/14 13:00:52 INFO spark.SecurityManager: Changing modify acls to: HDFS 15/05/14 13:00:52 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(HDFS); users with modify permissions: Set(HDFS) 15/05/14 13:00:52 INFO yarn.Client: Submitting application 3 to ResourceManager 15/05/14 13:00:52 INFO impl.YarnClientImpl: Submitted application application_1431587916618_0003 15/05/14 13:00:53 INFO yarn.Client: Application report for application_1431587916618_0003 (state: ACCEPTED) 15/05/14 13:00:53 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1431588652790 final status: UNDEFINED tracking URL: http://master:8088/proxy/application_1431587916618_0003/ user: HDFS 15/05/14 13:00:54 INFO yarn.Client: Application report for application_1431587916618_0003 (state: ACCEPTED) 15/05/14 13:00:55 INFO yarn.Client: Application report for application_1431587916618_0003 (state: ACCEPTED) 15/05/14 13:00:56 INFO yarn.Client: Application report for application_1431587916618_0003 (state: ACCEPTED) 15/05/14 13:00:57 INFO yarn.Client: Application report for application_1431587916618_0003 (state: ACCEPTED) 15/05/14 13:00:58 INFO yarn.Client: Application report for application_1431587916618_0003 (state: ACCEPTED) 15/05/14 13:00:59 INFO yarn.Client: Application report for application_1431587916618_0003 (state: FAILED) 15/05/14 13:00:59 INFO yarn.Client: client token: N/A diagnostics: Application application_1431587916618_0003 failed 2 times due to AM Container for appattempt_1431587916618_0003_02 exited with exitCode: 1 For more detailed output, check application tracking page:http://master:8088/proxy/application_1431587916618_0003/Then, click on links to logs of each attempt. Diagnostics: Exception from
[jira] [Commented] (SPARK-7700) Spark 1.3.0 on YARN: Application failed 2 times due to AM Container
[ https://issues.apache.org/jira/browse/SPARK-7700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551983#comment-14551983 ] Sean Owen commented on SPARK-7700: -- I'm not quite sure what change you're proposing but we work in terms of pull requests on Github. However I think this is the same issue as SPARK-5754 so I'd try to review and join any work there. Spark 1.3.0 on YARN: Application failed 2 times due to AM Container --- Key: SPARK-7700 URL: https://issues.apache.org/jira/browse/SPARK-7700 Project: Spark Issue Type: Story Components: Build Affects Versions: 1.3.1 Environment: windows 8 Single language Hadoop-2.5.2 Protocol Buffer-2.5.0 Scala-2.11 Reporter: Kaveen Raajan I build SPARK on yarn mode by giving following command. Build got succeeded. {panel} mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-0.12.0 -Phive-thriftserver -DskipTests clean package {panel} I set following property at spark-env.cmd file {panel} SET SPARK_JAR=hdfs://master:9000/user/spark/jar {panel} *Note:* spark jar files are moved to hdfs specified location. Also spark classpath are added to hadoop-config.cmd and HADOOP_CONF_DIR are set at enviroment variable. I tried to execute following SparkPi example in yarn-cluster mode. {panel} spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3 --driver-memory 4g --executor-memory 2g --executor-cores 1 --queue default S:\Hadoop\Spark\spark-1.3.1\examples\target\spark-examples_2.10-1.3.1.jar 10 {panel} My job able to submit at hadoop cluster, but it always in accepted state and Failed with following error {panel} 15/05/14 13:00:51 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 15/05/14 13:00:51 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers 15/05/14 13:00:51 INFO yarn.Client: Verifying our application has not requestedmore than the maximum memory capability of the cluster (8048 MB per container) 15/05/14 13:00:51 INFO yarn.Client: Will allocate AM container, with 4480 MB memory including 384 MB overhead 15/05/14 13:00:51 INFO yarn.Client: Setting up container launch context for ourAM 15/05/14 13:00:51 INFO yarn.Client: Preparing resources for our AM container 15/05/14 13:00:52 INFO yarn.Client: Source and destination file systems are thesame. Not copying hdfs://master:9000/user/spark/jar 15/05/14 13:00:52 INFO yarn.Client: Uploading resource file:/S:/Hadoop/Spark/spark-1.3.1/examples/target/spark-examples_2.10-1.3.1.jar - hdfs://master:9000/user/HDFS/.sparkStaging/application_1431587916618_0003/spark-examples_2.10-1.3.1.jar 15/05/14 13:00:52 INFO yarn.Client: Setting up the launch environment for our AM container 15/05/14 13:00:52 INFO spark.SecurityManager: Changing view acls to: HDFS 15/05/14 13:00:52 INFO spark.SecurityManager: Changing modify acls to: HDFS 15/05/14 13:00:52 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(HDFS); users with modify permissions: Set(HDFS) 15/05/14 13:00:52 INFO yarn.Client: Submitting application 3 to ResourceManager 15/05/14 13:00:52 INFO impl.YarnClientImpl: Submitted application application_1431587916618_0003 15/05/14 13:00:53 INFO yarn.Client: Application report for application_1431587916618_0003 (state: ACCEPTED) 15/05/14 13:00:53 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1431588652790 final status: UNDEFINED tracking URL: http://master:8088/proxy/application_1431587916618_0003/ user: HDFS 15/05/14 13:00:54 INFO yarn.Client: Application report for application_1431587916618_0003 (state: ACCEPTED) 15/05/14 13:00:55 INFO yarn.Client: Application report for application_1431587916618_0003 (state: ACCEPTED) 15/05/14 13:00:56 INFO yarn.Client: Application report for application_1431587916618_0003 (state: ACCEPTED) 15/05/14 13:00:57 INFO yarn.Client: Application report for application_1431587916618_0003 (state: ACCEPTED) 15/05/14 13:00:58 INFO yarn.Client: Application report for application_1431587916618_0003 (state: ACCEPTED) 15/05/14 13:00:59 INFO yarn.Client: Application report for application_1431587916618_0003 (state: FAILED) 15/05/14 13:00:59 INFO yarn.Client: client token: N/A diagnostics: Application application_1431587916618_0003 failed 2 times due to AM Container for appattempt_1431587916618_0003_02 exited with exitCode: 1 For more detailed output, check application tracking
[jira] [Commented] (SPARK-7700) Spark 1.3.0 on YARN: Application failed 2 times due to AM Container
[ https://issues.apache.org/jira/browse/SPARK-7700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550436#comment-14550436 ] Kaveen Raajan commented on SPARK-7700: -- Hi [~srowen], Thanks for the update. Now I able to run Spark job successfully on both yarn-client and yarn-cluster mode. Here I changed following in Spark code. Replacing single quotes as double quotes *Line Number - 163* {code:title=YarnSparkHadoopUtil.scala|borderStyle=solid} def escapeForShell(arg: String): String = { if (arg != null) { val escaped = new StringBuilder(') for (i - 0 to arg.length() - 1) { arg.charAt(i) match { case '$' = escaped.append(\\$) case '' = escaped.append(\\\) case '\'' = escaped.append('\\'') case c = escaped.append(c) } } escaped.append(').toString() } else { arg } } {code} After changing above changes, we faced ISSUE – {color:red}Error: Could not find or load main class PWD.Syncfusion.BigDataSDK.2.1.0.70.SDK.Hadoop.logs.userlogs.application_1431950105623_0001.container_1431950105623_0001_01_04{color} To resolve this we identified the issue reproducing location from Spark source as, Remove this -XX:OnOutOfMemoryError='kill %p' from commands. *Line Number - 213* {code:title=ExecutorRunnable.scala|borderStyle=solid} val commands = prefixEnv ++ Seq( YarnSparkHadoopUtil.expandEnvironment(Environment.JAVA_HOME) + /bin/java, -server, // Kill if OOM is raised - leverage yarn's failure handling to cause rescheduling. // Not killing the task leaves various aspects of the executor and (to some extent) the jvm in // an inconsistent state. // TODO: If the OOM is not recoverable by rescheduling it on different node, then do // 'something' to fail job ... akin to blacklisting trackers in mapred ? -XX:OnOutOfMemoryError='kill %p') ++ javaOpts ++ Seq(org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, masterAddress.toString, --executor-id, slaveId.toString, --hostname, hostname.toString, --cores, executorCores.toString, --app-id, appId) ++ userClassPath ++ Seq( 1, ApplicationConstants.LOG_DIR_EXPANSION_VAR + /stdout, 2, ApplicationConstants.LOG_DIR_EXPANSION_VAR + /stderr) {code} Now We able to run all spark job in both yarn-cluster and yarn-client mode. I have a random question that Is there any equivalent patch available for following changes? why windows not accepted single quotes? and What the reason for giving this line -XX:OnOutOfMemoryError='kill %p'? Spark 1.3.0 on YARN: Application failed 2 times due to AM Container --- Key: SPARK-7700 URL: https://issues.apache.org/jira/browse/SPARK-7700 Project: Spark Issue Type: Story Components: Build Affects Versions: 1.3.1 Environment: windows 8 Single language Hadoop-2.5.2 Protocol Buffer-2.5.0 Scala-2.11 Reporter: Kaveen Raajan I build SPARK on yarn mode by giving following command. Build got succeeded. {panel} mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-0.12.0 -Phive-thriftserver -DskipTests clean package {panel} I set following property at spark-env.cmd file {panel} SET SPARK_JAR=hdfs://master:9000/user/spark/jar {panel} *Note:* spark jar files are moved to hdfs specified location. Also spark classpath are added to hadoop-config.cmd and HADOOP_CONF_DIR are set at enviroment variable. I tried to execute following SparkPi example in yarn-cluster mode. {panel} spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3 --driver-memory 4g --executor-memory 2g --executor-cores 1 --queue default S:\Hadoop\Spark\spark-1.3.1\examples\target\spark-examples_2.10-1.3.1.jar 10 {panel} My job able to submit at hadoop cluster, but it always in accepted state and Failed with following error {panel} 15/05/14 13:00:51 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 15/05/14 13:00:51 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers 15/05/14 13:00:51 INFO yarn.Client: Verifying our application has not requestedmore than the maximum memory capability of the cluster (8048 MB per container) 15/05/14 13:00:51 INFO yarn.Client: Will allocate AM container, with 4480 MB memory including 384 MB overhead 15/05/14 13:00:51 INFO yarn.Client: Setting up container launch context for ourAM 15/05/14 13:00:51 INFO yarn.Client: Preparing resources for our AM container 15/05/14 13:00:52 INFO yarn.Client: Source and destination file systems are thesame. Not copying hdfs://master:9000/user/spark/jar 15/05/14 13:00:52 INFO yarn.Client: Uploading
[jira] [Commented] (SPARK-7700) Spark 1.3.0 on YARN: Application failed 2 times due to AM Container
[ https://issues.apache.org/jira/browse/SPARK-7700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547641#comment-14547641 ] Kaveen Raajan commented on SPARK-7700: -- Hi [~srowen] I'm sure there is no space available at my hadoop and spark path. Since I running in windows environment I replace spark-env.sh to spark-env.cmd which contains _SET SPARK_JAR=hdfs://master:9000/user/spark/jar_ *Note:* spark jar files are moved to hdfs specified location. Also spark classpath are added to hadoop-config.cmd and HADOOP_CONF_DIR are set at enviroment variable. And also I didn't put any JVM arg in class belongs. While seeing at launch-container.cmd file this line are executed {panel} @C:\Hadoop\bin\winutils.exe symlink __spark__.jar \tmp\hadoop-HDFS\nm-local-dir\filecache\10\jar @C:\Hadoop\bin\winutils.exe symlink __app__.jar \tmp\hadoop-HDFS\nm-local-dir\usercache\HDFS\filecache\10\spark-examples-1.3.1-hadoop2.5.2.jar @call %JAVA_HOME%/bin/java -server -Xmx4096m -Djava.io.tmpdir=%PWD%/tmp '-Dspark.executor.memory=2g' '-Dspark.app.name=org.apache.spark.examples.SparkPi' '-Dspark.master=yarn-cluster' -Dspark.yarn.app.container.log.dir=C:/Hadoop/logs/userlogs/application_1431924261044_0002/container_1431924261044_0002_01_01 org.apache.spark.deploy.yarn.ApplicationMaster --class 'org.apache.spark.examples.SparkPi' --jar file:/C:/Spark/lib/spark-examples-1.3.1-hadoop2.5.2.jar --arg '10' --executor-memory 2048m --executor-cores 1 --num-executors 3 1 C:/Hadoop/logs/userlogs/application_1431924261044_0002/container_1431924261044_0002_01_01/stdout 2 C:/Hadoop/logs/userlogs/application_1431924261044_0002/container_1431924261044_0002_01_01/stderr {panel} Spark 1.3.0 on YARN: Application failed 2 times due to AM Container --- Key: SPARK-7700 URL: https://issues.apache.org/jira/browse/SPARK-7700 Project: Spark Issue Type: Story Components: Build Affects Versions: 1.3.1 Environment: windows 8 Single language Hadoop-2.5.2 Protocol Buffer-2.5.0 Scala-2.11 Reporter: Kaveen Raajan I build SPARK on yarn mode by giving following command. Build got succeeded. {panel} mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-0.12.0 -Phive-thriftserver -DskipTests clean package {panel} I set following property at spark-env.cmd file {panel} SET SPARK_JAR=hdfs://master:9000/user/spark/jar {panel} *Note:* spark jar files are moved to hdfs specified location. Also spark classpath are added to hadoop-config.cmd and HADOOP_CONF_DIR are set at enviroment variable. I tried to execute following SparkPi example in yarn-cluster mode. {panel} spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3 --driver-memory 4g --executor-memory 2g --executor-cores 1 --queue default S:\Hadoop\Spark\spark-1.3.1\examples\target\spark-examples_2.10-1.3.1.jar 10 {panel} My job able to submit at hadoop cluster, but it always in accepted state and Failed with following error {panel} 15/05/14 13:00:51 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 15/05/14 13:00:51 INFO yarn.Client: Requesting a new application from cluster with 1 NodeManagers 15/05/14 13:00:51 INFO yarn.Client: Verifying our application has not requestedmore than the maximum memory capability of the cluster (8048 MB per container) 15/05/14 13:00:51 INFO yarn.Client: Will allocate AM container, with 4480 MB memory including 384 MB overhead 15/05/14 13:00:51 INFO yarn.Client: Setting up container launch context for ourAM 15/05/14 13:00:51 INFO yarn.Client: Preparing resources for our AM container 15/05/14 13:00:52 INFO yarn.Client: Source and destination file systems are thesame. Not copying hdfs://master:9000/user/spark/jar 15/05/14 13:00:52 INFO yarn.Client: Uploading resource file:/S:/Hadoop/Spark/spark-1.3.1/examples/target/spark-examples_2.10-1.3.1.jar - hdfs://master:9000/user/HDFS/.sparkStaging/application_1431587916618_0003/spark-examples_2.10-1.3.1.jar 15/05/14 13:00:52 INFO yarn.Client: Setting up the launch environment for our AM container 15/05/14 13:00:52 INFO spark.SecurityManager: Changing view acls to: HDFS 15/05/14 13:00:52 INFO spark.SecurityManager: Changing modify acls to: HDFS 15/05/14 13:00:52 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(HDFS); users with modify permissions: Set(HDFS) 15/05/14 13:00:52 INFO yarn.Client: Submitting application 3 to ResourceManager 15/05/14 13:00:52 INFO impl.YarnClientImpl: Submitted application application_1431587916618_0003 15/05/14 13:00:53 INFO yarn.Client: Application report for application_1431587916618_0003 (state: ACCEPTED) 15/05/14 13:00:53 INFO