Hi Zhan, Alas setting: -Dhdp.version=2.2.0.0–2041
Does not help. Still get the same error: 15/04/13 09:53:59 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1428918838408 final status: UNDEFINED tracking URL: http://foo.bar.site:8088/proxy/application_1427875242006_0037/ user: test 15/04/13 09:54:00 INFO yarn.Client: Application report for application_1427875242006_0037 (state: ACCEPTED) 15/04/13 09:54:01 INFO yarn.Client: Application report for application_1427875242006_0037 (state: ACCEPTED) 15/04/13 09:54:02 INFO yarn.Client: Application report for application_1427875242006_0037 (state: ACCEPTED) 15/04/13 09:54:03 INFO yarn.Client: Application report for application_1427875242006_0037 (state: FAILED) 15/04/13 09:54:03 INFO yarn.Client: client token: N/A diagnostics: Application application_1427875242006_0037 failed 2 times due to AM Container for appattempt_1427875242006_0037_000002 exited with exitCode: 1 For more detailed output, check application tracking page: http://foo.bar.site:8088/proxy/application_1427875242006_0037/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1427875242006_0037_02_000001 Exit code: 1 Exception message: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0037/container_1427875242006_0037_02_000001/launch_container.sh: line 27: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution Stack trace: ExitCodeException exitCode=1: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0037/container_1427875242006_0037_02_000001/launch_container.sh: line 27: $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 Failing this attempt. Failing the application. ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1428918838408 final status: FAILED tracking URL: http://foo.bar.site:8088/cluster/app/application_1427875242006_0037 user: test Exception in thread "main" org.apache.spark.SparkException: Application finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:622) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) On Fri, Apr 10, 2015 at 8:50 PM, Zhan Zhang <zzh...@hortonworks.com> wrote: > Hi Zork, > > There is some script change in spark-1.3 when starting the spark. You > can try put java-opts in your conf/ with following contents. > > -Dhdp.version=2.2.0.0–2041 > > > Please let me know whether it works or not. > > Thanks. > > Zhan Zhang > > > On Apr 10, 2015, at 7:21 AM, Zork Sail <zorks...@gmail.com> wrote: > > Many thanks. > > Yet even after setting: > > spark.driver.extraJavaOptions -Dhdp.version=2.2.0.0–2041 > spark.yarn.am.extraJavaOptions -Dhdp.version=2.2.0.0–2041 > > in SPARK_HOME/conf/spark-defaults.conf > > does not help, I still have exactly the same error log as before (( > > On Fri, Apr 10, 2015 at 5:44 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> Zork: >> See http://search-hadoop.com/m/JW1q5iQhwz1 >> >> >> >> On Apr 10, 2015, at 5:08 AM, Zork Sail <zorks...@gmail.com> wrote: >> >> I have built Spark with command: >> >> mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive -Phive-thriftserver >> -DskipTests package >> >> What is missing in this command to build it for YARN? >> >> I have also tried latest pre-built version with Hadoop support. >> In both cases I get the same errors described above. >> What else can be wrong? Maybe Spark 1.3.0 does not support Hadoop 2.6? >> >> On Fri, Apr 10, 2015 at 3:29 PM, Sean Owen <so...@cloudera.com> wrote: >> >>> I see at least two possible problems: maybe you did not build Spark >>> for YARN, and looks like a variable hdp.version is expected in your >>> environment but not set (this isn't specific to Spark) >>> >>> On Fri, Apr 10, 2015 at 6:34 AM, Zork Sail <zorks...@gmail.com> wrote: >>> > >>> > Please help! Completely stuck trying to run Spark 1.3.0 on YARN! >>> > I have `Hadoop 2.6.0.2.2.0.0-2041` with `Hive 0.14.0.2.2.0.0-2041 >>> > ` >>> > After building Spark with command: >>> > >>> > mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -Phive >>> > -Phive-thriftserver -DskipTests package >>> > >>> > I try to run Pi example on YARN with the following command: >>> > >>> > export HADOOP_CONF_DIR=/etc/hadoop/conf >>> > /var/home2/test/spark/bin/spark-submit \ >>> > --class org.apache.spark.examples.SparkPi \ >>> > --master yarn-cluster \ >>> > --executor-memory 3G \ >>> > --num-executors 50 \ >>> > hdfs:///user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar \ >>> > 1000 >>> > >>> > I get exceptions: `application_1427875242006_0029 failed 2 times due >>> to AM >>> > Container for appattempt_1427875242006_0029_000002 exited with >>> exitCode: 1` >>> > Which in fact is `Diagnostics: Exception from >>> container-launch.`(please see >>> > log below). >>> > >>> > Application tracking url reveals the following messages: >>> > >>> > java.lang.Exception: Unknown container. Container either has not >>> started >>> > or has already completed or doesn't belong to this node at all >>> > >>> > and also: >>> > >>> > Error: Could not find or load main class >>> > org.apache.spark.deploy.yarn.ApplicationMaster >>> > >>> > I have Hadoop working fine on 4 nodes and completly at a loss how to >>> make >>> > Spark work on YARN. Please advise where to look for, any ideas would >>> be of >>> > great help, thank you! >>> > >>> > Spark assembly has been built with Hive, including Datanucleus >>> jars on >>> > classpath >>> > 15/04/06 10:53:40 WARN util.NativeCodeLoader: Unable to load >>> > native-hadoop library for your platform... using builtin-java classes >>> where >>> > applicable >>> > 15/04/06 10:53:42 INFO impl.TimelineClientImpl: Timeline service >>> > address: http://etl-hdp-yarn.foo.bar.com:8188/ws/v1/timeline/ >>> > 15/04/06 10:53:42 INFO client.RMProxy: Connecting to >>> ResourceManager at >>> > etl-hdp-yarn.foo.bar.com/192.168.0.16:8050 >>> > 15/04/06 10:53:42 INFO yarn.Client: Requesting a new application >>> from >>> > cluster with 4 NodeManagers >>> > 15/04/06 10:53:42 INFO yarn.Client: Verifying our application has >>> not >>> > requested more than the maximum memory capability of the cluster (4096 >>> MB >>> > per container) >>> > 15/04/06 10:53:42 INFO yarn.Client: Will allocate AM container, >>> with 896 >>> > MB memory including 384 MB overhead >>> > 15/04/06 10:53:42 INFO yarn.Client: Setting up container launch >>> context >>> > for our AM >>> > 15/04/06 10:53:42 INFO yarn.Client: Preparing resources for our AM >>> > container >>> > 15/04/06 10:53:43 WARN shortcircuit.DomainSocketFactory: The >>> > short-circuit local reads feature cannot be used because libhadoop >>> cannot be >>> > loaded. >>> > 15/04/06 10:53:43 INFO yarn.Client: Uploading resource >>> > >>> file:/var/home2/test/spark-1.3.0/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.6.0.jar >>> > -> >>> > hdfs:// >>> etl-hdp-nn1.foo.bar.com:8020/user/test/.sparkStaging/application_1427875242006_0029/spark-assembly-1.3.0-hadoop2.6.0.jar >>> > 15/04/06 10:53:44 INFO yarn.Client: Source and destination file >>> systems >>> > are the same. Not copying >>> > hdfs:/user/test/jars/spark-examples-1.3.0-hadoop2.4.0.jar >>> > 15/04/06 10:53:44 INFO yarn.Client: Setting up the launch >>> environment >>> > for our AM container >>> > 15/04/06 10:53:44 INFO spark.SecurityManager: Changing view acls >>> to: >>> > test >>> > 15/04/06 10:53:44 INFO spark.SecurityManager: Changing modify acls >>> to: >>> > test >>> > 15/04/06 10:53:44 INFO spark.SecurityManager: SecurityManager: >>> > authentication disabled; ui acls disabled; users with view permissions: >>> > Set(test); users with modify permissions: Set(test) >>> > 15/04/06 10:53:44 INFO yarn.Client: Submitting application 29 to >>> > ResourceManager >>> > 15/04/06 10:53:44 INFO impl.YarnClientImpl: Submitted application >>> > application_1427875242006_0029 >>> > 15/04/06 10:53:45 INFO yarn.Client: Application report for >>> > application_1427875242006_0029 (state: ACCEPTED) >>> > 15/04/06 10:53:45 INFO yarn.Client: >>> > client token: N/A >>> > diagnostics: N/A >>> > ApplicationMaster host: N/A >>> > ApplicationMaster RPC port: -1 >>> > queue: default >>> > start time: 1428317623905 >>> > final status: UNDEFINED >>> > tracking URL: >>> > >>> http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/ >>> > user: test >>> > 15/04/06 10:53:46 INFO yarn.Client: Application report for >>> > application_1427875242006_0029 (state: ACCEPTED) >>> > 15/04/06 10:53:47 INFO yarn.Client: Application report for >>> > application_1427875242006_0029 (state: ACCEPTED) >>> > 15/04/06 10:53:48 INFO yarn.Client: Application report for >>> > application_1427875242006_0029 (state: ACCEPTED) >>> > 15/04/06 10:53:49 INFO yarn.Client: Application report for >>> > application_1427875242006_0029 (state: FAILED) >>> > 15/04/06 10:53:49 INFO yarn.Client: >>> > client token: N/A >>> > diagnostics: Application application_1427875242006_0029 >>> failed 2 >>> > times due to AM Container for appattempt_1427875242006_0029_000002 >>> exited >>> > with exitCode: 1 >>> > For more detailed output, check application tracking >>> > page: >>> http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0029/Then >>> , >>> > click on links to logs of each attempt. >>> > Diagnostics: Exception from container-launch. >>> > Container id: container_1427875242006_0029_02_000001 >>> > Exit code: 1 >>> > Exception message: >>> > >>> /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0029/container_1427875242006_0029_02_000001/launch_container.sh: >>> > line 27: >>> > >>> $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: >>> > bad substitution >>> > >>> > Stack trace: ExitCodeException exitCode=1: >>> > >>> /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0029/container_1427875242006_0029_02_000001/launch_container.sh: >>> > line 27: >>> > >>> $PWD:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: >>> > bad substitution >>> > >>> > at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) >>> > at org.apache.hadoop.util.Shell.run(Shell.java:455) >>> > at >>> > >>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) >>> > at >>> > >>> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) >>> > at >>> > >>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) >>> > at >>> > >>> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) >>> > at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>> > at >>> > >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> > at >>> > >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> > at java.lang.Thread.run(Thread.java:745) >>> > >>> > >>> > Container exited with a non-zero exit code 1 >>> > Failing this attempt. Failing the application. >>> > ApplicationMaster host: N/A >>> > ApplicationMaster RPC port: -1 >>> > queue: default >>> > start time: 1428317623905 >>> > final status: FAILED >>> > tracking URL: >>> > >>> http://etl-hdp-yarn.foo.bar.com:8088/cluster/app/application_1427875242006_0029 >>> > user: test >>> > Exception in thread "main" org.apache.spark.SparkException: >>> Application >>> > finished with failed status >>> > at org.apache.spark.deploy.yarn.Client.run(Client.scala:622) >>> > at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647) >>> > at org.apache.spark.deploy.yarn.Client.main(Client.scala) >>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> > at >>> > >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> > at >>> > >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> > at java.lang.reflect.Method.invoke(Method.java:606) >>> > at >>> > >>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569) >>> > at >>> > org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) >>> > at >>> > org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) >>> > at >>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) >>> > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >>> > >>> > >>> > >>> >> >> > >