Koert, is there any chance that your fs.defaultFS isn't setup right?
On Fri, Jun 20, 2014 at 9:57 AM, Koert Kuipers <ko...@tresata.com> wrote: > yeah sure see below. i strongly suspect its something i misconfigured > causing yarn to try to use local filesystem mistakenly. > > ********************* > > [koert@cdh5-yarn ~]$ /usr/local/lib/spark/bin/spark-submit --class > org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3 > --executor-cores 1 > hdfs://cdh5-yarn/lib/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar 10 > 14/06/20 12:54:40 WARN NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 14/06/20 12:54:40 INFO RMProxy: Connecting to ResourceManager at > cdh5-yarn.tresata.com/192.168.1.85:8032 > 14/06/20 12:54:41 INFO Client: Got Cluster metric info from > ApplicationsManager (ASM), number of NodeManagers: 1 > 14/06/20 12:54:41 INFO Client: Queue info ... queueName: root.default, > queueCurrentCapacity: 0.0, queueMaxCapacity: -1.0, > queueApplicationCount = 0, queueChildQueueCount = 0 > 14/06/20 12:54:41 INFO Client: Max mem capabililty of a single resource in > this cluster 8192 > 14/06/20 12:54:41 INFO Client: Preparing Local resources > 14/06/20 12:54:41 WARN BlockReaderLocal: The short-circuit local reads > feature cannot be used because libhadoop cannot be loaded. > 14/06/20 12:54:41 INFO Client: Uploading > hdfs://cdh5-yarn/lib/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar to > file:/home/koert/.sparkStaging/application_1403201750110_0060/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar > 14/06/20 12:54:43 INFO Client: Setting up the launch environment > 14/06/20 12:54:43 INFO Client: Setting up container launch context > 14/06/20 12:54:43 INFO Client: Command for starting the Spark > ApplicationMaster: List($JAVA_HOME/bin/java, -server, -Xmx512m, > -Djava.io.tmpdir=$PWD/tmp, -Dspark.akka.retry.wait=\"30000\", > -Dspark.storage.blockManagerTimeoutIntervalMs=\"120000\", > -Dspark.storage.blockManagerHeartBeatMs=\"120000\", > -Dspark.app.name=\"org.apache.spark.examples.SparkPi\", > -Dspark.akka.frameSize=\"10000\", -Dspark.akka.timeout=\"30000\", > -Dspark.worker.timeout=\"30000\", > -Dspark.akka.logLifecycleEvents=\"true\", > -Dlog4j.configuration=log4j-spark-container.properties, > org.apache.spark.deploy.yarn.ApplicationMaster, --class, > org.apache.spark.examples.SparkPi, --jar , > hdfs://cdh5-yarn/lib/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar, > --args '10' , --executor-memory, 1024, --executor-cores, 1, > --num-executors , 3, 1>, <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr) > 14/06/20 12:54:43 INFO Client: Submitting application to ASM > 14/06/20 12:54:43 INFO YarnClientImpl: Submitted application > application_1403201750110_0060 > 14/06/20 12:54:44 INFO Client: Application report from ASM: > application identifier: application_1403201750110_0060 > appId: 60 > clientToAMToken: null > appDiagnostics: > appMasterHost: N/A > appQueue: root.koert > appMasterRpcPort: -1 > appStartTime: 1403283283505 > yarnAppState: ACCEPTED > distributedFinalState: UNDEFINED > appTrackingUrl: > http://cdh5-yarn.tresata.com:8088/proxy/application_1403201750110_0060/ > appUser: koert > 14/06/20 12:54:45 INFO Client: Application report from ASM: > application identifier: application_1403201750110_0060 > appId: 60 > clientToAMToken: null > appDiagnostics: > appMasterHost: N/A > appQueue: root.koert > appMasterRpcPort: -1 > appStartTime: 1403283283505 > yarnAppState: ACCEPTED > distributedFinalState: UNDEFINED > appTrackingUrl: > http://cdh5-yarn.tresata.com:8088/proxy/application_1403201750110_0060/ > appUser: koert > 14/06/20 12:54:46 INFO Client: Application report from ASM: > application identifier: application_1403201750110_0060 > appId: 60 > clientToAMToken: null > appDiagnostics: > appMasterHost: N/A > appQueue: root.koert > appMasterRpcPort: -1 > appStartTime: 1403283283505 > yarnAppState: ACCEPTED > distributedFinalState: UNDEFINED > appTrackingUrl: > http://cdh5-yarn.tresata.com:8088/proxy/application_1403201750110_0060/ > appUser: koert > 14/06/20 12:54:47 INFO Client: Application report from ASM: > application identifier: application_1403201750110_0060 > appId: 60 > clientToAMToken: null > appDiagnostics: Application application_1403201750110_0060 failed 2 > times due to AM Container for appattempt_1403201750110_0060_000002 exited > with exitCode: -1000 due to: File > file:/home/koert/.sparkStaging/application_1403201750110_0060/spark-examples-1.0.0-hadoop2.3.0-cdh5.0.2.jar > does not exist > .Failing this attempt.. Failing the application. > appMasterHost: N/A > appQueue: root.koert > appMasterRpcPort: -1 > appStartTime: 1403283283505 > yarnAppState: FAILED > distributedFinalState: FAILED > appTrackingUrl: > cdh5-yarn.tresata.com:8088/cluster/app/application_1403201750110_0060 > appUser: koert > > > > > On Fri, Jun 20, 2014 at 12:42 PM, Marcelo Vanzin <van...@cloudera.com> > wrote: > >> Hi Koert, >> >> Could you provide more details? Job arguments, log messages, errors, etc. >> >> On Fri, Jun 20, 2014 at 9:40 AM, Koert Kuipers <ko...@tresata.com> wrote: >> > i noticed that when i submit a job to yarn it mistakenly tries to upload >> > files to local filesystem instead of hdfs. what could cause this? >> > >> > in spark-env.sh i have HADOOP_CONF_DIR set correctly (and spark-submit >> does >> > find yarn), and my core-site.xml has a fs.defaultFS that is hdfs, not >> local >> > filesystem. >> > >> > thanks! koert >> >> >> >> -- >> Marcelo >> > >