Re: Exception adding resource files in latest Spark
Thanks for flagging this. I reverted the relevant YARN fix in Spark 1.2 release. We can try to debug this in master. On Thu, Dec 4, 2014 at 9:51 PM, Jianshi Huang wrote: > I created a ticket for this: > > https://issues.apache.org/jira/browse/SPARK-4757 > > > Jianshi > > On Fri, Dec 5, 2014 at 1:31 PM, Jianshi Huang > wrote: >> >> Correction: >> >> According to Liancheng, this hotfix might be the root cause: >> >> >> https://github.com/apache/spark/commit/38cb2c3a36a5c9ead4494cbc3dde008c2f0698ce >> >> Jianshi >> >> On Fri, Dec 5, 2014 at 12:45 PM, Jianshi Huang >> wrote: >>> >>> Looks like the datanucleus*.jar shouldn't appear in the hdfs path in >>> Yarn-client mode. >>> >>> Maybe this patch broke yarn-client. >>> >>> >>> https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53 >>> >>> Jianshi >>> >>> On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang >>> wrote: Actually my HADOOP_CLASSPATH has already been set to include /etc/hadoop/conf/* export HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase classpath) Jianshi On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang wrote: > > Looks like somehow Spark failed to find the core-site.xml in > /et/hadoop/conf > > I've already set the following env variables: > > export YARN_CONF_DIR=/etc/hadoop/conf > export HADOOP_CONF_DIR=/etc/hadoop/conf > export HBASE_CONF_DIR=/etc/hbase/conf > > Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH? > > Jianshi > > On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang > wrote: >> >> I got the following error during Spark startup (Yarn-client mode): >> >> 14/12/04 19:33:58 INFO Client: Uploading resource >> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar >> -> >> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar >> java.lang.IllegalArgumentException: Wrong FS: >> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar, >> expected: file:/// >> at >> org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643) >> at >> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79) >> at >> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506) >> at >> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724) >> at >> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501) >> at >> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397) >> at >> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67) >> at >> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257) >> at >> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242) >> at scala.Option.foreach(Option.scala:236) >> at >> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242) >> at >> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35) >> at >> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350) >> at >> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35) >> at >> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80) >> at >> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57) >> at >> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140) >> at >> org.apache.spark.SparkContext.(SparkContext.scala:335) >> at >> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986) >> at $iwC$$iwC.(:9) >> at $iwC.(:18) >> at (:20) >> at .(:24) >> >> I'm using latest Spark built from master HEAD yesterday. Is this a >> bug? >> >> -- >> Jianshi Huang >> >> LinkedIn: jianshi >> Twitter: @jshuang >> Github & Blog: http://huangjs.github.com/ > > > > > -- > Jianshi Huang > > LinkedIn: jianshi > Twitter: @jshuang > Github & Blog: http://huangjs.github.com/ -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/ >>> >>> >>> >>> >>> -- >>> Jianshi Huang >>> >>> LinkedIn: jianshi >>> Twitter: @jshuang >>> Gi
Re: Exception adding resource files in latest Spark
I created a ticket for this: https://issues.apache.org/jira/browse/SPARK-4757 Jianshi On Fri, Dec 5, 2014 at 1:31 PM, Jianshi Huang wrote: > Correction: > > According to Liancheng, this hotfix might be the root cause: > > > https://github.com/apache/spark/commit/38cb2c3a36a5c9ead4494cbc3dde008c2f0698ce > > Jianshi > > On Fri, Dec 5, 2014 at 12:45 PM, Jianshi Huang > wrote: > >> Looks like the datanucleus*.jar shouldn't appear in the hdfs path in >> Yarn-client mode. >> >> Maybe this patch broke yarn-client. >> >> >> https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53 >> >> Jianshi >> >> On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang >> wrote: >> >>> Actually my HADOOP_CLASSPATH has already been set to include >>> /etc/hadoop/conf/* >>> >>> export >>> HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase >>> classpath) >>> >>> Jianshi >>> >>> On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang >>> wrote: >>> Looks like somehow Spark failed to find the core-site.xml in /et/hadoop/conf I've already set the following env variables: export YARN_CONF_DIR=/etc/hadoop/conf export HADOOP_CONF_DIR=/etc/hadoop/conf export HBASE_CONF_DIR=/etc/hbase/conf Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH? Jianshi On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang >>> > wrote: > I got the following error during Spark startup (Yarn-client mode): > > 14/12/04 19:33:58 INFO Client: Uploading resource > file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar > -> > hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar > java.lang.IllegalArgumentException: Wrong FS: > hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar, > expected: file:/// > at > org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643) > at > org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79) > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397) > at > org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67) > at > org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257) > at > org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242) > at scala.Option.foreach(Option.scala:236) > at > org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242) > at > org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35) > at > org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350) > at > org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35) > at > org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80) > at > org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57) > at > org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140) > at org.apache.spark.SparkContext.(SparkContext.scala:335) > at > org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986) > at $iwC$$iwC.(:9) > at $iwC.(:18) > at (:20) > at .(:24) > > I'm using latest Spark built from master HEAD yesterday. Is this a bug? > > -- > Jianshi Huang > > LinkedIn: jianshi > Twitter: @jshuang > Github & Blog: http://huangjs.github.com/ > -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/ >>> >>> >>> >>> -- >>> Jianshi Huang >>> >>> LinkedIn: jianshi >>> Twitter: @jshuang >>> Github & Blog: http://huangjs.github.com/ >>> >> >> >> >> -- >> Jianshi Huang >> >> LinkedIn: jianshi >> Twitter: @jshuang >> Github & Blog: http://huangjs.github.com/ >> > > > > -- > Jianshi Huang > > LinkedIn: jianshi > Twitter: @jshuang > Github & Blog: http://huangjs.github.com/ > -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Correction: According to Liancheng, this hotfix might be the root cause: https://github.com/apache/spark/commit/38cb2c3a36a5c9ead4494cbc3dde008c2f0698ce Jianshi On Fri, Dec 5, 2014 at 12:45 PM, Jianshi Huang wrote: > Looks like the datanucleus*.jar shouldn't appear in the hdfs path in > Yarn-client mode. > > Maybe this patch broke yarn-client. > > > https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53 > > Jianshi > > On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang > wrote: > >> Actually my HADOOP_CLASSPATH has already been set to include >> /etc/hadoop/conf/* >> >> export >> HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase >> classpath) >> >> Jianshi >> >> On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang >> wrote: >> >>> Looks like somehow Spark failed to find the core-site.xml in >>> /et/hadoop/conf >>> >>> I've already set the following env variables: >>> >>> export YARN_CONF_DIR=/etc/hadoop/conf >>> export HADOOP_CONF_DIR=/etc/hadoop/conf >>> export HBASE_CONF_DIR=/etc/hbase/conf >>> >>> Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH? >>> >>> Jianshi >>> >>> On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang >>> wrote: >>> I got the following error during Spark startup (Yarn-client mode): 14/12/04 19:33:58 INFO Client: Uploading resource file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar -> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar java.lang.IllegalArgumentException: Wrong FS: hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643) at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79) at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397) at org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67) at org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257) at org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242) at scala.Option.foreach(Option.scala:236) at org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242) at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35) at org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350) at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140) at org.apache.spark.SparkContext.(SparkContext.scala:335) at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986) at $iwC$$iwC.(:9) at $iwC.(:18) at (:20) at .(:24) I'm using latest Spark built from master HEAD yesterday. Is this a bug? -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/ >>> >>> >>> >>> -- >>> Jianshi Huang >>> >>> LinkedIn: jianshi >>> Twitter: @jshuang >>> Github & Blog: http://huangjs.github.com/ >>> >> >> >> >> -- >> Jianshi Huang >> >> LinkedIn: jianshi >> Twitter: @jshuang >> Github & Blog: http://huangjs.github.com/ >> > > > > -- > Jianshi Huang > > LinkedIn: jianshi > Twitter: @jshuang > Github & Blog: http://huangjs.github.com/ > -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Looks like the datanucleus*.jar shouldn't appear in the hdfs path in Yarn-client mode. Maybe this patch broke yarn-client. https://github.com/apache/spark/commit/a975dc32799bb8a14f9e1c76defaaa7cfbaf8b53 Jianshi On Fri, Dec 5, 2014 at 12:02 PM, Jianshi Huang wrote: > Actually my HADOOP_CLASSPATH has already been set to include > /etc/hadoop/conf/* > > export > HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase > classpath) > > Jianshi > > On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang > wrote: > >> Looks like somehow Spark failed to find the core-site.xml in >> /et/hadoop/conf >> >> I've already set the following env variables: >> >> export YARN_CONF_DIR=/etc/hadoop/conf >> export HADOOP_CONF_DIR=/etc/hadoop/conf >> export HBASE_CONF_DIR=/etc/hbase/conf >> >> Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH? >> >> Jianshi >> >> On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang >> wrote: >> >>> I got the following error during Spark startup (Yarn-client mode): >>> >>> 14/12/04 19:33:58 INFO Client: Uploading resource >>> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar >>> -> >>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar >>> java.lang.IllegalArgumentException: Wrong FS: >>> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar, >>> expected: file:/// >>> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643) >>> at >>> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79) >>> at >>> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506) >>> at >>> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724) >>> at >>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501) >>> at >>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397) >>> at >>> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67) >>> at >>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257) >>> at >>> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242) >>> at scala.Option.foreach(Option.scala:236) >>> at >>> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242) >>> at >>> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35) >>> at >>> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350) >>> at >>> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35) >>> at >>> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80) >>> at >>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57) >>> at >>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140) >>> at org.apache.spark.SparkContext.(SparkContext.scala:335) >>> at >>> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986) >>> at $iwC$$iwC.(:9) >>> at $iwC.(:18) >>> at (:20) >>> at .(:24) >>> >>> I'm using latest Spark built from master HEAD yesterday. Is this a bug? >>> >>> -- >>> Jianshi Huang >>> >>> LinkedIn: jianshi >>> Twitter: @jshuang >>> Github & Blog: http://huangjs.github.com/ >>> >> >> >> >> -- >> Jianshi Huang >> >> LinkedIn: jianshi >> Twitter: @jshuang >> Github & Blog: http://huangjs.github.com/ >> > > > > -- > Jianshi Huang > > LinkedIn: jianshi > Twitter: @jshuang > Github & Blog: http://huangjs.github.com/ > -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Actually my HADOOP_CLASSPATH has already been set to include /etc/hadoop/conf/* export HADOOP_CLASSPATH=/etc/hbase/conf/hbase-site.xml:/usr/lib/hbase/lib/hbase-protocol.jar:$(hbase classpath) Jianshi On Fri, Dec 5, 2014 at 11:54 AM, Jianshi Huang wrote: > Looks like somehow Spark failed to find the core-site.xml in > /et/hadoop/conf > > I've already set the following env variables: > > export YARN_CONF_DIR=/etc/hadoop/conf > export HADOOP_CONF_DIR=/etc/hadoop/conf > export HBASE_CONF_DIR=/etc/hbase/conf > > Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH? > > Jianshi > > On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang > wrote: > >> I got the following error during Spark startup (Yarn-client mode): >> >> 14/12/04 19:33:58 INFO Client: Uploading resource >> file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar >> -> >> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar >> java.lang.IllegalArgumentException: Wrong FS: >> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar, >> expected: file:/// >> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643) >> at >> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79) >> at >> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506) >> at >> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724) >> at >> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501) >> at >> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397) >> at >> org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67) >> at >> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257) >> at >> org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242) >> at scala.Option.foreach(Option.scala:236) >> at >> org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242) >> at >> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35) >> at >> org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350) >> at >> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35) >> at >> org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80) >> at >> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57) >> at >> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140) >> at org.apache.spark.SparkContext.(SparkContext.scala:335) >> at >> org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986) >> at $iwC$$iwC.(:9) >> at $iwC.(:18) >> at (:20) >> at .(:24) >> >> I'm using latest Spark built from master HEAD yesterday. Is this a bug? >> >> -- >> Jianshi Huang >> >> LinkedIn: jianshi >> Twitter: @jshuang >> Github & Blog: http://huangjs.github.com/ >> > > > > -- > Jianshi Huang > > LinkedIn: jianshi > Twitter: @jshuang > Github & Blog: http://huangjs.github.com/ > -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/
Re: Exception adding resource files in latest Spark
Looks like somehow Spark failed to find the core-site.xml in /et/hadoop/conf I've already set the following env variables: export YARN_CONF_DIR=/etc/hadoop/conf export HADOOP_CONF_DIR=/etc/hadoop/conf export HBASE_CONF_DIR=/etc/hbase/conf Should I put $HADOOP_CONF_DIR/* to HADOOP_CLASSPATH? Jianshi On Fri, Dec 5, 2014 at 11:37 AM, Jianshi Huang wrote: > I got the following error during Spark startup (Yarn-client mode): > > 14/12/04 19:33:58 INFO Client: Uploading resource > file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar > -> > hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar > java.lang.IllegalArgumentException: Wrong FS: > hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar, > expected: file:/// > at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643) > at > org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79) > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397) > at > org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67) > at > org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257) > at > org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242) > at scala.Option.foreach(Option.scala:236) > at > org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242) > at > org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35) > at > org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350) > at > org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35) > at > org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80) > at > org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57) > at > org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140) > at org.apache.spark.SparkContext.(SparkContext.scala:335) > at > org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986) > at $iwC$$iwC.(:9) > at $iwC.(:18) > at (:20) > at .(:24) > > I'm using latest Spark built from master HEAD yesterday. Is this a bug? > > -- > Jianshi Huang > > LinkedIn: jianshi > Twitter: @jshuang > Github & Blog: http://huangjs.github.com/ > -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/
Exception adding resource files in latest Spark
I got the following error during Spark startup (Yarn-client mode): 14/12/04 19:33:58 INFO Client: Uploading resource file:/x/home/jianshuang/spark/spark-latest/lib/datanucleus-api-jdo-3.2.6.jar -> hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar java.lang.IllegalArgumentException: Wrong FS: hdfs://stampy/user/jianshuang/.sparkStaging/application_1404410683830_531767/datanucleus-api-jdo-3.2.6.jar, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:643) at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:79) at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:506) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397) at org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:67) at org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:257) at org.apache.spark.deploy.yarn.ClientBase$$anonfun$prepareLocalResources$5.apply(ClientBase.scala:242) at scala.Option.foreach(Option.scala:236) at org.apache.spark.deploy.yarn.ClientBase$class.prepareLocalResources(ClientBase.scala:242) at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:35) at org.apache.spark.deploy.yarn.ClientBase$class.createContainerLaunchContext(ClientBase.scala:350) at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:35) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:80) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:57) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:140) at org.apache.spark.SparkContext.(SparkContext.scala:335) at org.apache.spark.repl.SparkILoop.createSparkContext(SparkILoop.scala:986) at $iwC$$iwC.(:9) at $iwC.(:18) at (:20) at .(:24) I'm using latest Spark built from master HEAD yesterday. Is this a bug? -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github & Blog: http://huangjs.github.com/