RE: No FileSystem for scheme: hdfs

2014-07-04 Thread Steven Cox
Thanks for the help folks.

Adding the config files was necessary but not sufficient.

I also had hadoop 1.0.4 classes on the classpath because a bad jar:

   spark-0.9.1/jars/spark-assembly-0.9.1-hadoop1.0.4.jar

was in my spark executor tar.gz (stored in HDFS).

I believe this was due to a bit of unfortunate devops hygiene during the 
install of our new cluster.

After ensuring the pom referenced hadoop 2.3.0 and rebuilding with:

   mvn -Pyarn -Dhadoop.version=2.3.0 -Dyarn.version=2.3.0 -DskipTests clean 
package

I repackaged, chucked it into hdfs and relaunched my app.

Problem solved.

Hopefully, this will save someone else some tedium.

Thanks,

Steve



From: Akhil Das [ak...@sigmoidanalytics.com]
Sent: Friday, July 04, 2014 1:55 AM
To: user@spark.apache.org
Subject: Re: No FileSystem for scheme: hdfs

​Most likely you are missing the hadoop configuration files (present in 
conf/*.xml).​

Thanks
Best Regards


On Fri, Jul 4, 2014 at 7:38 AM, Steven Cox 
s...@renci.orgmailto:s...@renci.org wrote:
They weren't. They are now and the logs look a bit better - like perhaps some 
serialization is completing that wasn't before.

But I still get the same error periodically. Other thoughts?


From: Soren Macbeth [so...@yieldbot.commailto:so...@yieldbot.com]
Sent: Thursday, July 03, 2014 9:54 PM
To: user@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: No FileSystem for scheme: hdfs

Are the hadoop configuration files on the classpath for your mesos executors?


On Thu, Jul 3, 2014 at 6:45 PM, Steven Cox 
s...@renci.orgmailto:s...@renci.org wrote:
...and a real subject line.

From: Steven Cox [s...@renci.orgmailto:s...@renci.org]
Sent: Thursday, July 03, 2014 9:21 PM
To: user@spark.apache.orgmailto:user@spark.apache.org
Subject:


Folks, I have a program derived from the Kafka streaming wordcount example 
which works fine standalone.


Running on Mesos is not working so well. For starters, I get the error below 
No FileSystem for scheme: hdfs.


I've looked at lots of promising comments on this issue so now I have -

* Every jar under hadoop in my classpath

* Hadoop HDFS and Client in my pom.xml


I find it odd that the app writes checkpoint files to HDFS successfully for a 
couple of cycles then throws this exception. This would suggest the problem is 
not with the syntax of the hdfs URL, for example.


Any thoughts on what I'm missing?


Thanks,


Steve


Mesos : 0.18.2

Spark : 0.9.1



14/07/03 21:14:20 WARN TaskSetManager: Lost TID 296 (task 1514.0:0)

14/07/03 21:14:20 WARN TaskSetManager: Lost TID 297 (task 1514.0:1)

14/07/03 21:14:20 WARN TaskSetManager: Lost TID 298 (task 1514.0:0)

14/07/03 21:14:20 ERROR TaskSetManager: Task 1514.0:0 failed 10 times; aborting 
job

14/07/03 21:14:20 ERROR JobScheduler: Error running job streaming job 
140443646 ms.0

org.apache.spark.SparkException: Job aborted: Task 1514.0:0 failed 10 times 
(most recent failure: Exception failure: java.io.IOException: No FileSystem for 
scheme: hdfs)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)

at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)

at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)

at 
org.apache.spark.scheduler.DAGScheduler.orghttp://org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)

at scala.Option.foreach(Option.scala:236)

at 
org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190)

at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)

at akka.actor.ActorCell.invoke(ActorCell.scala:456)

at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)






[no subject]

2014-07-03 Thread Steven Cox
Folks, I have a program derived from the Kafka streaming wordcount example 
which works fine standalone.


Running on Mesos is not working so well. For starters, I get the error below 
No FileSystem for scheme: hdfs.


I've looked at lots of promising comments on this issue so now I have -

* Every jar under hadoop in my classpath

* Hadoop HDFS and Client in my pom.xml


I find it odd that the app writes checkpoint files to HDFS successfully for a 
couple of cycles then throws this exception. This would suggest the problem is 
not with the syntax of the hdfs URL, for example.


Any thoughts on what I'm missing?


Thanks,


Steve


Mesos : 0.18.2

Spark : 0.9.1



14/07/03 21:14:20 WARN TaskSetManager: Lost TID 296 (task 1514.0:0)

14/07/03 21:14:20 WARN TaskSetManager: Lost TID 297 (task 1514.0:1)

14/07/03 21:14:20 WARN TaskSetManager: Lost TID 298 (task 1514.0:0)

14/07/03 21:14:20 ERROR TaskSetManager: Task 1514.0:0 failed 10 times; aborting 
job

14/07/03 21:14:20 ERROR JobScheduler: Error running job streaming job 
140443646 ms.0

org.apache.spark.SparkException: Job aborted: Task 1514.0:0 failed 10 times 
(most recent failure: Exception failure: java.io.IOException: No FileSystem for 
scheme: hdfs)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)

at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)

at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)

at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)

at scala.Option.foreach(Option.scala:236)

at 
org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190)

at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)

at akka.actor.ActorCell.invoke(ActorCell.scala:456)

at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)




No FileSystem for scheme: hdfs

2014-07-03 Thread Steven Cox
...and a real subject line.

From: Steven Cox [s...@renci.org]
Sent: Thursday, July 03, 2014 9:21 PM
To: user@spark.apache.org
Subject:


Folks, I have a program derived from the Kafka streaming wordcount example 
which works fine standalone.


Running on Mesos is not working so well. For starters, I get the error below 
No FileSystem for scheme: hdfs.


I've looked at lots of promising comments on this issue so now I have -

* Every jar under hadoop in my classpath

* Hadoop HDFS and Client in my pom.xml


I find it odd that the app writes checkpoint files to HDFS successfully for a 
couple of cycles then throws this exception. This would suggest the problem is 
not with the syntax of the hdfs URL, for example.


Any thoughts on what I'm missing?


Thanks,


Steve


Mesos : 0.18.2

Spark : 0.9.1



14/07/03 21:14:20 WARN TaskSetManager: Lost TID 296 (task 1514.0:0)

14/07/03 21:14:20 WARN TaskSetManager: Lost TID 297 (task 1514.0:1)

14/07/03 21:14:20 WARN TaskSetManager: Lost TID 298 (task 1514.0:0)

14/07/03 21:14:20 ERROR TaskSetManager: Task 1514.0:0 failed 10 times; aborting 
job

14/07/03 21:14:20 ERROR JobScheduler: Error running job streaming job 
140443646 ms.0

org.apache.spark.SparkException: Job aborted: Task 1514.0:0 failed 10 times 
(most recent failure: Exception failure: java.io.IOException: No FileSystem for 
scheme: hdfs)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)

at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)

at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)

at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)

at scala.Option.foreach(Option.scala:236)

at 
org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190)

at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)

at akka.actor.ActorCell.invoke(ActorCell.scala:456)

at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)




RE: No FileSystem for scheme: hdfs

2014-07-03 Thread Steven Cox
They weren't. They are now and the logs look a bit better - like perhaps some 
serialization is completing that wasn't before.

But I still get the same error periodically. Other thoughts?


From: Soren Macbeth [so...@yieldbot.com]
Sent: Thursday, July 03, 2014 9:54 PM
To: user@spark.apache.org
Subject: Re: No FileSystem for scheme: hdfs

Are the hadoop configuration files on the classpath for your mesos executors?


On Thu, Jul 3, 2014 at 6:45 PM, Steven Cox 
s...@renci.orgmailto:s...@renci.org wrote:
...and a real subject line.

From: Steven Cox [s...@renci.orgmailto:s...@renci.org]
Sent: Thursday, July 03, 2014 9:21 PM
To: user@spark.apache.orgmailto:user@spark.apache.org
Subject:


Folks, I have a program derived from the Kafka streaming wordcount example 
which works fine standalone.


Running on Mesos is not working so well. For starters, I get the error below 
No FileSystem for scheme: hdfs.


I've looked at lots of promising comments on this issue so now I have -

* Every jar under hadoop in my classpath

* Hadoop HDFS and Client in my pom.xml


I find it odd that the app writes checkpoint files to HDFS successfully for a 
couple of cycles then throws this exception. This would suggest the problem is 
not with the syntax of the hdfs URL, for example.


Any thoughts on what I'm missing?


Thanks,


Steve


Mesos : 0.18.2

Spark : 0.9.1



14/07/03 21:14:20 WARN TaskSetManager: Lost TID 296 (task 1514.0:0)

14/07/03 21:14:20 WARN TaskSetManager: Lost TID 297 (task 1514.0:1)

14/07/03 21:14:20 WARN TaskSetManager: Lost TID 298 (task 1514.0:0)

14/07/03 21:14:20 ERROR TaskSetManager: Task 1514.0:0 failed 10 times; aborting 
job

14/07/03 21:14:20 ERROR JobScheduler: Error running job streaming job 
140443646 ms.0

org.apache.spark.SparkException: Job aborted: Task 1514.0:0 failed 10 times 
(most recent failure: Exception failure: java.io.IOException: No FileSystem for 
scheme: hdfs)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)

at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)

at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)

at 
org.apache.spark.scheduler.DAGScheduler.orghttp://org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)

at scala.Option.foreach(Option.scala:236)

at 
org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604)

at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190)

at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)

at akka.actor.ActorCell.invoke(ActorCell.scala:456)

at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)





Spark 0.9.1 core dumps on Mesos 0.18.0

2014-04-17 Thread Steven Cox
So I tried a fix found on the list...

   The issue was due to meos version mismatch as I am using latest mesos 
0.17.0, but spark uses 0.13.0.
Fixed by updating the SparkBuild.scala to latest version.

I changed this line in SparkBuild.scala
org.apache.mesos % mesos% 0.13.0,
to
org.apache.mesos % mesos% 0.18.0,

...ran make-distribution.sh, repackaged and redeployed the tar.gz to HDFS.

It still core dumps like this:
https://gist.github.com/stevencox/11002498

In this environment:
  Ubuntu 13.10
  Mesos 0.18.0
  Spark 0.9.1
  JDK 1.7.0_45
  Scala 2.10.1

What am I missing?


RE: Spark 0.9.1 core dumps on Mesos 0.18.0

2014-04-17 Thread Steven Cox
FYI, I've tried older versions (jdk6.x), openjdk. Also here's a fresh core dump 
on jdk7u55-b13:


# A fatal error has been detected by the Java Runtime Environment:

#

#  SIGSEGV (0xb) at pc=0x7f7c6b718d39, pid=7708, tid=140171900581632

#

# JRE version: Java(TM) SE Runtime Environment (7.0_55-b13) (build 1.7.0_55-b13)

# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.55-b03 mixed mode linux-amd64 
compressed oops)

# Problematic frame:

# V  [libjvm.so+0x632d39]  jni_GetByteArrayElements+0x89

#

# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try ulimit -c unlimited before starting Java again

#

# An error report file with more information is saved as:

# /home/scox/skylr/skylr-analytics/hs_err_pid7708.log

#

# If you would like to submit a bug report, please visit:

#   http://bugreport.sun.com/bugreport/crash.jsp


Steve



From: andy petrella [andy.petre...@gmail.com]
Sent: Thursday, April 17, 2014 3:21 PM
To: user@spark.apache.org
Subject: Re: Spark 0.9.1 core dumps on Mesos 0.18.0

No of course, but I was guessing some native libs imported (to communicate with 
Mesos) in the project that... could miserably crash the JVM.

Anyway, so you tell us that using this oracle version, you don't have any 
issues when using spark on mesos 0.18.0, that's interesting 'cause AFAIR, my 
last test (done by night, which means floating and eventual memory) I was using 
this particular version as well.

Just to make thing clear, Sean, you're using spark 0.9.1 on Mesos 0.18.0 with 
Hadoop 2.x (x = 2) without any modification than just specifying against which 
version of hadoop you had run make-distribution?

Thanks for your help,

Andy

On Thu, Apr 17, 2014 at 9:11 PM, Sean Owen 
so...@cloudera.commailto:so...@cloudera.com wrote:
I don't know if it's anything you or the project is missing... that's
just a JDK bug.
FWIW I am on 1.7.0_51 and have not seen anything like that.

I don't think it's a protobuf issue -- you don't crash the JVM with
simple version incompatibilities :)
--
Sean Owen | Director, Data Science | London


On Thu, Apr 17, 2014 at 7:29 PM, Steven Cox 
s...@renci.orgmailto:s...@renci.org wrote:
 So I tried a fix found on the list...

The issue was due to meos version mismatch as I am using latest mesos
 0.17.0, but spark uses 0.13.0.
 Fixed by updating the SparkBuild.scala to latest version.

 I changed this line in SparkBuild.scala
 org.apache.mesos % mesos% 0.13.0,
 to
 org.apache.mesos % mesos% 0.18.0,

 ...ran make-distribution.sh, repackaged and redeployed the tar.gz to HDFS.

 It still core dumps like this:
 https://gist.github.com/stevencox/11002498

 In this environment:
   Ubuntu 13.10
   Mesos 0.18.0
   Spark 0.9.1
   JDK 1.7.0_45
   Scala 2.10.1

 What am I missing?