Re: spark 0.8

Matei Zaharia Thu, 17 Oct 2013 16:57:06 -0700

Koert, did you link your Spark job to the right version of HDFS as well? In 
Spark 0.8, you have to add a Maven dependency on "hadoop-client" for your 
version of Hadoop. See 
http://spark.incubator.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala
 for example.


Matei

On Oct 17, 2013, at 4:38 PM, Koert Kuipers <[email protected]> wrote:

> i got the job a little further along by also setting this:
> System.setProperty("spark.closure.serializer", 
> "org.apache.spark.serializer.KryoSerializer")
> 
> not sure why i need to... but anyhow, now my workers start and then they blow 
> up on this:
> 
> 13/10/17 19:22:57 ERROR Executor: Uncaught exception in thread 
> Thread[pool-5-thread-1,5,main]
> java.lang.NullPointerException
>     at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>     at java.lang.Thread.run(Thread.java:662)
> 
> 
> which is:
>  val metrics = attemptedTask.flatMap(t => t.metrics)
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Thu, Oct 17, 2013 at 7:30 PM, dachuan <[email protected]> wrote:
> thanks, Mark.
> 
> 
> On Thu, Oct 17, 2013 at 6:36 PM, Mark Hamstra <[email protected]> wrote:
> SNAPSHOTs are not fixed versions, but are floating names associated with 
> whatever is the most recent code.  So, Spark 0.8.0 is the current released 
> version of Spark, which is exactly the same today as it was yesterday, and 
> will be the same thing forever.  Spark 0.8.1-SNAPSHOT is whatever is 
> currently in branch-0.8.  It changes every time new code is committed to that 
> branch (which should be just bug fixes and the few additional features that 
> we wanted to get into 0.8.0, but that didn't quite make it.)  Not too long 
> from now there will be a release of Spark 0.8.1, at which time the SNAPSHOT 
> will got to 0.8.2 and 0.8.1 will be forever frozen.  Meanwhile, the wild new 
> development is taking place on the master branch, and whatever is currently 
> in that branch becomes 0.9.0-SNAPSHOT.  This could be quite different from 
> day to day, and there are no guarantees that things won't be broken in 
> 0.9.0-SNAPSHOT.  Several months from now there will be a release of Spark 
> 0.9.0 (unless the decision is made to bump the version to 1.0.0), at which 
> point the SNAPSHOT goes to 0.9.1 and the whole process advances to the next 
> phase of development.
> 
> The short answer is that releases are stable, SNAPSHOTs are not, and 
> SNAPSHOTs that aren't on maintenance branches can break things.  You make 
> your choice of which to use and pay the consequences. 
> 
> 
> On Thu, Oct 17, 2013 at 3:18 PM, dachuan <[email protected]> wrote:
> yeah, I mean 0.9.0-SNAPSHOT. I use git clone and that's what I got.. what's 
> the difference? I mean SNAPSHOT and non-SNAPSHOT.
> 
> 
> On Thu, Oct 17, 2013 at 6:15 PM, Mark Hamstra <[email protected]> wrote:
> Of course, you mean 0.9.0-SNAPSHOT.  There is no Spark 0.9.0, and won't be 
> for several months.
> 
> 
> 
> On Thu, Oct 17, 2013 at 3:11 PM, dachuan <[email protected]> wrote:
> I'm sorry if this doesn't answer your question directly, but I have tried 
> spark 0.9.0 and hdfs 1.0.4 just now, it works..
> 
> 
> On Thu, Oct 17, 2013 at 6:05 PM, Koert Kuipers <[email protected]> wrote:
> after upgrading from spark 0.7 to spark 0.8 i can no longer access any files 
> on HDFS.
> i see the error below. any ideas?
> 
> i am running spark standalone on a cluster that also has CDH4.3.0 and rebuild 
> spark accordingly. the jars in lib_managed look good to me.
> 
> i noticed similar errors in the mailing list but found no suggested 
> solutions. 
> 
> thanks! koert
> 
> 
> 13/10/17 17:43:23 ERROR Executor: Exception in task ID 0
> java.io.EOFException
>       at 
> java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:2703)
>       at java.io.ObjectInputStream.readFully(ObjectInputStream.java:1008)
>       at 
> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:68)
>       at 
> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:106)
>       at org.apache.hadoop.io.UTF8.readChars(UTF8.java:258)
>       at org.apache.hadoop.io.UTF8.readString(UTF8.java:250)
>       at org.apache.hadoop.mapred.FileSplit.readFields(FileSplit.java:87)
>       at 
> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280)
>       at 
> org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
>       at 
> org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:39)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at 
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1852)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326)
>       at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1950)
>       at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1874)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326)
>       at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
>       at 
> org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:135)
>       at 
> java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1795)
>       at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1754)
>       at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326)
>       at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
>       at 
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:39)
>       at 
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:61)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:153)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>       at java.lang.Thread.run(Thread.java:662)
> 
> 
> 
> -- 
> Dachuan Huang
> Cellphone: 614-390-7234
> 2015 Neil Avenue
> Ohio State University
> Columbus, Ohio
> U.S.A.
> 43210
> 
> 
> 
> 
> -- 
> Dachuan Huang
> Cellphone: 614-390-7234
> 2015 Neil Avenue
> Ohio State University
> Columbus, Ohio
> U.S.A.
> 43210
> 
> 
> 
> 
> -- 
> Dachuan Huang
> Cellphone: 614-390-7234
> 2015 Neil Avenue
> Ohio State University
> Columbus, Ohio
> U.S.A.
> 43210
>

Re: spark 0.8

Reply via email to