Koert, did you link your Spark job to the right version of HDFS as well? In Spark 0.8, you have to add a Maven dependency on "hadoop-client" for your version of Hadoop. See http://spark.incubator.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala for example.
Matei On Oct 17, 2013, at 4:38 PM, Koert Kuipers <[email protected]> wrote: > i got the job a little further along by also setting this: > System.setProperty("spark.closure.serializer", > "org.apache.spark.serializer.KryoSerializer") > > not sure why i need to... but anyhow, now my workers start and then they blow > up on this: > > 13/10/17 19:22:57 ERROR Executor: Uncaught exception in thread > Thread[pool-5-thread-1,5,main] > java.lang.NullPointerException > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) > at java.lang.Thread.run(Thread.java:662) > > > which is: > val metrics = attemptedTask.flatMap(t => t.metrics) > > > > > > > > > > On Thu, Oct 17, 2013 at 7:30 PM, dachuan <[email protected]> wrote: > thanks, Mark. > > > On Thu, Oct 17, 2013 at 6:36 PM, Mark Hamstra <[email protected]> wrote: > SNAPSHOTs are not fixed versions, but are floating names associated with > whatever is the most recent code. So, Spark 0.8.0 is the current released > version of Spark, which is exactly the same today as it was yesterday, and > will be the same thing forever. Spark 0.8.1-SNAPSHOT is whatever is > currently in branch-0.8. It changes every time new code is committed to that > branch (which should be just bug fixes and the few additional features that > we wanted to get into 0.8.0, but that didn't quite make it.) Not too long > from now there will be a release of Spark 0.8.1, at which time the SNAPSHOT > will got to 0.8.2 and 0.8.1 will be forever frozen. Meanwhile, the wild new > development is taking place on the master branch, and whatever is currently > in that branch becomes 0.9.0-SNAPSHOT. This could be quite different from > day to day, and there are no guarantees that things won't be broken in > 0.9.0-SNAPSHOT. Several months from now there will be a release of Spark > 0.9.0 (unless the decision is made to bump the version to 1.0.0), at which > point the SNAPSHOT goes to 0.9.1 and the whole process advances to the next > phase of development. > > The short answer is that releases are stable, SNAPSHOTs are not, and > SNAPSHOTs that aren't on maintenance branches can break things. You make > your choice of which to use and pay the consequences. > > > On Thu, Oct 17, 2013 at 3:18 PM, dachuan <[email protected]> wrote: > yeah, I mean 0.9.0-SNAPSHOT. I use git clone and that's what I got.. what's > the difference? I mean SNAPSHOT and non-SNAPSHOT. > > > On Thu, Oct 17, 2013 at 6:15 PM, Mark Hamstra <[email protected]> wrote: > Of course, you mean 0.9.0-SNAPSHOT. There is no Spark 0.9.0, and won't be > for several months. > > > > On Thu, Oct 17, 2013 at 3:11 PM, dachuan <[email protected]> wrote: > I'm sorry if this doesn't answer your question directly, but I have tried > spark 0.9.0 and hdfs 1.0.4 just now, it works.. > > > On Thu, Oct 17, 2013 at 6:05 PM, Koert Kuipers <[email protected]> wrote: > after upgrading from spark 0.7 to spark 0.8 i can no longer access any files > on HDFS. > i see the error below. any ideas? > > i am running spark standalone on a cluster that also has CDH4.3.0 and rebuild > spark accordingly. the jars in lib_managed look good to me. > > i noticed similar errors in the mailing list but found no suggested > solutions. > > thanks! koert > > > 13/10/17 17:43:23 ERROR Executor: Exception in task ID 0 > java.io.EOFException > at > java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:2703) > at java.io.ObjectInputStream.readFully(ObjectInputStream.java:1008) > at > org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:68) > at > org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:106) > at org.apache.hadoop.io.UTF8.readChars(UTF8.java:258) > at org.apache.hadoop.io.UTF8.readString(UTF8.java:250) > at org.apache.hadoop.mapred.FileSplit.readFields(FileSplit.java:87) > at > org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280) > at > org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75) > at > org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:39) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1852) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1950) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1874) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348) > at > org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:135) > at > java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1795) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1754) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348) > at > org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:39) > at > org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:61) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:153) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) > at java.lang.Thread.run(Thread.java:662) > > > > -- > Dachuan Huang > Cellphone: 614-390-7234 > 2015 Neil Avenue > Ohio State University > Columbus, Ohio > U.S.A. > 43210 > > > > > -- > Dachuan Huang > Cellphone: 614-390-7234 > 2015 Neil Avenue > Ohio State University > Columbus, Ohio > U.S.A. > 43210 > > > > > -- > Dachuan Huang > Cellphone: 614-390-7234 > 2015 Neil Avenue > Ohio State University > Columbus, Ohio > U.S.A. > 43210 >
