Re: spark 0.8

Koert Kuipers Thu, 17 Oct 2013 17:05:00 -0700

yes i did that and i can see the correct jars sitting in lib_managed


On Thu, Oct 17, 2013 at 7:56 PM, Matei Zaharia <[email protected]>wrote:

> Koert, did you link your Spark job to the right version of HDFS as well?
> In Spark 0.8, you have to add a Maven dependency on "hadoop-client" for
> your version of Hadoop. See
> http://spark.incubator.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala
>  for
> example.
>
> Matei
>
> On Oct 17, 2013, at 4:38 PM, Koert Kuipers <[email protected]> wrote:
>
> i got the job a little further along by also setting this:
> System.setProperty("spark.closure.serializer",
> "org.apache.spark.serializer.KryoSerializer")
>
> not sure why i need to... but anyhow, now my workers start and then they
> blow up on this:
>
> 13/10/17 19:22:57 ERROR Executor: Uncaught exception in thread
> Thread[pool-5-thread-1,5,main]
> java.lang.NullPointerException
>     at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>     at java.lang.Thread.run(Thread.java:662)
>
>
> which is:
>  val metrics = attemptedTask.flatMap(t => t.metrics)
>
>
>
>
>
>
>
>
>
> On Thu, Oct 17, 2013 at 7:30 PM, dachuan <[email protected]> wrote:
>
>> thanks, Mark.
>>
>>
>> On Thu, Oct 17, 2013 at 6:36 PM, Mark Hamstra <[email protected]>wrote:
>>
>>> SNAPSHOTs are not fixed versions, but are floating names associated with
>>> whatever is the most recent code.  So, Spark 0.8.0 is the current released
>>> version of Spark, which is exactly the same today as it was yesterday, and
>>> will be the same thing forever.  Spark 0.8.1-SNAPSHOT is whatever is
>>> currently in branch-0.8.  It changes every time new code is committed to
>>> that branch (which should be just bug fixes and the few additional features
>>> that we wanted to get into 0.8.0, but that didn't quite make it.)  Not too
>>> long from now there will be a release of Spark 0.8.1, at which time the
>>> SNAPSHOT will got to 0.8.2 and 0.8.1 will be forever frozen.  Meanwhile,
>>> the wild new development is taking place on the master branch, and whatever
>>> is currently in that branch becomes 0.9.0-SNAPSHOT.  This could be quite
>>> different from day to day, and there are no guarantees that things won't be
>>> broken in 0.9.0-SNAPSHOT.  Several months from now there will be a release
>>> of Spark 0.9.0 (unless the decision is made to bump the version to 1.0.0),
>>> at which point the SNAPSHOT goes to 0.9.1 and the whole process advances to
>>> the next phase of development.
>>>
>>> The short answer is that releases are stable, SNAPSHOTs are not, and
>>> SNAPSHOTs that aren't on maintenance branches can break things.  You make
>>> your choice of which to use and pay the consequences.
>>>
>>>
>>> On Thu, Oct 17, 2013 at 3:18 PM, dachuan <[email protected]> wrote:
>>>
>>>> yeah, I mean 0.9.0-SNAPSHOT. I use git clone and that's what I got..
>>>> what's the difference? I mean SNAPSHOT and non-SNAPSHOT.
>>>>
>>>>
>>>> On Thu, Oct 17, 2013 at 6:15 PM, Mark Hamstra 
>>>> <[email protected]>wrote:
>>>>
>>>>> Of course, you mean 0.9.0-SNAPSHOT.  There is no Spark 0.9.0, and
>>>>> won't be for several months.
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Oct 17, 2013 at 3:11 PM, dachuan <[email protected]> wrote:
>>>>>
>>>>>> I'm sorry if this doesn't answer your question directly, but I have
>>>>>> tried spark 0.9.0 and hdfs 1.0.4 just now, it works..
>>>>>>
>>>>>>
>>>>>> On Thu, Oct 17, 2013 at 6:05 PM, Koert Kuipers <[email protected]>wrote:
>>>>>>
>>>>>>> after upgrading from spark 0.7 to spark 0.8 i can no longer access
>>>>>>> any files on HDFS.
>>>>>>>  i see the error below. any ideas?
>>>>>>>
>>>>>>> i am running spark standalone on a cluster that also has CDH4.3.0
>>>>>>> and rebuild spark accordingly. the jars in lib_managed look good to me.
>>>>>>>
>>>>>>> i noticed similar errors in the mailing list but found no suggested
>>>>>>> solutions.
>>>>>>>
>>>>>>> thanks! koert
>>>>>>>
>>>>>>>
>>>>>>> 13/10/17 17:43:23 ERROR Executor: Exception in task ID 0
>>>>>>> java.io.EOFException
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:2703)
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream.readFully(ObjectInputStream.java:1008)
>>>>>>>         at 
>>>>>>> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:68)
>>>>>>>         at 
>>>>>>> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:106)
>>>>>>>         at org.apache.hadoop.io.UTF8.readChars(UTF8.java:258)
>>>>>>>         at org.apache.hadoop.io.UTF8.readString(UTF8.java:250)
>>>>>>>         at 
>>>>>>> org.apache.hadoop.mapred.FileSplit.readFields(FileSplit.java:87)
>>>>>>>         at 
>>>>>>> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280)
>>>>>>>         at 
>>>>>>> org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
>>>>>>>         at 
>>>>>>> org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:39)
>>>>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>>         at 
>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>>         at 
>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>>         at 
>>>>>>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969)
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1852)
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756)
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326)
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1950)
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1874)
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756)
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326)
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
>>>>>>>         at 
>>>>>>> org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:135)
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1795)
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1754)
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326)
>>>>>>>         at 
>>>>>>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
>>>>>>>         at 
>>>>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:39)
>>>>>>>         at 
>>>>>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:61)
>>>>>>>         at 
>>>>>>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:153)
>>>>>>>         at 
>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>>>>>>>         at 
>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>>>>>>>         at java.lang.Thread.run(Thread.java:662)
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Dachuan Huang
>>>>>> Cellphone: 614-390-7234
>>>>>> 2015 Neil Avenue
>>>>>> Ohio State University
>>>>>> Columbus, Ohio
>>>>>> U.S.A.
>>>>>> 43210
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Dachuan Huang
>>>> Cellphone: 614-390-7234
>>>> 2015 Neil Avenue
>>>> Ohio State University
>>>> Columbus, Ohio
>>>> U.S.A.
>>>> 43210
>>>>
>>>
>>>
>>
>>
>> --
>> Dachuan Huang
>> Cellphone: 614-390-7234
>> 2015 Neil Avenue
>> Ohio State University
>> Columbus, Ohio
>> U.S.A.
>> 43210
>>
>
>
>

Re: spark 0.8

Reply via email to