i got the job a little further along by also setting this:
System.setProperty("spark.closure.serializer",
"org.apache.spark.serializer.KryoSerializer")
not sure why i need to... but anyhow, now my workers start and then they
blow up on this:
13/10/17 19:22:57 ERROR Executor: Uncaught exception in thread
Thread[pool-5-thread-1,5,main]
java.lang.NullPointerException
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:662)
which is:
val metrics = attemptedTask.flatMap(t => t.metrics)
On Thu, Oct 17, 2013 at 7:30 PM, dachuan <[email protected]> wrote:
> thanks, Mark.
>
>
> On Thu, Oct 17, 2013 at 6:36 PM, Mark Hamstra <[email protected]>wrote:
>
>> SNAPSHOTs are not fixed versions, but are floating names associated with
>> whatever is the most recent code. So, Spark 0.8.0 is the current released
>> version of Spark, which is exactly the same today as it was yesterday, and
>> will be the same thing forever. Spark 0.8.1-SNAPSHOT is whatever is
>> currently in branch-0.8. It changes every time new code is committed to
>> that branch (which should be just bug fixes and the few additional features
>> that we wanted to get into 0.8.0, but that didn't quite make it.) Not too
>> long from now there will be a release of Spark 0.8.1, at which time the
>> SNAPSHOT will got to 0.8.2 and 0.8.1 will be forever frozen. Meanwhile,
>> the wild new development is taking place on the master branch, and whatever
>> is currently in that branch becomes 0.9.0-SNAPSHOT. This could be quite
>> different from day to day, and there are no guarantees that things won't be
>> broken in 0.9.0-SNAPSHOT. Several months from now there will be a release
>> of Spark 0.9.0 (unless the decision is made to bump the version to 1.0.0),
>> at which point the SNAPSHOT goes to 0.9.1 and the whole process advances to
>> the next phase of development.
>>
>> The short answer is that releases are stable, SNAPSHOTs are not, and
>> SNAPSHOTs that aren't on maintenance branches can break things. You make
>> your choice of which to use and pay the consequences.
>>
>>
>> On Thu, Oct 17, 2013 at 3:18 PM, dachuan <[email protected]> wrote:
>>
>>> yeah, I mean 0.9.0-SNAPSHOT. I use git clone and that's what I got..
>>> what's the difference? I mean SNAPSHOT and non-SNAPSHOT.
>>>
>>>
>>> On Thu, Oct 17, 2013 at 6:15 PM, Mark Hamstra
>>> <[email protected]>wrote:
>>>
>>>> Of course, you mean 0.9.0-SNAPSHOT. There is no Spark 0.9.0, and won't
>>>> be for several months.
>>>>
>>>>
>>>>
>>>> On Thu, Oct 17, 2013 at 3:11 PM, dachuan <[email protected]> wrote:
>>>>
>>>>> I'm sorry if this doesn't answer your question directly, but I have
>>>>> tried spark 0.9.0 and hdfs 1.0.4 just now, it works..
>>>>>
>>>>>
>>>>> On Thu, Oct 17, 2013 at 6:05 PM, Koert Kuipers <[email protected]>wrote:
>>>>>
>>>>>> after upgrading from spark 0.7 to spark 0.8 i can no longer access
>>>>>> any files on HDFS.
>>>>>> i see the error below. any ideas?
>>>>>>
>>>>>> i am running spark standalone on a cluster that also has CDH4.3.0 and
>>>>>> rebuild spark accordingly. the jars in lib_managed look good to me.
>>>>>>
>>>>>> i noticed similar errors in the mailing list but found no suggested
>>>>>> solutions.
>>>>>>
>>>>>> thanks! koert
>>>>>>
>>>>>>
>>>>>> 13/10/17 17:43:23 ERROR Executor: Exception in task ID 0
>>>>>> java.io.EOFException
>>>>>> at
>>>>>> java.io.ObjectInputStream$BlockDataInputStream.readFully(ObjectInputStream.java:2703)
>>>>>> at java.io.ObjectInputStream.readFully(ObjectInputStream.java:1008)
>>>>>> at
>>>>>> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:68)
>>>>>> at
>>>>>> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:106)
>>>>>> at org.apache.hadoop.io.UTF8.readChars(UTF8.java:258)
>>>>>> at org.apache.hadoop.io.UTF8.readString(UTF8.java:250)
>>>>>> at org.apache.hadoop.mapred.FileSplit.readFields(FileSplit.java:87)
>>>>>> at
>>>>>> org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:280)
>>>>>> at
>>>>>> org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:75)
>>>>>> at
>>>>>> org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:39)
>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> at
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>> at
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>> at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>> at
>>>>>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969)
>>>>>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1852)
>>>>>> at
>>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756)
>>>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326)
>>>>>> at
>>>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1950)
>>>>>> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1874)
>>>>>> at
>>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1756)
>>>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326)
>>>>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
>>>>>> at
>>>>>> org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:135)
>>>>>> at
>>>>>> java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1795)
>>>>>> at
>>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1754)
>>>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1326)
>>>>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:348)
>>>>>> at
>>>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:39)
>>>>>> at
>>>>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:61)
>>>>>> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:153)
>>>>>> at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>>>>>> at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>>>>>> at java.lang.Thread.run(Thread.java:662)
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Dachuan Huang
>>>>> Cellphone: 614-390-7234
>>>>> 2015 Neil Avenue
>>>>> Ohio State University
>>>>> Columbus, Ohio
>>>>> U.S.A.
>>>>> 43210
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Dachuan Huang
>>> Cellphone: 614-390-7234
>>> 2015 Neil Avenue
>>> Ohio State University
>>> Columbus, Ohio
>>> U.S.A.
>>> 43210
>>>
>>
>>
>
>
> --
> Dachuan Huang
> Cellphone: 614-390-7234
> 2015 Neil Avenue
> Ohio State University
> Columbus, Ohio
> U.S.A.
> 43210
>