Re: Why spark has to use static method?

Jie Deng Fri, 13 Dec 2013 06:48:47 -0800

Great, Thanks a lot!!!


2013/12/13 Yadid Ayzenberg <[email protected]>

>
> In order for Spark to ship your objects to the slaves, they must be
> serializable.
> Also make sure to read the Data Serialization section in the tuning guide:
> http://spark.incubator.apache.org/docs/latest/tuning.html
> If performance is an issue you may want to switch to Kryo serialization
> instead of Java serialization.
>
> Yadid
>
>
>
> On 12/13/13 9:22 AM, Jie Deng wrote:
>
> Thanks Yadid, that really works!
>
>  So is that means static method works only because spark did not
> distribute task yet, and the right way of using spark is implement all
> class Serializable?
>
>  Thanks a lot!
>
>
>
>
> 2013/12/13 Yadid Ayzenberg <[email protected]>
>
>>  Hi Jie,
>>
>> it seems that SparkPrefix is not serializable. you can try adding *implements
>> Serializable* and see if that solves the problem.
>>
>> Yadid
>>
>>
>> On 12/13/13 5:10 AM, Jie Deng wrote:
>>
>> Hi,all,
>>
>> Thanks for your time to read this,
>>
>>  When I first trying to write a new Java class, and put spark in it, I
>> always get a exception:
>>
>>  *Exception in thread "main" org.apache.spark.SparkException: Job
>> aborted: Task not serializable: java.io.NotSerializableException:
>> org.dcu.test.SparkPrefix*
>> * at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1011)*
>> * at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1009)*
>> * at
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)*
>> * at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)*
>> * at org.apache.spark.scheduler.DAGScheduler.org
>> <http://org.apache.spark.scheduler.dagscheduler.org/>$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1009)*
>> * at org.apache.spark.scheduler.DAGScheduler.org
>> <http://org.apache.spark.scheduler.dagscheduler.org/>$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:777)*
>> * at org.apache.spark.scheduler.DAGScheduler.org
>> <http://org.apache.spark.scheduler.dagscheduler.org/>$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:720)*
>> * at
>> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:554)*
>> * at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$1$$anonfun$receive$1.apply(DAGScheduler.scala:201)*
>> * at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$1$$anonfun$receive$1.apply(DAGScheduler.scala:192)*
>> * at akka.actor.Actor$class.apply(Actor.scala:318)*
>> * at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$1.apply(DAGScheduler.scala:173)*
>> * at akka.actor.ActorCell.invoke(ActorCell.scala:626)*
>> * at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:197)*
>> * at akka.dispatch.Mailbox.run(Mailbox.scala:179)*
>> * at
>> akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:516)*
>> * at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259)*
>> * at akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975)*
>> * at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)*
>> * at akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)*
>>
>>
>>  Then I found all the spark examples using main function or other static
>> functions only, so is there any pattern in it and why?
>>
>>
>>
>
>

Re: Why spark has to use static method?

Reply via email to