Re: Why spark has to use static method?

Yadid Ayzenberg Fri, 13 Dec 2013 06:41:04 -0800

In order for Spark to ship your objects to the slaves, they must beserializable.Also make sure to read the Data Serialization section in the tuningguide: http://spark.incubator.apache.org/docs/latest/tuning.htmlIf performance is an issue you may want to switch to Kryo serializationinstead of Java serialization.


Yadid


On 12/13/13 9:22 AM, Jie Deng wrote:

Thanks Yadid, that really works!

So is that means static method works only because spark did notdistribute task yet, and the right way of using spark is implement allclass Serializable?


Thanks a lot!

2013/12/13 Yadid Ayzenberg <[email protected]<mailto:[email protected]>>


    Hi Jie,

    it seems that SparkPrefix is not serializable. you can try adding
    /implements Serializable/ and see if that solves the problem.

    Yadid


    On 12/13/13 5:10 AM, Jie Deng wrote:

    Hi,all,

    Thanks for your time to read this,

    When I first trying to write a new Java class, and put spark in
    it, I always get a exception:

    /Exception in thread "main" org.apache.spark.SparkException: Job
    aborted: Task not serializable: java.io.NotSerializableException:
    org.dcu.test.SparkPrefix/
    /at
    
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1011)/
    /at
    
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1009)/
    /at
    
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)/
    /at
    scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)/
    /at org.apache.spark.scheduler.DAGScheduler.org
    
<http://org.apache.spark.scheduler.dagscheduler.org/>$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1009)/
    /at org.apache.spark.scheduler.DAGScheduler.org
    
<http://org.apache.spark.scheduler.dagscheduler.org/>$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:777)/
    /at org.apache.spark.scheduler.DAGScheduler.org
    
<http://org.apache.spark.scheduler.dagscheduler.org/>$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:720)/
    /at
    
org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:554)/
    /at
    
org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$1$$anonfun$receive$1.apply(DAGScheduler.scala:201)/
    /at
    
org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$1$$anonfun$receive$1.apply(DAGScheduler.scala:192)/
    /at akka.actor.Actor$class.apply(Actor.scala:318)/
    /at
    
org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$1.apply(DAGScheduler.scala:173)/
    /at akka.actor.ActorCell.invoke(ActorCell.scala:626)/
    /at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:197)/
    /at akka.dispatch.Mailbox.run(Mailbox.scala:179)/
    /at
    
akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:516)/
    /at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259)/
    /at
    akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975)/
    /at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)/
    /at
    akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)/
    /
    /
    /
    /
    Then I found all the spark examples using main function or other
    static functions only, so is there any pattern in it and why?

Re: Why spark has to use static method?

Reply via email to