In order for Spark to ship your objects to the slaves, they must be serializable. Also make sure to read the Data Serialization section in the tuning guide: http://spark.incubator.apache.org/docs/latest/tuning.html If performance is an issue you may want to switch to Kryo serialization instead of Java serialization.

Yadid


On 12/13/13 9:22 AM, Jie Deng wrote:
Thanks Yadid, that really works!

So is that means static method works only because spark did not distribute task yet, and the right way of using spark is implement all class Serializable?

Thanks a lot!




2013/12/13 Yadid Ayzenberg <[email protected] <mailto:[email protected]>>

    Hi Jie,

    it seems that SparkPrefix is not serializable. you can try adding
    /implements Serializable/ and see if that solves the problem.

    Yadid


    On 12/13/13 5:10 AM, Jie Deng wrote:
    Hi,all,

    Thanks for your time to read this,

    When I first trying to write a new Java class, and put spark in
    it, I always get a exception:

    /Exception in thread "main" org.apache.spark.SparkException: Job
    aborted: Task not serializable: java.io.NotSerializableException:
    org.dcu.test.SparkPrefix/
    /at
    
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1011)/
    /at
    
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1009)/
    /at
    
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)/
    /at
    scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)/
    /at org.apache.spark.scheduler.DAGScheduler.org
    
<http://org.apache.spark.scheduler.dagscheduler.org/>$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1009)/
    /at org.apache.spark.scheduler.DAGScheduler.org
    
<http://org.apache.spark.scheduler.dagscheduler.org/>$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:777)/
    /at org.apache.spark.scheduler.DAGScheduler.org
    
<http://org.apache.spark.scheduler.dagscheduler.org/>$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:720)/
    /at
    
org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:554)/
    /at
    
org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$1$$anonfun$receive$1.apply(DAGScheduler.scala:201)/
    /at
    
org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$1$$anonfun$receive$1.apply(DAGScheduler.scala:192)/
    /at akka.actor.Actor$class.apply(Actor.scala:318)/
    /at
    
org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$1.apply(DAGScheduler.scala:173)/
    /at akka.actor.ActorCell.invoke(ActorCell.scala:626)/
    /at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:197)/
    /at akka.dispatch.Mailbox.run(Mailbox.scala:179)/
    /at
    
akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:516)/
    /at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259)/
    /at
    akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975)/
    /at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)/
    /at
    akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)/
    /
    /
    /
    /
    Then I found all the spark examples using main function or other
    static functions only, so is there any pattern in it and why?



Reply via email to