[jira] [Commented] (SPARK-11751) Doc describe error in the "Spark Streaming Programming Guide" page

yangping wu (JIRA) Mon, 16 Nov 2015 01:30:57 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-11751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15006390#comment-15006390
 ]


yangping wu commented on SPARK-11751:
-------------------------------------

Hi， [~srowen] Thank you for your reply.

I want to change it to
{quote}*Task Serialization*: Using Java serialization for serializing tasks can 
reduce the task sizes, and therefore reduce the time taken to send them to the 
slaves,  currently only the Java serializer is supported, see 
*spark.closure.serializer*{quote} . 
Do you have any suggestions?

> Doc describe error in the "Spark Streaming Programming Guide" page
> ------------------------------------------------------------------
>
>                 Key: SPARK-11751
>                 URL: https://issues.apache.org/jira/browse/SPARK-11751
>             Project: Spark
>          Issue Type: Documentation
>          Components: Documentation
>    Affects Versions: 1.4.1, 1.5.0, 1.5.1, 1.5.2
>            Reporter: yangping wu
>            Priority: Trivial
>
> In the *Task Launching Overheads* section,
> {quote}*Task Serialization*: Using Kryo serialization for serializing tasks 
> can reduce the task sizes, and therefore reduce the time taken to send them 
> to the slaves.{quote}
> As we known *Task Serialization* is configuration by 
> *spark.closure.serializer* parameter, but currently only the Java serializer 
> is supported. If we set *spark.closure.serializer*  to 
> *org.apache.spark.serializer.KryoSerializer*, then this will throw a 
> exception as follow:
> {code}
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 516 
> in stage 0.0 failed 4 times, most recent failure: Lost task 516.3 in stage 
> 0.0 (TID 21, spark-cluster.data.com): java.io.EOFException
>       at java.io.DataInputStream.readInt(DataInputStream.java:392)
>       at 
> org.apache.spark.scheduler.Task$.deserializeWithDependencies(Task.scala:188)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:192)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:744)
> Driver stacktrace:
>       at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1273)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1264)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1263)
>       at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>       at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>       at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1263)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
>       at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:730)
>       at scala.Option.foreach(Option.scala:236)
>       at 
> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:730)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1457)
>       at 
> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1418)
>       at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-11751) Doc describe error in the "Spark Streaming Programming Guide" page

Reply via email to