GitHub user witgo opened a pull request: https://github.com/apache/spark/pull/15505
[WIP][SPARK-17931]taskScheduler has some unneeded serialization ## What changes were proposed in this pull request? When taskScheduler instantiates TaskDescription, it calls `Task.serializeWithDependencies(task, sched.sc.addedFiles, sched.sc.addedJars, ser)`. It serializes task and its dependency. But after SPARK-2521 has been merged into the master, the ResultTask class and ShuffleMapTask class no longer contain rdd and closure objects. TaskDescription class can be changed as below: ```scala class TaskDescription( var taskId: Long, var attemptNumber: Int, var executorId: String, var name: String, var index: Int, var taskFiles: mutable.Map[String, Long], var taskJars: mutable.Map[String, Long], var task: Task[_]) ``` ## How was this patch tested? TODO ... You can merge this pull request into a Git repository by running: $ git pull https://github.com/witgo/spark SPARK-17931 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/15505.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #15505 ---- commit b51d00ce0f737e5b3568668e5d9225a805a4abea Author: Guoqiang Li <wi...@qq.com> Date: 2016-10-16T08:27:37Z taskScheduler has some unneeded serialization ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org