[
https://issues.apache.org/jira/browse/SPARK-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981820#comment-13981820
]
Nan Zhu commented on SPARK-1298:
--------------------------------
addressed in https://github.com/apache/spark/pull/186
> Remove duplicate partition id checking
> --------------------------------------
>
> Key: SPARK-1298
> URL: https://issues.apache.org/jira/browse/SPARK-1298
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 1.0.0
> Reporter: Nan Zhu
> Assignee: Nan Zhu
> Priority: Minor
> Fix For: 1.0.0
>
>
> In the current implementation, we check whether partitionIDs make sense in
> SparkContext.runJob()
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L896
> However, immediately following, in DAGScheduler (calling path
> SparkContext.runJob -> DAGScheduler.runJob -> DAGScheduler.submitJob), we
> check it again, (just missing a < 0 condition),
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L432
> I propose to remove the SparkContext one and check it in DAGScheduler (which
> makes more sense, from my view)
--
This message was sent by Atlassian JIRA
(v6.2#6252)