[ 
https://issues.apache.org/jira/browse/SPARK-1298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981820#comment-13981820
 ] 

Nan Zhu commented on SPARK-1298:
--------------------------------

addressed in https://github.com/apache/spark/pull/186

> Remove duplicate partition id checking
> --------------------------------------
>
>                 Key: SPARK-1298
>                 URL: https://issues.apache.org/jira/browse/SPARK-1298
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 1.0.0
>            Reporter: Nan Zhu
>            Assignee: Nan Zhu
>            Priority: Minor
>             Fix For: 1.0.0
>
>
> In the current implementation, we check whether partitionIDs make sense in 
> SparkContext.runJob()
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L896
> However, immediately following, in DAGScheduler (calling path 
> SparkContext.runJob -> DAGScheduler.runJob -> DAGScheduler.submitJob), we 
> check it again, (just missing a < 0 condition), 
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L432
> I propose to remove the SparkContext one and check it in DAGScheduler (which 
> makes more sense, from my view)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to