[
https://issues.apache.org/jira/browse/SPARK-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080452#comment-14080452
]
Larry Xiao commented on SPARK-2728:
-----------------------------------
I want to try :)
> Integer overflow in partition index calculation RangePartitioner
> ----------------------------------------------------------------
>
> Key: SPARK-2728
> URL: https://issues.apache.org/jira/browse/SPARK-2728
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 1.0.0
> Environment: Spark 1.0.1
> Reporter: Jianshi Huang
> Labels: easyfix
>
> If the partition number are greater than 10362, then spark will report
> ArrayOutofIndex error.
> The reason is in the partition index calculation in rangeBounds:
> #Line: 112
> val bounds = new Array[K](partitions - 1)
> for (i <- 0 until partitions - 1) {
> val index = (rddSample.length - 1) * (i + 1) / partitions
> bounds(i) = rddSample(index)
> }
> Here (rddSample.length - 1) * (i + 1) will overflow to a negative Int.
> Cast rddSample.length - 1 to Long should be enough for a fix?
> Jianshi
--
This message was sent by Atlassian JIRA
(v6.2#6252)