[ 
https://issues.apache.org/jira/browse/SPARK-2862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086181#comment-14086181
 ] 

Sean Owen commented on SPARK-2862:
----------------------------------

It looks like a Scala bug, which I see you've already found and proposed a 
workaround for. I was about to paste this simpler proof of the bug, so here it 
is for spectators:

scala> val increment = (99.0 - 6.0) / 9.0
increment: Double = 10.333333333333334
scala> Range.Double.inclusive(6.0, 99.0, increment).toArray
java.lang.IndexOutOfBoundsException: 9
...

Range.Double.inclusive(6.0, 99.0, 10.333333333333333) is fine as is your "6.0 
to (99.0, increment). I think you can open a bug for the Scala class.

> DoubleRDDFunctions.histogram() throws exception for some inputs
> ---------------------------------------------------------------
>
>                 Key: SPARK-2862
>                 URL: https://issues.apache.org/jira/browse/SPARK-2862
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 0.9.0, 0.9.1, 1.0.0
>         Environment: Scala version 2.9.2 (OpenJDK 64-Bit Server VM, Java 
> 1.7.0_55) running on Ubuntu 14.04
>            Reporter: Chandan Kumar
>
> histogram method call throws the below stack trace when the choice of 
> bucketCount partitions the RDD in irrational increments e.g. 
> scala> val r = sc.parallelize(6 to 99)
> r: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at 
> <console>:12
> scala> r.histogram(9)
> java.lang.IndexOutOfBoundsException: 9
> at scala.collection.immutable.NumericRange.apply(NumericRange.scala:124)
> at 
> scala.collection.immutable.NumericRange$$anon$1.apply(NumericRange.scala:176)
> at scala.collection.IndexedSeqLike$Elements.next(IndexedSeqLike.scala:66)
> at scala.collection.IterableLike$class.copyToArray(IterableLike.scala:237)
> at scala.collection.AbstractIterable.copyToArray(Iterable.scala:54)
> at 
> scala.collection.TraversableOnce$class.copyToArray(TraversableOnce.scala:241)
> at scala.collection.AbstractTraversable.copyToArray(Traversable.scala:105)
> at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:249)
> at scala.collection.AbstractTraversable.toArray(Traversable.scala:105)
> at 
> org.apache.spark.rdd.DoubleRDDFunctions.histogram(DoubleRDDFunctions.scala:116)
> at $iwC$$iwC$$iwC$$iwC.<init>(<console>:15)
> at $iwC$$iwC$$iwC.<init>(<console>:20)
> at $iwC$$iwC.<init>(<console>:22)
> at $iwC.<init>(<console>:24)
> at <init>(<console>:26)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to