[
https://issues.apache.org/jira/browse/SPARK-2862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086181#comment-14086181
]
Sean Owen commented on SPARK-2862:
----------------------------------
It looks like a Scala bug, which I see you've already found and proposed a
workaround for. I was about to paste this simpler proof of the bug, so here it
is for spectators:
scala> val increment = (99.0 - 6.0) / 9.0
increment: Double = 10.333333333333334
scala> Range.Double.inclusive(6.0, 99.0, increment).toArray
java.lang.IndexOutOfBoundsException: 9
...
Range.Double.inclusive(6.0, 99.0, 10.333333333333333) is fine as is your "6.0
to (99.0, increment). I think you can open a bug for the Scala class.
> DoubleRDDFunctions.histogram() throws exception for some inputs
> ---------------------------------------------------------------
>
> Key: SPARK-2862
> URL: https://issues.apache.org/jira/browse/SPARK-2862
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 0.9.0, 0.9.1, 1.0.0
> Environment: Scala version 2.9.2 (OpenJDK 64-Bit Server VM, Java
> 1.7.0_55) running on Ubuntu 14.04
> Reporter: Chandan Kumar
>
> histogram method call throws the below stack trace when the choice of
> bucketCount partitions the RDD in irrational increments e.g.
> scala> val r = sc.parallelize(6 to 99)
> r: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at
> <console>:12
> scala> r.histogram(9)
> java.lang.IndexOutOfBoundsException: 9
> at scala.collection.immutable.NumericRange.apply(NumericRange.scala:124)
> at
> scala.collection.immutable.NumericRange$$anon$1.apply(NumericRange.scala:176)
> at scala.collection.IndexedSeqLike$Elements.next(IndexedSeqLike.scala:66)
> at scala.collection.IterableLike$class.copyToArray(IterableLike.scala:237)
> at scala.collection.AbstractIterable.copyToArray(Iterable.scala:54)
> at
> scala.collection.TraversableOnce$class.copyToArray(TraversableOnce.scala:241)
> at scala.collection.AbstractTraversable.copyToArray(Traversable.scala:105)
> at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:249)
> at scala.collection.AbstractTraversable.toArray(Traversable.scala:105)
> at
> org.apache.spark.rdd.DoubleRDDFunctions.histogram(DoubleRDDFunctions.scala:116)
> at $iwC$$iwC$$iwC$$iwC.<init>(<console>:15)
> at $iwC$$iwC$$iwC.<init>(<console>:20)
> at $iwC$$iwC.<init>(<console>:22)
> at $iwC.<init>(<console>:24)
> at <init>(<console>:26)
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]