Good catch, Daniel. Looks like this is a scala bug, not a spark one. Yet, spark users got to be careful not using NumericRange.
On Fri, Apr 18, 2014 at 9:05 PM, Daniel Darabos < daniel.dara...@lynxanalytics.com> wrote: > To make up for mocking Scala, I've filed a bug ( > https://issues.scala-lang.org/browse/SI-8518) and will try to patch this. > > > On Fri, Apr 18, 2014 at 9:24 PM, Daniel Darabos < > daniel.dara...@lynxanalytics.com> wrote: > >> Looks like NumericRange in Scala is just a joke. >> >> scala> val x = 0.0 to 1.0 by 0.1 >> x: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, >> 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999, >> 0.8999999999999999, 0.9999999999999999) >> >> scala> x.take(3) >> res1: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, >> 0.1, 0.2) >> >> scala> x.drop(3) >> res2: scala.collection.immutable.NumericRange[Double] = >> NumericRange(0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999, >> 0.8999999999999999, 0.9999999999999999) >> >> So far so good. >> >> scala> x.drop(3).take(3) >> res3: scala.collection.immutable.NumericRange[Double] = >> NumericRange(0.30000000000000004, 0.4) >> >> Why only two values? Where's 0.5? >> >> scala> x.drop(6) >> res4: scala.collection.immutable.NumericRange[Double] = >> NumericRange(0.6000000000000001, 0.7000000000000001, 0.8, 0.9) >> >> And where did the last value disappear now? >> >> You have to approach Scala with a healthy amount of distrust. You're on >> the right track with toArray. >> >> >> On Fri, Apr 18, 2014 at 8:01 PM, Mark Hamstra <m...@clearstorydata.com>wrote: >> >>> Please file an issue: Spark Project >>> JIRA<https://issues.apache.org/jira/browse/SPARK> >>> >>> >>> >>> On Fri, Apr 18, 2014 at 10:25 AM, Aureliano Buendia < >>> buendia...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> I just notices that sc.makeRDD() does not make all values given with >>>> input type of NumericRange, try this in spark shell: >>>> >>>> >>>> $ MASTER=local[4] bin/spark-shell >>>> >>>> scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length >>>> >>>> *8* >>>> >>>> >>>> The expected length is 11. This works correctly when lanching spark >>>> with only one core: >>>> >>>> >>>> $ MASTER=local[1] bin/spark-shell >>>> >>>> scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length >>>> >>>> *11* >>>> >>>> >>>> This also works correctly when using toArray(): >>>> >>>> $ MASTER=local[4] bin/spark-shell >>>> >>>> scala> sc.makeRDD((0.0 to 1 by 0.1).*toArray*).collect().length >>>> >>>> *8* >>>> >>> >>> >> >