Hi Dongjoon, yeah, it seems to be the same. So, was it done on purpose to match the behavior of Hive?
Best regards, Anton 2017-05-19 16:39 GMT+02:00 Dong Joon Hyun <dh...@hortonworks.com>: > Hi, Anton. > > > > It’s the same result with Hive, isn’t it? > > > > hive> select 9.223372036854786E20, ceil(9.223372036854786E20); > > OK > > _c0 _c1 > > 9.223372036854786E20 9223372036854775807 > > Time taken: 2.041 seconds, Fetched: 1 row(s) > > > > Bests, > > Dongjoon. > > > > *From: *Anton Okolnychyi <anton.okolnyc...@gmail.com> > *Date: *Friday, May 19, 2017 at 7:26 AM > *To: *"dev@spark.apache.org" <dev@spark.apache.org> > *Subject: *[Spark SQL] ceil and floor functions on doubles > > > > Hi all, > > > > I am wondering why the results of ceil and floor functions on doubles are > internally casted to longs. This causes loss of precision since doubles can > hold bigger numbers. > > > > Consider the following example: > > > > // 9.223372036854786E20 is greater than Long.MaxValue > > val df = sc.parallelize(Array(("col", 9.223372036854786E20))).toDF() > > df.createOrReplaceTempView("tbl") > > spark.sql("select _2 AS original_value, ceil(_2) as ceil_result from > tbl").show() > > > > +---------------------------------+---------------------------------+ > > | original_value | ceil_result | > > +---------------------------------+---------------------------------+ > > | 9.223372036854786E20 | 9223372036854775807 | > > +---------------------------------+---------------------------------+ > > > > So, the original double value is rounded to 9223372036854775807, which is > Long.MaxValue. > > I think that it would be better to return 9.223372036854786E20 as it was > (and as it is actually returned by math.ceil before the cast to long). If > it is a problem, then I can fix this. > > > > Best regards, > > Anton >