[GitHub] spark issue #21794: [SPARK-24834][CORE] use java comparison for float and do...

bavardage Wed, 18 Jul 2018 13:55:32 -0700

Github user bavardage commented on the issue:

    https://github.com/apache/spark/pull/21794
  
    it does seem that spark currently does distinguish -0 and 0, at least as 
far as groupbys go
    
    ```
    scala> case class Thing(x : Float)
    defined class Thing
    
    scala> val df = Seq(Thing(0.0f), Thing(-0.0f),Thing(0.0f), 
Thing(-0.0f),Thing(0.0f), Thing(-0.0f),Thing(0.0f), Thing(-0.0f)).toDF
    2018-07-17 13:17:08 WARN  ObjectStore:568 - Failed to get database 
global_temp, returning NoSuchObjectException
    df: org.apache.spark.sql.DataFrame = [x: float]
    
    scala> df.groupBy($"x").count
    res0: org.apache.spark.sql.DataFrame = [x: float, count: bigint]
    
    scala> res0.collect
    res1: Array[org.apache.spark.sql.Row] = Array([-0.0,4], [0.0,4])
    ```
    
    doubles are hashed via `doubleToLongBits` 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala#L338
 which gives different bitwise representations of positive and negative 0



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #21794: [SPARK-24834][CORE] use java comparison for float and do...

Reply via email to