Github user bavardage commented on the issue:
https://github.com/apache/spark/pull/21794
it does seem that spark currently does distinguish -0 and 0, at least as
far as groupbys go
```
scala> case class Thing(x : Float)
defined class Thing
scala> val df = Seq(Thing(0.0f), Thing(-0.0f),Thing(0.0f),
Thing(-0.0f),Thing(0.0f), Thing(-0.0f),Thing(0.0f), Thing(-0.0f)).toDF
2018-07-17 13:17:08 WARN ObjectStore:568 - Failed to get database
global_temp, returning NoSuchObjectException
df: org.apache.spark.sql.DataFrame = [x: float]
scala> df.groupBy($"x").count
res0: org.apache.spark.sql.DataFrame = [x: float, count: bigint]
scala> res0.collect
res1: Array[org.apache.spark.sql.Row] = Array([-0.0,4], [0.0,4])
```
doubles are hashed via `doubleToLongBits`
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala#L338
which gives different bitwise representations of positive and negative 0
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]