nchammas commented on code in PR #45036:
URL: https://github.com/apache/spark/pull/45036#discussion_r1482123330
##########
core/src/test/scala/org/apache/spark/util/collection/OpenHashSetSuite.scala:
##########
@@ -269,4 +269,35 @@ class OpenHashSetSuite extends SparkFunSuite with Matchers
{
assert(pos1 == pos2)
}
}
+
+ test("SPARK-45599: 0.0 and -0.0 are equal but not the same") {
Review Comment:
Consider another interesting case where `java.util.HashSet` and
`OpenHashSet` differ:
```scala
scala> val h = new HashSet[Double]()
val h: java.util.HashSet[Double] = []
scala> h.add(Double.NaN)
val res9: Boolean = true
scala> h.add(Double.NaN)
val res10: Boolean = false
scala> h.size()
val res11: Int = 1
```
On `master`, `OpenHashSet` does IMO the wrong thing:
```scala
val set = new OpenHashSet[Double]()
set.add(Double.NaN)
set.add(Double.NaN)
set.size // returns 2
```
This could possibly lead to a bug like the one reported in SPARK-45599 but
in reverse, where a new NaN row is added rather than dropped. I will see if I
can construct such a scenario as a demonstration. But regardless, I think this
behavior is incorrect by itself.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]