nchammas commented on code in PR #45036:
URL: https://github.com/apache/spark/pull/45036#discussion_r1482085088
##########
core/src/test/scala/org/apache/spark/util/collection/OpenHashSetSuite.scala:
##########
@@ -269,4 +269,35 @@ class OpenHashSetSuite extends SparkFunSuite with Matchers
{
assert(pos1 == pos2)
}
}
+
+ test("SPARK-45599: 0.0 and -0.0 are equal but not the same") {
Review Comment:
> This is a bit tricky and it's better if we can find a reference system
that defines this semantic.
```scala
scala> import java.util.HashSet
import java.util.HashSet
scala> val h = new HashSet[Double]()
val h: java.util.HashSet[Double] = []
scala> h.add(0.0)
val res0: Boolean = true
scala> h.add(-0.0)
val res1: Boolean = true
scala> h.size()
val res2: Int = 2
```
The doc for [HashSet.add][1] states:
> More formally, adds the specified element e to this set if this set
contains no element e2 such that Objects.equals(e, e2). If this set already
contains the element, the call leaves the set unchanged and returns false.
In other words, `java.util.HashSet` uses `equals` and not `==`, and
therefore it considers `0.0` and `-0.0` distinct elements.
So this PR brings `OpenHashSet` more in line with the semantics of
`java.util.HashSet`.
[1]:
https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/HashSet.html#add(E)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]