Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22569#discussion_r220954056
  
    --- Diff: 
core/src/test/scala/org/apache/spark/util/collection/OpenHashSetSuite.scala ---
    @@ -255,4 +255,16 @@ class OpenHashSetSuite extends SparkFunSuite with 
Matchers {
         val set = new OpenHashSet[Long](0)
         assert(set.size === 0)
       }
    +
    +  test("support for more than 12M items") {
    +    val cnt = 12000000 // 12M
    +    val set = new OpenHashSet[Int](cnt)
    +    for (i <- 0 until cnt) {
    +      set.add(i)
    +
    +      val pos1 = set.addWithoutResize(i) & OpenHashSet.POSITION_MASK
    +      val pos2 = set.getPos(i)
    +      assert(pos1 == pos2)
    +    }
    --- End diff --
    
    nit: Is it better to add the following to check each value after adding 
all, too?
    ```
    for (i <- 0 until cnt) {
      assert(set.contains(i))
    }
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to