Github user andrewor14 commented on a diff in the pull request:

    https://github.com/apache/incubator-spark/pull/612#discussion_r9925630
  
    --- Diff: 
core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
 ---
    @@ -83,6 +83,28 @@ class ExternalAppendOnlyMapSuite extends FunSuite with 
BeforeAndAfter with Local
           (3, Set[Int](30))))
       }
     
    +  test("insert with collision on hashCode Int.MaxValue") {
    +    val conf = new SparkConf(false)
    +    sc = new SparkContext("local", "test", conf)
    +
    +    val map = new ExternalAppendOnlyMap[Int, Int, 
ArrayBuffer[Int]](createCombiner,
    +      mergeValue, mergeCombiners)
    +
    +    map.insert(Int.MaxValue, 10)
    +    map.insert(2, 20)
    +    map.insert(3, 30)
    +    map.insert(Int.MaxValue, 100)
    +    map.insert(2, 200)
    +    map.insert(Int.MaxValue, 1000)
    +    val it = map.iterator
    +    assert(it.hasNext)
    +    val result = it.toSet[(Int, ArrayBuffer[Int])].map(kv => (kv._1, 
kv._2.toSet))
    +    assert(result == Set[(Int, Set[Int])](
    +      (Int.MaxValue, Set[Int](10, 100, 1000)),
    +      (2, Set[Int](20, 200)),
    +      (3, Set[Int](30))))
    --- End diff --
    
    Even after setting the memory parameters, we still need to insert a lot 
into the map to induce spilling. I have been able to trigger the exception that 
you found with the following:
    
    (1 until 100000).foreach { i => map.insert(i, i) }
    map.insert(Int.MaxValue, Int.MaxValue)
    
    val it = map.iterator
    while (it.hasNext) {
      it.next()
    }


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. To do so, please top-post your response.
If your project does not have this feature enabled and wishes so, or if the
feature is enabled but not working, please contact infrastructure at
infrastruct...@apache.org or file a JIRA ticket with INFRA.
---

Reply via email to