Github user chouqin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/5737#discussion_r29304781
  
    --- Diff: 
core/src/test/scala/org/apache/spark/util/collection/ExternalSorterSuite.scala 
---
    @@ -506,7 +506,10 @@ class ExternalSorterSuite extends FunSuite with 
LocalSparkContext with PrivateMe
         val agg = new Aggregator[Int, Int, Int](i => i, (i, j) => i + j, (i, 
j) => i + j)
         val ord = implicitly[Ordering[Int]]
         val sorter = new ExternalSorter(Some(agg), Some(new 
HashPartitioner(3)), Some(ord), None)
    -    sorter.insertAll((0 until 100000).iterator.map(i => (i / 2, i)))
    +
    +    // avoid combine before spill
    +    sorter.insertAll((0 until 50000).iterator.map(i => (i , 2 * i)))
    +    sorter.insertAll((0 until 50000).iterator.map(i => (i, 2 * i + 1)))
    --- End diff --
    
    Yes, it failed on my computer(this testcase didn't stop). I set the memory 
limit to a small value  on my computer to ganrantee spills. I think we can also 
set `spark.shuffle.memoryFraction` to a small value to do this. The origin 
value of `spark.shuffle.memoryFraction` is 0.001, I don't know if it can 
ganrantee spills. If not, we should change it to a smaller value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to