Github user eyalfa commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21369#discussion_r209499005
  
    --- Diff: 
core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala
 ---
    @@ -585,17 +592,15 @@ class ExternalAppendOnlyMap[K, V, C](
           } else {
             logInfo(s"Task ${context.taskAttemptId} force spilling in-memory 
map to disk and " +
               s"it will release 
${org.apache.spark.util.Utils.bytesToString(getUsed())} memory")
    -        nextUpstream = spillMemoryIteratorToDisk(upstream)
    +        val nextUpstream = spillMemoryIteratorToDisk(upstream)
    +        assert(!upstream.hasNext)
             hasSpilled = true
    +        upstream = nextUpstream
    --- End diff --
    
    @cloud-fan , do you think this is worth doing, I'm referring to the 
CompletionIterator delaying GC of the sub iterator and cleanup function 
(usually a closure referring to a larger collection).
    if so, I'd open a separate JIRA+PR for this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to