Github user pwendell commented on a diff in the pull request:

    https://github.com/apache/spark/pull/4450#discussion_r29118289
  
    --- Diff: 
core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala ---
    @@ -740,15 +723,29 @@ private[spark] class ExternalSorter[K, V, C](
               in.close()
             }
           }
    +    } else if (spills.isEmpty && partitionWriters == null) {
    --- End diff --
    
    The branching here is starting to get very complicated (#1799 added a 
second level of branching, and now this adds a third). Also, it a bit redundant 
with the branching in `partitionedIterator`, which also has its own special 
case for this. However, I don't seen an obvious way to improve it given the 
design of these other data structures. I'll keep thinking


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to