Github user j-baker commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18543#discussion_r128027434
  
    --- Diff: 
sql/catalyst/src/main/java/org/apache/spark/sql/execution/UnsafeExternalRowSorter.java
 ---
    @@ -211,7 +211,10 @@ public int compare(Object baseObj1, long baseOff1, 
Object baseObj2, long baseOff
           // TODO: Why are the sizes -1?
           row1.pointTo(baseObj1, baseOff1, -1);
           row2.pointTo(baseObj2, baseOff2, -1);
    -      return ordering.compare(row1, row2);
    +      int comparison = ordering.compare(row1, row2);
    +      row1.pointTo(null, 0L, -1);
    +      row2.pointTo(null, 0L, -1);
    --- End diff --
    
    So I feel like if the thing ends up in memory, this is correct - but 
otherwise the comparator is used in the 
[UnsafeSorterSpillMerger](https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillMerger.java#L51).
    
    Since we're handing back an iterator, am I right in thinking that without 
some periodic cleanup task you always stand a risk from this kind of leak 
unless you clear after each comparison or have some kind of async cleanup task?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to