Github user davies commented on the issue:

    https://github.com/apache/spark/pull/15722
  
    @jiexiong The longArray will not grow indefinitely, it only grow when the 
number of keys reach 50% of it's size. Another assumption is that the memory 
used by longArray should be much smaller than the pages (longArray take 32 
bytes per key, the pages take 56 bytes for a aggregate with 3 grouping key and 
1 aggregate) . Is that true for your workload?
    
    If the bookkeeping in the memory manager is right, it may do more spilling 
(because longArray is using more memory than expected), should not OOM. It's 
true that this patch could fix the OOM you saw in that query, but changing the 
memory factor (or other configs) should also fix that, I'm worrying there could 
be another bug in other places that cause the problem than this one. Could you 
dump logging how the memory is used when OOM happened?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to