Github user mingyukim commented on the pull request:

    https://github.com/apache/spark/pull/4420#issuecomment-73303667
  
    Can you elaborate on the "memory size as an additional heuristic" idea? 
This is consistently causing OOMs in one of our workflows, which is exactly 
what spilling to disk is supposed to handle. I'm happy to work on it on my end 
if you have suggestions.
    
    A few ideas off the top of my head are,
    - Have a threshold on {currentMemory - myMemoryThreshold} value so it tries 
to spill if the difference gets big enough.
    - In fact, why not remove the entire threshold check just like how it was 
originally suggested in #3656? You can tweak how often the spill is done by 
setting a minimum on the amount of memory you request from 
ShuffleMemoryManager. Then, you're guaranteed that the spill files are not too 
small. You still get too many files? Well.. that's unavoidable. Your shuffle is 
really big, so you'd have to spill a lot. Otherwise, your JVM will OOM. 
Basically, I don't think trackMemoryThreshold and trackMemoryFrequency are the 
right way to control your spill frequency or spill file size, since it can lead 
to OOMs when each element is large.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to