Github user andrewor14 commented on the pull request:

    https://github.com/apache/spark/pull/3302#issuecomment-63398241
  
    I see, after a while we unconditionally try to spill every 32 elements 
regardless of whether the in-memory buffer has exceeded the spill threshold. 
This is a serious problem and it seems that this easy fix is just an omission 
in the original code since we don't ever update `elementsRead` ever in this 
code path. Changes here LGTM.
    
    I think this is the first step towards fixing the too many files open issue 
that many are seeing. We still need to hunt down the root cause for why the 
lower bound for how much memory a data structure can have is not being 
accounted for properly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to