Github user mikeringenburg commented on the pull request:
https://github.com/apache/spark/pull/5403#issuecomment-112938589
We have one of the configurations to which @kayousterhout refers - a good
deal of local memory, but no local file system, only a global parallel file
system (Lustre). Using Lustre for shuffle's temporary directory performs very
poorly, and using a local ram disk is limiting due to one of the issues
mentioned in the updated PR description - namely that shuffle data is cleaned
up very slowly, meaning that we may run out of memory after a number of
iterations.
Thus, my feeling is that perhaps finding a way to more aggressively clean
up the shuffle data might be a bigger priority - it would make something like
this PR more suitable for production, and would also make using a ram disk for
shuffle data more viable.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]