Mark Khaitman created SPARK-5782:
------------------------------------
Summary: Python Worker / Pyspark Daemon Memory Issue
Key: SPARK-5782
URL: https://issues.apache.org/jira/browse/SPARK-5782
Project: Spark
Issue Type: Bug
Components: PySpark, Shuffle
Affects Versions: 1.2.1, 1.3.0, 1.2.2
Environment: CentOS 7, Spark Standalone
Reporter: Mark Khaitman
I'm including the Shuffle component on this, as a brief scan through the code
(which I'm not 100% familiar with just yet) shows a large amount of memory
handling in it:
It appears that any type of join between two RDDs spawns up twice as many
pyspark.daemon workers compared to the default 1 task -> 1 core configuration
in our environment. This can become problematic in the cases where you build up
a tree of RDD joins, since the pyspark.daemons do not cease to exist until the
top level join is completed (or so it seems)... This can lead to memory
exhaustion by a single framework, even though is set to have a 512MB python
worker memory limit and few gigs of executor memory.
Another related issue to this is that the individual python workers are not
supposed to even exceed that far beyond 512MB, otherwise they're supposed to
spill to disk.
I came across this bit of code in shuffle.py which *may* have something to do
with allowing some of our python workers from somehow reaching 2GB each (which
when multiplied by the number of cores per executor * the number of joins
occurring in some cases), causing the Out-of-Memory killer to step up to its
unfortunate job! :(
def _next_limit(self):
"""
Return the next memory limit. If the memory is not released
after spilling, it will dump the data only when the used memory
starts to increase.
"""
return max(self.memory_limit, get_used_memory() * 1.05)
I've only just started looking into the code, and would definitely love to
contribute towards Spark, though I figured it might be quicker to resolve if
someone already owns the code!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]