Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r236345625
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
private val reuseWorker = conf.getBoolean("spark.python.worker.reuse",
true)
// each python worker gets an equal part of the allocation. the worker
pool will grow to the
// number of concurrent tasks, which is determined by the number of
cores in this executor.
- private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
+ private val memoryMb = if (Utils.isWindows) {
--- End diff --
> functionality is disabled in Python side
The only functionality that is disabled is limiting the memory space. The
allocation for Python is still requested from resource managers. Setting the
environment property tells python how much memory it was allocated, no matter
how that is used or enforced.
> code consistency - usually the configuration is dealt with in JVM side if
possible
The JVM is handling the setting by requesting that memory for python and
passing on the amount requested to python. The fact that the python process
can't limit doesn't affect how the JVM side should behave. This needlessly
couples JVM and python behavior with an assumption that may not be true in the
future.
> Why are you so against about disabling it in JVM side?
There is no benefit to disabling this. It is more code with no purpose and
it makes assumptions about what python can or cannot do that aren't obvious.
What if pandas implements some method to spill to disk to limit memory
consumption? Will implementers of that future feature know that the environment
variable is not set when running in windows? This adds complexity for no
benefit because it doesn't change either the resource allocation in the JVM or
the behavior of the python process. It only avoids sending valuable
information. I see no reason for it.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]