Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r236478702
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
private val reuseWorker = conf.getBoolean("spark.python.worker.reuse",
true)
// each python worker gets an equal part of the allocation. the worker
pool will grow to the
// number of concurrent tasks, which is determined by the number of
cores in this executor.
- private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
+ private val memoryMb = if (Utils.isWindows) {
--- End diff --
Strictly we should move it into JVM rather then adding more controls at
workers. Otherwise we will end up sending a bunch of data and environment
variables into python workers. It already send the environment variable that
indicates that this configuration is set and I was thinking what we should do
is to disable it on certain condition rather then checking the package and skip
in Python worker side.
I would rather remove python side's check then.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]