Github user HyukjinKwon commented on a diff in the pull request:
https://github.com/apache/spark/pull/23055#discussion_r236926113
--- Diff:
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
@@ -74,8 +74,13 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
private val reuseWorker = conf.getBoolean("spark.python.worker.reuse",
true)
// each python worker gets an equal part of the allocation. the worker
pool will grow to the
// number of concurrent tasks, which is determined by the number of
cores in this executor.
- private val memoryMb = conf.get(PYSPARK_EXECUTOR_MEMORY)
+ private val memoryMb = if (Utils.isWindows) {
--- End diff --
I don't think I'm only the one tho.
> Why is this code needed?
I explain above multiple times. See above.
> it's not doing anything useful if you keep the python check around
See above. I want to delete them but added per review comment.
> The JVM doesn't understand exactly what Python supports of not, it's
better to let the python code decide that.
Not really. resource module is a Python builtin module that exists unix
based system. It just does not exist in Windows.
> You say we should disable the feature on Windows. The python-side changes
already do that.
I explained above. See the first change I proposed 2d3315a. It relays on
the environment variable.
> We should not remove the extra memory requested from the resource manager
just because you're running on Windows - you'll still need that memory, you'll
just get a different error message if you end up using more than you requested.
Yea, I know it would probably work. My question that is it ever tested? One
failure case was found and it looks a bit odd that we document it works. It's
not even tested and shall we make it simple rather then it make it work
differently until it's tested?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]