Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/21977
Okay, workers are tracked in `PythonWorkerFactory`. Its `create` method
returns an idle worker if one is available. When a task finishes, it calls
`SparkEnv.releasePythonWorker` that calls `releaseWorker` on the factory to
return the worker socket to the list of idle workers. So workers should grow to
the number of concurrent tasks.
I'll update this to always divide by the number of cores.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]