Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20151#discussion_r160864049
  
    --- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala ---
    @@ -34,17 +34,39 @@ private[spark] class PythonWorkerFactory(pythonExec: 
String, envVars: Map[String
     
       import PythonWorkerFactory._
     
    -  // Because forking processes from Java is expensive, we prefer to launch 
a single Python daemon
    -  // (pyspark/daemon.py) and tell it to fork new workers for our tasks. 
This daemon currently
    -  // only works on UNIX-based systems now because it uses signals for 
child management, so we can
    -  // also fall back to launching workers (pyspark/worker.py) directly.
    +  // Because forking processes from Java is expensive, we prefer to launch 
a single Python daemon,
    +  // pyspark/daemon.py (by default) and tell it to fork new workers for 
our tasks. This daemon
    +  // currently only works on UNIX-based systems now because it uses 
signals for child management,
    +  // so we can also fall back to launching workers, pyspark/worker.py (by 
default) directly.
       val useDaemon = {
         val useDaemonEnabled = 
SparkEnv.get.conf.getBoolean("spark.python.use.daemon", true)
     
         // This flag is ignored on Windows as it's unable to fork.
         !System.getProperty("os.name").startsWith("Windows") && 
useDaemonEnabled
       }
     
    +  // WARN: Both configurations, 'spark.python.daemon.module' and 
'spark.python.worker.module' are
    +  // for very advanced users and they are experimental. This should be 
considered
    +  // as expert-only option, and shouldn't be used before knowing what it 
means exactly.
    +
    +  // This configuration indicates the module to run the daemon to execute 
its Python workers.
    +  val daemonModule = 
SparkEnv.get.conf.getOption("spark.python.daemon.module").map { value =>
    --- End diff --
    
    Hm, actually we could check like .. if it's empty string too. I wrote 
"shouldn't be used before knowing what it means exactly." above. So, I think 
it's fine.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to