Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21977#discussion_r211763465
  
    --- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
    @@ -60,14 +61,20 @@ private[spark] object PythonEvalType {
      */
     private[spark] abstract class BasePythonRunner[IN, OUT](
         funcs: Seq[ChainedPythonFunctions],
    -    bufferSize: Int,
    -    reuseWorker: Boolean,
         evalType: Int,
    -    argOffsets: Array[Array[Int]])
    +    argOffsets: Array[Array[Int]],
    +    conf: SparkConf)
    --- End diff --
    
    You're right, we could do that. Originally, I thought it was a good idea to 
pass in the right config, but that's not possible. If we use the config, we 
must use `SparkEnv.get.conf`. The alternative is to go back to spreading the 
config logic into every RDD or Exec node to use the conf on the driver and 
serialize it to make it available on the executors, which is ugly.
    
    I'd prefer just using SparkEnv.get.conf in PythonRunner.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to