[GitHub] spark pull request #21977: SPARK-25004: Add spark.executor.pyspark.memory li...

rdblue Tue, 21 Aug 2018 14:18:33 -0700

Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21977#discussion_r211763465
  
    --- Diff: 
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala ---
    @@ -60,14 +61,20 @@ private[spark] object PythonEvalType {
      */
     private[spark] abstract class BasePythonRunner[IN, OUT](
         funcs: Seq[ChainedPythonFunctions],
    -    bufferSize: Int,
    -    reuseWorker: Boolean,
         evalType: Int,
    -    argOffsets: Array[Array[Int]])
    +    argOffsets: Array[Array[Int]],
    +    conf: SparkConf)
    --- End diff --
    
    You're right, we could do that. Originally, I thought it was a good idea to 
pass in the right config, but that's not possible. If we use the config, we 
must use `SparkEnv.get.conf`. The alternative is to go back to spreading the 
config logic into every RDD or Exec node to use the conf on the driver and 
serialize it to make it available on the executors, which is ugly.
    
    I'd prefer just using SparkEnv.get.conf in PythonRunner.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21977: SPARK-25004: Add spark.executor.pyspark.memory li...

Reply via email to