[GitHub] spark pull request #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in...

icexelloss Tue, 21 Aug 2018 12:38:20 -0700

Github user icexelloss commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22104#discussion_r211733007
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/python/EvalPythonExec.scala
 ---
    @@ -117,15 +117,18 @@ abstract class EvalPythonExec(udfs: Seq[PythonUDF], 
output: Seq[Attribute], chil
               }
             }.toArray
           }.toArray
    -      val projection = newMutableProjection(allInputs, child.output)
    +
    +      // Project input rows to unsafe row so we can put it in the row queue
    +      val unsafeProjection = UnsafeProjection.create(child.output, 
child.output)
    --- End diff --
    
    Friendly ping @cloud-fan. Do you think forcing a unsafeProject here to deal 
with non-unsafe rows from data sources are correct? Is there a way to know 
whether the children nodes output unsafe rows so to avoid unnecessary unsafe 
projection here?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22104: [SPARK-24721][SQL] Exclude Python UDFs filters in...

Reply via email to