[GitHub] spark pull request #23127: [SPARK-26159] Codegen for LocalTableScanExec and ...

juliuszsompolski Mon, 26 Nov 2018 11:20:20 -0800

Github user juliuszsompolski commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23127#discussion_r236391673
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala
 ---
    @@ -406,14 +415,62 @@ trait BlockingOperatorWithCodegen extends 
CodegenSupport {
       override def limitNotReachedChecks: Seq[String] = Nil
     }
     
    +/**
    + * Leaf codegen node reading from a single RDD.
    + */
    +trait InputRDDCodegen extends CodegenSupport {
    +
    +  def inputRDD: RDD[InternalRow]
    +
    +  // If the input is an RDD of InternalRow which are potentially not 
UnsafeRow,
    +  // and there is no parent to consume it, it needs an UnsafeProjection.
    +  protected val createUnsafeProjection: Boolean = (parent == null)
    +
    +  override def inputRDDs(): Seq[RDD[InternalRow]] = {
    +    inputRDD :: Nil
    +  }
    +
    +  override def doProduce(ctx: CodegenContext): String = {
    --- End diff --
    
    The new one should be the same as the previous 
`RowDataSourceScanExec.doProduce` and `RDDScanExec.doProduce` if 
createUnsafeProjection == true, and it should be the same as the previous 
`InputAdapter.doProduce` and `LocalTableScanExec.doProduce` when 
createUnsafeProjection == false.
    
    From the fact that `InputAdapter` was not doing an explicit unsafe 
projection, even though it's input could be InternalRows that are not 
UnsafeRows I derived an assumption that it is safe not to do so as long as 
there is a parent operator. This assumes that that parent operator would always 
result in an UnsafeProjection being eventually added, and hence the output of 
the WholeStageCodegen will be in UnsafeRows.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #23127: [SPARK-26159] Codegen for LocalTableScanExec and ...

Reply via email to