[GitHub] spark pull request #20153: [SPARK-22392][SQL] data source v2 columnar batch ...

gatorsmile Fri, 12 Jan 2018 07:37:49 -0800

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20153#discussion_r161252949
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryTableScanExec.scala
 ---
    @@ -90,14 +92,56 @@ case class InMemoryTableScanExec(
         columnarBatch
       }
     
    -  override def inputRDDs(): Seq[RDD[InternalRow]] = {
    -    assert(supportCodegen)
    +  private lazy val inputRDD: RDD[InternalRow] = {
         val buffers = filteredCachedBatches()
    -    // HACK ALERT: This is actually an RDD[ColumnarBatch].
    -    // We're taking advantage of Scala's type erasure here to pass these 
batches along.
    -    
Seq(buffers.map(createAndDecompressColumn(_)).asInstanceOf[RDD[InternalRow]])
    +    if (supportsBatch) {
    +      // HACK ALERT: This is actually an RDD[ColumnarBatch].
    +      // We're taking advantage of Scala's type erasure here to pass these 
batches along.
    +      buffers.map(createAndDecompressColumn).asInstanceOf[RDD[InternalRow]]
    +    } else {
    +      val numOutputRows = longMetric("numOutputRows")
    +
    +      if (enableAccumulators) {
    --- End diff --
    
    This conf is really confusing... Maybe renaming it to 
`enableAccumulatorsForTestingOnly`



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20153: [SPARK-22392][SQL] data source v2 columnar batch ...

Reply via email to