Github user jose-torres commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21029#discussion_r182267906
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala
 ---
    @@ -95,21 +77,29 @@ case class DataSourceV2ScanExec(
               
sparkContext.getLocalProperty(ContinuousExecution.EPOCH_COORDINATOR_ID_KEY),
               sparkContext.env)
             .askSync[Unit](SetReaderPartitions(readerFactories.size))
    -      new ContinuousDataSourceRDD(sparkContext, sqlContext, 
readerFactories)
    -        .asInstanceOf[RDD[InternalRow]]
    -
    -    case r: SupportsScanColumnarBatch if r.enableBatchRead() =>
    -      new DataSourceRDD(sparkContext, 
batchReaderFactories).asInstanceOf[RDD[InternalRow]]
    -
    +      if (readerFactories.exists(_.dataFormat() == 
DataFormat.COLUMNAR_BATCH)) {
    +        throw new IllegalArgumentException(
    +          "continuous stream reader does not support columnar read yet.")
    --- End diff --
    
    I've thought about this further. Shouldn't it be trivial to write a wrapper 
that simply converts a DataReader[ColumnarBatch] to a DataReader[InternalRow]? 
If so then we can easily support it after the current PR.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to