Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22009#discussion_r208638600
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ScanExec.scala
---
@@ -93,21 +81,17 @@ case class DataSourceV2ScanExec(
sparkContext,
sqlContext.conf.continuousStreamingExecutorQueueSize,
sqlContext.conf.continuousStreamingExecutorPollIntervalMs,
- partitions).asInstanceOf[RDD[InternalRow]]
-
- case r: SupportsScanColumnarBatch if r.enableBatchRead() =>
- new DataSourceRDD(sparkContext,
batchPartitions).asInstanceOf[RDD[InternalRow]]
+ partitions,
+ schema,
+
partitionReaderFactory.asInstanceOf[ContinuousPartitionReaderFactory])
--- End diff --
However you want to do it is fine with me, but I've seen excessive casting
in the SQL back-end so I'm against adding it when it isn't necessary, like this
case.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]