cloud-fan commented on a change in pull request #25679: [SPARK-28974][SQL]
centralize the Data Source V2 table capability checks
URL: https://github.com/apache/spark/pull/25679#discussion_r320766615
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/TableCapabilityCheck.scala
##########
@@ -18,25 +18,59 @@
package org.apache.spark.sql.execution.datasources.v2
import org.apache.spark.sql.AnalysisException
-import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.catalyst.expressions.Literal
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LogicalPlan,
OverwriteByExpression, OverwritePartitionsDynamic}
import org.apache.spark.sql.execution.streaming.{StreamingRelation,
StreamingRelationV2}
-import org.apache.spark.sql.sources.v2.TableCapability.{CONTINUOUS_READ,
MICRO_BATCH_READ}
+import org.apache.spark.sql.sources.v2.TableCapability._
+import org.apache.spark.sql.types.BooleanType
/**
- * This rules adds some basic table capability check for streaming scan,
without knowing the actual
- * streaming execution mode.
+ * Checks the capabilities of Data Source V2 tables, and fail problematic
queries earlier.
*/
-object V2StreamingScanSupportCheck extends (LogicalPlan => Unit) {
+object TableCapabilityCheck extends (LogicalPlan => Unit) {
import DataSourceV2Implicits._
+ private def failAnalysis(msg: String): Unit = throw new
AnalysisException(msg)
+
override def apply(plan: LogicalPlan): Unit = {
- plan.foreach {
+ plan foreach {
+ case r: DataSourceV2Relation if !r.table.supports(BATCH_READ) =>
Review comment:
Here I add the batch scan check. It's possible that a table implements
`SupportsRead` without reporting `BATCH_READ` capability. For example, a
steaming table which doesn't support batch scan. We must check the `BATCH_READ`
capability here, instead of relying on the `.isInstaceOf[SupportsRead]` check
at the planner side.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]