[GitHub] [spark] cloud-fan commented on a change in pull request #25955: [SPARK-29277][SQL] Add early DSv2 filter and projection pushdown

GitBox Mon, 21 Oct 2019 20:24:23 -0700

cloud-fan commented on a change in pull request #25955: [SPARK-29277][SQL] Add 
early DSv2 filter and projection pushdown
URL: https://github.com/apache/spark/pull/25955#discussion_r337318929


 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ##########
 @@ -243,17 +247,36 @@ class FindDataSourceTable(sparkSession: SparkSession) 
extends Rule[LogicalPlan]
   override def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
     case i @ InsertIntoStatement(UnresolvedCatalogRelation(tableMeta), _, _, 
_, _)
         if DDLUtils.isDatasourceTable(tableMeta) =>
-      i.copy(table = readDataSourceTable(tableMeta))
+      if (DataSource.isV2Provider(tableMeta.provider.get, 
sparkSession.sessionState.conf)) {
 
 Review comment:
   I see the problem now. The table lookup for SELECT/INSERT is more 
complicated than I thought:
   1. try to lookup temp view first.
   2. lookup table/view. If it's a table from the session catalog, we should 
create a v1 relation if table provider is v1, otherwise create v2 relation.
   
   In fact, we rely on the order of `ResolveTables` and `ResolveRelations`, 
which is pretty bad and violates the design of catalyst. The rules in one batch 
should be order-insensitive.
   
   This fix does resolve the problem: even if we mistakenly resolve to a v1 
relation, we still have a chance to correct it to a v2 relation. But I think 
it's better to fix the root cause: `ResolveTables` and `ResolveRelations` are 
order-sensitive.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a change in pull request #25955: [SPARK-29277][SQL] Add early DSv2 filter and projection pushdown

Reply via email to