[GitHub] [spark] cloud-fan commented on a change in pull request #32807: [SPARK-35669][SQL] Fix special char in CSV header with filter pushdown

GitBox Mon, 07 Jun 2021 22:01:45 -0700


cloud-fan commented on a change in pull request #32807:
URL: https://github.com/apache/spark/pull/32807#discussion_r647111703




##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
##########
@@ -699,20 +699,25 @@ abstract class PushableColumnBase {
 
   def unapply(e: Expression): Option[String] = {
     import 
org.apache.spark.sql.connector.catalog.CatalogV2Implicits.MultipartIdentifierHelper
-    def helper(e: Expression): Option[Seq[String]] = e match {
-      case a: Attribute =>
-        // Attribute that contains dot "." in name is supported only when
-        // nested predicate pushdown is enabled.
-        if (nestedPredicatePushdownEnabled || !a.name.contains(".")) {
-          Some(Seq(a.name))
-        } else {
-          None
-        }
-      case s: GetStructField if nestedPredicatePushdownEnabled =>
-        helper(s.child).map(_ :+ s.childSchema(s.ordinal).name)
-      case _ => None
+    if (nestedPredicatePushdownEnabled) {

Review comment:
       note that:
   1. nestedPredicatePushdownEnabled is always enabled for DS v2 (by default)
   2. nestedPredicatePushdownEnabled is never enabled for DS v1
   3. nestedPredicatePushdownEnabled is only enabled for file source parquet 
and orc (by default)
   
   After changing the quoting logic:
   1. DS v1 is not affected
   2. file source is builtin so we are fine
   3. DS v2 will be affected if the column name contains special chars.
   
   Personally, I think the new quoting behavior is better (more ANSI SQL), and 
most v2 implementations won't be affected as they already need to deal with 
quoted names.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a change in pull request #32807: [SPARK-35669][SQL] Fix special char in CSV header with filter pushdown

Reply via email to