[GitHub] [spark] cloud-fan commented on a diff in pull request #39667: [SPARK-42131][SQL] Extract the function that construct the select statement for JDBC dialect.

via GitHub Mon, 06 Feb 2023 03:33:31 -0800


cloud-fan commented on code in PR #39667:
URL: https://github.com/apache/spark/pull/39667#discussion_r1097264613



##########
sql/core/src/main/scala/org/apache/spark/sql/jdbc/JdbcSQLQueryBuilder.scala:
##########
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.jdbc
+
+import org.apache.spark.sql.connector.expressions.filter.Predicate
+import org.apache.spark.sql.execution.datasources.jdbc.{JDBCOptions, 
JDBCPartition}
+import org.apache.spark.sql.execution.datasources.v2.TableSampleInfo
+
+/**
+ * The builder to build a single SELECT query.
+ *
+ * Note: All the `withXXX` methods will be invoked at most once. The 
invocation order does not
+ * matter, as all these clauses follow the natural SQL order: sample the table 
first, then filter,
+ * then group by, then sort, then offset, then limit.
+ *
+ * @since 3.4.0
+ */
+class JdbcSQLQueryBuilder(dialect: JdbcDialect, options: JDBCOptions) {
+
+  /**
+   * `columns`, but as a String suitable for injection into a SQL query.
+   */
+  protected var columnList: String = "1"
+
+  /**
+   * A WHERE clause representing both `filters`, if any, and the current 
partition.
+   */
+  protected var whereClause: String = ""
+
+  /**
+   * A GROUP BY clause representing pushed-down grouping columns.
+   */
+  protected var groupByClause: String = ""
+
+  /**
+   * A ORDER BY clause representing pushed-down sort of top n.
+   */
+  protected var orderByClause: String = ""
+
+  /**
+   * A LIMIT value representing pushed-down limit.
+   */
+  protected var limit: Int = -1
+
+  /**
+   * A OFFSET value representing pushed-down offset.
+   */
+  protected var offset: Int = -1
+
+  /**
+   * A table sample clause representing pushed-down table sample.
+   */
+  protected var tableSampleClause: String = ""
+
+  def withColumns(columns: Array[String]): JdbcSQLQueryBuilder = {
+    if (columns.nonEmpty) {
+      columnList = columns.mkString(",")
+    }
+    this
+  }
+
+  def withPredicates(predicates: Array[Predicate], part: JDBCPartition): 
JdbcSQLQueryBuilder = {
+    // `filters`, but as a WHERE clause suitable for injection into a SQL 
query.
+    val filterWhereClause: String = {
+      predicates.flatMap(dialect.compileExpression(_)).map(p => 
s"($p)").mkString(" AND ")
+    }
+
+    // A WHERE clause representing both `filters`, if any, and the current 
partition.
+    whereClause = if (part.whereClause != null && filterWhereClause.length > 
0) {
+      "WHERE " + s"($filterWhereClause)" + " AND " + s"(${part.whereClause})"
+    } else if (part.whereClause != null) {
+      "WHERE " + part.whereClause
+    } else if (filterWhereClause.length > 0) {
+      "WHERE " + filterWhereClause
+    } else {
+      ""
+    }
+
+    this
+  }
+
+  def withGroupByColumns(groupByColumns: Option[Array[String]]): 
JdbcSQLQueryBuilder = {
+    if (groupByColumns.nonEmpty && groupByColumns.get.nonEmpty) {
+      // The GROUP BY columns should already be quoted by the caller side.
+      groupByClause = s"GROUP BY ${groupByColumns.get.mkString(", ")}"
+    }
+
+    this
+  }
+
+  def withSortOrders(sortOrders: Array[String]): JdbcSQLQueryBuilder = {
+    if (sortOrders.nonEmpty) {
+      orderByClause = s" ORDER BY ${sortOrders.mkString(", ")}"
+    }
+
+    this
+  }
+
+  def withLimit(limit: Int): JdbcSQLQueryBuilder = {
+    this.limit = limit
+
+    this
+  }
+
+  def withOffset(offset: Int): JdbcSQLQueryBuilder = {
+    this.offset = offset
+
+    this
+  }
+
+  def withTableSample(sample: Option[TableSampleInfo]): JdbcSQLQueryBuilder = {

Review Comment:
   I don't think the parameter type should be `Option`. Spark should not invoke 
`withTableSample` at all if there is no table sample in the query.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a diff in pull request #39667: [SPARK-42131][SQL] Extract the function that construct the select statement for JDBC dialect.

Reply via email to