[spark] branch master updated: [SPARK-43205][DOC] identifier clause docs

wenchen Thu, 17 Aug 2023 18:27:06 -0700

This is an automated email from the ASF dual-hosted git repository.

wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 02a07cd6adc [SPARK-43205][DOC] identifier clause docs
02a07cd6adc is described below

commit 02a07cd6adc7f0674bc673e3f917d71d9b290199
Author: srielau <se...@rielau.com>
AuthorDate: Fri Aug 18 09:26:32 2023 +0800

    [SPARK-43205][DOC] identifier clause docs
    
    ### What changes were proposed in this pull request?
    
    Document the IDENTIFIER() clause
    
    ### Why are the changes needed?
    
    Docs are good!
    
    ### Does this PR introduce _any_ user-facing change?
    
    ### How was this patch tested?
    <!--
    <img width="892" alt="Screenshot 2023-08-15 at 4 26 27 PM" 
src="https://github.com/apache/spark/assets/3514644/6ce43330-668e-4c84-b72b-bf1e2679d736";>
    
    If tests were added, say they were added here. Please make sure to add some 
test cases that check the changes thoroughly including negative and positive 
cases if possible.
    If it was tested in a way different from regular unit tests, please clarify 
how you tested step by step, ideally copy and paste-able, so that other 
reviewers can test and check, and descendants can verify in the future.
    If tests were not added, please describe why they were not added and/or why 
it was difficult to add.
    If benchmark tests were added, please run the benchmarks in GitHub Actions 
for the consistent environment, and the instructions could accord to: 
https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
    -->
    See attached
    <img width="892" alt="Screenshot 2023-08-15 at 4 26 27 PM" 
src="https://github.com/apache/spark/assets/3514644/55823375-8d1a-4473-bf19-74796d273416";>
    
    <img width="747" alt="Screenshot 2023-08-15 at 4 45 23 PM" 
src="https://github.com/apache/spark/assets/3514644/0ee852a9-6a11-4c87-bed9-43531c55fc31";>
    
    Closes #42506 from srielau/SPARK-43205-3.5-IDENTIFIER-clause-docs.
    
    Authored-by: srielau <se...@rielau.com>
    Signed-off-by: Wenchen Fan <wenc...@databricks.com>
    (cherry picked from commit 7786d0b2f359eccd570461a399da0fca84e515c1)
    Signed-off-by: Wenchen Fan <wenc...@databricks.com>
---
 docs/sql-ref-identifier-clause.md | 106 ++++++++++++++++++++++++++++++++++++++
 docs/sql-ref.md                   |   1 +
 2 files changed, 107 insertions(+)

diff --git a/docs/sql-ref-identifier-clause.md 
b/docs/sql-ref-identifier-clause.md
new file mode 100644
index 00000000000..694731109f8
--- /dev/null
+++ b/docs/sql-ref-identifier-clause.md
@@ -0,0 +1,106 @@
+---
+layout: global
+title: Identifier clause
+displayTitle: IDENTIFIER clause
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+Converts a constant `STRING` expression into a SQL object name.
+The purpose of this clause is to allow for templating of identifiers in SQL 
statements without opening up the risk of SQL injection attacks.
+Typically, this clause is used with a parameter marker as argument.
+
+### Syntax
+
+```sql
+IDENTIFIER ( strExpr )
+```
+
+### Parameters
+
+- **strExpr**: A constant `STRING` expression. Typically, the expression 
includes a parameter marker.
+
+### Returns
+
+A (qualified) identifier which can be used as a:
+
+- qualified table name
+- namespace name
+- function name
+- qualified column or attribute reference
+
+### Examples
+
+These examples use named parameter markers to templatize queries.
+
+```scala
+// Creation of a table using parameter marker.
+spark.sql("CREATE TABLE IDENTIFIER(:mytab)(c1 INT)", args = Map("mytab" -> 
"tab1")).show()
+
+spark.sql("DESCRIBE IDENTIFIER(:mytab)", args = Map("mytab" -> "tab1")).show()
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|      c1|      int|   NULL|
++--------+---------+-------+
+
+// Altering a table with a fixed schema and a parameterized table name. 
+spark.sql("ALTER TABLE IDENTIFIER('default.' || :mytab) ADD COLUMN c2 INT", 
args = Map("mytab" -> "tab1")).show()
+
+spark.sql("DESCRIBE IDENTIFIER(:mytab)", args = Map("mytab" -> 
"default.tab1")).show()
++--------+---------+-------+
+|col_name|data_type|comment|
++--------+---------+-------+
+|      c1|      int|   NULL|
+|      c2|      int|   NULL|
++--------+---------+-------+
+
+// A parameterized reference to a table in a query. This table name is 
qualified and uses back-ticks.
+spark.sql("SELECT * FROM IDENTIFIER(:mytab)", args = Map("mytab" -> 
"`default`.`tab1`")).show()
++---+---+
+| c1| c2|
++---+---+
++---+---+
+
+
+// You cannot qualify the IDENTIFIER clause or use it as a qualifier itself.
+spark.sql("SELECT * FROM myschema.IDENTIFIER(:mytab)", args = Map("mytab" -> 
"`tab1`")).show()
+[INVALID_SQL_SYNTAX.INVALID_TABLE_VALUED_FUNC_NAME] `myschema`.`IDENTIFIER`.
+
+spark.sql("SELECT * FROM IDENTIFIER(:myschema).mytab", args = Map("mychema" -> 
"`default`")).show()
+[PARSE_SYNTAX_ERROR]
+
+// Dropping a table with separate schema and table parameters.
+spark.sql("DROP TABLE IDENTIFIER(:myschema || '.' || :mytab)", args = 
Map("myschema" -> "default", "mytab" -> "tab1")).show()
+
+// A parameterized column reference
+spark.sql("SELECT IDENTIFIER(:col) FROM VALUES(1) AS T(c1)", args = Map("col" 
-> "t.c1")).show()
++---+
+| c1|
++---+
+|  1|
++---+
+
+// Passing in a function name as a parameter
+spark.sql("SELECT IDENTIFIER(:func)(-1)", args = Map("func" -> "abs")).show();
++-------+
+|abs(-1)|
++-------+
+|      1|
++-------+
+```
diff --git a/docs/sql-ref.md b/docs/sql-ref.md
index 026d072c07d..8f3289e9b77 100644
--- a/docs/sql-ref.md
+++ b/docs/sql-ref.md
@@ -32,6 +32,7 @@ Spark SQL is Apache Spark's module for working with 
structured data. This guide
    * [User-Defined Aggregate Functions 
(UDAFs)](sql-ref-functions-udf-aggregate.html)
    * [Integration with Hive UDFs/UDAFs/UDTFs](sql-ref-functions-udf-hive.html)
  * [Identifiers](sql-ref-identifier.html)
+ * [IDENTIFIER clause](sql-ref-identifier-clause.html)
  * [Literals](sql-ref-literals.html)
  * [Null Semantics](sql-ref-null-semantics.html)
  * [SQL Syntax](sql-ref-syntax.html)


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-43205][DOC] identifier clause docs

Reply via email to