maropu commented on a change in pull request #28196: [SPARK-31428][SQL][DOCS]
Document Common Table Expression in SQL Reference
URL: https://github.com/apache/spark/pull/28196#discussion_r407278361
##########
File path: docs/sql-ref-syntax-qry-select-cte.md
##########
@@ -19,4 +19,120 @@ license: |
limitations under the License.
---
-**This page is under construction**
+### Description
+
+A common table expression (CTE) defines a temporary result set that a user can
reference within the duration of a SQL statement. A CTE can be used wherever a
SELECT statement is used; for example, it can be used in a SELECT, INSERT,
UPDATE, DELETE, MERGE or CREATE VIEW statement. A CTE has a name with an
optional column names list, followed by a query expression. When a CTE
references itself, it is called a recursive CTE.
+
+### Syntax
+
+{% highlight sql %}
+WITH common_table_expression [ , ... ]
+{% endhighlight %}
+
+While ```common_table_expression``` is defined as
+{% highlight sql %}
+expression_name [ ( column_name [ , ... ] ) ] [ AS ] ( [
common_table_expression ] query )
+{% endhighlight %}
+
+### Parameters
+
+<dl>
+ <dt><code><em>expression_name</em></code></dt>
+ <dd>
+ Specifies a name for the common table expression.
+ </dd>
+</dl>
+<dl>
+ <dt><code><em>query</em></code></dt>
+ <dd>
+ A <a href="sql-ref-syntax-qry-select.html">SELECT</a> statement.
+ </dd>
+</dl>
+
+### Examples
+
+{% highlight sql %}
+-- CTE with multiple column aliases
+WITH t(x, y) AS (SELECT 1, 2)
+SELECT * FROM t WHERE x = 1 AND y = 2;
+ +---+---+
+ | x| y|
+ +---+---+
+ | 1| 2|
+ +---+---+
+
+-- CTE in CTE definition
+WITH t as (
+ WITH t2 AS (SELECT 1)
+ SELECT * FROM t2
+)
+SELECT * FROM t;
+ +---+
+ | 1|
+ +---+
+ | 1|
+ +---+
+
+-- CTE in subquery
+SELECT max(c) FROM (
+ WITH t(c) AS (SELECT 1)
+ SELECT * FROM t
+);
+ +------+
+ |max(c)|
+ +------+
+ | 1|
+ +------+
+
+-- CTE in subquery expression
+SELECT (
+ WITH t AS (SELECT 1)
+ SELECT * FROM t
+);
+ +----------------+
+ |scalarsubquery()|
+ +----------------+
+ | 1|
+ +----------------+
+
+-- Name conflict in nested CTE. Spark throws an AnalysisException by default.
+-- SET spark.sql.legacy.ctePrecedencePolicy = CORRECTED (which is recommended),
+-- inner CTE definitions take precedence over outer definitions.
+SET spark.sql.legacy.ctePrecedencePolicy = CORRECTED;
+WITH
+ t AS (SELECT 1),
+ t2 AS (
+ WITH t AS (SELECT 2)
+ SELECT * FROM t
+ )
+SELECT * FROM t2;
+ +---+
+ | 2|
+ +---+
+ | 2|
+ +---+
+
+-- SET spark.sql.legacy.ctePrecedencePolicy = LEGACY (the behavior in version
2.4 and earlier)
Review comment:
The 3.0 document needs to describe this legacy behaivour? I personally think
we don't need it. cc: @gatorsmile
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]