comphead commented on code in PR #22672:
URL: https://github.com/apache/datafusion/pull/22672#discussion_r3377580941


##########
docs/source/user-guide/sql/select.md:
##########
@@ -19,76 +19,320 @@
 
 # SELECT syntax
 
-The queries in DataFusion scan data from tables and return 0 or more rows.
-Please be aware that column names in queries are made lower-case, but not on 
the inferred schema. Accordingly, if you
-want to query against a capitalized field, make sure to use double quotes. 
Please see this
-[example](https://datafusion.apache.org/user-guide/example-usage.html) for 
clarification.
-In this documentation we describe the SQL syntax in DataFusion.
-
-DataFusion supports the following syntax for queries:
-<code class="language-sql hljs">
-
-[ [WITH](#with-clause) with_query [, ...] ] <br/>
-[SELECT](#select-clause) [ ALL | DISTINCT ] select_expr [, ...] <br/>
-[ [FROM](#from-clause) from_item [, ...] ] <br/>
-[ [JOIN](#join-clause) join_item [, ...] ] <br/>
-[ [WHERE](#where-clause) condition ] <br/>
-[ [GROUP BY](#group-by-clause) grouping_element [, ...] ] <br/>
-[ [HAVING](#having-clause) condition] <br/>
-[ [QUALIFY](#qualify-clause) condition] <br/>
-[ [UNION](#union-clause) [ ALL | select ] <br/>
-[ [ORDER BY](#order-by-clause) expression [ ASC | DESC ][, ...] ] <br/>
-[ [LIMIT](#limit-clause) count ] <br/>
-[ [EXCLUDE | EXCEPT](#exclude-and-except-clause) ] <br/>
-[Pipe operators](#pipe-operators) <br/>
-
-</code>
+Queries in DataFusion scan data from tables, subqueries, table functions, or
+literal values and return zero or more rows. DataFusion supports the following
+general form for `SELECT` queries. Optional clauses can be omitted. The linked
+sections describe each clause in more detail.
+
+<pre><code>[ <a href="#with-clause">WITH</a> cte [, ...] ]
+<a href="#select-clause">SELECT</a> select_item [, ...]
+[ <a href="#select-into">INTO</a> table_name ]
+[ <a href="#from-clause">FROM</a> from_item [, ...] ]
+[ <a href="#join-clause">JOIN</a> join_item ... ]
+[ <a href="#where-clause">WHERE</a> condition ]
+[ <a href="#group-by-clause">GROUP BY</a> grouping_element [, ...] | GROUP BY 
ALL ]
+[ <a href="#having-clause">HAVING</a> condition ]
+[ <a href="#window-clause">WINDOW</a> window_name AS (window_definition) [, 
...] ]
+[ <a href="#qualify-clause">QUALIFY</a> condition ]
+[ { <a href="#set-operations">UNION</a> | <a 
href="#set-operations">INTERSECT</a> | <a href="#set-operations">EXCEPT</a> } 
query ] [...]
+[ <a href="#order-by-clause">ORDER BY</a> order_expression [, ...] ]
+[ <a href="#limit-and-offset-clauses">LIMIT</a> count ] [ <a 
href="#limit-and-offset-clauses">OFFSET</a> count ]
+[ <a href="#pipe-operators">|&gt;</a> pipe_operator ... ]</code></pre>
+
+Unquoted identifiers are normalized to lower case in SQL queries, but inferred
+schema field names are not changed. If a field name contains capital letters or
+other characters that require quoting, reference it with double quotes. See 
this
+[example](https://datafusion.apache.org/user-guide/example-usage.html) for
+clarification.
 
 ## WITH clause
 
-A with clause allows to give names for queries and reference them by name.
+```text
+WITH [RECURSIVE] cte_name [(column_name [, ...])] AS (query) [, ...]
+```
+
+A `WITH` clause defines common table expressions (CTEs) that can be referenced
+by name in the rest of the query.
+
+Examples:
 
 ```sql
 WITH x AS (SELECT a, MAX(b) AS b FROM t GROUP BY a)
 SELECT a, b FROM x;
 ```
 
+CTEs can also rename their output columns:
+
+```sql
+WITH x(key, total) AS (
+  SELECT a, SUM(b) FROM t GROUP BY a
+)
+SELECT key, total FROM x;
+```
+
+DataFusion supports `WITH RECURSIVE` for recursive CTEs. Recursive CTE support
+is controlled by the `datafusion.execution.enable_recursive_ctes` configuration
+setting, which is enabled by default.
+
+```sql
+WITH RECURSIVE numbers AS (
+  SELECT 1 AS n
+  UNION ALL
+  SELECT n + 1 FROM numbers WHERE n < 3
+)
+SELECT n FROM numbers;
+```
+
 ## SELECT clause
 
-Example:
+```text
+SELECT [ALL | DISTINCT | DISTINCT ON (expression [, ...])]
+       select_item [, ...]
+       [INTO table_name]
+```
+
+The `SELECT` list can contain column references, arbitrary expressions, scalar
+functions, aggregate functions, window functions, scalar subqueries, and
+wildcards.
+
+Examples:
 
 ```sql
-SELECT a, b, a + b FROM table
+SELECT a, b, a + b AS sum_ab FROM table_name;
 ```
 
-The `DISTINCT` quantifier can be added to make the query return all distinct 
rows.
-By default `ALL` will be used, which returns all the rows.
+Aliases can be written with or without `AS`:
 
 ```sql
-SELECT DISTINCT person, age FROM employees
+SELECT a AS key, b value FROM table_name;
+```
+
+`SELECT` can be used without a `FROM` clause when the selected expressions do
+not need input rows:
+
+```sql
+SELECT 1 + 2 AS three;
+```
+
+`SELECT *` requires a `FROM` clause.
+
+### DISTINCT
+
+```text
+SELECT DISTINCT select_item [, ...]
+SELECT DISTINCT ON (expression [, ...]) select_item [, ...]
+```
+
+By default, `SELECT` uses `ALL` semantics and returns every row. The `DISTINCT`
+quantifier removes duplicate rows from the query result.
+
+Examples:
+
+```sql
+SELECT DISTINCT person, age FROM employees;
+```
+
+DataFusion also supports PostgreSQL-style `DISTINCT ON`, which keeps one row 
for
+each distinct value of the listed expressions. Use `ORDER BY` to choose which
+row is kept for each group. When `ORDER BY` is present, the initial `ORDER BY`
+expressions must match the `DISTINCT ON` expressions.
+
+If multiple rows have the same `DISTINCT ON` values and the `ORDER BY` clause
+does not fully order those rows, the row that is kept is not specified. Add
+additional `ORDER BY` expressions to make the choice deterministic.
+
+```sql
+SELECT DISTINCT ON (customer_id) customer_id, order_id, order_date
+FROM orders
+ORDER BY customer_id, order_date DESC;
+```
+
+### Wildcards
+
+```text
+*
+table_alias.*
+* EXCLUDE column_name
+* EXCLUDE (column_name [, ...])
+* EXCEPT column_name
+* EXCEPT (column_name [, ...])
+* REPLACE (expression AS column_name [, ...])
+```
+
+Use `*` to select all columns, or `table_alias.*` to select all columns from a
+specific input.
+
+Examples:
+
+```sql
+SELECT * FROM orders;
+SELECT o.* FROM orders AS o;
+```
+
+Wildcard projections support `EXCLUDE` and `EXCEPT` to omit columns. Both
+accept either a single column name or a parenthesized list of column names.
+
+```sql
+SELECT * EXCLUDE customer_id FROM orders;
+SELECT * EXCLUDE (customer_id, internal_note) FROM orders;
+SELECT * EXCEPT customer_id FROM orders;
+SELECT * EXCEPT (customer_id, internal_note) FROM orders;
+SELECT o.* EXCLUDE (internal_note) FROM orders AS o;
+```
+
+Every name in an `EXCLUDE` or `EXCEPT` list must refer to an existing column.
+The list must not name the same column more than once, and the wildcard must
+not expand to zero columns.
+
+Wildcard projections also support `REPLACE`, which keeps the original column
+name but substitutes a new expression for that column.
+
+```sql
+SELECT * REPLACE (price * 2 AS price) FROM products;
+SELECT p.* REPLACE (price * 2 AS price, product_id + 1000 AS product_id)
+FROM products AS p;
+```
+
+`RENAME` and wildcard aliases such as `* AS alias` are not supported.
+
+### SELECT INTO
+
+```text
+SELECT select_item [, ...] INTO table_name FROM ...
+```
+
+`SELECT ... INTO table_name` creates an in-memory table from the query result.
+It is similar to [`CREATE TABLE ... AS SELECT`](ddl.md#create-table).
+
+```sql
+SELECT customer_id, SUM(amount) AS total
+INTO customer_totals
+FROM orders
+GROUP BY customer_id;
 ```
 
 ## FROM clause
 
-Example:
+```text
+FROM from_item [, ...]
+
+from_item:
+  table_name [[AS] alias [(column_alias [, ...])]]
+| (query) [[AS] alias [(column_alias [, ...])]]
+| VALUES (expression [, ...]) [, ...] [[AS] alias [(column_alias [, ...])]]
+| table_function(argument [, ...]) [[AS] alias [(column_alias [, ...])]]
+| UNNEST(expression) [[AS] alias [(column_alias [, ...])]]
+```
+
+The `FROM` clause specifies the input relations for the query. Supported inputs
+include tables, CTEs, derived tables, `VALUES`, table functions, and `UNNEST`.
+
+Examples:
+
+```sql
+SELECT t.a FROM table_name AS t;
+```
+
+Table aliases can include column aliases:
+
+```sql
+SELECT x, y
+FROM some_table AS t(x, y);
+```
+
+Subqueries can be used in the `FROM` clause:
+
+```sql
+SELECT q.a
+FROM (SELECT a FROM table_name WHERE a > 10) AS q;
+```
+
+`VALUES` can be used as a table expression:
 
 ```sql
-SELECT t.a FROM table AS t
+SELECT *
+FROM VALUES (1, 'a'), (2, 'b') AS t(id, label);
 ```
 
+Table functions such as `range` and `generate_series` can be used in `FROM`:
+
+```sql
+SELECT value FROM range(0, 3);
+```
+
+`UNNEST` expands a list, array, or similar nested value into one row for each
+element. It can be used in the `SELECT` list to expand a value in each input
+row, or as an input relation in `FROM`. When used in `FROM`, it can be given a
+table alias and column alias.
+
+```sql
+SELECT * FROM UNNEST([1, 2, 3]) AS u(value);
+```
+
+To expand a column for each input row, use `UNNEST` in the `SELECT` list:
+
+```sql
+SELECT id, UNNEST(items) FROM orders;
+```
+
+`UNNEST` in the `FROM` clause cannot yet reference columns from preceding 
`FROM`
+items (implicit lateral references such as `FROM orders AS t, UNNEST(t.items)`
+are not currently supported).
+
 ## WHERE clause
 
-Example:
+```text
+WHERE condition
+```
+
+The `WHERE` clause filters input rows before grouping, aggregation, and window
+processing.
 
 ```sql
-SELECT a FROM table WHERE a > 10
+SELECT a FROM table_name WHERE a > 10;
 ```
 
 ## JOIN clause
 
-DataFusion supports `INNER JOIN`, `LEFT OUTER JOIN`, `RIGHT OUTER JOIN`, `FULL 
OUTER JOIN`, `NATURAL JOIN`, `CROSS JOIN`, `LEFT SEMI JOIN`, `RIGHT SEMI JOIN`, 
`LEFT ANTI JOIN`, `RIGHT ANTI JOIN`, `LATERAL JOIN`, and `LEFT JOIN LATERAL`.
+```text
+from_item [join_type] JOIN from_item [join_condition]
+from_item CROSS JOIN from_item
+from_item NATURAL JOIN from_item
+from_item [join_type] JOIN LATERAL (query) AS alias [join_condition]
+from_item, LATERAL (query) AS alias
+
+join_type:
+  INNER
+| LEFT [OUTER]
+| RIGHT [OUTER]
+| FULL [OUTER]
+| LEFT SEMI
+| RIGHT SEMI
+| LEFT ANTI

Review Comment:
   btw we also have mark joins, however its more like optimization.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to