peter-toth commented on a change in pull request #25029: [SPARK-28228][SQL] Fix 
substitution order of nested WITH clauses
URL: https://github.com/apache/spark/pull/25029#discussion_r300582136
 
 

 ##########
 File path: docs/sql-migration-guide-upgrade.md
 ##########
 @@ -147,6 +147,8 @@ license: |
 
   - Since Spark 3.0, if files or subdirectories disappear during recursive 
directory listing (i.e. they appear in an intermediate listing but then cannot 
be read or listed during later phases of the recursive directory listing, due 
to either concurrent file deletions or object store consistency issues) then 
the listing will fail with an exception unless 
`spark.sql.files.ignoreMissingFiles` is `true` (default `false`). In previous 
versions, these missing files or subdirectories would be ignored. Note that 
this change of behavior only applies during initial table file listing (or 
during `REFRESH TABLE`), not during query execution: the net change is that 
`spark.sql.files.ignoreMissingFiles` is now obeyed during table file listing / 
query planning, not only at query execution time.
 
+  - Since Spark 3.0, substitution order of nested WITH clauses is changed and 
an inner CTE definition takes precedence over an outer. The previous behaviour 
can be restored by setting `spark.sql.legacy.cte.substitution.enabled` to 
`true`.
 
 Review comment:
   https://github.com/apache/spark/pull/24831 added support for `WITH` clause 
in subqueries and introduced an new option of nesting like here:
   ```
   WITH t(c) AS (SELECT 1)
   SELECT max(c) FROM (
     WITH t(c) AS (SELECT 2)
     SELECT * FROM t
   )
   ```
   
   But other kinds of nested `WITH` clauses were available even before that PR 
like:
   - Nesting in CTE:
     ```
     WITH
       t AS (SELECT 1),
       t2 AS (
         WITH t AS (SELECT 2)
         SELECT * FROM t
       )
     SELECT * FROM t2
     ```
   - Nesting in subquery expression:
     ```
     WITH t AS (SELECT 1)
     SELECT (
       WITH t AS (SELECT 2)
       SELECT * FROM t
     )
     ```
   The issue this PR fixes is that the last 2 examples should return `2` 
instead of `1`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to