[GitHub] [spark] srielau commented on a diff in pull request #40126: [SPARK-40822][SQL] Stable derived column aliases

via GitHub Fri, 03 Mar 2023 06:32:37 -0800


srielau commented on code in PR #40126:
URL: https://github.com/apache/spark/pull/40126#discussion_r1124536818



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala:
##########
@@ -465,7 +465,20 @@ class Analyzer(override val catalogManager: 
CatalogManager) extends RuleExecutor
   }
 
   /**
-   * Replaces [[UnresolvedAlias]]s with concrete aliases.
+   * Replaces [[UnresolvedAlias]]s with concrete aliases by applying the 
following rules:
+   *   1. Use the specified name of named expressions;
+   *   2. Derive a stable alias from the original normalized SQL text except 
of:
+   *     2.1. A multipart identifier (column, field, mapkey, ...) -> the right 
most identifier.
+   *          For example: a.b.c => c
+   *     2.2. A CAST, or try_cast -> the argument of the cast.
+   *          For example: CAST(c1 AS INT) => c1
+   *     2.3. A map lookup with a literal -> the map key.
+   *          For example: map[5] => 5
+   *     2.4. Squeezing out unwanted whitespace or comments.
+   *          For example: T.c1 + /* test */ foo( 5 ) => T.c1+foo(5)
+   *     2.5. Normalize SQL string literals when ANSI mode is enabled and the 
SQL config
+   *          `spark.sql.ansi.doubleQuotedIdentifiers` is set to `true`.
+   *          For example: "abc" => 'abc'

Review Comment:
   This description seems incomplete. I see examples below with \t, \r \n in 
strings. We preserve them? I'm doubtful that's useful).
   Also you seem to have some rules around whitespace.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] srielau commented on a diff in pull request #40126: [SPARK-40822][SQL] Stable derived column aliases

Reply via email to