srielau commented on code in PR #40126: URL: https://github.com/apache/spark/pull/40126#discussion_r1124536818
########## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala: ########## @@ -465,7 +465,20 @@ class Analyzer(override val catalogManager: CatalogManager) extends RuleExecutor } /** - * Replaces [[UnresolvedAlias]]s with concrete aliases. + * Replaces [[UnresolvedAlias]]s with concrete aliases by applying the following rules: + * 1. Use the specified name of named expressions; + * 2. Derive a stable alias from the original normalized SQL text except of: + * 2.1. A multipart identifier (column, field, mapkey, ...) -> the right most identifier. + * For example: a.b.c => c + * 2.2. A CAST, or try_cast -> the argument of the cast. + * For example: CAST(c1 AS INT) => c1 + * 2.3. A map lookup with a literal -> the map key. + * For example: map[5] => 5 + * 2.4. Squeezing out unwanted whitespace or comments. + * For example: T.c1 + /* test */ foo( 5 ) => T.c1+foo(5) + * 2.5. Normalize SQL string literals when ANSI mode is enabled and the SQL config + * `spark.sql.ansi.doubleQuotedIdentifiers` is set to `true`. + * For example: "abc" => 'abc' Review Comment: This description seems incomplete. I see examples below with \t, \r \n in strings. We preserve them? I'm doubtful that's useful). Also you seem to have some rules around whitespace. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
