MaxGekk opened a new pull request, #40126:
URL: https://github.com/apache/spark/pull/40126
### What changes were proposed in this pull request?
In the PR, I propose to change auto-generation of column aliases (the case
when an user doesn't assign any alias explicitly). Before the changes, Spark
SQL generates such alias from `Expression` but this PR proposes to take the
parse tree (output of lexer), and generate an alias using the term tokens from
the tree.
New helper function `ParserUtils.toExprAlias` takes a `ParseTree` from
`Antlr4`, and converts it to a `String` using following simple rules:
1. Adds a gap after every terminal node (`TerminalNodeImpl`) except of `(<[.`
2. Removes a gap before `(), <>, []` and `,.`
For example, the sequence of tokens "(", "columnA", "+", "1", ")" is
converted to the alias "(columnA + 1)"
Closes #39332
### Why are the changes needed?
To improve user experience with Spark SQL. It is always best practice to
name the result of any expressions in a queries select list, if one plans to
reference them later. This yields the most readable results and stability.
However, sometimes queries are generated or we’re just lazy and trust in the
auto generated names. The problem is that the auto-generated names are produced
by pretty printing the expression tree which is, while “generally” readable,
not meant to be stable across long durations of time. For example:
```sql
spark-sql> DESC SELECT substring('hello', 5);
substring(hello, 5, 2147483647) string
```
the auto-generated column alias `substring(hello, 5, 2147483647)` contains
not-obvious elements.
### Does this PR introduce _any_ user-facing change?
Yes.
### How was this patch tested?
By existing test suites.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]