[
https://issues.apache.org/jira/browse/SPARK-27561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640214#comment-17640214
]
Apache Spark commented on SPARK-27561:
--------------------------------------
User 'anchovYu' has created a pull request for this issue:
https://github.com/apache/spark/pull/38776
> Support "lateral column alias references" to allow column aliases to be used
> within SELECT clauses
> --------------------------------------------------------------------------------------------------
>
> Key: SPARK-27561
> URL: https://issues.apache.org/jira/browse/SPARK-27561
> Project: Spark
> Issue Type: New Feature
> Components: SQL
> Affects Versions: 3.1.0
> Reporter: Josh Rosen
> Priority: Major
>
> Amazon Redshift has a feature called "lateral column alias references":
> [https://aws.amazon.com/about-aws/whats-new/2018/08/amazon-redshift-announces-support-for-lateral-column-alias-reference/].
> Quoting from that blogpost:
> {quote}The support for lateral column alias reference enables you to write
> queries without repeating the same expressions in the SELECT list. For
> example, you can define the alias 'probability' and use it within the same
> select statement:
> {code:java}
> select clicks / impressions as probability, round(100 * probability, 1) as
> percentage from raw_data;
> {code}
> {quote}
> There's more information about this feature on
> [https://docs.aws.amazon.com/redshift/latest/dg/r_SELECT_list.html:]
> {quote}The benefit of the lateral alias reference is you don't need to repeat
> the aliased expression when building more complex expressions in the same
> target list. When Amazon Redshift parses this type of reference, it just
> inlines the previously defined aliases. If there is a column with the same
> name defined in the FROM clause as the previously aliased expression, the
> column in the FROM clause takes priority. For example, in the above query if
> there is a column named 'probability' in table raw_data, the 'probability' in
> the second expression in the target list will refer to that column instead of
> the alias name 'probability'.
> {quote}
> It would be nice if Spark supported this syntax. I don't think that this is
> standard SQL, so it might be a good idea to research if other SQL databases
> support similar syntax (and to see if they implement the same column
> resolution strategy as Redshift).
> We should also consider whether this needs to be feature-flagged as part of a
> specific SQL compatibility mode / dialect.
> One possibly-related existing ticket: SPARK-9338, which discusses the use of
> SELECT aliases inĀ GROUP BY expressions.
> /cc [~hvanhovell]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]