[jira] [Created] (SPARK-27561) Support "lateral column alias references" to allow column aliases to be used within SELECT clauses

Josh Rosen (JIRA) Wed, 24 Apr 2019 16:07:27 -0700

Josh Rosen created SPARK-27561:
----------------------------------

             Summary: Support "lateral column alias references" to allow column 
aliases to be used within SELECT clauses
                 Key: SPARK-27561
                 URL: https://issues.apache.org/jira/browse/SPARK-27561
             Project: Spark
          Issue Type: New Feature
          Components: SQL
    Affects Versions: 2.4.0
            Reporter: Josh Rosen



Amazon Redshift has a feature called "lateral column alias references: 
[https://aws.amazon.com/about-aws/whats-new/2018/08/amazon-redshift-announces-support-for-lateral-column-alias-reference/].
 Quoting from that blogpost:
{quote}The support for lateral column alias reference enables you to write 
queries without repeating the same expressions in the SELECT list. For example, 
you can define the alias 'probability' and use it within the same select 
statement:
{code:java}
select clicks / impressions as probability, round(100 * probability, 1) as 
percentage from raw_data;
{code}
{quote}
There's more information about this feature on 
[https://docs.aws.amazon.com/redshift/latest/dg/r_SELECT_list.html:]
{quote}The benefit of the lateral alias reference is you don't need to repeat 
the aliased expression when building more complex expressions in the same 
target list. When Amazon Redshift parses this type of reference, it just 
inlines the previously defined aliases. If there is a column with the same name 
defined in the FROM clause as the previously aliased expression, the column in 
the FROM clause takes priority. For example, in the above query if there is a 
column named 'probability' in table raw_data, the 'probability' in the second 
expression in the target list will refer to that column instead of the alias 
name 'probability'.
{quote}
It would be nice if Spark supported this syntax. I don't think that this is 
standard SQL, so it might be a good idea to research if other SQL databases 
support similar syntax (and to see if they implement the same column resolution 
strategy as Redshift).

One possibly-related existing ticket: SPARK-9338, which discusses the use of 
SELECT aliases in GROUP BY expressions.

/cc [~hvanhovell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-27561) Support "lateral column alias references" to allow column aliases to be used within SELECT clauses

Reply via email to