[ 
https://issues.apache.org/jira/browse/SPARK-10034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan updated SPARK-10034:
--------------------------------
    Description: 
Before #8371, there was a bug for `Sort` on `Aggregate` that we can't use 
aggregate expressions named `_aggOrdering` and can't use more than one ordering 
expressions which contains aggregate functions. The reason of this bug is that: 
The aggregate expression in `SortOrder` never get resolved, we alias it with 
`_aggOrdering` and call `toAttribute` which gives us an `UnresolvedAttribute`. 
So actually we are referencing aggregate expression by name, not by exprId like 
we thought. And if there is already an aggregate expression named 
`_aggOrdering` or there are more than one ordering expressions having aggregate 
functions, we will have conflict names and can't search by name.

However, after #8371 got merged, the `SortOrder`s are guaranteed to be resolved 
and we are always referencing aggregate expression by exprId. The Bug doesn't 
exist anymore and this PR add regression tests for it.

  was:
{code}
val df = Seq(1 -> 2).toDF("i", "j")
val query = df.groupBy('i)
  .agg(max('j).as("_aggOrdering"))
  .orderBy(sum('j))
checkAnswer(query, Row(1, 2))
{code}


> add regression test for Sort on Aggregate
> -----------------------------------------
>
>                 Key: SPARK-10034
>                 URL: https://issues.apache.org/jira/browse/SPARK-10034
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Wenchen Fan
>
> Before #8371, there was a bug for `Sort` on `Aggregate` that we can't use 
> aggregate expressions named `_aggOrdering` and can't use more than one 
> ordering expressions which contains aggregate functions. The reason of this 
> bug is that: The aggregate expression in `SortOrder` never get resolved, we 
> alias it with `_aggOrdering` and call `toAttribute` which gives us an 
> `UnresolvedAttribute`. So actually we are referencing aggregate expression by 
> name, not by exprId like we thought. And if there is already an aggregate 
> expression named `_aggOrdering` or there are more than one ordering 
> expressions having aggregate functions, we will have conflict names and can't 
> search by name.
> However, after #8371 got merged, the `SortOrder`s are guaranteed to be 
> resolved and we are always referencing aggregate expression by exprId. The 
> Bug doesn't exist anymore and this PR add regression tests for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to