[
https://issues.apache.org/jira/browse/SPARK-10034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan updated SPARK-10034:
--------------------------------
Description:
Before #8371, there was a bug for `Sort` on `Aggregate` that we can't use
aggregate expressions named `_aggOrdering` and can't use more than one ordering
expressions which contains aggregate functions. The reason of this bug is that:
The aggregate expression in `SortOrder` never get resolved, we alias it with
`_aggOrdering` and call `toAttribute` which gives us an `UnresolvedAttribute`.
So actually we are referencing aggregate expression by name, not by exprId like
we thought. And if there is already an aggregate expression named
`_aggOrdering` or there are more than one ordering expressions having aggregate
functions, we will have conflict names and can't search by name.
However, after #8371 got merged, the `SortOrder`s are guaranteed to be resolved
and we are always referencing aggregate expression by exprId. The Bug doesn't
exist anymore and this PR add regression tests for it.
was:
{code}
val df = Seq(1 -> 2).toDF("i", "j")
val query = df.groupBy('i)
.agg(max('j).as("_aggOrdering"))
.orderBy(sum('j))
checkAnswer(query, Row(1, 2))
{code}
> add regression test for Sort on Aggregate
> -----------------------------------------
>
> Key: SPARK-10034
> URL: https://issues.apache.org/jira/browse/SPARK-10034
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Reporter: Wenchen Fan
>
> Before #8371, there was a bug for `Sort` on `Aggregate` that we can't use
> aggregate expressions named `_aggOrdering` and can't use more than one
> ordering expressions which contains aggregate functions. The reason of this
> bug is that: The aggregate expression in `SortOrder` never get resolved, we
> alias it with `_aggOrdering` and call `toAttribute` which gives us an
> `UnresolvedAttribute`. So actually we are referencing aggregate expression by
> name, not by exprId like we thought. And if there is already an aggregate
> expression named `_aggOrdering` or there are more than one ordering
> expressions having aggregate functions, we will have conflict names and can't
> search by name.
> However, after #8371 got merged, the `SortOrder`s are guaranteed to be
> resolved and we are always referencing aggregate expression by exprId. The
> Bug doesn't exist anymore and this PR add regression tests for it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]