[
https://issues.apache.org/jira/browse/SPARK-11329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15027246#comment-15027246
]
Yin Huai commented on SPARK-11329:
----------------------------------
oh, sorry. Yes, you are right. Right now, when a struct type column is used as
the argument of max, the planner will use SortBasedAggregate operator instead
of TungstenAggregate operator. It is expected. We will improve this (i.e. using
TungstenAggregate to handle complex type aggregate function arguments) in
future versions.
> Expand Star when creating a struct
> ----------------------------------
>
> Key: SPARK-11329
> URL: https://issues.apache.org/jira/browse/SPARK-11329
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Reporter: Yin Huai
> Assignee: Nong Li
> Fix For: 1.6.0
>
>
> It is pretty common for customers to do regular extractions of update data
> from an external datasource (e.g. mysql or postgres). While this is possible
> today, the syntax is a little onerous. With some small improvements to the
> analyzer I think we could make this much easier.
> Goal: Allow users to execute the following two queries as well as their
> dataframe equivalents
> to find the most recent record for each key
> {{SELECT max(struct(timestamp, *)) as mostRecentRecord GROUP BY key}}
> to unnest the struct from above.
> {{SELECT mostRecentRecord.* FROM data}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]