[ 
https://issues.apache.org/jira/browse/SPARK-7712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551957#comment-14551957
 ] 

Apache Spark commented on SPARK-7712:
-------------------------------------

User 'hvanhovell' has created a pull request for this issue:
https://github.com/apache/spark/pull/6278

> Native Spark Window Functions & Performance Improvements 
> ---------------------------------------------------------
>
>                 Key: SPARK-7712
>                 URL: https://issues.apache.org/jira/browse/SPARK-7712
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.4.0
>            Reporter: Herman van Hovell tot Westerflier
>             Fix For: 1.5.0
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Hi All,
> After playing with the current spark window implementation, I tried to take 
> this to next level. My main goal is/was to address the following issues:
> - Native Spark-SQL, the current implementation relies only on Hive UDAFs. The 
> improved implementation uses Spark SQL Aggregates. Hive UDAF's are still 
> supported though.
> - Much better performance (10x) in running cases (e.g. BETWEEN UNBOUNDED 
> PRECEDING AND CURRENT ROW) and UNBOUDED FOLLOWING cases.
> - Increased optimization opportunities. AggregateEvaluation style 
> optimization should be possible for in frame processing. Tungsten might also 
> provide interesting optimization opportunities.
> The current work is available at the following location: 
> https://github.com/hvanhovell/spark-window
> I will try to turn this into a PR in the next couple of days. Meanwhile 
> comments, feedback and other discussion is much appreciated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to