[ https://issues.apache.org/jira/browse/SPARK-7712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551957#comment-14551957 ]
Apache Spark commented on SPARK-7712: ------------------------------------- User 'hvanhovell' has created a pull request for this issue: https://github.com/apache/spark/pull/6278 > Native Spark Window Functions & Performance Improvements > --------------------------------------------------------- > > Key: SPARK-7712 > URL: https://issues.apache.org/jira/browse/SPARK-7712 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 1.4.0 > Reporter: Herman van Hovell tot Westerflier > Fix For: 1.5.0 > > Original Estimate: 336h > Remaining Estimate: 336h > > Hi All, > After playing with the current spark window implementation, I tried to take > this to next level. My main goal is/was to address the following issues: > - Native Spark-SQL, the current implementation relies only on Hive UDAFs. The > improved implementation uses Spark SQL Aggregates. Hive UDAF's are still > supported though. > - Much better performance (10x) in running cases (e.g. BETWEEN UNBOUNDED > PRECEDING AND CURRENT ROW) and UNBOUDED FOLLOWING cases. > - Increased optimization opportunities. AggregateEvaluation style > optimization should be possible for in frame processing. Tungsten might also > provide interesting optimization opportunities. > The current work is available at the following location: > https://github.com/hvanhovell/spark-window > I will try to turn this into a PR in the next couple of days. Meanwhile > comments, feedback and other discussion is much appreciated. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org