[jira] [Updated] (SPARK-8638) Window Function Performance Improvements

Herman van Hovell tot Westerflier (JIRA) Fri, 03 Jul 2015 09:24:35 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Herman van Hovell tot Westerflier updated SPARK-8638:
-----------------------------------------------------
    Description: 
Improve the performance of Spark Window Functions in the following cases:
# Much better performance (10x) in the running case (e.g. BETWEEN UNBOUNDED 
PRECEDING AND CURRENT ROW). The current implementation in spark uses a sliding 
window approach in these cases. This means that an aggregate is maintained for 
every row, so space usage is N (N being the number of rows). This also means 
that all these aggregates all need to be updated separately, this takes 
N*(N-1)/2 updates. The running case differs from the Sliding case because we 
are only adding data to an aggregate function (no reset is required), we only 
need to maintain one aggregate (like in the UNBOUNDED PRECEDING AND UNBOUNDED 
case), update the aggregate for each row, and get the aggregate value after 
each update. This is what the new implementation does. This approach only uses 
1 buffer, and only requires N updates; I am currently working on data with 
window sizes of 500-1000 doing running sums and this saves a lot of time.
#. Fewer comparisons in the sliding case. The current implementation determines 
frame boundaries for every input row. The new implementation makes more use of 
the fact that the window is sorted, maintains the boundaries, and only moves 
them when the current row order changes. This is a minor improvement.
# A single Window node is able to process all types of Frames for the same 
Partitioning/Ordering. This saves a little time/memory spent buffering and 
managing partitions. This will be enabled in a follow-up PR.
# A lot of the staging code is moved from the execution phase to the 
initialization phase. Minor performance improvement, and improves readability 
of the execution code.

The attached perf_test.scala file contains s number of queries which can be 
used to measure the differences between the current and the proposed window 
function implementation. In the tests the new implementation outperforms the 
current implementation by a factor 7x in sliding window cases, and by a factor 
14x in the running window cases.

  was:
Improve the performance of Spark Window Functions in the following cases:
# Much better performance (10x) in running cases (e.g. BETWEEN UNBOUNDED 
PRECEDING AND CURRENT ROW) and UNBOUDED FOLLOWING cases. The current 
implementation in spark uses a sliding window approach in these cases. This 
means that an aggregate is maintained for every row, so space usage is N (N 
being the number of rows). This also means that all these aggregates all need 
to be updated separately, this takes N*(N-1)/2 updates. The running case 
differs from the Sliding case because we are only adding data to an aggregate 
function (no reset is required), we only need to maintain one aggregate (like 
in the UNBOUNDED PRECEDING AND UNBOUNDED case), update the aggregate for each 
row, and get the aggregate value after each update. This is what the new 
implementation does. This approach only uses 1 buffer, and only requires N 
updates; I am currently working on data with window sizes of 500-1000 doing 
running sums and this saves a lot of time. The CURRENT ROW AND UNBOUNDED 
FOLLOWING case also uses this approach and the fact that aggregate operations 
are communitative, there is one twist though it will process the input buffer 
in reverse.
#. Fewer comparisons in the sliding case. The current implementation determines 
frame boundaries for every input row. The new implementation makes more use of 
the fact that the window is sorted, maintains the boundaries, and only moves 
them when the current row order changes. This is a minor improvement.
# A single Window node is able to process all types of Frames for the same 
Partitioning/Ordering. This saves a little time/memory spent buffering and 
managing partitions. This will be enabled in a follow-up PR.
# A lot of the staging code is moved from the execution phase to the 
initialization phase. Minor performance improvement, and improves readability 
of the execution code.

The attached perf_test.scala file contains s number of queries which can be 
used to measure the differences between the current and the proposed window 
function implementation. In the tests the new implementation outperforms the 
current implementation by a factor 7x in sliding window cases, and by a factor 
14x in the running window cases.


> Window Function Performance Improvements
> ----------------------------------------
>
>                 Key: SPARK-8638
>                 URL: https://issues.apache.org/jira/browse/SPARK-8638
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Herman van Hovell tot Westerflier
>         Attachments: perf_test.scala
>
>
> Improve the performance of Spark Window Functions in the following cases:
> # Much better performance (10x) in the running case (e.g. BETWEEN UNBOUNDED 
> PRECEDING AND CURRENT ROW). The current implementation in spark uses a 
> sliding window approach in these cases. This means that an aggregate is 
> maintained for every row, so space usage is N (N being the number of rows). 
> This also means that all these aggregates all need to be updated separately, 
> this takes N*(N-1)/2 updates. The running case differs from the Sliding case 
> because we are only adding data to an aggregate function (no reset is 
> required), we only need to maintain one aggregate (like in the UNBOUNDED 
> PRECEDING AND UNBOUNDED case), update the aggregate for each row, and get the 
> aggregate value after each update. This is what the new implementation does. 
> This approach only uses 1 buffer, and only requires N updates; I am currently 
> working on data with window sizes of 500-1000 doing running sums and this 
> saves a lot of time.
> #. Fewer comparisons in the sliding case. The current implementation 
> determines frame boundaries for every input row. The new implementation makes 
> more use of the fact that the window is sorted, maintains the boundaries, and 
> only moves them when the current row order changes. This is a minor 
> improvement.
> # A single Window node is able to process all types of Frames for the same 
> Partitioning/Ordering. This saves a little time/memory spent buffering and 
> managing partitions. This will be enabled in a follow-up PR.
> # A lot of the staging code is moved from the execution phase to the 
> initialization phase. Minor performance improvement, and improves readability 
> of the execution code.
> The attached perf_test.scala file contains s number of queries which can be 
> used to measure the differences between the current and the proposed window 
> function implementation. In the tests the new implementation outperforms the 
> current implementation by a factor 7x in sliding window cases, and by a 
> factor 14x in the running window cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-8638) Window Function Performance Improvements

Reply via email to