Deneche A. Hakim created DRILL-3952:
---------------------------------------
Summary: Improve Window Functions performance when not all batches
are required to process the current batch
Key: DRILL-3952
URL: https://issues.apache.org/jira/browse/DRILL-3952
Project: Apache Drill
Issue Type: Bug
Components: Execution - Relational Operators
Affects Versions: 1.2.0
Reporter: Deneche A. Hakim
Assignee: Deneche A. Hakim
Fix For: 1.3.0
Currently, the window operator blocks until all batches of current partition to
be available. For some queries it's necessary (e.g. aggregate with no order-by
in the window definition), but for other cases the window operator can process
and pass the current batch downstream sooner.
Implementing this should help the window operator use less memory and run
faster, especially in the presence of a limit operator.
The purpose of this JIRA is to improve the window operator in the following
cases:
- aggregate, when order-by clause is available in window definition, can
process current batch as soon as it receives the last peer row
- lead can process current batch as soon as it receives 1 more batch
- lag can process current batch immediately
- first_value can process current batch immediately
- last_value, when order-by clause is available in window definition, can
process current batch as soon as it receives the last peer row
- row_number, rank and dense_rank can process current batch immediately
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)