GitHub user sirpkt opened a pull request:

    https://github.com/apache/tajo/pull/454

    TAJO-1415: Window frame support

    It supports all ROWS and RANGE window frame.
    
    Cases when window frame is not applied
    - no order by clause is used
    - some built-in window functions where window frame is not supported: 
row_number, rank, dense_rank, percent_rank, cume_dist, tile, lag, lead 
    
    Cases when window frame should be applied
    - other built-in window functions: first_value, last_value, nth_value
    - normal aggregation functions
    
    Based on above information, this patch distinguishes window function types 
as following three:
     1. built-in window function without window frame support
     2. built-in window function with window frame support
     3. normal aggregation functions used as a window function. In this case, 
window frame should be supported
    
    And, it further distinguishes window frame types as following four:
     1. entire partition
     2. from the start of the partition to the moving end point relative to 
current row
     3. from the moving start point relative to current row to the end of the 
partition
     4. sliding frame as the current row position varies
    
    Case 1 is the same as previous handling of window function.
    Case 2 is handled as incremental termination of aggregation function, which 
means for every row call merge() and terminate() of the given function
    Case 3 is handled almost the same as case 2 except feeding rows to the 
function from the end of the partition to the start of the frame, i.e., in 
reverse order
    Case 4 is handled by two pass approach: making small loop of feeding rows 
to the function for each row value computation, I think, which is inevitable 
since aggregation function does not support sliding window aggregation.
    
    All above are implemented for ROWS first, 
    and then expanded to support RANGE by including rows that has the same 
order by value with current row in computation of window function.
    
    This patch includes following changes
    - parser can handle integer offset PRECEDING and FOLLOWING
    - ExprAnnotator can reflect window frame information on WindowFunctionEval 
including default value handling
    - WindowAggExec can handles ROWS and RANGE with window frame support
    - Parameter checking in parser and ExprAnnotator is included
    - last_value is re-implemented as WindowAggFunc. First_value implementation 
becomes more simple
    - Window related classes in tajo-plan has new prefix 'Logical' to 
distinguish themselves with the same name class in tajo-algebra
    - plan.proto is modified to support data structure to distinguish function 
types and frame types
    - add test cases for window frame


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sirpkt/tajo TAJO-1415

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tajo/pull/454.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #454
    
----
commit 3e2cfabb8513ae730cdef7c42347a5730df47988
Author: Keuntae Park <[email protected]>
Date:   2015-03-22T13:38:30Z

    window frame ROWS support is added, RANGE is not supported yet

commit ddd6797d3c029ec984e0e3f99eb68755ee05261f
Author: Keuntae Park <[email protected]>
Date:   2015-03-22T13:39:56Z

    Merge remote-tracking branch 'upstream/master' into TAJO-1415

commit ddc7c1b2d1a2c8ad8cc5bee8f8b6141a13116973
Author: Keuntae Park <[email protected]>
Date:   2015-03-23T00:39:09Z

    bug fix during master merge

commit aa97dbba3b055cc598667c5c167607b62cf64de3
Author: Keuntae Park <[email protected]>
Date:   2015-03-23T06:56:56Z

    support for RANGE window frame

commit 7b21415dfc2e67508d4cca192aa69e6f3bede68d
Author: Keuntae Park <[email protected]>
Date:   2015-03-23T06:57:11Z

    Merge remote-tracking branch 'upstream/master' into TAJO-1415

commit 973d99fd33f819387dc33f29b775f02ede860198
Author: Keuntae Park <[email protected]>
Date:   2015-03-23T07:56:34Z

    Fix bug for no order by case, where window function SHOULD work on the 
entire partition

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to