[ 
https://issues.apache.org/jira/browse/CALCITE-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alessandro Solimando updated CALCITE-7376:
------------------------------------------
    Description: 
Calcite currently supports ROWS and RANGE window frame types, but does not 
support the GROUPS frame type.

The GROUPS frame type counts peer groups (rows with equal values in the ORDER 
BY columns) rather than individual rows or value ranges. For example:
{code:java}
SELECT
  department_id,
  salary,
  SUM(salary) OVER (
    PARTITION BY department_id
    ORDER BY salary
    GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING
  ) as group_sum
FROM employees
{code}
 
This would include the current row's peer group plus one peer group before and 
one after.

This differs from:
 * ROWS: counts individual physical rows
 * RANGE: counts by value offset from the current row's value

h2. Current Behavior

Parsing a query with GROUPS fails:
{code:java}
SELECT SUM(x) OVER (ORDER BY y GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) FROM 
t
{code}
 
Error:
{noformat}
Encountered "GROUPS" at line 1, column 32.
  Was expecting one of:
      "ALLOW" ...
      "ASC" ...
      "DESC" ...
      "DISALLOW" ...
      "NULLS" ...
      "RANGE" ...
      "ROWS" ...
      ")" ...
      ...
{noformat}
h2. Implementation Notes

The _WindowSpecification()_ rule 
([Parser.jj#L2881-L2885|https://github.com/apache/calcite/blob/1a4b00b77a1cff4f8dd3f16834a743b209f02659/core/src/main/codegen/templates/Parser.jj#L2881-L2885])
 only handles ROWS and RANGE:
{noformat}
(
    <ROWS> { isRows = SqlLiteral.createBoolean(true, getPos()); }
|
    <RANGE> { isRows = SqlLiteral.createBoolean(false, getPos()); }
)
{noformat}
  Key areas requiring changes:
 * SqlWindow.java - deprecated the _isRows_ boolean field and add an enum-based 
representation (e.g., FrameType \{ ROWS, RANGE, GROUPS }), to keep backwards 
compatibility
 * Parser.jj - extend WindowSpecification() to parse GROUPS
 * SqlToRelConverter - handle GROUPS in window rel conversion
 * Window.java (rel core) - change Window.Group.isRows boolean to enum
 * EnumerableWindow.java - implement GROUPS execution logic

PostgreSQL documentation can be a useful reference for implementing this 
feature: 
[https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS]

  was:
Calcite currently supports ROWS and RANGE window frame types, but does not 
support the GROUPS frame type.

The GROUPS frame type counts peer groups (rows with equal values in the ORDER 
BY columns) rather than individual rows or value ranges. For example:
{code:java}
SELECT
  department_id,
  salary,
  SUM(salary) OVER (
    PARTITION BY department_id
    ORDER BY salary
    GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING
  ) as group_sum
FROM employees
{code}
 
This would include the current row's peer group plus one peer group before and 
one after.

This differs from:
 * ROWS: counts individual physical rows
 * RANGE: counts by value offset from the current row's value

h2. Current Behavior

Parsing a query with GROUPS fails:
{code:java}
SELECT SUM(x) OVER (ORDER BY y GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) FROM 
t
{code}
 
Error:
{noformat}
Encountered "GROUPS" at line 1, column 32.
  Was expecting one of:
      "ALLOW" ...
      "ASC" ...
      "DESC" ...
      "DISALLOW" ...
      "NULLS" ...
      "RANGE" ...
      "ROWS" ...
      ")" ...
      ...
{noformat}
h2. Implementation Notes

The _WindowSpecification()_ rule 
([Parser.jj#L2881-L2885|https://github.com/apache/calcite/blob/1a4b00b77a1cff4f8dd3f16834a743b209f02659/core/src/main/codegen/templates/Parser.jj#L2881-L2885])
 only handles ROWS and RANGE:
{noformat}
(
    <ROWS> { isRows = SqlLiteral.createBoolean(true, getPos()); }
|
    <RANGE> { isRows = SqlLiteral.createBoolean(false, getPos()); }
)
{noformat}
  Key areas requiring changes:
 * SqlWindow.java - change isRows boolean to an enum-based representation 
(e.g., FrameType \{ ROWS, RANGE, GROUPS })
 * Parser.jj - extend WindowSpecification() to parse GROUPS
 * SqlToRelConverter - handle GROUPS in window rel conversion
 * Window.java (rel core) - change Window.Group.isRows boolean to enum
 * EnumerableWindow.java - implement GROUPS execution logic

PostgreSQL documentation can be a useful reference for implementing this 
feature: 
[https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS]


> Support GROUPS window frame type in window specifications
> ---------------------------------------------------------
>
>                 Key: CALCITE-7376
>                 URL: https://issues.apache.org/jira/browse/CALCITE-7376
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.41.0
>            Reporter: Alessandro Solimando
>            Priority: Major
>
> Calcite currently supports ROWS and RANGE window frame types, but does not 
> support the GROUPS frame type.
> The GROUPS frame type counts peer groups (rows with equal values in the ORDER 
> BY columns) rather than individual rows or value ranges. For example:
> {code:java}
> SELECT
>   department_id,
>   salary,
>   SUM(salary) OVER (
>     PARTITION BY department_id
>     ORDER BY salary
>     GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING
>   ) as group_sum
> FROM employees
> {code}
>  
> This would include the current row's peer group plus one peer group before 
> and one after.
> This differs from:
>  * ROWS: counts individual physical rows
>  * RANGE: counts by value offset from the current row's value
> h2. Current Behavior
> Parsing a query with GROUPS fails:
> {code:java}
> SELECT SUM(x) OVER (ORDER BY y GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) 
> FROM t
> {code}
>  
> Error:
> {noformat}
> Encountered "GROUPS" at line 1, column 32.
>   Was expecting one of:
>       "ALLOW" ...
>       "ASC" ...
>       "DESC" ...
>       "DISALLOW" ...
>       "NULLS" ...
>       "RANGE" ...
>       "ROWS" ...
>       ")" ...
>       ...
> {noformat}
> h2. Implementation Notes
> The _WindowSpecification()_ rule 
> ([Parser.jj#L2881-L2885|https://github.com/apache/calcite/blob/1a4b00b77a1cff4f8dd3f16834a743b209f02659/core/src/main/codegen/templates/Parser.jj#L2881-L2885])
>  only handles ROWS and RANGE:
> {noformat}
> (
>     <ROWS> { isRows = SqlLiteral.createBoolean(true, getPos()); }
> |
>     <RANGE> { isRows = SqlLiteral.createBoolean(false, getPos()); }
> )
> {noformat}
>   Key areas requiring changes:
>  * SqlWindow.java - deprecated the _isRows_ boolean field and add an 
> enum-based representation (e.g., FrameType \{ ROWS, RANGE, GROUPS }), to keep 
> backwards compatibility
>  * Parser.jj - extend WindowSpecification() to parse GROUPS
>  * SqlToRelConverter - handle GROUPS in window rel conversion
>  * Window.java (rel core) - change Window.Group.isRows boolean to enum
>  * EnumerableWindow.java - implement GROUPS execution logic
> PostgreSQL documentation can be a useful reference for implementing this 
> feature: 
> [https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to