[
https://issues.apache.org/jira/browse/CALCITE-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alessandro Solimando updated CALCITE-7376:
------------------------------------------
Description:
Calcite currently supports ROWS and RANGE window frame types, but does not
support the GROUPS frame type.
The GROUPS frame type counts peer groups (rows with equal values in the ORDER
BY columns) rather than individual rows or value ranges. For example:
{code:java}
SELECT
department_id,
salary,
SUM(salary) OVER (
PARTITION BY department_id
ORDER BY salary
GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING
) as group_sum
FROM employees
{code}
This would include the current row's peer group plus one peer group before and
one after.
This differs from:
* ROWS: counts individual physical rows
* RANGE: counts by value offset from the current row's value
h2. Current Behavior
Parsing a query with GROUPS fails:
{code:java}
SELECT SUM(x) OVER (ORDER BY y GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) FROM
t
{code}
Error:
{noformat}
Encountered "GROUPS" at line 1, column 32.
Was expecting one of:
"ALLOW" ...
"ASC" ...
"DESC" ...
"DISALLOW" ...
"NULLS" ...
"RANGE" ...
"ROWS" ...
")" ...
...
{noformat}
h2. Implementation Notes
The _WindowSpecification()_ rule
([Parser.jj#L2881-L2885|https://github.com/apache/calcite/blob/1a4b00b77a1cff4f8dd3f16834a743b209f02659/core/src/main/codegen/templates/Parser.jj#L2881-L2885])
only handles ROWS and RANGE:
{noformat}
(
<ROWS> { isRows = SqlLiteral.createBoolean(true, getPos()); }
|
<RANGE> { isRows = SqlLiteral.createBoolean(false, getPos()); }
)
{noformat}
Key areas requiring changes:
* SqlWindow.java - deprecated the _isRows_ boolean field and add an enum-based
representation (e.g., FrameType \{ ROWS, RANGE, GROUPS }), to keep backwards
compatibility
* Parser.jj - extend WindowSpecification() to parse GROUPS
* SqlToRelConverter - handle GROUPS in window rel conversion
* Window.java (rel core) - change Window.Group.isRows boolean to enum
* EnumerableWindow.java - implement GROUPS execution logic
PostgreSQL documentation can be a useful reference for implementing this
feature:
[https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS]
was:
Calcite currently supports ROWS and RANGE window frame types, but does not
support the GROUPS frame type.
The GROUPS frame type counts peer groups (rows with equal values in the ORDER
BY columns) rather than individual rows or value ranges. For example:
{code:java}
SELECT
department_id,
salary,
SUM(salary) OVER (
PARTITION BY department_id
ORDER BY salary
GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING
) as group_sum
FROM employees
{code}
This would include the current row's peer group plus one peer group before and
one after.
This differs from:
* ROWS: counts individual physical rows
* RANGE: counts by value offset from the current row's value
h2. Current Behavior
Parsing a query with GROUPS fails:
{code:java}
SELECT SUM(x) OVER (ORDER BY y GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) FROM
t
{code}
Error:
{noformat}
Encountered "GROUPS" at line 1, column 32.
Was expecting one of:
"ALLOW" ...
"ASC" ...
"DESC" ...
"DISALLOW" ...
"NULLS" ...
"RANGE" ...
"ROWS" ...
")" ...
...
{noformat}
h2. Implementation Notes
The _WindowSpecification()_ rule
([Parser.jj#L2881-L2885|https://github.com/apache/calcite/blob/1a4b00b77a1cff4f8dd3f16834a743b209f02659/core/src/main/codegen/templates/Parser.jj#L2881-L2885])
only handles ROWS and RANGE:
{noformat}
(
<ROWS> { isRows = SqlLiteral.createBoolean(true, getPos()); }
|
<RANGE> { isRows = SqlLiteral.createBoolean(false, getPos()); }
)
{noformat}
Key areas requiring changes:
* SqlWindow.java - change isRows boolean to an enum-based representation
(e.g., FrameType \{ ROWS, RANGE, GROUPS })
* Parser.jj - extend WindowSpecification() to parse GROUPS
* SqlToRelConverter - handle GROUPS in window rel conversion
* Window.java (rel core) - change Window.Group.isRows boolean to enum
* EnumerableWindow.java - implement GROUPS execution logic
PostgreSQL documentation can be a useful reference for implementing this
feature:
[https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS]
> Support GROUPS window frame type in window specifications
> ---------------------------------------------------------
>
> Key: CALCITE-7376
> URL: https://issues.apache.org/jira/browse/CALCITE-7376
> Project: Calcite
> Issue Type: New Feature
> Components: core
> Affects Versions: 1.41.0
> Reporter: Alessandro Solimando
> Priority: Major
>
> Calcite currently supports ROWS and RANGE window frame types, but does not
> support the GROUPS frame type.
> The GROUPS frame type counts peer groups (rows with equal values in the ORDER
> BY columns) rather than individual rows or value ranges. For example:
> {code:java}
> SELECT
> department_id,
> salary,
> SUM(salary) OVER (
> PARTITION BY department_id
> ORDER BY salary
> GROUPS BETWEEN 1 PRECEDING AND 1 FOLLOWING
> ) as group_sum
> FROM employees
> {code}
>
> This would include the current row's peer group plus one peer group before
> and one after.
> This differs from:
> * ROWS: counts individual physical rows
> * RANGE: counts by value offset from the current row's value
> h2. Current Behavior
> Parsing a query with GROUPS fails:
> {code:java}
> SELECT SUM(x) OVER (ORDER BY y GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW)
> FROM t
> {code}
>
> Error:
> {noformat}
> Encountered "GROUPS" at line 1, column 32.
> Was expecting one of:
> "ALLOW" ...
> "ASC" ...
> "DESC" ...
> "DISALLOW" ...
> "NULLS" ...
> "RANGE" ...
> "ROWS" ...
> ")" ...
> ...
> {noformat}
> h2. Implementation Notes
> The _WindowSpecification()_ rule
> ([Parser.jj#L2881-L2885|https://github.com/apache/calcite/blob/1a4b00b77a1cff4f8dd3f16834a743b209f02659/core/src/main/codegen/templates/Parser.jj#L2881-L2885])
> only handles ROWS and RANGE:
> {noformat}
> (
> <ROWS> { isRows = SqlLiteral.createBoolean(true, getPos()); }
> |
> <RANGE> { isRows = SqlLiteral.createBoolean(false, getPos()); }
> )
> {noformat}
> Key areas requiring changes:
> * SqlWindow.java - deprecated the _isRows_ boolean field and add an
> enum-based representation (e.g., FrameType \{ ROWS, RANGE, GROUPS }), to keep
> backwards compatibility
> * Parser.jj - extend WindowSpecification() to parse GROUPS
> * SqlToRelConverter - handle GROUPS in window rel conversion
> * Window.java (rel core) - change Window.Group.isRows boolean to enum
> * EnumerableWindow.java - implement GROUPS execution logic
> PostgreSQL documentation can be a useful reference for implementing this
> feature:
> [https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)