Simon Dew has uploaded this change for review. (
https://asterix-gerrit.ics.uci.edu/3448
Change subject: [NO ISSUE] Document window functions.
......................................................................
[NO ISSUE] Document window functions.
Add new window function document.
Add new window clause (OVER clause) document.
Update Query document:
- update list of aggregate functions,
- update list of syntactic sugar functions.
Update Aggregate SQL Function document:
- add link for DISTINCT keyword,
- add link for window clause,
- add link for SQL standard functions.
Add extra blank line if file ends with indented code.
Add OVER to reserved keywords.
Update Functions ToC.
Update Builtins POM.
Change-Id: I52d6e97a27c2fa51208810c6ac3d98cb21a0e2b1
---
M asterixdb/asterix-doc/pom.xml
M asterixdb/asterix-doc/src/main/markdown/builtins/0_toc.md
A asterixdb/asterix-doc/src/main/markdown/builtins/14_window.md
A asterixdb/asterix-doc/src/main/markdown/builtins/15_over.md
M asterixdb/asterix-doc/src/main/markdown/builtins/7_allens.md
M asterixdb/asterix-doc/src/main/markdown/builtins/7_temporal.md
M asterixdb/asterix-doc/src/main/markdown/builtins/9_aggregate_aql.md
M asterixdb/asterix-doc/src/main/markdown/builtins/9_aggregate_sql.md
M asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
M asterixdb/asterix-doc/src/main/markdown/sqlpp/appendix_1_keywords.md
10 files changed, 1,707 insertions(+), 22 deletions(-)
git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb
refs/changes/48/3448/1
diff --git a/asterixdb/asterix-doc/pom.xml b/asterixdb/asterix-doc/pom.xml
index b6f77a5..963daec 100644
--- a/asterixdb/asterix-doc/pom.xml
+++ b/asterixdb/asterix-doc/pom.xml
@@ -55,7 +55,7 @@
<filelist dir="${project.basedir}/src/main/markdown/sqlpp"
files="0_toc.md,1_intro.md,2_expr_title.md,2_expr.md,3_query_title.md,3_declare_dataverse.md,3_declare_function.md,3_query.md,4_error_title.md,4_error.md,5_ddl_head.md,5_ddl_dataset_index.md,5_ddl_function_removal.md,5_ddl_dml.md,appendix_1_title.md,appendix_1_keywords.md,appendix_2_title.md,appendix_2_parameters.md,appendix_2_index_only.md,appendix_3_title.md,appendix_3_resolution.md"
/>
</concat>
<concat
destfile="${project.build.directory}/generated-site/markdown/sqlpp/builtins.md">
- <filelist
dir="${project.basedir}/src/main/markdown/builtins"
files="0_toc.md,1_numeric_common.md,1_numeric_delta.md,2_string_common.md,2_string_delta.md,3_binary.md,4_spatial.md,5_similarity.md,6_tokenizing.md,7_temporal.md,7_allens.md,8_record.md,9_aggregate_sql.md,10_comparison.md,11_type.md,13_conditional.md,12_misc.md"
/>
+ <filelist
dir="${project.basedir}/src/main/markdown/builtins"
files="0_toc.md,1_numeric_common.md,1_numeric_delta.md,2_string_common.md,2_string_delta.md,3_binary.md,4_spatial.md,5_similarity.md,6_tokenizing.md,7_temporal.md,7_allens.md,8_record.md,9_aggregate_sql.md,10_comparison.md,11_type.md,13_conditional.md,12_misc.md,14_window.md,15_over.md"
/>
</concat>
<concat
destfile="${project.build.directory}/generated-site/markdown/aql/builtins.md">
<filelist
dir="${project.basedir}/src/main/markdown/builtins"
files="0_toc.md,1_numeric_common.md,1_numeric_delta.md,2_string_common.md,2_string_delta.md,3_binary.md,4_spatial.md,5_similarity.md,6_tokenizing.md,7_temporal.md,7_allens.md,8_record.md,9_aggregate_sql.md,10_comparison.md,11_type.md,13_conditional.md,12_misc.md"
/>
diff --git a/asterixdb/asterix-doc/src/main/markdown/builtins/0_toc.md
b/asterixdb/asterix-doc/src/main/markdown/builtins/0_toc.md
index e0d23d4..7bb23bb 100644
--- a/asterixdb/asterix-doc/src/main/markdown/builtins/0_toc.md
+++ b/asterixdb/asterix-doc/src/main/markdown/builtins/0_toc.md
@@ -34,6 +34,8 @@
* [Type Functions](#TypeFunctions)
* [Conditional Functions](#ConditionalFunctions)
* [Miscellaneous Functions](#MiscFunctions)
+* [Window Functions](#WindowFunctions)
+* [Window Clause (OVER Clause)](#WindowClause)
The system provides various classes of functions to support operations on
numeric, string, spatial, and temporal data.
This document explains how to use these functions.
diff --git a/asterixdb/asterix-doc/src/main/markdown/builtins/14_window.md
b/asterixdb/asterix-doc/src/main/markdown/builtins/14_window.md
new file mode 100644
index 0000000..08bcef6
--- /dev/null
+++ b/asterixdb/asterix-doc/src/main/markdown/builtins/14_window.md
@@ -0,0 +1,1302 @@
+<!--
+ ! Licensed to the Apache Software Foundation (ASF) under one
+ ! or more contributor license agreements. See the NOTICE file
+ ! distributed with this work for additional information
+ ! regarding copyright ownership. The ASF licenses this file
+ ! to you under the Apache License, Version 2.0 (the
+ ! "License"); you may not use this file except in compliance
+ ! with the License. You may obtain a copy of the License at
+ !
+ ! http://www.apache.org/licenses/LICENSE-2.0
+ !
+ ! Unless required by applicable law or agreed to in writing,
+ ! software distributed under the License is distributed on an
+ ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ ! KIND, either express or implied. See the License for the
+ ! specific language governing permissions and limitations
+ ! under the License.
+ !-->
+
+## <a id="WindowFunctions">Window Functions</a> ##
+
+Window functions are used to compute an aggregate or cumulative value, based on
+a group of objects.
+For each input object, a movable window of objects is defined.
+The window determines the objects to be used by the window function.
+
+All window functions must be used with a window clause,
+which is introduced by the `OVER` keyword.
+Refer to [Window Clause](#WindowClause) for details.
+
+Window functions cannot appear in the FROM clause clause or LIMIT clause.
+
+The examples in this section use the `GleambookMessages` dataset,
+described in the section on [SELECT Statements](manual.html#SELECT_statements).
+
+### cume_dist ###
+
+* Syntax:
+
+ CUME_DIST() OVER ([window-partition-clause] window-order-clause)
+
+* Returns the percentile rank of the current object as part of the cumulative
+ distribution – that is, the number of objects ranked lower than or equal to
+ the current object, including the current object, divided by the total number
+ of objects in the window partition.
+
+* Arguments:
+
+ * None.
+
+* Clauses:
+
+ * (Optional) [Window Partition Clause](#window-partition-clause).
+
+ * (Required) [Window Order Clause](#window-order-clause).
+
+* Return Value:
+
+ * A number greater than 0 and less than or equal to 1.
+ The higher the value, the higher the ranking.
+
+* Example:
+
+ For each author, find the cumulative distribution of all messages
+ in order of message ID.
+
+ SELECT m.messageId, m.authorId, CUME_DIST() OVER (
+ PARTITION BY m.authorId
+ ORDER BY m.messageId
+ ) AS `rank`
+ FROM GleambookMessages AS m;
+
+* The expected result is:
+
+ [
+ {
+ "rank": 0.2,
+ "messageId": 2,
+ "authorId": 1
+ },
+ {
+ "rank": 0.4,
+ "messageId": 4,
+ "authorId": 1
+ },
+ {
+ "rank": 0.6,
+ "messageId": 8,
+ "authorId": 1
+ },
+ {
+ "rank": 0.8,
+ "messageId": 10,
+ "authorId": 1
+ },
+ {
+ "rank": 1,
+ "messageId": 11,
+ "authorId": 1
+ },
+ {
+ "rank": 0.5,
+ "messageId": 3,
+ "authorId": 2
+ },
+ {
+ "rank": 1,
+ "messageId": 6,
+ "authorId": 2
+ }
+ ]
+
+### dense_rank ###
+
+* Syntax:
+
+ DENSE_RANK() OVER ([window-partition-clause] window-order-clause)
+
+* Returns the dense rank of the current object – that is, the number of
+ distinct objects preceding this object in the current window partition, plus
+ one.
+
+ The objects are ordered by the window order clause.
+ If any objects are tied, they will have the same rank.
+
+ For this function, when any objects have the same rank, the rank of the next
+ object will be consecutive, so there will not be a gap in the sequence of
+ returned values.
+ For example, if there are three objects ranked 2, the next dense rank is 3.
+
+* Arguments:
+
+ * None.
+
+* Clauses:
+
+ * (Optional) [Window Partition Clause](#window-partition-clause).
+
+ * (Required) [Window Order Clause](#window-order-clause).
+
+* Return Value:
+
+ * An integer, greater than or equal to 1.
+
+* Example:
+
+ For each author, find the dense rank of all messages in order of location.
+
+ SELECT m.authorId, m.messageId, m.senderLocation[1] as longitude,
+ DENSE_RANK() OVER (
+ PARTITION BY m.authorId
+ ORDER BY m.senderLocation[1]
+ ) AS `rank`
+ FROM GleambookMessages AS m;
+
+* The expected result is:
+
+ [
+ {
+ "rank": 1,
+ "authorId": 1,
+ "messageId": 10,
+ "longitude": 70.01
+ },
+ {
+ "rank": 2,
+ "authorId": 1,
+ "messageId": 11,
+ "longitude": 77.49
+ },
+ {
+ "rank": 3,
+ "authorId": 1,
+ "messageId": 2,
+ "longitude": 80.87
+ },
+ {
+ "rank": 3,
+ "authorId": 1,
+ "messageId": 8,
+ "longitude": 80.87
+ },
+ {
+ "rank": 4,
+ "authorId": 1,
+ "messageId": 4,
+ "longitude": 97.04
+ },
+ {
+ "rank": 1,
+ "authorId": 2,
+ "messageId": 6,
+ "longitude": 75.56
+ },
+ {
+ "rank": 2,
+ "authorId": 2,
+ "messageId": 3,
+ "longitude": 81.01
+ }
+ ]
+
+### first_value ###
+
+* Syntax:
+
+ FIRST_VALUE(expr) [nulls-treatment] OVER (window-clause)
+
+* Returns the requested value from the first object in the current window
+ frame, where the window frame is specified by the [window
+ clause](#window-clause).
+
+* Arguments:
+
+ * `expr`: The value that you want to return from the first
+ object in the window frame. <sup>\[[1](#fn_1)\]</sup>
+
+* Modifiers:
+
+ * [Nulls Treatment](#nulls-treatment): (Optional) Determines how NULL or
+ MISSING values are treated when finding the first object in the window
+ frame.
+
+ - `IGNORE NULLS`: If the values for any objects evaluate to NULL or
+ MISSING, those objects are ignored when finding the first object.
+ In this case, the function returns the first non-NULL, non-MISSING
+ value.
+
+ - `RESPECT NULLS`: If the values for any objects evaluate to NULL or
+ MISSING, those objects are included when finding the first object.
+
+ If this modifier is omitted, the default is `RESPECT NULLS`.
+
+* Clauses:
+
+ * (Optional) [Window Partition Clause](#window-partition-clause).
+
+ * (Optional) [Window Order Clause](#window-order-clause).
+
+ * (Optional) [Window Frame Clause](#window-frame-clause).
+
+* Return Value:
+
+ * The specified value from the first object.
+
+ * If all values are NULL or MISSING it returns NULL.
+
+ * In the following cases, this function may return unpredictable results.
+
+ - If the window order clause is omitted.
+
+ - If the window frame is defined by `ROWS`, and there are tied objects
+ in the window frame.
+
+ * To make the function return deterministic results, add a window order
+ clause, or add further ordering terms to the window order clause so that
+ no objects are tied.
+
+ * If the window frame is defined by `RANGE` or `GROUPS`, and there are
+ tied objects in the window frame, the function returns the lowest value
+ of the input expression.
+
+* Example:
+
+ For each author, show the length of each message, including the
+ length of the shortest message from that author.
+
+ SELECT m.authorId, m.messageId,
+ LENGTH(m.message) AS message_length,
+ FIRST_VALUE(LENGTH(m.message)) OVER (
+ PARTITION BY m.authorId
+ ORDER BY LENGTH(m.message)
+ ) AS shortest_message
+ FROM GleambookMessages AS m;
+
+* The expected result is:
+
+ [
+ {
+ "message_length": 31,
+ "shortest_message": 31,
+ "authorId": 1,
+ "messageId": 8
+ },
+ {
+ "message_length": 39,
+ "shortest_message": 31,
+ "authorId": 1,
+ "messageId": 11
+ },
+ {
+ "message_length": 44,
+ "shortest_message": 31,
+ "authorId": 1,
+ "messageId": 4
+ },
+ {
+ "message_length": 45,
+ "shortest_message": 31,
+ "authorId": 1,
+ "messageId": 2
+ },
+ {
+ "message_length": 51,
+ "shortest_message": 31,
+ "authorId": 1,
+ "messageId": 10
+ },
+ {
+ "message_length": 35,
+ "shortest_message": 35,
+ "authorId": 2,
+ "messageId": 3
+ },
+ {
+ "message_length": 44,
+ "shortest_message": 35,
+ "authorId": 2,
+ "messageId": 6
+ }
+ ]
+
+### lag ###
+
+* Syntax:
+
+ LAG(expr[, offset[, default]]) [nulls-treatment] OVER
([window-partition-clause] window-order-clause)
+
+* Returns the value of an object at a given offset prior to the current object
+ position.
+
+* Arguments:
+
+ * `expr`: The value that you want to return from the offset
+ object. <sup>\[[1](#fn_1)\]</sup>
+
+ * `offset`: (Optional) A positive integer greater than 0.
+ If omitted, the default is 1.
+
+ * `default`: (Optional) The value to return when the offset goes out of
+ window scope.
+ If omitted, the default is NULL.
+
+* Modifiers:
+
+ * [Nulls Treatment](#nulls-treatment): (Optional) Determines how NULL or
+ MISSING values are treated when finding the first object in the window
+ frame.
+
+ - `IGNORE NULLS`: If the values for any objects evaluate to NULL or
+ MISSING, those objects are ignored when finding the first object.
+ In this case, the function returns the first non-NULL, non-MISSING
+ value.
+
+ - `RESPECT NULLS`: If the values for any objects evaluate to NULL or
+ MISSING, those objects are included when finding the first object.
+
+ If this modifier is omitted, the default is `RESPECT NULLS`.
+
+* Clauses:
+
+ * (Optional) [Window Partition Clause](#window-partition-clause).
+
+ * (Required) [Window Order Clause](#window-order-clause).
+
+* Return Value:
+
+ * The specified value from the offset object.
+
+ * If the offset object is out of scope, it returns the default value,
+ or NULL if no default is specified.
+
+* Example:
+
+ For each author, show the length of each message, including the
+ length of the next-shortest message.
+
+ SELECT m.authorId, m.messageId,
+ LENGTH(m.message) AS message_length,
+ LAG(LENGTH(m.message), 1, "No shorter message") OVER (
+ PARTITION BY m.authorId
+ ORDER BY LENGTH(m.message)
+ ) AS previous_message
+ FROM GleambookMessages AS m;
+
+* The expected result is:
+
+ [
+ {
+ "message_length": 31,
+ "authorId": 1,
+ "messageId": 8,
+ "previous_message": "No shorter message"
+ },
+ {
+ "message_length": 39,
+ "authorId": 1,
+ "messageId": 11,
+ "previous_message": 31
+ },
+ {
+ "message_length": 44,
+ "authorId": 1,
+ "messageId": 4,
+ "previous_message": 39
+ },
+ {
+ "message_length": 45,
+ "authorId": 1,
+ "messageId": 2,
+ "previous_message": 44
+ },
+ {
+ "message_length": 51,
+ "authorId": 1,
+ "messageId": 10,
+ "previous_message": 45
+ },
+ {
+ "message_length": 35,
+ "authorId": 2,
+ "messageId": 3,
+ "previous_message": "No shorter message"
+ },
+ {
+ "message_length": 44,
+ "authorId": 2,
+ "messageId": 6,
+ "previous_message": 35
+ }
+ ]
+
+### last_value ###
+
+* Syntax:
+
+ LAST_VALUE(expr) [nulls-treatment] OVER (window-clause)
+
+* Returns the requested value from the last object in the current window frame,
+ where the window frame is specified by the [window clause](#window-clause).
+
+* Arguments:
+
+ * `expr`: The value that you want to return from the last object
+ in the window frame. <sup>\[[1](#fn_1)\]</sup>
+
+* Modifiers:
+
+ * [Nulls Treatment](#nulls-treatment): (Optional) Determines how NULL or
+ MISSING values are treated when finding the first object in the window
+ frame.
+
+ - `IGNORE NULLS`: If the values for any objects evaluate to NULL or
+ MISSING, those objects are ignored when finding the first object.
+ In this case, the function returns the first non-NULL, non-MISSING
+ value.
+
+ - `RESPECT NULLS`: If the values for any objects evaluate to NULL or
+ MISSING, those objects are included when finding the first object.
+
+ If this modifier is omitted, the default is `RESPECT NULLS`.
+
+* Clauses:
+
+ * (Optional) [Window Partition Clause](#window-partition-clause).
+
+ * (Optional) [Window Order Clause](#window-order-clause).
+
+ * (Optional) [Window Frame Clause](#window-frame-clause).
+
+* Return Value:
+
+ * The specified value from the last object.
+
+ * If all values are NULL or MISSING it returns NULL.
+
+ * In the following cases, this function may return unpredictable results.
+
+ - If the window order clause is omitted.
+
+ - If the window frame clause is omitted.
+
+ - If the window frame is defined by `ROWS`, and there are tied objects
+ in the window frame.
+
+ * To make the function return deterministic results, add a window order
+ clause, or add further ordering terms to the window order clause so that
+ no objects are tied.
+
+ * If the window frame is defined by `RANGE` or `GROUPS`, and there are
+ tied objects in the window frame, the function returns the highest
+ value of the input expression.
+
+* Example:
+
+ For each author, show the length of each message, including the
+ length of the longest message from that author.
+
+ SELECT m.authorId, m.messageId,
+ LENGTH(m.message) AS message_length,
+ LAST_VALUE(LENGTH(m.message)) OVER (
+ PARTITION BY m.authorId
+ ORDER BY LENGTH(m.message)
+ ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING -- ➊
+ ) AS longest_message
+ FROM GleambookMessages AS m;
+
+* The expected result is:
+
+ [
+ {
+ "message_length": 31,
+ "longest_message": 51,
+ "authorId": 1,
+ "messageId": 8
+ },
+ {
+ "message_length": 39,
+ "longest_message": 51,
+ "authorId": 1,
+ "messageId": 11
+ },
+ {
+ "message_length": 44,
+ "longest_message": 51,
+ "authorId": 1,
+ "messageId": 4
+ },
+ {
+ "message_length": 45,
+ "longest_message": 51,
+ "authorId": 1,
+ "messageId": 2
+ },
+ {
+ "message_length": 51,
+ "longest_message": 51,
+ "authorId": 1,
+ "messageId": 10
+ },
+ {
+ "message_length": 35,
+ "longest_message": 44,
+ "authorId": 2,
+ "messageId": 3
+ },
+ {
+ "message_length": 44,
+ "longest_message": 44,
+ "authorId": 2,
+ "messageId": 6
+ }
+ ]
+
+ ➀ This clause specifies that the window frame should extend to the
+ end of the window partition.
+ Without this clause, the end point of the window frame would always be the
+ current object.
+ This would mean that the longest message would always be the same as the
+ current message.
+
+### lead ###
+
+* Syntax:
+
+ LEAD(expr[, offset[, default]]) [nulls-treatment] OVER
([window-partition-clause] window-order-clause)
+
+* Returns the value of an object at a given offset ahead of the current object
+ position.
+
+* Arguments:
+
+ * `expr`: The value that you want to return from the offset
+ object. <sup>\[[1](#fn_1)\]</sup>
+
+ * `offset`: (Optional) A positive integer greater than 0. If omitted, the
+ default is 1.
+
+ * `default`: (Optional) The value to return when the offset goes out of
+ window scope.
+ If omitted, the default is NULL.
+
+* Modifiers:
+
+ * [Nulls Treatment](#nulls-treatment): (Optional) Determines how NULL or
+ MISSING values are treated when finding the first object in the window
+ frame.
+
+ - `IGNORE NULLS`: If the values for any objects evaluate to NULL or
+ MISSING, those objects are ignored when finding the first object.
+ In this case, the function returns the first non-NULL, non-MISSING
+ value.
+
+ - `RESPECT NULLS`: If the values for any objects evaluate to NULL or
+ MISSING, those objects are included when finding the first object.
+
+ If this modifier is omitted, the default is `RESPECT NULLS`.
+
+* Clauses:
+
+ * (Optional) [Window Partition Clause](#window-partition-clause).
+
+ * (Required) [Window Order Clause](#window-order-clause).
+
+* Return Value:
+
+ * The specified value from the offset object.
+
+ * If the offset object is out of scope, it returns the default value, or
+ NULL if no default is specified.
+
+* Example:
+
+ For each author, show the length of each message, including the
+ length of the next-longest message.
+
+ SELECT m.authorId, m.messageId,
+ LENGTH(m.message) AS message_length,
+ LEAD(LENGTH(m.message), 1, "No longer message") OVER (
+ PARTITION BY m.authorId
+ ORDER BY LENGTH(m.message)
+ ) AS next_message
+ FROM GleambookMessages AS m;
+
+* The expected result is:
+
+ [
+ {
+ "message_length": 31,
+ "authorId": 1,
+ "messageId": 8,
+ "next_message": 39
+ },
+ {
+ "message_length": 39,
+ "authorId": 1,
+ "messageId": 11,
+ "next_message": 44
+ },
+ {
+ "message_length": 44,
+ "authorId": 1,
+ "messageId": 4,
+ "next_message": 45
+ },
+ {
+ "message_length": 45,
+ "authorId": 1,
+ "messageId": 2,
+ "next_message": 51
+ },
+ {
+ "message_length": 51,
+ "authorId": 1,
+ "messageId": 10,
+ "next_message": "No longer message"
+ },
+ {
+ "message_length": 35,
+ "authorId": 2,
+ "messageId": 3,
+ "next_message": 44
+ },
+ {
+ "message_length": 44,
+ "authorId": 2,
+ "messageId": 6,
+ "next_message": "No longer message"
+ }
+ ]
+
+### nth_value ###
+
+* Syntax:
+
+ NTH_VALUE(expr, offset) [nthval-from] [nulls-treatment] OVER
(window-clause)
+
+* Returns the requested value from an object in the current window frame, where
+ the window frame is specified by the window clause.
+
+* Arguments:
+
+ * `expr`: The value that you want to return from the offset
+ object in the window frame. <sup>\[[1](#fn_1)\]</sup>
+
+ * `offset`: The number of the offset object within the window
+ frame, counting from 1.
+
+* Modifiers:
+
+ * [Nth Val From](#nthval-from): (Optional) Determines where the function
+ starts counting the offset.
+
+ - `FROM FIRST`: Counting starts at the first object in the window
frame.
+ In this case, an offset of 1 is the first object in the window frame,
+ 2 is the second object, and so on.
+
+ - `FROM LAST`: Counting starts at the last object in the window frame.
+ In this case, an offset of 1 is the last object in the window frame,
+ 2 is the second-to-last object, and so on.
+
+ If this modifier is omitted, the default is `FROM FIRST`.
+
+ * [Nulls Treatment](#nulls-treatment): (Optional) Determines how NULL or
+ MISSING values are treated when finding the first object in the window
+ frame.
+
+ - `IGNORE NULLS`: If the values for any objects evaluate to NULL or
+ MISSING, those objects are ignored when finding the first object.
+ In this case, the function returns the first non-NULL, non-MISSING
+ value.
+
+ - `RESPECT NULLS`: If the values for any objects evaluate to NULL or
+ MISSING, those objects are included when finding the first object.
+
+ If this modifier is omitted, the default is `RESPECT NULLS`.
+
+* Clauses:
+
+ * (Optional) [Window Partition Clause](#window-partition-clause).
+
+ * (Optional) [Window Order Clause](#window-order-clause).
+
+ * (Optional) [Window Frame Clause](#window-order-clause).
+
+* Return Value:
+
+ * The specified value from the offset object.
+
+ * In the following cases, this function may return unpredictable results.
+
+ - If the window order clause is omitted.
+
+ - If the window frame clause is omitted.
+
+ - If the window frame is defined by `ROWS`, and there are tied objects
+ in the window frame.
+
+ * To make the function return deterministic results, add a window order
+ clause, or add further ordering terms to the window order clause so that
+ no objects are tied.
+
+ * If the window frame is defined by `RANGE` or `GROUPS`, and there are
+ tied objects in the window frame, the function returns the lowest value
+ of the input expression when counting `FROM FIRST`, or the highest
+ value of the input expression when counting `FROM LAST`.
+
+* Example 1:
+
+ For each author, show the length of each message, including the
+ length of the second shortest message from that author.
+
+ SELECT m.authorId, m.messageId,
+ LENGTH(m.message) AS message_length,
+ NTH_VALUE(LENGTH(m.message), 2) FROM FIRST OVER (
+ PARTITION BY m.authorId
+ ORDER BY LENGTH(m.message)
+ ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING -- ➊
+ ) AS shortest_message_but_1
+ FROM GleambookMessages AS m;
+
+* The expected result is:
+
+ [
+ {
+ "message_length": 31,
+ "shortest_message_but_1": 39,
+ "authorId": 1,
+ "messageId": 8
+ },
+ {
+ "message_length": 39,
+ "shortest_message_but_1": 39,
+ "authorId": 1,
+ "messageId": 11 // ➋
+ },
+ {
+ "message_length": 44,
+ "shortest_message_but_1": 39,
+ "authorId": 1,
+ "messageId": 4
+ },
+ {
+ "message_length": 45,
+ "shortest_message_but_1": 39,
+ "authorId": 1,
+ "messageId": 2
+ },
+ {
+ "message_length": 51,
+ "shortest_message_but_1": 39,
+ "authorId": 1,
+ "messageId": 10
+ },
+ {
+ "message_length": 35,
+ "shortest_message_but_1": 44,
+ "authorId": 2,
+ "messageId": 3
+ },
+ {
+ "message_length": 44,
+ "shortest_message_but_1": 44,
+ "authorId": 2,
+ "messageId": 6 // ➋
+ }
+ ]
+
+ ➀ This clause specifies that the window frame should extend to the
+ end of the window partition.
+ Without this clause, the end point of the window frame would always be the
+ current object.
+ This would mean that for the shortest message, the function
+ would be unable to find the route with the second shortest message.
+
+ ➁ The second shortest message from this author.
+
+* Example 2:
+
+ For each author, show the length of each message, including the
+ length of the second longest message from that author.
+
+ SELECT m.authorId, m.messageId,
+ LENGTH(m.message) AS message_length,
+ NTH_VALUE(LENGTH(m.message), 2) FROM LAST OVER (
+ PARTITION BY m.authorId
+ ORDER BY LENGTH(m.message)
+ ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING -- ➊
+ ) AS longest_message_but_1
+ FROM GleambookMessages AS m;
+
+* The expected result is:
+
+ [
+ {
+ "message_length": 31,
+ "longest_message_but_1": 45,
+ "authorId": 1,
+ "messageId": 8
+ },
+ {
+ "message_length": 39,
+ "longest_message_but_1": 45,
+ "authorId": 1,
+ "messageId": 11
+ },
+ {
+ "message_length": 44,
+ "longest_message_but_1": 45,
+ "authorId": 1,
+ "messageId": 4
+ },
+ {
+ "message_length": 45,
+ "longest_message_but_1": 45,
+ "authorId": 1,
+ "messageId": 2 // ➋
+ },
+ {
+ "message_length": 51,
+ "longest_message_but_1": 45,
+ "authorId": 1,
+ "messageId": 10
+ },
+ {
+ "message_length": 35,
+ "longest_message_but_1": 35,
+ "authorId": 2,
+ "messageId": 3 // ➋
+ },
+ {
+ "message_length": 44,
+ "longest_message_but_1": 35,
+ "authorId": 2,
+ "messageId": 6
+ }
+ ]
+
+ ➀ This clause specifies that the window frame should extend to the
+ end of the window partition.
+ Without this clause, the end point of the window frame would always be the
+ current object.
+ This would mean the function would be unable to find the second longest
+ message for shorter messages.
+
+ ➁ The second longest message from this author.
+
+### ntile ###
+
+* Syntax:
+
+ NTILE(num_tiles) OVER ([window-partition-clause] window-order-clause)
+
+* Divides the window partition into the specified number of tiles, and
+ allocates each object in the window partition to a tile, so that as far as
+ possible each tile has an equal number of objects.
+ When the set of objects is not equally divisible by the number of tiles, the
+ function puts more objects into the lower-numbered tiles.
+ For each object, the function returns the number of the tile into which that
+ object was placed.
+
+* Arguments:
+
+ * `num_tiles`: The number of tiles into which you want to divide
+ the window partition.
+ This argument can be an expression and must evaluate to a number.
+ If the number is not an integer, it will be truncated.
+ If the expression depends on an object, it evaluates from the first
+ object in the window partition.
+
+* Clauses:
+
+ * (Optional) [Window Partition Clause](#window-partition-clause).
+
+ * (Required) [Window Order Clause](#window-order-clause).
+
+* Return Value:
+
+ * An value greater than or equal to 1 and less than or equal to the number
+ of tiles.
+
+* Example:
+
+ Allocate each message to one of three tiles by length and message ID.
+
+ SELECT m.messageId, LENGTH(m.message) AS `length`,
+ NTILE(3) OVER (
+ ORDER BY LENGTH(m.message), m.messageId
+ ) AS `ntile`
+ FROM GleambookMessages AS m;
+
+* The expected result is:
+
+ [
+ {
+ "length": 31,
+ "ntile": 1,
+ "messageId": 8
+ },
+ {
+ "length": 35,
+ "ntile": 1,
+ "messageId": 3
+ },
+ {
+ "length": 39,
+ "ntile": 1,
+ "messageId": 11
+ },
+ {
+ "length": 44,
+ "ntile": 2,
+ "messageId": 4
+ },
+ {
+ "length": 44,
+ "ntile": 2,
+ "messageId": 6
+ },
+ {
+ "length": 45,
+ "ntile": 3,
+ "messageId": 2
+ },
+ {
+ "length": 51,
+ "ntile": 3,
+ "messageId": 10
+ }
+ ]
+
+### percent_rank ###
+
+* Syntax:
+
+ PERCENT_RANK() OVER ([window-partition-clause] window-order-clause)
+
+* Returns the percentile rank of the current object – that is, the rank of the
+ objects minus one, divided by the total number of objects in the window
+ partition minus one.
+
+* Arguments:
+
+ * None.
+
+* Clauses:
+
+ * (Optional) [Window Partition Clause](#window-partition-clause).
+
+ * (Required) [Window Order Clause](#window-order-clause).
+
+* Return Value:
+
+ * A number between 0 and 1.
+ The higher the value, the higher the ranking.
+
+* Example:
+
+ For each author, find the percentile rank of all messages in order
+ of message ID.
+
+ SELECT m.messageId, m.authorId, PERCENT_RANK() OVER (
+ PARTITION BY m.authorId
+ ORDER BY m.messageId
+ ) AS `rank`
+ FROM GleambookMessages AS m;
+
+* The expected result is:
+
+ [
+ {
+ "rank": 0,
+ "messageId": 2,
+ "authorId": 1
+ },
+ {
+ "rank": 0.25,
+ "messageId": 4,
+ "authorId": 1
+ },
+ {
+ "rank": 0.5,
+ "messageId": 8,
+ "authorId": 1
+ },
+ {
+ "rank": 0.75,
+ "messageId": 10,
+ "authorId": 1
+ },
+ {
+ "rank": 1,
+ "messageId": 11,
+ "authorId": 1
+ },
+ {
+ "rank": 0,
+ "messageId": 3,
+ "authorId": 2
+ },
+ {
+ "rank": 1,
+ "messageId": 6,
+ "authorId": 2
+ }
+ ]
+
+### rank ###
+
+* Syntax:
+
+ RANK() OVER ([window-partition-clause] window-order-clause)
+
+* Returns the rank of the current object – that is, the number of distinct
+ objects preceding this object in the current window partition, plus one.
+
+ The objects are ordered by the window order clause.
+ If any objects are tied, they will have the same rank.
+
+ When any objects have the same rank, the rank of the next object will include
+ all preceding objects, so there may be a gap in the sequence of returned
+ values.
+ For example, if there are three objects ranked 2, the next rank is 5.
+
+* Arguments:
+
+ * None.
+
+* Clauses:
+
+ * (Optional) [Window Partition Clause](#window-partition-clause).
+
+ * (Required) [Window Order Clause](#window-order-clause).
+
+* Return Value:
+
+ * An integer, greater than or equal to 1.
+
+* Example:
+
+ For each author, find the rank of all messages in order of location.
+
+ SELECT m.authorId, m.messageId, m.senderLocation[1] as longitude,
+ RANK() OVER (
+ PARTITION BY m.authorId
+ ORDER BY m.senderLocation[1]
+ ) AS `rank`
+ FROM GleambookMessages AS m;
+
+* The expected result is:
+
+ [
+ {
+ "rank": 1,
+ "authorId": 1,
+ "messageId": 10,
+ "longitude": 70.01
+ },
+ {
+ "rank": 2,
+ "authorId": 1,
+ "messageId": 11,
+ "longitude": 77.49
+ },
+ {
+ "rank": 3,
+ "authorId": 1,
+ "messageId": 2,
+ "longitude": 80.87
+ },
+ {
+ "rank": 3,
+ "authorId": 1,
+ "messageId": 8,
+ "longitude": 80.87
+ },
+ {
+ "rank": 5,
+ "authorId": 1,
+ "messageId": 4,
+ "longitude": 97.04
+ },
+ {
+ "rank": 1,
+ "authorId": 2,
+ "messageId": 6,
+ "longitude": 75.56
+ },
+ {
+ "rank": 2,
+ "authorId": 2,
+ "messageId": 3,
+ "longitude": 81.01
+ }
+ ]
+
+### ratio_to_report ###
+
+* Syntax:
+
+ RATIO_TO_REPORT(expr) OVER (window-clause)
+
+* Returns the fractional ratio of the specified value for each object to the
+ sum of values for all objects in the window partition.
+
+* Arguments:
+
+ * `expr`: The value for which you want to calculate the
+ fractional ratio. <sup>\[[1](#fn_1)\]</sup>
+
+* Clauses:
+
+ * (Optional) [Window Partition Clause](#window-partition-clause).
+
+ * (Optional) [Window Order Clause](#window-order-clause).
+
+ * (Optional) [Window Frame Clause](#window-frame-clause).
+
+* Return Value:
+
+ * A number between 0 and 1, representing the fractional ratio of the value
+ for the current object to the sum of values for all objects in the
+ current window frame.
+ The sum of values for all objects in the current window frame is 1.
+
+ * If the input expression does not evaluate to a number, or the sum of
+ values for all objects is zero, it returns NULL.
+
+* Example:
+
+ For each author, calculate the length of each message as a
+ fraction of the total length of all messages.
+
+ SELECT m.messageId, m.authorId,
+ RATIO_TO_REPORT(LENGTH(m.message)) OVER (
+ PARTITION BY m.authorId
+ ) AS length_ratio
+ FROM GleambookMessages AS m;
+
+* The expected result is:
+
+ [
+ {
+ "length_ratio": 0.21428571428571427,
+ "messageId": 2,
+ "authorId": 1
+ },
+ {
+ "length_ratio": 0.20952380952380953,
+ "messageId": 4,
+ "authorId": 1
+ },
+ {
+ "length_ratio": 0.14761904761904762,
+ "messageId": 8,
+ "authorId": 1
+ },
+ {
+ "length_ratio": 0.24285714285714285,
+ "messageId": 10,
+ "authorId": 1
+ },
+ {
+ "length_ratio": 0.18571428571428572,
+ "messageId": 11,
+ "authorId": 1
+ },
+ {
+ "length_ratio": 0.4430379746835443,
+ "messageId": 3,
+ "authorId": 2
+ },
+ {
+ "length_ratio": 0.5569620253164557,
+ "messageId": 6,
+ "authorId": 2
+ }
+ ]
+
+### row_number ###
+
+* Syntax:
+
+ ROW_NUMBER() OVER ([window-partition-clause] [window-order-clause])
+
+* Returns a unique row number for every object in every window partition.
+ In each window partition, the row numbering starts at 1.
+
+ The window order clause determines the sort order of the objects.
+ If the window order clause is omitted, the return values may be
unpredictable.
+
+* Arguments:
+
+ * None.
+
+* Clauses:
+
+ * (Optional) [Window Partition Clause](#window-partition-clause).
+
+ * (Optional) [Window Order Clause](#window-order-clause).
+
+* Return Value:
+
+ * An integer, greater than or equal to 1.
+
+* Example:
+
+ For each author, number all messages in order of length.
+
+ SELECT m.messageId, m.authorId,
+ ROW_NUMBER() OVER (
+ PARTITION BY m.authorId
+ ORDER BY LENGTH(m.message)
+ ) AS `row`
+ FROM GleambookMessages AS m;
+
+* The expected result is:
+
+ [
+ {
+ "row": 1,
+ "messageId": 8,
+ "authorId": 1
+ },
+ {
+ "row": 2,
+ "messageId": 11,
+ "authorId": 1
+ },
+ {
+ "row": 3,
+ "messageId": 4,
+ "authorId": 1
+ },
+ {
+ "row": 4,
+ "messageId": 2,
+ "authorId": 1
+ },
+ {
+ "row": 5,
+ "messageId": 10,
+ "authorId": 1
+ },
+ {
+ "row": 1,
+ "messageId": 3,
+ "authorId": 2
+ },
+ {
+ "row": 2,
+ "messageId": 6,
+ "authorId": 2
+ }
+ ]
+
+---
+
+<a id="fn_1">1</a>.
+If the query contains the GROUP BY clause or any
+[aggregate functions](#AggregateFunctions), this expression must only
+depend on GROUP BY expressions or aggregate functions.
diff --git a/asterixdb/asterix-doc/src/main/markdown/builtins/15_over.md
b/asterixdb/asterix-doc/src/main/markdown/builtins/15_over.md
new file mode 100644
index 0000000..8703de8
--- /dev/null
+++ b/asterixdb/asterix-doc/src/main/markdown/builtins/15_over.md
@@ -0,0 +1,343 @@
+<!--
+ ! Licensed to the Apache Software Foundation (ASF) under one
+ ! or more contributor license agreements. See the NOTICE file
+ ! distributed with this work for additional information
+ ! regarding copyright ownership. The ASF licenses this file
+ ! to you under the Apache License, Version 2.0 (the
+ ! "License"); you may not use this file except in compliance
+ ! with the License. You may obtain a copy of the License at
+ !
+ ! http://www.apache.org/licenses/LICENSE-2.0
+ !
+ ! Unless required by applicable law or agreed to in writing,
+ ! software distributed under the License is distributed on an
+ ! "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ ! KIND, either express or implied. See the License for the
+ ! specific language governing permissions and limitations
+ ! under the License.
+ !-->
+
+## <a id="WindowClause">Window Clause (OVER Clause)</a> ##
+
+All window functions must have a window clause to define the window partitions,
+the order of objects within the window, and the size of the window frame.
+The `OVER` keyword introduces the window clause.
+Some window functions take additional window options, which are specified by
+further clauses before the window clause.
+
+The query language has a dedicated set of window functions.
+Aggregate functions can also be used as window functions, when they are used
+with a window clause.
+
+### <a id="window-function">Window Function</a> ###
+
+ WindowFunctionCall ::= WindowFunctionType "(" WindowFunctionArguments ")"
+ (WindowFunctionOptions)? <OVER> (Variable <AS>)? "(" WindowClause ")"
+
+### <a id="window-function-type">Window Function Type</a> ###
+
+ WindowFunctionType ::= AggregateFunctions | WindowFunctions
+
+Refer to the [Aggregate Functions](#AggregateFunctions) section for a list of
+aggregate functions.
+
+Refer to the [Window Functions](#WindowFunctions) section for a list of window
+functions.
+
+### <a id="window-function-arguments">Window Function Arguments</a> ###
+
+ WindowFunctionArguments ::= ( (<DISTINCT>)? Expression |
+ (Expression ("," Expression ("," Expression)? )? )? )
+
+Refer to the [Aggregate Functions](#AggregateFunctions) section or the
+[Window Functions](#WindowFunctions) section for details of the arguments for
+individual functions.
+
+### <a id="window-function-options">Window Function Options</a> ###
+
+ WindowFunctionOptions ::= (NthValFrom)? (NullsTreatment)?
+
+Window function options cannot be used with [aggregate
+functions](#AggregateFunctions).
+
+Window function options can only be used with some [window
+functions](#WindowFunctions), as described below.
+
+#### <a id="nthval-from">Nth Val From</a> ####
+
+ NthValFrom ::= <FROM> ( <FIRST> | <LAST> )
+
+The **nth val from** modifier determines whether the computation begins at the
+first or last object in the window.
+
+This modifier can only be used with the `nth_value()` function.
+
+This modifier is optional.
+If omitted, the default setting is `FROM FIRST`.
+
+#### <a id="nulls-treatment">Nulls Treatment</a> ####
+
+ nulls-treatment ::= ( <RESPECT> | <IGNORE> ) <NULLS>
+
+The **nulls treatment** modifier determines whether NULL values are included in
+the computation, or ignored.
+MISSING values are treated the same way as NULL values.
+
+This modifier can only be used with the `first_value()`, `last_value()`,
+`nth_value()`, `lag()`, and `lead()` functions.
+
+This modifier is optional.
+If omitted, the default setting is `RESPECT NULLS`.
+
+### <a id="window-frame-variable">Window Frame Variable</a> ###
+
+The AS keyword enables you to specify an alias for the window clause.
+When using a built-in [aggregate function](#AggregateFunctions) as a
+window function, the function’s argument must be a subquery which refers to the
+window clause using this alias, for example:
+
+ FROM source
+ SELECT ARRAY_COUNT(DISTINCT (FROM alias SELECT VALUE alias.source.expr))
+ OVER alias AS (PARTITION BY … ORDER BY …)
+
+The alias is not necessary when using a [window function](#WindowFunctions),
+or when using a standard SQL aggregate function with the window clause.
+
+#### Standard SQL Aggregate Functions with the Window Clause ####
+
+A standard SQL aggregate function with a window clause is rewritten by the
+query compiler using a built-in aggregate function over a frame variable.
+For example, the following query with the `sum()` function:
+
+ FROM source
+ SELECT SUM(expr)
+ OVER (PARTITION BY … ORDER BY …)
+
+Is rewritten as the following query using the `array_sum()` function:
+
+ FROM source
+ SELECT ARRAY_SUM( (FROM alias SELECT VALUE alias.source.expr) )
+ OVER alias AS (PARTITION BY … ORDER BY …)
+
+This is similar to the way that standard SQL aggregate functions are rewritten
+as built-in aggregate functions in the presence of the GROUP BY clause.
+
+### <a id="window-clause">Window Clause</a> ###
+
+ WindowClause ::= (WindowPartitionClause)? (WindowOrderClause
+ (WindowFrameClause (WindowFrameExclusion)? )? )?
+
+The **window clause** specifies the partitioning, ordering, and framing for
+window functions.
+
+#### <a id="window-partition-clause">Window Partition Clause</a> ####
+
+ WindowPartitionClause ::= <PARTITION> <BY> Expression ("," Expression)*
+
+The **window partition clause** groups the query results into partitions using
+one or more expressions.
+
+This clause may be used with any [window function](#WindowFunctions), or any
+[aggregate function](#AggregateFunctions) used as a window function.
+
+This clause is optional.
+If omitted, all the query results are grouped into single partition.
+
+#### <a id="window-order-clause">Window Order Clause</a> ####
+
+ WindowOrderClause ::= <ORDER> <BY> OrderingTerm ("," OrderingTerm)*
+
+The **window order clause** determines how objects are ordered within each
+partition.
+The window function works on objects in the order specified by this clause.
+
+This clause may be used with any [window function](#WindowFunctions), or any
+[aggregate function](#AggregateFunctions) used as a window function.
+
+This clause is optional for some functions, and required for others.
+Refer to the [Aggregate Functions](#AggregateFunctions) section or the
+[Window Functions](#WindowFunctions) section for details of the syntax of
+individual functions.
+
+If this clause is omitted, all objects are considered peers, i.e. their order
+is tied.
+When objects in the window partition are tied, each window function behaves
+differently.
+
+* The `row_number()` function returns a distinct number for each object.
+ If objects are tied, the results may be unpredictable.
+
+* The `rank()`, `dense_rank()`, `percent_rank()`, and `cume_dist()` functions
+ return the same result for each object.
+
+* For other functions, if the [window frame](#window-frame-clause) is
+ defined by `ROWS`, the results may be unpredictable.
+ If the window frame is defined by `RANGE` or `GROUPS`, the results are same
+ for each object.
+
+This clause may have multiple [ordering terms](#ordering-term).
+To reduce the number of ties, add additional [ordering terms](#ordering-term).
+
+##### NOTE #####
+
+This clause does not guarantee the overall order of the query results.
+To guarantee the order of the final results, use the query ORDER BY clause.
+
+#### <a id="ordering-term">Ordering Term</a> ####
+
+ OrderingTerm ::= Expression ( <ASC> | <DESC> )?
+
+The **ordering term** specifies an ordering expression and collation.
+
+This clause has the same syntax and semantics as the ordering term for queries.
+Refer to the [ORDER BY clause](manual.html#Order_By_clauses) section
+for details.
+
+#### <a id="window-frame-clause">Window Frame Clause</a> ####
+
+ WindowFrameClause ::= ( <ROWS> | <RANGE> | <GROUPS> ) WindowFrameExtent
+
+The **window frame clause** defines the window frame.
+
+This clause can be used with all [aggregate functions](#AggregateFunctions) and
+some [window functions](#WindowFunctions) — refer to the descriptions of
+individual functions for more details.
+
+This clause is allowed only when the [window order
+clause](#window-order-clause) is present.
+
+This clause is optional.
+
+* If this clause is omitted and there is no [window order
+ clause](#window-order-clause), the window frame is the entire partition.
+
+* If this clause is omitted but there is a [window order
+ clause](#window-order-clause), the window frame becomes all objects
+ in the partition preceding the current object and its peers — the
+ same as `RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW`.
+
+The window frame can be defined in the following ways:
+
+* `ROWS`: Counts the exact number of objects within the frame.
+ If window ordering doesn’t result in unique ordering, the function may
+ produce unpredictable results.
+ You can add a unique expression or more window ordering expressions to
+ produce unique ordering.
+
+* `RANGE`: Looks for a value offset within the frame.
+ The function produces deterministic results.
+
+* `GROUPS`: Counts all groups of tied rows within the frame.
+ The function produces deterministic results.
+
+##### NOTE #####
+
+If this clause uses `RANGE` with either `Expression PRECEDING` or
+`Expression FOLLOWING`, the [window order clause](#window-order-clause) must
+have only a single ordering term.
+The ordering term expression must evaluate to a number.
+
+If the ordering term expression does not evaluate to a number, the window
+frame will be empty, which means the window function will return its default
+value: in most cases this is NULL, except for `strict_count()` or
+`array_count()`, whose default value is 0.
+
+This restriction does not apply when the window frame uses `ROWS` or
+`GROUPS`.
+
+##### TIP #####
+
+The `RANGE` window frame is commonly used to define a window frames based
+on date or time.
+In JSON, dates and times are represented as a string in ISO-8601 standard.
+
+If you want to use `RANGE` with either `Expression PRECEDING` or
+`Expression FOLLOWING`, and you want to use an ordering expression based on
+date or time, use the appropriate [date or time function](#TemporalFunctions)
+to convert the date or time into milliseconds, then use the resulting number
+in the ordering expression.
+
+#### <a id="window-frame-extent">Window Frame Extent</a> ####
+
+ WindowFrameExtent ::= ( <UNBOUNDED> <PRECEDING> | <CURRENT> <ROW> |
+ Expression <FOLLOWING> ) | <BETWEEN> ( <UNBOUNDED> <PRECEDING> | <CURRENT>
+ <ROW> | Expression ( <PRECEDING> | <FOLLOWING> ) ) <AND> ( <UNBOUNDED>
+ <FOLLOWING> | <CURRENT> <ROW> | Expression ( <PRECEDING> | <FOLLOWING> ) )
+
+The **window frame extent clause** specifies the start point and end point of
+the window frame.
+The expression before `AND` is the start point and the expression after `AND`
+is the end point.
+If `BETWEEN` is omitted, you can only specify the start point; the end point
+becomes `CURRENT ROW`.
+
+The window frame end point can’t be before the start point.
+If this clause violates this restriction explicitly, an error will result.
+If it violates this restriction implicitly, the window frame will be empty,
+which means the window function will return its default value:
+in most cases this is NULL, except for `strict_count()` or
+`array_count()`, whose default value is 0.
+
+Window frame extents that result in an explicit violation are:
+
+* `BETWEEN CURRENT ROW AND Expression PRECEDING`
+
+* `BETWEEN Expression FOLLOWING AND Expression PRECEDING`
+
+* `BETWEEN Expression FOLLOWING AND CURRENT ROW`
+
+Window frame extents that result in an implicit violation are:
+
+* `BETWEEN UNBOUNDED PRECEDING AND Expression PRECEDING` — if `Expression` is
+ too high, some objects may generate an empty window frame.
+
+* `BETWEEN Expression PRECEDING AND Expression PRECEDING` — if the second
+ `Expression` is greater than or equal to the first `Expression`,
+ all result sets will generate an empty window frame.
+
+* `BETWEEN Expression FOLLOWING AND Expression FOLLOWING` — if the first
+ `Expression` is greater than or equal to the second `Expression`, all result
+ sets will generate an empty window frame.
+
+* `BETWEEN Expression FOLLOWING AND UNBOUNDED FOLLOWING` — if `Expression` is
+ too high, some objects may generate an empty window frame.
+
+* If the [window frame exclusion clause](#window-frame-exclusion) is present,
+ any window frame specification may result in empty window frame.
+
+The `Expression` must be a positive constant or an expression that evaluates as
+a positive number.
+For `ROWS` or `GROUPS`, the `Expression` must be an integer.
+
+#### <a id="window-frame-exclusion">Window Frame Exclusion</a> ####
+
+ WindowFrameExclusion ::= <EXCLUDE> ( <CURRENT> <ROW> | <GROUP> | <TIES> |
+ <NO> <OTHERS> )
+
+The **window frame exclusion clause** enables you to exclude specified
+objects from the window frame.
+
+This clause can be used with all [aggregate functions](#AggregateFunctions) and
+some [window functions](#WindowFunctions) — refer to the descriptions of
+individual functions for more details.
+
+This clause is allowed only when the [window frame
+clause](#window-frame-clause) is present.
+
+This clause is optional.
+If this clause is omitted, the default is no exclusion —
+the same as `EXCLUDE NO OTHERS`.
+
+* `EXCLUDE CURRENT ROW`: If the current object is still part of the window
+ frame, it is removed from the window frame.
+
+* `EXCLUDE GROUP`: The current object and any peers of the current object are
+ removed from the window frame.
+
+* `EXCLUDE TIES`: Any peers of the current object, but not the current object
+ itself, are removed from the window frame.
+
+* `EXCLUDE NO OTHERS`: No additional objects are removed from the window frame.
+
+If the current object is already removed from the window frame, then it remains
+removed from the window frame.
diff --git a/asterixdb/asterix-doc/src/main/markdown/builtins/7_allens.md
b/asterixdb/asterix-doc/src/main/markdown/builtins/7_allens.md
index ad7bd7e..e84ec97 100644
--- a/asterixdb/asterix-doc/src/main/markdown/builtins/7_allens.md
+++ b/asterixdb/asterix-doc/src/main/markdown/builtins/7_allens.md
@@ -272,3 +272,4 @@
* The expected result is:
{ "interval_ends": true, "interval_ended_by": true }
+
diff --git a/asterixdb/asterix-doc/src/main/markdown/builtins/7_temporal.md
b/asterixdb/asterix-doc/src/main/markdown/builtins/7_temporal.md
index dd07124..ab8b753 100644
--- a/asterixdb/asterix-doc/src/main/markdown/builtins/7_temporal.md
+++ b/asterixdb/asterix-doc/src/main/markdown/builtins/7_temporal.md
@@ -800,3 +800,4 @@
interval(datetime("2000-01-01T00:00:00.000Z"),
datetime("2100-01-01T00:00:00.000Z"))
]
};
+
diff --git
a/asterixdb/asterix-doc/src/main/markdown/builtins/9_aggregate_aql.md
b/asterixdb/asterix-doc/src/main/markdown/builtins/9_aggregate_aql.md
index 6498307..4482df8 100644
--- a/asterixdb/asterix-doc/src/main/markdown/builtins/9_aggregate_aql.md
+++ b/asterixdb/asterix-doc/src/main/markdown/builtins/9_aggregate_aql.md
@@ -295,3 +295,4 @@
* The expected result is:
100.0
+
diff --git
a/asterixdb/asterix-doc/src/main/markdown/builtins/9_aggregate_sql.md
b/asterixdb/asterix-doc/src/main/markdown/builtins/9_aggregate_sql.md
index 17b6e13..4737fa2 100644
--- a/asterixdb/asterix-doc/src/main/markdown/builtins/9_aggregate_sql.md
+++ b/asterixdb/asterix-doc/src/main/markdown/builtins/9_aggregate_sql.md
@@ -19,11 +19,22 @@
## <a id="AggregateFunctions">Aggregate Functions (Array Functions) </a> ##
-This section contains detailed descriptions of each aggregate function (i.e.,
array function) in the query language.
-Note that standard SQL aggregate functions (e.g., `MIN`, `MAX`, `SUM`,
`COUNT`, and `AVG`)
-are not real functions in the query language, but just syntactic sugars over
corresponding
+This section contains detailed descriptions of the built-in aggregate
functions in the query language.
+
+The query language also supports standard SQL aggregate functions (e.g.,
`MIN`, `MAX`, `SUM`, `COUNT`, and `AVG`).
+Note that these are not real functions in the query language, but just
syntactic sugars over corresponding
builtin aggregate functions (e.g., `ARRAY_MIN`, `ARRAY_MAX`,
`ARRAY_SUM`, `ARRAY_COUNT`, and `ARRAY_AVG`).
+Refer to [SQL-92 Aggregation
Functions](manual.html#SQL-92_aggregation_functions) for details.
+
+The `DISTINCT` keyword may be used with built-in aggregate functions and
standard SQL aggregate functions.
+It may also be used with aggregate functions used as window functions.
+It determines whether the function aggregates all values in the group, or
distinct values only.
+Refer to [Aggregation Functions](manual.html#Aggregation_functions) for
details.
+
+Aggregate functions may also be used as window functions when they are used
with a window clause,
+which is introduced by the `OVER` keyword.
+Refer to [Window Clause](#WindowClause) for details.
### array_count ###
* Syntax:
@@ -610,3 +621,4 @@
* The expected result is:
-1.5
+
diff --git a/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
b/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
index 17d935a..9f9db63 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/3_query.md
@@ -1180,15 +1180,27 @@
| STRICT_MAX | returns NULL | returns NULL | returns NULL |
| STRICT_MIN | returns NULL | returns NULL | returns NULL |
| STRICT_AVG | returns NULL | returns NULL | returns NULL |
+| STRICT_STDDEV_SAMP | returns NULL | returns NULL | returns NULL |
+| STRICT_STDDEV_POP | returns NULL | returns NULL | returns NULL |
+| STRICT_VAR_SAMP | returns NULL | returns NULL | returns NULL |
+| STRICT_VAR_POP | returns NULL | returns NULL | returns NULL |
+| STRICT_SKEWNESS | returns NULL | returns NULL | returns NULL |
+| STRICT_KURTOSIS | returns NULL | returns NULL | returns NULL |
| ARRAY_COUNT | not counted | not counted | 0 |
| ARRAY_SUM | ignores NULL | ignores NULL | returns NULL |
| ARRAY_MAX | ignores NULL | ignores NULL | returns NULL |
| ARRAY_MIN | ignores NULL | ignores NULL | returns NULL |
| ARRAY_AVG | ignores NULL | ignores NULL | returns NULL |
+| ARRAY_STDDEV_SAMP | ignores NULL | ignores NULL | returns NULL |
+| ARRAY_STDDEV_POP | ignores NULL | ignores NULL | returns NULL |
+| ARRAY_VAR_SAMP | ignores NULL | ignores NULL | returns NULL |
+| ARRAY_VAR_POP | ignores NULL | ignores NULL | returns NULL |
+| ARRAY_SKEWNESS | ignores NULL | ignores NULL | returns NULL |
+| ARRAY_KURTOSIS | ignores NULL | ignores NULL | returns NULL |
-Notice that the query language has twice as many functions listed above as
there are aggregate functions in SQL-92.
-This is because the language offers two versions of each -- one that handles
`UNKNOWN` values in a semantically
-strict fashion, where unknown values in the input result in unknown values in
the output -- and one that
+Notice that the query language offers two versions for each of the aggregate
functions listed above.
+For each function, the STRICT version handles `UNKNOWN` values in a
semantically strict fashion,
+where unknown values in the input result in unknown values in the output; and
the ARRAY version
handles them in the ad hoc "just ignore the unknown values" fashion that the
SQL standard chose to adopt.
##### Example
@@ -1225,7 +1237,7 @@
The query then uses the collection aggregate function ARRAY_COUNT to get the
cardinality of each
group of messages.
-Each aggregation function in the query language supports DISTINCT modifier
that removes duplicate values from
+Each aggregation function in the query language supports the DISTINCT modifier
that removes duplicate values from
the input collection.
##### Example
@@ -1265,15 +1277,26 @@
In contrast to the collection aggregate functions of the query language, these
special SQL-92 function symbols
can only be used in the same way they are in standard SQL (i.e., with the same
restrictions).
-DISTINCT modifier is also supported for these aggregate functions.
+The DISTINCT modifier is also supported for these aggregate functions.
-The following aggregate function aliases are supported
+The following table shows the SQL-92 functions supported by the query
language, their aliases where available,
+and their corresponding built-in functions.
-| Function | Aliases |
-|----------------|-------------------------|
-| STDDEV_SAMP | STDDEV |
-| VAR_SAMP | VARIANCE, VARIANCE_SAMP |
-| VAR_POP | VARIANCE_POP |
+| SQL-92 Function | Aliases | Corresponding Built-in Function |
+|-----------------|-------------------------|---------------------------------|
+| COUNT | | ARRAY_COUNT |
+| SUM | | ARRAY_SUM |
+| MAX | | ARRAY_MAX |
+| MIN | | ARRAY_MIN |
+| AVG | | ARRAY_AVG |
+| ARRAY_AGG | | (none) |
+| STDDEV_SAMP | STDDEV | ARRAY_STDDEV_SAMP |
+| STDDEV_POP | | ARRAY_STDDEV_POP |
+| VAR_SAMP | VARIANCE, VARIANCE_SAMP | ARRAY_VAR_SAMP |
+| VAR_POP | VARIANCE_POP | ARRAY_VAR_POP |
+
+Note that the `ARRAY_AGG` function symbol is rewritten simply to return the
result of the generated subquery,
+without applying any built-in function.
### <a id="SQL-92_compliant_gby">SQL-92 Compliant GROUP BY Aggregations</a>
The query language provides full support for SQL-92 `GROUP BY` aggregation
queries.
@@ -1329,7 +1352,7 @@
## <a id="Where_having_clauses">WHERE Clauses and HAVING Clauses</a>
Both `WHERE` clauses and `HAVING` clauses are used to filter input data based
on a condition expression.
Only tuples for which the condition expression evaluates to `TRUE` are
propagated.
-Note that if the condition expression evaluates to `NULL` or `MISSING` the
input tuple will be disgarded.
+Note that if the condition expression evaluates to `NULL` or `MISSING` the
input tuple will be discarded.
## <a id="Order_By_clauses">ORDER BY Clauses</a>
The `ORDER BY` clause is used to globally sort data in either ascending order
(i.e., `ASC`) or descending order (i.e., `DESC`).
diff --git
a/asterixdb/asterix-doc/src/main/markdown/sqlpp/appendix_1_keywords.md
b/asterixdb/asterix-doc/src/main/markdown/sqlpp/appendix_1_keywords.md
index 6d60249..d7c02d8 100644
--- a/asterixdb/asterix-doc/src/main/markdown/sqlpp/appendix_1_keywords.md
+++ b/asterixdb/asterix-doc/src/main/markdown/sqlpp/appendix_1_keywords.md
@@ -33,10 +33,10 @@
| INNER | INSERT | INTERNAL | INTERSECT | IS | JOIN |
| KEYWORD | LEFT | LETTING | LET | LIKE | LIMIT |
| LOAD | NODEGROUP | NGRAM | NOT | OFFSET | ON |
-| OPEN | OR | ORDER | OUTER | OUTPUT | PATH |
-| POLICY | PRE-SORTED | PRIMARY | RAW | REFRESH | RETURN |
-| RTREE | RUN | SATISFIES | SECONDARY | SELECT | SET |
-| SOME | TEMPORARY | THEN | TYPE | UNKNOWN | UNNEST |
-| UPDATE | USE | USING | VALUE | WHEN | WHERE |
-| WITH | WRITE | | | | |
+| OPEN | OR | ORDER | OUTER | OUTPUT | OVER |
+| PATH | POLICY | PRE-SORTED | PRIMARY | RAW | REFRESH |
+| RETURN | RTREE | RUN | SATISFIES | SECONDARY | SELECT |
+| SET | SOME | TEMPORARY | THEN | TYPE | UNKNOWN |
+| UNNEST | UPDATE | USE | USING | VALUE | WHEN |
+| WHERE | WITH | WRITE | | | |
--
To view, visit https://asterix-gerrit.ics.uci.edu/3448
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I52d6e97a27c2fa51208810c6ac3d98cb21a0e2b1
Gerrit-Change-Number: 3448
Gerrit-PatchSet: 1
Gerrit-Owner: Simon Dew <[email protected]>