This is an automated email from the ASF dual-hosted git repository.

victoria pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/master by this push:
     new 3caacba8c52 update window functions doc (#15902)
3caacba8c52 is described below

commit 3caacba8c5297eac767f24e247aebe0483cfced5
Author: Charles Smith <techdocsm...@gmail.com>
AuthorDate: Thu Mar 7 15:16:52 2024 -0800

    update window functions doc (#15902)
    
    Co-authored-by: Victoria Lim <vt...@users.noreply.github.com>
---
 docs/querying/sql-functions.md        |  80 +++++++++++
 docs/querying/sql-translation.md      |   1 -
 docs/querying/sql-window-functions.md | 250 +++++++++++++++++++++++++++++++---
 website/.spelling                     |   6 +
 website/sidebars.json                 |   2 +-
 5 files changed, 318 insertions(+), 21 deletions(-)

diff --git a/docs/querying/sql-functions.md b/docs/querying/sql-functions.md
index 2d0c51f6c12..9a43afab1d7 100644
--- a/docs/querying/sql-functions.md
+++ b/docs/querying/sql-functions.md
@@ -480,6 +480,14 @@ Calculates the trigonometric cotangent of an angle 
expressed in radians.
 
 Counts the number of rows.
 
+## CUME_DIST
+
+`CUME_DIST()`
+
+**Function type:** [Window](sql-window-functions.md#window-function-reference)
+
+Returns the cumulative distribution of the current row within the window 
calculated as `number of window rows at the same rank or higher than current 
row` / `total window rows`. The return value ranges between `1/number of rows` 
and 1.
+
 ## CURRENT_DATE
 
 `CURRENT_DATE`
@@ -529,6 +537,14 @@ Decodes a Base64-encoded string into a UTF-8 encoded 
string.
 
 Converts an angle from radians to degrees.
 
+## DENSE_RANK
+
+`DENSE_RANK()`
+
+**Function type:** [Window](sql-window-functions.md#window-function-reference)
+
+Returns the rank for a row within a window without gaps. For example, if two 
rows tie for a rank of 1, the subsequent row is ranked 2.
+
 ## DIV
 
 `DIV(x, y)`
@@ -684,6 +700,14 @@ Calculates _e_ raised to the power of the numeric 
expression.
 
 Extracts the value of some unit of the timestamp, optionally from a certain 
time zone, and returns the number.
 
+## FIRST_VALUE
+
+`FIRST_VALUE(expr)`
+
+**Function type:** [Window](sql-window-functions.md#window-function-reference)
+
+Returns the value evaluated for the expression for the first row within the 
window.
+
 ## FLOOR (date and time)
 
 `FLOOR(<TIMESTAMP> TO <TIME_UNIT>)`
@@ -860,6 +884,22 @@ Extracts an `ARRAY<COMPLEX<json>>` value from `expr` at 
the specified `path`. If
 
 Extracts a literal value from `expr` at the specified `path`. If you specify 
`RETURNING` and an SQL type name (such as `VARCHAR`, `BIGINT`, `DOUBLE`, etc) 
the function plans the query using the suggested type. Otherwise, it attempts 
to infer the type based on the context. If it can't infer the type, it defaults 
to `VARCHAR`.
 
+## LAG
+
+`LAG(expr[, offset])`
+
+**Function type:** [Window](sql-window-functions.md#window-function-reference)
+
+If you do not supply an `offset`, returns the value evaluated at the row 
preceding the current row. Specify an offset number `n` to return the value 
evaluated at `n` rows preceding the current one.
+
+## LAST_VALUE
+
+`LAST_VALUE(expr)`
+
+**Function type:** [Window](sql-window-functions.md#window-function-reference)
+
+Returns the value evaluated for the expression for the last row within the 
window.
+
 ## LATEST
 
 `LATEST(expr, [maxBytesPerValue])`
@@ -876,6 +916,14 @@ Returns the value of a numeric or string expression 
corresponding to the latest
 
 Returns the value of a numeric or string expression corresponding to the 
latest time value from `timestampExpr`.
 
+## LEAD
+
+`LEAD(expr[, offset])`
+
+**Function type:** [Window](sql-window-functions.md#window-function-reference)
+
+If you do not supply an `offset`, returns the value evaluated at the row 
following the current row. Specify an offset number `n` to return the value 
evaluated at `n` rows following the current one; if there is no such row, 
returns the given default value.
+
 ## LEAST
 
 `LEAST([expr1, ...])`
@@ -1092,6 +1140,14 @@ Returns a slice of the array from the zero-based start 
and end indexes.
 
 Joins all elements of the array together by the given delimiter.
 
+## NTILE
+
+`NTILE(tiles)`
+
+**Function type:** [Window](sql-window-functions.md#window-function-reference)
+
+Divides the rows within a window as evenly as possible into the number of 
tiles, also called buckets, and returns the value of the tile that the row 
falls into.
+
 ## NULLIF
 
 `NULLIF(value1, value2)`
@@ -1124,6 +1180,14 @@ Parses `expr` into a `COMPLEX<json>` object. This 
operator deserializes JSON val
 
 Converts a string into a BIGINT with the given base or into a DECIMAL data 
type if the base is not specified.
 
+## PERCENT_RANK
+
+`PERCENT_RANK()`
+
+**Function type:** [Window](sql-window-functions.md#window-function-reference)
+
+Returns the relative rank of the row calculated as a percentage according to 
the formula: `RANK() OVER (window) / COUNT(1) OVER (window)`.
+
 ## POSITION
 
 `POSITION(<CHARACTER> IN <CHARACTER> [FROM <INTEGER>])`
@@ -1148,6 +1212,14 @@ Calculates a numerical expression raised to the 
specified power.
 
 Converts an angle from degrees to radians.
 
+## RANK
+
+`RANK()`
+
+**Function type:** [Window](sql-window-functions.md#window-function-reference)
+
+Returns the rank with gaps for a row within a window. For example, if two rows 
tie for rank 1, the next rank is 3.
+
 ## REGEXP_EXTRACT
 
 `REGEXP_EXTRACT(<CHARACTER>, <CHARACTER>, [<INTEGER>])`
@@ -1213,6 +1285,14 @@ Returns the rightmost number of characters from an 
expression.
 
 Calculates the rounded value for a numerical expression.
 
+## ROW_NUMBER
+
+`ROW_NUMBER()`
+
+**Function type:** [Window](sql-window-functions.md#window-function-reference)
+
+Returns the number of the row within the window starting from 1.
+
 ## RPAD
 
 `RPAD(<CHARACTER>, <INTEGER>, [<CHARACTER>])`
diff --git a/docs/querying/sql-translation.md b/docs/querying/sql-translation.md
index 29edc932531..63ef8fdbdb3 100644
--- a/docs/querying/sql-translation.md
+++ b/docs/querying/sql-translation.md
@@ -822,7 +822,6 @@ Druid does not support all SQL features. In particular, the 
following features a
 - JOIN conditions that are not an equality between expressions from the left- 
and right-hand sides.
 - JOIN conditions containing a constant value inside the condition.
 - JOIN conditions on a column which contains a multi-value dimension.
-- OVER clauses, and analytic functions such as `LAG` and `LEAD`.
 - ORDER BY for a non-aggregating query, except for `ORDER BY __time` or `ORDER 
BY __time DESC`, which are supported.
   This restriction only applies to non-aggregating queries; you can ORDER BY 
any column in an aggregating query.
 - DDL and DML.
diff --git a/docs/querying/sql-window-functions.md 
b/docs/querying/sql-window-functions.md
index 4624ef07715..d64538779f0 100644
--- a/docs/querying/sql-window-functions.md
+++ b/docs/querying/sql-window-functions.md
@@ -1,6 +1,7 @@
 ---
 id: sql-window-functions
 title: Window functions
+description: Reference for window functions
 ---
 
 <!--
@@ -38,10 +39,10 @@ Set the context parameter `enableWindowing: true` to use 
window functions.
 
 Window functions in Apache Druid produce values based upon the relationship of 
one row within a window of rows to the other rows within the same window. A 
window is a group of related rows within a result set. For example, rows with 
the same value for a specific dimension.
 
-The following example organizes results with the same `channel` value into 
windows. For each window, the query returns the rank of each row in ascending 
order based upon its `delta` value.
-
 Window functions in Druid require a GROUP BY statement. Druid performs the 
row-level aggregations for the GROUP BY before performing the window function 
calculations.
 
+The following example organizes results with the same `channel` value into 
windows. For each window, the query returns the rank of each row in ascending 
order based upon its `changed` value.
+
 ```sql
 SELECT FLOOR(__time TO DAY) AS event_time,
     channel,
@@ -88,11 +89,55 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+You can write a window function in Druid using either syntax below.
+The second syntax shows a window alias to reference a window that you can 
reuse.
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY partitioning expression] [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
-- PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- PARTITION BY indicates the dimension that defines window boundaries
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+In the following example, the following OVER clause example sets the window 
dimension to `channel` and orders the results by the absolute value of `delta` 
ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation.
+
+ROWS and RANGE accept the following values for `range start` and `range end`:
+- UNBOUNDED PRECEDING: from the beginning of the window as ordered by the 
order expression
+- _N_ ROWS PRECEDING: _N_ rows before the current row as ordered by the order 
expression
+- CURRENT ROW: the current row
+- _N_ ROWS FOLLOWING: _N_ rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING: to the end of the window as ordered by the order 
expression
+
+See [Example with window frames](#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregated 
results.
 
 :::note
 
@@ -100,28 +145,126 @@ Sometimes windows are called partitions. However, the 
partitioning for window fu
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY and not PARTITION BY, it 
sorts the aggregate data set and applies the function in that order.
+
+The following query uses `ORDER BY SUM(delta) DESC` to rank user hourly 
activity from the most changed the least changed within an hour:
 
 ```sql
-...
-RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
-...
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, 
+    channel, 
+    user,
+    SUM(delta) net_user_changes,
+    RANK() OVER (ORDER BY SUM(delta) DESC) AS editing_rank
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'), channel, user
+ORDER BY 5 
 ```
 
+<details>
+<summary> View results </summary>
+
+| `time_hour` | `channel` | `user` | `net_user_changes` | `editing_rank` |
+| --- | --- | --- | --- | --- |
+| `2016-06-27T15:00:00.000Z` | `#kk.wikipedia` | `Nurkhan` | 6900| 1 |
+| `2016-06-27T19:00:00.000Z` | `#lt.wikipedia` | `77.221.66.41` | 4358 | 2 |
+| `2016-06-27T09:00:00.000Z` | `#kk.wikipedia` | `Салиха` | 2702 | 3 |
+| `2016-06-27T04:00:00.000Z` | `#kk.wikipedia` | `Nurkhan` | 2440 | 4 |
+| `2016-06-27T09:00:00.000Z` | `#lt.wikipedia` | `80.4.147.222` | 894 | 5 |
+| `2016-06-27T09:00:00.000Z` | `#lt.wikipedia` | `178.11.203.212` | 447 | 6 |
+| `2016-06-27T11:00:00.000Z` | `#kk.wikipedia` | `Нұрлан Рахымжанов` | 126 | 7 
|
+| `2016-06-27T06:00:00.000Z` | `#kk.wikipedia` | `Шокай` | 91 | 8 |
+| `2016-06-27T11:00:00.000Z` | `#lt.wikipedia` | `MaryroseB54` | 59 | 9 |
+| `2016-06-27T04:00:00.000Z` | `#kk.wikipedia` | `Нұрлан Рахымжанов` | 56 | 10 
|
+| `2016-06-27T12:00:00.000Z` | `#lt.wikipedia` | `Karoliuk` | 53 | 11 |
+| `2016-06-27T12:00:00.000Z` | `#lt.wikipedia` | `Powermelon` | 28 | 12 |
+| `2016-06-27T07:00:00.000Z` | `#lt.wikipedia` | `Powermelon` | 13 | 13 |
+| `2016-06-27T10:00:00.000Z` | `#lt.wikipedia` | `80.4.147.222` | 1 | 14 |
+| `2016-06-27T07:00:00.000Z` | `#kk.wikipedia` | `Салиха` | -1 | 15 |
+| `2016-06-27T06:00:00.000Z` | `#lt.wikipedia` | `Powermelon` | -2 | 16 |
+</details>
+
+### PARTITION BY windows
+
+When a window only specifies PARTITION BY partition expression, Druid 
calculates the aggregate window function over all the rows that share a value 
within the selected dataset.
+
+The following example demonstrates a query that uses two different 
windows—`PARTITION BY channel` and `PARTITION BY user`—to calculate the total 
activity in the channel and total activity by the user so that they can be 
compared to individual hourly activity:
+
+```sql
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour,
+    channel,
+    user,
+    SUM(delta) AS hourly_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY user) AS total_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY channel) AS total_channel_changes
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'), 2, 3
+ORDER BY channel, TIME_FLOOR(__time, 'PT1H'), user
+```
+
+<details>
+<summary> View results </summary>
+
+| `time_hour` | `channel` | `user` | `hourly_user_changes` | 
`total_user_changes` | `total_channel_changes` |
+| --- | ---| ---| --- | --- | --- |
+| `2016-06-27T04:00:00.000Z` | `#kk.wikipedia` | `Nurkhan` | 2440 | 9340 | 
12314 |
+| `2016-06-27T04:00:00.000Z` | `#kk.wikipedia` | `Нұрлан Рахымжанов` | 56 | 
182 | 12314 |
+| `2016-06-27T06:00:00.000Z` | `#kk.wikipedia` | `Шокай` | 91 | 91 | 12314 |
+| `2016-06-27T07:00:00.000Z` | `#kk.wikipedia` | `Салиха` | -1 | 2701 | 12314 |
+| `2016-06-27T09:00:00.000Z` | `#kk.wikipedia` | `Салиха` | 2702 | 2701 | 
12314 |
+| `2016-06-27T11:00:00.000Z` | `#kk.wikipedia` | `Нұрлан Рахымжанов` | 126 | 
182 | 12314 |
+| `2016-06-27T15:00:00.000Z` | `#kk.wikipedia` | `Nurkhan` | 6900 | 9340 | 
12314 |
+| `2016-06-27T06:00:00.000Z` | `#lt.wikipedia` | `Powermelon` | -2 | 39 | 5851 
|
+| `2016-06-27T07:00:00.000Z` | `#lt.wikipedia` | `Powermelon` | 13 | 39 | 5851 
|
+| `2016-06-27T09:00:00.000Z` | `#lt.wikipedia` | `178.11.203.212` | 447 | 447 
| 5851 |
+| `2016-06-27T09:00:00.000Z` | `#lt.wikipedia` | `80.4.147.222` | 894 | 895 | 
5851 |
+| `2016-06-27T10:00:00.000Z` | `#lt.wikipedia` | `80.4.147.222` | 1 | 895 | 
5851 |
+| `2016-06-27T11:00:00.000Z` | `#lt.wikipedia` | `MaryroseB54` | 59 | 59 | 
5851 |
+| `2016-06-27T12:00:00.000Z` | `#lt.wikipedia` | `Karoliuk` | 53 | 53 | 5851 |
+| `2016-06-27T12:00:00.000Z` | `#lt.wikipedia` | `Powermelon` | 28 | 39 | 5851 
|
+| `2016-06-27T19:00:00.000Z` | `#lt.wikipedia` | `77.221.66.41` | 4358 | 4358 
| 5851 |
+
+</details>
+
+In this example, the dataset is filtered for a single day. Therefore the 
window function results represent the total activity for the day, for the 
`user` and for the `channel` dimensions respectively.
+
+This type of result helps you analyze the impact of an individual user's 
hourly activity:
+- the impact to the channel by comparing `hourly_user_changes` to 
`total_channel_changes`
+- the impact of each user over the channel by `total_user_changes` to 
`total_channel_changes`
+- the progress of each user's individual activity by comparing 
`hourly_user_changes` to `total_user_changes`
+
+#### Window frame guardrails
+
+Druid has guardrail logic to prevent you from executing window function 
queries with window frame expressions that might return unexpected results.
+
+For example:
+- You cannot set expressions as bounds for window frames.
+- You cannot use two FOLLOWING expressions in the window frame. For example: 
`ROWS BETWEEN 2 ROWS FOLLOWING and 3 ROWS FOLLOWING`.
+- You can only use a RANGE frames when both endpoints are unbounded or current 
row.
+
+If you write a query that violates one of these conditions, Druid throws an 
error: "The query contains a window frame which may return incorrect results. 
To disregard this warning, set `windowingStrictValidation` to false in the 
query context."
+
 ## Window function reference
 
 |Function|Notes|
 |--------|-----|
-| `ROW_NUMBER()`| Returns the number of the row within the window |
-|`RANK()`| Returns the rank for a row within a window | 
-|`DENSE_RANK()`| Returns the rank for a row within a window without gaps. For 
example, if two rows tie for rank of 1, the subsequent row is ranked 2. |
-|`PERCENT_RANK()`| Returns the rank of the row calculated as a percentage 
according to the formula: `(rank - 1) / (total window rows - 1)` |
-|`CUME_DIST()`| Returns the cumulative distribution of the current row within 
the window calculated as `number of window rows at the same rank or higher than 
current row` / `total window rows` |
-|`NTILE(tiles)`| Divides the rows within a window as evenly as possible into 
the number of tiles, also called buckets, and returns the value of the tile 
that the row falls into | None |
-|`LAG(expr[, offset])`| Returns the value evaluated at the row that precedes 
the current row by the offset number within the window. `offset` defaults to 1 
if not provided |
-|`LEAD(expr[, offset])`| Returns the value evaluated at the row that follows 
the current row by the offset number within the window; if there is no such 
row, returns the given default value. `offset` defaults to 1 if not provided |
-|`FIRST_VALUE(expr)`| Returns the value for the expression for the first row 
within the window |
-|`LAST_VALUE(expr)`| Returns the value for the expression for the last row 
within the window |
+| `ROW_NUMBER()` | Returns the number of the row within the window starting 
from 1 |
+| `RANK()` | Returns the rank with gaps for a row within a window. For 
example, if two rows tie for rank 1, the next rank is 3 | 
+| `DENSE_RANK()` | Returns the rank for a row within a window without gaps. 
For example, if two rows tie for rank of 1, the subsequent row is ranked 2. |
+| `PERCENT_RANK()` | Returns the relative rank of the row calculated as a 
percentage according to the formula: `RANK() OVER (window) / COUNT(1) OVER 
(window)` |
+| `CUME_DIST()` | Returns the cumulative distribution of the current row 
within the window calculated as number of window rows at the same rank or 
higher than current row divided by total window rows. The return value ranges 
between `1/number of rows` and 1 |
+| `NTILE(tiles)` | Divides the rows within a window as evenly as possible into 
the number of tiles, also called buckets, and returns the value of the tile 
that the row falls into | None |
+| `LAG(expr[, offset])` | If you do not supply an `offset`, returns the value 
evaluated at the row preceding the current row. Specify an offset number, `n`, 
to return the value evaluated at `n` rows preceding the current one |
+| `LEAD(expr[, offset])` | If you do not supply an `offset`, returns the value 
evaluated at the row following the current row. Specify an offset number `n` to 
return the value evaluated at `n` rows following the current one; if there is 
no such row, returns the given default value |
+| `FIRST_VALUE(expr)` | Returns the value evaluated for the expression for the 
first row within the window |
+| `LAST_VALUE(expr)` | Returns the value evaluated for the expression for the 
last row within the window |
 
 ## Examples
 
@@ -214,6 +357,75 @@ GROUP BY channel, __time, delta
 
 </details>
 
+### Example with window frames
+
+The following query uses a few different window frames to calculate overall 
activity by channel:
+
+```sql
+SELECT
+    channel, 
+    TIME_FLOOR(__time, 'PT1H')      AS time_hour, 
+    SUM(delta)                      AS hourly_channel_changes,
+    SUM(SUM(delta)) OVER cumulative AS cumulative_activity_in_channel,
+    SUM(SUM(delta)) OVER moving5    AS csum5,
+    COUNT(1) OVER moving5           AS count5
+FROM "wikipedia"
+WHERE channel = '#en.wikipedia'
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY 1, TIME_FLOOR(__time, 'PT1H')
+WINDOW cumulative AS (   
+                         PARTITION BY channel 
+                         ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+                     )
+                     ,
+        moving5 AS ( 
+                    PARTITION BY channel 
+                    ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                    ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
+                  )
+```
+
+<details>
+<summary> View results </summary>
+
+| `channel` | `time_hour` | `hourly_channel_changes` | 
`cumulative_activity_in_channel` | `csum5` | `count5` |
+| --- | --- | --- | --- | --- | --- | --- | --- |
+| `#en.wikipedia` | `2016-06-27T00:00:00.000Z` | 74996 | 74996 | 74996 | 1 |
+| `#en.wikipedia` | `2016-06-27T01:00:00.000Z` | 24150 | 99146 | 99146 | 2 |
+| `#en.wikipedia` | `2016-06-27T02:00:00.000Z` | 102372 | 201518 | 201518 | 3 |
+| `#en.wikipedia` | `2016-06-27T03:00:00.000Z` | 61362 | 262880 | 262880 | 4 |
+| `#en.wikipedia` | `2016-06-27T04:00:00.000Z` | 61666 | 324546 | 324546 | 5 |
+| `#en.wikipedia` | `2016-06-27T05:00:00.000Z` | 144199 | 468745 | 393749 | 5 |
+| `#en.wikipedia` | `2016-06-27T06:00:00.000Z` | 33414 | 502159 | 403013 | 5 |
+| `#en.wikipedia` | `2016-06-27T07:00:00.000Z` | 79397 | 581556 | 380038 | 5 |
+| `#en.wikipedia` | `2016-06-27T08:00:00.000Z` | 104436 | 685992 | 423112 | 5 |
+| `#en.wikipedia` | `2016-06-27T09:00:00.000Z` | 58020 | 744012 | 419466 | 5 |
+| `#en.wikipedia` | `2016-06-27T10:00:00.000Z` | 93904 | 837916 | 369171 | 5 |
+| `#en.wikipedia` | `2016-06-27T11:00:00.000Z` | 74436 | 912352 | 410193 | 5 |
+| `#en.wikipedia` | `2016-06-27T12:00:00.000Z` | 83491 | 995843 | 414287 | 5 |
+| `#en.wikipedia` | `2016-06-27T13:00:00.000Z` | 103051 | 1098894 | 412902 | 5 
|
+| `#en.wikipedia` | `2016-06-27T14:00:00.000Z` | 211411 | 1310305 | 566293 | 5 
|
+| `#en.wikipedia` | `2016-06-27T15:00:00.000Z` | 101247 | 1411552 | 573636 | 5 
|
+| `#en.wikipedia` | `2016-06-27T16:00:00.000Z` | 189765 | 1601317 | 688965 | 5 
|
+| `#en.wikipedia` | `2016-06-27T17:00:00.000Z` | 74404 | 1675721 | 679878 | 5 |
+| `#en.wikipedia` | `2016-06-27T18:00:00.000Z` | 104824 | 1780545 | 681651 | 5 
|
+| `#en.wikipedia` | `2016-06-27T19:00:00.000Z` | 71268 | 1851813 | 541508 | 5 |
+| `#en.wikipedia` | `2016-06-27T20:00:00.000Z` | 88185 | 1939998 | 528446 | 5 |
+| `#en.wikipedia` | `2016-06-27T21:00:00.000Z` | 42584 | 1982582 | 381265 | 5 |
+
+</details>
+
+The example defines multiple window specifications in the WINDOW clause that 
you can use for various window function calculations.
+
+The query uses two windows:
+- `cumulative` is partitioned by channel and includes all rows from the 
beginning of partition up to the current row as ordered by `__time` to enable 
cumulative aggregation
+- `moving5` is also partitioned by channel but only includes up to the last 
four rows and the current row as ordered by time
+
+The number of rows considered for the `moving5` window for the `count5` column:
+- starts at a single row because there are no rows before the current one
+- grows up to five rows as defined by `ROWS BETWEEN 4 ROWS PRECEDING AND 
CURRENT ROW`
+
 ## Known issues
 
 The following are known issues with window functions:
diff --git a/website/.spelling b/website/.spelling
index f26711c6d5f..86495fdf84e 100644
--- a/website/.spelling
+++ b/website/.spelling
@@ -57,7 +57,9 @@ CloudWatch
 ColumnDescriptor
 Corretto
 CLI
+CUME_DIST
 DDL
+DENSE_RANK
 DML
 DNS
 DRUIDVERSION
@@ -94,6 +96,7 @@ ETL
 Elasticsearch
 Enums
 FirehoseFactory
+FIRST_VALUE
 FlattenSpec
 Float.NEGATIVE_INFINITY
 Float.NEGATIVE_INFINITY.
@@ -416,6 +419,7 @@ nonfinalized
 non-null
 non-nullable
 noop
+NTILE
 numerics
 numShards
 parameterize
@@ -427,6 +431,7 @@ partitioner
 partitionFunction
 partitionsSpec
 pathParts
+PERCENT_RANK
 performant
 plaintext
 pluggable
@@ -489,6 +494,7 @@ repo
 requireSSL
 rollup
 rollups
+ROW_NUMBER
 rsync
 runtime
 schemas
diff --git a/website/sidebars.json b/website/sidebars.json
index e3178726ce5..0b5bbc6e845 100644
--- a/website/sidebars.json
+++ b/website/sidebars.json
@@ -122,6 +122,7 @@
         "label": "Druid SQL",
         "items": [
           "querying/sql",
+          "querying/sql-functions",
           "querying/tips-good-queries",
           "querying/query-deep-storage",
           "querying/sql-data-types",
@@ -132,7 +133,6 @@
           "querying/sql-array-functions",
           "querying/sql-multivalue-string-functions",
           "querying/sql-json-functions",
-          "querying/sql-functions",
           "querying/sql-query-context",
           "querying/sql-metadata-tables",
           "querying/sql-translation"


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to