Re: [PR] update window functions doc (druid)

via GitHub Fri, 23 Feb 2024 14:32:48 -0800


vtlim commented on code in PR #15902:
URL: https://github.com/apache/druid/pull/15902#discussion_r1501114160



##########
docs/querying/sql-functions.md:
##########
@@ -529,6 +537,14 @@ Decodes a Base64-encoded string into a UTF-8 encoded 
string.
 
 Converts an angle from radians to degrees.
 
+## DENSE_RANK
+
+`DENSE_RANK()`
+
+**Function type:** [Window](sql-window-functions.md#window-function-reference)
+
+Returns the rank for a row within a window without gaps. For example, if two 
rows tie for rank of 1, the subsequent row is ranked 2.

Review Comment:
   ```suggestion
   Returns the rank for a row within a window without gaps. For example, if two 
rows tie for a rank of 1, the subsequent row is ranked 2.
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:

Review Comment:
   Should this say "use one of the following two (equivalent?) syntaxes"?
   
   Is there a difference between the two in query execution or planning? And if 
so how do you decide when to use one or the other.



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:

Review Comment:
   ```suggestion
   Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation.
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :

Review Comment:
   ```suggestion
   ROWS and RANGE accept the following values for `range start` and `range end`:
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.

Review Comment:
   ```suggestion
   Druid applies the GROUP BY dimensions before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregated 
results.
   ```



##########
docs/querying/sql-window-functions.md:
##########


Review Comment:
   not related to the changes in this PR, but before the first example, it 
seems you should switch the ordering of these two sentences
   
   >The following example organizes results with the same `channel` value into 
windows. For each window, the query returns the rank of each row in ascending 
order based upon its `delta` value.
   
   
   >Window functions in Druid require a GROUP BY statement. Druid performs the 
row-level aggregations for the GROUP BY before performing the window function 
calculations.
   



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:

Review Comment:
   ```suggestion
   In general, window functions in Druid use the following syntax:
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]

Review Comment:
   Since the previous syntax has `partitioning expression` and `order 
expression`, should you include it here too?



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )

Review Comment:
   ```suggestion
     [[ROWS, RANGE] BETWEEN range start AND range end])
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]

Review Comment:
   Is this specifically `PARTITION` and not `PARTITIONED` like in 
https://druid.apache.org/docs/latest/multi-stage-query/reference#partitioned-by?
   
   Or can you use both interchangeably? If not maybe we should note the 
difference.



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window

Review Comment:
   possibly "that defines the window boundaries"



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression

Review Comment:
   ```suggestion
   - `UNBOUND PRECEDING`: from the beginning of the window as ordered by the 
order expression
   - `N ROWS PRECEDING`: _N_ rows before the current row as ordered by the 
order expression
   - `CURRENT ROW`: the current row
   - `N ROWS FOLLOWING`: _N_ rows after the current row as ordered by the order 
expression
   - `UNBOUNDED FOLLOWING`: to the end of the window as ordered by the order 
expression
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:

Review Comment:
   Not sure if I understand what an empty OVER clause is, but doesn't the 
example have a non-empty OVER clause?



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:
 
 ```sql
-...
-RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
-...
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, 
+    channel, 
+    user,
+    SUM(delta) net_user_changes,
+    RANK( ) OVER ( ORDER BY SUM(delta) DESC ) editing_rank
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'), channel, user
+ORDER BY 5 
 ```
 
+### PARTITION BY windows
+
+When a window only specifies PARTITION BY partition expression, Druid 
calculates the aggregate window function over all the rows that share a values 
within the selected dataset.
+
+The following example demonstrates a query that uses two different windows 
PARTITION BY channel and PARTITION BY user to calculate the total activity in 
the channel and total activity by the user so that they can be compared to 
individual hourly activity:

Review Comment:
   ```suggestion
   The following example demonstrates a query that uses two different 
windows—`PARTITION BY channel` and `PARTITION BY user`—to calculate the total 
activity in the channel and total activity by the user so that they can be 
compared to individual hourly activity:
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:
 
 ```sql

Review Comment:
   Could you also include query results? Like with the first example.



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:

Review Comment:
   ```suggestion
   The following query uses `ORDER BY SUM(delta) DESC` to rank user hourly 
activity from the most changed the least changed within an hour:
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.

Review Comment:
   How are the window boundaries determined if there's not PARTITION BY? Is the 
entire result set (the result of WHERE) one big window? 



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:
 
 ```sql
-...
-RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
-...
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, 
+    channel, 
+    user,
+    SUM(delta) net_user_changes,
+    RANK( ) OVER ( ORDER BY SUM(delta) DESC ) editing_rank
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'), channel, user
+ORDER BY 5 
 ```
 
+### PARTITION BY windows
+
+When a window only specifies PARTITION BY partition expression, Druid 
calculates the aggregate window function over all the rows that share a values 
within the selected dataset.
+
+The following example demonstrates a query that uses two different windows 
PARTITION BY channel and PARTITION BY user to calculate the total activity in 
the channel and total activity by the user so that they can be compared to 
individual hourly activity:
+
+```sql
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, channel, user,
+    SUM(delta) hourly_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY user ) AS total_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY channel ) AS total_channel_changes

Review Comment:
   ```suggestion
       SUM(SUM(delta)) OVER (PARTITION BY user) AS total_user_changes,
       SUM(SUM(delta)) OVER (PARTITION BY channel) AS total_channel_changes
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:
 
 ```sql
-...
-RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
-...
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, 
+    channel, 
+    user,
+    SUM(delta) net_user_changes,
+    RANK( ) OVER ( ORDER BY SUM(delta) DESC ) editing_rank
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'), channel, user
+ORDER BY 5 
 ```
 
+### PARTITION BY windows
+
+When a window only specifies PARTITION BY partition expression, Druid 
calculates the aggregate window function over all the rows that share a values 
within the selected dataset.

Review Comment:
   "share a value" or "share values"



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:
 
 ```sql
-...
-RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
-...
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, 
+    channel, 
+    user,
+    SUM(delta) net_user_changes,
+    RANK( ) OVER ( ORDER BY SUM(delta) DESC ) editing_rank
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'), channel, user
+ORDER BY 5 
 ```
 
+### PARTITION BY windows
+
+When a window only specifies PARTITION BY partition expression, Druid 
calculates the aggregate window function over all the rows that share a values 
within the selected dataset.
+
+The following example demonstrates a query that uses two different windows 
PARTITION BY channel and PARTITION BY user to calculate the total activity in 
the channel and total activity by the user so that they can be compared to 
individual hourly activity:
+
+```sql
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, channel, user,

Review Comment:
   ```suggestion
       TIME_FLOOR(__time, 'PT1H') as time_hour,
       channel,
       user,
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:
 
 ```sql
-...
-RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
-...
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, 
+    channel, 
+    user,
+    SUM(delta) net_user_changes,
+    RANK( ) OVER ( ORDER BY SUM(delta) DESC ) editing_rank
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'), channel, user
+ORDER BY 5 
 ```
 
+### PARTITION BY windows
+
+When a window only specifies PARTITION BY partition expression, Druid 
calculates the aggregate window function over all the rows that share a values 
within the selected dataset.
+
+The following example demonstrates a query that uses two different windows 
PARTITION BY channel and PARTITION BY user to calculate the total activity in 
the channel and total activity by the user so that they can be compared to 
individual hourly activity:
+
+```sql
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, channel, user,
+    SUM(delta) hourly_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY user ) AS total_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY channel ) AS total_channel_changes
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'),2,3
+ORDER BY channel,TIME_FLOOR(__time, 'PT1H'), user
+```
+
+In this example, the dataset is filtered for a single day, therefore the 
window function results represent the total activity for the day, for the 
`user` and for the `channel` respectively.
+
+This type of result helps you analyze the impact of an individual user's 
hourly activity:
+- the impact to the channel by comparing `hourly_user_changes` to 
`total_channel_changes`
+- the impact of each user over the channel by `total_user_changes` to 
`total_channel_changes`
+- the progress of each user's individual activity by comparing 
`hourly_user_changes` to `total_user_changes`
+
+
+
+
+#### Window frame guardrails
+
+Druid has guardrail logic to prevent you from executing window function 
queries with window frame expressions that might return unexpected results.
+
+For example:
+- You cannot set expressions as bounds for window frames.
+- You cannot use two FOLLOWING expressions in the window frame. For example: 
"rows between 2 FOLLOWING and 3 FOLLOWING"
+- You can only use a RANGE frames when both endpoints are unbounded or current 
row.
+
+If you write a query that violates one of these conditions, Druid throws an 
error: "The query contains a window frame which may return incorrect results. 
To disregard this warning, set [`windowingStrictValidation`] to false in the 
query context."

Review Comment:
   Does the query context parameter need a link? Or remove `[]`



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:
 
 ```sql
-...
-RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
-...
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, 
+    channel, 
+    user,
+    SUM(delta) net_user_changes,
+    RANK( ) OVER ( ORDER BY SUM(delta) DESC ) editing_rank

Review Comment:
   ```suggestion
       RANK() OVER (ORDER BY SUM(delta) DESC) AS editing_rank
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:
 
 ```sql
-...
-RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
-...
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, 
+    channel, 
+    user,
+    SUM(delta) net_user_changes,
+    RANK( ) OVER ( ORDER BY SUM(delta) DESC ) editing_rank
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'), channel, user
+ORDER BY 5 
 ```
 
+### PARTITION BY windows
+
+When a window only specifies PARTITION BY partition expression, Druid 
calculates the aggregate window function over all the rows that share a values 
within the selected dataset.
+
+The following example demonstrates a query that uses two different windows 
PARTITION BY channel and PARTITION BY user to calculate the total activity in 
the channel and total activity by the user so that they can be compared to 
individual hourly activity:
+
+```sql
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, channel, user,
+    SUM(delta) hourly_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY user ) AS total_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY channel ) AS total_channel_changes
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'),2,3
+ORDER BY channel,TIME_FLOOR(__time, 'PT1H'), user

Review Comment:
   ```suggestion
   ORDER BY channel, TIME_FLOOR(__time, 'PT1H'), user
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:
 
 ```sql
-...
-RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
-...
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, 
+    channel, 
+    user,
+    SUM(delta) net_user_changes,
+    RANK( ) OVER ( ORDER BY SUM(delta) DESC ) editing_rank
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'), channel, user
+ORDER BY 5 
 ```
 
+### PARTITION BY windows
+
+When a window only specifies PARTITION BY partition expression, Druid 
calculates the aggregate window function over all the rows that share a values 
within the selected dataset.
+
+The following example demonstrates a query that uses two different windows 
PARTITION BY channel and PARTITION BY user to calculate the total activity in 
the channel and total activity by the user so that they can be compared to 
individual hourly activity:
+
+```sql
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, channel, user,
+    SUM(delta) hourly_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY user ) AS total_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY channel ) AS total_channel_changes
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'),2,3
+ORDER BY channel,TIME_FLOOR(__time, 'PT1H'), user
+```
+
+In this example, the dataset is filtered for a single day, therefore the 
window function results represent the total activity for the day, for the 
`user` and for the `channel` respectively.

Review Comment:
   ```suggestion
   In this example, the dataset is filtered for a single day. Therefore the 
window function results represent the total activity for the day, for the 
`user` and for the `channel` dimensions respectively.
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:
 
 ```sql
-...
-RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
-...
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, 
+    channel, 
+    user,
+    SUM(delta) net_user_changes,
+    RANK( ) OVER ( ORDER BY SUM(delta) DESC ) editing_rank
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'), channel, user
+ORDER BY 5 
 ```
 
+### PARTITION BY windows
+
+When a window only specifies PARTITION BY partition expression, Druid 
calculates the aggregate window function over all the rows that share a values 
within the selected dataset.
+
+The following example demonstrates a query that uses two different windows 
PARTITION BY channel and PARTITION BY user to calculate the total activity in 
the channel and total activity by the user so that they can be compared to 
individual hourly activity:
+
+```sql
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, channel, user,
+    SUM(delta) hourly_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY user ) AS total_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY channel ) AS total_channel_changes
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'),2,3
+ORDER BY channel,TIME_FLOOR(__time, 'PT1H'), user
+```
+
+In this example, the dataset is filtered for a single day, therefore the 
window function results represent the total activity for the day, for the 
`user` and for the `channel` respectively.
+
+This type of result helps you analyze the impact of an individual user's 
hourly activity:
+- the impact to the channel by comparing `hourly_user_changes` to 
`total_channel_changes`
+- the impact of each user over the channel by `total_user_changes` to 
`total_channel_changes`
+- the progress of each user's individual activity by comparing 
`hourly_user_changes` to `total_user_changes`
+
+
+
+
+#### Window frame guardrails

Review Comment:
   ```suggestion
   - the progress of each user's individual activity by comparing 
`hourly_user_changes` to `total_user_changes`
   
   #### Window frame guardrails
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.

Review Comment:
   ```suggestion
   See [Example with window frames](#example-with-window-frames) for more 
detail.
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.

Review Comment:
   ```suggestion
   When the window definition only specifies ORDER BY and not PARTITION BY, it 
sorts the aggregate data set and applies the function in that order.
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -214,6 +309,45 @@ GROUP BY channel, __time, delta
 
 </details>
 
+### Example with window frames
+
+The following query uses a few different window frames to calculate overall 
activity by channel:
+
+```sql
+SELECT
+    channel, 
+    TIME_FLOOR(__time, 'PT1H')      AS time_hour, 
+    SUM(delta)                      AS hourly_channel_changes,
+    SUM(SUM(delta)) OVER cumulative AS cumulative_activity_in_channel,
+    SUM(SUM(delta)) OVER moving5    AS csum5,
+    COUNT(1) OVER moving5           AS count5
+FROM "wikipedia"
+WHERE channel = '#en.wikipedia'
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY 1, TIME_FLOOR(__time, 'PT1H')
+WINDOW cumulative AS (   
+                         PARTITION BY channel 
+                         ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+                     )
+                     ,
+        moving5 AS ( 
+                    PARTITION BY channel 
+                    ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                    ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
+                  )
+```
+
+The example defines multiple window specifications in the WINDOW clause that 
you can use for various window function calculations.
+
+The query uses two windows:
+- cumulative is partitioned by channel and includes all rows from the 
beginning of partition up to the current row as ordered by `__time` to enable 
cumulative aggregation

Review Comment:
   ```suggestion
   - `cumulative` is partitioned by channel and includes all rows from the 
beginning of partition up to the current row as ordered by `__time` to enable 
cumulative aggregation
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -214,6 +309,45 @@ GROUP BY channel, __time, delta
 
 </details>
 
+### Example with window frames
+
+The following query uses a few different window frames to calculate overall 
activity by channel:
+
+```sql
+SELECT
+    channel, 
+    TIME_FLOOR(__time, 'PT1H')      AS time_hour, 
+    SUM(delta)                      AS hourly_channel_changes,
+    SUM(SUM(delta)) OVER cumulative AS cumulative_activity_in_channel,
+    SUM(SUM(delta)) OVER moving5    AS csum5,
+    COUNT(1) OVER moving5           AS count5
+FROM "wikipedia"
+WHERE channel = '#en.wikipedia'
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY 1, TIME_FLOOR(__time, 'PT1H')
+WINDOW cumulative AS (   
+                         PARTITION BY channel 
+                         ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+                     )
+                     ,
+        moving5 AS ( 
+                    PARTITION BY channel 
+                    ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                    ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
+                  )
+```
+
+The example defines multiple window specifications in the WINDOW clause that 
you can use for various window function calculations.
+
+The query uses two windows:
+- cumulative is partitioned by channel and includes all rows from the 
beginning of partition up to the current row as ordered by `__time` to enable 
cumulative aggregation
+- `moving5` is also partitioned by channel but only includes up to the last 4 
rows and the current row as ordered by time
+
+The number of rows considered for the `moving5` window for the count  5 column:
+- starts at 1 because there are no rows before the current one
+- grows up to 5 as defined by ROWS BETWEEN 4 PRECEDING AND CURRENT ROW

Review Comment:
   See earlier comment on if it's `4 PRECEDING` or `4 ROWS PRECEDING`



##########
docs/querying/sql-window-functions.md:
##########
@@ -214,6 +309,45 @@ GROUP BY channel, __time, delta
 
 </details>
 
+### Example with window frames
+
+The following query uses a few different window frames to calculate overall 
activity by channel:
+
+```sql
+SELECT
+    channel, 
+    TIME_FLOOR(__time, 'PT1H')      AS time_hour, 
+    SUM(delta)                      AS hourly_channel_changes,
+    SUM(SUM(delta)) OVER cumulative AS cumulative_activity_in_channel,
+    SUM(SUM(delta)) OVER moving5    AS csum5,
+    COUNT(1) OVER moving5           AS count5
+FROM "wikipedia"
+WHERE channel = '#en.wikipedia'
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY 1, TIME_FLOOR(__time, 'PT1H')
+WINDOW cumulative AS (   
+                         PARTITION BY channel 
+                         ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+                     )
+                     ,
+        moving5 AS ( 
+                    PARTITION BY channel 
+                    ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                    ROWS BETWEEN 4 PRECEDING AND CURRENT ROW

Review Comment:
   Should this be `4 PRECEDING` or should line 131 say `N PRECEDING`? (right 
now it has `N ROWS PRECEDING`)



##########
docs/querying/sql-window-functions.md:
##########
@@ -214,6 +309,45 @@ GROUP BY channel, __time, delta
 
 </details>
 
+### Example with window frames
+
+The following query uses a few different window frames to calculate overall 
activity by channel:
+
+```sql
+SELECT
+    channel, 
+    TIME_FLOOR(__time, 'PT1H')      AS time_hour, 
+    SUM(delta)                      AS hourly_channel_changes,
+    SUM(SUM(delta)) OVER cumulative AS cumulative_activity_in_channel,
+    SUM(SUM(delta)) OVER moving5    AS csum5,
+    COUNT(1) OVER moving5           AS count5
+FROM "wikipedia"
+WHERE channel = '#en.wikipedia'
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY 1, TIME_FLOOR(__time, 'PT1H')
+WINDOW cumulative AS (   
+                         PARTITION BY channel 
+                         ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+                     )
+                     ,
+        moving5 AS ( 
+                    PARTITION BY channel 
+                    ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                    ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
+                  )
+```
+
+The example defines multiple window specifications in the WINDOW clause that 
you can use for various window function calculations.
+
+The query uses two windows:
+- cumulative is partitioned by channel and includes all rows from the 
beginning of partition up to the current row as ordered by `__time` to enable 
cumulative aggregation
+- `moving5` is also partitioned by channel but only includes up to the last 4 
rows and the current row as ordered by time
+
+The number of rows considered for the `moving5` window for the count  5 column:

Review Comment:
   ```suggestion
   The number of rows considered for the `moving5` window for the `count5` 
column:
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -213,7 +213,7 @@ If you write a query that violates one of these conditions, 
Druid throws an erro
 | `PERCENT_RANK()` | Returns the relative rank of the row calculated as a 
percentage according to the formula: `RANK() OVER (window) / COUNT(1) OVER 
(window)` |
 | `CUME_DIST()` | Returns the cumulative distribution of the current row 
within the window calculated as number of window rows at the same rank or 
higher than current row divided by total window rows. The return value ranges 
between `1/number of rows` and 1 |
 | `NTILE(tiles)` | Divides the rows within a window as evenly as possible into 
the number of tiles, also called buckets, and returns the value of the tile 
that the row falls into | None |
-| `LAG(expr[, offset])` | If you do not supply an `offset`, returns the value 
evaluated at the row preeeding the current row. Specify an offset number, `n` 
to return the value evaluated at `n` rows preceding the current one |
+| `LAG(expr[, offset])` | If you do not supply an `offset`, returns the value 
evaluated at the row preding the current row. Specify an offset number, `n` to 
return the value evaluated at `n` rows preceding the current one |

Review Comment:
   ```suggestion
   | `LAG(expr[, offset])` | If you do not supply an `offset`, returns the 
value evaluated at the row preceding the current row. Specify an offset number, 
`n`, to return the value evaluated at `n` rows preceding the current one |
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -214,6 +309,45 @@ GROUP BY channel, __time, delta
 
 </details>
 
+### Example with window frames
+
+The following query uses a few different window frames to calculate overall 
activity by channel:
+
+```sql
+SELECT
+    channel, 
+    TIME_FLOOR(__time, 'PT1H')      AS time_hour, 
+    SUM(delta)                      AS hourly_channel_changes,
+    SUM(SUM(delta)) OVER cumulative AS cumulative_activity_in_channel,
+    SUM(SUM(delta)) OVER moving5    AS csum5,
+    COUNT(1) OVER moving5           AS count5
+FROM "wikipedia"
+WHERE channel = '#en.wikipedia'
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY 1, TIME_FLOOR(__time, 'PT1H')
+WINDOW cumulative AS (   
+                         PARTITION BY channel 
+                         ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+                     )
+                     ,
+        moving5 AS ( 
+                    PARTITION BY channel 
+                    ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                    ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
+                  )
+```
+
+The example defines multiple window specifications in the WINDOW clause that 
you can use for various window function calculations.
+
+The query uses two windows:
+- cumulative is partitioned by channel and includes all rows from the 
beginning of partition up to the current row as ordered by `__time` to enable 
cumulative aggregation
+- `moving5` is also partitioned by channel but only includes up to the last 4 
rows and the current row as ordered by time

Review Comment:
   ```suggestion
   - `moving5` is also partitioned by channel but only includes up to the last 
four rows and the current row as ordered by time
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression

Review Comment:
   For the SQL value, I assume N is a placeholder here so we should denote it 
as a placeholder somehow like `<N> ROWS PRECEDING`
   
   For the description we should italicize N since it's a variable 
(https://developers.google.com/style/text-formatting)
   



##########
docs/querying/sql-window-functions.md:
##########
@@ -214,6 +309,45 @@ GROUP BY channel, __time, delta
 
 </details>
 
+### Example with window frames
+
+The following query uses a few different window frames to calculate overall 
activity by channel:
+
+```sql
+SELECT
+    channel, 
+    TIME_FLOOR(__time, 'PT1H')      AS time_hour, 
+    SUM(delta)                      AS hourly_channel_changes,
+    SUM(SUM(delta)) OVER cumulative AS cumulative_activity_in_channel,
+    SUM(SUM(delta)) OVER moving5    AS csum5,
+    COUNT(1) OVER moving5           AS count5
+FROM "wikipedia"
+WHERE channel = '#en.wikipedia'
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY 1, TIME_FLOOR(__time, 'PT1H')
+WINDOW cumulative AS (   
+                         PARTITION BY channel 
+                         ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
+                     )
+                     ,
+        moving5 AS ( 
+                    PARTITION BY channel 
+                    ORDER BY TIME_FLOOR(__time, 'PT1H') 
+                    ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
+                  )
+```
+
+The example defines multiple window specifications in the WINDOW clause that 
you can use for various window function calculations.
+
+The query uses two windows:
+- cumulative is partitioned by channel and includes all rows from the 
beginning of partition up to the current row as ordered by `__time` to enable 
cumulative aggregation
+- `moving5` is also partitioned by channel but only includes up to the last 4 
rows and the current row as ordered by time
+
+The number of rows considered for the `moving5` window for the count  5 column:
+- starts at 1 because there are no rows before the current one
+- grows up to 5 as defined by ROWS BETWEEN 4 PRECEDING AND CURRENT ROW

Review Comment:
   ```suggestion
   - starts at a single row because there are no rows before the current one
   - grows up to five rows as defined by `ROWS BETWEEN 4 PRECEDING AND CURRENT 
ROW`
   ```



##########
docs/querying/sql-window-functions.md:
##########


Review Comment:
   I think `based upon its delta value.` should be `based upon its change 
value.`



##########
docs/querying/sql-functions.md:
##########
@@ -890,7 +890,7 @@ Extracts a literal value from `expr` at the specified 
`path`. If you specify `RE
 
 **Function type:** [Window](sql-window-functions.md#window-function-reference)
 
-If you do not supply an `offset`, returns the value evaluated at the row 
preeeding the current row. Specify an offset number `n` to return the value 
evaluated at `n` rows preceding the current one.
+If you do not supply an `offset`, returns the value evaluated at the row 
preding the current row. Specify an offset number `n` to return the value 
evaluated at `n` rows preceding the current one.

Review Comment:
   ```suggestion
   If you do not supply an `offset`, returns the value evaluated at the row 
preceding the current row. Specify an offset number `n` to return the value 
evaluated at `n` rows preceding the current one.
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:
 
 ```sql
-...
-RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
-...
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, 
+    channel, 
+    user,
+    SUM(delta) net_user_changes,
+    RANK( ) OVER ( ORDER BY SUM(delta) DESC ) editing_rank
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'), channel, user
+ORDER BY 5 
 ```
 
+### PARTITION BY windows
+
+When a window only specifies PARTITION BY partition expression, Druid 
calculates the aggregate window function over all the rows that share a values 
within the selected dataset.
+
+The following example demonstrates a query that uses two different windows 
PARTITION BY channel and PARTITION BY user to calculate the total activity in 
the channel and total activity by the user so that they can be compared to 
individual hourly activity:
+
+```sql
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, channel, user,
+    SUM(delta) hourly_user_changes,

Review Comment:
   ```suggestion
       SUM(delta) AS hourly_user_changes,
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:
 
 ```sql
-...
-RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
-...
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, 
+    channel, 
+    user,
+    SUM(delta) net_user_changes,
+    RANK( ) OVER ( ORDER BY SUM(delta) DESC ) editing_rank
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'), channel, user
+ORDER BY 5 
 ```
 
+### PARTITION BY windows
+
+When a window only specifies PARTITION BY partition expression, Druid 
calculates the aggregate window function over all the rows that share a values 
within the selected dataset.
+
+The following example demonstrates a query that uses two different windows 
PARTITION BY channel and PARTITION BY user to calculate the total activity in 
the channel and total activity by the user so that they can be compared to 
individual hourly activity:
+
+```sql
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, channel, user,
+    SUM(delta) hourly_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY user ) AS total_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY channel ) AS total_channel_changes
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'),2,3
+ORDER BY channel,TIME_FLOOR(__time, 'PT1H'), user
+```
+
+In this example, the dataset is filtered for a single day, therefore the 
window function results represent the total activity for the day, for the 
`user` and for the `channel` respectively.
+
+This type of result helps you analyze the impact of an individual user's 
hourly activity:
+- the impact to the channel by comparing `hourly_user_changes` to 
`total_channel_changes`
+- the impact of each user over the channel by `total_user_changes` to 
`total_channel_changes`
+- the progress of each user's individual activity by comparing 
`hourly_user_changes` to `total_user_changes`
+
+
+
+
+#### Window frame guardrails
+
+Druid has guardrail logic to prevent you from executing window function 
queries with window frame expressions that might return unexpected results.
+
+For example:
+- You cannot set expressions as bounds for window frames.
+- You cannot use two FOLLOWING expressions in the window frame. For example: 
"rows between 2 FOLLOWING and 3 FOLLOWING"

Review Comment:
   ```suggestion
   - You cannot use two FOLLOWING expressions in the window frame. For example: 
`ROWS BETWEEN 2 ROWS FOLLOWING and 3 ROWS FOLLOWING`.
   ```



##########
docs/querying/sql-window-functions.md:
##########
@@ -88,40 +88,135 @@ You can use the OVER clause to treat other Druid 
aggregation functions as window
 
 Window functions support aliasing.
 
-## Define a window with the OVER clause
+## Window function syntax
+
+In general, Druid Window functions use the following syntax:
+
+
+```sql
+window_function() OVER (
+  [PARTITION BY partitioning expression]
+  [ORDER BY order expression]
+  [[ROWS, RANGE] BETWEEN range start AND range end])
+FROM table
+GROUP BY dimensions
+```
+
+```sql
+window_function() OVER w
+FROM table
+WINDOW w AS ([PARTITION BY ] [ORDER BY]
+  [[ROWS, RANGE] BETWEEN range start AND range end] )
+GROUP BY dimensions
+```
 
 The OVER clause defines the query windows for window functions as follows:
 - PARTITION BY indicates the dimension that defines the rows within the window
-- ORDER BY specifies the order of the rows within the windows.
+- ORDER BY specifies the order of the rows within the windows
+
+An empty OVER clause or the absence of a PARTITION BY clause indicates that 
all data belongs to a single window.
+
+For example, the following OVER clause example sets the window dimension to 
`channel` and orders the results by the absolute value of `delta` ascending:
+
+```sql
+...
+RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
+...
+```
+
+Window frames, set in ROWS and RANGE expressions, limit the set of rows used 
for the windowed aggregation. The general syntax is:
+
+ROWS AND RANGE accept the following values start and end :
+- UNBOUND PRECEDING - from the beginning of the partition as order by the 
order expression
+- N ROWS PRECEDING - N rows before the current row as ordered by the order 
expression
+- CURRENT ROW - the current row
+- N ROWS FOLLOWING - N rows after the current row as ordered by the order 
expression
+- UNBOUNDED FOLLOWING - to the end of the partition as ordered by the order 
expression
+
+See (Example with window frames)(#example-with-window-frames) for more detail.
+ 
+Druid applies the GROUP BY dimensions first before calculating all non-window 
aggregation functions. Then it applies the window function over the aggregate 
results.
 
 :::note
 
 Sometimes windows are called partitions. However, the partitioning for window 
functions are a shuffle (partition) of the result set created at query time and 
is not to be confused with Druid's segment partitioning feature which 
partitions data at ingest time.
 
 :::
 
-The following OVER clause example sets the window dimension to `channel` and 
orders the results by the absolute value of `delta` ascending:
+### ORDER BY windows
+
+When the window definition only specifies ORDER BY , it sorts the aggregate 
data set and applies the function in that order.
+
+The following query uses ORDER BY SUM(delta) DESC to rank user hourly activity 
from the most changed the least changed within an hour:
 
 ```sql
-...
-RANK() OVER (PARTITION BY channel ORDER BY ABS(delta) ASC)
-...
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, 
+    channel, 
+    user,
+    SUM(delta) net_user_changes,
+    RANK( ) OVER ( ORDER BY SUM(delta) DESC ) editing_rank
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'), channel, user
+ORDER BY 5 
 ```
 
+### PARTITION BY windows
+
+When a window only specifies PARTITION BY partition expression, Druid 
calculates the aggregate window function over all the rows that share a values 
within the selected dataset.
+
+The following example demonstrates a query that uses two different windows 
PARTITION BY channel and PARTITION BY user to calculate the total activity in 
the channel and total activity by the user so that they can be compared to 
individual hourly activity:
+
+```sql
+SELECT
+    TIME_FLOOR(__time, 'PT1H') as time_hour, channel, user,
+    SUM(delta) hourly_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY user ) AS total_user_changes,
+    SUM(SUM(delta)) OVER (PARTITION BY channel ) AS total_channel_changes
+FROM "wikipedia"
+WHERE channel IN ('#kk.wikipedia', '#lt.wikipedia')
+  AND __time BETWEEN '2016-06-27' AND '2016-06-28'
+GROUP BY TIME_FLOOR(__time, 'PT1H'),2,3

Review Comment:
   ```suggestion
   GROUP BY TIME_FLOOR(__time, 'PT1H'), 2, 3
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] update window functions doc (druid)

Reply via email to