soumyava commented on code in PR #14739: URL: https://github.com/apache/druid/pull/14739#discussion_r1381152441
########## docs/querying/sql-window-functions.md: ########## @@ -0,0 +1,221 @@ +--- +id: sql-window-functions +title: Window functions +--- + +<!-- + ~ Licensed to the Apache Software Foundation (ASF) under one + ~ or more contributor license agreements. See the NOTICE file + ~ distributed with this work for additional information + ~ regarding copyright ownership. The ASF licenses this file + ~ to you under the Apache License, Version 2.0 (the + ~ License); you may not use this file except in compliance + ~ with the License. You may obtain a copy of the License at + ~ + ~ http://www.apache.org/licenses/LICENSE-2.0 + ~ + ~ Unless required by applicable law or agreed to in writing, + ~ software distributed under the License is distributed on an + ~ AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + ~ KIND, either express or implied. See the License for the + ~ specific language governing permissions and limitations + ~ under the License. + --> + +:::info + +Apache Druid supports two query languages: [Druid SQL](sql.md) and [native queries](querying.md). +This document describes the SQL language. + +Window functions are an [experimental](../development/experimental.md) feature. Development and testing are still at early stage. Feel free to try window functions and provide your feedback. + +There are known issues where ORDER BY only works on ascending order and certain options may cause errors. + +Set the context parameter `windowsAreForClosers: true` to use window functions. + +::: + +Window functions in Apache Druid produce values based upon the relationship of one row within a window of rows to the other rows within the same window. A window is a group of related rows within a result set. For example, rows with the same value for a specific dimension. + +The following example groups results with the same `channel` value into windows. For each window, the query returns the rank of each row in ascending order based upon its `delta` value. + +```sql +SELECT FLOOR(__time TO DAY) AS event_time, + channel, + ABS(delta) AS change, + RANK() OVER w AS rank_value +FROM wikipedia +WHERE channel in ('#kk.wikipedia', '#lt.wikipedia') +AND '2016-06-28' > FLOOR(__time TO DAY) > '2016-06-26' +GROUP BY channel, ABS(delta), __time +WINDOW w AS (PARTITION BY channel ORDER BY ABS(delta) ASC) +``` + +<details> +<summary> View results </summary> + +| `event_time` | `channel` | `change`| `rank_value` | +| -- | -- | -- | -- | +| `2016-06-27T00:00:00.000Z`| `#kk.wikipedia`| 1 | 1 | +| `2016-06-27T00:00:00.000Z`| `#kk.wikipedia`| 1 | 1 | +| `2016-06-27T00:00:00.000Z`| `#kk.wikipedia`| 7 | 3 | +| `2016-06-27T00:00:00.000Z`| `#kk.wikipedia`| 56 | 4 | +| `2016-06-27T00:00:00.000Z`| `#kk.wikipedia`| 56 | 4 | +| `2016-06-27T00:00:00.000Z`| `#kk.wikipedia`| 63 | 6 | +| `2016-06-27T00:00:00.000Z`| `#kk.wikipedia`| 91 | 7 | +| `2016-06-27T00:00:00.000Z`| `#kk.wikipedia`| 2440 | 8 | +| `2016-06-27T00:00:00.000Z`| `#kk.wikipedia`| 2703 | 9 | +| `2016-06-27T00:00:00.000Z`| `#kk.wikipedia`| 6900 |10 | +| `2016-06-27T00:00:00.000Z`| `#lt.wikipedia`| 1 | 1 | +| `2016-06-27T00:00:00.000Z`| `#lt.wikipedia`| 2 | 2 | +| `2016-06-27T00:00:00.000Z`| `#lt.wikipedia`| 13 | 3 | +| `2016-06-27T00:00:00.000Z`| `#lt.wikipedia`| 28 | 4 | +| `2016-06-27T00:00:00.000Z`| `#lt.wikipedia`| 53 | 5 | +| `2016-06-27T00:00:00.000Z`| `#lt.wikipedia`| 56 | 6 | +| `2016-06-27T00:00:00.000Z`| `#lt.wikipedia`| 59 | 7 | +| `2016-06-27T00:00:00.000Z`| `#lt.wikipedia`| 391 | 8 | +| `2016-06-27T00:00:00.000Z`| `#lt.wikipedia`| 894 | 9 | +| `2016-06-27T00:00:00.000Z`| `#lt.wikipedia`| 4358 | 10 | + +</details> + +Window functions are similar to [aggregation functions](./aggregations.md). + +You can use the OVER clause to treat other Druid aggregation functions as window functions. For example, the sum of a value for rows within a window. + +When working with window functions, consider the following: +- Window functions only work on GROUP BY queries. Review Comment: I tried a couple of queries like ``` SELECT RANK() OVER (PARTITION BY m1 ORDER BY m1 DESC) AS ranking, m1 FROM foo ``` Does not give any results but if I add a `GROUP BY m1` it does give results. As for this I would like to include GROUP BY in the docs as the window functions are still experimental and once we are confident that the cases run without the group by we can update it in the next release -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
