As part of writing a talk about window functions, I have done some cleanups of the window function docs, attached, that I would like to apply to head.
-- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
diff --git a/doc/src/sgml/advanced.sgml b/doc/src/sgml/advanced.sgml index 1e45511..59c1e7e 100644 --- a/doc/src/sgml/advanced.sgml +++ b/doc/src/sgml/advanced.sgml @@ -328,7 +328,7 @@ COMMIT; A <firstterm>window function</> performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. - But unlike regular aggregate functions, use of a window function does not + But unlike normal aggregate functions, use of a window function does not cause rows to become grouped into a single output row — the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query @@ -363,20 +363,19 @@ SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM emps <structname>empsalary</>, and there is one output row for each row in the table. The fourth column represents an average taken across all the table rows that have the same <structfield>depname</> value as the current row. - (This actually is the same function as the regular <function>avg</> + (This actually is the same function as the normal <function>avg</> aggregate function, but the <literal>OVER</> clause causes it to be - treated as a window function and computed across an appropriate set of - rows.) + treated as a window function and computed across the entire partition.) </para> <para> A window function call always contains an <literal>OVER</> clause directly following the window function's name and argument(s). This is what - syntactically distinguishes it from a regular function or aggregate + syntactically distinguishes it from a normal function or aggregate function. The <literal>OVER</> clause determines exactly how the rows of the query are split up for processing by the window function. - The <literal>PARTITION BY</> list within <literal>OVER</> specifies - dividing the rows into groups, or partitions, that share the same + The <literal>PARTITION BY</> clause within <literal>OVER</> + divides the rows into groups, or partitions, that share the same values of the <literal>PARTITION BY</> expression(s). For each row, the window function is computed across the rows that fall into the same partition as the current row. @@ -411,8 +410,8 @@ FROM empsalary; </screen> As shown here, the <function>rank</> function produces a numerical rank - within the current row's partition for each distinct <literal>ORDER BY</> - value, in the order defined by the <literal>ORDER BY</> clause. + for each distinct <literal>ORDER BY</> value in the current row's + partition, in the order defined by the <literal>ORDER BY</> clause. <function>rank</> needs no explicit parameter, because its behavior is entirely determined by the <literal>OVER</> clause. </para> @@ -424,20 +423,20 @@ FROM empsalary; if any. For example, a row removed because it does not meet the <literal>WHERE</> condition is not seen by any window function. A query can contain multiple window functions that slice up the data - in different ways by means of different <literal>OVER</> clauses, but + in different ways using different <literal>OVER</> clauses, but they all act on the same collection of rows defined by this virtual table. </para> <para> We already saw that <literal>ORDER BY</> can be omitted if the ordering of rows is not important. It is also possible to omit <literal>PARTITION - BY</>, in which case there is just one partition containing all the rows. + BY</>, in which case there is a single partition containing all rows. </para> <para> There is another important concept associated with window functions: for each row, there is a set of rows within its partition called its - <firstterm>window frame</>. Many (but not all) window functions act only + <firstterm>window frame</>. Some window functions act only on the rows of the window frame, rather than of the whole partition. By default, if <literal>ORDER BY</> is supplied then the frame consists of all rows from the start of the partition up through the current row, plus @@ -515,7 +514,7 @@ SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary; elsewhere, such as in <literal>GROUP BY</>, <literal>HAVING</> and <literal>WHERE</literal> clauses. This is because they logically execute after the processing of those clauses. Also, window functions - execute after regular aggregate functions. This means it is valid to + execute after normal aggregate functions. This means it is valid to include an aggregate function call in the arguments of a window function, but not vice versa. </para> diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index e3d852c..91c76a3 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -14413,17 +14413,17 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab; <para> The built-in window functions are listed in <xref linkend="functions-window-table">. Note that these functions - <emphasis>must</> be invoked using window function syntax; that is an + <emphasis>must</> be invoked using window function syntax, i.e. an <literal>OVER</> clause is required. </para> <para> In addition to these functions, any built-in or user-defined normal - aggregate function (but not ordered-set or hypothetical-set aggregates) + aggregate function (i.e. not ordered-set or hypothetical-set aggregates) can be used as a window function; see <xref linkend="functions-aggregate"> for a list of the built-in aggregates. - Aggregate functions act as window functions only when an <literal>OVER</> - clause follows the call; otherwise they act as regular aggregates. + Normal aggregate functions act as window functions only when an <literal>OVER</> + clause follows the call; otherwise they act as normal aggregates. </para> <table id="functions-window-table"> @@ -14488,7 +14488,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab; <entry> <type>double precision</type> </entry> - <entry>relative rank of the current row: (<function>rank</> - 1) / (total rows - 1)</entry> + <entry>relative rank of the current row: (<function>rank</> - 1) / (total partition rows - 1)</entry> </row> <row> @@ -14501,7 +14501,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab; <entry> <type>double precision</type> </entry> - <entry>relative rank of the current row: (number of rows preceding or peer with current row) / (total rows)</entry> + <entry>cumulative distribution: (number of partition rows preceding or peer with current row) / total partition rows</entry> </row> <row> @@ -14634,9 +14634,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab; All of the functions listed in <xref linkend="functions-window-table"> depend on the sort ordering specified by the <literal>ORDER BY</> clause of the associated window - definition. Rows that are not distinct in the <literal>ORDER BY</> - ordering are said to be <firstterm>peers</>; the four ranking functions - are defined so that they give the same answer for any two peer rows. + definition. Rows that are not distinct based on <literal>ORDER BY</> + ordering are said to be <firstterm>peers</>. The four ranking + functions (includes <function>cume_dist</>) are defined so that they + give the same answer for any two peer rows. </para> <para> @@ -14646,14 +14647,13 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab; partition through the last peer of the current row. This is likely to give unhelpful results for <function>last_value</> and sometimes also <function>nth_value</>. You can redefine the frame by - adding a suitable frame specification (<literal>RANGE</> or - <literal>ROWS</>) to the <literal>OVER</> clause. - See <xref linkend="syntax-window-functions"> for more information - about frame specifications. + modifying the frame specification; see <xref + linkend="syntax-window-functions"> for more information about frame + specifications. </para> <para> - When an aggregate function is used as a window function, it aggregates + When a normal aggregate function is used as a window function, it aggregates over the rows within the current row's window frame. An aggregate used with <literal>ORDER BY</> and the default window frame definition produces a <quote>running sum</> type of behavior, which may or @@ -14679,6 +14679,13 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab; </para> </note> + <para> + <function>cume_dist</> computes the ratio of partition rows that + are less than or equal to the current row and its peers, while + <function>percent_rank</> computes the ratio of lesser partition rows, + assuming the current row is does not exist in the partition. + </para> + </sect1> <sect1 id="functions-subquery"> diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml index 40f722c..bdb4d52 100644 --- a/doc/src/sgml/syntax.sgml +++ b/doc/src/sgml/syntax.sgml @@ -1779,10 +1779,10 @@ FROM generate_series(1,10) AS s(i); <para> A <firstterm>window function call</firstterm> represents the application of an aggregate-like function over some portion of the rows selected - by a query. Unlike regular aggregate function calls, this is not tied + by a query. Unlike normal aggregate function calls, this is not tied to grouping of the selected rows into a single output row — each row remains separate in the query output. However the window function - is able to scan all the rows that would be part of the current row's + has access to all the rows that would be part of the current row's group according to the grouping specification (<literal>PARTITION BY</> list) of the window function call. The syntax of a window function call is one of the following: @@ -1831,20 +1831,20 @@ UNBOUNDED FOLLOWING named window in the <literal>WINDOW</literal> clause; see the <xref linkend="sql-select"> reference page for details. It's worth pointing out that <literal>OVER wname</> is not exactly equivalent to - <literal>OVER (wname)</>; the latter implies copying and modifying the + <literal>OVER (wname ...)</>; the latter implies copying and modifying the window definition, and will be rejected if the referenced window specification includes a frame clause. </para> <para> - The <literal>PARTITION BY</> option groups the rows of the query into + The <literal>PARTITION BY</> clause groups the rows of the query into <firstterm>partitions</>, which are processed separately by the window function. <literal>PARTITION BY</> works similarly to a query-level <literal>GROUP BY</> clause, except that its expressions are always just expressions and cannot be output-column names or numbers. Without <literal>PARTITION BY</>, all rows produced by the query are treated as a single partition. - The <literal>ORDER BY</> option determines the order in which the rows + The <literal>ORDER BY</> clause determines the order in which the rows of a partition are processed by the window function. It works similarly to a query-level <literal>ORDER BY</> clause, but likewise cannot use output-column names or numbers. Without <literal>ORDER BY</>, rows are @@ -1922,8 +1922,8 @@ UNBOUNDED FOLLOWING The built-in window functions are described in <xref linkend="functions-window-table">. Other window functions can be added by the user. Also, any built-in or user-defined normal aggregate function - can be used as a window function. Ordered-set aggregates presently - cannot be used as window functions, however. + can be used as a window function. However, ordered-set aggregate + cannot presently be used as window functions. </para> <para>
-- Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-docs