As part of writing a talk about window functions, I have done some
cleanups of the window function docs, attached, that I would like to
apply to head.
--
Bruce Momjian <[email protected]> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
diff --git a/doc/src/sgml/advanced.sgml b/doc/src/sgml/advanced.sgml
index 1e45511..59c1e7e 100644
--- a/doc/src/sgml/advanced.sgml
+++ b/doc/src/sgml/advanced.sgml
@@ -328,7 +328,7 @@ COMMIT;
A <firstterm>window function</> performs a calculation across a set of
table rows that are somehow related to the current row. This is comparable
to the type of calculation that can be done with an aggregate function.
- But unlike regular aggregate functions, use of a window function does not
+ But unlike normal aggregate functions, use of a window function does not
cause rows to become grouped into a single output row — the
rows retain their separate identities. Behind the scenes, the window
function is able to access more than just the current row of the query
@@ -363,20 +363,19 @@ SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM emps
<structname>empsalary</>, and there is one output row for each row in the
table. The fourth column represents an average taken across all the table
rows that have the same <structfield>depname</> value as the current row.
- (This actually is the same function as the regular <function>avg</>
+ (This actually is the same function as the normal <function>avg</>
aggregate function, but the <literal>OVER</> clause causes it to be
- treated as a window function and computed across an appropriate set of
- rows.)
+ treated as a window function and computed across the entire partition.)
</para>
<para>
A window function call always contains an <literal>OVER</> clause
directly following the window function's name and argument(s). This is what
- syntactically distinguishes it from a regular function or aggregate
+ syntactically distinguishes it from a normal function or aggregate
function. The <literal>OVER</> clause determines exactly how the
rows of the query are split up for processing by the window function.
- The <literal>PARTITION BY</> list within <literal>OVER</> specifies
- dividing the rows into groups, or partitions, that share the same
+ The <literal>PARTITION BY</> clause within <literal>OVER</>
+ divides the rows into groups, or partitions, that share the same
values of the <literal>PARTITION BY</> expression(s). For each row,
the window function is computed across the rows that fall into the
same partition as the current row.
@@ -411,8 +410,8 @@ FROM empsalary;
</screen>
As shown here, the <function>rank</> function produces a numerical rank
- within the current row's partition for each distinct <literal>ORDER BY</>
- value, in the order defined by the <literal>ORDER BY</> clause.
+ for each distinct <literal>ORDER BY</> value in the current row's
+ partition, in the order defined by the <literal>ORDER BY</> clause.
<function>rank</> needs no explicit parameter, because its behavior
is entirely determined by the <literal>OVER</> clause.
</para>
@@ -424,20 +423,20 @@ FROM empsalary;
if any. For example, a row removed because it does not meet the
<literal>WHERE</> condition is not seen by any window function.
A query can contain multiple window functions that slice up the data
- in different ways by means of different <literal>OVER</> clauses, but
+ in different ways using different <literal>OVER</> clauses, but
they all act on the same collection of rows defined by this virtual table.
</para>
<para>
We already saw that <literal>ORDER BY</> can be omitted if the ordering
of rows is not important. It is also possible to omit <literal>PARTITION
- BY</>, in which case there is just one partition containing all the rows.
+ BY</>, in which case there is a single partition containing all rows.
</para>
<para>
There is another important concept associated with window functions:
for each row, there is a set of rows within its partition called its
- <firstterm>window frame</>. Many (but not all) window functions act only
+ <firstterm>window frame</>. Some window functions act only
on the rows of the window frame, rather than of the whole partition.
By default, if <literal>ORDER BY</> is supplied then the frame consists of
all rows from the start of the partition up through the current row, plus
@@ -515,7 +514,7 @@ SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary;
elsewhere, such as in <literal>GROUP BY</>, <literal>HAVING</>
and <literal>WHERE</literal> clauses. This is because they logically
execute after the processing of those clauses. Also, window functions
- execute after regular aggregate functions. This means it is valid to
+ execute after normal aggregate functions. This means it is valid to
include an aggregate function call in the arguments of a window function,
but not vice versa.
</para>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e3d852c..91c76a3 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14413,17 +14413,17 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<para>
The built-in window functions are listed in
<xref linkend="functions-window-table">. Note that these functions
- <emphasis>must</> be invoked using window function syntax; that is an
+ <emphasis>must</> be invoked using window function syntax, i.e. an
<literal>OVER</> clause is required.
</para>
<para>
In addition to these functions, any built-in or user-defined normal
- aggregate function (but not ordered-set or hypothetical-set aggregates)
+ aggregate function (i.e. not ordered-set or hypothetical-set aggregates)
can be used as a window function; see
<xref linkend="functions-aggregate"> for a list of the built-in aggregates.
- Aggregate functions act as window functions only when an <literal>OVER</>
- clause follows the call; otherwise they act as regular aggregates.
+ Normal aggregate functions act as window functions only when an <literal>OVER</>
+ clause follows the call; otherwise they act as normal aggregates.
</para>
<table id="functions-window-table">
@@ -14488,7 +14488,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<entry>
<type>double precision</type>
</entry>
- <entry>relative rank of the current row: (<function>rank</> - 1) / (total rows - 1)</entry>
+ <entry>relative rank of the current row: (<function>rank</> - 1) / (total partition rows - 1)</entry>
</row>
<row>
@@ -14501,7 +14501,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
<entry>
<type>double precision</type>
</entry>
- <entry>relative rank of the current row: (number of rows preceding or peer with current row) / (total rows)</entry>
+ <entry>cumulative distribution: (number of partition rows preceding or peer with current row) / total partition rows</entry>
</row>
<row>
@@ -14634,9 +14634,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
All of the functions listed in
<xref linkend="functions-window-table"> depend on the sort ordering
specified by the <literal>ORDER BY</> clause of the associated window
- definition. Rows that are not distinct in the <literal>ORDER BY</>
- ordering are said to be <firstterm>peers</>; the four ranking functions
- are defined so that they give the same answer for any two peer rows.
+ definition. Rows that are not distinct based on <literal>ORDER BY</>
+ ordering are said to be <firstterm>peers</>. The four ranking
+ functions (includes <function>cume_dist</>) are defined so that they
+ give the same answer for any two peer rows.
</para>
<para>
@@ -14646,14 +14647,13 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
partition through the last peer of the current row. This is
likely to give unhelpful results for <function>last_value</> and
sometimes also <function>nth_value</>. You can redefine the frame by
- adding a suitable frame specification (<literal>RANGE</> or
- <literal>ROWS</>) to the <literal>OVER</> clause.
- See <xref linkend="syntax-window-functions"> for more information
- about frame specifications.
+ modifying the frame specification; see <xref
+ linkend="syntax-window-functions"> for more information about frame
+ specifications.
</para>
<para>
- When an aggregate function is used as a window function, it aggregates
+ When a normal aggregate function is used as a window function, it aggregates
over the rows within the current row's window frame.
An aggregate used with <literal>ORDER BY</> and the default window frame
definition produces a <quote>running sum</> type of behavior, which may or
@@ -14679,6 +14679,13 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
</para>
</note>
+ <para>
+ <function>cume_dist</> computes the ratio of partition rows that
+ are less than or equal to the current row and its peers, while
+ <function>percent_rank</> computes the ratio of lesser partition rows,
+ assuming the current row is does not exist in the partition.
+ </para>
+
</sect1>
<sect1 id="functions-subquery">
diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index 40f722c..bdb4d52 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -1779,10 +1779,10 @@ FROM generate_series(1,10) AS s(i);
<para>
A <firstterm>window function call</firstterm> represents the application
of an aggregate-like function over some portion of the rows selected
- by a query. Unlike regular aggregate function calls, this is not tied
+ by a query. Unlike normal aggregate function calls, this is not tied
to grouping of the selected rows into a single output row — each
row remains separate in the query output. However the window function
- is able to scan all the rows that would be part of the current row's
+ has access to all the rows that would be part of the current row's
group according to the grouping specification (<literal>PARTITION BY</>
list) of the window function call.
The syntax of a window function call is one of the following:
@@ -1831,20 +1831,20 @@ UNBOUNDED FOLLOWING
named window in the <literal>WINDOW</literal> clause; see the
<xref linkend="sql-select"> reference page for details. It's worth
pointing out that <literal>OVER wname</> is not exactly equivalent to
- <literal>OVER (wname)</>; the latter implies copying and modifying the
+ <literal>OVER (wname ...)</>; the latter implies copying and modifying the
window definition, and will be rejected if the referenced window
specification includes a frame clause.
</para>
<para>
- The <literal>PARTITION BY</> option groups the rows of the query into
+ The <literal>PARTITION BY</> clause groups the rows of the query into
<firstterm>partitions</>, which are processed separately by the window
function. <literal>PARTITION BY</> works similarly to a query-level
<literal>GROUP BY</> clause, except that its expressions are always just
expressions and cannot be output-column names or numbers.
Without <literal>PARTITION BY</>, all rows produced by the query are
treated as a single partition.
- The <literal>ORDER BY</> option determines the order in which the rows
+ The <literal>ORDER BY</> clause determines the order in which the rows
of a partition are processed by the window function. It works similarly
to a query-level <literal>ORDER BY</> clause, but likewise cannot use
output-column names or numbers. Without <literal>ORDER BY</>, rows are
@@ -1922,8 +1922,8 @@ UNBOUNDED FOLLOWING
The built-in window functions are described in <xref
linkend="functions-window-table">. Other window functions can be added by
the user. Also, any built-in or user-defined normal aggregate function
- can be used as a window function. Ordered-set aggregates presently
- cannot be used as window functions, however.
+ can be used as a window function. However, ordered-set aggregate
+ cannot presently be used as window functions.
</para>
<para>
--
Sent via pgsql-docs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs