[DOCS] Window function docs

Bruce Momjian Thu, 06 Apr 2017 14:49:52 -0700

As part of writing a talk about window functions, I have done some
cleanups of the window function docs, attached, that I would like to
apply to head.


-- 
  Bruce Momjian  <[email protected]>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

diff --git a/doc/src/sgml/advanced.sgml b/doc/src/sgml/advanced.sgml
index 1e45511..59c1e7e 100644
--- a/doc/src/sgml/advanced.sgml
+++ b/doc/src/sgml/advanced.sgml
@@ -328,7 +328,7 @@ COMMIT;
     A <firstterm>window function</> performs a calculation across a set of
     table rows that are somehow related to the current row.  This is comparable
     to the type of calculation that can be done with an aggregate function.
-    But unlike regular aggregate functions, use of a window function does not
+    But unlike normal aggregate functions, use of a window function does not
     cause rows to become grouped into a single output row &mdash; the
     rows retain their separate identities.  Behind the scenes, the window
     function is able to access more than just the current row of the query
@@ -363,20 +363,19 @@ SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM emps
     <structname>empsalary</>, and there is one output row for each row in the
     table.  The fourth column represents an average taken across all the table
     rows that have the same <structfield>depname</> value as the current row.
-    (This actually is the same function as the regular <function>avg</>
+    (This actually is the same function as the normal <function>avg</>
     aggregate function, but the <literal>OVER</> clause causes it to be
-    treated as a window function and computed across an appropriate set of
-    rows.)
+    treated as a window function and computed across the entire partition.)
    </para>
 
    <para>
     A window function call always contains an <literal>OVER</> clause
     directly following the window function's name and argument(s).  This is what
-    syntactically distinguishes it from a regular function or aggregate
+    syntactically distinguishes it from a normal function or aggregate
     function.  The <literal>OVER</> clause determines exactly how the
     rows of the query are split up for processing by the window function.
-    The <literal>PARTITION BY</> list within <literal>OVER</> specifies
-    dividing the rows into groups, or partitions, that share the same
+    The <literal>PARTITION BY</> clause within <literal>OVER</>
+    divides the rows into groups, or partitions, that share the same
     values of the <literal>PARTITION BY</> expression(s).  For each row,
     the window function is computed across the rows that fall into the
     same partition as the current row.
@@ -411,8 +410,8 @@ FROM empsalary;
 </screen>
 
     As shown here, the <function>rank</> function produces a numerical rank
-    within the current row's partition for each distinct <literal>ORDER BY</>
-    value, in the order defined by the <literal>ORDER BY</> clause.
+    for each distinct <literal>ORDER BY</> value in the current row's
+    partition, in the order defined by the <literal>ORDER BY</> clause.
     <function>rank</> needs no explicit parameter, because its behavior
     is entirely determined by the <literal>OVER</> clause.
    </para>
@@ -424,20 +423,20 @@ FROM empsalary;
     if any.  For example, a row removed because it does not meet the
     <literal>WHERE</> condition is not seen by any window function.
     A query can contain multiple window functions that slice up the data
-    in different ways by means of different <literal>OVER</> clauses, but
+    in different ways using different <literal>OVER</> clauses, but
     they all act on the same collection of rows defined by this virtual table.
    </para>
 
    <para>
     We already saw that <literal>ORDER BY</> can be omitted if the ordering
     of rows is not important.  It is also possible to omit <literal>PARTITION
-    BY</>, in which case there is just one partition containing all the rows.
+    BY</>, in which case there is a single partition containing all rows.
    </para>
 
    <para>
     There is another important concept associated with window functions:
     for each row, there is a set of rows within its partition called its
-    <firstterm>window frame</>.  Many (but not all) window functions act only
+    <firstterm>window frame</>.  Some window functions act only
     on the rows of the window frame, rather than of the whole partition.
     By default, if <literal>ORDER BY</> is supplied then the frame consists of
     all rows from the start of the partition up through the current row, plus
@@ -515,7 +514,7 @@ SELECT salary, sum(salary) OVER (ORDER BY salary) FROM empsalary;
     elsewhere, such as in <literal>GROUP BY</>, <literal>HAVING</>
     and <literal>WHERE</literal> clauses.  This is because they logically
     execute after the processing of those clauses.  Also, window functions
-    execute after regular aggregate functions.  This means it is valid to
+    execute after normal aggregate functions.  This means it is valid to
     include an aggregate function call in the arguments of a window function,
     but not vice versa.
    </para>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e3d852c..91c76a3 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -14413,17 +14413,17 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
   <para>
    The built-in window functions are listed in
    <xref linkend="functions-window-table">.  Note that these functions
-   <emphasis>must</> be invoked using window function syntax; that is an
+   <emphasis>must</> be invoked using window function syntax, i.e. an
    <literal>OVER</> clause is required.
   </para>
 
   <para>
    In addition to these functions, any built-in or user-defined normal
-   aggregate function (but not ordered-set or hypothetical-set aggregates)
+   aggregate function (i.e. not ordered-set or hypothetical-set aggregates)
    can be used as a window function; see
    <xref linkend="functions-aggregate"> for a list of the built-in aggregates.
-   Aggregate functions act as window functions only when an <literal>OVER</>
-   clause follows the call; otherwise they act as regular aggregates.
+   Normal aggregate functions act as window functions only when an <literal>OVER</>
+   clause follows the call; otherwise they act as normal aggregates.
   </para>
 
   <table id="functions-window-table">
@@ -14488,7 +14488,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
       <entry>
        <type>double precision</type>
       </entry>
-      <entry>relative rank of the current row: (<function>rank</> - 1) / (total rows - 1)</entry>
+      <entry>relative rank of the current row: (<function>rank</> - 1) / (total partition rows - 1)</entry>
      </row>
 
      <row>
@@ -14501,7 +14501,7 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
       <entry>
        <type>double precision</type>
       </entry>
-      <entry>relative rank of the current row: (number of rows preceding or peer with current row) / (total rows)</entry>
+      <entry>cumulative distribution: (number of partition rows preceding or peer with current row) / total partition rows</entry>
      </row>
 
      <row>
@@ -14634,9 +14634,10 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
    All of the functions listed in
    <xref linkend="functions-window-table"> depend on the sort ordering
    specified by the <literal>ORDER BY</> clause of the associated window
-   definition.  Rows that are not distinct in the <literal>ORDER BY</>
-   ordering are said to be <firstterm>peers</>; the four ranking functions
-   are defined so that they give the same answer for any two peer rows.
+   definition.  Rows that are not distinct based on <literal>ORDER BY</>
+   ordering are said to be <firstterm>peers</>.  The four ranking
+   functions (includes <function>cume_dist</>) are defined so that they
+   give the same answer for any two peer rows.
   </para>
 
   <para>
@@ -14646,14 +14647,13 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
    partition through the last peer of the current row.  This is
    likely to give unhelpful results for <function>last_value</> and
    sometimes also <function>nth_value</>.  You can redefine the frame by
-   adding a suitable frame specification (<literal>RANGE</> or
-   <literal>ROWS</>) to the <literal>OVER</> clause.
-   See <xref linkend="syntax-window-functions"> for more information
-   about frame specifications.
+   modifying the frame specification;  see <xref
+   linkend="syntax-window-functions"> for more information about frame
+   specifications.
   </para>
 
   <para>
-   When an aggregate function is used as a window function, it aggregates
+   When a normal aggregate function is used as a window function, it aggregates
    over the rows within the current row's window frame.
    An aggregate used with <literal>ORDER BY</> and the default window frame
    definition produces a <quote>running sum</> type of behavior, which may or
@@ -14679,6 +14679,13 @@ SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
    </para>
   </note>
 
+  <para>
+   <function>cume_dist</> computes the ratio of partition rows that
+   are less than or equal to the current row and its peers, while
+   <function>percent_rank</> computes the ratio of lesser partition rows,
+   assuming the current row is does not exist in the partition.
+  </para>
+
  </sect1>
 
  <sect1 id="functions-subquery">
diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index 40f722c..bdb4d52 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -1779,10 +1779,10 @@ FROM generate_series(1,10) AS s(i);
    <para>
     A <firstterm>window function call</firstterm> represents the application
     of an aggregate-like function over some portion of the rows selected
-    by a query.  Unlike regular aggregate function calls, this is not tied
+    by a query.  Unlike normal aggregate function calls, this is not tied
     to grouping of the selected rows into a single output row &mdash; each
     row remains separate in the query output.  However the window function
-    is able to scan all the rows that would be part of the current row's
+    has access to all the rows that would be part of the current row's
     group according to the grouping specification (<literal>PARTITION BY</>
     list) of the window function call.
     The syntax of a window function call is one of the following:
@@ -1831,20 +1831,20 @@ UNBOUNDED FOLLOWING
     named window in the <literal>WINDOW</literal> clause; see the
     <xref linkend="sql-select"> reference page for details.  It's worth
     pointing out that <literal>OVER wname</> is not exactly equivalent to
-    <literal>OVER (wname)</>; the latter implies copying and modifying the
+    <literal>OVER (wname ...)</>; the latter implies copying and modifying the
     window definition, and will be rejected if the referenced window
     specification includes a frame clause.
    </para>
 
    <para>
-    The <literal>PARTITION BY</> option groups the rows of the query into
+    The <literal>PARTITION BY</> clause groups the rows of the query into
     <firstterm>partitions</>, which are processed separately by the window
     function.  <literal>PARTITION BY</> works similarly to a query-level
     <literal>GROUP BY</> clause, except that its expressions are always just
     expressions and cannot be output-column names or numbers.
     Without <literal>PARTITION BY</>, all rows produced by the query are
     treated as a single partition.
-    The <literal>ORDER BY</> option determines the order in which the rows
+    The <literal>ORDER BY</> clause determines the order in which the rows
     of a partition are processed by the window function.  It works similarly
     to a query-level <literal>ORDER BY</> clause, but likewise cannot use
     output-column names or numbers.  Without <literal>ORDER BY</>, rows are
@@ -1922,8 +1922,8 @@ UNBOUNDED FOLLOWING
     The built-in window functions are described in <xref
     linkend="functions-window-table">.  Other window functions can be added by
     the user.  Also, any built-in or user-defined normal aggregate function
-    can be used as a window function.  Ordered-set aggregates presently
-    cannot be used as window functions, however.
+    can be used as a window function.  However, ordered-set aggregate
+    cannot presently be used as window functions.
    </para>
 
    <para>

-- 
Sent via pgsql-docs mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

[DOCS] Window function docs

Reply via email to