New version, a bit more ambitious.  I think it's better to describe
behavior for partitioned tables ahead of inheritance.  Also, in the
ANALYZE reference page I split the topic in two: in one single paragraph
we now describe what happens with manual analyze for partitioned tables
and inheritance hierarchies; we describe the behavior of autovacuum in
one separate paragraph for each type of hierarchy, since the differences
are stark.

I noticed that difference while verifying the behavior that I was to
document.  If you look at ANALYZE VERBOSE output, it seems a bit
wasteful:

create table part (a int) partition by list (a);
create table part0 partition of part for values in (0);
create table part1 partition of part for values in (1);
create table part23 partition of part for values in (2, 3) partition by list 
(a);
create table part2 partition of part23 for values in (2);
create table part3 partition of part23 for values in (3);
insert into part select g%4 from generate_series(1, 50000000) g;

analyze verbose part;

INFO:  analyzing "public.part" inheritance tree
INFO:  "part1": scanned 7500 of 55310 pages, containing 1695000 live rows and 0 
dead rows; 7500 rows in sample, 12500060 estimated total rows
INFO:  "part2": scanned 7500 of 55310 pages, containing 1695000 live rows and 0 
dead rows; 7500 rows in sample, 12500060 estimated total rows
INFO:  "part3": scanned 7500 of 55310 pages, containing 1695000 live rows and 0 
dead rows; 7500 rows in sample, 12500060 estimated total rows
INFO:  "part4": scanned 7500 of 55310 pages, containing 1695000 live rows and 0 
dead rows; 7500 rows in sample, 12500060 estimated total rows
INFO:  analyzing "public.part1"
INFO:  "part1": scanned 30000 of 55310 pages, containing 6779940 live rows and 
0 dead rows; 30000 rows in sample, 12499949 estimated total rows
INFO:  analyzing "public.part2"
INFO:  "part2": scanned 30000 of 55310 pages, containing 6779940 live rows and 
0 dead rows; 30000 rows in sample, 12499949 estimated total rows
INFO:  analyzing "public.part34" inheritance tree
INFO:  "part3": scanned 15000 of 55310 pages, containing 3390000 live rows and 
0 dead rows; 15000 rows in sample, 12500060 estimated total rows
INFO:  "part4": scanned 15000 of 55310 pages, containing 3389940 live rows and 
0 dead rows; 15000 rows in sample, 12499839 estimated total rows
INFO:  analyzing "public.part3"
INFO:  "part3": scanned 30000 of 55310 pages, containing 6780000 live rows and 
0 dead rows; 30000 rows in sample, 12500060 estimated total rows
INFO:  analyzing "public.part4"
INFO:  "part4": scanned 30000 of 55310 pages, containing 6780000 live rows and 
0 dead rows; 30000 rows in sample, 12500060 estimated total rows
ANALYZE

-- 
Álvaro Herrera       Valdivia, Chile
"The eagle never lost so much time, as
when he submitted to learn of the crow." (William Blake)
>From 6961e64a3ad5bfd10a14f544c470dbb93f9aadc3 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvhe...@alvh.no-ip.org>
Date: Thu, 13 May 2021 16:24:11 -0400
Subject: [PATCH] update docs on analyze on partitioned tables

---
 doc/src/sgml/maintenance.sgml    |  5 ++++
 doc/src/sgml/perform.sgml        |  3 ++-
 doc/src/sgml/ref/analyze.sgml    | 40 +++++++++++++++++++++++---------
 doc/src/sgml/ref/pg_restore.sgml |  6 +++--
 4 files changed, 40 insertions(+), 14 deletions(-)

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index de7fd75e1c..b390debf2e 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -817,6 +817,11 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
 </programlisting>
     is compared to the total number of tuples inserted, updated, or deleted
     since the last <command>ANALYZE</command>.
+    For partitioned tables, inserts and updates on partitions are counted
+    towards this threshold; however partition meta-operations such as
+    attachment, detachment or drop are not, so running a manual
+    <command>ANALYZE</command> is recommended if the partition added or
+    removed contains a statistically significant volume of data.
    </para>
 
    <para>
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
index 89ff58338e..ddd6c3ff3e 100644
--- a/doc/src/sgml/perform.sgml
+++ b/doc/src/sgml/perform.sgml
@@ -1767,7 +1767,8 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
    <para>
     Whenever you have significantly altered the distribution of data
     within a table, running <link linkend="sql-analyze"><command>ANALYZE</command></link> is strongly recommended. This
-    includes bulk loading large amounts of data into the table.  Running
+    includes bulk loading large amounts of data into the table as well as
+    attaching, detaching or dropping partitions.  Running
     <command>ANALYZE</command> (or <command>VACUUM ANALYZE</command>)
     ensures that the planner has up-to-date statistics about the
     table.  With no statistics or obsolete statistics, the planner might
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index c8fcebc161..8f8d3af985 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -250,20 +250,38 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea
   </para>
 
   <para>
-    If the table being analyzed has one or more children,
-    <command>ANALYZE</command> will gather statistics twice: once on the
-    rows of the parent table only, and a second time on the rows of the
-    parent table with all of its children.  This second set of statistics
-    is needed when planning queries that traverse the entire inheritance
-    tree.  The autovacuum daemon, however, will only consider inserts or
-    updates on the parent table itself when deciding whether to trigger an
-    automatic analyze for that table.  If that table is rarely inserted into
-    or updated, the inheritance statistics will not be up to date unless you
-    run <command>ANALYZE</command> manually.
+   If the table being analyzed is partitioned, <command>ANALYZE</command>
+   will gather statistics by sampling block randomly from its partitions;
+   and in addition it will analyze each individual partition separately.
+   (However, in multi-level partitioning scenarios, each leaf partition
+   will only be analyzed once.)
+   By constrast, if the table being analyzed has inheritance children,
+   <command>ANALYZE</command> will gather statistics for that table twice:
+   once on the rows of the parent table only, and a second time on the
+   rows of the parent table with all of its children.  This second set of
+   statistics is needed when planning queries that traverse the entire
+   inheritance tree.  The children tables are not individually analyzed
+   in this case.
   </para>
 
   <para>
-    If any of the child tables are foreign tables whose foreign data wrappers
+   The autovacuum daemon counts inserts and updates in the partitions
+   to determine if auto-analyze is needed.  However, adding or
+   removing partitions does not affect the autovacuum daemon decisions,
+   so triggering a manual <command>ANALYZE</command> is recommended when
+   they occur.
+  </para>
+
+  <para>
+   For tables with inheritance children, the autovacuum daemon only
+   counts inserts and deletes in the parent table itself when deciding
+   whether to trigger an automatic analyze for that table.  If that table
+   is rarely inserted into or updated, the inheritance statistics will
+   not be up to date unless you run <command>ANALYZE</command> manually.
+  </para>
+
+  <para>
+    If any of the child tables or partitions are foreign tables whose foreign data wrappers
     do not support <command>ANALYZE</command>, those child tables are ignored while
     gathering inheritance statistics.
   </para>
diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml
index 93ea937ac8..474f18c73f 100644
--- a/doc/src/sgml/ref/pg_restore.sgml
+++ b/doc/src/sgml/ref/pg_restore.sgml
@@ -922,8 +922,10 @@ CREATE DATABASE foo WITH TEMPLATE template0;
 
   <para>
    Once restored, it is wise to run <command>ANALYZE</command> on each
-   restored table so the optimizer has useful statistics; see
-   <xref linkend="vacuum-for-statistics"/> and
+   restored table so the optimizer has useful statistics.
+   If the table is a partition or an inheritance child, it may also be useful
+   to analyze the parent table.
+   See <xref linkend="vacuum-for-statistics"/> and
    <xref linkend="autovacuum"/> for more information.
   </para>
 
-- 
2.20.1

Reply via email to