New version, a bit more ambitious. I think it's better to describe behavior for partitioned tables ahead of inheritance. Also, in the ANALYZE reference page I split the topic in two: in one single paragraph we now describe what happens with manual analyze for partitioned tables and inheritance hierarchies; we describe the behavior of autovacuum in one separate paragraph for each type of hierarchy, since the differences are stark.
I noticed that difference while verifying the behavior that I was to document. If you look at ANALYZE VERBOSE output, it seems a bit wasteful: create table part (a int) partition by list (a); create table part0 partition of part for values in (0); create table part1 partition of part for values in (1); create table part23 partition of part for values in (2, 3) partition by list (a); create table part2 partition of part23 for values in (2); create table part3 partition of part23 for values in (3); insert into part select g%4 from generate_series(1, 50000000) g; analyze verbose part; INFO: analyzing "public.part" inheritance tree INFO: "part1": scanned 7500 of 55310 pages, containing 1695000 live rows and 0 dead rows; 7500 rows in sample, 12500060 estimated total rows INFO: "part2": scanned 7500 of 55310 pages, containing 1695000 live rows and 0 dead rows; 7500 rows in sample, 12500060 estimated total rows INFO: "part3": scanned 7500 of 55310 pages, containing 1695000 live rows and 0 dead rows; 7500 rows in sample, 12500060 estimated total rows INFO: "part4": scanned 7500 of 55310 pages, containing 1695000 live rows and 0 dead rows; 7500 rows in sample, 12500060 estimated total rows INFO: analyzing "public.part1" INFO: "part1": scanned 30000 of 55310 pages, containing 6779940 live rows and 0 dead rows; 30000 rows in sample, 12499949 estimated total rows INFO: analyzing "public.part2" INFO: "part2": scanned 30000 of 55310 pages, containing 6779940 live rows and 0 dead rows; 30000 rows in sample, 12499949 estimated total rows INFO: analyzing "public.part34" inheritance tree INFO: "part3": scanned 15000 of 55310 pages, containing 3390000 live rows and 0 dead rows; 15000 rows in sample, 12500060 estimated total rows INFO: "part4": scanned 15000 of 55310 pages, containing 3389940 live rows and 0 dead rows; 15000 rows in sample, 12499839 estimated total rows INFO: analyzing "public.part3" INFO: "part3": scanned 30000 of 55310 pages, containing 6780000 live rows and 0 dead rows; 30000 rows in sample, 12500060 estimated total rows INFO: analyzing "public.part4" INFO: "part4": scanned 30000 of 55310 pages, containing 6780000 live rows and 0 dead rows; 30000 rows in sample, 12500060 estimated total rows ANALYZE -- Álvaro Herrera Valdivia, Chile "The eagle never lost so much time, as when he submitted to learn of the crow." (William Blake)
>From 6961e64a3ad5bfd10a14f544c470dbb93f9aadc3 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera <alvhe...@alvh.no-ip.org> Date: Thu, 13 May 2021 16:24:11 -0400 Subject: [PATCH] update docs on analyze on partitioned tables --- doc/src/sgml/maintenance.sgml | 5 ++++ doc/src/sgml/perform.sgml | 3 ++- doc/src/sgml/ref/analyze.sgml | 40 +++++++++++++++++++++++--------- doc/src/sgml/ref/pg_restore.sgml | 6 +++-- 4 files changed, 40 insertions(+), 14 deletions(-) diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml index de7fd75e1c..b390debf2e 100644 --- a/doc/src/sgml/maintenance.sgml +++ b/doc/src/sgml/maintenance.sgml @@ -817,6 +817,11 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu </programlisting> is compared to the total number of tuples inserted, updated, or deleted since the last <command>ANALYZE</command>. + For partitioned tables, inserts and updates on partitions are counted + towards this threshold; however partition meta-operations such as + attachment, detachment or drop are not, so running a manual + <command>ANALYZE</command> is recommended if the partition added or + removed contains a statistically significant volume of data. </para> <para> diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml index 89ff58338e..ddd6c3ff3e 100644 --- a/doc/src/sgml/perform.sgml +++ b/doc/src/sgml/perform.sgml @@ -1767,7 +1767,8 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse; <para> Whenever you have significantly altered the distribution of data within a table, running <link linkend="sql-analyze"><command>ANALYZE</command></link> is strongly recommended. This - includes bulk loading large amounts of data into the table. Running + includes bulk loading large amounts of data into the table as well as + attaching, detaching or dropping partitions. Running <command>ANALYZE</command> (or <command>VACUUM ANALYZE</command>) ensures that the planner has up-to-date statistics about the table. With no statistics or obsolete statistics, the planner might diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml index c8fcebc161..8f8d3af985 100644 --- a/doc/src/sgml/ref/analyze.sgml +++ b/doc/src/sgml/ref/analyze.sgml @@ -250,20 +250,38 @@ ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replacea </para> <para> - If the table being analyzed has one or more children, - <command>ANALYZE</command> will gather statistics twice: once on the - rows of the parent table only, and a second time on the rows of the - parent table with all of its children. This second set of statistics - is needed when planning queries that traverse the entire inheritance - tree. The autovacuum daemon, however, will only consider inserts or - updates on the parent table itself when deciding whether to trigger an - automatic analyze for that table. If that table is rarely inserted into - or updated, the inheritance statistics will not be up to date unless you - run <command>ANALYZE</command> manually. + If the table being analyzed is partitioned, <command>ANALYZE</command> + will gather statistics by sampling block randomly from its partitions; + and in addition it will analyze each individual partition separately. + (However, in multi-level partitioning scenarios, each leaf partition + will only be analyzed once.) + By constrast, if the table being analyzed has inheritance children, + <command>ANALYZE</command> will gather statistics for that table twice: + once on the rows of the parent table only, and a second time on the + rows of the parent table with all of its children. This second set of + statistics is needed when planning queries that traverse the entire + inheritance tree. The children tables are not individually analyzed + in this case. </para> <para> - If any of the child tables are foreign tables whose foreign data wrappers + The autovacuum daemon counts inserts and updates in the partitions + to determine if auto-analyze is needed. However, adding or + removing partitions does not affect the autovacuum daemon decisions, + so triggering a manual <command>ANALYZE</command> is recommended when + they occur. + </para> + + <para> + For tables with inheritance children, the autovacuum daemon only + counts inserts and deletes in the parent table itself when deciding + whether to trigger an automatic analyze for that table. If that table + is rarely inserted into or updated, the inheritance statistics will + not be up to date unless you run <command>ANALYZE</command> manually. + </para> + + <para> + If any of the child tables or partitions are foreign tables whose foreign data wrappers do not support <command>ANALYZE</command>, those child tables are ignored while gathering inheritance statistics. </para> diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml index 93ea937ac8..474f18c73f 100644 --- a/doc/src/sgml/ref/pg_restore.sgml +++ b/doc/src/sgml/ref/pg_restore.sgml @@ -922,8 +922,10 @@ CREATE DATABASE foo WITH TEMPLATE template0; <para> Once restored, it is wise to run <command>ANALYZE</command> on each - restored table so the optimizer has useful statistics; see - <xref linkend="vacuum-for-statistics"/> and + restored table so the optimizer has useful statistics. + If the table is a partition or an inheritance child, it may also be useful + to analyze the parent table. + See <xref linkend="vacuum-for-statistics"/> and <xref linkend="autovacuum"/> for more information. </para> -- 2.20.1