Re: document the need to analyze partitioned tables

Laurenz Albe Tue, 05 Sep 2023 20:54:24 -0700

Sorry for dropping the ball on this; I'll add it to the next commitfest.

On Wed, 2023-01-25 at 21:43 +1300, David Rowley wrote:
> > I think your first sentence it a bit clumsy and might be streamlined to
> > 
> >   Partitioned tables do not directly store tuples and consequently do not
> >   require autovacuum to perform any <command>VACUUM</command> operations.
> 
> That seems better than what I had.


Ok, I went with it.

> > Also, I am a little bit unhappy about
> > 
> > 1. Your paragraph states that partitioned table need no autovacuum,
> >    but doesn't state unmistakably that they will never be treated
> >    by autovacuum.
> 
> hmm. I assume the reader realises from the text that lack of any
> tuples means VACUUM is not required.  The remaining part of what
> autovacuum does not do is explained when the text goes on to say that
> ANALYZE operations are also not performed on partitioned tables. I'm
> not sure what is left that's mistakable there.

I rewrote the paragraph a little so that it looks clearer to me.
I hope it is OK for you as well.

> > 2. You make a distinction between table partitions and "normal tables",
> >    but really there is no distiction.
> 
> We may have different mental models here. This relates to the part
> that I wasn't keen on in your patch, i.e:
> 
> +    The partitions of a partitioned table are normal tables and get processed
> +    by autovacuum
> 
> While I agree that the majority of partitions are likely to be
> relkind='r', which you might ordinarily consider a "normal table", you
> just might change your mind when you try to INSERT or UPDATE records
> that would violate the partition constraint. Some partitions might
> also be themselves partitioned tables and others might be foreign
> tables. That does not really matter much when it comes to what
> autovacuum does or does not do, but I'm not really keen to imply in
> our documents that partitions are "normal tables".

Agreed, there are differences between partitions and normal tables.
And this is not the place in the documentation where we would like to
get into detail about the differences.

Attached is the next version of my patch.

Yours,
Laurenz Albe

From 33ef30888b5f5b57c776a1bd00065e0c94daccdb Mon Sep 17 00:00:00 2001
From: Laurenz Albe <laurenz.a...@cybertec.at>
Date: Wed, 6 Sep 2023 05:43:30 +0200
Subject: [PATCH] Improve autovacuum doc on partitioned tables

The documentation mentioned that autovacuum doesn't process
partitioned tables, but it was unclear about the impact.
The old wording could be interpreted to mean that there are
problems with dead tuple cleanup on partitioned tables.
Clarify that the only potential problem is autoanalyze, and
that statistics for the partitions will be gathered.

Author: Laurenz Albe
Reviewed-by: Nathan Bossart, Justin Pryzby
Discussion: https://postgr.es/m/1fd81ddc7710a154834030133c6fea41e55c8efb.camel%40cybertec.at
---
 doc/src/sgml/maintenance.sgml | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml
index 9cf9d030a8..10b1f211e8 100644
--- a/doc/src/sgml/maintenance.sgml
+++ b/doc/src/sgml/maintenance.sgml
@@ -861,10 +861,18 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
    </para>
 
    <para>
-    Partitioned tables are not processed by autovacuum.  Statistics
-    should be collected by running a manual <command>ANALYZE</command> when it is
-    first populated, and again whenever the distribution of data in its
-    partitions changes significantly.
+    Partitioned tables do not directly store tuples and consequently do not
+    require autovacuum to perform any <command>VACUUM</command> operations.
+    Autovacuum performs a <command>VACUUM</command> on the partitioned
+    table's partitions the same as it does with other tables.  While that
+    works fine, the fact that autovacuum doesn't process partitioned tables
+    also means that it doesn't run <command>ANALYZE</command> on them, and this
+    can be problematic, as there are various places in the query planner that
+    attempt to make use of table statistics for partitioned tables when
+    partitioned tables are queried.  For now, you can work around this problem
+    by running a manual <command>ANALYZE</command> command on the partitioned
+    table when the partitioned table is first populated, and again whenever
+    the distribution of data in its partitions changes significantly.
    </para>
 
    <para>
-- 
2.41.0

Re: document the need to analyze partitioned tables

Reply via email to