Hello hackers,
I wanted to revive this thread specifically around the attislocal
optimization discussion. As part of
https://github.com/postgres/postgres/commit/b3f0e0503f3, we now batch all
the attislocal UPDATEs together, hence making it more performant.
I think we might be able to go one step further and completely skip the
attislocal UPDATE for partition tables. This is because the attislocal
UPDATE is done immediately after 'CREATE TABLE', during the 'ATTACH
PARTITION' step(see attislocal being set to false in
MergeAttributesIntoExisting). The UPDATEs emitted by pg_dump are therefore
redundant. Even with batching, the single UPDATE still modifies N(no of
columns) rows causing N relcache invalidations. This same workflow is then
repeated by ATTACH PARTITION causing another N relcache invalidations.
Skipping the attislocal UPDATE definitely speeds up the runtime if there
are a lot of partition tables because we will avoid quite a lot of relcache
invalidations and rebuild calls. Since this optimization removes the
attislocal UPDATE completely, the effect will be even more pronounced for
wider partition tables.
There's already precedent for this:
* attinhcount is never explicitly set by pg_dump. It is only modified by
MergeAttributesIntoExisting during ATTACH PARTITION
* conislocal for CHECK constraints is explicitly not fixed for partitions.
See comment "No need to fix conislocal: ATTACH PARTITION does that" in
dumpTableSchema
The only risk I can foresee is the window between CREATE TABLE and ATTACH
PARTITION where attislocal will be incorrectly set to true. But I think
this window is small enough to not worry about since ATTACH PARTITION
immediately succeeds CREATE TABLE(maybe barring some other minor updates)
Here is a simple patch
```
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 137161aa5e0..d3d7403228a 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -17734,7 +17734,8 @@ dumpTableSchema(Archive *fout, const TableInfo
*tbinfo)
for (j = 0; j < tbinfo->numatts; j++)
{
if (!tbinfo->attisdropped[j] &&
- !tbinfo->attislocal[j])
+ !tbinfo->attislocal[j] &&
+ !tbinfo->ispartition)
{
if (firstitem)
{
```
I tried a few experiments on my local macbook and noticed an improvement of
around ~33-36% (% can vary mostly depending on the number of columns)
Setup(master branch): 300 partitioned root tables with 200 leaves each =
60000 partition tables
300 columns per partition:
baseline : ~30 mins
with patch : ~20 mins (~33% faster)
700 columns per partition:
baseline : ~66 mins
with patch : ~42 mins (~36% faster)
Thanks
Nikhil
Broadcom Inc.
On Mon, 23 Mar 2026 at 11:50, Alexander Korotkov <[email protected]>
wrote:
> On Mon, Jul 29, 2024 at 12:24 AM Tom Lane <[email protected]> wrote:
> > So I'm forced to the conclusion that we'd better make the transaction
> > size adaptive as per Alexander's suggestion.
> >
> > In addition to the patches attached, I experimented with making
> > dumpTableSchema fold all the ALTER TABLE commands for a single table
> > into one command. That's do-able without too much effort, but I'm now
> > convinced that we shouldn't. It would break the semicolon-counting
> > hack for detecting that tables like these involve extra work.
> > I'm also not very confident that the backend won't have trouble with
> > ALTER TABLE commands containing hundreds of subcommands. That's
> > something we ought to work on probably, but it's not a project that
> > I want to condition v17 pg_upgrade's stability on.
> >
> > Anyway, proposed patches attached. 0001 is some trivial cleanup
> > that I noticed while working on the failed single-ALTER-TABLE idea.
> > 0002 merges the catalog-UPDATE commands that dumpTableSchema issues,
> > and 0003 is Alexander's suggestion.
>
> Nice to see you picked up my idea. I took a look over the patchset.
> Looks good to me.
>
> ------
> Regards,
> Alexander Korotkov
> Supabase
>
>
>
>
>