Hello Experts
We have a large-ish (16T) database cluster which were are performing the
following sequence on.
- First we upgrade the whole cluster from pg11 to pg13, using pg_upgrade
(this succeeds)
- Next we run a migration script on each database in the cluster. The
migration script
converts a large number of tables from inheritance based partitioning to
declarative
partitioning. Unfortunately I am not at liberty to share the migration
script.
The second step succeeds for most of the databases in the cluster but fails on
one of them.
The migration is performed inside a transaction, and during the process of
committing the
transaction the following error is thrown:
[2021-08-11 11:27:50 CEST] aue_75@218006 218015@[local] db_vrqv1 ERROR:
invalid memory alloc request size 1073741824
[2021-08-11 11:27:50 CEST] aue_75@218006 218015@[local] db_vrqv1 STATEMENT:
commit
[2021-08-11 11:27:50 CEST] aue_75@218006 218015@[local] db_vrqv1 WARNING:
AbortTransaction while in COMMIT state
The transaction is rolled back.
I have looked into the error message - it is very low level from the memory
manager and
occurs when a memory allocation of >= 1GB is requested. Most of the hits on
google for this
error indicate database corruption, however I am not sure this is the case for
us as
we have been able to do a complete pg_dump on the database without errors.
We repeated the migration with all postgres debug logging enabled - however
this did not
provide any more detail than above.
Luckily this has occurred while we were testing the procedure on a replica of
the production
system, not on the actual production system.
We are using pg 13.0. We are currently re-testing this with huge pages disabled
(on a hunch)
and after that we plan to re-test it on 13.3.
Any ideas as to what could be causing this problem, or any suggestions for
troubleshooting steps
we could take?
Many thanks in advance,
Cheers
Mike.