On Mon, 24 Mar 2025, Adrian Klaver wrote:

On 3/24/25 07:24, Dimitrios Apostolou wrote:
 On Sun, 23 Mar 2025, Laurenz Albe wrote:

 On Thu, 2025-03-20 at 23:48 +0100, Dimitrios Apostolou wrote:
 Performance issues: (important as my db size is >5TB)

 * WAL writes: I didn't manage to avoid writing to the WAL, despite
 having
    setting wal_level=minimal. I even wrote my own function to ALTER all
    tables to UNLOGGED, but failed with "could not change table T to
    unlogged because it references logged table".  I'm out of ideas on
 this
    one.

 You'd have to create an load the table in the same transaction, that is,
 you'd have to run pg_restore with --single-transaction.

 That would restore the schema from the dump, while I want to create the
 schema from the SQL code in version control.


I am not following, from your original post:

"
... create a
clean database by running the SQL schema definition from version control, and
then copy the data for only the tables created.

For this case, I choose to run pg_restore --data-only, and run it as the user
who owns the database (dbowner), not as a superuser, in order to avoid
changes being introduced under the radar.
"

You are running the process in two steps, where the first does not involve
pg_restore. Not sure why doing the pg_restore --data-only portion in single
transaction is not possible?

Laurenz informed me that I could avoid writing to the WAL if I "create and
load the table in a single transaction".
I haven't tried, but here is what I would do to try --single-transaction:

Transaction 1: manually issuing all of CREATE TABLE etc.

Transaction 2: pg_restore --single-transaction --data-only

The COPY command in transaction 2 would still need to write to WAL, since
it's separate from the CREATE TABLE.

Am I wrong somewhere?

 Something that might work, would be for pg_restore to issue a TRUNCATE
 before the COPY. I believe this would require superuser privelege though,
 that I would prefer to avoid. Currently I issue TRUNCATE for all tables
 manually before running pg_restore, but of course this is in a different
 transaction so it doesn't help.

 By the way do you see potential problems with using --single-transaction
 to restore billion-rows tables?

COPY is all or none(version 17+ caveat(see
https://www.postgresql.org/docs/current/sql-copy.html  ON_ERROR)), so if the
data dump fails in --single-transaction everything rolls back.

So if I restore all tables, then an error about a "table not found" would
not roll back already copied tables, since it's not part of a COPY?


Thank you for the feedback,
Dimitris

Reply via email to