Why are you regularly having emergencies requiring the restoration of multi-TB tables to databases with lots of cruft?
Fixing that would go a long way towards eliminating your problems with pg_restore. On Mon, Mar 24, 2025 at 11:51 AM Dimitrios Apostolou <ji...@gmx.net> wrote: > On Mon, 24 Mar 2025, Adrian Klaver wrote: > > > On 3/24/25 07:24, Dimitrios Apostolou wrote: > >> On Sun, 23 Mar 2025, Laurenz Albe wrote: > >> > >>> On Thu, 2025-03-20 at 23:48 +0100, Dimitrios Apostolou wrote: > >>>> Performance issues: (important as my db size is >5TB) > >>>> > >>>> * WAL writes: I didn't manage to avoid writing to the WAL, despite > >>>> having > >>>> setting wal_level=minimal. I even wrote my own function to ALTER > all > >>>> tables to UNLOGGED, but failed with "could not change table T to > >>>> unlogged because it references logged table". I'm out of ideas on > >>>> this > >>>> one. > >>> > >>> You'd have to create an load the table in the same transaction, that > is, > >>> you'd have to run pg_restore with --single-transaction. > >> > >> That would restore the schema from the dump, while I want to create the > >> schema from the SQL code in version control. > > > > > > I am not following, from your original post: > > > > " > > ... create a > > clean database by running the SQL schema definition from version > control, and > > then copy the data for only the tables created. > > > > For this case, I choose to run pg_restore --data-only, and run it as the > user > > who owns the database (dbowner), not as a superuser, in order to avoid > > changes being introduced under the radar. > > " > > > > You are running the process in two steps, where the first does not > involve > > pg_restore. Not sure why doing the pg_restore --data-only portion in > single > > transaction is not possible? > > Laurenz informed me that I could avoid writing to the WAL if I "create and > load the table in a single transaction". > I haven't tried, but here is what I would do to try --single-transaction: > > Transaction 1: manually issuing all of CREATE TABLE etc. > > Transaction 2: pg_restore --single-transaction --data-only > > The COPY command in transaction 2 would still need to write to WAL, since > it's separate from the CREATE TABLE. > > Am I wrong somewhere? > > >> Something that might work, would be for pg_restore to issue a TRUNCATE > >> before the COPY. I believe this would require superuser privelege > though, > >> that I would prefer to avoid. Currently I issue TRUNCATE for all tables > >> manually before running pg_restore, but of course this is in a > different > >> transaction so it doesn't help. > >> > >> By the way do you see potential problems with using > --single-transaction > >> to restore billion-rows tables? > > > > COPY is all or none(version 17+ caveat(see > > https://www.postgresql.org/docs/current/sql-copy.html ON_ERROR)), so > if the > > data dump fails in --single-transaction everything rolls back. > > So if I restore all tables, then an error about a "table not found" would > not roll back already copied tables, since it's not part of a COPY? > > > Thank you for the feedback, > Dimitris > > -- Death to <Redacted>, and butter sauce. Don't boil me, I'm still alive. <Redacted> lobster!