Sorry that should be a "vacuum full freeze"... not just a "vacuum freeze"
On Fri, 25 May 2018 at 07:07 Andrew Bartley <ambart...@gmail.com> wrote: > Hi, > > The two main techniques we use are. > > 1. Create a script to pass the -t param to pg_dump to exclude the log > tables. The idea here is to backup the rest of your DB to one backup > regime and the log tables to another. We set it up so at the end of the day > the current log table is backed up and loaded into an backup archive, then > we vacuum freeze the log table. The benefits are that once each log table > is "Closed" (meaning you will no longer be writing to that log table ever > again), is backed up once only.... ever... It speeds housekeeping up, and > your daily backups are much smaller. > > 2. Use UNLOGGED in the log table creates. > > Have fun. > > Andrew > > On Fri, 25 May 2018 at 02:55 Paul Jungwirth <p...@illuminatedcomputing.com> > wrote: > >> Hi, >> >> A project of mine uses a trigger-based approach to record changes to an >> audit table. The audit table is partitioned by month (pg 9.5, so >> old-fashioned partitioning). These tables are write-heavy but >> append-only and practically write-only: we never UPDATE or DELETE, and >> we seem to consult them only a few times a year. But they are enormous: >> bigger than the rest of the database in fact. They slow down our >> backups, they increase WAL size and streaming replication, they add to >> recovery time, they make upgrades more time-consuming, and I suppose >> they compete for RAM. >> >> This is all on an AWS EC2 instance with EBS storage. We also run a warm >> standby with streaming replication. >> >> Since these tables are so different from everything else, I'm wondering >> what opportunities we have to reduce their performance cost. I'm >> interested both in practical high-bang-for-buck changes, but also in >> harder just-interesting-to-think-about last-resort approaches. Here are >> a few ideas of my own, but I'm curious what others think: >> >> We already have no indexes or foreign keys on these tables, so at least >> there's no cost there. >> >> Since they are already partitioned, we could move old data to offline >> storage and drop those tables. This feels like the biggest, easiest win, >> and something we should have done a long time ago. Probably it's all we >> need. >> >> Put them on a different tablespace. This one is also pretty obvious, but >> aside from using a separate disk, I'm curious what other crazy things we >> could do. Is there any per-tablespace tuning possible? (I think the >> answer within Postgres is no, but I wish we could change the settings >> for wal_level, or exclude them from replication, or something, so I'm >> wondering if we could achieve the same effect by exploiting being on a >> separate filesystem.) Maybe put the tablespace on some FUSE filesystem >> to get async writes? Or just pick different mount options, e.g. on ext4 >> lazytime,dealloc,data=writeback? I don't know. Or at a different level: >> change the triggers so they call a custom function that uses a new >> thread to store the audit records elsewhere. Maybe these ideas are all >> too risky, but I think the organization is fine with slightly relaxed >> durability guarantees for this data, and anyway I'm just curious to have >> a list of possibilities before I categorize anything as too crazy or >> not. :-) >> >> If we upgraded to pg 10 we could use logical replication and leave out >> the audit tables. That is appealing. Even without upgrading, I guess we >> could replace those tables with postgres_fdw ones, so that they are not >> replicated? Has anyone else used that trick? >> >> Thanks! >> >> -- >> Paul ~{:-) >> p...@illuminatedcomputing.com >> >>