Sorry that should be a "vacuum full freeze"... not just a  "vacuum freeze"

On Fri, 25 May 2018 at 07:07 Andrew Bartley <ambart...@gmail.com> wrote:

> Hi,
>
> The two main techniques we use are.
>
> 1. Create a script to pass the -t param to pg_dump to exclude the log
> tables.  The idea here is to backup the rest of your DB to one backup
> regime and the log tables to another. We set it up so at the end of the day
> the current log table is backed up and loaded into an backup archive, then
> we vacuum freeze the log table.  The benefits are that once each log table
> is "Closed" (meaning you will no longer be writing to that log table ever
> again), is backed up once only.... ever...   It speeds housekeeping up, and
> your daily backups are much smaller.
>
> 2. Use  UNLOGGED in the log table creates.
>
> Have fun.
>
> Andrew
>
> On Fri, 25 May 2018 at 02:55 Paul Jungwirth <p...@illuminatedcomputing.com>
> wrote:
>
>> Hi,
>>
>> A project of mine uses a trigger-based approach to record changes to an
>> audit table. The audit table is partitioned by month (pg 9.5, so
>> old-fashioned partitioning). These tables are write-heavy but
>> append-only and practically write-only: we never UPDATE or DELETE, and
>> we seem to consult them only a few times a year. But they are enormous:
>> bigger than the rest of the database in fact. They slow down our
>> backups, they increase WAL size and streaming replication, they add to
>> recovery time, they make upgrades more time-consuming, and I suppose
>> they compete for RAM.
>>
>> This is all on an AWS EC2 instance with EBS storage. We also run a warm
>> standby with streaming replication.
>>
>> Since these tables are so different from everything else, I'm wondering
>> what opportunities we have to reduce their performance cost. I'm
>> interested both in practical high-bang-for-buck changes, but also in
>> harder just-interesting-to-think-about last-resort approaches. Here are
>> a few ideas of my own, but I'm curious what others think:
>>
>> We already have no indexes or foreign keys on these tables, so at least
>> there's no cost there.
>>
>> Since they are already partitioned, we could move old data to offline
>> storage and drop those tables. This feels like the biggest, easiest win,
>> and something we should have done a long time ago. Probably it's all we
>> need.
>>
>> Put them on a different tablespace. This one is also pretty obvious, but
>> aside from using a separate disk, I'm curious what other crazy things we
>> could do. Is there any per-tablespace tuning possible? (I think the
>> answer within Postgres is no, but I wish we could change the settings
>> for wal_level, or exclude them from replication, or something, so I'm
>> wondering if we could achieve the same effect by exploiting being on a
>> separate filesystem.) Maybe put the tablespace on some FUSE filesystem
>> to get async writes? Or just pick different mount options, e.g. on ext4
>> lazytime,dealloc,data=writeback? I don't know. Or at a different level:
>> change the triggers so they call a custom function that uses a new
>> thread to store the audit records elsewhere. Maybe these ideas are all
>> too risky, but I think the organization is fine with slightly relaxed
>> durability guarantees for this data, and anyway I'm just curious to have
>> a list of possibilities before I categorize anything as too crazy or
>> not. :-)
>>
>> If we upgraded to pg 10 we could use logical replication and leave out
>> the audit tables. That is appealing. Even without upgrading, I guess we
>> could replace those tables with postgres_fdw ones, so that they are not
>> replicated? Has anyone else used that trick?
>>
>> Thanks!
>>
>> --
>> Paul              ~{:-)
>> p...@illuminatedcomputing.com
>>
>>

Reply via email to