Avoiding bloat in the presence of a long-running transaction (Re: [HACKERS] Freezing without write I/O)

Heikki Linnakangas Fri, 07 Jun 2013 12:47:26 -0700

On 07.06.2013 22:15, Robert Haas wrote:

On Fri, Jun 7, 2013 at 3:10 PM, Simon Riggs<si...@2ndquadrant.com>  wrote:

The long running query problem hasn't ever been looked at, it seems,
until here and now.


For what it's worth (and that may not be much), I think most people
will die a horrible death due to bloat after holding a transaction
open for a tiny fraction of 2B XIDs.  :-(


Yeah, we should fix that too ;-).

While we're at it: I've been thinking that we should try harder tovacuum dead tuples that are no longer visible to any snapshot, even ifthere's an even old snapshot. The stereotypical scenario is a table witha single row that's updated very very frequently. Like a counter.Autovacuum can normally keep it in check, but if you have a long-runningtransaction, it will bloat horrendously. But if you only have onelong-running transaction with one really old snapshot, and everythingelse is recent, you'd really only need to keep one old tuple around forthe old snapshot to see, and a recent version or two for the rest. Atworst, the database needs to bloat to double the size, but not more thanthat.

To know which tuples are dead at such a fine-grained level, vacuum wouldneed to know in more detail what snapshots the backends have. I'm reallyexcited about Ants Aasma's proposal to use a CSN for snapshots, or moreprecisely the variant using commit record's LSN for that. If a snapshotis just a single integer, it becomes easier for backends to share theirsnapshots, in limited amount of shared memory.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Avoiding bloat in the presence of a long-running transaction (Re: [HACKERS] Freezing without write I/O)

Reply via email to