On Tue, Jun 12, 2012 at 6:02 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > The part I think is actually hard is how to clean up if the inserting > xact doesn't reach commit. I think what we're basically looking at here > is pushing more cost into that path in order to avoid cost in successful > cases. The first design that comes to mind is > > (1) the inserting xact remembers which tables it's inserted pre-hinted > tuples into, and if it has to abort, it first seqscans those tables to > reset the hint bits;
I don't think we can count on that to be safe in an arbitrarily chosen abort path. Anything FATAL, for instance. I think we're going to need to keep track of some kind table-xmin value, representing the oldest operation on the table that's not cleaned up yet, and make it autovacuum's job to clean any that precede OldestXmin. If the backend can clean itself up, great, but there has to be some kind of allowance for the case where that doesn't happen. I'm also skeptical about the notion that "scan the whole table" is going to be a good idea. It really will have to be a full sequential scan, if we're setting visibility map bits as we go, not just a scan of pages that are not-all-visible, as vacuum normally does. I think if we want to go this route, we need to log the TID of every tuple we write into the heap into some kind of undo fork (or maybe just the block numbers), so that if the transaction aborts, we (or autovacuum) can go back and find all of those TIDs and mark the tuples dead without having to scan through (potentially) terabytes of data. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers