Anybody remembers my patch to allow tracking the minimum Xid present in a table, allowing to update the freeze xid on a per-table basis? The motivation behind it was to remove the requirement of database-wide vacuums.
The problem I found with it was that it required all tables to be vacuumed at least once every billion transactions, even frozen tables, because there was the danger that somebody may insert new tuples into the table without marking that fact in the catalogs (thus minxid would remain FrozenTransactionId but reality would be different.) My proposal to solve that problem, is to make any transaction that inserts or modifies tuples in a table that is marked as frozen, unfreeze it first. The problem I had last time was finding a good spot in the code for doing so. I'm now proposing to do it in the parser, in setTargetTable(). This routine currently opens the target relation and acquires RowExclusiveLock on it. At this point we can check its relminxid, and if it's FreezeTransactionId, open pg_class and change the value there. The problem with this is that it seems to turn a possibly innocuous insert into an operation that needs to open pg_class. But in the case of a relation not in the relcache, it will happen anyway, so it's not really all that serious. (The assumption here is that an unfreeze event is at least as unlikely as a cache miss or a cache invalidation.) The attached patch implements this idea. (Of course it doesn't work standalone; it requires the rest of my min-xid patch.) I attach it here separately because it's small and the proposal can be reviewed independently of the rest of the patch, which is quite bigger. Note that there's a FIXME on heap_unfreeze() saying that the shared invalidation would not occur if the transaction aborts. This comment comes verbatim from vacuum.c's vac_update_relstats(); however I made a small experiment and it seems to be false. I'm not sure about it, but it seems to me to be critical to send a invalidation message so that other backends will notice immediately when we unfreeze a relation. Comments? -- Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4 "La victoria es para quien se atreve a estar solo"
Index: catalog/heap.c =================================================================== RCS file: /home/alvherre/cvs/pgsql/src/backend/catalog/heap.c,v retrieving revision 1.293 diff -c -r1.293 heap.c *** catalog/heap.c 22 Nov 2005 18:17:08 -0000 1.293 --- catalog/heap.c 9 Dec 2005 16:21:09 -0000 *************** *** 2074,2076 **** --- 2137,2202 ---- systable_endscan(fkeyScan); heap_close(fkeyRel, AccessShareLock); } + + /* + * Mark a relation as no longer frozen in pg_class. We violate no-overwrite + * semantics here by storing the new minxid value directly into the pg_class + * tuple that's already in the page. The reason for this is that our change + * must persist even if our transaction aborts. + */ + void + heap_unfreeze(Relation rel) + { + Relation classRel; + Form_pg_class classForm; + HeapTuple tuple; + HeapTupleData rtup; + Oid relid = RelationGetRelid(rel); + Buffer buffer; + + /* + * Throw a warning due to deadlock risk. This is not an error so we have + * a chance to correct the bogus state and continue with the original + * operation. It's possible to get a deadlock below however. + */ + if (IsSystemRelation(rel)) + ereport(WARNING, + (errcode(ERRCODE_WARNING), + errmsg("system catalogs must not be marked as frozen"))); + + classRel = heap_open(RelationRelationId, RowExclusiveLock); + + tuple = SearchSysCache(RELOID, + ObjectIdGetDatum(relid), + 0, 0, 0); + if (!HeapTupleIsValid(tuple)) + elog(ERROR, "cache lookup failed for relation %u", relid); + + rtup.t_self = tuple->t_self; + ReleaseSysCache(tuple); + if (!heap_fetch(classRel, SnapshotNow, &rtup, &buffer, false, NULL)) + elog(ERROR, "pg_class entry for relid %u vanished during unfreezing", + relid); + + /* Ensure no one does this at the same time */ + LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); + + classForm = (Form_pg_class) GETSTRUCT(&rtup); + classForm->relminxid = RecentXmin; + + LockBuffer(buffer, BUFFER_LOCK_UNLOCK); + + /* + * Invalidate the tuple in the catcaches; this also arranges to flush + * the relation's relcache entry. + * + * FIXME -- If we fail to commit for some reason, no flush will occur. + * This is a bug. + */ + CacheInvalidateHeapTuple(classRel, &rtup); + + /* Write the buffer */ + WriteBuffer(buffer); + + heap_close(classRel, RowExclusiveLock); + } Index: parser/parse_clause.c =================================================================== RCS file: /home/alvherre/cvs/pgsql/src/backend/parser/parse_clause.c,v retrieving revision 1.144 diff -c -r1.144 parse_clause.c *** parser/parse_clause.c 22 Nov 2005 18:17:16 -0000 1.144 --- parser/parse_clause.c 9 Dec 2005 15:37:02 -0000 *************** *** 157,162 **** --- 157,168 ---- pstate->p_target_relation = heap_openrv(relation, RowExclusiveLock); /* + * If the relation is currently frozen, need to unfreeze it before use. + */ + if (pstate->p_target_relation->rd_rel->relminxid == FrozenTransactionId) + heap_unfreeze(pstate->p_target_relation); + + /* * Now build an RTE. */ rte = addRangeTableEntryForRelation(pstate, pstate->p_target_relation,
---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster