Anybody remembers my patch to allow tracking the minimum Xid present in
a table, allowing to update the freeze xid on a per-table basis?  The
motivation behind it was to remove the requirement of database-wide
vacuums.

The problem I found with it was that it required all tables to be
vacuumed at least once every billion transactions, even frozen tables,
because there was the danger that somebody may insert new tuples into
the table without marking that fact in the catalogs (thus minxid would
remain FrozenTransactionId but reality would be different.)

My proposal to solve that problem, is to make any transaction that
inserts or modifies tuples in a table that is marked as frozen, unfreeze
it first.  The problem I had last time was finding a good spot in the
code for doing so.  I'm now proposing to do it in the parser, in
setTargetTable().  This routine currently opens the target relation and
acquires RowExclusiveLock on it.  At this point we can check its
relminxid, and if it's FreezeTransactionId, open pg_class and change the
value there.

The problem with this is that it seems to turn a possibly innocuous
insert into an operation that needs to open pg_class.  But in the case
of a relation not in the relcache, it will happen anyway, so it's not
really all that serious.  (The assumption here is that an unfreeze event
is at least as unlikely as a cache miss or a cache invalidation.)

The attached patch implements this idea.  (Of course it doesn't work
standalone; it requires the rest of my min-xid patch.)  I attach it here
separately because it's small and the proposal can be reviewed
independently of the rest of the patch, which is quite bigger.

Note that there's a FIXME on heap_unfreeze() saying that the shared
invalidation would not occur if the transaction aborts.  This comment
comes verbatim from vacuum.c's vac_update_relstats(); however I made a
small experiment and it seems to be false.  I'm not sure about it, but
it seems to me to be critical to send a invalidation message so that
other backends will notice immediately when we unfreeze a relation.

Comments?

-- 
Alvaro Herrera                 http://www.amazon.com/gp/registry/CTMLCN8V17R4
"La victoria es para quien se atreve a estar solo"
Index: catalog/heap.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/catalog/heap.c,v
retrieving revision 1.293
diff -c -r1.293 heap.c
*** catalog/heap.c      22 Nov 2005 18:17:08 -0000      1.293
--- catalog/heap.c      9 Dec 2005 16:21:09 -0000
***************
*** 2074,2076 ****
--- 2137,2202 ----
        systable_endscan(fkeyScan);
        heap_close(fkeyRel, AccessShareLock);
  }
+ 
+ /*
+  * Mark a relation as no longer frozen in pg_class.  We violate no-overwrite
+  * semantics here by storing the new minxid value directly into the pg_class
+  * tuple that's already in the page.  The reason for this is that our change
+  * must persist even if our transaction aborts.
+  */
+ void
+ heap_unfreeze(Relation rel)
+ {
+       Relation        classRel;
+       Form_pg_class classForm;
+       HeapTuple       tuple;
+       HeapTupleData rtup;
+       Oid                     relid = RelationGetRelid(rel);
+       Buffer          buffer;
+ 
+       /*
+        * Throw a warning due to deadlock risk.  This is not an error so we 
have
+        * a chance to correct the bogus state and continue with the original
+        * operation.  It's possible to get a deadlock below however.
+        */
+       if (IsSystemRelation(rel))
+               ereport(WARNING,
+                               (errcode(ERRCODE_WARNING),
+                                errmsg("system catalogs must not be marked as 
frozen")));
+ 
+       classRel = heap_open(RelationRelationId, RowExclusiveLock);
+ 
+       tuple = SearchSysCache(RELOID,
+                                                  ObjectIdGetDatum(relid),
+                                                  0, 0, 0);
+       if (!HeapTupleIsValid(tuple))
+               elog(ERROR, "cache lookup failed for relation %u", relid);
+ 
+       rtup.t_self = tuple->t_self;
+       ReleaseSysCache(tuple);
+       if (!heap_fetch(classRel, SnapshotNow, &rtup, &buffer, false, NULL))
+               elog(ERROR, "pg_class entry for relid %u vanished during 
unfreezing",
+                        relid);
+ 
+       /* Ensure no one does this at the same time */
+       LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
+ 
+       classForm = (Form_pg_class) GETSTRUCT(&rtup);
+       classForm->relminxid = RecentXmin;
+ 
+       LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
+ 
+       /*
+        * Invalidate the tuple in the catcaches; this also arranges to flush
+        * the relation's relcache entry.
+        *
+        * FIXME -- If we fail to commit for some reason, no flush will occur.
+        * This is a bug.
+        */
+       CacheInvalidateHeapTuple(classRel, &rtup);
+ 
+       /* Write the buffer */
+       WriteBuffer(buffer);
+ 
+       heap_close(classRel, RowExclusiveLock);
+ }
Index: parser/parse_clause.c
===================================================================
RCS file: /home/alvherre/cvs/pgsql/src/backend/parser/parse_clause.c,v
retrieving revision 1.144
diff -c -r1.144 parse_clause.c
*** parser/parse_clause.c       22 Nov 2005 18:17:16 -0000      1.144
--- parser/parse_clause.c       9 Dec 2005 15:37:02 -0000
***************
*** 157,162 ****
--- 157,168 ----
        pstate->p_target_relation = heap_openrv(relation, RowExclusiveLock);
  
        /*
+        * If the relation is currently frozen, need to unfreeze it before use.
+        */
+       if (pstate->p_target_relation->rd_rel->relminxid == FrozenTransactionId)
+               heap_unfreeze(pstate->p_target_relation);
+ 
+       /*
         * Now build an RTE.
         */
        rte = addRangeTableEntryForRelation(pstate, pstate->p_target_relation,
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to