Hi, Hannu Krossing asked me about his patch to ignore transactions running VACUUM LAZY in other vacuum transactions. I attach a version of the patch updated to the current sources.
Just to remind what this is about: the point of the patch is to be able to run more than one VACUUM LAZY simultaneously and not have them interefere with each other. For example, assume you have a database with two tables, one very big and another very small but with a high update rate. One usually wants to vacuum the small one very frequently in order to keep the number of dead tuples low. But if one starts to vacuum the big table, it will take a long time, during which the vacuums applied to the smaller table won't be able to recover any tuple because that transaction will think the other transaction may want to read some of the tuples that the small transaction is trying to remove. We know this is not so -- a VACUUM can only be run in a standalone transaction, and it only checks the one table it's vacuuming. Thus we can optimize the vacuuming so that if the only thing that's holding the tuples undeletable is another big vacuum operation, ignore it and delete the tuples anyway. One exception is that we can't do that with full vacuums. The reason is that full vacuum may want to run user-defined functions to be able to index the tuples it moves. This isn't a problem normally, except in the case where the function tries to scan some other table: if we ignored that transaction, then another lazy vacuum might delete tuples from that table that we need to see. In a previous version of the patch, there was a note somewhere that made the code not ignore lazy vacuums in the case where we were running database-wide vacuums. The reason was that the value we computed was also used as truncate point for pg_clog; thus if we ignored that transaction, the truncate point could be further ahead than the vacuum, so the clog page for the vacuum transaction could be gone and it wouldn't be able to commit. This is no longer the case, because with the patch I committed yesterday, the clog truncation point is calculated differently and thus we don't need to take special care about this. -- Alvaro Herrera http://www.advogato.org/person/alvherre "Uno combate cuando es necesario... ¡no cuando está de humor! El humor es para el ganado, o para hacer el amor, o para tocar el baliset. No para combatir." (Gurney Halleck)
Index: src/backend/access/transam/twophase.c =================================================================== RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/twophase.c,v retrieving revision 1.19 diff -c -r1.19 twophase.c *** src/backend/access/transam/twophase.c 5 Mar 2006 15:58:22 -0000 1.19 --- src/backend/access/transam/twophase.c 11 Jul 2006 16:44:03 -0000 *************** *** 279,284 **** --- 279,286 ---- gxact->proc.pid = 0; gxact->proc.databaseId = databaseid; gxact->proc.roleId = owner; + gxact->proc.inVacuum = false; + gxact->proc.nonInVacuumXmin = InvalidTransactionId; gxact->proc.lwWaiting = false; gxact->proc.lwExclusive = false; gxact->proc.lwWaitLink = NULL; Index: src/backend/access/transam/xact.c =================================================================== RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xact.c,v retrieving revision 1.221 diff -c -r1.221 xact.c *** src/backend/access/transam/xact.c 20 Jun 2006 22:51:59 -0000 1.221 --- src/backend/access/transam/xact.c 11 Jul 2006 16:44:03 -0000 *************** *** 1529,1534 **** --- 1529,1536 ---- LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); MyProc->xid = InvalidTransactionId; MyProc->xmin = InvalidTransactionId; + MyProc->inVacuum = false; /* must be cleared with xid/xmin */ + MyProc->nonInVacuumXmin = InvalidTransactionId; /* this too */ /* Clear the subtransaction-XID cache too while holding the lock */ MyProc->subxids.nxids = 0; *************** *** 1762,1767 **** --- 1764,1771 ---- LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); MyProc->xid = InvalidTransactionId; MyProc->xmin = InvalidTransactionId; + MyProc->inVacuum = false; /* must be cleared with xid/xmin */ + MyProc->nonInVacuumXmin = InvalidTransactionId; /* this too */ /* Clear the subtransaction-XID cache too while holding the lock */ MyProc->subxids.nxids = 0; *************** *** 1925,1930 **** --- 1929,1936 ---- LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE); MyProc->xid = InvalidTransactionId; MyProc->xmin = InvalidTransactionId; + MyProc->inVacuum = false; /* must be cleared with xid/xmin */ + MyProc->nonInVacuumXmin = InvalidTransactionId; /* this too */ /* Clear the subtransaction-XID cache too while holding the lock */ MyProc->subxids.nxids = 0; Index: src/backend/access/transam/xlog.c =================================================================== RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xlog.c,v retrieving revision 1.242 diff -c -r1.242 xlog.c *** src/backend/access/transam/xlog.c 27 Jun 2006 18:59:17 -0000 1.242 --- src/backend/access/transam/xlog.c 11 Jul 2006 16:44:03 -0000 *************** *** 5417,5423 **** * StartupSUBTRANS hasn't been called yet. */ if (!InRecovery) ! TruncateSUBTRANS(GetOldestXmin(true)); if (!shutdown) ereport(DEBUG2, --- 5417,5423 ---- * StartupSUBTRANS hasn't been called yet. */ if (!InRecovery) ! TruncateSUBTRANS(GetOldestXmin(true, false)); if (!shutdown) ereport(DEBUG2, Index: src/backend/catalog/index.c =================================================================== RCS file: /home/alvherre/cvs/pgsql/src/backend/catalog/index.c,v retrieving revision 1.268 diff -c -r1.268 index.c *** src/backend/catalog/index.c 3 Jul 2006 22:45:37 -0000 1.268 --- src/backend/catalog/index.c 11 Jul 2006 16:44:03 -0000 *************** *** 1365,1371 **** else { snapshot = SnapshotAny; ! OldestXmin = GetOldestXmin(heapRelation->rd_rel->relisshared); } scan = heap_beginscan(heapRelation, /* relation */ --- 1365,1372 ---- else { snapshot = SnapshotAny; ! /* okay to ignore lazy VACUUMs here */ ! OldestXmin = GetOldestXmin(heapRelation->rd_rel->relisshared, true); } scan = heap_beginscan(heapRelation, /* relation */ Index: src/backend/commands/vacuum.c =================================================================== RCS file: /home/alvherre/cvs/pgsql/src/backend/commands/vacuum.c,v retrieving revision 1.333 diff -c -r1.333 vacuum.c *** src/backend/commands/vacuum.c 10 Jul 2006 16:20:50 -0000 1.333 --- src/backend/commands/vacuum.c 11 Jul 2006 20:03:35 -0000 *************** *** 40,45 **** --- 40,46 ---- #include "postmaster/autovacuum.h" #include "storage/freespace.h" #include "storage/pmsignal.h" + #include "storage/proc.h" #include "storage/procarray.h" #include "storage/smgr.h" #include "tcop/pquery.h" *************** *** 594,600 **** { TransactionId limit; ! *oldestXmin = GetOldestXmin(sharedRel); Assert(TransactionIdIsNormal(*oldestXmin)); --- 595,607 ---- { TransactionId limit; ! /* ! * We can always ignore processes running lazy vacuum. This is because we ! * use these values only for deciding which tuples we must keep in the ! * tables. Since lazy vacuum doesn't write its xid to the table, it's ! * safe to ignore it. ! */ ! *oldestXmin = GetOldestXmin(sharedRel, true); Assert(TransactionIdIsNormal(*oldestXmin)); *************** *** 650,655 **** --- 657,667 ---- * pg_class would've been obsoleted. Of course, this only works for * fixed-size never-null columns, but these are. * + * Another reason for doing it this way is that when we are in a lazy + * VACUUM and have inVacuum set, we mustn't do any updates --- somebody + * vacuuming pg_class might think they could delete a tuple marked with + * xmin = our xid. + * * This routine is shared by full VACUUM, lazy VACUUM, and stand-alone * ANALYZE. */ *************** *** 1001,1008 **** /* Begin a transaction for vacuuming this relation */ StartTransactionCommand(); ! /* functions in indexes may want a snapshot set */ ! ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); /* * Tell the cache replacement strategy that vacuum is causing all --- 1013,1047 ---- /* Begin a transaction for vacuuming this relation */ StartTransactionCommand(); ! ! if (vacstmt->full) ! { ! /* functions in indexes may want a snapshot set */ ! ActiveSnapshot = CopySnapshot(GetTransactionSnapshot()); ! } ! else ! { ! /* ! * During a lazy VACUUM we do not run any user-supplied functions, ! * and so it should be safe to not create a transaction snapshot. ! * ! * We can furthermore set the inVacuum flag, which lets other ! * concurrent VACUUMs know that they can ignore this one while ! * determining their OldestXmin. (The reason we don't set inVacuum ! * during a full VACUUM is exactly that we may have to run user- ! * defined functions for functional indexes, and we want to make ! * sure that if they use the snapshot set above, any tuples it ! * requires can't get removed from other tables. An index function ! * that depends on the contents of other tables is arguably broken, ! * but we won't break it here by violating transaction semantics.) ! * ! * Note: the inVacuum flag remains set until CommitTransaction or ! * AbortTransaction. We don't want to clear it until we reset ! * MyProc->xid/xmin, else OldestXmin might appear to go backwards, ! * which is probably Not Good. ! */ ! MyProc->inVacuum = true; ! } /* * Tell the cache replacement strategy that vacuum is causing all Index: src/backend/storage/ipc/procarray.c =================================================================== RCS file: /home/alvherre/cvs/pgsql/src/backend/storage/ipc/procarray.c,v retrieving revision 1.12 diff -c -r1.12 procarray.c *** src/backend/storage/ipc/procarray.c 19 Jun 2006 01:51:21 -0000 1.12 --- src/backend/storage/ipc/procarray.c 11 Jul 2006 20:26:35 -0000 *************** *** 387,406 **** * If allDbs is TRUE then all backends are considered; if allDbs is FALSE * then only backends running in my own database are considered. * * This is used by VACUUM to decide which deleted tuples must be preserved * in a table. allDbs = TRUE is needed for shared relations, but allDbs = * FALSE is sufficient for non-shared relations, since only backends in my ! * own database could ever see the tuples in them. * * This is also used to determine where to truncate pg_subtrans. allDbs ! * must be TRUE for that case. * * Note: we include the currently running xids in the set of considered xids. * This ensures that if a just-started xact has not yet set its snapshot, * when it does set the snapshot it cannot set xmin less than what we compute. */ TransactionId ! GetOldestXmin(bool allDbs) { ProcArrayStruct *arrayP = procArray; TransactionId result; --- 387,410 ---- * If allDbs is TRUE then all backends are considered; if allDbs is FALSE * then only backends running in my own database are considered. * + * If ignoreVacuum is TRUE then backends with inVacuum set are ignored. + * * This is used by VACUUM to decide which deleted tuples must be preserved * in a table. allDbs = TRUE is needed for shared relations, but allDbs = * FALSE is sufficient for non-shared relations, since only backends in my ! * own database could ever see the tuples in them. Also, we can ignore ! * concurrently running lazy VACUUMs because (a) they must be working on other ! * tables, and (b) they don't need to do snapshot-based lookups. * * This is also used to determine where to truncate pg_subtrans. allDbs ! * must be TRUE for that case, and ignoreVacuum FALSE. * * Note: we include the currently running xids in the set of considered xids. * This ensures that if a just-started xact has not yet set its snapshot, * when it does set the snapshot it cannot set xmin less than what we compute. */ TransactionId ! GetOldestXmin(bool allDbs, bool ignoreVacuum) { ProcArrayStruct *arrayP = procArray; TransactionId result; *************** *** 424,429 **** --- 428,436 ---- { PGPROC *proc = arrayP->procs[index]; + if (ignoreVacuum && proc->inVacuum) + continue; + if (allDbs || proc->databaseId == MyDatabaseId) { /* Fetch xid just once - see GetNewTransactionId */ *************** *** 481,486 **** --- 488,494 ---- TransactionId xmin; TransactionId xmax; TransactionId globalxmin; + TransactionId noninvacuumxmin; int index; int count = 0; *************** *** 514,520 **** errmsg("out of memory"))); } ! globalxmin = xmin = GetTopTransactionId(); /* * If we are going to set MyProc->xmin then we'd better get exclusive --- 522,528 ---- errmsg("out of memory"))); } ! globalxmin = xmin = noninvacuumxmin = GetTopTransactionId(); /* * If we are going to set MyProc->xmin then we'd better get exclusive *************** *** 573,578 **** --- 581,591 ---- if (TransactionIdPrecedes(xid, xmin)) xmin = xid; + + /* Only consider non-vacuum transactions for nonInVacuumXmin */ + if (TransactionIdPrecedes(xid, noninvacuumxmin) && !proc->inVacuum) + noninvacuumxmin = xid; + snapshot->xip[count] = xid; count++; *************** *** 584,590 **** --- 597,606 ---- } if (serializable) + { MyProc->xmin = TransactionXmin = xmin; + MyProc->nonInVacuumXmin = noninvacuumxmin; + } LWLockRelease(ProcArrayLock); Index: src/backend/storage/lmgr/proc.c =================================================================== RCS file: /home/alvherre/cvs/pgsql/src/backend/storage/lmgr/proc.c,v retrieving revision 1.175 diff -c -r1.175 proc.c *** src/backend/storage/lmgr/proc.c 20 Jun 2006 22:52:00 -0000 1.175 --- src/backend/storage/lmgr/proc.c 11 Jul 2006 16:44:03 -0000 *************** *** 258,263 **** --- 258,265 ---- /* databaseId and roleId will be filled in later */ MyProc->databaseId = InvalidOid; MyProc->roleId = InvalidOid; + MyProc->inVacuum = false; + MyProc->nonInVacuumXmin = InvalidTransactionId; MyProc->lwWaiting = false; MyProc->lwExclusive = false; MyProc->lwWaitLink = NULL; *************** *** 389,394 **** --- 391,398 ---- MyProc->xmin = InvalidTransactionId; MyProc->databaseId = InvalidOid; MyProc->roleId = InvalidOid; + MyProc->inVacuum = false; + MyProc->nonInVacuumXmin = InvalidTransactionId; MyProc->lwWaiting = false; MyProc->lwExclusive = false; MyProc->lwWaitLink = NULL; Index: src/include/storage/proc.h =================================================================== RCS file: /home/alvherre/cvs/pgsql/src/include/storage/proc.h,v retrieving revision 1.88 diff -c -r1.88 proc.h *** src/include/storage/proc.h 14 Apr 2006 03:38:56 -0000 1.88 --- src/include/storage/proc.h 11 Jul 2006 20:06:15 -0000 *************** *** 74,79 **** --- 74,84 ---- Oid databaseId; /* OID of database this backend is using */ Oid roleId; /* OID of role using this backend */ + bool inVacuum; /* true if current xact is a LAZY VACUUM */ + + TransactionId nonInVacuumXmin; /* same as xmin with transactions where + * (proc->inVacuum == true) excluded */ + /* Info about LWLock the process is currently waiting for, if any. */ bool lwWaiting; /* true if waiting for an LW lock */ bool lwExclusive; /* true if waiting for exclusive access */ Index: src/include/storage/procarray.h =================================================================== RCS file: /home/alvherre/cvs/pgsql/src/include/storage/procarray.h,v retrieving revision 1.9 diff -c -r1.9 procarray.h *** src/include/storage/procarray.h 19 Jun 2006 01:51:22 -0000 1.9 --- src/include/storage/procarray.h 11 Jul 2006 16:44:03 -0000 *************** *** 24,30 **** extern bool TransactionIdIsInProgress(TransactionId xid); extern bool TransactionIdIsActive(TransactionId xid); ! extern TransactionId GetOldestXmin(bool allDbs); extern PGPROC *BackendPidGetProc(int pid); extern int BackendXidGetPid(TransactionId xid); --- 24,30 ---- extern bool TransactionIdIsInProgress(TransactionId xid); extern bool TransactionIdIsActive(TransactionId xid); ! extern TransactionId GetOldestXmin(bool allDbs, bool ignoreVacuum); extern PGPROC *BackendPidGetProc(int pid); extern int BackendXidGetPid(TransactionId xid);
---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster