On 3/28/2011 12:35 PM, Jan Wieck wrote:
On 3/27/2011 10:43 PM, Tom Lane wrote:
In particular, I thought the direction Jan was headed was to release and
reacquire the lock between truncating off limited-size chunks of the
file. If we do that, we probably *don't* want or need to allow autovac
to be booted off the lock more quickly.
That is correct.
3) Scanning backwards 8MB at a time scanning each 8MB forwards instead
of just going back by block backwards.
Maybe. I'd want to see some experimental evidence justifying the choice
of chunk size; I'm pretty sure this will become counterproductive once
the chunk size is too large.
Me too, which is why that part of my proposal is highly questionable and
requires a lot of evidence to be even remotely considered for back releases.
Attached is a patch against HEAD that implements the part that truncates
the heap in small batches (512 pages at a time) without fiddling with
the scan direction.
It does several retries when attempting to get the exclusive lock. This
is because when doing it this way I discovered that locks queued up
behind the exclusive lock held by autovacuum make it too likely that it
fails after just a few batches.
I am going to see what a similar logic will do to 8.4, where the
exclusive lock has far more severe consequences to client connections.
Jan
--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index a5c024c..a3e9a99 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -63,8 +63,17 @@
* REL_TRUNCATE_MINIMUM or (relsize / REL_TRUNCATE_FRACTION) (whichever
* is less) potentially-freeable pages.
*/
-#define REL_TRUNCATE_MINIMUM 1000
-#define REL_TRUNCATE_FRACTION 16
+#define REL_TRUNCATE_MINIMUM 1000
+#define REL_TRUNCATE_FRACTION 16
+
+/*
+ * When truncating a heap, we need an exclusive lock while rescanning
+ * the relation for newly created rows. We do it in batches of pages
+ * to minimize interference between autovacuum and concurrent transactions.
+ */
+#define REL_TRUNCATE_BATCHSIZE 512
+#define REL_TRUNCATE_LOCKATTEMPTS 10
+#define REL_TRUNCATE_LOCKWAIT 50
/*
* Guesstimation of number of dead tuples per page. This is used to
@@ -1017,72 +1026,140 @@ lazy_cleanup_index(Relation indrel,
static void
lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats)
{
+ BlockNumber orig_rel_pages = vacrelstats->rel_pages;
BlockNumber old_rel_pages = vacrelstats->rel_pages;
- BlockNumber new_rel_pages;
+ BlockNumber old_nonempty_pages = vacrelstats->nonempty_pages;
+ BlockNumber new_rel_pages = 0;
+ int retry;
PGRUsage ru0;
pg_rusage_init(&ru0);
- /*
- * We need full exclusive lock on the relation in order to do truncation.
- * If we can't get it, give up rather than waiting --- we don't want to
- * block other backends, and we don't want to deadlock (which is quite
- * possible considering we already hold a lower-grade lock).
- */
- if (!ConditionalLockRelation(onerel, AccessExclusiveLock))
- return;
-
- /*
- * Now that we have exclusive lock, look to see if the rel has grown
- * whilst we were vacuuming with non-exclusive lock. If so, give up; the
- * newly added pages presumably contain non-deletable tuples.
- */
- new_rel_pages = RelationGetNumberOfBlocks(onerel);
- if (new_rel_pages != old_rel_pages)
+ vacrelstats->nonempty_pages = old_rel_pages;
+ while (vacrelstats->nonempty_pages > old_nonempty_pages)
{
- /* might as well use the latest news when we update pg_class stats */
- vacrelstats->rel_pages = new_rel_pages;
- UnlockRelation(onerel, AccessExclusiveLock);
- return;
- }
+ /*
+ * We need full exclusive lock on the relation in order to do
+ * truncation. Rather than waiting we retry this a number of
+ * times --- we don't want to block other backends, and we don't
+ * want to deadlock (which is quite possible considering we
+ * already hold a lower-grade lock).
+ */
+ for (retry = 1; retry <= REL_TRUNCATE_LOCKATTEMPTS; retry++)
+ {
+ CHECK_FOR_INTERRUPTS();
- /*
- * Scan backwards from the end to verify that the end pages actually
- * contain no tuples. This is *necessary*, not optional, because other
- * backends could have added tuples to these pages whilst we were
- * vacuuming.
- */
- new_rel_pages = count_nondeletable_pages(onerel, vacrelstats);
+ if (!ConditionalLockRelation(onerel, AccessExclusiveLock))
+ {
+ if (retry == REL_TRUNCATE_LOCKATTEMPTS)
+ {
+ vacrelstats->nonempty_pages = old_nonempty_pages;
- if (new_rel_pages >= old_rel_pages)
- {
- /* can't do anything after all */
- UnlockRelation(onerel, AccessExclusiveLock);
- return;
- }
+ if (old_rel_pages < orig_rel_pages)
+ {
+ ereport(elevel,
+ (errmsg("\"%s\": truncated %u to %u pages",
+ RelationGetRelationName(onerel),
+ orig_rel_pages, old_rel_pages),
+ errdetail("%s.",
+ pg_rusage_show(&ru0))));
+ }
- /*
- * Okay to truncate.
- */
- RelationTruncate(onerel, new_rel_pages);
+ return;
+ }
+ else
+ pg_usleep(REL_TRUNCATE_LOCKWAIT * 1000L);
+ }
+ else
+ break;
+ }
- /*
- * We can release the exclusive lock as soon as we have truncated. Other
- * backends can't safely access the relation until they have processed the
- * smgr invalidation that smgrtruncate sent out ... but that should happen
- * as part of standard invalidation processing once they acquire lock on
- * the relation.
- */
- UnlockRelation(onerel, AccessExclusiveLock);
+ /*
+ * Now that we have exclusive lock, look to see if the rel has grown
+ * whilst we did not hold an exclusive lock. If so, give up; the
+ * newly added pages presumably contain non-deletable tuples.
+ */
+ new_rel_pages = RelationGetNumberOfBlocks(onerel);
+ if (new_rel_pages != old_rel_pages)
+ {
+ /* We can use the latest news when we update pg_class stats */
+ vacrelstats->rel_pages = new_rel_pages;
+ UnlockRelation(onerel, AccessExclusiveLock);
+
+ if (old_rel_pages < orig_rel_pages)
+ {
+ ereport(elevel,
+ (errmsg("\"%s\": truncated %u to %u pages",
+ RelationGetRelationName(onerel),
+ orig_rel_pages, old_rel_pages),
+ errdetail("%s.",
+ pg_rusage_show(&ru0))));
+ }
+
+ return;
+ }
+
+ /*
+ * Adjust vacrelstats so that we at maximum scan one batch
+ * size in count_nondeletable_pages().
+ */
+ if ((vacrelstats->nonempty_pages - old_nonempty_pages) >
+ REL_TRUNCATE_BATCHSIZE)
+ vacrelstats->nonempty_pages -= REL_TRUNCATE_BATCHSIZE;
+ else
+ vacrelstats->nonempty_pages = old_nonempty_pages;
+
+ /*
+ * Scan backwards from the end to verify that the end pages actually
+ * contain no tuples. This is *necessary*, not optional, because other
+ * backends could have added tuples to these pages whilst we were
+ * vacuuming.
+ */
+ new_rel_pages = count_nondeletable_pages(onerel, vacrelstats);
+
+ if (new_rel_pages >= old_rel_pages)
+ {
+ /* can't do anything after all */
+ UnlockRelation(onerel, AccessExclusiveLock);
+
+ if (old_rel_pages < orig_rel_pages)
+ {
+ ereport(elevel,
+ (errmsg("\"%s\": truncated %u to %u pages",
+ RelationGetRelationName(onerel),
+ orig_rel_pages, old_rel_pages),
+ errdetail("%s.",
+ pg_rusage_show(&ru0))));
+ }
+
+ return;
+ }
+
+ /*
+ * Okay to truncate this batch.
+ */
+ RelationTruncate(onerel, new_rel_pages);
+
+ /*
+ * We can release the exclusive lock as soon as we have truncated.
+ * Other backends can't safely access the relation until they have
+ * processed the smgr invalidation that smgrtruncate sent out ...
+ * but that should happen as part of standard invalidation processing
+ * once they acquire lock on the relation.
+ */
+ UnlockRelation(onerel, AccessExclusiveLock);
+
+ /* update statistics */
+ vacrelstats->rel_pages = new_rel_pages;
+ vacrelstats->pages_removed = orig_rel_pages - new_rel_pages;
+ old_rel_pages = new_rel_pages;
- /* update statistics */
- vacrelstats->rel_pages = new_rel_pages;
- vacrelstats->pages_removed = old_rel_pages - new_rel_pages;
+ }
ereport(elevel,
(errmsg("\"%s\": truncated %u to %u pages",
RelationGetRelationName(onerel),
- old_rel_pages, new_rel_pages),
+ orig_rel_pages, new_rel_pages),
errdetail("%s.",
pg_rusage_show(&ru0))));
}
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers