On 2013-11-26 13:32:44 +0100, Andres Freund wrote:
> A longer period of staring revealed a likely reason, in lazy_vacuum_rel:
>     /* Do the vacuuming */
>     lazy_scan_heap(onerel, vacrelstats, Irel, nindexes, scan_all);
> ...
>     if (whatever)
>        lazy_truncate_heap(onerel, vacrelstats);
> ...
>     new_frozen_xid = FreezeLimit;
>     if (vacrelstats->scanned_pages < vacrelstats->rel_pages)
>         new_frozen_xid = InvalidTransactionId;
> but lazy_tuncate_heap() does, after it's finished truncating:
>     vacrelstats->rel_pages = new_rel_pages;
> 
> Which means, we might consider a partial vacuum as a vacuum that has
> frozen all old rows if just enough pages have been truncated away.

repro.sql is a reproducer for the problem.

> This seems to be the case since
> b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8. I suggest we go back to using
> scan_all to determine whether we can set new_frozen_xid. That's a slight
> pessimization when we scan a relation fully without explicitly scanning
> it in its entirety, but given this isn't the first bug around
> scanned_pages/rel_pages I'd rather go that way. The aforementioned
> commit wasn't primarily concerned with that.
> Alternatively we could just compute new_frozen_xid et al before the
> lazy_truncate_heap.

I've gone for the latter in this preliminary patch. Not increasing
relfrozenxid after an initial data load seems like a bit of a shame.

I wonder if we should just do scan_all || vacrelstats->scanned_pages <
vacrelstats->rel_pages?

Greetings,

Andres Freund

-- 
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
RESET vacuum_freeze_min_age;
DROP TABLE IF EXISTS vacfailure;
CREATE TABLE vacfailure(id serial primary key, data int);
-- disable autovac so we don't have spurious failures
ALTER TABLE vacfailure SET (AUTOVACUUM_ENABLED = false);

-- insert data and mark as all-visible
INSERT INTO vacfailure(data) SELECT * FROM generate_series(1, 100000);
SELECT relname, relfrozenxid, txid_current() FROM pg_class WHERE oid = 
'vacfailure'::regclass;
SELECT min(xmin::text::bigint) FROM vacfailure WHERE NOT xmin = 2;
VACUUM VERBOSE vacfailure;
SELECT min(xmin::text::bigint) FROM vacfailure WHERE NOT xmin = 2;

-- so the test doesn't need to create a couple million xids
SET vacuum_freeze_min_age = 0;
SELECT txid_current();SELECT txid_current();SELECT txid_current();
SELECT txid_current();SELECT txid_current();SELECT txid_current();
SELECT txid_current();SELECT txid_current();SELECT txid_current();

SELECT relname, relfrozenxid, txid_current() FROM pg_class WHERE oid = 
'vacfailure'::regclass;
-- produce dead rows at the end of the relation
BEGIN;
INSERT INTO vacfailure(data) SELECT * FROM generate_series(1, 500000);
ROLLBACK;
-- partial vacuum, if "row versions in 2213 out of 2655 pages" or similar
VACUUM VERBOSE vacfailure;
-- if different than the the last time, the bug has hit
SELECT relname, relfrozenxid, txid_current() FROM pg_class WHERE oid = 
'vacfailure'::regclass;
SELECT min(xmin::text::bigint) FROM vacfailure WHERE NOT xmin = 2;
>From 9907880952f40dfd11795625d659ea723352c376 Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Tue, 26 Nov 2013 23:57:55 +0100
Subject: [PATCH] Don't sometimes incorrectly increase relfrozenxid in vacuums
 that truncate.

(auto-)vacuum recognizes that it can increase relfrozenxid by checking
whether it has processed all pages of a relation and frozen all that
are older than the new relfrozenxid. Unfortunately it performed that
check after truncating the dead pages of the end of the relation and
used the new number of pages to decide whether all pages have been
scanned.

This can lead to relfrozenxid being increased above the actual oldest
xid which in turn can lead to data loss due to xid wraparounds with
some rows suddently missing. This likely has escaped notice so far
because it takes a large number (~2^31) of xids being used to see the
effect, while a full-table vacuum before that would fix the issue.
---
 src/backend/commands/vacuumlazy.c | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 6778c7d..5d9d32f 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -226,6 +226,24 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
 	vac_close_indexes(nindexes, Irel, NoLock);
 
 	/*
+	 * Compute whether we actually scanned the whole relation, if we did, we
+	 * can adjust relfrozenxid.
+	 *
+	 * NB: We need to check before truncating the relation, because that will
+	 * change ->rel_pages.
+	 */
+	new_frozen_xid = FreezeLimit;
+	if (vacrelstats->scanned_pages < vacrelstats->rel_pages)
+	{
+		Assert(!scan_all);
+		new_frozen_xid = InvalidTransactionId;
+	}
+
+	new_min_multi = MultiXactCutoff;
+	if (scan_all || vacrelstats->scanned_pages < vacrelstats->rel_pages)
+		new_min_multi = InvalidMultiXactId;
+
+	/*
 	 * Optionally truncate the relation.
 	 *
 	 * Don't even think about it unless we have a shot at releasing a goodly
@@ -269,14 +287,6 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
 	if (new_rel_allvisible > new_rel_pages)
 		new_rel_allvisible = new_rel_pages;
 
-	new_frozen_xid = FreezeLimit;
-	if (vacrelstats->scanned_pages < vacrelstats->rel_pages)
-		new_frozen_xid = InvalidTransactionId;
-
-	new_min_multi = MultiXactCutoff;
-	if (vacrelstats->scanned_pages < vacrelstats->rel_pages)
-		new_min_multi = InvalidMultiXactId;
-
 	vac_update_relstats(onerel,
 						new_rel_pages,
 						new_rel_tuples,
-- 
1.8.3.251.g1462b67

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to