Re: recovering from "found xmin ... from before relfrozenxid ..."

Ashutosh Sharma Mon, 14 Sep 2020 03:27:10 -0700

On Sun, Sep 13, 2020 at 3:30 AM Tom Lane <t...@sss.pgh.pa.us> wrote:
>
> Robert Haas <robertmh...@gmail.com> writes:
> > I have committed this version.
>
> This failure says that the test case is not entirely stable:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2020-09-12%2005%3A13%3A12
>
> diff -U3 
> /home/nm/farm/gcc64/HEAD/pgsql.build/contrib/pg_surgery/expected/heap_surgery.out
>  
> /home/nm/farm/gcc64/HEAD/pgsql.build/contrib/pg_surgery/results/heap_surgery.out
> --- 
> /home/nm/farm/gcc64/HEAD/pgsql.build/contrib/pg_surgery/expected/heap_surgery.out
>    2020-09-11 06:31:36.000000000 +0000
> +++ 
> /home/nm/farm/gcc64/HEAD/pgsql.build/contrib/pg_surgery/results/heap_surgery.out
>     2020-09-12 11:40:26.000000000 +0000
> @@ -116,7 +116,6 @@
>   vacuum freeze htab2;
>   -- unused TIDs should be skipped
>   select heap_force_kill('htab2'::regclass, ARRAY['(0, 2)']::tid[]);
> - NOTICE:  skipping tid (0, 2) for relation "htab2" because it is marked 
> unused
>    heap_force_kill
>   -----------------
>
>
> sungazer's first run after pg_surgery went in was successful, so it's
> not a hard failure.  I'm guessing that it's timing dependent.
>
> The most obvious theory for the cause is that what VACUUM does with
> a tuple depends on whether the tuple's xmin is below global xmin,
> and a concurrent autovacuum could very easily be holding back global
> xmin.  While I can't easily get autovac to run at just the right
> time, I did verify that a concurrent regular session holding back
> global xmin produces the symptom seen above.  (To replicate, insert
> "select pg_sleep(...)" in heap_surgery.sql before "-- now create an unused
> line pointer"; run make installcheck; and use the delay to connect
> to the database manually, start a serializable transaction, and do
> any query to acquire a snapshot.)
>


Thanks for reporting. I'm able to reproduce the issue by creating some
delay just before "-- now create an unused line pointer" and use the
delay to start a new session either with repeatable read or
serializable transaction isolation level and run some query on the
test table. To fix this, as you suggested I've converted the test
table to the temp table. Attached is the patch with the changes.
Please have a look and let me know about any concerns.

Thanks,

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

diff --git a/contrib/pg_surgery/expected/heap_surgery.out b/contrib/pg_surgery/expected/heap_surgery.out
index 9451c57..d2cf647 100644
--- a/contrib/pg_surgery/expected/heap_surgery.out
+++ b/contrib/pg_surgery/expected/heap_surgery.out
@@ -87,7 +87,7 @@ NOTICE:  skipping tid (0, 6) for relation "htab" because the item number is out
 
 rollback;
 -- set up a new table with a redirected line pointer
-create table htab2(a int) with (autovacuum_enabled = off);
+create temp table htab2(a int);
 insert into htab2 values (100);
 update htab2 set a = 200;
 vacuum htab2;
@@ -175,6 +175,5 @@ ERROR:  "vw" is not a table, materialized view, or TOAST table
 select heap_force_freeze('vw'::regclass, ARRAY['(0, 1)']::tid[]);
 ERROR:  "vw" is not a table, materialized view, or TOAST table
 -- cleanup.
-drop table htab2;
 drop view vw;
 drop extension pg_surgery;
diff --git a/contrib/pg_surgery/sql/heap_surgery.sql b/contrib/pg_surgery/sql/heap_surgery.sql
index 8a27214..3635bd5 100644
--- a/contrib/pg_surgery/sql/heap_surgery.sql
+++ b/contrib/pg_surgery/sql/heap_surgery.sql
@@ -41,7 +41,7 @@ select heap_force_freeze('htab'::regclass, ARRAY['(0, 0)', '(0, 6)']::tid[]);
 rollback;
 
 -- set up a new table with a redirected line pointer
-create table htab2(a int) with (autovacuum_enabled = off);
+create temp table htab2(a int);
 insert into htab2 values (100);
 update htab2 set a = 200;
 vacuum htab2;
@@ -86,6 +86,5 @@ select heap_force_kill('vw'::regclass, ARRAY['(0, 1)']::tid[]);
 select heap_force_freeze('vw'::regclass, ARRAY['(0, 1)']::tid[]);
 
 -- cleanup.
-drop table htab2;
 drop view vw;
 drop extension pg_surgery;

Re: recovering from "found xmin ... from before relfrozenxid ..."

Reply via email to