Re: visibility map corruption

2021-07-26 Thread Bruce Momjian
On Sat, Jul 24, 2021 at 10:01:05AM -0400, Bruce Momjian wrote: > On Fri, Jul 23, 2021 at 09:01:18PM -0400, Bruce Momjian wrote: > > On Fri, Jul 23, 2021 at 05:47:18PM -0700, Peter Geoghegan wrote: > > > > I could perhaps see corruption happening if pg_control's oldest xid > > > > value was closer

Re: visibility map corruption

2021-07-24 Thread Bruce Momjian
On Fri, Jul 23, 2021 at 09:01:18PM -0400, Bruce Momjian wrote: > On Fri, Jul 23, 2021 at 05:47:18PM -0700, Peter Geoghegan wrote: > > > I could perhaps see corruption happening if pg_control's oldest xid > > > value was closer to the current xid value than it should be, but I can't > > > see how

Re: visibility map corruption

2021-07-23 Thread Bruce Momjian
On Fri, Jul 23, 2021 at 05:47:18PM -0700, Peter Geoghegan wrote: > > I could perhaps see corruption happening if pg_control's oldest xid > > value was closer to the current xid value than it should be, but I can't > > see how having it 2-billion away could cause harm, unless perhaps > > pg_upgrade

Re: visibility map corruption

2021-07-23 Thread Peter Geoghegan
On Fri, Jul 23, 2021 at 5:08 PM Bruce Momjian wrote: > However, I am now stuck on the commit message text, and I think this is > the point Peter Geoghegan was trying to make earlier --- while we know > that preserving the oldest xid in pg_control is the right thing to do, > and that setting it to

Re: visibility map corruption

2021-07-23 Thread Bruce Momjian
On Thu, Jul 8, 2021 at 09:51:47AM -0400, Bruce Momjian wrote: > On Thu, Jul 8, 2021 at 08:11:14AM -0500, Justin Pryzby wrote: > > Also, the pg_upgrade status message still seems to be misplaced: > > > > In 20210706190612.gm22...@telsasoft.com, Justin Pryzby wrote: > > > I re-arranged the

Re: visibility map corruption

2021-07-08 Thread Bruce Momjian
On Thu, Jul 8, 2021 at 08:11:14AM -0500, Justin Pryzby wrote: > Also, the pg_upgrade status message still seems to be misplaced: > > In 20210706190612.gm22...@telsasoft.com, Justin Pryzby wrote: > > I re-arranged the pg_upgrade output of that patch: it was in the middle of > > the > > two

Re: visibility map corruption

2021-07-08 Thread Justin Pryzby
Also, the pg_upgrade status message still seems to be misplaced: In 20210706190612.gm22...@telsasoft.com, Justin Pryzby wrote: > I re-arranged the pg_upgrade output of that patch: it was in the middle of the > two halves: "Setting next transaction ID and epoch for new cluster" +++

Re: visibility map corruption

2021-07-08 Thread Bruce Momjian
On Thu, Jul 8, 2021 at 07:35:58AM +0200, Drouvot, Bertrand wrote: > Thanks for having looked at it. > > It looks good to me, but i have one question: > > +    printf(_("  -u, --oldest-transaction-id=XID  set oldest transaction > ID\n")); > > and > > +   if

Re: visibility map corruption

2021-07-06 Thread Bruce Momjian
On Tue, Jul 6, 2021 at 08:36:13PM -0400, Bruce Momjian wrote: > On Tue, Jul 6, 2021 at 06:49:10PM -0400, Bruce Momjian wrote: > > My point is that there are a lot internals involved here that are not > > part of pg_upgrade, though it probably only affects pg_upgrade. Anyway, > > Bertrand patch

Re: visibility map corruption

2021-07-06 Thread Bruce Momjian
On Tue, Jul 6, 2021 at 06:49:10PM -0400, Bruce Momjian wrote: > On Tue, Jul 6, 2021 at 03:46:48PM -0700, Peter Geoghegan wrote: > > On Tue, Jul 6, 2021 at 3:30 PM Bruce Momjian wrote: > > > Yes, I can, though it seems like a much bigger issue than pg_upgrade. > > > I will be glad to dig into

Re: visibility map corruption

2021-07-06 Thread Peter Geoghegan
On Tue, Jul 6, 2021 at 3:49 PM Bruce Momjian wrote: > My point is that there are a lot internals involved here that are not > part of pg_upgrade, though it probably only affects pg_upgrade. Anyway, > Bertrand patch seems to have what I need. I was confused by your remarks because I am kind of

Re: visibility map corruption

2021-07-06 Thread Bruce Momjian
On Tue, Jul 6, 2021 at 03:46:48PM -0700, Peter Geoghegan wrote: > On Tue, Jul 6, 2021 at 3:30 PM Bruce Momjian wrote: > > Yes, I can, though it seems like a much bigger issue than pg_upgrade. > > I will be glad to dig into it. > > I'm not sure what you mean by that. Technically this would be an

Re: visibility map corruption

2021-07-06 Thread Bruce Momjian
On Tue, Jul 6, 2021 at 06:30:41PM -0400, Bruce Momjian wrote: > On Tue, Jul 6, 2021 at 02:27:34PM -0700, Peter Geoghegan wrote: > > > BTW, is it really necessary for copy_xact_xlog_xid to invoke pg_resetwal > > > so many times? Why can't we pass all of the update-this options in one > > > call?

Re: visibility map corruption

2021-07-06 Thread Peter Geoghegan
On Tue, Jul 6, 2021 at 3:30 PM Bruce Momjian wrote: > Yes, I can, though it seems like a much bigger issue than pg_upgrade. > I will be glad to dig into it. I'm not sure what you mean by that. Technically this would be an issue for any program that uses "pg_resetwal -x" in the way that

Re: visibility map corruption

2021-07-06 Thread Bruce Momjian
On Tue, Jul 6, 2021 at 02:27:34PM -0700, Peter Geoghegan wrote: > > BTW, is it really necessary for copy_xact_xlog_xid to invoke pg_resetwal > > so many times? Why can't we pass all of the update-this options in one > > call? > > No opinion here. > > > Who's going to do the legwork on this? >

Re: visibility map corruption

2021-07-06 Thread Peter Geoghegan
On Tue, Jul 6, 2021 at 3:12 PM Mark Dilger wrote: > Thanks, Peter, for drawing my attention to this. I had already been > following this thread, but had not yet thought about the problem in terms of > amcheck. > > I will investigate possible solutions in verify_heapam(). Thanks! Great that we

Re: visibility map corruption

2021-07-06 Thread Mark Dilger
> On Jul 6, 2021, at 2:27 PM, Peter Geoghegan wrote: > > It looks like amcheck's verify_heapam.c functionality almost catches > bugs like this one. Something for Mark (CC'd) to consider. Does it > matter that we usually "ctx.oldest_xid = ctx.relfrozenxid", and so > usually use

Re: visibility map corruption

2021-07-06 Thread Peter Geoghegan
On Tue, Jul 6, 2021 at 11:57 AM Tom Lane wrote: > Peter Geoghegan writes: > > ... We should just carry forward the original oldestXid. > > Yup. It's a bit silly that we recognized the need to do that > for oldestMultiXid yet not for oldestXid. True. But at the same time it somehow doesn't seem

Re: visibility map corruption

2021-07-06 Thread Tom Lane
Peter Geoghegan writes: > ... We should just carry forward the original oldestXid. Yup. It's a bit silly that we recognized the need to do that for oldestMultiXid yet not for oldestXid. BTW, is it really necessary for copy_xact_xlog_xid to invoke pg_resetwal so many times? Why can't we pass

Re: visibility map corruption

2021-07-06 Thread Peter Geoghegan
On Tue, Jul 6, 2021 at 10:58 AM Bruce Momjian wrote: > Well, pg_upgrade corruptions are rare, but so is modifying > autovacuum_freeze_max_age. If we have a corruption and we know > autovacuum_freeze_max_age was modified, odds are that is the cause. My point is that there isn't necessarily that

Re: visibility map corruption

2021-07-06 Thread Bruce Momjian
On Tue, Jul 6, 2021 at 10:32:24AM -0700, Peter Geoghegan wrote: > On Tue, Jul 6, 2021 at 10:27 AM Bruce Momjian wrote: > > OK, this is confirmation that the pg_resetwal bug, and its use by > > pg_upgrade, is a serious issue that needs to be addressed. I am > > prepared to work on it now. > >

Re: visibility map corruption

2021-07-06 Thread Peter Geoghegan
On Tue, Jul 6, 2021 at 10:27 AM Bruce Momjian wrote: > OK, this is confirmation that the pg_resetwal bug, and its use by > pg_upgrade, is a serious issue that needs to be addressed. I am > prepared to work on it now. To be clear, I'm not 100% sure that this is related to the pg_upgrade +

Re: visibility map corruption

2021-07-06 Thread Bruce Momjian
On Sun, Jul 4, 2021 at 10:28:25PM +, Floris Van Nee wrote: > > > > I wonder if it's related to this issue: > > > > https://www.postgresql.org/message- > > id/20210423234256.hwopuftipdmp3...@alap3.anarazel.de > > > > Have you increased autovacuum_freeze_max_age from its default? This > >

RE: visibility map corruption

2021-07-04 Thread Floris Van Nee
> > I wonder if it's related to this issue: > > https://www.postgresql.org/message- > id/20210423234256.hwopuftipdmp3...@alap3.anarazel.de > > Have you increased autovacuum_freeze_max_age from its default? This > already sounds like the kind of database where that would make sense. >

Re: visibility map corruption

2021-07-04 Thread Peter Geoghegan
On Sun, Jul 4, 2021 at 2:26 PM Floris Van Nee wrote: > > Have you ever used pg_upgrade on this database? > > > > Yes. The last time (from v11 to v12) was in October 2020. The transaction id > in the tuples (the one PG is trying to check in the tx log) dated from > February 2021. I do believe

RE: visibility map corruption

2021-07-04 Thread Floris Van Nee
> On Sun, Jul 4, 2021 at 1:44 PM Floris Van Nee > wrote: > > We recently ran into an issue where the visibility map of a relation was > corrupt, running Postgres 12.4. The error we'd get when running a SELECT * > from this table is: > > > > could not access status of transaction 3704450152 > >

Re: visibility map corruption

2021-07-04 Thread Peter Geoghegan
On Sun, Jul 4, 2021 at 1:44 PM Floris Van Nee wrote: > We recently ran into an issue where the visibility map of a relation was > corrupt, running Postgres 12.4. The error we'd get when running a SELECT * > from this table is: > > could not access status of transaction 3704450152 > DETAIL:

visibility map corruption

2021-07-04 Thread Floris Van Nee
Hi hackers, We recently ran into an issue where the visibility map of a relation was corrupt, running Postgres 12.4. The error we'd get when running a SELECT * from this table is: could not access status of transaction 3704450152 DETAIL: Could not open file "pg_xact/0DCC": No such file or