Re: Possible corruption by CreateRestartPoint at promotion

2022-05-15 Thread Michael Paquier
On Mon, May 09, 2022 at 09:24:06AM +0900, Michael Paquier wrote: > Okay, applied this one on HEAD after going back-and-forth on it for > the last couple of days. I have found myself shaping the patch in > what looks like its simplest form, by applying the check based on an > older checkpoint to al

Re: Possible corruption by CreateRestartPoint at promotion

2022-05-08 Thread Michael Paquier
On Fri, May 06, 2022 at 07:58:43PM +0900, Michael Paquier wrote: > And I have spent a bit of this stuff to finish with the attached. It > will be a plus to get that done on HEAD for beta1, so I'll try to deal > with it on Monday. I am still a bit stressed about the back branches > as concurrent c

Re: Possible corruption by CreateRestartPoint at promotion

2022-05-06 Thread Michael Paquier
On Fri, May 06, 2022 at 08:52:45AM -0700, Nathan Bossart wrote: > I was looking at other changes in this area (e.g., 3c64dcb), and now I'm > wondering if we actually should invalidate the minRecoveryPoint when the > control file no longer indicates archive recovery. Specifically, what > happens if

Re: Possible corruption by CreateRestartPoint at promotion

2022-05-06 Thread Nathan Bossart
On Fri, May 06, 2022 at 07:58:43PM +0900, Michael Paquier wrote: > And I have spent a bit of this stuff to finish with the attached. It > will be a plus to get that done on HEAD for beta1, so I'll try to deal > with it on Monday. I am still a bit stressed about the back branches > as concurrent c

Re: Possible corruption by CreateRestartPoint at promotion

2022-05-06 Thread Michael Paquier
On Thu, Apr 28, 2022 at 03:49:42PM +0900, Michael Paquier wrote: > I am not sure what you mean here. FWIW, I am translating the > suggestion of Nathan to split the existing check in > CreateRestartPoint() that we are discussing here into two if blocks, > instead of just one: > - Move the update of

Re: Possible corruption by CreateRestartPoint at promotion

2022-04-27 Thread Michael Paquier
On Thu, Apr 28, 2022 at 11:43:57AM +0900, Kyotaro Horiguchi wrote: > At Thu, 28 Apr 2022 09:12:13 +0900, Michael Paquier > wrote in >> On Wed, Apr 27, 2022 at 11:09:45AM -0700, Nathan Bossart wrote: >>> On Wed, Apr 27, 2022 at 02:16:01PM +0900, Michael Paquier wrote: - if (ControlFile->st

Re: Possible corruption by CreateRestartPoint at promotion

2022-04-27 Thread Kyotaro Horiguchi
At Wed, 27 Apr 2022 01:31:55 -0400, Tom Lane wrote in > Michael Paquier writes: > > On Wed, Apr 27, 2022 at 12:36:10PM +0800, Rui Zhao wrote: > >> Do you have interest in adding a test like one in my patch? > > > I have studied the test case you are proposing, and I am afraid that > > it is too

Re: Possible corruption by CreateRestartPoint at promotion

2022-04-27 Thread Kyotaro Horiguchi
At Thu, 28 Apr 2022 09:12:13 +0900, Michael Paquier wrote in > On Wed, Apr 27, 2022 at 11:09:45AM -0700, Nathan Bossart wrote: > > On Wed, Apr 27, 2022 at 02:16:01PM +0900, Michael Paquier wrote: > >> - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && > >> - ControlFile->checkPointCop

Re: Possible corruption by CreateRestartPoint at promotion

2022-04-27 Thread Kyotaro Horiguchi
At Wed, 27 Apr 2022 14:16:01 +0900, Michael Paquier wrote in > On Tue, Apr 26, 2022 at 08:26:09PM -0700, Nathan Bossart wrote: > > On Wed, Apr 27, 2022 at 10:43:53AM +0900, Kyotaro Horiguchi wrote: > >> At Tue, 26 Apr 2022 11:33:49 -0700, Nathan Bossart > >> wrote in > >>> I suspect we'll sta

Re: Possible corruption by CreateRestartPoint at promotion

2022-04-27 Thread Michael Paquier
On Wed, Apr 27, 2022 at 11:09:45AM -0700, Nathan Bossart wrote: > On Wed, Apr 27, 2022 at 02:16:01PM +0900, Michael Paquier wrote: >> - if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY && >> - ControlFile->checkPointCopy.redo < lastCheckPoint.redo) >> - { >> 7ff23c6 has removed the last c

Re: Possible corruption by CreateRestartPoint at promotion

2022-04-27 Thread Nathan Bossart
On Wed, Apr 27, 2022 at 02:16:01PM +0900, Michael Paquier wrote: > On Tue, Apr 26, 2022 at 08:26:09PM -0700, Nathan Bossart wrote: >> On Wed, Apr 27, 2022 at 10:43:53AM +0900, Kyotaro Horiguchi wrote: >>> + ControlFile->minRecoveryPoint = InvalidXLogRecPtr; >>> + ControlFile->mi

Re: Possible corruption by CreateRestartPoint at promotion

2022-04-26 Thread Tom Lane
Michael Paquier writes: > On Wed, Apr 27, 2022 at 12:36:10PM +0800, Rui Zhao wrote: >> Do you have interest in adding a test like one in my patch? > I have studied the test case you are proposing, and I am afraid that > it is too expensive as designed. That was my feeling too. It's certainly a

Re: Possible corruption by CreateRestartPoint at promotion

2022-04-26 Thread Michael Paquier
On Wed, Apr 27, 2022 at 12:36:10PM +0800, Rui Zhao wrote: > Do you have interest in adding a test like one in my patch? I have studied the test case you are proposing, and I am afraid that it is too expensive as designed. And it is actually racy as you expect the restart point to take longer than

Re: Possible corruption by CreateRestartPoint at promotion

2022-04-26 Thread Michael Paquier
On Tue, Apr 26, 2022 at 08:26:09PM -0700, Nathan Bossart wrote: > On Wed, Apr 27, 2022 at 10:43:53AM +0900, Kyotaro Horiguchi wrote: >> At Tue, 26 Apr 2022 11:33:49 -0700, Nathan Bossart >> wrote in >>> I suspect we'll start seeing this problem more often once end-of-recovery >>> checkpoints are

Re: Possible corruption by CreateRestartPoint at promotion

2022-04-26 Thread Nathan Bossart
On Wed, Apr 27, 2022 at 10:43:53AM +0900, Kyotaro Horiguchi wrote: > At Tue, 26 Apr 2022 11:33:49 -0700, Nathan Bossart > wrote in >> I suspect we'll start seeing this problem more often once end-of-recovery >> checkpoints are removed [0]. Would you mind creating a commitfest entry >> for this

Re: Possible corruption by CreateRestartPoint at promotion

2022-04-26 Thread Kyotaro Horiguchi
At Tue, 26 Apr 2022 11:33:49 -0700, Nathan Bossart wrote in > On Wed, Mar 16, 2022 at 10:24:44AM +0900, Kyotaro Horiguchi wrote: > > While discussing on additional LSNs in checkpoint log message, > > Fujii-san pointed out [2] that there is a case where > > CreateRestartPoint leaves unrecoverable

Re: Possible corruption by CreateRestartPoint at promotion

2022-04-26 Thread Nathan Bossart
On Wed, Mar 16, 2022 at 10:24:44AM +0900, Kyotaro Horiguchi wrote: > While discussing on additional LSNs in checkpoint log message, > Fujii-san pointed out [2] that there is a case where > CreateRestartPoint leaves unrecoverable database when concurrent > promotion happens. That corruption is "fixe

Re: Possible corruption by CreateRestartPoint at promotion

2022-03-16 Thread Kyotaro Horiguchi
Just for the record. An instance of the corruption showed up in this mailing list [1]. [1] https://www.postgresql.org/message-id/flat/9EB4CF63-1107-470E-B5A4-061FB9EF8CC8%40outlook.com regards. -- Kyotaro Horiguchi NTT Open Source Software Center

Possible corruption by CreateRestartPoint at promotion

2022-03-15 Thread Kyotaro Horiguchi
Hello, (Cc:ed Fujii-san) This is a diverged topic from [1], which is summarized as $SUBJECT. To recap: While discussing on additional LSNs in checkpoint log message, Fujii-san pointed out [2] that there is a case where CreateRestartPoint leaves unrecoverable database when concurrent promotion h