On Sat, Dec 15, 2012 at 9:36 AM, Fujii Masao <masao.fu...@gmail.com> wrote: > On Sat, Dec 8, 2012 at 12:51 AM, Heikki Linnakangas > <hlinnakan...@vmware.com> wrote: >> On 06.12.2012 15:39, Amit Kapila wrote: >>> >>> On Thursday, December 06, 2012 12:53 AM Heikki Linnakangas wrote: >>>> >>>> On 05.12.2012 14:32, Amit Kapila wrote: >>>>> >>>>> On Tuesday, December 04, 2012 10:01 PM Heikki Linnakangas wrote: >>>>>> >>>>>> After some diversions to fix bugs and refactor existing code, I've >>>>>> committed a couple of small parts of this patch, which just add some >>>>>> sanity checks to notice incorrect PITR scenarios. Here's a new >>>>>> version of the main patch based on current HEAD. >>>>> >>>>> >>>>> After testing with the new patch, the following problems are observed. >>>>> >>>>> Defect - 1: >>>>> >>>>> 1. start primary A >>>>> 2. start standby B following A >>>>> 3. start cascade standby C following B. >>>>> 4. start another standby D following C. >>>>> 5. Promote standby B. >>>>> 6. After successful time line switch in cascade standby C& D, >>>> >>>> stop D. >>>>> >>>>> 7. Restart D, Startup is successful and connecting to standby C. >>>>> 8. Stop C. >>>>> 9. Restart C, startup is failing. >>>> >>>> >>>> Ok, the error I get in that scenario is: >>>> >>>> C 2012-12-05 19:55:43.840 EET 9283 FATAL: requested timeline 2 does not >>>> contain minimum recovery point 0/3023F08 on timeline 1 C 2012-12-05 >>>> 19:55:43.841 EET 9282 LOG: startup process (PID 9283) exited with exit >>>> code 1 C 2012-12-05 19:55:43.841 EET 9282 LOG: aborting startup due to >>>> startup process failure >>>> >>> >>>> >>>> That mismatch causes the error. I'd like to fix this by always treating >>>> the checkpoint record to be part of the new timeline. That feels more >>>> correct. The most straightforward way to implement that would be to peek >>>> at the xlog record before updating replayEndRecPtr and replayEndTLI. If >>>> it's a checkpoint record that changes TLI, set replayEndTLI to the new >>>> timeline before calling the redo-function. But it's a bit of a >>>> modularity violation to peek into the record like that. >>>> >>>> Or we could just revert the sanity check at beginning of recovery that >>>> throws the "requested timeline 2 does not contain minimum recovery point >>>> 0/3023F08 on timeline 1" error. The error I added to redo of checkpoint >>>> record that says "unexpected timeline ID %u in checkpoint record, before >>>> reaching minimum recovery point %X/%X on timeline %u" checks basically >>>> the same thing, but at a later stage. However, the way >>>> minRecoveryPointTLI is updated still seems wrong to me, so I'd like to >>>> fix that. >>>> >>>> I'm thinking of something like the attached (with some more comments >>>> before committing). Thoughts? >>> >>> >>> This has fixed the problem reported. >>> However, I am not able to think will there be any problem if we remove >>> check >>> "requested timeline 2 does not contain minimum recovery point >>>> >>>> 0/3023F08 on timeline 1" at beginning of recovery and just update >>> >>> replayEndTLI with ThisTimeLineID? >> >> >> Well, it seems wrong for the control file to contain a situation like this: >> >> pg_control version number: 932 >> Catalog version number: 201211281 >> Database system identifier: 5819228770976387006 >> Database cluster state: shut down in recovery >> pg_control last modified: pe 7. joulukuuta 2012 17.39.57 >> Latest checkpoint location: 0/3023EA8 >> Prior checkpoint location: 0/2000060 >> Latest checkpoint's REDO location: 0/3023EA8 >> Latest checkpoint's REDO WAL file: 000000020000000000000003 >> Latest checkpoint's TimeLineID: 2 >> ... >> Time of latest checkpoint: pe 7. joulukuuta 2012 17.39.49 >> Min recovery ending location: 0/3023F08 >> Min recovery ending loc's timeline: 1 >> >> Note the latest checkpoint location and its TimelineID, and compare them >> with the min recovery ending location. The min recovery ending location is >> ahead of latest checkpoint's location; the min recovery ending location >> actually points to the end of the checkpoint record. But how come the min >> recovery ending location's timeline is 1, while the checkpoint record's >> timeline is 2. >> >> Now maybe that would happen to work if remove the sanity check, but it still >> seems horribly confusing. I'm afraid that discrepancy will come back to >> haunt us later if we leave it like that. So I'd like to fix that. >> >> Mulling over this for some more, I propose the attached patch. With the >> patch, we peek into the checkpoint record, and actually perform the timeline >> switch (by changing ThisTimeLineID) before replaying it. That way the >> checkpoint record is really considered to be on the new timeline for all >> purposes. At the moment, the only difference that makes in practice is that >> we set replayEndTLI, and thus minRecoveryPointTLI, to the new TLI, but it >> feels logically more correct to do it that way. > > This patch has already been included in HEAD. Right? > > I found another "requested timeline does not contain minimum recovery point" > error scenario in HEAD: > > 1. Set up the master 'M', one standby 'S1', and one cascade standby 'S2'. > 2. Shutdown the master 'M' and promote the standby 'S1', and wait for 'S2' > to reconnect to 'S1'. > 3. Set up new cascade standby 'S3' connecting to 'S2'. > Then 'S3' fails to start the recovery because of the following error: > > FATAL: requested timeline 2 does not contain minimum recovery > point 0/3000000 on timeline 1 > LOG: startup process (PID 33104) exited with exit code 1 > LOG: aborting startup due to startup process failure > > The result of pg_controldata of 'S3' is: > > Latest checkpoint location: 0/3000088 > Prior checkpoint location: 0/2000060 > Latest checkpoint's REDO location: 0/3000088 > Latest checkpoint's REDO WAL file: 000000020000000000000003 > Latest checkpoint's TimeLineID: 2 > <snip> > Min recovery ending location: 0/3000000 > Min recovery ending loc's timeline: 1 > Backup start location: 0/0 > Backup end location: 0/0 > > The content of the timeline history file '00000002.history' is: > > 1 0/3000088 no recovery target specified
I still could reproduce this problem. Attached is the shell script which reproduces the problem. Regards, -- Fujii Masao
fujii_test.sh
Description: Bourne shell script
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers