On Thu, Aug 20, 2015 at 9:57 AM, Michael Paquier <michael.paqu...@gmail.com> wrote:
> On Wed, Jul 22, 2015 at 4:28 PM, Alexander Korotkov < > a.korot...@postgrespro.ru> wrote: > >> On Wed, Jul 22, 2015 at 8:48 AM, Michael Paquier < >> michael.paqu...@gmail.com> wrote >> >>> On Mon, Jul 20, 2015 at 9:18 PM, Alexander Korotkov >>> <a.korot...@postgrespro.ru> wrote: >>> > attached patch allows pg_rewind to work when target timeline was >>> switched. >>> > Actually, this patch fixes TODO from pg_rewind comments. >>> > >>> > /* >>> > * Trace the history backwards, until we hit the target timeline. >>> > * >>> > * TODO: This assumes that there are no timeline switches on the >>> target >>> > * cluster after the fork. >>> > */ >>> > >>> > This patch allows pg_rewind to handle data directory synchronization >>> is much >>> > more general way. >>> For instance, user can return promoted standby to old master. >> >> > + /* > + * Since incomplete segments are copied into next > timelines, find the > + * lastest timeline holding required segment. > + */ > + while (private->tliIndex < targetNentries - 1 && > + targetHistory[private->tliIndex].end < > targetSegEnd) > + { > + private->tliIndex++; > + tli_index++; > + } > It seems to me that the patch is able to handle timeline switches onwards, > but not backwards and this is what would be required to return a promoted > standby, that got switched to let's say timeline 2 when promoted, to an old > master, that is still on timeline 1. This code actually fails when scanning > for the last checkpoint before WAL forked as it will be on the timeline 1 > of the old master. Imagine for example that the WAL has forked at > 0/30XXXXX which is saved in segment 000000020000000000000003 (say 2/0/3) on > the promoted standby, and that the last checkpoint record is on 0/20XXXXX, > which is part of 000000010000000000000002 (1/0/2). I think that we should > scan 2/0/3 (not the partial segment 1/0/3), and then 1/0/2 when looking for > the last checkpoint record. Hence the startup index TLI should be set to > the highest timeline and should be decremented depending on what is in the > history file. > The code above looks correct to me when scanning the WAL history onwards > though, which is what is done when extracting the page map, but not > backwards when we try to find the last common checkpoint record. This code > actually fails trying to open 2/0/2 that does not exist in the promoted > standby's pg_xlog in my test case. > > Attached is a small script I have used to reproduce the failure. > Right, thanks! It should be fixed in the attached version of patch. I think that the documentation needs a brush up as well to outline the fact > that pg_rewind would be able to put back as well a standby in a cluster, > after for example an operator mistake when promoting a node that should not > be. > Yes. But from current pg_rewind documentation I can't figure out that it can't do so. However, we can add an explicit statement which claims that it can. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
pg-rewind-target-switch-3.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers