On Thu, Aug 20, 2015 at 9:57 AM, Michael Paquier <michael.paqu...@gmail.com>
wrote:

> On Wed, Jul 22, 2015 at 4:28 PM, Alexander Korotkov <
> a.korot...@postgrespro.ru> wrote:
>
>> On Wed, Jul 22, 2015 at 8:48 AM, Michael Paquier <
>> michael.paqu...@gmail.com> wrote
>>
>>> On Mon, Jul 20, 2015 at 9:18 PM, Alexander Korotkov
>>> <a.korot...@postgrespro.ru> wrote:
>>> > attached patch allows pg_rewind to work when target timeline was
>>> switched.
>>> > Actually, this patch fixes TODO from pg_rewind comments.
>>> >
>>> >   /*
>>> >    * Trace the history backwards, until we hit the target timeline.
>>> >    *
>>> >    * TODO: This assumes that there are no timeline switches on the
>>> target
>>> >    * cluster after the fork.
>>> >    */
>>> >
>>> > This patch allows pg_rewind to handle data directory synchronization
>>> is much
>>> > more general way.
>>> For instance, user can return promoted standby to old master.
>>
>>
> +               /*
> +                * Since incomplete segments are copied into next
> timelines, find the
> +                * lastest timeline holding required segment.
> +                */
> +               while (private->tliIndex < targetNentries - 1 &&
> +                          targetHistory[private->tliIndex].end <
> targetSegEnd)
> +               {
> +                       private->tliIndex++;
> +                       tli_index++;
> +               }
> It seems to me that the patch is able to handle timeline switches onwards,
> but not backwards and this is what would be required to return a promoted
> standby, that got switched to let's say timeline 2 when promoted, to an old
> master, that is still on timeline 1. This code actually fails when scanning
> for the last checkpoint before WAL forked as it will be on the timeline 1
> of the old master. Imagine for example that the WAL has forked at
> 0/30XXXXX which is saved in segment 000000020000000000000003 (say 2/0/3) on
> the promoted standby, and that the last checkpoint record is on 0/20XXXXX,
> which is part of 000000010000000000000002 (1/0/2). I think that we should
> scan 2/0/3 (not the partial segment 1/0/3), and then 1/0/2 when looking for
> the last checkpoint record. Hence the startup index TLI should be set to
> the highest timeline and should be decremented depending on what is in the
> history file.
> The code above looks correct to me when scanning the WAL history onwards
> though, which is what is done when extracting the page map, but not
> backwards when we try to find the last common checkpoint record. This code
> actually fails trying to open 2/0/2 that does not exist in the promoted
> standby's pg_xlog in my test case.
>
> Attached is a small script I have used to reproduce the failure.
>

Right, thanks! It should be fixed in the attached version of patch.

I think that the documentation needs a brush up as well to outline the fact
> that pg_rewind would be able to put back as well a standby in a cluster,
> after for example an operator mistake when promoting a node that should not
> be.
>

Yes. But from current pg_rewind documentation I can't figure out that it
can't do so. However, we can add an explicit statement which claims that it
can.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment: pg-rewind-target-switch-3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to