Re: [HACKERS] Using non-sequential timelines in order to help with possible collisions

Michael Paquier Wed, 19 Jul 2017 10:42:52 -0700

On Wed, Jul 19, 2017 at 7:00 PM, Robert Haas <[email protected]> wrote:
> On Wed, Jul 19, 2017 at 11:23 AM, Brian Faherty
> <[email protected]> wrote:
>>   I was working with replication and recovery the other day and noticed that
>> there were scenarios where I could cause multiple servers to enter the same
>> timeline while possibly having divergent data. One such scenario is Master A
>> and Replica B are both on timeline 1. There is an event that causes Replica
>> B to become promoted which changes it to timeline 2. Following this, you
>> perform a restore on Master A to a point before the event happened. Once
>> Postgres completes this recovery on Master A, it will switch over to
>> timeline 2. There are now WAL files that have been written to timeline 2
>> from both servers.
>>
>> From this scenario, I would like to suggest considering using non-sequential
>> timelines. From what I have investigated so far, I believe the *.history
>> files in the WAL directory already have all the timelines id's in them and
>> are in order. If we could make those timeline ids to be a bit more
>> unique/random, and still rely on the ordering in the *.history file, I think
>> this would help prevent multiple servers on the same timeline with divergent
>> data.


It seems to me that you are missing one piece here: the history files
generated at the moment of the timeline bump. When recovery finishes,
an instance scans the archives or from the instances it is streaming
from for history files, and chooses a timeline number that does not
match existing ones. So you are trying to avoid a problem that can
easily be solved with a proper archive for example.

>> I was hoping to begin a conversation on whether or not non-sequential
>> timelines are a good idea before I looked at the code around timelines.
>
> It's interesting that you bring this up.  I've also wondered why we
> don't use random TLIs.  I suppose I'm internally assuming that it's
> because the people who wrote the code are far more brilliant and
> knowledgeable of this area than I could ever be and that doing
> anything else would create some kind of awful problem, but maybe
> that's not so.

I am not the only who worked on that, but the result code is a tad
more simple, as it is possible to guess more easily some hierarchy for
the timelines, of course with the history files at hand.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using non-sequential timelines in order to help with possible collisions

Reply via email to