Re: pg_rewind does not rewind diverging timelines

Mats Kindahl Sun, 21 Jun 2026 02:10:02 -0700

On 6/8/26 12:48, Andrey Borodin wrote:

On 30 Apr 2026, at 13:19, Mats Kindahl<[email protected]> wrote:


There is one scenario that I assume is known that TLC found, but does not seem 
to be fixed. It is a relatively rare case, but since the fix is quite easy, I 
thought I'd share it with you and get feedback.

Hi Mats,


Hi Andrey,

Thanks for looking at this.

Thanks for working on this. I think the problem is real, but I wonder if
adding a separate UUID to timeline history files is solving it one step
too late.

If two independent promotions manage to choose the same numeric TLI, then
we already have two different histories with the same timeline identifier.
Their history files will also have the same name. A UUID in the file lets
tools detect the mismatch afterwards, but it does not prevent the archive
namespace from containing two different meanings for the same TLI.


Yes, that is correct.

In normal deployments with a shared archive this should only be possible
when the history file is not visible to the other promoting server:
either there is no usable restore_command/shared archive, or there is a
race around publishing and observing the history file. In other words, TLI
allocation is not atomic, but it is intended to be coordinated through the
archive.

Yes, that is the ideal way it should work when you have a sharedarchive. This works because you have a central authority thatsynchronizes the timelines (in theory, not counting bugs).

Maybe we should keep TimelineID as the actual branch identifier and make
that allocation harder to collide instead of adding a second identifier.
For example, when choosing a new TLI, add some randomness rather than just
using the next sequential value.
That would make the race window much less
dangerous: two independent promotions would be extremely unlikely to
choose the same TLI, the history file names would remain distinct, and TLI
would keep its current role as the timeline identifier.
This also keeps the operational model simpler. TimelineID is already the
identifier exposed in WAL file names, history file names, logs, and
recovery configuration. If we add UUIDs, we effectively introduce another
identity for the same object, and tools then need to reason about both.
If instead we make TLI allocation less deterministic under races, the
existing model remains intact.

Does that framing make sense, or am I missing a case where duplicate TLIs
are unavoidable even with a shared archive and a less collision-prone
allocation scheme?

I considered using some random increment of the TLI in the manner youdescribe but there are some issues that makes this solution morecomplicated from an operational perspective:


 * If you skip some TLIs (in the sense pick a TLI that is "random but
   larger"), then it is not clear what the relation between them are.
     o The history files contain the complete linkage of the timelines,
       so that is covered, but the naming would be strange.
         + For example, if you have history files 1, 5, 7, and 8, then
           these can all belong to different timelines, (except 1), or
           be a single timeline and it is hard to understand which one
           without looking through the files.
     o With more promotions, the relation becomes even more strange,
       and the risk of collisions increases. (For example, imagine one
       timeline with 1, 5, 7, 8, 11, and one timeline that forks off 1.
       Then any increment of 4, 6, 7, or 10 will result in a collision.)
 * To actually reduce the risk significantly, you need to have a very
   wide range of the added randomness. Taking a smaller number is
   easier to work with, but then you need to handle that some timelines
   can collide in some manner.
 * Normally, the history file with the highest number will be the only
   relevant one. With this approach, you have to check the contents of
   the files to understand which ones are relevant, which increases the
   operational burden.

In contrast, if you use an UUID in this manner.

 * Adding an UUID does not require a central coordinator and is not
   likely to collide (on the level "impossible to collide") and is very
   straightforward to add. It also comes with a low risk since the
   places in the code that requires changes are very few and not likely
   to have unexpected consequences elsewhere. This works both with and
   without a shared archive.
 * Normally, a shared archive should only contain a single timeline.
   Anything else is an anomaly and should be corrected.
 * I think it is still necessary to handle the case where you do not
   have a shared archive; it would be an odd limitation to say that
   promote only works if you have a shared archive
 * The UUID still serves a purpose in capturing a situation where
   things have gone wrong. Think of the UUID as similar to a "checksum"
   safety and an extra precaution to prevent things from going wrong.

In short, I think the operational issues with random increment of thehistory file number is worse, not better, and we should deal with thename collisions correctly for shared archives instead. There is an issuein that it need to work even in the case where you have a promotion thatgenerates a new UUID but the correct history file exists (reported inthe other message) that I will look into.


Best wishes,
Mats Kindahl

Best regards, Andrey Borodin.

Re: pg_rewind does not rewind diverging timelines

Reply via email to