> On 11 May 2015, at 21:00, Heikki Linnakangas <[email protected]> wrote:
> 
> Applied that part.
> 
>> Now that we got this last-partial-segment problem out of the way, I'm
>> going to try fixing the problem you (Michael) pointed out about relying
>> on pgstat file. Meanwhile, I'd love to get more feedback on the rest of
>> the patch, and the documentation.
> 
> And here is a new version of the patch. I kept the approach of using pgstat, 
> but it now only polls pgstat every 10 seconds, and doesn't block to wait for 
> updated stats.

Hi Heikki,

There’s a nearby thread [0] (about 10 years later) where I’m working on a 
problem your patch from this thread helps solve.

In datacenter large outages, 1–2% of clusters end up with gaps in their PITR 
timeline.
In HA setups, when the primary is lost, some WAL can be missing from the 
archive even though it was streamed to the standby. Many HA tools (PGConsul, 
Patroni, etc.) try to re-archive from the standby, but those WAL files may 
already have been removed.

Your “shared” archive mode addresses this: the standby keeps WAL until it’s 
archived. archive_mode=always plus an archive tool can work, but it’s 
expensive. In WAL-G, for example, the archive command does a GET on the 
standby’s WAL, then decrypts and compares. Switching to HEAD would reduce cost 
in some clouds but still adds cost.

Another option is coordinating archiving outside Postgres, but that would mean 
building distributed coordination into the archive tool.

Shared archive mode tackles this in Postgres itself.

I’ve retrofitted your patch, incorporated ideas from the Greenplum work [1], 
and made some improvements.

The patchset has three parts:
 * Rebase + tests – Your original patch, rebased, with tests added.
 * Timeline switching – Correct handling of timeline switches in archive status 
updates.
 * Avoid directory scans – Skip scanning archive_status when possible, which 
was costly in WAL-G setups.

What do you think?

Best regards, Andrey Borodin.

Attachment: v4-0001-Add-archive_mode-shared-for-coordinated-WAL-archi.patch
Description: Binary data

Attachment: v4-0003-Optimize-ProcessArchivalReport-to-avoid-directory.patch
Description: Binary data

Attachment: v4-0002-Mark-ancestor-timeline-WAL-segments-as-archived.patch
Description: Binary data

Reply via email to