On Tue, Jul 05, 2022 at 09:58:38AM -0700, Nathan Bossart wrote:
> Thanks! I wonder if we should add a comment in writeTimeLineHistoryFile()
> about possible concurrent use by a WAL receiver and the startup process and
> why that is okay.
Agreed. Adding an extra note at the top of the routine
On Tue, Jul 05, 2022 at 10:19:49AM +0900, Michael Paquier wrote:
> On Thu, May 05, 2022 at 08:10:02PM +0900, Michael Paquier wrote:
>> I'd agree with removing all the callers at the end. pgrename() is
>> quite robust on Windows, but I'd keep the two checks in
>> writeTimeLineHistory(), as the
On Thu, May 05, 2022 at 08:10:02PM +0900, Michael Paquier wrote:
> I'd agree with removing all the callers at the end. pgrename() is
> quite robust on Windows, but I'd keep the two checks in
> writeTimeLineHistory(), as the logic around findNewestTimeLine() would
> consider a past TLI history
On Mon, May 02, 2022 at 04:06:13PM -0700, Nathan Bossart wrote:
> Here is a new patch set. For now, I've only removed the file existence
> check in writeTimeLineHistoryFile(). I don't know if I'm totally convinced
> that there isn't a problem here (e.g., due to concurrent .ready file
>
On Mon, May 02, 2022 at 10:39:07AM -0700, Nathan Bossart wrote:
> On Mon, May 02, 2022 at 07:48:18PM +0900, Michael Paquier wrote:
>> The WAL receiver upgrades the ERROR to a FATAL, and restarts
>> streaming shortly after. Using durable_rename() would not be an issue
>> here.
>
> Thanks for
On Mon, May 02, 2022 at 07:48:18PM +0900, Michael Paquier wrote:
> Skimming through at the buildfarm logs, it happens that the tests are
> able to see this race from time to time. Here is one such example on
> rorqual:
>
On Sun, May 01, 2022 at 10:08:53PM +0900, Michael Paquier wrote:
> At the end, switching directly from durable_rename_excl() to
> durable_rename() should be fine for the WAL segment initialization,
> but we could do things a bit more carefully by adding a check on the
> file existence before
On Sun, May 01, 2022 at 10:08:53PM +0900, Michael Paquier wrote:
> Now, I am surprised by the third code path of durable_rename_excl(),
> as of the WAL receiver doing writeTimeLineHistoryFile(), to not cause
> any issues, as link() should exit with EEXIST when the startup process
> grabs the same
On Tue, Apr 12, 2022 at 09:27:42AM -0700, Nathan Bossart wrote:
> On Tue, Apr 12, 2022 at 03:46:31PM +0900, Kyotaro Horiguchi wrote:
>> At Mon, 11 Apr 2022 09:52:57 -0700, Nathan Bossart
>> wrote in
>>> I traced this back a while ago. I believe the link() was first added in
>>> November 2000
On Wed, Apr 27, 2022 at 11:42:04AM -0700, Nathan Bossart wrote:
> Here is a new patch set with these assertions added. I think at least the
> xlog.c change ought to be back-patched. The problem may be unlikely, but
> AFAICT the possible consequences include WAL corruption.
Okay, so I have
On Wed, Apr 27, 2022 at 04:09:20PM +0900, Michael Paquier wrote:
> I am not sure that have any need to backpatch this change based on the
> unlikeliness of the problem, TBH. One thing that is itching me a bit,
> like Robert upthread, is that we don't check anymore that the newfile
> does not
On Tue, Apr 26, 2022 at 01:09:35PM -0700, Nathan Bossart wrote:
> Here is an attempt at creating something that can be back-patched. 0001
> simply replaces calls to durable_rename_excl() with durable_rename() and is
> intended to be back-patched. 0002 removes the definition of
>
Here is an attempt at creating something that can be back-patched. 0001
simply replaces calls to durable_rename_excl() with durable_rename() and is
intended to be back-patched. 0002 removes the definition of
durable_rename_excl() and is _not_ intended for back-patching. I imagine
0002 will need
The readdir interface allows processes to be in the middle of reading
a directory and unless a kernel was happy to either materialize the
entire directory list when the readdir starts, or lock the entire
directory against modification for the entire time the a process has a
readdir fd open it's
Michael Paquier writes:
> On Fri, Apr 08, 2022 at 09:00:36PM -0400, Robert Haas wrote:
>> I wonder if this is really true. I thought rename() was supposed to be
>> atomic.
> Not always. For example, some old versions of MacOS have a non-atomic
> implementation of rename(), like prairiedog with
On Mon, Apr 18, 2022 at 04:48:35PM +0900, Michael Paquier wrote:
> Saying that, it would be nice to see durable_rename_excl() gone as it
> has created quite a bit of pain for us in the past years.
Yeah, I think this is the right thing to do. Patch upthread [0].
For back-branches, I suspect
On Fri, Apr 08, 2022 at 09:00:36PM -0400, Robert Haas wrote:
> On Fri, Apr 8, 2022 at 12:53 PM Nathan Bossart
> wrote:
>> I think there might be another problem. The man page for rename() seems to
>> indicate that overwriting an existing file also introduces a window where
>> the old and new
On Tue, Apr 12, 2022 at 03:46:31PM +0900, Kyotaro Horiguchi wrote:
> At Mon, 11 Apr 2022 09:52:57 -0700, Nathan Bossart
> wrote in
>> I traced this back a while ago. I believe the link() was first added in
>> November 2000 as part of f0e37a8. This even predates WAL recycling, which
>> was
At Mon, 11 Apr 2022 09:52:57 -0700, Nathan Bossart
wrote in
> On Mon, Apr 11, 2022 at 12:28:47PM -0400, Tom Lane wrote:
> > Robert Haas writes:
> >> On Mon, Apr 11, 2022 at 5:12 AM Kyotaro Horiguchi
> >> wrote:
> >>> If this diagnosis is correct, the comment is proved to be paranoid.
> >
>
On Mon, Apr 11, 2022 at 12:28:47PM -0400, Tom Lane wrote:
> Robert Haas writes:
>> On Mon, Apr 11, 2022 at 5:12 AM Kyotaro Horiguchi
>> wrote:
>>> If this diagnosis is correct, the comment is proved to be paranoid.
>
>> It's sometimes difficult to understand what problems really old code
>>
Robert Haas writes:
> On Mon, Apr 11, 2022 at 5:12 AM Kyotaro Horiguchi
> wrote:
>> If this diagnosis is correct, the comment is proved to be paranoid.
> It's sometimes difficult to understand what problems really old code
> comments are worrying about. For example, could they have been
>
On Mon, Apr 11, 2022 at 5:12 AM Kyotaro Horiguchi
wrote:
> So, the only thing we need to care is segment switch. Without it, the
> segment that InstallXLogFileSegment found by the stat loop is known to
> be safe to overwrite even if exists.
>
> When segment switch finds an existing file, it's no
At Thu, 7 Apr 2022 11:29:54 -0700, Nathan Bossart
wrote in
> The attached patch prevents this problem by using durable_rename() instead
> of durable_rename_excl() for WAL recycling. This removes the protection
> against accidentally overwriting an existing WAL file, but there shouldn't
> be
On Fri, Apr 08, 2022 at 09:00:36PM -0400, Robert Haas wrote:
> > I think there might be another problem. The man page for rename() seems to
> > indicate that overwriting an existing file also introduces a window where
> > the old and new path are hard links to the same file. This isn't a problem
On Fri, Apr 8, 2022 at 12:53 PM Nathan Bossart wrote:
> On Fri, Apr 08, 2022 at 10:38:03AM -0400, Robert Haas wrote:
> > I see that durable_rename_excl() has the following comment: "Similar
> > to durable_rename(), except that this routine tries (but does not
> > guarantee) not to overwrite the
On Fri, Apr 08, 2022 at 10:38:03AM -0400, Robert Haas wrote:
> I'd actually be in favor of nuking durable_rename_excl() from orbit
> and putting the file-exists tests in the callers. Otherwise, someone
> might assume that it actually has the semantics that its name
> suggests, which could be
On Fri, Apr 08, 2022 at 09:53:12AM -0700, Nathan Bossart wrote:
> On Fri, Apr 08, 2022 at 10:38:03AM -0400, Robert Haas wrote:
>> I'd actually be in favor of nuking durable_rename_excl() from orbit
>> and putting the file-exists tests in the callers. Otherwise, someone
>> might assume that it
On Fri, Apr 08, 2022 at 10:38:03AM -0400, Robert Haas wrote:
> I see that durable_rename_excl() has the following comment: "Similar
> to durable_rename(), except that this routine tries (but does not
> guarantee) not to overwrite the target file." If those are the desired
> semantics, we could
On Thu, Apr 7, 2022 at 2:30 PM Nathan Bossart wrote:
> Presently, WAL recycling uses durable_rename_excl(), which notes that a
> crash at an unfortunate moment can result in two links to the same file.
> My testing [1] demonstrated that it was possible to end up with two links
> to the same file
rom 244726f6a78aca52c2fe6e70cef966f152057191 Mon Sep 17 00:00:00 2001
From: Nathan Bossart
Date: Thu, 7 Apr 2022 10:07:42 -0700
Subject: [PATCH v1 1/1] Avoid multiple hard links to same WAL file after a
crash.
Presently, WAL recycling uses durable_rename_excl(), which notes that a cr
30 matches
Mail list logo