Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-12-09 Thread Michael Paquier
On Thu, Dec 06, 2018 at 11:18:12PM +, Bossart, Nathan wrote: > That seems reasonable to me. Thanks, committed after renaming a couple of variables, after adding a comment about the use of unlink() and also adding a comment explaining what the different retry counters are here for. -- Michael

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-12-06 Thread Bossart, Nathan
On 12/6/18, 4:54 PM, "Michael Paquier" wrote: > On Thu, Dec 06, 2018 at 02:43:35PM +0900, Michael Paquier wrote: >> Why? A WARNING would be logged if the first unlink() fails, and >> another, different WARNING would be logged if the subsequent fsync >> fails. It looks enough to me to make a dist

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-12-06 Thread Michael Paquier
On Thu, Dec 06, 2018 at 02:43:35PM +0900, Michael Paquier wrote: > Why? A WARNING would be logged if the first unlink() fails, and > another, different WARNING would be logged if the subsequent fsync > fails. It looks enough to me to make a distinction between both. Now, > you may have a point i

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-12-05 Thread Michael Paquier
On Thu, Dec 06, 2018 at 01:55:46PM +0900, Kyotaro HORIGUCHI wrote: > durable_unlink has two modes of faiure. Failure to unlink and > fsync. If once it fails at the fsync stage, subsequent > durable_unlink calls for the same file always fail to unlink with > ENOENT. If durable_unlink is intended to

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-12-05 Thread Kyotaro HORIGUCHI
Hello. At Wed, 5 Dec 2018 16:11:23 +, "Bossart, Nathan" wrote in > The v4 patch looks good to me! durable_unlnk has two modes of faiure. Failure to unlink and fsync. If once it fails at the fsync stage, subsequent durable_unlink calls for the same file always fail to unlink with ENOENT. I

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-12-05 Thread Bossart, Nathan
On 12/4/18, 7:35 PM, "Michael Paquier" wrote: > On Tue, Dec 04, 2018 at 08:40:40PM +, Bossart, Nathan wrote: >> Thanks for the updated patch! The code looks good to me, the patch >> applies cleanly and builds without warnings, and it seems to work well >> in my manual tests. I just have a fe

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-12-04 Thread Michael Paquier
On Tue, Dec 04, 2018 at 08:40:40PM +, Bossart, Nathan wrote: > Thanks for the updated patch! The code looks good to me, the patch > applies cleanly and builds without warnings, and it seems to work well > in my manual tests. I just have a few wording suggestions. How are you testing this? I

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-12-04 Thread Bossart, Nathan
On 12/4/18, 1:36 AM, "Michael Paquier" wrote: > Okay, here is an updated patch for this stuff, which does the following: > - Check for a WAL segment if it has a ".ready" status file, an orphaned > status file is removed only on ENOENT. > - If durable_unlink fails, retry 3 times. If there are too

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-12-03 Thread Michael Paquier
On Thu, Nov 29, 2018 at 03:00:42PM +, Bossart, Nathan wrote: > +1 Okay, here is an updated patch for this stuff, which does the following: - Check for a WAL segment if it has a ".ready" status file, an orphaned status file is removed only on ENOENT. - If durable_unlink fails, retry 3 times. I

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-11-29 Thread Bossart, Nathan
On 11/27/18, 3:53 PM, "Michael Paquier" wrote: > On Tue, Nov 27, 2018 at 09:49:29PM +, Bossart, Nathan wrote: >> That sounds good to me. I was actually thinking of using the same >> retry counter that we use for pgarch_archiveXlog(), but on second >> thought, it is probably better to have two

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-11-27 Thread Michael Paquier
On Tue, Nov 27, 2018 at 09:49:29PM +, Bossart, Nathan wrote: > That sounds good to me. I was actually thinking of using the same > retry counter that we use for pgarch_archiveXlog(), but on second > thought, it is probably better to have two independent retry counters > for these two unrelated

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-11-27 Thread Bossart, Nathan
On 11/27/18, 3:20 PM, "Michael Paquier" wrote: > On Tue, Nov 27, 2018 at 08:43:06PM +, Bossart, Nathan wrote: >> IIUC any time that the file does not exist, we will attempt to unlink >> it. Regardless of whether unlinking fails or succeeds, we then >> proceed to give up archiving for now, but

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-11-27 Thread Michael Paquier
On Tue, Nov 27, 2018 at 08:43:06PM +, Bossart, Nathan wrote: > Don't we also need to check that errno is ENOENT here? Yep. > IIUC any time that the file does not exist, we will attempt to unlink > it. Regardless of whether unlinking fails or succeeds, we then > proceed to give up archiving f

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-11-27 Thread Bossart, Nathan
On 11/27/18, 2:46 PM, "Andres Freund" wrote: > On 2018-11-27 20:43:06 +, Bossart, Nathan wrote: >> I don't have exact figures to share, but yes, a huge number of calls >> to sync_file_range() and fsync() can use up a lot of time. Presumably >> Postgres processes files individually instead of

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-11-27 Thread Andres Freund
Hi, On 2018-11-27 20:43:06 +, Bossart, Nathan wrote: > I don't have exact figures to share, but yes, a huge number of calls > to sync_file_range() and fsync() can use up a lot of time. Presumably > Postgres processes files individually instead of using sync() because > sync() may return befor

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-11-27 Thread Bossart, Nathan
On 11/21/18, 10:16 PM, "Michael Paquier" wrote: >> At Fri, 02 Nov 2018 14:47:08 +, Nathan Bossart >> wrote in >> <154117002849.5569.14588306221618961668.p...@coridan.postgresql.org>: >>> Granted, any added delay from this patch is unlikely to be noticeable >>> unless your archiver is way behi

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-11-27 Thread Michael Paquier
On Thu, Nov 22, 2018 at 01:16:09PM +0900, Michael Paquier wrote: > No, pgarch_readyXLog() should still look after .ready files as those are > here for this purpose, but we could have an additional check to see if > the segment linked with it actually exists and can be archived. This > check could

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-11-21 Thread Michael Paquier
On Thu, Nov 15, 2018 at 07:39:27PM +0900, Kyotaro HORIGUCHI wrote: > At Fri, 02 Nov 2018 14:47:08 +, Nathan Bossart > wrote in > <154117002849.5569.14588306221618961668.p...@coridan.postgresql.org>: >> One argument for instead checking WAL file existence before calling >> archive_command might

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-11-15 Thread Kyotaro HORIGUCHI
At Fri, 02 Nov 2018 14:47:08 +, Nathan Bossart wrote in <154117002849.5569.14588306221618961668.p...@coridan.postgresql.org> > One argument for instead checking WAL file existence before calling > archive_command might be to avoid the increased startup time. > Granted, any added delay from th

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-11-02 Thread Nathan Bossart
One argument for instead checking WAL file existence before calling archive_command might be to avoid the increased startup time. Granted, any added delay from this patch is unlikely to be noticeable unless your archiver is way behind and archive_status has a huge number of files. However, I have

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-09-30 Thread Michael Paquier
On Sat, Sep 29, 2018 at 04:58:57PM +0900, Michael Paquier wrote: > Actually, what you are proposing here sounds much better to me. That's > in the area of what has been done recently with RemoveTempXlogFiles() in > 5fc1008e. Any objections to doing something like that? Okay. I have hacked a pa

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-09-29 Thread Michael Paquier
On Fri, Sep 28, 2018 at 07:16:25PM -0400, Stephen Frost wrote: > An alternative would be to go through the .ready files on crash-recovery > and remove any .ready files that don't have corresponding WAL files, or > if we felt that it was necessary, we could do that on every restart but > do we reall

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-09-28 Thread Stephen Frost
Greetings, * Michael Paquier (mich...@paquier.xyz) wrote: > On Fri, Sep 28, 2018 at 02:36:19PM -0400, Stephen Frost wrote: > > Is there an issue with making the archiver able to understand that > > situation instead of being confused by it..? Seems like that'd probably > > be a good thing to do r

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-09-28 Thread Michael Paquier
On Fri, Sep 28, 2018 at 02:36:19PM -0400, Stephen Frost wrote: > Is there an issue with making the archiver able to understand that > situation instead of being confused by it..? Seems like that'd probably > be a good thing to do regardless of this, but that would then remove the > need for this k

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-09-28 Thread Stephen Frost
Greetings, * Michael Paquier (mich...@paquier.xyz) wrote: > While reviewing the archiving code, I have bumped into the fact that > XLogArchiveCleanup() thinks that it is safe to do only a plain unlink() > for .ready and .done files when removing a past segment. I don't think > that it is a smart

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-09-27 Thread Andres Freund
On September 27, 2018 10:23:31 PM PDT, Michael Paquier wrote: >On Thu, Sep 27, 2018 at 08:40:26PM -0700, Andres Freund wrote: >> On 2018-09-28 12:28:27 +0900, Michael Paquier wrote: >>> While reviewing the archiving code, I have bumped into the fact that >>> XLogArchiveCleanup() thinks that it

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-09-27 Thread Michael Paquier
On Thu, Sep 27, 2018 at 08:40:26PM -0700, Andres Freund wrote: > On 2018-09-28 12:28:27 +0900, Michael Paquier wrote: >> While reviewing the archiving code, I have bumped into the fact that >> XLogArchiveCleanup() thinks that it is safe to do only a plain unlink() >> for .ready and .done files when

Re: Use durable_unlink for .ready and .done files for WAL segment removal

2018-09-27 Thread Andres Freund
Hi, On 2018-09-28 12:28:27 +0900, Michael Paquier wrote: > While reviewing the archiving code, I have bumped into the fact that > XLogArchiveCleanup() thinks that it is safe to do only a plain unlink() > for .ready and .done files when removing a past segment. I don't think > that it is a smart m

Use durable_unlink for .ready and .done files for WAL segment removal

2018-09-27 Thread Michael Paquier
Hi all, While reviewing the archiving code, I have bumped into the fact that XLogArchiveCleanup() thinks that it is safe to do only a plain unlink() for .ready and .done files when removing a past segment. I don't think that it is a smart move, as on a subsequent crash we may still see those, but