Re: Two fsync related performance issues?

2020-12-02 Thread Craig Ringer
On Wed, 2 Dec 2020 at 15:41, Michael Paquier wrote: > On Tue, Dec 01, 2020 at 07:39:30PM +0800, Craig Ringer wrote: > > On Wed, 14 Oct 2020, 13:06 Michael Paquier, wrote: > >> Yeah, it is safer to assume that it is the responsability of the > >> backup tool to ensure that because it could be

Re: Two fsync related performance issues?

2020-12-01 Thread Michael Paquier
On Tue, Dec 01, 2020 at 07:39:30PM +0800, Craig Ringer wrote: > On Wed, 14 Oct 2020, 13:06 Michael Paquier, wrote: >> Yeah, it is safer to assume that it is the responsability of the >> backup tool to ensure that because it could be possible that a host is >> unplugged just after taking a backup,

Re: Two fsync related performance issues?

2020-12-01 Thread Craig Ringer
On Wed, 14 Oct 2020, 13:06 Michael Paquier, wrote: > On Wed, Oct 14, 2020 at 02:48:18PM +1300, Thomas Munro wrote: > > On Wed, Oct 14, 2020 at 12:53 AM Michael Banck > > wrote: > >> One question about this: Did you consider the case of a basebackup being > >> copied/restored somewhere and the

Re: Two fsync related performance issues?

2020-10-14 Thread Michael Banck
Hi, Am Mittwoch, den 14.10.2020, 14:06 +0900 schrieb Michael Paquier: > On Wed, Oct 14, 2020 at 02:48:18PM +1300, Thomas Munro wrote: > > On Wed, Oct 14, 2020 at 12:53 AM Michael Banck > > wrote: > > > One question about this: Did you consider the case of a basebackup being > > > copied/restored

Re: Two fsync related performance issues?

2020-10-13 Thread Michael Paquier
On Wed, Oct 14, 2020 at 02:48:18PM +1300, Thomas Munro wrote: > On Wed, Oct 14, 2020 at 12:53 AM Michael Banck > wrote: >> One question about this: Did you consider the case of a basebackup being >> copied/restored somewhere and the restore/PITR being started? Shouldn't >> Postgres then sync the

Re: Two fsync related performance issues?

2020-10-13 Thread Thomas Munro
On Wed, Oct 14, 2020 at 12:53 AM Michael Banck wrote: > Am Mittwoch, den 07.10.2020, 18:17 +1300 schrieb Thomas Munro: > > ... and for comparison/discussion, here is an alternative patch that > > figures out precisely which files need to be fsync'd using information > > in the WAL. > > One

Re: Two fsync related performance issues?

2020-10-13 Thread Michael Banck
Hi, Am Mittwoch, den 07.10.2020, 18:17 +1300 schrieb Thomas Munro: > ... and for comparison/discussion, here is an alternative patch that > figures out precisely which files need to be fsync'd using information > in the WAL. One question about this: Did you consider the case of a basebackup

Re: Two fsync related performance issues?

2020-10-09 Thread Michael Banck
Hi, Am Mittwoch, den 07.10.2020, 18:17 +1300 schrieb Thomas Munro: > On Mon, Oct 5, 2020 at 2:38 PM Thomas Munro wrote: > > On Wed, Sep 9, 2020 at 3:49 PM Thomas Munro wrote: > > > For the record, Andres Freund mentioned a few problems with this > > > off-list and suggested we consider calling

Re: Two fsync related performance issues?

2020-10-07 Thread Thomas Munro
On Wed, Oct 7, 2020 at 6:17 PM Thomas Munro wrote: > On Mon, Oct 5, 2020 at 2:38 PM Thomas Munro wrote: > > On Wed, Sep 9, 2020 at 3:49 PM Thomas Munro wrote: > > > For the record, Andres Freund mentioned a few problems with this > > > off-list and suggested we consider calling Linux syncfs()

Re: Two fsync related performance issues?

2020-10-06 Thread Thomas Munro
On Mon, Oct 5, 2020 at 2:38 PM Thomas Munro wrote: > On Wed, Sep 9, 2020 at 3:49 PM Thomas Munro wrote: > > For the record, Andres Freund mentioned a few problems with this > > off-list and suggested we consider calling Linux syncfs() for each top > > level directory that could potentially be on

Re: Two fsync related performance issues?

2020-10-04 Thread Thomas Munro
On Wed, Sep 9, 2020 at 3:49 PM Thomas Munro wrote: > On Thu, Sep 3, 2020 at 11:30 AM Thomas Munro wrote: > > On Wed, May 27, 2020 at 12:31 AM Craig Ringer wrote: > > > On Tue, 12 May 2020, 08:42 Paul Guo, wrote: > > >> 1. StartupXLOG() does fsync on the whole data directory early in the > >

Re: Two fsync related performance issues?

2020-09-10 Thread Thomas Munro
On Thu, Sep 3, 2020 at 12:09 PM Thomas Munro wrote: > On Tue, May 12, 2020 at 12:43 PM Paul Guo wrote: > > RecreateTwoPhaseFile(gxact->xid, buf, len); > I hadn't previously focused on this second part of your email. I > think the fsync() call in RecreateTwoPhaseFile() might be a candidate

Re: Two fsync related performance issues?

2020-09-08 Thread Thomas Munro
On Thu, Sep 3, 2020 at 11:30 AM Thomas Munro wrote: > On Wed, May 27, 2020 at 12:31 AM Craig Ringer wrote: > > On Tue, 12 May 2020, 08:42 Paul Guo, wrote: > >> 1. StartupXLOG() does fsync on the whole data directory early in the crash > >> recovery. I'm wondering if we could skip some

Re: Two fsync related performance issues?

2020-09-02 Thread Thomas Munro
On Tue, May 12, 2020 at 12:43 PM Paul Guo wrote: > 2. CheckPointTwoPhase() > > This may be a small issue. > > See the code below, > > for (i = 0; i < TwoPhaseState->numPrepXacts; i++) > RecreateTwoPhaseFile(gxact->xid, buf, len); > > RecreateTwoPhaseFile() writes a state file for a prepared

Re: Two fsync related performance issues?

2020-09-02 Thread Thomas Munro
On Wed, May 27, 2020 at 12:31 AM Craig Ringer wrote: > On Tue, 12 May 2020, 08:42 Paul Guo, wrote: >> 1. StartupXLOG() does fsync on the whole data directory early in the crash >> recovery. I'm wondering if we could skip some directories (at least the >> pg_log/, table directories) since wal,

Re: Two fsync related performance issues?

2020-05-26 Thread Craig Ringer
On Tue, 12 May 2020, 08:42 Paul Guo, wrote: > Hello hackers, > > 1. StartupXLOG() does fsync on the whole data directory early in the crash > recovery. I'm wondering if we could skip some directories (at least the > pg_log/, table directories) since wal, etc could ensure consistency. Here > is

Re: Two fsync related performance issues?

2020-05-20 Thread Robert Haas
On Tue, May 19, 2020 at 4:31 PM Thomas Munro wrote: > What would a precise version of this look like? Maybe we really only > need to fsync relation files that recovery modifies (as we already > do), plus those that it would have touched but didn't because of the > page LSN (a new behaviour to

Re: Two fsync related performance issues?

2020-05-19 Thread Thomas Munro
On Wed, May 20, 2020 at 12:51 AM Robert Haas wrote: > On Mon, May 11, 2020 at 8:43 PM Paul Guo wrote: > > I have this concern since I saw an issue in a real product environment that > > the startup process needs 10+ seconds to start wal replay after relaunch > > due to elog(PANIC) (it was seen

Re: Two fsync related performance issues?

2020-05-19 Thread Robert Haas
On Mon, May 11, 2020 at 8:43 PM Paul Guo wrote: > I have this concern since I saw an issue in a real product environment that > the startup process needs 10+ seconds to start wal replay after relaunch due > to elog(PANIC) (it was seen on postgres based product Greenplum but it is a > common

Re: Two fsync related performance issues?

2020-05-18 Thread Tom Lane
Paul Guo writes: > table directories & wal fsync probably dominates the fsync time. Do we > know any possible real scenario that requires table directory fsync? Yes, there are filesystems where that's absolutely required. See past discussions that led to putting in those fsyncs (we did not

Re: Two fsync related performance issues?

2020-05-18 Thread Paul Guo
Thanks for the replies. On Tue, May 12, 2020 at 2:04 PM Michael Paquier wrote: > On Tue, May 12, 2020 at 12:55:37PM +0900, Fujii Masao wrote: > > On 2020/05/12 9:42, Paul Guo wrote: > >> 1. StartupXLOG() does fsync on the whole data directory early in > >> the crash recovery. I'm wondering if

Re: Two fsync related performance issues?

2020-05-12 Thread Michael Paquier
On Tue, May 12, 2020 at 12:55:37PM +0900, Fujii Masao wrote: > On 2020/05/12 9:42, Paul Guo wrote: >> 1. StartupXLOG() does fsync on the whole data directory early in >> the crash recovery. I'm wondering if we could skip some >> directories (at least the pg_log/, table directories) since wal, >>

Re: Two fsync related performance issues?

2020-05-11 Thread Fujii Masao
On 2020/05/12 9:42, Paul Guo wrote: Hello hackers, 1. StartupXLOG() does fsync on the whole data directory early in the crash recovery. I'm wondering if we could skip some directories (at least the pg_log/, table directories) since wal, etc could ensure consistency. I agree that we can

Two fsync related performance issues?

2020-05-11 Thread Paul Guo
Hello hackers, 1. StartupXLOG() does fsync on the whole data directory early in the crash recovery. I'm wondering if we could skip some directories (at least the pg_log/, table directories) since wal, etc could ensure consistency. Here is the related code. if (ControlFile->state !=