Re: Map WAL segment files on PMEM as WAL buffers
As discussed in [1], we're taking this opportunity to return some patchsets that don't appear to be getting enough reviewer interest. This is not a rejection, since we don't necessarily think there's anything unacceptable about the entry, but it differs from a standard "Returned with Feedback" in that there's probably not much actionable feedback at all. Rather than code changes, what this patch needs is more community interest. You might - ask people for help with your approach, - see if there are similar patches that your code could supplement, - get interested parties to agree to review your patch in a CF, or - possibly present the functionality in a way that's easier to review overall. (Doing these things is no guarantee that there will be interest, but it's hopefully better than endlessly rebasing a patchset that is not receiving any feedback from the community.) Once you think you've built up some community support and the patchset is ready for review, you (or any interested party) can resurrect the patch entry by visiting https://commitfest.postgresql.org/38/3181/ and changing the status to "Needs Review", and then changing the status again to "Move to next CF". (Don't forget the second step; hopefully we will have streamlined this in the near future!) Thanks, --Jacob [1] https://postgr.es/m/86140760-8ba5-6f3a-3e6e-5ca6c060b...@timescale.com
Re: Map WAL segment files on PMEM as WAL buffers
Hi Andres, Thank you for your report. I rebased and made patchset v9 attached to this email. Note that v9-0009 and v9-0010 are for those who want to pass their own Cirrus CI. Regards, Takashi On Tue, Mar 22, 2022 at 9:44 AM Andres Freund wrote: > > Hi, > > On 2022-01-20 14:55:13 +0900, Takashi Menjo wrote: > > Here is patchset v8. It will have "make check-world" and Cirrus to > > pass. > > This unfortunately does not apply anymore: > http://cfbot.cputube.org/patch_37_3181.log > > Could you rebase? > > - Andres -- Takashi Menjo v9-0003-Add-wal_pmem_map-to-postgresql.conf.sample.patch Description: Binary data v9-0001-Add-with-libpmem-option-for-PMEM-support.patch Description: Binary data v9-0002-Add-wal_pmem_map-to-GUC.patch Description: Binary data v9-0005-Map-WAL-segment-files-on-PMEM-as-WAL-buffers.patch Description: Binary data v9-0004-Export-InstallXLogFileSegment.patch Description: Binary data v9-0006-WAL-statistics-in-cases-of-wal_pmem_map-true.patch Description: Binary data v9-0007-Update-document.patch Description: Binary data v9-0008-Preallocate-and-initialize-more-WAL-if-wal_pmem_m.patch Description: Binary data v9-0009-For-CI-only-Setup-Cirrus-CI-for-with-libpmem.patch Description: Binary data v9-0010-For-CI-only-Modify-initdb-for-wal_pmem_map-on.patch Description: Binary data
Re: Map WAL segment files on PMEM as WAL buffers
Hi, On 2022-01-20 14:55:13 +0900, Takashi Menjo wrote: > Here is patchset v8. It will have "make check-world" and Cirrus to > pass. This unfortunately does not apply anymore: http://cfbot.cputube.org/patch_37_3181.log Could you rebase? - Andres
Re: Map WAL segment files on PMEM as WAL buffers
Hi Justin, Here is patchset v8. It will have "make check-world" and Cirrus to pass. Would you try this one? The v8 squashes some patches in v7 into related ones, and adds the following patches: - v8-0003: Add wal_pmem_map to postgresql.conf.sample. It also helps v8-0011. - v8-0009: Fix wrong handling of missingContrecPtr for test/recovery/t/026 to pass. It is the cause of the error. Thanks for your report. - v8-0010 and v8-0011: Each of the two is for CI only. v8-0010 adds --with-libpmem and v8-0011 enables "wal_pmem_map = on". Please note that, unlike your suggestion, in my patchset PMEM_IS_PMEM_FORCE=1 will be given as an environment variable in .cirrus.yml and "wal_pmem_map = on" will be given by initdb. Regards, Takashi -- Takashi Menjo v8-0001-Add-with-libpmem-option-for-PMEM-support.patch Description: Binary data v8-0002-Add-wal_pmem_map-to-GUC.patch Description: Binary data v8-0005-Map-WAL-segment-files-on-PMEM-as-WAL-buffers.patch Description: Binary data v8-0003-Add-wal_pmem_map-to-postgresql.conf.sample.patch Description: Binary data v8-0004-Export-InstallXLogFileSegment.patch Description: Binary data v8-0007-Update-document.patch Description: Binary data v8-0008-Preallocate-and-initialize-more-WAL-if-wal_pmem_m.patch Description: Binary data v8-0006-WAL-statistics-in-cases-of-wal_pmem_map-true.patch Description: Binary data v8-0009-Fix-wrong-handling-of-missingContrecPtr.patch Description: Binary data v8-0011-For-CI-only-Modify-initdb-for-wal_pmem_map-on.patch Description: Binary data v8-0010-For-CI-only-Setup-Cirrus-CI-for-with-libpmem.patch Description: Binary data
Re: Map WAL segment files on PMEM as WAL buffers
Hi Justin, I can reproduce the error you reported, with PMEM_IS_PMEM_FORCE=1. Moreover, I can reproduce it **on a real PMem device**. So the causes are in my patchset, not in PMem environment. I'll fix it in the next patchset version. Regards, Takashi -- Takashi Menjo
Re: Map WAL segment files on PMEM as WAL buffers
Hi Justin, Thanks for your help. I'm making an additional patch for Cirrus CI. I'm also trying to reproduce the "make check-world" error you reported, on my Linux environment that has neither a real PMem nor an emulated one, with PMEM_IS_PMEM_FORCE=1. I'll keep you updated. Regards, Takashi On Mon, Jan 17, 2022 at 4:34 PM Justin Pryzby wrote: > > On Thu, Jan 06, 2022 at 10:43:37PM -0600, Justin Pryzby wrote: > > On Fri, Jan 07, 2022 at 12:50:01PM +0900, Takashi Menjo wrote: > > > > But in this case it really doesn't work :( > > > > > > > > running bootstrap script ... 2022-01-05 23:17:30.244 CST [12088] FATAL: > > > > file not on PMEM: path "pg_wal/00010001" > > > > > > Do you have a real PMEM device such as NVDIMM-N or Intel Optane PMem? > > > > No - the point is that we'd like to have a way to exercise this patch on the > > cfbot. Particularly the new code introduced by this patch, not just the > > --without-pmem case... > .. > > I think you should add a patch which does what Thomas suggested: 1) add to > > ./.cirrus.yaml installation of the libpmem package for > > debian/bsd/mac/windows; > > 2) add setenv to main(), as above; 3) change configure.ac and guc.c to > > default > > to --with-libpmem and wal_pmem_map=on. This should be the last patch, for > > cfbot only, not meant to be merged. > > I was able to get the cirrus CI to compile on linux and bsd with the below > changes. I don't know if there's an easy package installation for mac OSX. I > think it's okay if mac CI doesn't use --enable-pmem for now. > > > You can test that the package installation part works before mailing > > patches to > > the list with the instructions here: > > > > src/tools/ci/README: > > Enabling cirrus-ci in a github repository.. > > I ran the CI under my own github account. > Linux crashes in the recovery check. > And freebsd has been stuck for 45min. > > I'm not sure, but maybe those are legimate consequence of using > PMEM_IS_PMEM_FORCE (?) If so, maybe the recovery check would need to be > disabled for this patch to run on CI... Or maybe my suggestion to enable it > by > default for CI doesn't work for this patch. It would need to be specially > tested with real hardware. > > https://cirrus-ci.com/task/6245151591890944 > > https://cirrus-ci.com/task/6162551485497344?logs=test_world#L3941 > #2 0x55ff43c6edad in ExceptionalCondition (conditionName=0x55ff43d18108 > "!XLogRecPtrIsInvalid(missingContrecPtr)", errorType=0x55ff43d151c4 > "FailedAssertion", fileName=0x55ff43d151bd "xlog.c", lineNumber=8297) at > assert.c:69 > > commit 15533794e465a381eb23634d67700afa809a0210 > Author: Justin Pryzby > Date: Thu Jan 6 22:53:28 2022 -0600 > > tmp: enable pmem by default, for CI > > diff --git a/.cirrus.yml b/.cirrus.yml > index 677bdf0e65e..0cb961c8103 100644 > --- a/.cirrus.yml > +++ b/.cirrus.yml > @@ -81,6 +81,7 @@ task: > mkdir -m 770 /tmp/cores > chown root:postgres /tmp/cores > sysctl kern.corefile='/tmp/cores/%N.%P.core' > +pkg install -y devel/pmdk > ># NB: Intentionally build without --with-llvm. The freebsd image size is ># already large enough to make VM startup slow, and even without llvm > @@ -99,6 +100,7 @@ task: > --with-lz4 \ > --with-pam \ > --with-perl \ > +--with-libpmem \ > --with-python \ > --with-ssl=openssl \ > --with-tcl --with-tclconfig=/usr/local/lib/tcl8.6/ \ > @@ -138,6 +140,7 @@ LINUX_CONFIGURE_FEATURES: _CONFIGURE_FEATURES >- >--with-lz4 >--with-pam >--with-perl > + --with-libpmem >--with-python >--with-selinux >--with-ssl=openssl > @@ -188,6 +191,9 @@ task: > mkdir -m 770 /tmp/cores > chown root:postgres /tmp/cores > sysctl kernel.core_pattern='/tmp/cores/%e-%s-%p.core' > +echo 'deb http://deb.debian.org/debian bullseye universe' > >>/etc/apt/sources.list > +apt-get update > +apt-get -y install libpmem-dev > >configure_script: | > su postgres <<-EOF > @@ -267,6 +273,7 @@ task: >make \ >openldap \ >openssl \ > + pmem \ >python \ >tcl-tk > > @@ -301,6 +308,7 @@ task: >--with-libxslt \ >--with-lz4 \ >--with-perl \ > + --with-libpmem \ >--with-python \ >--with-ssl=openssl \ >--with-tcl --with-tclconfig=${brewpath}/opt/tcl-tk/lib/ \ > diff --git a/src/backend/main/main.c b/src/backend/main/main.c > index 9124060bde7..b814269675d 100644 > --- a/src/backend/main/main.c > +++ b/src/backend/main/main.c > @@ -69,6 +69,7 @@ main(int argc, char *argv[]) > #endif > > progname = get_progname(argv[0]); > + setenv("PMEM_IS_PMEM_FORCE", "1", 0); > > /* > * Platform-specific startup hacks > diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c > index ffc55f33e86..32d650cb9b2 100644 > --- a/src/backend/utils/misc/guc.c > +++ b/src/backend/utils/misc/guc.c > @@ -1354,7
Re: Map WAL segment files on PMEM as WAL buffers
On Thu, Jan 06, 2022 at 10:43:37PM -0600, Justin Pryzby wrote: > On Fri, Jan 07, 2022 at 12:50:01PM +0900, Takashi Menjo wrote: > > > But in this case it really doesn't work :( > > > > > > running bootstrap script ... 2022-01-05 23:17:30.244 CST [12088] FATAL: > > > file not on PMEM: path "pg_wal/00010001" > > > > Do you have a real PMEM device such as NVDIMM-N or Intel Optane PMem? > > No - the point is that we'd like to have a way to exercise this patch on the > cfbot. Particularly the new code introduced by this patch, not just the > --without-pmem case... .. > I think you should add a patch which does what Thomas suggested: 1) add to > ./.cirrus.yaml installation of the libpmem package for debian/bsd/mac/windows; > 2) add setenv to main(), as above; 3) change configure.ac and guc.c to default > to --with-libpmem and wal_pmem_map=on. This should be the last patch, for > cfbot only, not meant to be merged. I was able to get the cirrus CI to compile on linux and bsd with the below changes. I don't know if there's an easy package installation for mac OSX. I think it's okay if mac CI doesn't use --enable-pmem for now. > You can test that the package installation part works before mailing patches > to > the list with the instructions here: > > src/tools/ci/README: > Enabling cirrus-ci in a github repository.. I ran the CI under my own github account. Linux crashes in the recovery check. And freebsd has been stuck for 45min. I'm not sure, but maybe those are legimate consequence of using PMEM_IS_PMEM_FORCE (?) If so, maybe the recovery check would need to be disabled for this patch to run on CI... Or maybe my suggestion to enable it by default for CI doesn't work for this patch. It would need to be specially tested with real hardware. https://cirrus-ci.com/task/6245151591890944 https://cirrus-ci.com/task/6162551485497344?logs=test_world#L3941 #2 0x55ff43c6edad in ExceptionalCondition (conditionName=0x55ff43d18108 "!XLogRecPtrIsInvalid(missingContrecPtr)", errorType=0x55ff43d151c4 "FailedAssertion", fileName=0x55ff43d151bd "xlog.c", lineNumber=8297) at assert.c:69 commit 15533794e465a381eb23634d67700afa809a0210 Author: Justin Pryzby Date: Thu Jan 6 22:53:28 2022 -0600 tmp: enable pmem by default, for CI diff --git a/.cirrus.yml b/.cirrus.yml index 677bdf0e65e..0cb961c8103 100644 --- a/.cirrus.yml +++ b/.cirrus.yml @@ -81,6 +81,7 @@ task: mkdir -m 770 /tmp/cores chown root:postgres /tmp/cores sysctl kern.corefile='/tmp/cores/%N.%P.core' +pkg install -y devel/pmdk # NB: Intentionally build without --with-llvm. The freebsd image size is # already large enough to make VM startup slow, and even without llvm @@ -99,6 +100,7 @@ task: --with-lz4 \ --with-pam \ --with-perl \ +--with-libpmem \ --with-python \ --with-ssl=openssl \ --with-tcl --with-tclconfig=/usr/local/lib/tcl8.6/ \ @@ -138,6 +140,7 @@ LINUX_CONFIGURE_FEATURES: _CONFIGURE_FEATURES >- --with-lz4 --with-pam --with-perl + --with-libpmem --with-python --with-selinux --with-ssl=openssl @@ -188,6 +191,9 @@ task: mkdir -m 770 /tmp/cores chown root:postgres /tmp/cores sysctl kernel.core_pattern='/tmp/cores/%e-%s-%p.core' +echo 'deb http://deb.debian.org/debian bullseye universe' >>/etc/apt/sources.list +apt-get update +apt-get -y install libpmem-dev configure_script: | su postgres <<-EOF @@ -267,6 +273,7 @@ task: make \ openldap \ openssl \ + pmem \ python \ tcl-tk @@ -301,6 +308,7 @@ task: --with-libxslt \ --with-lz4 \ --with-perl \ + --with-libpmem \ --with-python \ --with-ssl=openssl \ --with-tcl --with-tclconfig=${brewpath}/opt/tcl-tk/lib/ \ diff --git a/src/backend/main/main.c b/src/backend/main/main.c index 9124060bde7..b814269675d 100644 --- a/src/backend/main/main.c +++ b/src/backend/main/main.c @@ -69,6 +69,7 @@ main(int argc, char *argv[]) #endif progname = get_progname(argv[0]); + setenv("PMEM_IS_PMEM_FORCE", "1", 0); /* * Platform-specific startup hacks diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index ffc55f33e86..32d650cb9b2 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -1354,7 +1354,7 @@ static struct config_bool ConfigureNamesBool[] = "traditional volatile ones."), }, _pmem_map, - false, + true, NULL, NULL, NULL }, #endif
Re: Map WAL segment files on PMEM as WAL buffers
On Fri, Jan 07, 2022 at 12:50:01PM +0900, Takashi Menjo wrote: > > But in this case it really doesn't work :( > > > > running bootstrap script ... 2022-01-05 23:17:30.244 CST [12088] FATAL: > > file not on PMEM: path "pg_wal/00010001" > > Do you have a real PMEM device such as NVDIMM-N or Intel Optane PMem? No - the point is that we'd like to have a way to exercise this patch on the cfbot. Particularly the new code introduced by this patch, not just the --without-pmem case... I was able to make this pass "make check" by adding this to main() in src/backend/main/main.c: | setenv("PMEM_IS_PMEM_FORCE", "1", 0); I think you should add a patch which does what Thomas suggested: 1) add to ./.cirrus.yaml installation of the libpmem package for debian/bsd/mac/windows; 2) add setenv to main(), as above; 3) change configure.ac and guc.c to default to --with-libpmem and wal_pmem_map=on. This should be the last patch, for cfbot only, not meant to be merged. You can test that the package installation part works before mailing patches to the list with the instructions here: src/tools/ci/README: Enabling cirrus-ci in a github repository.. > If you don't, you have two alternatives below. Note that neither of > them ensures durability. Each of them is just for testing. > 2. Set the environment variable PMEM_IS_PMEM_FORCE=1 to tell libpmem > to treat any devices as if they were PMEM. The next revision should surely squish all the fixes into their corresponding patches to be fixed. Each of the patches ought to be compile and pass tests without depending on the "following" patches: 0001 without 0002-, 0001-0002 without 0003-, etc. -- Justin
Re: Map WAL segment files on PMEM as WAL buffers
Hi Justin, Thank you for your build test and comments. The v7 patchset attached to this email fixes the issues you reported. > The cfbot showed issues compiling on linux and windows. > http://cfbot.cputube.org/takashi-menjo.html > > https://cirrus-ci.com/task/6125740327436288 > [02:30:06.538] In file included from xlog.c:38: > [02:30:06.538] ../../../../src/include/access/xlogpmem.h:32:42: error: > unknown type name ‘tli’ > [02:30:06.538]32 | PmemXLogEnsurePrevMapped(XLogRecPtr ptr, tli) > [02:30:06.538] | ^~~ > [02:30:06.538] xlog.c: In function ‘GetXLogBuffer’: > [02:30:06.538] xlog.c:1959:19: warning: implicit declaration of function > ‘PmemXLogEnsurePrevMapped’ [-Wimplicit-function-declaration] > [02:30:06.538] 1959 |openLogSegNo = PmemXLogEnsurePrevMapped(endptr, > tli); > > https://cirrus-ci.com/task/6688690280857600?logs=build#L379 > [02:33:25.752] c:\cirrus\src\include\access\xlogpmem.h(33,1): error C2081: > 'tli': name in formal parameter list illegal (compiling source file > src/backend/access/transam/xlog.c) [c:\cirrus\postgres.vcxproj] > > I'm attaching a probable fix. Unfortunately, for patches like this, most of > the functionality isn't exercised unless the library is installed and > compilation and runtime are enabled by default. I got the same error when without --with-libpmem. Your fix looks reasonable. My v7-0008 fixes this error. > In 0009: recaluculated => recalculated v7-0011 fixes this typo. > 0010-Update-document should be squished with 0003-Add-wal_pmem_map-to-GUC (and > maybe 0002 and 0001). I believe the patches after 0005 are more WIP, so it's > fine if they're not squished yet. As you say, the patch updating document should melt into a related fix, probably "Add wal_pmem_map to GUC". For now I want it to be a separate patch (v7-0014). > I'm not sure what the point is of this one: > 0008-Let-wal_pmem_map-be-constant-unl If USE_LIBPMEM is not defined (that is, no --with-libpmem), wal_pmem_map is always false and is never used essentially. Using #if(n)def everywhere is not good for code readability, so I let wal_pmem_map be constant. This may help compilers optimize conditional branches. v7-0005 adds the comment above. > + ereport(ERROR, > + (errcode_for_file_access(), > +errmsg("could not pmem_map_file \"%s\": %m", > path))); > > => The outer parenthesis are not needed since e3a87b4. v7-0009 fixes this. > But in this case it really doesn't work :( > > running bootstrap script ... 2022-01-05 23:17:30.244 CST [12088] FATAL: file > not on PMEM: path "pg_wal/00010001" Do you have a real PMEM device such as NVDIMM-N or Intel Optane PMem? If so, please use a PMEM mounted with Filesystem DAX option for pg_wal, or the FATAL error will occur. If you don't, you have two alternatives below. Note that neither of them ensures durability. Each of them is just for testing. 1. Emulate PMEM with memmap=nn[KMG]!ss[KMG]. This can be used only on Linux. Please see [1][2] for details; or 2. Set the environment variable PMEM_IS_PMEM_FORCE=1 to tell libpmem to treat any devices as if they were PMEM. Regards, Takashi [1] https://www.intel.com/content/www/us/en/developer/articles/training/how-to-emulate-persistent-memory-on-an-intel-architecture-server.html [2] https://nvdimm.wiki.kernel.org/ -- Takashi Menjo v7-0004-Let-wal_pmem_map-be-constant-unless-with-libpmem.patch Description: Binary data v7-0002-Support-build-with-MSVC-on-Windows.patch Description: Binary data v7-0001-Add-with-libpmem-option-for-PMEM-support.patch Description: Binary data v7-0005-Comment-for-constant-wal_pmem_map.patch Description: Binary data v7-0003-Add-wal_pmem_map-to-GUC.patch Description: Binary data v7-0006-Export-InstallXLogFileSegment.patch Description: Binary data v7-0008-Fix-invalid-declaration-of-PmemXLogEnsurePrevMapp.patch Description: Binary data v7-0007-Map-WAL-segment-files-on-PMEM-as-WAL-buffers.patch Description: Binary data v7-0009-Remove-redundant-parentheses-from-ereport-call.patch Description: Binary data v7-0011-Fix-typo-in-comment.patch Description: Binary data v7-0013-WAL-statistics-in-cases-of-wal_pmem_map-true.patch Description: Binary data v7-0012-Compatible-to-Windows.patch Description: Binary data v7-0010-Ensure-WAL-mappings-before-assertion.patch Description: Binary data v7-0014-Update-document.patch Description: Binary data v7-0015-Preallocate-and-initialize-more-WAL-if-wal_pmem_m.patch Description: Binary data
Re: Map WAL segment files on PMEM as WAL buffers
On Thu, Jan 06, 2022 at 05:52:01PM +1300, Thomas Munro wrote: > On Thu, Jan 6, 2022 at 5:00 PM Justin Pryzby wrote: > > I'm attaching a probable fix. Unfortunately, for patches like this, most of > > the functionality isn't exercised unless the library is installed and > > compilation and runtime are enabled by default. > > By the way, you could add a separate patch marked not-for-commit that > adds, say, an apt-get command to the Linux task in the .cirrus.yml > file, and whatever --with-blah stuff you might need to the configure > part, if that'd be useful to test this code. In general, I think that's more or less essential. But in this case it really doesn't work :( running bootstrap script ... 2022-01-05 23:17:30.244 CST [12088] FATAL: file not on PMEM: path "pg_wal/00010001" -- Justin
Re: Map WAL segment files on PMEM as WAL buffers
On Thu, Jan 6, 2022 at 5:00 PM Justin Pryzby wrote: > I'm attaching a probable fix. Unfortunately, for patches like this, most of > the functionality isn't exercised unless the library is installed and > compilation and runtime are enabled by default. By the way, you could add a separate patch marked not-for-commit that adds, say, an apt-get command to the Linux task in the .cirrus.yml file, and whatever --with-blah stuff you might need to the configure part, if that'd be useful to test this code. Eventually, if we wanted to support that permanently for all CI testing, we'd want to push package installation down to the image building scripts (not in the pg source tree) so that CI starts with everything we need pre-installed, but one of the goals of the recent CI work was to make it possible for patches that include dependency changes to be possible (for example the alternative SSL library threads).
Re: Map WAL segment files on PMEM as WAL buffers
ions}->{'--with-pgport'}; $cfg .= " --with-pgport=$port" if defined($port); diff --git a/src/tools/msvc/config_default.pl b/src/tools/msvc/config_default.pl index 460c0375d4b..774730c9a8f 100644 --- a/src/tools/msvc/config_default.pl +++ b/src/tools/msvc/config_default.pl @@ -25,7 +25,8 @@ our $config = { xml => undef,# --with-libxml= xslt => undef,# --with-libxslt= iconv => undef,# (not in configure, path to iconv) - zlib => undef # --with-zlib= + zlib => undef,# --with-zlib= + pmem => undef # --with-libpmem= }; 1; -- 2.17.1 >From 7c2d6665a925dc5615f6ffd999374e10c3ea2199 Mon Sep 17 00:00:00 2001 From: Takashi Menjo Date: Thu, 11 Mar 2021 17:55:53 +0900 Subject: [PATCH 03/13] Add wal_pmem_map to GUC --- src/backend/access/transam/xlog.c | 51 --- src/backend/utils/misc/guc.c | 14 + src/include/access/xlog.h | 1 + 3 files changed, 55 insertions(+), 11 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 87cd05c9454..02f63c31387 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -115,6 +115,7 @@ int CommitSiblings = 5; /* # concurrent xacts needed to sleep */ int wal_retrieve_retry_interval = 5000; int max_slot_wal_keep_size_mb = -1; bool track_wal_io_timing = false; +bool wal_pmem_map = false; #ifdef WAL_DEBUG bool XLOG_DEBUG = false; @@ -5194,13 +5195,28 @@ XLOGShmemSize(void) { Size size; + /* + * If we use WAL segment files as WAL buffers, we don't use the given + * value of wal_buffers. Instead, we set it to the value based on the + * segment size and the page size. This should be done before calculating + * the size of xlblocks array. + */ + if (wal_pmem_map) + { + int npages; + char buf[32]; + + npages = wal_segment_size / XLOG_BLCKSZ; + snprintf(buf, sizeof(buf), "%d", (int) npages); + SetConfigOption("wal_buffers", buf, PGC_POSTMASTER, PGC_S_OVERRIDE); + } /* * If the value of wal_buffers is -1, use the preferred auto-tune value. * This isn't an amazingly clean place to do this, but we must wait till * NBuffers has received its final value, and must do it before using the * value of XLOGbuffers to do anything important. */ - if (XLOGbuffers == -1) + else if (XLOGbuffers == -1) { char buf[32]; @@ -5216,10 +5232,17 @@ XLOGShmemSize(void) size = add_size(size, mul_size(sizeof(WALInsertLockPadded), NUM_XLOGINSERT_LOCKS + 1)); /* xlblocks array */ size = add_size(size, mul_size(sizeof(XLogRecPtr), XLOGbuffers)); - /* extra alignment padding for XLOG I/O buffers */ - size = add_size(size, XLOG_BLCKSZ); - /* and the buffers themselves */ - size = add_size(size, mul_size(XLOG_BLCKSZ, XLOGbuffers)); + + /* + * If we use WAL segment files as WAL buffers, we don't need volatile ones. + */ + if (!wal_pmem_map) + { + /* extra alignment padding for XLOG I/O buffers */ + size = add_size(size, XLOG_BLCKSZ); + /* and the buffers themselves */ + size = add_size(size, mul_size(XLOG_BLCKSZ, XLOGbuffers)); + } /* * Note: we don't count ControlFileData, it comes out of the "slop factor" @@ -5313,13 +5336,19 @@ XLOGShmemInit(void) } /* - * Align the start of the page buffers to a full xlog block size boundary. - * This simplifies some calculations in XLOG insertion. It is also - * required for O_DIRECT. + * If we use WAL segment files as WAL buffers, we don't need volatile ones. */ - allocptr = (char *) TYPEALIGN(XLOG_BLCKSZ, allocptr); - XLogCtl->pages = allocptr; - memset(XLogCtl->pages, 0, (Size) XLOG_BLCKSZ * XLOGbuffers); + if (!wal_pmem_map) + { + /* + * Align the start of the page buffers to a full xlog block size boundary. + * This simplifies some calculations in XLOG insertion. It is also + * required for O_DIRECT. + */ + allocptr = (char *) TYPEALIGN(XLOG_BLCKSZ, allocptr); + XLogCtl->pages = allocptr; + memset(XLogCtl->pages, 0, (Size) XLOG_BLCKSZ * XLOGbuffers); + } /* * Do basic initialization of XLogCtl shared data. (StartupXLOG will fill diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index f9504d3aec4..ee18a9cf338 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -1344,6 +1344,20 @@ static struct config_bool ConfigureNamesBool[] = NULL, NULL, NULL }, +#ifdef USE_LIBPMEM + { + {"wal_pmem_map", PGC_POSTMASTER, WAL_SETTINGS, + gettext_noop("Map WAL segment files on PMEM as WAL buffers."), + gettext_noop("If true, postgres will memory-map WAL segment files " + "on PMEM to use them as WAL buffers instead of the " + "traditional volatile ones."), + }, + _pmem_map, + false, + NULL, NULL, NULL + }, +#endif + { {"log_checkpoints", PGC_SIGHUP, LOGGING_WHAT, gettext_noop("Logs each checkpoint.
Re: Map WAL segment files on PMEM as WAL buffers
Rebased. On Fri, Nov 5, 2021 at 3:47 PM Takashi Menjo wrote: > > Hi Daniel, > > The issue you told has been fixed. I attach the v5 patchset to this email. > > The v5 has all the patches in the v4, and in addition, has the > following two new patches: > > - (v5-0002) Support build with MSVC on Windows: Please have > src\tools\msvc\config.pl as follows to "configure --with-libpmem:" > > $config->{pmem} = 'C:\path\to\pmdk\x64-windows'; > > - (v5-0006) Compatible to Windows: This patch resolves conflicting > mode_t typedefs and libpmem API variants (U or W, like Windows API). > > Best regards, > Takashi > > On Thu, Nov 4, 2021 at 5:46 PM Takashi Menjo wrote: > > > > Hello Daniel, > > > > Thank you for your comment. I had the following error message with > > MSVC on Windows. It looks the same as what you told me. I'll fix it. > > > > | > cd src\tools\msvc > > | > build > > | (..snipped..) > > | Copying pg_config_os.h... > > | Generating configuration headers... > > | undefined symbol: HAVE_LIBPMEM at src/include/pg_config.h line 347 > > at C:/Users/menjo/Documents/git/postgres/src/tools/msvc/Mkvcbuild.pm > > line 860. > > > > Best regards, > > Takashi > > > > > > On Wed, Nov 3, 2021 at 10:04 PM Daniel Gustafsson wrote: > > > > > > > On 28 Oct 2021, at 08:09, Takashi Menjo wrote: > > > > > > > Rebased, and added the patches below into the patchset. > > > > > > Looks like the 0001 patch needs to be updated to support Windows and > > > MSVC. See > > > src/tools/msvc/Mkvcbuild.pm and Solution.pm et.al for inspiration on how > > > to add > > > the MSVC equivalent of --with-libpmem. Currently the patch fails in the > > > "Generating configuration headers" step in Solution.pm. > > > > > > -- > > > Daniel Gustafsson https://vmware.com/ > > > > > > > > > -- > > Takashi Menjo > > > > -- > Takashi Menjo -- Takashi Menjo v6-0003-Add-wal_pmem_map-to-GUC.patch Description: Binary data v6-0001-Add-with-libpmem-option-for-PMEM-support.patch Description: Binary data v6-0004-Export-InstallXLogFileSegment.patch Description: Binary data v6-0002-Support-build-with-MSVC-on-Windows.patch Description: Binary data v6-0005-Map-WAL-segment-files-on-PMEM-as-WAL-buffers.patch Description: Binary data v6-0008-Let-wal_pmem_map-be-constant-unless-with-libpmem.patch Description: Binary data v6-0006-Compatible-to-Windows.patch Description: Binary data v6-0009-Ensure-WAL-mappings-before-assertion.patch Description: Binary data v6-0007-WAL-statistics-in-cases-of-wal_pmem_map-true.patch Description: Binary data v6-0010-Update-document.patch Description: Binary data v6-0011-Preallocate-and-initialize-more-WAL-if-wal_pmem_m.patch Description: Binary data
Re: Map WAL segment files on PMEM as WAL buffers
Hi Daniel, The issue you told has been fixed. I attach the v5 patchset to this email. The v5 has all the patches in the v4, and in addition, has the following two new patches: - (v5-0002) Support build with MSVC on Windows: Please have src\tools\msvc\config.pl as follows to "configure --with-libpmem:" $config->{pmem} = 'C:\path\to\pmdk\x64-windows'; - (v5-0006) Compatible to Windows: This patch resolves conflicting mode_t typedefs and libpmem API variants (U or W, like Windows API). Best regards, Takashi On Thu, Nov 4, 2021 at 5:46 PM Takashi Menjo wrote: > > Hello Daniel, > > Thank you for your comment. I had the following error message with > MSVC on Windows. It looks the same as what you told me. I'll fix it. > > | > cd src\tools\msvc > | > build > | (..snipped..) > | Copying pg_config_os.h... > | Generating configuration headers... > | undefined symbol: HAVE_LIBPMEM at src/include/pg_config.h line 347 > at C:/Users/menjo/Documents/git/postgres/src/tools/msvc/Mkvcbuild.pm > line 860. > > Best regards, > Takashi > > > On Wed, Nov 3, 2021 at 10:04 PM Daniel Gustafsson wrote: > > > > > On 28 Oct 2021, at 08:09, Takashi Menjo wrote: > > > > > Rebased, and added the patches below into the patchset. > > > > Looks like the 0001 patch needs to be updated to support Windows and MSVC. > > See > > src/tools/msvc/Mkvcbuild.pm and Solution.pm et.al for inspiration on how to > > add > > the MSVC equivalent of --with-libpmem. Currently the patch fails in the > > "Generating configuration headers" step in Solution.pm. > > > > -- > > Daniel Gustafsson https://vmware.com/ > > > > > -- > Takashi Menjo -- Takashi Menjo v5-0001-Add-with-libpmem-option-for-PMEM-support.patch Description: Binary data v5-0002-Support-build-with-MSVC-on-Windows.patch Description: Binary data v5-0003-Add-wal_pmem_map-to-GUC.patch Description: Binary data v5-0004-Export-InstallXLogFileSegment.patch Description: Binary data v5-0005-Map-WAL-segment-files-on-PMEM-as-WAL-buffers.patch Description: Binary data v5-0006-Compatible-to-Windows.patch Description: Binary data v5-0007-WAL-statistics-in-cases-of-wal_pmem_map-true.patch Description: Binary data v5-0008-Let-wal_pmem_map-be-constant-unless-with-libpmem.patch Description: Binary data v5-0009-Ensure-WAL-mappings-before-assertion.patch Description: Binary data v5-0010-Update-document.patch Description: Binary data v5-0011-Preallocate-and-initialize-more-WAL-if-wal_pmem_m.patch Description: Binary data
Re: Map WAL segment files on PMEM as WAL buffers
Hello Daniel, Thank you for your comment. I had the following error message with MSVC on Windows. It looks the same as what you told me. I'll fix it. | > cd src\tools\msvc | > build | (..snipped..) | Copying pg_config_os.h... | Generating configuration headers... | undefined symbol: HAVE_LIBPMEM at src/include/pg_config.h line 347 at C:/Users/menjo/Documents/git/postgres/src/tools/msvc/Mkvcbuild.pm line 860. Best regards, Takashi On Wed, Nov 3, 2021 at 10:04 PM Daniel Gustafsson wrote: > > > On 28 Oct 2021, at 08:09, Takashi Menjo wrote: > > > Rebased, and added the patches below into the patchset. > > Looks like the 0001 patch needs to be updated to support Windows and MSVC. > See > src/tools/msvc/Mkvcbuild.pm and Solution.pm et.al for inspiration on how to > add > the MSVC equivalent of --with-libpmem. Currently the patch fails in the > "Generating configuration headers" step in Solution.pm. > > -- > Daniel Gustafsson https://vmware.com/ > -- Takashi Menjo
Re: Map WAL segment files on PMEM as WAL buffers
> On 28 Oct 2021, at 08:09, Takashi Menjo wrote: > Rebased, and added the patches below into the patchset. Looks like the 0001 patch needs to be updated to support Windows and MSVC. See src/tools/msvc/Mkvcbuild.pm and Solution.pm et.al for inspiration on how to add the MSVC equivalent of --with-libpmem. Currently the patch fails in the "Generating configuration headers" step in Solution.pm. -- Daniel Gustafsson https://vmware.com/
Re: Map WAL segment files on PMEM as WAL buffers
Hi, Rebased, and added the patches below into the patchset. - (0006) Let wal_pmem_map be constant unless --with-libpmem wal_pmem_map never changes from false in that case, so let it be constant. Thanks, Matthias! - (0007) Ensure WAL mappings before assertion This fixes SIGSEGV abortion in GetXLogBuffer when --enable-cassert. - (0008) Update document This adds a new entry for wal_pmem_map in the section Write Ahead Log -> Settings. Best regards, Takashi On Fri, Oct 8, 2021 at 5:07 PM Takashi Menjo wrote: > > Hello Matthias, > > Thank you for your comment! > > > > [ v3-0002-Add-wal_pmem_map-to-GUC.patch ] > > > +extern bool wal_pmem_map; > > > > A lot of the new code in these patches is gated behind this one flag, > > but the flag should never be true on !pmem systems. Could you instead > > replace it with something like the following? > > > > +#ifdef USE_LIBPMEM > > +extern bool wal_pmem_map; > > +#else > > +#define wal_pmem_map false > > +#endif > > > > A good compiler would then eliminate all the dead code from being > > generated on non-pmem builds (instead of the compiler needing to keep > > that code around just in case some extension decides to set > > wal_pmem_map to true on !pmem systems because it has access to that > > variable). > > That sounds good. I will introduce it in the next update. > > > > [ v3-0004-Map-WAL-segment-files-on-PMEM-as-WAL-buffers.patch ] > > > +if ((uintptr_t) addr & ~PG_DAX_HUGEPAGE_MASK) > > > +elog(WARNING, > > > + "file not mapped on DAX hugepage boundary: path \"%s\" addr > > > %p", > > > + path, addr); > > > > I'm not sure that we should want to log this every time we detect the > > issue; It's likely that once it happens it will happen for the next > > file as well. Maybe add a timeout, or do we generally not deduplicate > > such messages? > > Let me give it some thought. I have believed this WARNING is most > unlikely to happen, and is mutually independent from other happenings. > I will try to find a case where the WARNING happens repeatedly; or I > will de-duplicate the messages if it is easier. > > Best regards, > Takashi > > -- > Takashi Menjo -- Takashi Menjo v4-0001-Add-with-libpmem-option-for-PMEM-support.patch Description: Binary data v4-0002-Add-wal_pmem_map-to-GUC.patch Description: Binary data v4-0003-Export-InstallXLogFileSegment.patch Description: Binary data v4-0005-WAL-statistics-in-cases-of-wal_pmem_map-true.patch Description: Binary data v4-0004-Map-WAL-segment-files-on-PMEM-as-WAL-buffers.patch Description: Binary data v4-0006-Let-wal_pmem_map-be-constant-unless-with-libpmem.patch Description: Binary data v4-0007-Ensure-WAL-mappings-before-assertion.patch Description: Binary data v4-0008-Update-document.patch Description: Binary data v4-0009-Preallocate-and-initialize-more-WAL-if-wal_pmem_m.patch Description: Binary data
Re: Map WAL segment files on PMEM as WAL buffers
Hello Matthias, Thank you for your comment! > > [ v3-0002-Add-wal_pmem_map-to-GUC.patch ] > > +extern bool wal_pmem_map; > > A lot of the new code in these patches is gated behind this one flag, > but the flag should never be true on !pmem systems. Could you instead > replace it with something like the following? > > +#ifdef USE_LIBPMEM > +extern bool wal_pmem_map; > +#else > +#define wal_pmem_map false > +#endif > > A good compiler would then eliminate all the dead code from being > generated on non-pmem builds (instead of the compiler needing to keep > that code around just in case some extension decides to set > wal_pmem_map to true on !pmem systems because it has access to that > variable). That sounds good. I will introduce it in the next update. > > [ v3-0004-Map-WAL-segment-files-on-PMEM-as-WAL-buffers.patch ] > > +if ((uintptr_t) addr & ~PG_DAX_HUGEPAGE_MASK) > > +elog(WARNING, > > + "file not mapped on DAX hugepage boundary: path \"%s\" addr > > %p", > > + path, addr); > > I'm not sure that we should want to log this every time we detect the > issue; It's likely that once it happens it will happen for the next > file as well. Maybe add a timeout, or do we generally not deduplicate > such messages? Let me give it some thought. I have believed this WARNING is most unlikely to happen, and is mutually independent from other happenings. I will try to find a case where the WARNING happens repeatedly; or I will de-duplicate the messages if it is easier. Best regards, Takashi -- Takashi Menjo
Re: Map WAL segment files on PMEM as WAL buffers
On Wed, 30 Jun 2021 at 06:53, Takashi Menjo wrote: > > Rebased. Thanks for these patches! I recently took a dive into the WAL subsystem, and got to this patchset through the previous ones linked from [0]. This patchset seems straightforward to understand, so thanks! I've looked over the patches and added some comments below. I haven't yet tested this though; finding out how to get PMEM on WSL without actual PMEM is probably going to be difficult. > [ v3-0002-Add-wal_pmem_map-to-GUC.patch ] > +extern bool wal_pmem_map; A lot of the new code in these patches is gated behind this one flag, but the flag should never be true on !pmem systems. Could you instead replace it with something like the following? +#ifdef USE_LIBPMEM +extern bool wal_pmem_map; +#else +#define wal_pmem_map false +#endif A good compiler would then eliminate all the dead code from being generated on non-pmem builds (instead of the compiler needing to keep that code around just in case some extension decides to set wal_pmem_map to true on !pmem systems because it has access to that variable). > [ v3-0004-Map-WAL-segment-files-on-PMEM-as-WAL-buffers.patch ] > +if ((uintptr_t) addr & ~PG_DAX_HUGEPAGE_MASK) > +elog(WARNING, > + "file not mapped on DAX hugepage boundary: path \"%s\" addr %p", > + path, addr); I'm not sure that we should want to log this every time we detect the issue; It's likely that once it happens it will happen for the next file as well. Maybe add a timeout, or do we generally not deduplicate such messages? Kind regards, Matthias [0] https://wiki.postgresql.org/wiki/Persistent_Memory_for_WAL#Basic_performance
Re: Map WAL segment files on PMEM as WAL buffers
Rebased. -- Takashi Menjo v3-0001-Add-with-libpmem-option-for-PMEM-support.patch Description: Binary data v3-0002-Add-wal_pmem_map-to-GUC.patch Description: Binary data v3-0003-Export-InstallXLogFileSegment.patch Description: Binary data v3-0004-Map-WAL-segment-files-on-PMEM-as-WAL-buffers.patch Description: Binary data v3-0005-WAL-statistics-in-cases-of-wal_pmem_map-true.patch Description: Binary data v3-0006-Preallocate-and-initialize-more-WAL-if-wal_pmem_m.patch Description: Binary data