RE: [PoC] Non-volatile WAL buffer

2021-03-08 Thread tsunakawa.ta...@fujitsu.com
From: Takashi Menjo > > The other question is whether simply placing WAL on DAX (without any > > code changes) is safe. If it's not, then all the "speedups" are > > computed with respect to unsafe configuration and so are useless. And > > BTT should be used instead, which would of course produce

Re: [PoC] Non-volatile WAL buffer

2021-03-08 Thread Takashi Menjo
Hi Tomas, > Hello Takashi-san, > > On 3/5/21 9:08 AM, Takashi Menjo wrote: > > Hi Tomas, > > > > Thank you so much for your report. I have read it with great interest. > > > > Your conclusion sounds reasonable to me. My patchset you call "NTT / > > segments" got as good performance as "NTT /

Re: [PoC] Non-volatile WAL buffer

2021-03-05 Thread Tomas Vondra
Hello Takashi-san, On 3/5/21 9:08 AM, Takashi Menjo wrote: > Hi Tomas, > > Thank you so much for your report. I have read it with great interest. > > Your conclusion sounds reasonable to me. My patchset you call "NTT / > segments" got as good performance as "NTT / buffer" patchset. I have >

Re: [PoC] Non-volatile WAL buffer

2021-03-05 Thread Takashi Menjo
Hi Tomas, Thank you so much for your report. I have read it with great interest. Your conclusion sounds reasonable to me. My patchset you call "NTT / segments" got as good performance as "NTT / buffer" patchset. I have been worried that calling mmap/munmap for each WAL segment file could have a

Re: [PoC] Non-volatile WAL buffer

2021-02-28 Thread Takashi Menjo
Hi Sawada, I am relieved to hear that the performance problem was solved. And I added a tip about PMEM namespace and partitioning in PG wiki[1]. Regards, [1] https://wiki.postgresql.org/wiki/Persistent_Memory_for_WAL#Configure_and_verify_DAX_hugepage_faults -- Takashi Menjo

Re: [PoC] Non-volatile WAL buffer

2021-02-24 Thread Masahiko Sawada
On Sat, Feb 13, 2021 at 12:18 PM Masahiko Sawada wrote: > > On Thu, Jan 28, 2021 at 1:41 AM Tomas Vondra > wrote: > > > > On 1/25/21 3:56 AM, Masahiko Sawada wrote: > > >> > > >> ... > > >> > > >> On 1/21/21 3:17 AM, Masahiko Sawada wrote: > > >>> ... > > >>> > > >>> While looking at the two

Re: [PoC] Non-volatile WAL buffer

2021-02-23 Thread Takashi Menjo
Hi, I had a performance test in another environment. The steps, setup, and postgresql.conf of the test are same as the ones sent by me on Feb 17 [1], except the following items: # Setup - Distro: Red Hat Enterprise Linux release 8.2 (Ootpa) - C compiler: gcc-8.3.1-5.el8.x86_64 - libc:

Re: [PoC] Non-volatile WAL buffer

2021-02-19 Thread Konstantin Knizhnik
Thank you for your feedback. On 19.02.2021 6:25, Tomas Vondra wrote: On 1/22/21 5:04 PM, Konstantin Knizhnik wrote: ... I have heard from several DBMS experts that appearance of huge and cheap non-volatile memory can make a revolution in database system architecture. If all database can fit

Re: [PoC] Non-volatile WAL buffer

2021-02-18 Thread Tomas Vondra
On 1/22/21 5:04 PM, Konstantin Knizhnik wrote: > ... > > I have heard from several DBMS experts that appearance of huge and > cheap non-volatile memory can make a revolution in database system > architecture. If all database can fit in non-volatile memory, then we > do not need buffers, WAL, ...>

Re: [PoC] Non-volatile WAL buffer

2021-02-17 Thread Takashi Menjo
Hi Sawada, Thank you for your performance report. First, I'd say that the latest v5 non-volatile WAL buffer patchset looks not bad itself. I made a performance test for the v5 and got better performance than the original (non-patched) one and our previous work. See the attached figure for

Re: [PoC] Non-volatile WAL buffer

2021-02-16 Thread Takashi Menjo
Hi Takayuki, Thank you for your helpful comments. In "Allocates WAL buffers on shared buffers", "shared buffers" should be > DRAM because shared buffers in Postgres means the buffer cache for database > data. > That's true. Fixed. > I haven't tracked the whole thread, but could you collect

RE: [PoC] Non-volatile WAL buffer

2021-02-16 Thread tsunakawa.ta...@fujitsu.com
From: Takashi Menjo > I made a new page at PostgreSQL Wiki to gather and summarize information and > discussion about PMEM-backed WAL designs and implementations. Some parts of > the page are TBD. I will continue to maintain the page. Requests are welcome. > > Persistent Memory for WAL >

Re: [PoC] Non-volatile WAL buffer

2021-02-15 Thread Takashi Menjo
Hi, I made a new page at PostgreSQL Wiki to gather and summarize information and discussion about PMEM-backed WAL designs and implementations. Some parts of the page are TBD. I will continue to maintain the page. Requests are welcome. Persistent Memory for WAL

RE: [PoC] Non-volatile WAL buffer

2021-02-14 Thread tsunakawa.ta...@fujitsu.com
From: Masahiko Sawada > I've done some performance benchmarks with the master and NTT v4 > patch. Let me share the results. > ... > master NTT master-unlogged > 32 113209 67107 154298 > 64 144880 54289 178883 > 96 151405 50562 180018 > > "master-unlogged" is

Re: [PoC] Non-volatile WAL buffer

2021-02-12 Thread Masahiko Sawada
On Thu, Jan 28, 2021 at 1:41 AM Tomas Vondra wrote: > > On 1/25/21 3:56 AM, Masahiko Sawada wrote: > >> > >> ... > >> > >> On 1/21/21 3:17 AM, Masahiko Sawada wrote: > >>> ... > >>> > >>> While looking at the two methods: NTT and simple-no-buffer, I realized > >>> that in XLogFlush(), NTT patch

Re: [PoC] Non-volatile WAL buffer

2021-01-29 Thread Takashi Menjo
Hi Tomas, I'd answer your questions. (Not all for now, sorry.) > Do I understand correctly that the patch removes "regular" WAL buffers and instead writes the data into the non-volatile PMEM buffer, without writing that to the WAL segments at all (unless in archiving mode)? > Firstly, I guess

RE: [PoC] Non-volatile WAL buffer

2021-01-27 Thread tsunakawa.ta...@fujitsu.com
From: Tomas Vondra > (c) As mentioned before, PMEM behaves differently with concurrent > access, i.e. it reaches peak throughput with relatively low number of > threads wroting data, and then the throughput drops quite quickly. I'm > not sure if the same thing applies to pmem_drain() too - if it

Re: [PoC] Non-volatile WAL buffer

2021-01-27 Thread Tomas Vondra
On 1/25/21 3:56 AM, Masahiko Sawada wrote: >> >> ... >> >> On 1/21/21 3:17 AM, Masahiko Sawada wrote: >>> ... >>> >>> While looking at the two methods: NTT and simple-no-buffer, I realized >>> that in XLogFlush(), NTT patch flushes (by pmem_flush() and >>> pmem_drain()) WAL without acquiring

Re: [PoC] Non-volatile WAL buffer

2021-01-27 Thread Takashi Menjo
Hi, Now I have caught up with this thread. I see that many of you are interested in performance profiling. I share my slides in SNIA SDC 2020 [1]. In the slides, I had profiles focused on XLogInsert and XLogFlush (mainly the latter) for my non-volatile WAL buffer patchset. I found that the time

Re: [PoC] Non-volatile WAL buffer

2021-01-26 Thread Takashi Menjo
Dear everyone, Tomas, First of all, the "v4" patchset for non-volatile WAL buffer attached to the previous mail is actually v5... Please read "v4" as "v5." Then, to Tomas: Thank you for your crash report you gave on Nov 27, 2020, regarding msync patchset. I applied the latest msync patchset v3

Re: [PoC] Non-volatile WAL buffer

2021-01-26 Thread Takashi Menjo
Dear everyone, I'm sorry for the late reply. I rebase my two patchsets onto the latest master 411ae64.The one patchset prefixed with v4 is for non-volatile WAL buffer; the other prefixed with v3 is for msync. I will reply to your thankful feedbacks one by one within days. Please wait for a

Re: [PoC] Non-volatile WAL buffer

2021-01-24 Thread Masahiko Sawada
On Fri, Jan 22, 2021 at 11:32 AM Tomas Vondra wrote: > > > > On 1/21/21 3:17 AM, Masahiko Sawada wrote: > > On Thu, Jan 7, 2021 at 2:16 AM Tomas Vondra > > wrote: > >> > >> Hi, > >> > >> I think I've managed to get the 0002 patch [1] rebased to master and > >> working (with help from Masahiko

Re: [PoC] Non-volatile WAL buffer

2021-01-22 Thread Konstantin Knizhnik
On 22.01.2021 5:32, Tomas Vondra wrote: On 1/21/21 3:17 AM, Masahiko Sawada wrote: On Thu, Jan 7, 2021 at 2:16 AM Tomas Vondra wrote: Hi, I think I've managed to get the 0002 patch [1] rebased to master and working (with help from Masahiko Sawada). It's not clear to me how it could

Re: [PoC] Non-volatile WAL buffer

2021-01-21 Thread Tomas Vondra
On 1/21/21 3:17 AM, Masahiko Sawada wrote: On Thu, Jan 7, 2021 at 2:16 AM Tomas Vondra wrote: Hi, I think I've managed to get the 0002 patch [1] rebased to master and working (with help from Masahiko Sawada). It's not clear to me how it could have worked as submitted - my theory is that

Re: [PoC] Non-volatile WAL buffer

2021-01-20 Thread Masahiko Sawada
On Thu, Jan 7, 2021 at 2:16 AM Tomas Vondra wrote: > > Hi, > > I think I've managed to get the 0002 patch [1] rebased to master and > working (with help from Masahiko Sawada). It's not clear to me how it > could have worked as submitted - my theory is that an incomplete patch > was submitted by

Re: [PoC] Non-volatile WAL buffer

2020-11-27 Thread Tomas Vondra
On 11/27/20 1:02 AM, Tomas Vondra wrote: > > Unfortunately, that patch seems to fail for me :-( > > The patches seem to be for PG12, so I applied them on REL_12_STABLE (all > the parts 0001-0005) and then I did this: > > LIBS="-lpmem" ./configure --prefix=/home/tomas/pg-12-pmem --enable-debug >

Re: [PoC] Non-volatile WAL buffer

2020-11-26 Thread Tomas Vondra
On 11/26/20 10:19 PM, Tomas Vondra wrote: > > > On 11/26/20 9:59 PM, Heikki Linnakangas wrote: >> On 26/11/2020 21:27, Tomas Vondra wrote: >>> Hi, >>> >>> Here's the "simple patch" that I'm currently experimenting with. It >>> essentially replaces open/close/write/fsync with pmem calls >>>

Re: [PoC] Non-volatile WAL buffer

2020-11-26 Thread Tomas Vondra
On 11/26/20 9:59 PM, Heikki Linnakangas wrote: > On 26/11/2020 21:27, Tomas Vondra wrote: >> Hi, >> >> Here's the "simple patch" that I'm currently experimenting with. It >> essentially replaces open/close/write/fsync with pmem calls >> (map/unmap/memcpy/persist variants), and it's by no means

Re: [PoC] Non-volatile WAL buffer

2020-11-26 Thread Tomas Vondra
Hi, Here's the "simple patch" that I'm currently experimenting with. It essentially replaces open/close/write/fsync with pmem calls (map/unmap/memcpy/persist variants), and it's by no means committable. But it works well enough for experiments / measurements, etc. The numbers (5-minute pgbench

Re: [PoC] Non-volatile WAL buffer

2020-11-26 Thread Heikki Linnakangas
On 26/11/2020 21:27, Tomas Vondra wrote: Hi, Here's the "simple patch" that I'm currently experimenting with. It essentially replaces open/close/write/fsync with pmem calls (map/unmap/memcpy/persist variants), and it's by no means committable. But it works well enough for experiments /

Re: [PoC] Non-volatile WAL buffer

2020-11-24 Thread Tomas Vondra
On 11/25/20 2:10 AM, Ashwin Agrawal wrote: > On Sun, Nov 22, 2020 at 5:23 PM Tomas Vondra > wrote: > >> I'm not entirely sure whether the "pmemdax" (i.e. unpatched instance >> with WAL on PMEM DAX device) is actually safe, but I included it anyway >> to see what difference is. > > I am curious

Re: [PoC] Non-volatile WAL buffer

2020-11-24 Thread Tomas Vondra
On 11/25/20 1:27 AM, tsunakawa.ta...@fujitsu.com wrote: > From: Tomas Vondra >> It's interesting that they only place the tail of the log on PMEM, >> i.e. the PMEM buffer has limited size, and the rest of the log is >> not on PMEM. It's a bit as if we inserted a PMEM buffer between our >> wal

Re: [PoC] Non-volatile WAL buffer

2020-11-24 Thread Ashwin Agrawal
On Sun, Nov 22, 2020 at 5:23 PM Tomas Vondra wrote: > I'm not entirely sure whether the "pmemdax" (i.e. unpatched instance > with WAL on PMEM DAX device) is actually safe, but I included it anyway > to see what difference is. I am curious to learn more on this aspect. Kernels have provided

Re: [PoC] Non-volatile WAL buffer

2020-11-24 Thread Tomas Vondra
On 11/24/20 7:34 AM, tsunakawa.ta...@fujitsu.com wrote: > From: Tomas Vondra >> So I wonder if using PMEM for the WAL buffer is the right way forward. >> AFAIK the WAL buffer is quite concurrent (multiple clients writing >> data), which seems to contradict the PMEM vs. DRAM trade-offs. >> >>

RE: [PoC] Non-volatile WAL buffer

2020-11-23 Thread tsunakawa.ta...@fujitsu.com
From: Tomas Vondra > So I wonder if using PMEM for the WAL buffer is the right way forward. > AFAIK the WAL buffer is quite concurrent (multiple clients writing > data), which seems to contradict the PMEM vs. DRAM trade-offs. > > The design I've originally expected would look more like this > >

Re: [PoC] Non-volatile WAL buffer

2020-11-23 Thread Tomas Vondra
Hi, On 11/23/20 3:01 AM, Tomas Vondra wrote: > Hi, > > On 10/30/20 6:57 AM, Takashi Menjo wrote: >> Hi Heikki, >> >>> I had a new look at this thread today, trying to figure out where >>> we are. >> >> I'm a bit confused. >>> >>> One thing we have established: mmap()ing WAL files performs worse

Re: [PoC] Non-volatile WAL buffer

2020-11-22 Thread Tomas Vondra
Hi, On 10/30/20 6:57 AM, Takashi Menjo wrote: > Hi Heikki, > >> I had a new look at this thread today, trying to figure out where >> we are. > > I'm a bit confused. >> >> One thing we have established: mmap()ing WAL files performs worse >> than the current method, if pg_wal is not on a

Re: [PoC] Non-volatile WAL buffer

2020-11-22 Thread Tomas Vondra
Hi, These patches no longer apply :-( A rebased version would be nice. I've been interested in what performance improvements this might bring, so I've been running some extensive benchmarks on a machine with PMEM hardware. So let me share some interesting results. (I used commit from early

Re: [PoC] Non-volatile WAL buffer

2020-11-05 Thread Takashi Menjo
Hi Gang, I appreciate your patience. I reproduced the results you reported to me, on my environment. First of all, the condition you gave to me was a little unstable on my environment, so I made the values of {max_,min_,nv}wal_size larger and the pre-warm duration longer to get stable

Re: [PoC] Non-volatile WAL buffer

2020-10-29 Thread Takashi Menjo
Hi Heikki, > I had a new look at this thread today, trying to figure out where we are. I'm a bit confused. > > One thing we have established: mmap()ing WAL files performs worse than the current method, if pg_wal is not on > a persistent memory device. This is because the kernel faults in existing

Re: [PoC] Non-volatile WAL buffer

2020-10-26 Thread Heikki Linnakangas
I had a new look at this thread today, trying to figure out where we are. I'm a bit confused. One thing we have established: mmap()ing WAL files performs worse than the current method, if pg_wal is not on a persistent memory device. This is because the kernel faults in existing content of

RE: [PoC] Non-volatile WAL buffer

2020-10-14 Thread Takashi Menjo
, Takashi -- Takashi Menjo NTT Software Innovation Center > -Original Message- > From: Deng, Gang > Sent: Friday, October 9, 2020 3:10 PM > To: Takashi Menjo > Cc: pgsql-hack...@postgresql.org; 'Takashi Menjo' > Subject: RE: [PoC] Non-volatile WAL buff

RE: [PoC] Non-volatile WAL buffer

2020-10-09 Thread Deng, Gang
Original Message- From: Takashi Menjo Sent: Tuesday, October 6, 2020 4:49 PM To: Deng, Gang Cc: pgsql-hack...@postgresql.org; 'Takashi Menjo' Subject: RE: [PoC] Non-volatile WAL buffer Hi Gang, I have tried to but yet cannot reproduce performance degrade you reported when inserting 32

RE: [PoC] Non-volatile WAL buffer

2020-10-06 Thread Takashi Menjo
v4 -- Takashi Menjo NTT Software Innovation Center > -Original Message- > From: Takashi Menjo > Sent: Thursday, September 24, 2020 2:38 AM > To: Deng, Gang > Cc: pgsql-hack...@postgresql.org; Takashi Menjo > > Subject: Re: [PoC] Non-volatile WAL buffer &g

Re: [PoC] Non-volatile WAL buffer

2020-09-23 Thread Takashi Menjo
gt; Througput (10^3 TPS) > 13.0 16.9 > > CPU Time % of CopyXlogRecordToWAL > 3.0 1.6 > > CPU Time % of XLogInsertRecord > 23.0 16.4 > > CPU Time % of XLogFlush >

RE: [PoC] Non-volatile WAL buffer

2020-09-20 Thread Deng, Gang
2.3 5.9 Best Regards, Gang From: Takashi Menjo Sent: Thursday, September 10, 2020 4:01 PM To: Takashi Menjo Cc: pgsql-hack...@postgresql.org Subject: Re: [PoC] Non-volatile WAL buffer Rebased. 2020年6月24日(水) 16:44 Takashi Menjo

RE: [PoC] Non-volatile WAL buffer

2020-06-24 Thread Takashi Menjo
angas' > ; 'Amit Langote' > > Subject: RE: [PoC] Non-volatile WAL buffer > > Dear hackers, > > I rebased my non-volatile WAL buffer's patchset onto master. A new v2 > patchset is attached to this mail. > > I also measured performance before and after patchset,

RE: [PoC] Non-volatile WAL buffer

2020-02-20 Thread Takashi Menjo
-Original Message- > From: Amit Langote > Sent: Monday, February 17, 2020 5:21 PM > To: Takashi Menjo > Cc: Robert Haas ; Heikki Linnakangas > ; PostgreSQL-development > > Subject: Re: [PoC] Non-volatile WAL buffer > > Hello, > > On Mon, Feb 17, 2020

Re: [PoC] Non-volatile WAL buffer

2020-02-19 Thread Andres Freund
Hi, On 2020-02-17 13:12:37 +0900, Takashi Menjo wrote: > I applied my patchset that mmap()-s WAL segments as WAL buffers to > refs/tags/REL_12_0, and measured and analyzed its performance with > pgbench. Roughly speaking, When I used *SSD and ext4* to store WAL, > it was "obviously worse" than

Re: [PoC] Non-volatile WAL buffer

2020-02-17 Thread Amit Langote
Hello, On Mon, Feb 17, 2020 at 4:16 PM Takashi Menjo wrote: > Hello Amit, > > > I apologize for not having any opinion on the patches themselves, but let > > me point out that it's better to base these > > patches on HEAD (master branch) than REL_12_0, because all new code is > > committed to

RE: [PoC] Non-volatile WAL buffer

2020-02-16 Thread Takashi Menjo
rt Haas ; Heikki Linnakangas > ; PostgreSQL-development > > Subject: Re: [PoC] Non-volatile WAL buffer > > Menjo-san, > > On Mon, Feb 17, 2020 at 1:13 PM Takashi Menjo > wrote: > > I applied my patchset that mmap()-s WAL segments as WAL buffers to > > refs/tags/R

Re: [PoC] Non-volatile WAL buffer

2020-02-16 Thread Amit Langote
Menjo-san, On Mon, Feb 17, 2020 at 1:13 PM Takashi Menjo wrote: > I applied my patchset that mmap()-s WAL segments as WAL buffers to > refs/tags/REL_12_0, and measured and analyzed its performance with pgbench. > Roughly speaking, When I used *SSD and ext4* to store WAL, it was "obviously >

RE: [PoC] Non-volatile WAL buffer

2020-02-16 Thread Takashi Menjo
day, February 10, 2020 6:30 PM > To: 'Robert Haas' ; 'Heikki Linnakangas' > > Cc: 'pgsql-hack...@postgresql.org' > Subject: RE: [PoC] Non-volatile WAL buffer > > Dear hackers, > > I made another WIP patchset to mmap WAL segments as WAL buffers. Note that > this is

RE: [PoC] Non-volatile WAL buffer

2020-02-10 Thread Takashi Menjo
ect: Re: [PoC] Non-volatile WAL buffer > > On Tue, Jan 28, 2020 at 3:28 AM Takashi Menjo > wrote: > > I think our concerns are roughly classified into two: > > > > (1) Performance > > (2) Consistency > > > > And your "different concern" is rat

Re: [PoC] Non-volatile WAL buffer

2020-02-03 Thread Andres Freund
Hi, On 2020-01-27 13:54:38 -0500, Robert Haas wrote: > On Mon, Jan 27, 2020 at 2:01 AM Takashi Menjo > wrote: > > It sounds reasonable, but I'm sorry that I haven't tested such a program > > yet. I'll try it to compare with my non-volatile WAL buffer. For now, I'm > > a little worried about

Re: [PoC] Non-volatile WAL buffer

2020-01-28 Thread Robert Haas
On Tue, Jan 28, 2020 at 3:28 AM Takashi Menjo wrote: > I think our concerns are roughly classified into two: > > (1) Performance > (2) Consistency > > And your "different concern" is rather into (2), I think. Actually, I think it was mostly a performance concern (writes triggering lots of

RE: [PoC] Non-volatile WAL buffer

2020-01-28 Thread Takashi Menjo
Hello Robert, I think our concerns are roughly classified into two: (1) Performance (2) Consistency And your "different concern" is rather into (2), I think. I'm also worried about it, but I have no good answer for now. I suppose mmap(flags|=MAP_SHARED) called by multiple backend processes

Re: [PoC] Non-volatile WAL buffer

2020-01-27 Thread Robert Haas
On Mon, Jan 27, 2020 at 2:01 AM Takashi Menjo wrote: > It sounds reasonable, but I'm sorry that I haven't tested such a program > yet. I'll try it to compare with my non-volatile WAL buffer. For now, I'm > a little worried about the overhead of mmap()/munmap() for each WAL segment > file. I

RE: [PoC] Non-volatile WAL buffer

2020-01-26 Thread Takashi Menjo
Hello Heikki, > I have the same comments on this that I had on the previous patch, see: > > https://www.postgresql.org/message-id/2aec6e2a-6a32-0c39-e4e2-aad854543aa8%40iki.fi Thanks. I re-read your messages [1][2]. What you meant, AFAIU, is how about using memory-mapped WAL segment files as

RE: [PoC] Non-volatile WAL buffer

2020-01-26 Thread Takashi Menjo
Hello Fabien, Thank you for your +1 :) > Is it possible to emulate somthing without the actual hardware, at least > for testing purposes? Yes, you can emulate PMEM using DRAM on Linux, via "memmap=nnG!ssG" kernel parameter. Please see [1] and [2] for emulation details. If your emulation does

Re: [PoC] Non-volatile WAL buffer

2020-01-24 Thread Heikki Linnakangas
On 24/01/2020 10:06, Takashi Menjo wrote: I propose "non-volatile WAL buffer," a proof-of-concept new feature. It enables WAL records to be durable without output to WAL segment files by residing on persistent memory (PMEM) instead of DRAM. It improves database performance by reducing copies

Re: [PoC] Non-volatile WAL buffer

2020-01-24 Thread Fabien COELHO
Hello, +1 on the idea. By quickly looking at the patch, I notice that there are no tests. Is it possible to emulate somthing without the actual hardware, at least for testing purposes? -- Fabien.

[PoC] Non-volatile WAL buffer

2020-01-24 Thread Takashi Menjo
Dear hackers, I propose "non-volatile WAL buffer," a proof-of-concept new feature. It enables WAL records to be durable without output to WAL segment files by residing on persistent memory (PMEM) instead of DRAM. It improves database performance by reducing copies of WAL and shortening the time