Re: 回复:Re: Cache relation sizes?

2021-07-14 Thread Anastasia Lubennikova
On Wed, Jun 16, 2021 at 9:24 AM Thomas Munro wrote: > No change yet, just posting a rebase to keep cfbot happy. > > Hi, Thomas I think that the proposed feature is pretty cool not only because it fixes some old issues with lseek() performance and reliability, but also because it opens the door

Re: 回复:Re: Cache relation sizes?

2021-06-16 Thread Thomas Munro
No change yet, just posting a rebase to keep cfbot happy. One thing I'm wondering about is whether it'd be possible, and if so, a good idea, to make a kind of tiny reusable cache replacement algorithm, something modern, that can be used to kill several birds with one stone (SLRUs, this object

Re: 回复:Re: Cache relation sizes?

2021-03-11 Thread Thomas Munro
On Thu, Mar 4, 2021 at 2:39 AM David Steele wrote: > On 1/18/21 10:42 PM, 陈佳昕(步真) wrote: > > I want to share a patch with you, I change the replacement algorithm > > from fifo to a simple lru. > > What do you think of this change? Ok, if I'm reading this right, it changes the replacement

Re: 回复:Re: Cache relation sizes?

2021-03-03 Thread David Steele
Hi Thomas, On 1/18/21 10:42 PM, 陈佳昕(步真) wrote: I want to share a patch with you, I change the replacement algorithm from fifo to a simple lru. What do you think of this change? Also, your patch set from [1] no longer applies (and of course this latest patch is confusing the tester as well).

回复:Re: Cache relation sizes?

2021-01-18 Thread 陈佳昕(步真)
Hi Thomas I want to share a patch with you, I change the replacement algorithm from fifo to a simple lru. Buzhen 0001-update-fifo-to-lru-to-sweep-a-valid-cache.patch Description: Binary data

Re: Cache relation sizes?

2021-01-07 Thread 陈佳昕(步真)
--原始邮件 -- 发件人:Thomas Munro 发送时间:Fri Jan 8 00:56:17 2021 收件人:陈佳昕(步真) 抄送:Amit Kapila , Konstantin Knizhnik , PostgreSQL Hackers 主题:Re: Cache relation sizes? On Wed, Dec 30, 2020 at 4:13 AM 陈佳昕(步真) wrote: > I found some other problems which I want to share my change with

Re: Re: Re: Cache relation sizes?

2020-12-29 Thread Thomas Munro
On Wed, Dec 30, 2020 at 5:52 PM Thomas Munro wrote: > and requires it on retry s/requires/reacquires/

Re: Re: Re: Cache relation sizes?

2020-12-29 Thread Thomas Munro
On Wed, Dec 30, 2020 at 4:13 AM 陈佳昕(步真) wrote: > I found some other problems which I want to share my change with you to make > you confirm. > <1> I changed the code in smgr_alloc_sr to avoid dead lock. > > LWLockAcquire(mapping_lock, LW_EXCLUSIVE); > flags = smgr_lock_sr(sr); >

Re: Cache relation sizes?

2020-12-29 Thread Thomas Munro
On Mon, Dec 28, 2020 at 5:24 PM Andy Fan wrote: > lseek(..., SEEK_END) = 9670656 > write(...) = 8192 > lseek(..., SEEK_END) = 9678848 > fsync(...) = -1 > lseek(..., SEEK_END) = 9670656 > > I got 2 information from above. a) before the fsync, the lseek(fd, 0, > SEEK_END) > returns a correct

Re: Cache relation sizes?

2020-12-29 Thread Andres Freund
Hi, On 2020-11-19 18:01:14 +1300, Thomas Munro wrote: > From ac3c61926bf947a3288724bd02cf8439ff5c14bc Mon Sep 17 00:00:00 2001 > From: Thomas Munro > Date: Fri, 13 Nov 2020 14:38:41 +1300 > Subject: [PATCH v2 1/2] WIP: Track relation sizes in shared memory. > > Introduce a fixed size pool of

回复:Re: Re: Cache relation sizes?

2020-12-29 Thread 陈佳昕(步真)
:Tue Dec 29 22:52:59 2020 收件人:陈佳昕(步真) 抄送:Amit Kapila , Konstantin Knizhnik , PostgreSQL Hackers 主题:Re: Re: Cache relation sizes? On Wed, Dec 23, 2020 at 1:31 AM 陈佳昕(步真) wrote: > I studied your patch these days and found there might be a problem. > When execute 'drop database', the

Re: Cache relation sizes?

2020-12-27 Thread Andy Fan
On Thu, Dec 24, 2020 at 6:59 AM Thomas Munro wrote: > On Thu, Dec 17, 2020 at 10:22 PM Andy Fan > wrote: > > Let me try to understand your point. Suppose process 1 extends a file to > > 2 blocks from 1 block, and fsync is not called, then a). the lseek *may* > still > > return 1 based on the

Re: Re: Cache relation sizes?

2020-12-23 Thread Thomas Munro
On Wed, Dec 23, 2020 at 1:31 AM 陈佳昕(步真) wrote: > I studied your patch these days and found there might be a problem. > When execute 'drop database', the smgr shared pool will not be removed > because of no call 'smgr_drop_sr'. Function 'dropdb' in dbcommands.c remove > the buffer from

Re: Cache relation sizes?

2020-12-23 Thread Thomas Munro
On Thu, Dec 17, 2020 at 10:22 PM Andy Fan wrote: > Let me try to understand your point. Suppose process 1 extends a file to > 2 blocks from 1 block, and fsync is not called, then a). the lseek *may* still > return 1 based on the comments in the ReadBuffer_common ("because > of buggy Linux

回复:Re: Cache relation sizes?

2020-12-22 Thread 陈佳昕(步真)
hen --原始邮件 -- 发件人:Thomas Munro 发送时间:Tue Dec 22 19:57:35 2020 收件人:Amit Kapila 抄送:Konstantin Knizhnik , PostgreSQL Hackers 主题:Re: Cache relation sizes? On Tue, Nov 17, 2020 at 10:48 PM Amit Kapila wrote: > Yeah, it is good to verify VACUUM stuff but I have another qu

Re: Cache relation sizes?

2020-12-17 Thread Andy Fan
Hi Thomas, Thank you for your quick response. On Thu, Dec 17, 2020 at 3:05 PM Thomas Munro wrote: > Hi Andy, > > On Thu, Dec 17, 2020 at 7:29 PM Andy Fan wrote: > > I spent one day studying the patch and I want to talk about one question > for now. > > What is the purpose of calling

Re: Cache relation sizes?

2020-12-16 Thread Thomas Munro
Hi Andy, On Thu, Dec 17, 2020 at 7:29 PM Andy Fan wrote: > I spent one day studying the patch and I want to talk about one question for > now. > What is the purpose of calling smgrimmedsync to evict a DIRTY sr (what will > happen > if we remove it and the SR_SYNCING and SR_JUST_DIRTIED flags)?

Re: Cache relation sizes?

2020-12-16 Thread Andy Fan
On Thu, Nov 19, 2020 at 1:02 PM Thomas Munro wrote: > On Tue, Nov 17, 2020 at 10:48 PM Amit Kapila > wrote: > > Yeah, it is good to verify VACUUM stuff but I have another question > > here. What about queries having functions that access the same > > relation (SELECT c1 FROM t1 WHERE c1 <=

Re: Cache relation sizes?

2020-11-18 Thread Thomas Munro
On Tue, Nov 17, 2020 at 10:48 PM Amit Kapila wrote: > Yeah, it is good to verify VACUUM stuff but I have another question > here. What about queries having functions that access the same > relation (SELECT c1 FROM t1 WHERE c1 <= func(); assuming here function > access the relation t1)? Now, here

Re: Cache relation sizes?

2020-11-17 Thread Amit Kapila
On Tue, Nov 17, 2020 at 4:13 AM Thomas Munro wrote: > > On Mon, Nov 16, 2020 at 11:01 PM Konstantin Knizhnik > wrote: > > I will look at your implementation more precisely latter. > > Thanks! Warning: I thought about making a thing like this for a > while, but the patch itself is only a

Re: Cache relation sizes?

2020-11-17 Thread Kyotaro Horiguchi
At Mon, 16 Nov 2020 20:11:52 +1300, Thomas Munro wrote in > After recent discussions about the limitations of relying on SEEK_END > in a nearby thread[1], I decided to try to prototype a system for > tracking relation sizes properly in shared memory. Earlier in this > thread I was talking

RE: Cache relation sizes?

2020-11-16 Thread k.jami...@fujitsu.com
On Tuesday, November 17, 2020 9:40 AM, Tsunkawa-san wrote: > From: Thomas Munro > > On Mon, Nov 16, 2020 at 11:01 PM Konstantin Knizhnik > > wrote: > > > I will look at your implementation more precisely latter. > > > > Thanks! Warning: I thought about making a thing like this for a > > while,

RE: Cache relation sizes?

2020-11-16 Thread tsunakawa.ta...@fujitsu.com
From: Thomas Munro > On Mon, Nov 16, 2020 at 11:01 PM Konstantin Knizhnik > wrote: > > I will look at your implementation more precisely latter. > > Thanks! Warning: I thought about making a thing like this for a > while, but the patch itself is only a one-day prototype, so I am sure > you

Re: Cache relation sizes?

2020-11-16 Thread Thomas Munro
On Mon, Nov 16, 2020 at 11:01 PM Konstantin Knizhnik wrote: > I will look at your implementation more precisely latter. Thanks! Warning: I thought about making a thing like this for a while, but the patch itself is only a one-day prototype, so I am sure you can find many bugs... Hmm, I guess

Re: Cache relation sizes?

2020-11-16 Thread Konstantin Knizhnik
On 16.11.2020 10:11, Thomas Munro wrote: On Tue, Aug 4, 2020 at 2:21 PM Thomas Munro wrote: On Tue, Aug 4, 2020 at 3:54 AM Konstantin Knizhnik wrote: This shared relation cache can easily store relation size as well. In addition it will solve a lot of other problems: - noticeable overhead

Re: Cache relation sizes?

2020-11-16 Thread Konstantin Knizhnik
On 16.11.2020 10:11, Thomas Munro wrote: On Tue, Aug 4, 2020 at 2:21 PM Thomas Munro wrote: On Tue, Aug 4, 2020 at 3:54 AM Konstantin Knizhnik wrote: This shared relation cache can easily store relation size as well. In addition it will solve a lot of other problems: - noticeable overhead

Re: Cache relation sizes?

2020-11-15 Thread Thomas Munro
On Tue, Aug 4, 2020 at 2:21 PM Thomas Munro wrote: > On Tue, Aug 4, 2020 at 3:54 AM Konstantin Knizhnik > wrote: > > This shared relation cache can easily store relation size as well. > > In addition it will solve a lot of other problems: > > - noticeable overhead of local relcache warming > > -

Re: Cache relation sizes?

2020-08-03 Thread Thomas Munro
On Tue, Aug 4, 2020 at 3:54 AM Konstantin Knizhnik wrote: > So in this thread three solutions are proposed: > 1. "bullet-proof general shared invalidation" > 2. recovery-only solution avoiding shared memory and invalidation > 3. "relaxed" shared memory cache with simplified invalidation Hi

Re: Cache relation sizes?

2020-08-03 Thread Pavel Stehule
po 3. 8. 2020 v 17:54 odesílatel Konstantin Knizhnik < k.knizh...@postgrespro.ru> napsal: > > > On 01.08.2020 00:56, Thomas Munro wrote: > > On Fri, Jul 31, 2020 at 2:36 PM Thomas Munro > wrote: > >> There's still the matter of crazy numbers of lseeks in regular > >> backends; looking at all

Re: Cache relation sizes?

2020-08-03 Thread Konstantin Knizhnik
On 01.08.2020 00:56, Thomas Munro wrote: On Fri, Jul 31, 2020 at 2:36 PM Thomas Munro wrote: There's still the matter of crazy numbers of lseeks in regular backends; looking at all processes while running the above test, I get 1,469,060 (9.18 per pgbench transaction) without -M prepared,

Re: Cache relation sizes?

2020-07-31 Thread Thomas Munro
On Fri, Jul 31, 2020 at 2:36 PM Thomas Munro wrote: > There's still the matter of crazy numbers of lseeks in regular > backends; looking at all processes while running the above test, I get > 1,469,060 (9.18 per pgbench transaction) without -M prepared, and > 193,722 with -M prepared (1.21 per

Re: Cache relation sizes?

2020-07-30 Thread Thomas Munro
On Sat, Jun 20, 2020 at 10:32 AM Thomas Munro wrote: > Rebased. I'll add this to the open commitfest. I traced the recovery process while running pgbench -M prepared -c16 -j16 -t1 (= 160,000 transactions). With the patch, the number of lseeks went from 1,080,661 (6.75 per pgbench

Re: Cache relation sizes?

2020-06-19 Thread Thomas Munro
On Sat, Apr 11, 2020 at 4:10 PM Thomas Munro wrote: > I received a report off-list from someone who experimented with the > patch I shared earlier on this thread[1], using a crash recovery test > similar to one I showed on the WAL prefetching thread[2] (which he was > also testing, separately).

Re: Cache relation sizes?

2020-04-10 Thread Thomas Munro
On Fri, Feb 14, 2020 at 1:50 PM Thomas Munro wrote: > On Thu, Feb 13, 2020 at 7:18 PM Thomas Munro wrote: > > ... (1) I'm pretty sure some systems would not be happy > > about that (see claims in this thread) ... > > I poked a couple of people off-list and learned that, although the > Linux and

Re: Cache relation sizes?

2020-02-13 Thread Thomas Munro
On Thu, Feb 13, 2020 at 7:18 PM Thomas Munro wrote: > ... (1) I'm pretty sure some systems would not be happy > about that (see claims in this thread) ... I poked a couple of people off-list and learned that, although the Linux and FreeBSD systems I tried could do a million lseek(SEEK_END) calls

Re: Cache relation sizes?

2020-02-12 Thread Thomas Munro
On Tue, Feb 4, 2020 at 2:23 AM Andres Freund wrote: > On 2019-12-31 17:05:31 +1300, Thomas Munro wrote: > > There is one potentially interesting case that doesn't require any > > kind of shared cache invalidation AFAICS. XLogReadBufferExtended() > > calls smgrnblocks() for every buffer access,

Re: Cache relation sizes?

2020-02-03 Thread Andres Freund
Hi, On 2019-12-31 17:05:31 +1300, Thomas Munro wrote: > There is one potentially interesting case that doesn't require any > kind of shared cache invalidation AFAICS. XLogReadBufferExtended() > calls smgrnblocks() for every buffer access, even if the buffer is > already in our buffer pool. Yea,

Re: Cache relation sizes?

2019-12-30 Thread Thomas Munro
On Tue, Dec 31, 2019 at 4:43 PM Kyotaro HORIGUCHI wrote: > I still believe that one shared memory element for every > non-mapped relation is not only too-complex but also too-much, as > Andres (and implicitly I) wrote. I feel that just one flag for > all works fine but partitioned flags (that is,

Re: Cache relation sizes?

2019-02-14 Thread Kyotaro HORIGUCHI
2019年2月14日(木) 20:41、Kyotaro HORIGUCHI さん(horiguchi.kyot...@lab.ntt.co.jp )のメッセージ: > At Wed, 13 Feb 2019 05:48:28 +, "Jamison, Kirk" < > k.jami...@jp.fujitsu.com> wrote in > > > On February 6, 2019, 8:57 AM +, Andres Freund wrote: > > > Maybe I'm missing something here, but why is it

Re: Cache relation sizes?

2019-02-14 Thread Kyotaro HORIGUCHI
At Wed, 13 Feb 2019 05:48:28 +, "Jamison, Kirk" wrote in > On February 6, 2019, 8:57 AM +, Andres Freund wrote: > > Maybe I'm missing something here, but why is it actually necessary to > > have the sizes in shared memory, if we're just talking about caching > > sizes? It's pretty darn

RE: Cache relation sizes?

2019-02-12 Thread Jamison, Kirk
On February 6, 2019, 8:57 AM +, Andres Freund wrote: > Maybe I'm missing something here, but why is it actually necessary to > have the sizes in shared memory, if we're just talking about caching > sizes? It's pretty darn cheap to determine the filesize of a file that > has been recently

Re: Cache relation sizes?

2019-02-06 Thread and...@anarazel.de
On 2019-02-06 08:50:45 +, Jamison, Kirk wrote: > On February 6, 2019, 08:25AM +, Kyotaro HORIGUCHI wrote: > > >At Wed, 6 Feb 2019 06:29:15 +, "Tsunakawa, Takayuki" > > wrote: > >> Although I haven't looked deeply at Thomas's patch yet, there's currently > >> no place to store the

RE: Cache relation sizes?

2019-02-06 Thread Jamison, Kirk
On February 6, 2019, 08:25AM +, Kyotaro HORIGUCHI wrote: >At Wed, 6 Feb 2019 06:29:15 +, "Tsunakawa, Takayuki" > wrote: >> Although I haven't looked deeply at Thomas's patch yet, there's currently no >> place to store the size per relation in shared memory. You have to wait for >> the

RE: Cache relation sizes?

2019-02-06 Thread Tsunakawa, Takayuki
From: Kyotaro HORIGUCHI [mailto:horiguchi.kyot...@lab.ntt.co.jp] > Just one counter in the patch *seems* to give significant gain > comparing to the complexity, given that lseek is so complex or it > brings latency, especially on workloads where file is scarcely > changed. Though I didn't run it

Re: Cache relation sizes?

2019-02-06 Thread Kyotaro HORIGUCHI
At Wed, 6 Feb 2019 06:29:15 +, "Tsunakawa, Takayuki" wrote in <0A3221C70F24FB45833433255569204D1FB955DF@G01JPEXMBYT05> > From: Jamison, Kirk [mailto:k.jami...@jp.fujitsu.com] > > On the other hand, the simplest method I thought that could also work is > > to only cache the file size

RE: Cache relation sizes?

2019-02-05 Thread Tsunakawa, Takayuki
From: Jamison, Kirk [mailto:k.jami...@jp.fujitsu.com] > On the other hand, the simplest method I thought that could also work is > to only cache the file size (nblock) in shared memory, not in the backend > process, since both nblock and relsize_change_counter are uint32 data type > anyway. If

RE: Cache relation sizes?

2019-02-05 Thread Ideriha, Takeshi
>From: Jamison, Kirk [mailto:k.jami...@jp.fujitsu.com] >On the other hand, the simplest method I thought that could also work is to >only cache >the file size (nblock) in shared memory, not in the backend process, since >both nblock >and relsize_change_counter are uint32 data type anyway. If

RE: Cache relation sizes?

2019-01-08 Thread Jamison, Kirk
Hi Thomas, On Friday, December 28, 2018 6:43 AM Thomas Munro wrote: > [...]if you have ideas about the validity of the assumptions, the reason it > breaks initdb, or any other aspect of this approach (or alternatives), please > don't let me stop you, and of course please feel free to submit

Re: Cache relation sizes?

2018-12-27 Thread Thomas Munro
On Thu, Dec 27, 2018 at 8:00 PM Jamison, Kirk wrote: > I also find this proposed feature to be beneficial for performance, > especially when we want to extend or truncate large tables. > As mentioned by David, currently there is a query latency spike when we make > generic plan for partitioned

RE: Cache relation sizes?

2018-12-26 Thread Jamison, Kirk
Hello, I also find this proposed feature to be beneficial for performance, especially when we want to extend or truncate large tables. As mentioned by David, currently there is a query latency spike when we make generic plan for partitioned table with many partitions. I tried to apply Thomas'

Re: Cache relation sizes?

2018-11-29 Thread David Rowley
On Fri, 16 Nov 2018 at 12:06, Thomas Munro wrote: > Oh, I just found the throw-away patch I wrote ages ago down the back > of the sofa. Here's a rebase. It somehow breaks initdb so you have > to initdb with unpatched. Unfortunately I couldn't seem to measure > any speed-up on a random EDB test

Re: Cache relation sizes?

2018-11-15 Thread Thomas Munro
On Fri, Nov 9, 2018 at 4:42 PM David Rowley wrote: > On 7 November 2018 at 11:46, Andres Freund wrote: > > On 2018-11-07 11:40:22 +1300, Thomas Munro wrote: > >> PostgreSQL likes to probe the size of relations with lseek(SEEK_END) a > >> lot. For example, a fully prewarmed pgbench -S

Re: Cache relation sizes?

2018-11-08 Thread David Rowley
On 7 November 2018 at 11:46, Andres Freund wrote: > Hi, > > On 2018-11-07 11:40:22 +1300, Thomas Munro wrote: >> PostgreSQL likes to probe the size of relations with lseek(SEEK_END) a >> lot. For example, a fully prewarmed pgbench -S transaction consists >> of recvfrom(), lseek(SEEK_END),

Re: Cache relation sizes?

2018-11-08 Thread Edmund Horner
On Wed, 7 Nov 2018 at 11:41, Thomas Munro wrote: > > Hello, > > PostgreSQL likes to probe the size of relations with lseek(SEEK_END) a > lot. For example, a fully prewarmed pgbench -S transaction consists > of recvfrom(), lseek(SEEK_END), lseek(SEEK_END), sendto(). I think > lseek() is probably

Re: Cache relation sizes?

2018-11-06 Thread Andres Freund
Hi, On 2018-11-07 11:40:22 +1300, Thomas Munro wrote: > PostgreSQL likes to probe the size of relations with lseek(SEEK_END) a > lot. For example, a fully prewarmed pgbench -S transaction consists > of recvfrom(), lseek(SEEK_END), lseek(SEEK_END), sendto(). I think > lseek() is probably about

Cache relation sizes?

2018-11-06 Thread Thomas Munro
Hello, PostgreSQL likes to probe the size of relations with lseek(SEEK_END) a lot. For example, a fully prewarmed pgbench -S transaction consists of recvfrom(), lseek(SEEK_END), lseek(SEEK_END), sendto(). I think lseek() is probably about as cheap as a syscall can be so I doubt it really costs