RE: [Patch] Optimize dropping of relation buffers using dlist

2020-11-25 Thread k.jami...@fujitsu.com
> From: k.jami...@fujitsu.com > On Thursday, November 19, 2020 4:08 PM, Tsunakawa, Takayuki wrote: > > From: Andres Freund > > > DropRelFileNodeBuffers() in recovery? The most common path is > > > DropRelationFiles()->smgrdounlinkall()->DropRelFileNodesAllBuffers() > > > , which 3/4 doesn't

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-11-25 Thread k.jami...@fujitsu.com
On Thursday, November 19, 2020 4:08 PM, Tsunakawa, Takayuki wrote: > From: Andres Freund > > DropRelFileNodeBuffers() in recovery? The most common path is > > DropRelationFiles()->smgrdounlinkall()->DropRelFileNodesAllBuffers(), > > which 3/4 doesn't address and 4/4 doesn't mention. > > > > 4/4

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-11-18 Thread tsunakawa.ta...@fujitsu.com
From: Andres Freund > DropRelFileNodeBuffers() in recovery? The most common path is > DropRelationFiles()->smgrdounlinkall()->DropRelFileNodesAllBuffers(), > which 3/4 doesn't address and 4/4 doesn't mention. > > 4/4 seems to address DropRelationFiles(), but only talks about TRUNCATE? Yes.

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-18 Thread Amit Kapila
On Wed, Nov 18, 2020 at 11:43 PM Andres Freund wrote: > > Hi, > > On 2020-11-18 17:34:31 +0530, Amit Kapila wrote: > > Yeah, that won't be a bad idea especially because the patch being > > discussed in the thread you referred is still in an exploratory phase. > > I haven't tested or done a

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-18 Thread Andres Freund
Hi, On 2020-11-18 17:34:31 +0530, Amit Kapila wrote: > Yeah, that won't be a bad idea especially because the patch being > discussed in the thread you referred is still in an exploratory phase. > I haven't tested or done a detailed review but I feel there shouldn't > be many problems if we agree

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-18 Thread Amit Kapila
On Wed, Nov 18, 2020 at 2:34 PM k.jami...@fujitsu.com wrote: > > On Thursday, November 12, 2020 1:14 PM, Tsunakawa-san wrote: > I forgot to reply. > Thank you very much Tsunakawa-san for testing and to everyone > who has provided their reviews and insights as well. > > Now thinking about

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-11-18 Thread k.jami...@fujitsu.com
On Thursday, November 12, 2020 1:14 PM, Tsunakawa-san wrote: > The patch looks OK. I think as Thomas-san suggested, we can remove the > modification to smgrnblocks() and don't care wheter the size is cached or not. > But I think the current patch is good too, so I'd like to leave it up to a >

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-11-11 Thread tsunakawa.ta...@fujitsu.com
The patch looks OK. I think as Thomas-san suggested, we can remove the modification to smgrnblocks() and don't care wheter the size is cached or not. But I think the current patch is good too, so I'd like to leave it up to a committer to decide which to choose. I measured performance in a

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-11-11 Thread k.jami...@fujitsu.com
On Tuesday, November 10, 2020 3:10 PM, Tsunakawa-san wrote: > From: Jamison, Kirk/ジャミソン カーク > > So I proceeded to update the patches using the "cached" parameter and > > updated the corresponding comments to it in 0002. > > OK, I'm in favor of the name "cached" now, although I first agreed with

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-11-09 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク > So I proceeded to update the patches using the "cached" parameter and > updated the corresponding comments to it in 0002. OK, I'm in favor of the name "cached" now, although I first agreed with Horiguchi-san in that it's better to use a name that represents the

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-09 Thread Thomas Munro
On Tue, Nov 10, 2020 at 6:18 PM Amit Kapila wrote: > Do we always truncate all the blocks? What if the vacuum has cleaned > last N (say 100) blocks then how do we handle it? Oh, hmm. Yeah, that idea only works for DROP, not for truncate last N.

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-09 Thread Amit Kapila
On Tue, Nov 10, 2020 at 10:00 AM Thomas Munro wrote: > > On Sat, Nov 7, 2020 at 12:40 AM Amit Kapila wrote: > > I think one of the problems is returning fewer rows and that too > > without any warning or error, so maybe that is a bigger problem but we > > seem to be okay with it as that is

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-11-09 Thread k.jami...@fujitsu.com
g > Subject: Re: [Patch] Optimize dropping of relation buffers using dlist > > At Tue, 10 Nov 2020 08:33:26 +0530, Amit Kapila > wrote in > > On Tue, Nov 10, 2020 at 8:19 AM k.jami...@fujitsu.com > > wrote: > > > > > > I repeated the recovery performance test f

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-09 Thread Thomas Munro
On Sat, Nov 7, 2020 at 12:40 AM Amit Kapila wrote: > I think one of the problems is returning fewer rows and that too > without any warning or error, so maybe that is a bigger problem but we > seem to be okay with it as that is already a known thing though I > think that is not documented

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-09 Thread Kyotaro Horiguchi
At Tue, 10 Nov 2020 08:33:26 +0530, Amit Kapila wrote in > On Tue, Nov 10, 2020 at 8:19 AM k.jami...@fujitsu.com > wrote: > > > > I repeated the recovery performance test for vacuum. (I made a mistake > > previously in NBuffers/128) > > The 3 kinds of thresholds are almost equally performant.

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-09 Thread Amit Kapila
On Tue, Nov 10, 2020 at 8:19 AM k.jami...@fujitsu.com wrote: > > > From: k.jami...@fujitsu.com > > On Thursday, October 22, 2020 3:15 PM, Kyotaro Horiguchi > > wrote: > > > I'm not sure about the exact steps of the test, but it can be expected > > > if we have many small relations to truncate.

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-11-09 Thread k.jami...@fujitsu.com
> From: k.jami...@fujitsu.com > On Thursday, October 22, 2020 3:15 PM, Kyotaro Horiguchi > wrote: > > I'm not sure about the exact steps of the test, but it can be expected > > if we have many small relations to truncate. > > > > Currently BUF_DROP_FULL_SCAN_THRESHOLD is set to Nbuffers / 512, >

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-06 Thread Amit Kapila
On Fri, Nov 6, 2020 at 11:10 AM Thomas Munro wrote: > > On Fri, Nov 6, 2020 at 5:09 PM Amit Kapila wrote: > > > > It is not very clear to me how this argument applies to the patch > > in-discussion where we are relying on the cached value of blocks > > during recovery. I understand your point

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-05 Thread Thomas Munro
On Fri, Nov 6, 2020 at 5:09 PM Amit Kapila wrote: > On Fri, Nov 6, 2020 at 5:02 AM Thomas Munro wrote: > > Here's a devil's advocate position I thought about: It's OK to leave > > stray buffers (clean or dirty) in the buffer pool if files are > > truncated underneath us by gremlins, as long as

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-05 Thread Amit Kapila
On Fri, Nov 6, 2020 at 5:02 AM Thomas Munro wrote: > > On Thu, Nov 5, 2020 at 10:47 PM Amit Kapila wrote: > > I still feel 'cached' is a better name. > > Amusingly, this thread is hitting all the hardest problems in computer > science according to the well known aphorism... > > Here's a devil's

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-11-05 Thread k.jami...@fujitsu.com
On Thursday, October 22, 2020 3:15 PM, Kyotaro Horiguchi wrote: > I'm not sure about the exact steps of the test, but it can be expected if we > have many small relations to truncate. > > Currently BUF_DROP_FULL_SCAN_THRESHOLD is set to Nbuffers / 512, > which is quite arbitrary that comes from

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-05 Thread Thomas Munro
On Thu, Nov 5, 2020 at 10:47 PM Amit Kapila wrote: > I still feel 'cached' is a better name. Amusingly, this thread is hitting all the hardest problems in computer science according to the well known aphorism... Here's a devil's advocate position I thought about: It's OK to leave stray buffers

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-05 Thread Amit Kapila
On Thu, Nov 5, 2020 at 1:59 PM Kyotaro Horiguchi wrote: > > At Thu, 5 Nov 2020 11:07:21 +0530, Amit Kapila > wrote in > > On Thu, Nov 5, 2020 at 8:26 AM k.jami...@fujitsu.com > > wrote: > > > > > Few comments on patches: > > > > > == > > > > >

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-05 Thread Kyotaro Horiguchi
At Thu, 5 Nov 2020 11:07:21 +0530, Amit Kapila wrote in > On Thu, Nov 5, 2020 at 8:26 AM k.jami...@fujitsu.com > wrote: > > > > Few comments on patches: > > > > == > > > > v29-0002-Add-bool-param-in-smgrnblocks-for-cached-blocks > > > >

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-04 Thread Amit Kapila
On Thu, Nov 5, 2020 at 8:26 AM k.jami...@fujitsu.com wrote: > > On Thursday, November 5, 2020 10:22 AM, Horiguchi-san wrote: > > > > > > Can we do a Truncate optimization once we decide about your other > > > patch as I see a few problems with it? If we can get the first patch > > > (vacuum

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-11-04 Thread k.jami...@fujitsu.com
On Thursday, November 5, 2020 10:22 AM, Horiguchi-san wrote: > Hello. > > Many of the quetions are on the code following my past suggestions. Yeah, I was also about to answer with the feedback you have given. Thank you for replying and taking a look too. > At Wed, 4 Nov 2020 15:59:17 +0530,

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-04 Thread Kyotaro Horiguchi
Hello. Many of the quetions are on the code following my past suggestions. At Wed, 4 Nov 2020 15:59:17 +0530, Amit Kapila wrote in > On Wed, Nov 4, 2020 at 8:28 AM k.jami...@fujitsu.com > wrote: > > > > Hi, > > > > I've updated the patch 0004 (Truncate optimization) with the previous > >

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-11-04 Thread Amit Kapila
On Wed, Nov 4, 2020 at 8:28 AM k.jami...@fujitsu.com wrote: > > Hi, > > I've updated the patch 0004 (Truncate optimization) with the previous > comments of > Tsunakawa-san already addressed in the patch. (Thank you very much for the > review.) > The change here compared to the previous version

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-11-03 Thread k.jami...@fujitsu.com
Hi, I've updated the patch 0004 (Truncate optimization) with the previous comments of Tsunakawa-san already addressed in the patch. (Thank you very much for the review.) The change here compared to the previous version is that in DropRelFileNodesAllBuffers() we don't check for the accurate

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-28 Thread tsunakawa.ta...@fujitsu.com
The patch looks almost good except for the minor ones: (1) + for (i = 0; i < nnodes; i++) + { + RelFileNodeBackend rnode = smgr_reln[i]->smgr_rnode; + + rnodes[i] = rnode; + } You can write: + for (i = 0; i < nnodes; i++) +

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-28 Thread k.jami...@fujitsu.com
Hi everyone, Attached are the updated set of patches (V28). 0004 - Truncate optimization is a new patch, while the rest are similar to V27. This passes the build, regression and TAP tests. Apologies for the delay. I'll post the benchmark test results on SSD soon, considering the suggested

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-22 Thread tsunakawa.ta...@fujitsu.com
From: Thomas Munro > > I'm probably being silly, but can't we avoid the problem by using fstat() > instead of lseek(SEEK_END)? Would they return the same value from the > i-node? > > Amazingly, st_size can disagree with SEEK_END when using the Linux NFS > client, but its behaviour is worse.

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-22 Thread Thomas Munro
On Thu, Oct 22, 2020 at 8:32 PM tsunakawa.ta...@fujitsu.com wrote: > I'm probably being silly, but can't we avoid the problem by using fstat() > instead of lseek(SEEK_END)? Would they return the same value from the i-node? Amazingly, st_size can disagree with SEEK_END when using the Linux NFS

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-22 Thread Amit Kapila
On Thu, Oct 22, 2020 at 2:20 PM Kyotaro Horiguchi wrote: > > At Thu, 22 Oct 2020 07:31:55 +, "tsunakawa.ta...@fujitsu.com" > wrote in > > From: Thomas Munro > > > On Thu, Oct 22, 2020 at 7:33 PM Kyotaro Horiguchi > > > wrote: > > > > Mmm. Not exact. The requirement here is that we must be

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-22 Thread Thomas Munro
On Thu, Oct 22, 2020 at 9:50 PM Kyotaro Horiguchi wrote: > By the way, heap scan finds the size of target relation using > smgrnblocks(). I'm not sure why we don't miss recently-extended pages > on a heap-scan? It seems to be possible that concurrent checkpoint > fsyncs relation files inbetween

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-22 Thread Kyotaro Horiguchi
At Thu, 22 Oct 2020 07:31:55 +, "tsunakawa.ta...@fujitsu.com" wrote in > From: Thomas Munro > > On Thu, Oct 22, 2020 at 7:33 PM Kyotaro Horiguchi > > wrote: > > > Mmm. Not exact. The requirement here is that we must be certain that > > > the we don't have a buffuer for blocks after the

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-22 Thread tsunakawa.ta...@fujitsu.com
From: Thomas Munro > On Thu, Oct 22, 2020 at 7:33 PM Kyotaro Horiguchi > wrote: > > Mmm. Not exact. The requirement here is that we must be certain that > > the we don't have a buffuer for blocks after the file size known to > > the process. While recoverying, If the first lseek() returned

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-22 Thread Kyotaro Horiguchi
At Thu, 22 Oct 2020 18:54:43 +1300, Thomas Munro wrote in > On Thu, Oct 22, 2020 at 5:52 PM Tom Lane wrote: > > Per the referenced bug-reporting thread, it was ReiserFS and/or NFS on > > SLES 9.3; so, dubious storage choices on an ancient-even-then Linux > > kernel. > > O. I can

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-22 Thread Thomas Munro
On Thu, Oct 22, 2020 at 7:33 PM Kyotaro Horiguchi wrote: > At Thu, 22 Oct 2020 14:16:37 +0900 (JST), Kyotaro Horiguchi > wrote in > > smgrtruncate and msgrextend modifies that cache from their parameter, > > not from lseek(). At the very first the value in the cache comes from > > lseek() but

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-22 Thread Kyotaro Horiguchi
At Thu, 22 Oct 2020 14:16:37 +0900 (JST), Kyotaro Horiguchi wrote in > At Thu, 22 Oct 2020 16:35:27 +1300, Thomas Munro > wrote in > > On Thu, Oct 22, 2020 at 3:07 PM k.jami...@fujitsu.com > > wrote: > > But... does the proposed caching behaviour and "accurate" flag really > > help with any

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-22 Thread Kyotaro Horiguchi
At Thu, 22 Oct 2020 01:33:31 +, "tsunakawa.ta...@fujitsu.com" wrote in > From: Jamison, Kirk/ジャミソン カーク > > The table below shows the vacuum execution time for non-recovery case. > > I've also subtracted the execution time when VACUUM (truncate off) is set. > > > > [NON-RECOVERY CASE -

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-21 Thread Thomas Munro
On Thu, Oct 22, 2020 at 5:52 PM Tom Lane wrote: > Per the referenced bug-reporting thread, it was ReiserFS and/or NFS on > SLES 9.3; so, dubious storage choices on an ancient-even-then Linux > kernel. O. I can reproduce that on a modern Linux box by forcing writeback to a full NFS

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-21 Thread Kyotaro Horiguchi
At Thu, 22 Oct 2020 16:35:27 +1300, Thomas Munro wrote in > On Thu, Oct 22, 2020 at 3:07 PM k.jami...@fujitsu.com > wrote: > +/* > + * Get the total number of to-be-invalidated blocks of a relation as well > + * as the total blocks for a given fork. The cached value returned by >

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-21 Thread Tom Lane
Thomas Munro writes: > Hmmm. The Linux comment led me to commit ffae5cc and a 2006 thread[1] > showing a buggy sequence of system calls. Hah, blast from the past ... > AFAICS it was not even an > SMP/race problem of the type you might half expect, it was a single > process not seeing its own

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-21 Thread Thomas Munro
On Thu, Oct 22, 2020 at 3:07 PM k.jami...@fujitsu.com wrote: +/* + * Get the total number of to-be-invalidated blocks of a relation as well + * as the total blocks for a given fork. The cached value returned by + * smgrnblocks could be smaller than the actual number of existing

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-21 Thread k.jami...@fujitsu.com
On Thursday, October 22, 2020 10:34 AM, Tsunakwa-san wrote: > > I have confirmed that the above comment (commenting out the lines in > > RelationTruncate) solves the issue for non-recovery case. > > The attached 0004 patch is just for non-recovery testing and is not > > included in the final set

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-21 Thread tsunakawa.ta...@fujitsu.com
The patch looks good except for the minor one: (1) +* as the total nblocks for a given fork. The cached value returned by nblocks -> blocks Regards Takayuki Tsunakawa

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-21 Thread tsunakawa.ta...@fujitsu.com
> As for 0004: > When testing TRUNCATE, remove the change to storage.c because it was > intended to troubleshoot the VACUUM test. I meant vacuum.c. Sorry. Regards Takayuki Tsunakawa

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-21 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク > I have confirmed that the above comment (commenting out the lines in > RelationTruncate) solves the issue for non-recovery case. > The attached 0004 patch is just for non-recovery testing and is not included > in > the final set of patches to be committed for

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-21 Thread k.jami...@fujitsu.com
On Wednesday, October 21, 2020 4:37 PM, Tsunakawa-san wrote: > RelationTruncate() invalidates the cached fork sizes as follows. This causes > smgrnblocks() return accurate=false, resulting in not running optimization. > Try commenting out for non-recovery case. > > /* > * Make sure

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-21 Thread tsunakawa.ta...@fujitsu.com
RelationTruncate() invalidates the cached fork sizes as follows. This causes smgrnblocks() return accurate=false, resulting in not running optimization. Try commenting out for non-recovery case. /* * Make sure smgr_targblock etc aren't pointing somewhere past new end */

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-20 Thread tsunakawa.ta...@fujitsu.com
From: tsunakawa.ta...@fujitsu.com > Can you measure the time DropRelFileNodeBuffers()? You can call > GetTimestamp() at the beginning and end of the function, and use > TimestampDifference() to calculate the difference. Then, for instance, > elog(WARNING, "time is | %u.%u", sec, usec) at the

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-20 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク > However, I still can't seem to find the cause of why the non-recovery > performance does not change when compared to master. (1 min 15 s for the > given test case below) Can you check and/or try the following? 1. Isn't the vacuum cost delay

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-15 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク > 2. Non-recovery Performance > However, I still can't seem to find the cause of why the non-recovery > performance does not change when compared to master. (1 min 15 s for the > given test case below) ... > 5. Measure VACUUM timing > \timing > VACUUM; Oops, why are

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-12 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク (1) > Alright. I also removed nTotalBlocks in v24-0003 patch. > > for (i = 0; i < nforks; i++) > { > if (nForkBlocks[i] != InvalidBlockNumber && > nBlocksToInvalidate < BUF_DROP_FULL_SCAN_THRESHOLD) > { > Optimization loop > } >

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-12 Thread Amit Kapila
On Mon, Oct 12, 2020 at 3:08 PM k.jami...@fujitsu.com wrote: > > Hmm. When I repeated the performance measurement for non-recovery, > I got almost similar execution results for both master and patched. > > Execution Time (in seconds) > | s_b | master | patched | %reg | >

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-12 Thread k.jami...@fujitsu.com
On Friday, October 9, 2020 11:12 AM, Horiguchi-san wrote: > I have some comments on the latest patch. Thank you for the feedback! I've attached the latest patches. > visibilitymap_prepare_truncate(Relation rel, BlockNumber nheapblocks) { > BlockNumber newnblocks; > + boolcached;

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-08 Thread Kyotaro Horiguchi
Oops! Sorry for the mistake. At Fri, 09 Oct 2020 11:12:01 +0900 (JST), Kyotaro Horiguchi wrote in > At Fri, 9 Oct 2020 00:41:24 +, "tsunakawa.ta...@fujitsu.com" > wrote in > > From: Jamison, Kirk/ジャミソン カーク > > > > (6) > > > > + bufHdr->tag.blockNum

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-08 Thread Kyotaro Horiguchi
At Fri, 9 Oct 2020 00:41:24 +, "tsunakawa.ta...@fujitsu.com" wrote in > From: Jamison, Kirk/ジャミソン カーク > > > (6) > > > + bufHdr->tag.blockNum >= > > > firstDelBlock[j]) > > > + InvalidateBuffer(bufHdr); /* > > > releases

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-08 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク > > (6) > > + bufHdr->tag.blockNum >= > > firstDelBlock[j]) > > + InvalidateBuffer(bufHdr); /* > > releases spinlock */ > > > > The right side of >= should be cur_block. > > Fixed. >= should

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-08 Thread k.jami...@fujitsu.com
Hi, > Attached are the updated patches. Sorry there was an error in the 3rd patch. So attached is a rebase one. Regards, Kirk Jamison 0001-v1-Prevent-invalidating-blocks-in-smgrextend-during-recovery.patch Description: 0001-v1-Prevent-invalidating-blocks-in-smgrextend-during-recovery.patch

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-08 Thread k.jami...@fujitsu.com
On Thursday, October 8, 2020 3:38 PM, Tsunakawa-san wrote: > Hi Kirk san, Thank you for looking into my patches! > (1) > + * This returns an InvalidBlockNumber when smgr_cached_nblocks is not > + * available and when not in recovery path. > > + /* > + * We cannot believe the result

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-08 Thread tsunakawa.ta...@fujitsu.com
Hi Kirk san, (1) + * This returns an InvalidBlockNumber when smgr_cached_nblocks is not + * available and when not in recovery path. + /* +* We cannot believe the result from smgr_nblocks is always accurate +* because lseek of buggy Linux kernels doesn't account for a

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-07 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク > With the latest patches attached, and removing the recovery check in > smgrnblocks, I tested the performance of vacuum. > (3 trial runs, 3.5 GB db populated with 1000 tables) > > Execution Time (seconds) > | s_b | master | patched | %reg | >

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-07 Thread k.jami...@fujitsu.com
On Monday, October 5, 2020 8:50 PM, Amit Kapila wrote: > On Mon, Oct 5, 2020 at 3:04 PM k.jami...@fujitsu.com > > > 2. Also, the other thing is I have asked for some testing to avoid > > > the small regression we have for a smaller number of shared buffers > > > which I don't see the results nor

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-05 Thread Amit Kapila
On Mon, Oct 5, 2020 at 3:04 PM k.jami...@fujitsu.com wrote: > > On Monday, October 5, 2020 3:30 PM, Amit Kapila wrote: > > > + for (i = 0; i < nforks; i++) > > + { > > + /* Get the total nblocks for a relation's fork */ nForkBlocks = > > + smgrcachednblocks(smgr_reln, forkNum[i]); > > + > > + if

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-05 Thread k.jami...@fujitsu.com
On Monday, October 5, 2020 3:30 PM, Amit Kapila wrote: > + for (i = 0; i < nforks; i++) > + { > + /* Get the total nblocks for a relation's fork */ nForkBlocks = > + smgrcachednblocks(smgr_reln, forkNum[i]); > + > + if (nForkBlocks == InvalidBlockNumber) { nTotalBlocks = > + InvalidBlockNumber;

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-05 Thread Amit Kapila
On Mon, Oct 5, 2020 at 6:59 AM k.jami...@fujitsu.com wrote: > > On Friday, October 2, 2020 11:45 AM, Horiguchi-san wrote: > > > Thaks for the new version. > > Thank you for your thoughtful reviews! > I've attached an updated patch addressing the comments below. > Few comments: === 1.

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-04 Thread Amit Kapila
On Fri, Oct 2, 2020 at 8:14 AM Kyotaro Horiguchi wrote: > > At Thu, 1 Oct 2020 12:55:34 +, "k.jami...@fujitsu.com" > wrote in > > On Thursday, October 1, 2020 4:52 PM, Tsunakawa-san wrote: > > > > (I'm still mildly opposed to the function name, which seems exposing > detail too much.) >

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-04 Thread k.jami...@fujitsu.com
On Friday, October 2, 2020 11:45 AM, Horiguchi-san wrote: > Thaks for the new version. Thank you for your thoughtful reviews! I've attached an updated patch addressing the comments below. 1. > The following description is found in the comment for FlushRelationBuffers. > > > * XXX

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-01 Thread Kyotaro Horiguchi
At Fri, 02 Oct 2020 11:44:46 +0900 (JST), Kyotaro Horiguchi wrote in > At Thu, 1 Oct 2020 12:55:34 +, "k.jami...@fujitsu.com" > wrote in > - * XXX currently it sequentially searches the buffer pool, should > be > - * changed to more clever ways of searching.

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-10-01 Thread Kyotaro Horiguchi
At Thu, 1 Oct 2020 12:55:34 +, "k.jami...@fujitsu.com" wrote in > On Thursday, October 1, 2020 4:52 PM, Tsunakawa-san wrote: > > > From: Kyotaro Horiguchi > > > I thought that the advantage of this optimization is that we don't > > > need to visit all buffers? If we need to run a

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-01 Thread k.jami...@fujitsu.com
On Thursday, October 1, 2020 4:52 PM, Tsunakawa-san wrote: > From: Kyotaro Horiguchi > > I thought that the advantage of this optimization is that we don't > > need to visit all buffers? If we need to run a full-scan for any > > reason, there's no point in looking-up already-visited buffers

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-01 Thread tsunakawa.ta...@fujitsu.com
From: Kyotaro Horiguchi > I thought that the advantage of this optimization is that we don't > need to visit all buffers? If we need to run a full-scan for any > reason, there's no point in looking-up already-visited buffers > again. That's just wastefull cycles. Am I missing somethig? > > I

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-10-01 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク > For non-recovery path, did you mean by any chance > measuring the cache hit rate for varying shared_buffers? No. You can test the speed of DropRelFileNodeBuffers() during normal operation, i.e. by running TRUNCATE on psql, instead of performing recovery. To

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-30 Thread k.jami...@fujitsu.com
On Thursday, October 1, 2020 11:49 AM, Amit Kapila wrote: > On Thu, Oct 1, 2020 at 8:11 AM tsunakawa.ta...@fujitsu.com > wrote: > > > > From: Jamison, Kirk/ジャミソン カーク > > > Recovery performance measurement results below. > > > But it seems there are overhead even with large shared buffers. > > >

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-09-30 Thread Kyotaro Horiguchi
At Thu, 1 Oct 2020 04:20:27 +, "tsunakawa.ta...@fujitsu.com" wrote in > From: Kyotaro Horiguchi > > In more detail, if smgrcachednblocks() returned InvalidBlockNumber for > > any of the forks, we should give up the optimization at all since we > > need to run a full scan anyway. On the

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-30 Thread tsunakawa.ta...@fujitsu.com
From: Kyotaro Horiguchi > In more detail, if smgrcachednblocks() returned InvalidBlockNumber for > any of the forks, we should give up the optimization at all since we > need to run a full scan anyway. On the other hand, if any of the > forks is smaller than the threshold, we still can use the

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-09-30 Thread Kyotaro Horiguchi
At Thu, 1 Oct 2020 02:40:52 +, "tsunakawa.ta...@fujitsu.com" wrote in > With the following code, when the main fork does not meet the > optimization criteria, other forks are not optimized as well. You > want to determine each fork's optimization separately, don't you? In more detail, if

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-30 Thread tsunakawa.ta...@fujitsu.com
From: Amit Kapila > I have one idea for performance testing. We can even test this for > non-recovery paths by removing the recovery-related check like only > use it when there are cached blocks. You can do this if testing via > recovery path is difficult because at the end performance should be

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-09-30 Thread Amit Kapila
On Thu, Oct 1, 2020 at 8:11 AM tsunakawa.ta...@fujitsu.com wrote: > > From: Jamison, Kirk/ジャミソン カーク > > Recovery performance measurement results below. > > But it seems there are overhead even with large shared buffers. > > > > | s_b | master | patched | %reg | > >

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-30 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク > Recovery performance measurement results below. > But it seems there are overhead even with large shared buffers. > > | s_b | master | patched | %reg | > |---||-|---| > | 128MB | 36.052 | 39.451 | 8.62% | > | 1GB | 21.731 | 21.73 |

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-28 Thread k.jami...@fujitsu.com
On Tuesday, September 29, 2020 10:35 AM, Horiguchi-san wrote: > FWIW, I (and maybe Amit) am thinking that the property we need here is not it > is cached or not but the accuracy of the returned file length, and that the > "cached" property should be hidden behind the API. > > Another reason for

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-09-28 Thread Amit Kapila
On Tue, Sep 29, 2020 at 7:21 AM tsunakawa.ta...@fujitsu.com wrote: > > From: Jamison, Kirk/ジャミソン カーク > > Also, as Amit-san mentioned, the cause of the slight performance regression > when shared_buffers is small needs to be investigated and addressed. > Yes, I think it is mainly because extra

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-28 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク > I also did not remove the duplicate code from smgrnblocks because Amit-san > mentioned that when the caching for non-recovery cases is implemented, we > can use it for non-recovery cases as well. But the extra code is not used now. The code for future usage

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-09-28 Thread Kyotaro Horiguchi
At Mon, 28 Sep 2020 08:57:36 +, "k.jami...@fujitsu.com" wrote in > On Monday, September 28, 2020 5:08 PM, Tsunakawa-san wrote: > > > From: Jamison, Kirk/ジャミソン カーク > > > Is my understanding above correct? > > > > No. I simply meant DropRelFileNodeBuffers() calls the following

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-28 Thread k.jami...@fujitsu.com
On Monday, September 28, 2020 5:08 PM, Tsunakawa-san wrote: > From: Jamison, Kirk/ジャミソン カーク > > Is my understanding above correct? > > No. I simply meant DropRelFileNodeBuffers() calls the following function, > and avoids the optimization if it returns InvalidBlockNumber. > > >

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-28 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク > Is my understanding above correct? No. I simply meant DropRelFileNodeBuffers() calls the following function, and avoids the optimization if it returns InvalidBlockNumber. BlockNumber smgrcachednblocks(SMgrRelation reln, ForkNumber forknum) {

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-28 Thread k.jami...@fujitsu.com
On Monday, September 28, 2020 11:50 AM, Tsunakawa-san wrote: > From: Amit Kapila > > I agree with the above two points. > > Thank you. I'm relieved to know I didn't misunderstand. > > > > > * Then, add a new function, say, smgrnblocks_cached() that simply > > > returns > > the cached block

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-27 Thread tsunakawa.ta...@fujitsu.com
From: Amit Kapila > I agree with the above two points. Thank you. I'm relieved to know I didn't misunderstand. > > * Then, add a new function, say, smgrnblocks_cached() that simply returns > the cached block count, and DropRelFileNodeBuffers() uses it instead of > smgrnblocks(). > > > > I am

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-09-26 Thread Amit Kapila
On Fri, Sep 25, 2020 at 1:49 PM k.jami...@fujitsu.com wrote: > > Hi. > > > I'll send performance measurement results in the next email. Thanks a lot > > for > > the reviews! > > Below are the performance measurement results. > I was only able to use low-spec machine: > CPU 4v, Memory 8GB, RHEL,

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-09-26 Thread Amit Kapila
On Fri, Sep 25, 2020 at 2:25 PM tsunakawa.ta...@fujitsu.com wrote: > > From: Amit Kapila > > No, during recovery also we need to be careful. We need to ensure that > > we use cached value during recovery and cached value is always > > up-to-date. We can't rely on lseek and I have provided some

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-25 Thread k.jami...@fujitsu.com
On Friday, September 25, 2020 6:02 PM, Tsunakawa-san wrote: > From: Jamison, Kirk/ジャミソン カーク > > [Results] > > Recovery/Failover performance (in seconds). 3 trial runs. > > > > | shared_buffers | master | patch | %reg| > > ||||-| > > | 128MB |

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-25 Thread tsunakawa.ta...@fujitsu.com
From: Jamison, Kirk/ジャミソン カーク > [Results] > Recovery/Failover performance (in seconds). 3 trial runs. > > | shared_buffers | master | patch | %reg| > ||||-| > | 128MB | 32.406 | 33.785 | 4.08% | > | 1GB| 36.188 | 32.747 |

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-25 Thread tsunakawa.ta...@fujitsu.com
From: Amit Kapila > No, during recovery also we need to be careful. We need to ensure that > we use cached value during recovery and cached value is always > up-to-date. We can't rely on lseek and I have provided some scenario > up thread [1] where such behavior can cause problem and then see the

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-25 Thread k.jami...@fujitsu.com
Hi. > I'll send performance measurement results in the next email. Thanks a lot for > the reviews! Below are the performance measurement results. I was only able to use low-spec machine: CPU 4v, Memory 8GB, RHEL, xfs filesystem. [Failover/Recovery Test] 1. (Master) Create table (ex. 10,000

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-09-24 Thread Kyotaro Horiguchi
Hello. At Wed, 23 Sep 2020 05:37:24 +, "tsunakawa.ta...@fujitsu.com" wrote in > From: Jamison, Kirk/ジャミソン カーク # Wow. I'm surprised to read it.. > > I revised the patch based from my understanding of Horiguchi-san's comment, > > but I could be wrong. > > Quoting: > > > > " > > +

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-24 Thread k.jami...@fujitsu.com
On Thursday, September 24, 2020 1:27 PM, Tsunakawa-san wrote: > (1) > + for (cur_blk = firstDelBlock[j]; cur_blk < > nblocks; cur_blk++) > > The right side of "cur_blk <" should not be nblocks, because nblocks is not > the number of the relation fork anymore. Right.

RE: [Patch] Optimize dropping of relation buffers using dlist

2020-09-23 Thread tsunakawa.ta...@fujitsu.com
In v15: (1) + for (cur_blk = firstDelBlock[j]; cur_blk < nblocks; cur_blk++) The right side of "cur_blk <" should not be nblocks, because nblocks is not the number of the relation fork anymore. (2) + BlockNumber nblocks; +

Re: [Patch] Optimize dropping of relation buffers using dlist

2020-09-23 Thread Amit Kapila
On Wed, Sep 23, 2020 at 12:00 PM tsunakawa.ta...@fujitsu.com wrote: > > From: Amit Kapila > > The idea is that we can't use this optimization if the value is not > > cached because we can't rely on lseek behavior. See all the discussion > > between Horiguchi-San and me in the thread above. So,

<    1   2   3   >