Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)
On Fri, Nov 23, 2018 at 6:39 PM Thomas Munro wrote: > On Mon, Nov 19, 2018 at 5:24 PM Kyotaro HORIGUCHI > wrote: > > At Sat, 17 Nov 2018 11:15:54 -0300, Alvaro Herrera < > alvhe...@2ndquadrant.com> wrote in > <20181117141554.4dkx2u4j6md3bqdh@alvherre.pgsql> > > > Is this patch committable now? > > > > I don't think so. We should make a decision on a point. > > > > I was a bit confused (sorry) but IIUIC Haribabu suggested that > > the RBM_ZERO_ON_ERROR case should be included to read (not just > > ignored). I'm on it. I think that RPM_ZERO_* other than ON_ERROR > > cases could be a kind of hit but I don't insist on it. > > > > So, I think we should decide on at least the ON_ERROR case before > > this becomes commttable. > > Agreed, RBM_ZERO_ON_ERROR needs to be counted as a read. Here's a new > version like that. > > > The another counter could be another issue. I don't object to add > > the counter but I'm not sure what is the suitable name. > > I think that might be interesting information in the future, but let's > consider that for a later patch. > Thanks for the updated patch. It looks good. I marked it in the commitfest as ready for committer. Regards, Haribabu Kommi Fujitsu Australia
Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)
On Mon, Nov 19, 2018 at 5:24 PM Kyotaro HORIGUCHI wrote: > At Sat, 17 Nov 2018 11:15:54 -0300, Alvaro Herrera > wrote in <20181117141554.4dkx2u4j6md3bqdh@alvherre.pgsql> > > Is this patch committable now? > > I don't think so. We should make a decision on a point. > > I was a bit confused (sorry) but IIUIC Haribabu suggested that > the RBM_ZERO_ON_ERROR case should be included to read (not just > ignored). I'm on it. I think that RPM_ZERO_* other than ON_ERROR > cases could be a kind of hit but I don't insist on it. > > So, I think we should decide on at least the ON_ERROR case before > this becomes commttable. Agreed, RBM_ZERO_ON_ERROR needs to be counted as a read. Here's a new version like that. > The another counter could be another issue. I don't object to add > the counter but I'm not sure what is the suitable name. I think that might be interesting information in the future, but let's consider that for a later patch. -- Thomas Munro http://www.enterprisedb.com 0001-Don-t-count-zero-filled-buffers-as-read-in-EXPLAI-v2.patch Description: Binary data
Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)
At Sat, 17 Nov 2018 11:15:54 -0300, Alvaro Herrera wrote in <20181117141554.4dkx2u4j6md3bqdh@alvherre.pgsql> > Is this patch committable now? I don't think so. We should make a decision on a point. I was a bit confused (sorry) but IIUIC Haribabu suggested that the RBM_ZERO_ON_ERROR case should be included to read (not just ignored). I'm on it. I think that RPM_ZERO_* other than ON_ERROR cases could be a kind of hit but I don't insist on it. So, I think we should decide on at least the ON_ERROR case before this becomes commttable. The another counter could be another issue. I don't object to add the counter but I'm not sure what is the suitable name. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)
Is this patch committable now? -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)
At Thu, 12 Jul 2018 10:31:46 +1200, Thomas Munro wrote in > I suppose someone might argue that even when it's not a hit and it's > not a read, we might still want to count this buffer interaction in > some other way. Perhaps there should be a separate counter? It may > technically be a kind of cache miss, but it's nowhere near as > expensive as a synchronous system call like read() so I didn't propose > that. At Thu, 12 Jul 2018 10:19:29 +1000, Haribabu Kommi wrote in > Thanks for the details. I got your point. But we need to include > RBM_ZERO_ON_ERROR case read operations, excluding others > are fine. FWIW I agree to Haribabu's comment that the case of RBM_ZERO_* other than ON_ERROR is a hit. It seems to show zheap's disk I /O reduction by itself in certain extent. At Thu, 12 Jul 2018 13:28:37 +1200, David Rowley wrote in > On 12 July 2018 at 12:19, Haribabu Kommi wrote: > > Yes, I agree that we may need a new counter that counts the buffers that > > are just allocated (no read or no write). But currently, may be the counter > > value is very less, so people are not interested. > The existing counters don't show up in EXPLAIN (ANALYZE, BUFFERS) when > they're zero. The new counter would surely work the same way, so the > users wouldn't be burdened by additional explain output it when > they're not affected by it. I don't object strongly neither to the new counter but I'm not sure it is enough distinctive from hits, in the view that "a hit is where we found a buffer for the requested page". regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)
On 12 July 2018 at 12:19, Haribabu Kommi wrote: > Yes, I agree that we may need a new counter that counts the buffers that > are just allocated (no read or no write). But currently, may be the counter > value is very less, so people are not interested. The existing counters don't show up in EXPLAIN (ANALYZE, BUFFERS) when they're zero. The new counter would surely work the same way, so the users wouldn't be burdened by additional explain output it when they're not affected by it. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)
On Thu, Jul 12, 2018 at 8:32 AM Thomas Munro wrote: > On Thu, Jul 12, 2018 at 12:46 AM, Haribabu Kommi > wrote: > >> > On 2018-04-30 14:59:31 +1200, Thomas Munro wrote: > >> >> In EXPLAIN (BUFFERS), there are two kinds of cache misses that show > up > >> >> as "reads" when in fact they are not reads at all: > >> >> > >> >> 1. Relation extension, which in fact writes a zero-filled block. > >> >> 2. The RBM_ZERO_* modes, which provoke neither read nor write. > > > > I checked the patch and I agree with the change 1). And regarding change > 2) > > whether it is zeroing the contents of the page or not, it does read? > > because if it exists in the buffer pool, we are counting them as hits > > irrespective > > of the mode? Am I missing something? > > Further down in the function you can see that there is no read() > system call for the RBM_ZERO_* modes: > > if (mode == RBM_ZERO_AND_LOCK || mode == > RBM_ZERO_AND_CLEANUP_LOCK) > MemSet((char *) bufBlock, 0, BLCKSZ); > else > { > ... > smgrread(smgr, forkNum, blockNum, (char *) > bufBlock); > ... > } > Thanks for the details. I got your point. But we need to include RBM_ZERO_ON_ERROR case read operations, excluding others are fine. > I suppose someone might argue that even when it's not a hit and it's > not a read, we might still want to count this buffer interaction in > some other way. Perhaps there should be a separate counter? It may > technically be a kind of cache miss, but it's nowhere near as > expensive as a synchronous system call like read() so I didn't propose > that. > Yes, I agree that we may need a new counter that counts the buffers that are just allocated (no read or no write). But currently, may be the counter value is very less, so people are not interested. > Some more on my motivation: In our zheap prototype, when the system > is working well and we have enough space, we constantly allocate > zeroed buffer pages at the insert point (= head) of an undo log and > drop pages at the discard point (= tail) in the background; > effectively a few pages just go round and round via the freelist and > no read() or write() syscalls happen. That's something I'm very happy > about and it's one of our claimed advantages over the traditional heap > (which tends to read and dirty more pages), but EXPLAIN (BUFFERS) > hides this virtuous behaviour when comparing with the traditional > heap: it falsely and slanderously reports that zheap is reading undo > pages when it is not. Of course I don't intent to litigate zheap > design in this thread, I just I figured that since this accounting is > wrong on principle and affects current PostgreSQL too (at least in > theory) I would propose this little patch independently. It's subtle > enough that I wouldn't bother to back-patch it though. > OK. May be it is better to implement the buffer allocate counter along with zheap to provide better buffer results? Regards, Haribabu Kommi Fujitsu Australia
Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)
On Thu, Jul 12, 2018 at 12:46 AM, Haribabu Kommi wrote: >> > On 2018-04-30 14:59:31 +1200, Thomas Munro wrote: >> >> In EXPLAIN (BUFFERS), there are two kinds of cache misses that show up >> >> as "reads" when in fact they are not reads at all: >> >> >> >> 1. Relation extension, which in fact writes a zero-filled block. >> >> 2. The RBM_ZERO_* modes, which provoke neither read nor write. > > I checked the patch and I agree with the change 1). And regarding change 2) > whether it is zeroing the contents of the page or not, it does read? > because if it exists in the buffer pool, we are counting them as hits > irrespective > of the mode? Am I missing something? Further down in the function you can see that there is no read() system call for the RBM_ZERO_* modes: if (mode == RBM_ZERO_AND_LOCK || mode == RBM_ZERO_AND_CLEANUP_LOCK) MemSet((char *) bufBlock, 0, BLCKSZ); else { ... smgrread(smgr, forkNum, blockNum, (char *) bufBlock); ... } I suppose someone might argue that even when it's not a hit and it's not a read, we might still want to count this buffer interaction in some other way. Perhaps there should be a separate counter? It may technically be a kind of cache miss, but it's nowhere near as expensive as a synchronous system call like read() so I didn't propose that. Some more on my motivation: In our zheap prototype, when the system is working well and we have enough space, we constantly allocate zeroed buffer pages at the insert point (= head) of an undo log and drop pages at the discard point (= tail) in the background; effectively a few pages just go round and round via the freelist and no read() or write() syscalls happen. That's something I'm very happy about and it's one of our claimed advantages over the traditional heap (which tends to read and dirty more pages), but EXPLAIN (BUFFERS) hides this virtuous behaviour when comparing with the traditional heap: it falsely and slanderously reports that zheap is reading undo pages when it is not. Of course I don't intent to litigate zheap design in this thread, I just I figured that since this accounting is wrong on principle and affects current PostgreSQL too (at least in theory) I would propose this little patch independently. It's subtle enough that I wouldn't bother to back-patch it though. -- Thomas Munro http://www.enterprisedb.com
Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)
On Mon, Apr 30, 2018 at 1:38 PM Thomas Munro wrote: > On Mon, Apr 30, 2018 at 3:13 PM, Andres Freund wrote: > > On 2018-04-30 14:59:31 +1200, Thomas Munro wrote: > >> In EXPLAIN (BUFFERS), there are two kinds of cache misses that show up > >> as "reads" when in fact they are not reads at all: > >> > >> 1. Relation extension, which in fact writes a zero-filled block. > >> 2. The RBM_ZERO_* modes, which provoke neither read nor write. > > > > Just for understanding: 2) can never happen for buffers showing up in > > EXPLAIN, right? > > > > I'm not saying you shouldn't fix the accounting... > > Maybe the hash AM can reach that in _hash_getinitbuf() while adding > overflow pages to bucket chains? Admittedly case 2 is obscure and > rare if not unreachable and probably no one would care too much about > that in practice (whereas case 1 can be seen by simply inserting stuff > into a new empty table). Other patches I'm working on for later > proposal do it more often (think accessing known-empty pages in a kind > of preallocated extent), and it occurred to me that it's clearly a bug > on principle, hence this patch. > I checked the patch and I agree with the change 1). And regarding change 2) whether it is zeroing the contents of the page or not, it does read? because if it exists in the buffer pool, we are counting them as hits irrespective of the mode? Am I missing something? Regards, Haribabu Kommi Fujitsu Australia
Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)
On Mon, Apr 30, 2018 at 3:13 PM, Andres Freundwrote: > On 2018-04-30 14:59:31 +1200, Thomas Munro wrote: >> In EXPLAIN (BUFFERS), there are two kinds of cache misses that show up >> as "reads" when in fact they are not reads at all: >> >> 1. Relation extension, which in fact writes a zero-filled block. >> 2. The RBM_ZERO_* modes, which provoke neither read nor write. > > Just for understanding: 2) can never happen for buffers showing up in > EXPLAIN, right? > > I'm not saying you shouldn't fix the accounting... Maybe the hash AM can reach that in _hash_getinitbuf() while adding overflow pages to bucket chains? Admittedly case 2 is obscure and rare if not unreachable and probably no one would care too much about that in practice (whereas case 1 can be seen by simply inserting stuff into a new empty table). Other patches I'm working on for later proposal do it more often (think accessing known-empty pages in a kind of preallocated extent), and it occurred to me that it's clearly a bug on principle, hence this patch. -- Thomas Munro http://www.enterprisedb.com
Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)
On 2018-04-30 14:59:31 +1200, Thomas Munro wrote: > Hi, > > In EXPLAIN (BUFFERS), there are two kinds of cache misses that show up > as "reads" when in fact they are not reads at all: > > 1. Relation extension, which in fact writes a zero-filled block. > 2. The RBM_ZERO_* modes, which provoke neither read nor write. Just for understanding: 2) can never happen for buffers showing up in EXPLAIN, right? I'm not saying you shouldn't fix the accounting... Greetings, Andres Freund
Accounting of zero-filled buffers in EXPLAIN (BUFFERS)
Hi, In EXPLAIN (BUFFERS), there are two kinds of cache misses that show up as "reads" when in fact they are not reads at all: 1. Relation extension, which in fact writes a zero-filled block. 2. The RBM_ZERO_* modes, which provoke neither read nor write. Here's a suggested fix. I noticed this because I have some patches in development that hit these cases a bit more and the numbers didn't match my expectation. I suppose someone might want a separate counter for zero-filled buffers (they're still buffer interactions and cache misses) but it seems like it's probably below the kind of thing we're interested in counting (though in passing, I notice recently that some kernels keep some free pages pre-zeroed so they can supply them faster, which is curious). -- Thomas Munro http://www.enterprisedb.com 0001-Don-t-count-zero-filled-buffers-as-read-in-EXPLAIN.patch Description: Binary data