Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)

2018-11-26 Thread Haribabu Kommi
On Fri, Nov 23, 2018 at 6:39 PM Thomas Munro 
wrote:

> On Mon, Nov 19, 2018 at 5:24 PM Kyotaro HORIGUCHI
>  wrote:
> > At Sat, 17 Nov 2018 11:15:54 -0300, Alvaro Herrera <
> alvhe...@2ndquadrant.com> wrote in
> <20181117141554.4dkx2u4j6md3bqdh@alvherre.pgsql>
> > > Is this patch committable now?
> >
> > I don't think so. We should make a decision on a point.
> >
> > I was a bit confused (sorry) but IIUIC Haribabu suggested that
> > the RBM_ZERO_ON_ERROR case should be included to read (not just
> > ignored). I'm on it.  I think that RPM_ZERO_* other than ON_ERROR
> > cases could be a kind of hit but I don't insist on it.
> >
> > So, I think we should decide on at least the ON_ERROR case before
> > this becomes commttable.
>
> Agreed, RBM_ZERO_ON_ERROR needs to be counted as a read.  Here's a new
> version like that.
>
> > The another counter could be another issue. I don't object to add
> > the counter but I'm not sure what is the suitable name.
>
> I think that might be interesting information in the future, but let's
> consider that for a later patch.
>

Thanks for the updated patch. It looks good.
I marked it in the commitfest as ready for committer.

Regards,
Haribabu Kommi
Fujitsu Australia


Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)

2018-11-22 Thread Thomas Munro
On Mon, Nov 19, 2018 at 5:24 PM Kyotaro HORIGUCHI
 wrote:
> At Sat, 17 Nov 2018 11:15:54 -0300, Alvaro Herrera  
> wrote in <20181117141554.4dkx2u4j6md3bqdh@alvherre.pgsql>
> > Is this patch committable now?
>
> I don't think so. We should make a decision on a point.
>
> I was a bit confused (sorry) but IIUIC Haribabu suggested that
> the RBM_ZERO_ON_ERROR case should be included to read (not just
> ignored). I'm on it.  I think that RPM_ZERO_* other than ON_ERROR
> cases could be a kind of hit but I don't insist on it.
>
> So, I think we should decide on at least the ON_ERROR case before
> this becomes commttable.

Agreed, RBM_ZERO_ON_ERROR needs to be counted as a read.  Here's a new
version like that.

> The another counter could be another issue. I don't object to add
> the counter but I'm not sure what is the suitable name.

I think that might be interesting information in the future, but let's
consider that for a later patch.

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Don-t-count-zero-filled-buffers-as-read-in-EXPLAI-v2.patch
Description: Binary data


Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)

2018-11-18 Thread Kyotaro HORIGUCHI
At Sat, 17 Nov 2018 11:15:54 -0300, Alvaro Herrera  
wrote in <20181117141554.4dkx2u4j6md3bqdh@alvherre.pgsql>
> Is this patch committable now?

I don't think so. We should make a decision on a point.

I was a bit confused (sorry) but IIUIC Haribabu suggested that
the RBM_ZERO_ON_ERROR case should be included to read (not just
ignored). I'm on it.  I think that RPM_ZERO_* other than ON_ERROR
cases could be a kind of hit but I don't insist on it.

So, I think we should decide on at least the ON_ERROR case before
this becomes commttable.

The another counter could be another issue. I don't object to add
the counter but I'm not sure what is the suitable name.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)

2018-11-17 Thread Alvaro Herrera
Is this patch committable now?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)

2018-11-14 Thread Kyotaro HORIGUCHI
At Thu, 12 Jul 2018 10:31:46 +1200, Thomas Munro 
 wrote in 

> I suppose someone might argue that even when it's not a hit and it's
> not a read, we might still want to count this buffer interaction in
> some other way.  Perhaps there should be a separate counter?  It may
> technically be a kind of cache miss, but it's nowhere near as
> expensive as a synchronous system call like read() so I didn't propose
> that.

At Thu, 12 Jul 2018 10:19:29 +1000, Haribabu Kommi  
wrote in 
> Thanks for the details. I got your point. But we need to include
> RBM_ZERO_ON_ERROR case read operations, excluding others
> are fine.

FWIW I agree to Haribabu's comment that the case of RBM_ZERO_*
other than ON_ERROR is a hit. It seems to show zheap's disk I /O
reduction by itself in certain extent.

At Thu, 12 Jul 2018 13:28:37 +1200, David Rowley  
wrote in 
> On 12 July 2018 at 12:19, Haribabu Kommi  wrote:
> > Yes, I agree that we may need a new counter that counts the buffers that
> > are just allocated (no read or no write). But currently, may be the counter
> > value is very less, so people are not interested.

> The existing counters don't show up in EXPLAIN (ANALYZE, BUFFERS) when
> they're zero. The new counter would surely work the same way, so the
> users wouldn't be burdened by additional explain output it when
> they're not affected by it.

I don't object strongly neither to the new counter but I'm not
sure it is enough distinctive from hits, in the view that "a hit
is where we found a buffer for the requested page".

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)

2018-07-11 Thread David Rowley
On 12 July 2018 at 12:19, Haribabu Kommi  wrote:
> Yes, I agree that we may need a new counter that counts the buffers that
> are just allocated (no read or no write). But currently, may be the counter
> value is very less, so people are not interested.

The existing counters don't show up in EXPLAIN (ANALYZE, BUFFERS) when
they're zero. The new counter would surely work the same way, so the
users wouldn't be burdened by additional explain output it when
they're not affected by it.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)

2018-07-11 Thread Haribabu Kommi
On Thu, Jul 12, 2018 at 8:32 AM Thomas Munro 
wrote:

> On Thu, Jul 12, 2018 at 12:46 AM, Haribabu Kommi
>  wrote:
> >> > On 2018-04-30 14:59:31 +1200, Thomas Munro wrote:
> >> >> In EXPLAIN (BUFFERS), there are two kinds of cache misses that show
> up
> >> >> as "reads" when in fact they are not reads at all:
> >> >>
> >> >> 1.  Relation extension, which in fact writes a zero-filled block.
> >> >> 2.  The RBM_ZERO_* modes, which provoke neither read nor write.
> >
> > I checked the patch and I agree with the change 1). And regarding change
> 2)
> > whether it is  zeroing the contents of the page or not, it does read?
> > because if it exists in the buffer pool, we are counting them as hits
> > irrespective
> > of the mode? Am I missing something?
>
> Further down in the function you can see that there is no read()
> system call for the RBM_ZERO_* modes:
>
> if (mode == RBM_ZERO_AND_LOCK || mode ==
> RBM_ZERO_AND_CLEANUP_LOCK)
> MemSet((char *) bufBlock, 0, BLCKSZ);
> else
> {
> ...
> smgrread(smgr, forkNum, blockNum, (char *)
> bufBlock);
> ...
> }
>

Thanks for the details. I got your point. But we need to include
RBM_ZERO_ON_ERROR case read operations, excluding others
are fine.


> I suppose someone might argue that even when it's not a hit and it's
> not a read, we might still want to count this buffer interaction in
> some other way.  Perhaps there should be a separate counter?  It may
> technically be a kind of cache miss, but it's nowhere near as
> expensive as a synchronous system call like read() so I didn't propose
> that.
>

Yes, I agree that we may need a new counter that counts the buffers that
are just allocated (no read or no write). But currently, may be the counter
value is very less, so people are not interested.


> Some more on my motivation:  In our zheap prototype, when the system
> is working well and we have enough space, we constantly allocate
> zeroed buffer pages at the insert point (= head) of an undo log and
> drop pages at the discard point (= tail) in the background;
> effectively a few pages just go round and round via the freelist and
> no read() or write() syscalls happen.  That's something I'm very happy
> about and it's one of our claimed advantages over the traditional heap
> (which tends to read and dirty more pages), but EXPLAIN (BUFFERS)
> hides this virtuous behaviour when comparing with the traditional
> heap: it falsely and slanderously reports that zheap is reading undo
> pages when it is not.  Of course I don't intent to litigate zheap
> design in this thread, I just I figured that since this accounting is
> wrong on principle and affects current PostgreSQL too (at least in
> theory) I would propose this little patch independently.  It's subtle
> enough that I wouldn't bother to back-patch it though.
>

OK. May be it is better to implement the buffer allocate counter along with
zheap to provide better buffer results?

Regards,
Haribabu Kommi
Fujitsu Australia


Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)

2018-07-11 Thread Thomas Munro
On Thu, Jul 12, 2018 at 12:46 AM, Haribabu Kommi
 wrote:
>> > On 2018-04-30 14:59:31 +1200, Thomas Munro wrote:
>> >> In EXPLAIN (BUFFERS), there are two kinds of cache misses that show up
>> >> as "reads" when in fact they are not reads at all:
>> >>
>> >> 1.  Relation extension, which in fact writes a zero-filled block.
>> >> 2.  The RBM_ZERO_* modes, which provoke neither read nor write.
>
> I checked the patch and I agree with the change 1). And regarding change 2)
> whether it is  zeroing the contents of the page or not, it does read?
> because if it exists in the buffer pool, we are counting them as hits
> irrespective
> of the mode? Am I missing something?

Further down in the function you can see that there is no read()
system call for the RBM_ZERO_* modes:

if (mode == RBM_ZERO_AND_LOCK || mode ==
RBM_ZERO_AND_CLEANUP_LOCK)
MemSet((char *) bufBlock, 0, BLCKSZ);
else
{
...
smgrread(smgr, forkNum, blockNum, (char *) bufBlock);
...
}

I suppose someone might argue that even when it's not a hit and it's
not a read, we might still want to count this buffer interaction in
some other way.  Perhaps there should be a separate counter?  It may
technically be a kind of cache miss, but it's nowhere near as
expensive as a synchronous system call like read() so I didn't propose
that.

Some more on my motivation:  In our zheap prototype, when the system
is working well and we have enough space, we constantly allocate
zeroed buffer pages at the insert point (= head) of an undo log and
drop pages at the discard point (= tail) in the background;
effectively a few pages just go round and round via the freelist and
no read() or write() syscalls happen.  That's something I'm very happy
about and it's one of our claimed advantages over the traditional heap
(which tends to read and dirty more pages), but EXPLAIN (BUFFERS)
hides this virtuous behaviour when comparing with the traditional
heap: it falsely and slanderously reports that zheap is reading undo
pages when it is not.  Of course I don't intent to litigate zheap
design in this thread, I just I figured that since this accounting is
wrong on principle and affects current PostgreSQL too (at least in
theory) I would propose this little patch independently.  It's subtle
enough that I wouldn't bother to back-patch it though.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)

2018-07-11 Thread Haribabu Kommi
On Mon, Apr 30, 2018 at 1:38 PM Thomas Munro 
wrote:

> On Mon, Apr 30, 2018 at 3:13 PM, Andres Freund  wrote:
> > On 2018-04-30 14:59:31 +1200, Thomas Munro wrote:
> >> In EXPLAIN (BUFFERS), there are two kinds of cache misses that show up
> >> as "reads" when in fact they are not reads at all:
> >>
> >> 1.  Relation extension, which in fact writes a zero-filled block.
> >> 2.  The RBM_ZERO_* modes, which provoke neither read nor write.
> >
> > Just for understanding: 2) can never happen for buffers showing up in
> > EXPLAIN, right?
> >
> > I'm not saying you shouldn't fix the accounting...
>
> Maybe the hash AM can reach that in _hash_getinitbuf() while adding
> overflow pages to bucket chains?  Admittedly case 2 is obscure and
> rare if not unreachable and probably no one would care too much about
> that in practice (whereas case 1 can be seen by simply inserting stuff
> into a new empty table).  Other patches I'm working on for later
> proposal do it more often (think accessing known-empty pages in a kind
> of preallocated extent), and it occurred to me that it's clearly a bug
> on principle, hence this patch.
>

I checked the patch and I agree with the change 1). And regarding change 2)
whether it is  zeroing the contents of the page or not, it does read?
because if it exists in the buffer pool, we are counting them as hits
irrespective
of the mode? Am I missing something?

Regards,
Haribabu Kommi
Fujitsu Australia


Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)

2018-04-29 Thread Thomas Munro
On Mon, Apr 30, 2018 at 3:13 PM, Andres Freund  wrote:
> On 2018-04-30 14:59:31 +1200, Thomas Munro wrote:
>> In EXPLAIN (BUFFERS), there are two kinds of cache misses that show up
>> as "reads" when in fact they are not reads at all:
>>
>> 1.  Relation extension, which in fact writes a zero-filled block.
>> 2.  The RBM_ZERO_* modes, which provoke neither read nor write.
>
> Just for understanding: 2) can never happen for buffers showing up in
> EXPLAIN, right?
>
> I'm not saying you shouldn't fix the accounting...

Maybe the hash AM can reach that in _hash_getinitbuf() while adding
overflow pages to bucket chains?  Admittedly case 2 is obscure and
rare if not unreachable and probably no one would care too much about
that in practice (whereas case 1 can be seen by simply inserting stuff
into a new empty table).  Other patches I'm working on for later
proposal do it more often (think accessing known-empty pages in a kind
of preallocated extent), and it occurred to me that it's clearly a bug
on principle, hence this patch.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Accounting of zero-filled buffers in EXPLAIN (BUFFERS)

2018-04-29 Thread Andres Freund
On 2018-04-30 14:59:31 +1200, Thomas Munro wrote:
> Hi,
> 
> In EXPLAIN (BUFFERS), there are two kinds of cache misses that show up
> as "reads" when in fact they are not reads at all:
> 
> 1.  Relation extension, which in fact writes a zero-filled block.
> 2.  The RBM_ZERO_* modes, which provoke neither read nor write.

Just for understanding: 2) can never happen for buffers showing up in
EXPLAIN, right?

I'm not saying you shouldn't fix the accounting...

Greetings,

Andres Freund



Accounting of zero-filled buffers in EXPLAIN (BUFFERS)

2018-04-29 Thread Thomas Munro
Hi,

In EXPLAIN (BUFFERS), there are two kinds of cache misses that show up
as "reads" when in fact they are not reads at all:

1.  Relation extension, which in fact writes a zero-filled block.
2.  The RBM_ZERO_* modes, which provoke neither read nor write.

Here's a suggested fix.

I noticed this because I have some patches in development that hit
these cases a bit more and the numbers didn't match my expectation.  I
suppose someone might want a separate counter for zero-filled buffers
(they're still buffer interactions and cache misses) but it seems like
it's probably below the kind of thing we're interested in counting
(though in passing, I notice recently that some kernels keep some free
pages pre-zeroed so they can supply them faster, which is curious).

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Don-t-count-zero-filled-buffers-as-read-in-EXPLAIN.patch
Description: Binary data