Re: [HACKERS] show Heap Fetches in EXPLAIN for index-only scans

2012-02-02 Thread Noah Misch
On Wed, Jan 25, 2012 at 08:42:05PM -0500, Robert Haas wrote:
 Only the first pass of vacuum knows how to mark pages all-visible.
 After the update, the first pass of the first vacuum sees a dead tuple
 on the old page and truncates it to a dead line pointer.  When it
 comes to the new page, it observes that the page is now all-visible
 and marks it so.  It then does index vacuuming and returns to the
 first page, marking the dead line pointer unused.  But during this
 second visit to the old page, there's no possibility of marking the
 page as all-visible, because the code doesn't know how to do that.
 The next vacuum's first pass, however, can do so, because there are no
 longer any dead tuples on the page.
 
 We could fix this by modifying lazy_vacuum_page(): since we have to
 dirty the buffer anyway, we could recheck whether all the remaining
 tuples on the page are now all-visible, and if so set the visibility
 map bit.  This is probably desirable even apart from index-only scans,
 because it will frequently save the next vacuum the cost of reading,
 dirtying, and writing extra pages.  There will be some incremental CPU
 cost, but that seems likely to be more than repaid by the I/O savings.
 
 Thoughts?  Should we do this at all?  If so, should we squeeze it into
 9.2 or leave it for 9.3?

Sounds like a good idea.  It has bothered me that two consecutive VACUUMs of a
table, with no intervening activity, can dirty a page twice.  Making that less
frequent is a good thing.  I'd hold the change for 9.3, but that's probably an
unusually-conservative viewpoint.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] show Heap Fetches in EXPLAIN for index-only scans

2012-01-25 Thread Robert Haas
On Sat, Jan 21, 2012 at 9:50 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 A review:

 [ review ]

Thanks.  Committed with hopefully-appropriate revisions.

 As a side-note, I noticed that I needed to run vacuum twice in a row
 to get the Heap Fetches to drop to zero.  I vaguely recall that only
 one vacuum was needed when ios first went in (and I had instrumented
 it to elog heap-fetches).  Does anyone know if this the expected
 consequence of one of the recent changes we made to vacuum?

No, that's not expected.  The change we made to vacuum was to skip
pages that are busy - but it shouldn't be randomly skipping pages for
no reason.  I can reproduce what you're observing, though:

[rhaas 16384]$ pg_filedump 16411 | grep TLI.*Flags | grep -v 'Flags: 0x0004'
 TLI: 0x0001  Prune XID: 0x  Flags: 0x0005 (HAS_FREE_LINES)
 TLI: 0x0001  Prune XID: 0x  Flags: 0x0005 (HAS_FREE_LINES)

After updating a row in the table and checkpointing, the page the rows
was on is marked full and the page that gets the new version becomes
not-all-visible:

[rhaas 16384]$ pg_filedump 16411 | grep TLI.*Flags | grep -v 'Flags: 0x0004'
 TLI: 0x0001  Prune XID: 0x03fb  Flags: 0x0003 (HAS_FREE_LINES|PAGE_FULL)
 TLI: 0x0001  Prune XID: 0x  Flags: 0x0001 (HAS_FREE_LINES)

Now I vacuum the relation and checkpoint, and the page the *new*
relation is on becomes all-visible:

[rhaas 16384]$ pg_filedump 16411 | grep TLI.*Flags | grep -v 'Flags: 0x0004'
 TLI: 0x0001  Prune XID: 0x  Flags: 0x0001 (HAS_FREE_LINES)
 TLI: 0x0001  Prune XID: 0x  Flags: 0x0005 (HAS_FREE_LINES)

Now I vacuum it again and checkpoint, and now the old page also
becomes all-visible:

[rhaas 16384]$ pg_filedump 16411 | grep TLI.*Flags | grep -v 'Flags: 0x0004'
 TLI: 0x0001  Prune XID: 0x  Flags: 0x0005 (HAS_FREE_LINES)
 TLI: 0x0001  Prune XID: 0x  Flags: 0x0005 (HAS_FREE_LINES)

But it seems to me that this is expected (if non-optimal) behavior.
Only the first pass of vacuum knows how to mark pages all-visible.
After the update, the first pass of the first vacuum sees a dead tuple
on the old page and truncates it to a dead line pointer.  When it
comes to the new page, it observes that the page is now all-visible
and marks it so.  It then does index vacuuming and returns to the
first page, marking the dead line pointer unused.  But during this
second visit to the old page, there's no possibility of marking the
page as all-visible, because the code doesn't know how to do that.
The next vacuum's first pass, however, can do so, because there are no
longer any dead tuples on the page.

We could fix this by modifying lazy_vacuum_page(): since we have to
dirty the buffer anyway, we could recheck whether all the remaining
tuples on the page are now all-visible, and if so set the visibility
map bit.  This is probably desirable even apart from index-only scans,
because it will frequently save the next vacuum the cost of reading,
dirtying, and writing extra pages.  There will be some incremental CPU
cost, but that seems likely to be more than repaid by the I/O savings.

Thoughts?  Should we do this at all?  If so, should we squeeze it into
9.2 or leave it for 9.3?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] show Heap Fetches in EXPLAIN for index-only scans

2012-01-21 Thread Jeff Janes
On Fri, Jan 13, 2012 at 7:21 AM, Robert Haas robertmh...@gmail.com wrote:
 I think that people who are using index-only scans are going to want
 some way to find out the degree to which the scans are in fact
 index-only.

 So here's a 5-line patch that adds the number of heap fetches to the
 EXPLAIN ANALYZE output.  This might not be all the instrumentation
 we'll ever want here, but I think we at least want this much.

A review:

It is not in context diff format.

It applies, builds, and runs cleanly, including under --enable-cassert

No docs, but the output format of other EXPLAIN ANALYZE fields
generally aren't documented either.  But it is fairly self-explanatory
(provided one knows what an index-only scan is in the first place).

No regression tests, but that too seems appropriate.

It does what it says.  We (or I, anyway) want what it does.  As
mentioned, we might want counts also exposed via the stats collector,
but that is another matter.

I don't see how it could possibly cause a meaningful slow-down, and
tests on all-memory index-only scans cannot detect a slow down.

On my machine, using EXPLAIN ANALYZE causes a 16 fold slow down on an
in memory -i -s100 select count(*) from pgbench_accounts conducted
after a run of pgbench -T30, so that there are some heap fetches
needed.  Compared to that ANALYZE slowdown, any additional slow down
from this patch is ridiculously slow.  It would be nice if you could
get the output of this patch (and of the BUFFERS option to EXPLAIN)
without incurring the timing overhead of ANALYZE.  But again, that is
not the subject of this patch.

I wondered about the type of ioss_HeapFetches.  Other members of
execnodes.h structs tend to be int64, not long.  But those others are
not reported in EXPLAIN.  Other things reported in explain.c tend to
be long.  This seems to be a foot in both worlds, and your response to
Peter is convincing.

I agree with Tom on the pre increment versus post increment of ioss_HeapFetches.

I also wondered if ioss_HeapFetches ought to be initialized to zero.
I looked for analogous code but didn't find enough analogy to convince
me one way or the other.

All the other members of IndexOnlyScanState have entries in the big
comment block preceding the typedef, so I would expect
ioss_HeapFetches should have one as well.

These are all minor issues, and since you are a committer I'm going to
mark it ready for committer.

As a side-note, I noticed that I needed to run vacuum twice in a row
to get the Heap Fetches to drop to zero.  I vaguely recall that only
one vacuum was needed when ios first went in (and I had instrumented
it to elog heap-fetches).  Does anyone know if this the expected
consequence of one of the recent changes we made to vacuum?

Thanks,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] show Heap Fetches in EXPLAIN for index-only scans

2012-01-13 Thread Robert Haas
I think that people who are using index-only scans are going to want
some way to find out the degree to which the scans are in fact
index-only.

So here's a 5-line patch that adds the number of heap fetches to the
EXPLAIN ANALYZE output.  This might not be all the instrumentation
we'll ever want here, but I think we at least want this much.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


explain-heap-fetches.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] show Heap Fetches in EXPLAIN for index-only scans

2012-01-13 Thread Magnus Hagander
On Fri, Jan 13, 2012 at 16:21, Robert Haas robertmh...@gmail.com wrote:
 I think that people who are using index-only scans are going to want
 some way to find out the degree to which the scans are in fact
 index-only.

 So here's a 5-line patch that adds the number of heap fetches to the
 EXPLAIN ANALYZE output.  This might not be all the instrumentation
 we'll ever want here, but I think we at least want this much.

Agreed.

Would also be good to have counter sin pg_stat_* for this, since you'd
usually want to look at this kind of data over time as well. In your
plans? ;)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] show Heap Fetches in EXPLAIN for index-only scans

2012-01-13 Thread Peter Geoghegan
On 13 January 2012 15:21, Robert Haas robertmh...@gmail.com wrote:
 I think that people who are using index-only scans are going to want
 some way to find out the degree to which the scans are in fact
 index-only.

 So here's a 5-line patch that adds the number of heap fetches to the
 EXPLAIN ANALYZE output.  This might not be all the instrumentation
 we'll ever want here, but I think we at least want this much.

Good idea. The fact that that information wasn't available did bother me.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] show Heap Fetches in EXPLAIN for index-only scans

2012-01-13 Thread Robert Haas
On Fri, Jan 13, 2012 at 10:29 AM, Magnus Hagander mag...@hagander.net wrote:
 On Fri, Jan 13, 2012 at 16:21, Robert Haas robertmh...@gmail.com wrote:
 I think that people who are using index-only scans are going to want
 some way to find out the degree to which the scans are in fact
 index-only.

 So here's a 5-line patch that adds the number of heap fetches to the
 EXPLAIN ANALYZE output.  This might not be all the instrumentation
 we'll ever want here, but I think we at least want this much.

 Agreed.

 Would also be good to have counter sin pg_stat_* for this, since you'd
 usually want to look at this kind of data over time as well. In your
 plans? ;)

Not really.  I don't have a clear enough idea about what that should
look like, and I expect a vigorous debate over the distributed cost of
another counter.  But I'm happy to have someone who feels more
strongly about it than I do take up the cause.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] show Heap Fetches in EXPLAIN for index-only scans

2012-01-13 Thread Peter Geoghegan
On 13 January 2012 15:31, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 10:29 AM, Magnus Hagander mag...@hagander.net wrote:
 Would also be good to have counter sin pg_stat_* for this, since you'd
 usually want to look at this kind of data over time as well. In your
 plans? ;)

 Not really.  I don't have a clear enough idea about what that should
 look like, and I expect a vigorous debate over the distributed cost of
 another counter.  But I'm happy to have someone who feels more
 strongly about it than I do take up the cause.

Maybe the the ioss_HeapFetches field should be a uint32? That's all
it's going to be on some platforms anyway.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] show Heap Fetches in EXPLAIN for index-only scans

2012-01-13 Thread Robert Haas
On Fri, Jan 13, 2012 at 10:41 AM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 13 January 2012 15:31, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jan 13, 2012 at 10:29 AM, Magnus Hagander mag...@hagander.net 
 wrote:
 Would also be good to have counter sin pg_stat_* for this, since you'd
 usually want to look at this kind of data over time as well. In your
 plans? ;)

 Not really.  I don't have a clear enough idea about what that should
 look like, and I expect a vigorous debate over the distributed cost of
 another counter.  But I'm happy to have someone who feels more
 strongly about it than I do take up the cause.

 Maybe the the ioss_HeapFetches field should be a uint32? That's all
 it's going to be on some platforms anyway.

I originally had it that way, but then it didn't match what
ExplainPropertyLong() was looking for.  I didn't think it was worth
further complicating explain.c for this...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] show Heap Fetches in EXPLAIN for index-only scans

2012-01-13 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 So here's a 5-line patch that adds the number of heap fetches to the
 EXPLAIN ANALYZE output.  This might not be all the instrumentation
 we'll ever want here, but I think we at least want this much.

Cosmetic gripes:

1. Please initialize the counter in ExecInitIndexOnlyScan.  We don't
generally rely on node fields to init as zeroes.

2. Project style is to use foo++, not ++foo, in contexts where
it doesn't actually matter which is used.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers