Re: [HACKERS] Set visibility map bit after HOT prune

2013-01-07 Thread Pavan Deolasee
Sorry for a long pause on this thread. A new arrival at home kept me
occupied all the time.

This thread saw a lot of ideas and suggestions from different people. I
don't think we had an agreement one way or the other on any of them, but
let me summarize the discussion for archival and taking further action if
deemed necessary.

*Suggestion 1: Set the visibility map bit after HOT prune
*
The rational for this idea is to improve the chances of an index-only scan
happening after HOT prune. This is especially interesting when the a table
gets random updates or deletes, each of which will clear the VM bit. The
table may not still come up for vacuum, either because the number of
updates/deletes are not over the vac threshold or because subsequent HOT
prune did not leave any work for vacuum. The only place where we set the VM
bits again is during vacuum. So this idea would add another path where VM
bits are set. This would also help vacuums to avoid visiting those heap
pages that don't have any work to be done.

The main objection to this idea is that this may result in too much
flip-flopping of the bit, especially if the HOT prune is to be followed by
an UPDATE to the page. This is a valid concern. But the way HOT prune works
today, it has no linkage to the future UPDATE operations other than the
fact that it frees up space for future UPDATE operations. But the prune can
happen even in a select code path. Suggestion 2 below is about changing
this behavior, but my point is to consider 1 unless and until we do 2. Tom
and Simon opposed saying we need to take a holistic view.  Another concern
with this idea is that VM bit set operation now generates WAL and repeated
setting/clearing of the bit may increase WAL activity. I suggested to piggy
back the VM bit set logging with the HOT prune WAL log. Robert raised some
doubts regarding increased full-page writes if VM set LSN is recorded in
the heap page LSN. I am not sure if that applies if we piggy back the
operation though because HOT prune WAL would anyway record LSN in the heap
page.

If we do this, we should also consider updating FSM after prune because
vacuum may not scan this page at all.

*Suggestion 2: Move HOT prune logic somewhere else
*
Tom/Simon suggested that we ought to consider moving HOT prune to some
other code path. When we implemented HOT a few years back, we wanted to
make it as less invasive as possible. But now the code has proven
stability, we can experiment a few more things. Especially, we would like
to prune only if the page is going to receive an UPDATE soon. Otherwise,
pruning may unnecessarily add overheads to a simple read-only query and the
space freed up by prune may not even be used soon/ever. Tom suggested that
we can teach planner/executor to distinguish between a scan on a normal
relation vs result relation. I'd some concerns that even if we have such
mechanism, it may not be enough because a scan does not guarantee that the
tuple will be finally updated because it may fail qualification etc.

Simon has strong views regarding burdening SELECTs with maintenance work,
but Robert and I are not convinced that its necessarily a bad idea to let
SELECTs do a little extra work which can help to keep the overall state
healthy.  But in general, it might be a good idea to try such approaches
and see if we can extract more out of the system. Suggestion 5 and 6 also
popped up to handle this problem in a slightly different way.

*Suggestion 3: Don't clear visibility map bit after HOT update
*
I proposed this during the course of discussion and Andreas F
liked/supported the idea. This could be useful when most/all updates are
HOT updates. So the page does not need any work during vacuum (assuming HOT
prune will take care of it) and index-only scans still work because the
index pointers will be pointing to a valid HOT chain. Tom/Simon didn't
quite like it because they were worried that this will change the meaning
on the VM. I (and I think even Andreas) don't think that way. Of course,
there are some concerns because this will break the use of PD_ALL_VISIBLE
flag for avoiding MVCC checks during heap scans. There are couple of
suggestions to fix that like having another page level bit to differentiate
these two states. Doing that will help us skip MVCC checks even if there
are one or more DEAD line pointers in the page. We should also run some
performance tests to see how much benefit is really served by skipping MVCC
checks in heap scans. We can weigh that against the benefit of keeping the
VM bit set.

*Suggestion 4: Freeze tuples during HOT prune*
*
*
I suggested that we can also freeze tuples during HOT prune. The rational
for doing it this way is to remove unnecessary work from vacuum by
piggy-backing the freeze logging in the HOT prune WAL record. Today vacuum
will generate additional WAL and dirty the buffers again just to freeze the
tuples. There are couple of objections to this idea. One is pushes
background work into foreground 

Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-20 Thread Simon Riggs
On 20 December 2012 00:43, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Dec 19, 2012 at 12:39 PM, Simon Riggs si...@2ndquadrant.com wrote:
 The benefit of saying that only UPDATEs clean the block is that this
 penalises only the workload making the mess, rather than everybody
 cleaning up repeatedly over one messy guy.

 Right, but there are plenty of situations where having everybody clean
 up after the messy guy is better than waiting around and hoping that
 Mom (aka vacuum) will do it.

The problems I see are that cleaning on SELECT is too frequent,
interferes with foreground performance and re-dirties data blocks too
often.

Waiting for Mom is configurable, since we can set parameters for
autovacuum. But we can't turn off the cleaning by SELECTs, which makes
the configurability of autovacuum somewhat moot.

We could also contact the Cleaner instead.

ISTM that if someone spots a block that could use cleanup, they mark
the block as BM_PIN_COUNT_WAITER, but don't set pid. Then when they
unpin the block they send a signal/queue work for a special cleaning
process to come in and do the work now that nobody is waiting. Logic
would allow VACUUMs to go past that by setting the pid. If we
prioritised cleanup onto blocks that are already dirty we would
minimise I/O.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-20 Thread Simon Riggs
On 20 December 2012 00:10, Pavan Deolasee pavan.deola...@gmail.com wrote:

 I just thought that we can fairly easily limit the damage if we are
 really worried about SELECTs being penalised. What if we set a
 configurable limit on *extra* things that a query may do which is
 otherwise not very useful for the query itself, but is useful to keep
 the system healthy and steady. HOT prune definitely counts as one of
 them and may be even setting of hint bits. (This is a topic for a
 separate thread though)

I like this idea

transaction_cleanup_limit = -1 (default), 0, 1+

-1 means no limit on number of cleanups in this transaction, which is
current behaviour.
Other numbers are the number of cleanups that will be tolerated in
this transaction; once we hit the limit we don't attempt cleanup
anymore we just get on with it. The limit is ignored for UPDATEs since
they need to clear space for their work.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-20 Thread Robert Haas
On Wed, Dec 19, 2012 at 11:12 PM, Pavan Deolasee
pavan.deola...@gmail.com wrote:
 Yeah, VM buffer contention can become prominent if we break the
 invariant that page level bit status implies the vm bit status, at
 least when its clear.OTOH IMHO we need some mechanism to address the
 issue of aggressive clearing of the VM bits, but a very lame
 corresponding set operation.

I agree.

 Today we don't have much contention on
 the VM page, but we must be sacrificing its usability in return. IOS
 as well as vacuum optimizations using VMs will turn out not so useful
 for many workloads.

I have that fear too, but the evidence isn't really in yet.  The
testing that people have reported on this list has had mostly positive
outcomes.  Of course that doesn't mean that it will work as well in
the field as it did in the lab, but it doesn't mean that it won't,
either.

 I'm very reluctant to suggest that we can solve
 this my setting aside another page-level bit to track visibility of
 tuples for heapscans. Or even have a bit in the tuple header itself to
 track this information at that level to avoid repeated visibility
 check for a tuple which is known to be visible to all current and
 future transactions.

This has been suggested before, as an alternative to freezing tuples.
It seems to have some potential although I think more thought and work
is needed to figure out exactly where to go with it.

 And we expect vacuums to be very less or none. AFAIR in pgbench, it
 now takes hours for accounts table to get chosen for vacuum and we
 should be happy about it. But IOS are almost impossible for pgbench
 kind of workloads today because of our aggressive strategy to clear
 the VM bits.

IMHO, it's probably fairly hopeless to make a pure pgbench workload
show a benefit from index-only scans.  A large table under a very
heavy, completely random write workload is just about the worst
possible case for index-only scans.  Index-only scans are a way of
avoiding unnecessary visibility checks when the target data hasn't
changed recently, not a magic bullet to escape all heap access.  If
the target data has changed, you're going to have to touch the heap.
And while I agree that we aren't aggressive enough in setting the VM
bits right now, I also think it wouldn't be too hard to go too far in
the opposite direction: we could easily spend more effort trying to
make index-only scans effective than we could ever hope to recoup from
the scans themselves.

Now, if you set up N threads of which 10% are doing regular pgbench
and the other 90% are doing pgbench -S, or something like that, then
you might start to hope for some benefit from index-only scans.  But I
think you might also GET some benefit in that case, even at steady
state.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-20 Thread Robert Haas
On Thu, Dec 20, 2012 at 4:58 AM, Simon Riggs si...@2ndquadrant.com wrote:
 ISTM that if someone spots a block that could use cleanup, they mark
 the block as BM_PIN_COUNT_WAITER, but don't set pid. Then when they
 unpin the block they send a signal/queue work for a special cleaning
 process to come in and do the work now that nobody is waiting. Logic
 would allow VACUUMs to go past that by setting the pid. If we
 prioritised cleanup onto blocks that are already dirty we would
 minimise I/O.

I don't favor that particular signaling mechanism, but I agree that
there is quite a bit of potential utility in having foreground
processes notice that work (like a HOT prune, or setting the VM bit)
needs to be done and pass those requests off to a background process.
I'm hoping the new background worker framework in 9.3 will make that
sort of thing easier for people to play around with.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-20 Thread Pavan Deolasee
On Thu, Dec 20, 2012 at 9:23 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Dec 19, 2012 at 11:12 PM, Pavan Deolasee


 I'm very reluctant to suggest that we can solve
 this my setting aside another page-level bit to track visibility of
 tuples for heapscans. Or even have a bit in the tuple header itself to
 track this information at that level to avoid repeated visibility
 check for a tuple which is known to be visible to all current and
 future transactions.

 This has been suggested before, as an alternative to freezing tuples.
 It seems to have some potential although I think more thought and work
 is needed to figure out exactly where to go with it.


Ok. Will try to read archives to see what was actually suggested and
why it was put on back burner. BTW at the risk of being shot down
again, I wonder if can we push down the freeze operation to HOT prune
also. A single WAL record can then record this action as well. Also,
it saves us from repeated checks for transaction status flags in
heap_freeze_tuple(). Of course, we do all these only if HOT prune has
work on its on and just try to piggyback.

I wonder if we should add a flag to heap_page_prune and try to do some
additional work if its being called from lazy vacuum such as setting
the VM bit and the tuple freeze. IIRC I had put something like that in
the early patches, but then ripped of for simplicity. May be its time
to play with that again.

In fact, I'd also suggested ripping off the line pointer scan in lazy
vacuum since its preceded by a HOT prune which does bulk of the work
anyways. I remember Tom taking objection to that, but can't remember
why. Will try to read up the old thread again.

 And we expect vacuums to be very less or none. AFAIR in pgbench, it
 now takes hours for accounts table to get chosen for vacuum and we
 should be happy about it. But IOS are almost impossible for pgbench
 kind of workloads today because of our aggressive strategy to clear
 the VM bits.

 IMHO, it's probably fairly hopeless to make a pure pgbench workload
 show a benefit from index-only scans.  A large table under a very
 heavy, completely random write workload is just about the worst
 possible case for index-only scans.  Index-only scans are a way of
 avoiding unnecessary visibility checks when the target data hasn't
 changed recently, not a magic bullet to escape all heap access.  If
 the target data has changed, you're going to have to touch the heap.

Not always. Not clearing the VM bit at HOT update is one such idea we
discussed. Of course, there are open issues with that, but they are
not unsolvable. The advantage of not touching heap is just too big to
ignore.

 And while I agree that we aren't aggressive enough in setting the VM
 bits right now, I also think it wouldn't be too hard to go too far in
 the opposite direction: we could easily spend more effort trying to
 make index-only scans effective than we could ever hope to recoup from
 the scans themselves.


I agree. I also started having that worry. We are at one extreme right
now and it might not help to get to the other extreme. Looks like I'm
coming along the idea of somehow detecting if the scan is happening on
the result relation of a ModifyTable and avoid setting VM bit in that
case.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-20 Thread Tom Lane
Pavan Deolasee pavan.deola...@gmail.com writes:
 Ok. Will try to read archives to see what was actually suggested and
 why it was put on back burner. BTW at the risk of being shot down
 again, I wonder if can we push down the freeze operation to HOT prune
 also.

Seems unlikely to be a win.  We only care about freezing tuples in the
context of being able to advance a relation-wide relfrozenxid horizon.
Freezing pages retail accomplishes nothing whatsoever towards that goal,
unless you have some way to know that no new freeze work will be needed
on the page before the next vacuum freeze happens.  Otherwise, you're
just moving portions of the work from background vacuuming into
foreground processes, with no benefit gained thereby.  In fact, you
might well be *creating* work that would otherwise not have had to be
done at all --- the tuple might get deleted before the next freeze
happens.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-20 Thread Robert Haas
On Thu, Dec 20, 2012 at 11:49 AM, Pavan Deolasee
pavan.deola...@gmail.com wrote:
 I wonder if we should add a flag to heap_page_prune and try to do some
 additional work if its being called from lazy vacuum such as setting
 the VM bit and the tuple freeze. IIRC I had put something like that in
 the early patches, but then ripped of for simplicity. May be its time
 to play with that again.

That seems unlikely to be a good trade-off.  If VACUUM is going to do
extra stuff, it's better to have that in the vacuum-specific code,
rather than in code that is also traversed from other places.
Otherwise the conditional logic might impose a penalty on people who
aren't taking those branches.

 IMHO, it's probably fairly hopeless to make a pure pgbench workload
 show a benefit from index-only scans.  A large table under a very
 heavy, completely random write workload is just about the worst
 possible case for index-only scans.  Index-only scans are a way of
 avoiding unnecessary visibility checks when the target data hasn't
 changed recently, not a magic bullet to escape all heap access.  If
 the target data has changed, you're going to have to touch the heap.

 Not always. Not clearing the VM bit at HOT update is one such idea we
 discussed. Of course, there are open issues with that, but they are
 not unsolvable. The advantage of not touching heap is just too big to
 ignore.

I don't really agree.  Sure, not touching the heap is nice, but mostly
because you avoid pulling pages into shared_buffers that aren't
otherwise needed.  IIRC, an index-only scan isn't faster than an index
scan if all the necessary table and index pages are already cached.
Touching already-resident pages just isn't that expensive.  And of
course, if a page has recently suffered an insert, update, or delete,
it is more likely to be resident.  You can construct access patterns
where this isn't so - e.g. update the page, wait for it to get paged
out, and then SELECT from it with an index-only scan, wait for it to
get paged out again, etc. - but I'm not sure how much of a problem
that is in the real world.

 And while I agree that we aren't aggressive enough in setting the VM
 bits right now, I also think it wouldn't be too hard to go too far in
 the opposite direction: we could easily spend more effort trying to
 make index-only scans effective than we could ever hope to recoup from
 the scans themselves.

 I agree. I also started having that worry. We are at one extreme right
 now and it might not help to get to the other extreme. Looks like I'm
 coming along the idea of somehow detecting if the scan is happening on
 the result relation of a ModifyTable and avoid setting VM bit in that
 case.

It's unclear to me that that's the right way to slice it.  There are
several different sets of concerns here: (1) avoiding setting the
all-visible bit when it'll be cleared again just after, (2) avoiding
slowing down SELECT with hot-pruning, and (3) avoiding slowing down
repeated SELECTs by NOT having the first one do HOT-pruning.  And
maybe others.  The right thing to do depends on which problems you
think are relatively more important.  That question might not even
have one right answer, but even if it does we don't have consensus on
what it is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-20 Thread Pavan Deolasee
On Thu, Dec 20, 2012 at 10:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Pavan Deolasee pavan.deola...@gmail.com writes:
 Ok. Will try to read archives to see what was actually suggested and
 why it was put on back burner. BTW at the risk of being shot down
 again, I wonder if can we push down the freeze operation to HOT prune
 also.

 Seems unlikely to be a win.  We only care about freezing tuples in the
 context of being able to advance a relation-wide relfrozenxid horizon.
 Freezing pages retail accomplishes nothing whatsoever towards that goal,
 unless you have some way to know that no new freeze work will be needed
 on the page before the next vacuum freeze happens. Otherwise, you're
 just moving portions of the work from background vacuuming into
 foreground processes, with no benefit gained thereby.

If we can establish an invariant that a all-visible page is always
fully freezed, then vacuum freeze does not need to look at those pages
again. Another advantage is that we are holding the right lock and
piggyback freeze with cleanup WAL-logging, thus avoiding re-dirtying
of the page and additional WAL logging.

  In fact, you
 might well be *creating* work that would otherwise not have had to be
 done at all --- the tuple might get deleted before the next freeze
 happens.


Yeah, there will be cases where it might not add any value or even add
little overhead. Don't know what will serve better on an average or
majority of the workloads though. Vacuum freeze has known to add
sudden and unexpected load on the system, so I thought this might
mitigate that to a certain extent.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-20 Thread Pavan Deolasee
On Thu, Dec 20, 2012 at 10:59 PM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Dec 20, 2012 at 11:49 AM, Pavan Deolasee
 pavan.deola...@gmail.com wrote:
 I wonder if we should add a flag to heap_page_prune and try to do some
 additional work if its being called from lazy vacuum such as setting
 the VM bit and the tuple freeze. IIRC I had put something like that in
 the early patches, but then ripped of for simplicity. May be its time
 to play with that again.

 That seems unlikely to be a good trade-off.  If VACUUM is going to do
 extra stuff, it's better to have that in the vacuum-specific code,
 rather than in code that is also traversed from other places.
 Otherwise the conditional logic might impose a penalty on people who
 aren't taking those branches.


Thats a call we need to take between code duplication vs customising
execution. We do that all over the code. Not sure if it will be any
different here.


 It's unclear to me that that's the right way to slice it.  There are
 several different sets of concerns here: (1) avoiding setting the
 all-visible bit when it'll be cleared again just after, (2) avoiding
 slowing down SELECT with hot-pruning, and (3) avoiding slowing down
 repeated SELECTs by NOT having the first one do HOT-pruning.  And
 maybe others.  The right thing to do depends on which problems you
 think are relatively more important.  That question might not even
 have one right answer, but even if it does we don't have consensus on
 what it is.

Hmm. We tossed and discussed many interesting ideas in this thread. It
will be sad if none of them go anywhere. When I look at archives, I
see we might have discussed some of these even in the past but never
got an agreement because there always be a workload which may not be
served well by any specific idea. And many a times, they are so
interrelated that we either have to do all or none. Unfortunately,
trying to do all is too-much and too-invasive most often.

May be what we need an official experimental branch where such ideas
can be checked-in and encourage people to try out those branches in
their real world tests or set up dedicated benchmark machines to run
regular tests. Tested and proven ideas can then be merged into the
main trunk. That will be the only way to know efficacy of such ideas.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-20 Thread Tom Lane
Pavan Deolasee pavan.deola...@gmail.com writes:
 On Thu, Dec 20, 2012 at 10:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Seems unlikely to be a win.  We only care about freezing tuples in the
 context of being able to advance a relation-wide relfrozenxid horizon.
 Freezing pages retail accomplishes nothing whatsoever towards that goal,
 unless you have some way to know that no new freeze work will be needed
 on the page before the next vacuum freeze happens. Otherwise, you're
 just moving portions of the work from background vacuuming into
 foreground processes, with no benefit gained thereby.

 If we can establish an invariant that a all-visible page is always
 fully freezed, then vacuum freeze does not need to look at those pages
 again.

We're not going to do that, because it would require freezing tuples
immediately after they fall below the RecentGlobalXmin horizon.  This
would be a significant loss of capability from a forensic standpoint,
not to mention breaking existing applications that look at xmin to
determine whether a tuple has recently been updated.  Besides which,
I think it would result in a large increase in the WAL volume emitted
by prune operations (hint bit setting doesn't require WAL, unlike
freezing).  I don't believe for a minute your argument that it would
result in a net reduction in WAL.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-20 Thread Robert Haas
On Thu, Dec 20, 2012 at 1:30 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Pavan Deolasee pavan.deola...@gmail.com writes:
 On Thu, Dec 20, 2012 at 10:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Seems unlikely to be a win.  We only care about freezing tuples in the
 context of being able to advance a relation-wide relfrozenxid horizon.
 Freezing pages retail accomplishes nothing whatsoever towards that goal,
 unless you have some way to know that no new freeze work will be needed
 on the page before the next vacuum freeze happens. Otherwise, you're
 just moving portions of the work from background vacuuming into
 foreground processes, with no benefit gained thereby.

 If we can establish an invariant that a all-visible page is always
 fully freezed, then vacuum freeze does not need to look at those pages
 again.

 We're not going to do that, because it would require freezing tuples
 immediately after they fall below the RecentGlobalXmin horizon.  This
 would be a significant loss of capability from a forensic standpoint,
 not to mention breaking existing applications that look at xmin to
 determine whether a tuple has recently been updated.  Besides which,
 I think it would result in a large increase in the WAL volume emitted
 by prune operations (hint bit setting doesn't require WAL, unlike
 freezing).  I don't believe for a minute your argument that it would
 result in a net reduction in WAL.

I don't think the above makes sense, because making a page all-visible
already requires emitting a WAL record.  Pavan didn't say freeze the
page every time we set a hint bit; he said freeze the page every
time it gets marked all-visible.  And that's already WAL-logged.

Now, there is a downside: right now, we play a tricky little game
where we emit a WAL record for setting the visibility map bit, but we
don't actually set the LSN of the heap page.  It's OK because it's
harmless if the PD_ALL_VISIBLE bit makes it to disk and the
visibility-map doesn't, and also because the PD_ALL_VISIBLE bit can be
set without relying on the previous page contents.  But doing anything
more complicated with the same WAL record, like freezing, is likely to
require setting the LSN on the heap page.  And that will result in a
huge increase in WAL traffic when vacuuming an insert-only table.
Whee, crash recovery is fun.

With respect to the forensic problem, we've previously discussed
setting a HEAP_XMIN_FROZEN bit in the tuple header rather than
overwriting the xmin with FrozenXID.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-20 Thread Jeff Janes
On Wednesday, December 19, 2012, Robert Haas wrote:

 On Wed, Dec 19, 2012 at 12:26 PM, Pavan Deolasee
 pavan.deola...@gmail.com javascript:; wrote:

  I would like to run some pgbench tests where we get the system in a
  steady state such as all/most updates are HOT updates (not entirely
  unlikely scenario for many real life cases). And then try running some
  concurrent queries which can be executed via IOS. My gut feel is that,
  today we will see slow and continuous drop in performance for these
  queries because IOS will slowly stop working.

 If there are no vacuums, I agree.


If the table is randomly updated over its entire size, then pretty much
every block will be not-all-visible (and so disqualified from IOS) before
you hit the default 20% vacuum threshold.  I wonder if there ought not be
another vac threshold, based on vm density rather than estimated obsolete
tuple density.

Cheers,

Jeff


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Robert Haas
On Sat, Dec 15, 2012 at 4:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 On the other hand, the HOT prune operation itself is worthless when
 we're running a SELECT.  The only reason we do it that way is that we
 have to prune before the query starts to use the page, else pruning
 might invalidate pointers-to-tuples that are being held within the
 query plan tree.

 Maybe it's time to look at what it'd take for the low-level scan
 operations to know whether they're scanning the target relation of
 an UPDATE query, so that we could skip pruning altogether except
 when a HOT update could conceivably ensue.  I think this was discussed
 back when HOT went in, but nobody wanted to make the patch more invasive
 than it had to be.

I think it's wrong to assume that HOT pruning has no value except in
this case.  Truncating dead tuples to line pointers and collapsing HOT
chains speeds up future page scans, and if we were able to set the
all-visible bit, that would help even more.  The problem is that this
is all somewhat prospective: HOT pruning the page doesn't help the
*current* scan - in fact, it can sometimes slow it down considerably -
but it can be a great help to the next scan that comes through.  We
say, oh, don't worry, VACUUM will take care of it, but there are
plenty of cases where a page can be scanned a very large number of
times before VACUUM comes along; and you do can lose a lot of
performance in those cases.

That having been said, I agree with the concerns expressed elsewhere
in this thread that setting the visibility map bit too aggressively
will be a waste.  If the page is about to get dirtied again we surely
don't want to go there.  Aside from the obvious problem of doing work
that may not be necessary, it figures to create buffer-lock contention
on the visibility map page.  One of the strengths of the current
design is that we avoid that pretty effectively.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Pavan Deolasee
On Wed, Dec 19, 2012 at 8:32 PM, Robert Haas robertmh...@gmail.com wrote:
 On Sat, Dec 15, 2012 at 4:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 On the other hand, the HOT prune operation itself is worthless when
 we're running a SELECT.  The only reason we do it that way is that we
 have to prune before the query starts to use the page, else pruning
 might invalidate pointers-to-tuples that are being held within the
 query plan tree.

 Maybe it's time to look at what it'd take for the low-level scan
 operations to know whether they're scanning the target relation of
 an UPDATE query, so that we could skip pruning altogether except
 when a HOT update could conceivably ensue.  I think this was discussed
 back when HOT went in, but nobody wanted to make the patch more invasive
 than it had to be.

 I think it's wrong to assume that HOT pruning has no value except in
 this case.  Truncating dead tuples to line pointers and collapsing HOT
 chains speeds up future page scans, and if we were able to set the
 all-visible bit, that would help even more.

Good point.

 The problem is that this
 is all somewhat prospective: HOT pruning the page doesn't help the
 *current* scan - in fact, it can sometimes slow it down considerably -
 but it can be a great help to the next scan that comes through.  We
 say, oh, don't worry, VACUUM will take care of it, but there are
 plenty of cases where a page can be scanned a very large number of
 times before VACUUM comes along; and you do can lose a lot of
 performance in those cases.


Also, since we discount for number of tuples pruned by HOT pruning
while tracking number of dead tuples in a table, in a perfectly stable
system, autovacuum may not ever pick the table for vacuuming, slowly
stopping index-only scans from working. Soon we will have a situation
when all VM bits are clear, but autovacuum would fail to pick the
table. Tom had a good suggestion to periodically count vm bits to
choose tables for vacuuming even if there are no dead tuples or dead
line pointers to remove. I'm not sure though if the extra vacuum will
be better than setting the bit after HOT prune. Also, deciding when to
count the bits can be tricky. Do it every vacuum cycle ? Or after
every 5/10 cycles ? I don't have the answer.

 That having been said, I agree with the concerns expressed elsewhere
 in this thread that setting the visibility map bit too aggressively
 will be a waste.  If the page is about to get dirtied again we surely
 don't want to go there.

Yeah, I agree. If we could figure out that we are soon going to UPDATE
a tuple in the page again, it will be worthless to set the bit. But
predicting that also could turn out to be tricky. Even if we could
somehow tell that the scan is happening on the result relation of an
UPDATE operation, not every page may receive updates because of where
quals etc. So we may get lots of false positives.

 Aside from the obvious problem of doing work
 that may not be necessary, it figures to create buffer-lock contention
 on the visibility map page.  One of the strengths of the current
 design is that we avoid that pretty effectively.


Its a valid concern, though my limited pgbench tests did not show any
drop in the number. But thats hardly any proof. We can possibly
mitigate this by conditional update to the VM bit. Do it only if you
get a conditional exclusive lock on the buffer page.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sat, Dec 15, 2012 at 4:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Maybe it's time to look at what it'd take for the low-level scan
 operations to know whether they're scanning the target relation of
 an UPDATE query, so that we could skip pruning altogether except
 when a HOT update could conceivably ensue.  I think this was discussed
 back when HOT went in, but nobody wanted to make the patch more invasive
 than it had to be.

 I think it's wrong to assume that HOT pruning has no value except in
 this case.  Truncating dead tuples to line pointers and collapsing HOT
 chains speeds up future page scans, and if we were able to set the
 all-visible bit, that would help even more.  The problem is that this
 is all somewhat prospective: HOT pruning the page doesn't help the
 *current* scan - in fact, it can sometimes slow it down considerably -
 but it can be a great help to the next scan that comes through.

Well, no.  The problem with the way we do it now is that doing it every
time a query scan arrives at a page is too often, resulting in a lot of
wasted work.  That wasted work is somewhat tolerable as long as it only
involves looking at the current page and ending up not actually changing
it.  If we start generating a lot of useless WAL activity and I/O as
a result of thrashing the all-visible bit, it won't be so tolerable
anymore.  But the problem is not so much the desire to set the bit as
that we're doing this whole activity at the wrong place and time.

Perhaps doing it every time an UPDATE arrives at the page is too far
in the other direction, and we need to look for some other mechanism
entirely.

I think my core point still stands: the way that HOT pruning is done now
is an artifact of having wanted to shoehorn it into the system with
minimum changes.  Which was reasonable at the time given the
experimental status of the feature, but now it's time to reconsider.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Pavan Deolasee
On Wed, Dec 19, 2012 at 9:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  If we start generating a lot of useless WAL activity and I/O as
 a result of thrashing the all-visible bit, it won't be so tolerable
 anymore.

What if we wrap that into the WAL generated by HOT prune itself ?
Would that address your concerns for extra WAL logging ? I also
suggested doing it conditionally to avoid contention on the VM buffer.

(I actually wonder why we WAL-log set operation at all except for HS
to be able to do IOS, but thats a topic for separate thread may be)

Also, if extra WAL-logging is really worrisome, may be we should again
seriously reconsider my idea of *not* clearing the bit at HOT update.
My apologies for repeating myself. But that seems very important in a
steady system when almost every update is a HOT update. So you don't
clear the bit at HOT update and so don't need to set it back either,
thus saving two WAL activity. You definitely don't need any vacuum in
this case if pruning keeps reclaiming dead space at appropriate time
and make it available for the next update. More so, IOS still works
great. Why is this so bad ? I haven't forgotten your complaints about
changed meaning of the bit, but I tried to explain that we can read it
in a slightly different way and still show it as an invariant.


 I think my core point still stands: the way that HOT pruning is done now
 is an artifact of having wanted to shoehorn it into the system with
 minimum changes.  Which was reasonable at the time given the
 experimental status of the feature, but now it's time to reconsider.


ISTM that you already have concret ideas about what are those places
where HOT prune would be more effective. My worry is changing anything
there is going to be a lot trickier and will require heavy testing.
Our initial work has served us well so far. Of course, I've no problem
changing that if its going to benefit users.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Robert Haas
On Wed, Dec 19, 2012 at 11:49 AM, Pavan Deolasee
pavan.deola...@gmail.com wrote:
 Also, if extra WAL-logging is really worrisome, may be we should again
 seriously reconsider my idea of *not* clearing the bit at HOT update.
 My apologies for repeating myself. But that seems very important in a
 steady system when almost every update is a HOT update. So you don't
 clear the bit at HOT update and so don't need to set it back either,
 thus saving two WAL activity. You definitely don't need any vacuum in
 this case if pruning keeps reclaiming dead space at appropriate time
 and make it available for the next update. More so, IOS still works
 great. Why is this so bad ?

It's bad because then sequential scans will return wrong answers,
unless we also rip out the optimization that uses PD_ALL_VISIBLE as an
excuse to skip all visibility checks for the page.  That optimization
is worth a significant amount of performance.

It's also bad because then vacuum won't visit the page, and it really
should.  It's much better to have vacuum prune the page in the
background than to have some query do it in the foreground, although
the latter is still better than not doing it at all.

We could potentially have two or three bits per page to suit these
different needs: (1) page can benefit from a vacuum, (2) page is safe
for IOS purposes, and (3) page is safe for seqscan purposes.  But I
think that might be overengineering.

IMHO, the goal here should be to have some method of setting the
visibility map, in some set of circumstances, outside of vacuum.
Figuring out which set of circumstances is appropriate is the hard
part.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Pavan Deolasee
On Wed, Dec 19, 2012 at 10:40 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Dec 19, 2012 at 11:49 AM, Pavan Deolasee
 pavan.deola...@gmail.com wrote:
 Also, if extra WAL-logging is really worrisome, may be we should again
 seriously reconsider my idea of *not* clearing the bit at HOT update.
 My apologies for repeating myself. But that seems very important in a
 steady system when almost every update is a HOT update. So you don't
 clear the bit at HOT update and so don't need to set it back either,
 thus saving two WAL activity. You definitely don't need any vacuum in
 this case if pruning keeps reclaiming dead space at appropriate time
 and make it available for the next update. More so, IOS still works
 great. Why is this so bad ?

 It's bad because then sequential scans will return wrong answers,
 unless we also rip out the optimization that uses PD_ALL_VISIBLE as an
 excuse to skip all visibility checks for the page.  That optimization
 is worth a significant amount of performance.


This can be handled by breaking 1-to-1 mapping on VM bit and
PD_ALL_VISIBLE bit. So seq scans will only look at PD_ALL_VISIBLE. It
was proposed by Andres up thread, but shot down by Tom and Simon. But
I still feel that was over reaction and there is a lot of merit in the
idea. As I said elsewhere, it will also help the case when there are
DEAD line pointers in a page. Today we can't mark such pages
all-visible, but if we break this mapping, we can do that.

I would like to run some pgbench tests where we get the system in a
steady state such as all/most updates are HOT updates (not entirely
unlikely scenario for many real life cases). And then try running some
concurrent queries which can be executed via IOS. My gut feel is that,
today we will see slow and continuous drop in performance for these
queries because IOS will slowly stop working.

 It's also bad because then vacuum won't visit the page, and it really
 should.  It's much better to have vacuum prune the page in the
 background than to have some query do it in the foreground, although
 the latter is still better than not doing it at all.


Hmm. This is a good point and I don't have an easy answer. I'm not
sure how this will pan out in real life cases though. We definitely
made great progress by having HOT, though the same concerns were
raised even then that we are moving work from background to
foreground. But I think generally HOT made great difference to the
system as a whole, may be at a cost of slowdown for some read-only,
select queries. And HOT prune is not the only operation that we do in
foreground. We also set hint bits and make buffers dirty in an
otherwise read-only queries.


 IMHO, the goal here should be to have some method of setting the
 visibility map, in some set of circumstances, outside of vacuum.
 Figuring out which set of circumstances is appropriate is the hard
 part.


Yeah, if we can figure that out conclusively, I'm sure we might be
able to auto-tune the system even further.

Thanks,
Pavan
-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Simon Riggs
On 19 December 2012 16:21, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sat, Dec 15, 2012 at 4:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Maybe it's time to look at what it'd take for the low-level scan
 operations to know whether they're scanning the target relation of
 an UPDATE query, so that we could skip pruning altogether except
 when a HOT update could conceivably ensue.  I think this was discussed
 back when HOT went in, but nobody wanted to make the patch more invasive
 than it had to be.

 I think it's wrong to assume that HOT pruning has no value except in
 this case.  Truncating dead tuples to line pointers and collapsing HOT
 chains speeds up future page scans, and if we were able to set the
 all-visible bit, that would help even more.  The problem is that this
 is all somewhat prospective: HOT pruning the page doesn't help the
 *current* scan - in fact, it can sometimes slow it down considerably -
 but it can be a great help to the next scan that comes through.

 Well, no.  The problem with the way we do it now is that doing it every
 time a query scan arrives at a page is too often, resulting in a lot of
 wasted work.  That wasted work is somewhat tolerable as long as it only
 involves looking at the current page and ending up not actually changing
 it.  If we start generating a lot of useless WAL activity and I/O as
 a result of thrashing the all-visible bit, it won't be so tolerable
 anymore.  But the problem is not so much the desire to set the bit as
 that we're doing this whole activity at the wrong place and time.

 Perhaps doing it every time an UPDATE arrives at the page is too far
 in the other direction, and we need to look for some other mechanism
 entirely.

The benefit of saying that only UPDATEs clean the block is that this
penalises only the workload making the mess, rather than everybody
cleaning up repeatedly over one messy guy.

Having a random SELECT clean the block causes both delay in
non-UPDATEing process, contention and additional writes.

We definitely know we write too often; this has been measured and
discussed over a period of years.

It would be useful to have a table-level option of hot_cleanup= SELECT
| UPDATE | NONE to allow people to minimise cleanup and test the
difference this makes.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Tom Lane
Pavan Deolasee pavan.deola...@gmail.com writes:
 On Wed, Dec 19, 2012 at 9:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 If we start generating a lot of useless WAL activity and I/O as
 a result of thrashing the all-visible bit, it won't be so tolerable
 anymore.

 What if we wrap that into the WAL generated by HOT prune itself ?

What WAL?  The case we're worried about here is that there's nothing
else for HOT prune to do.

 I think my core point still stands: the way that HOT pruning is done now
 is an artifact of having wanted to shoehorn it into the system with
 minimum changes.  Which was reasonable at the time given the
 experimental status of the feature, but now it's time to reconsider.

 ISTM that you already have concret ideas about what are those places
 where HOT prune would be more effective.

No, I don't; I'm just suggesting that we ought to think outside the box
of the way it's being done now.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Simon Riggs
On 19 December 2012 17:26, Pavan Deolasee pavan.deola...@gmail.com wrote:

 We definitely
 made great progress by having HOT

Yes, definitely. Great work. That is not for debate.

 But I think generally HOT made great difference to the
 system as a whole, may be at a cost of slowdown for some read-only,
 select queries. And HOT prune is not the only operation that we do in
 foreground. We also set hint bits and make buffers dirty in an
 otherwise read-only queries.

And those last things are being debated hotly. We definitely need to
ask whether the way things are now can be tweaked to be better. The
major mechanics need not be reviewed, but the tradeoffs and balances?
Definitely.

Anything we do in foreground needs evaluation. Assuming eager actions
give a good payoff is not always a useful thought.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Pavan Deolasee
On Thu, Dec 20, 2012 at 12:22 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Pavan Deolasee pavan.deola...@gmail.com writes:


 What if we wrap that into the WAL generated by HOT prune itself ?

 What WAL?  The case we're worried about here is that there's nothing
 else for HOT prune to do.


Does such a case exist ? Or at least, is it that common ? I mean, we
have enough checks in place to ensure that HOT prune is attempted only
when there is something interesting in the page to be done. Otherwise
we don't even attempt getting a cleanup lock on the page. Of course,
they are just hints, but they serve very well. Prune XID is what I've
in mind in particular.

So the fact that the visibility map bit is cleared, it could be
because either at least one tuple in the page was updated, deleted or
inserted. The first two would have set prune XID and will trigger an
HOT action and HOT prune will indeed do something useful. I think
aborted non-HOT update may create a scenario that you're talking about
i.e. HOT prune will have nothing to do, but the page again turned
all-visible. Being an abort path, I wonder if its really that common
though.

That leaves us with the inserts which will clear the VM bit, but may
not have anything for HOT prune to do. But we don't set prune XID for
inserts either. So we won't get into hot_page_prune() for such pages.

So my point is, for fairly large and common cases, often we will set
the bit only when HOT prune did something useful, though not every
useful HOT prune will necessarily set the bit. And even if we slip
through all the safety nets on HOT prune, we can choose to set the bit
only if HOT did something useful to avoid any extra WAL logging
assuming we are still worried about those corner cases.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Pavan Deolasee
On Thu, Dec 20, 2012 at 5:05 AM, Simon Riggs si...@2ndquadrant.com wrote:


 And those last things are being debated hotly. We definitely need to
 ask whether the way things are now can be tweaked to be better. The
 major mechanics need not be reviewed, but the tradeoffs and balances?
 Definitely.


I have zero objection to do that, just that I don't have solid ideas
right now. And its not because I haven't thought hard enough.

 Anything we do in foreground needs evaluation. Assuming eager actions
 give a good payoff is not always a useful thought.


I don't disagree. Your field experience is much larger than mine, but
I have spent hours testing PostgreSQL's performance, so can talk with
some degree of conviction. I think when we do things that can reduce
read/write IO or bloat of a large table in general, the system as a
whole benefits, may be at a cost of some genuinely good guy doing a
simple SELECT in this case. Often the SELECTs are also benefited
because one of their good siblings helped us reduce bloat of the table
and hence seq scans had to scan order of magnitude less blocks.

I just thought that we can fairly easily limit the damage if we are
really worried about SELECTs being penalised. What if we set a
configurable limit on *extra* things that a query may do which is
otherwise not very useful for the query itself, but is useful to keep
the system healthy and steady. HOT prune definitely counts as one of
them and may be even setting of hint bits. (This is a topic for a
separate thread though)

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Robert Haas
On Wed, Dec 19, 2012 at 12:26 PM, Pavan Deolasee
pavan.deola...@gmail.com wrote:
 This can be handled by breaking 1-to-1 mapping on VM bit and
 PD_ALL_VISIBLE bit. So seq scans will only look at PD_ALL_VISIBLE. It
 was proposed by Andres up thread, but shot down by Tom and Simon. But
 I still feel that was over reaction and there is a lot of merit in the
 idea. As I said elsewhere, it will also help the case when there are
 DEAD line pointers in a page. Today we can't mark such pages
 all-visible, but if we break this mapping, we can do that.

Sure, but you're zipping rather blithely past the disadvantages of
such an approach.  Jeff Davis recently proposed getting rid of
PD_ALL_VISIBLE, and Tom and I both expressed considerable skepticism
about that; this proposal has the same problems.  One of the major
benefits of PD_ALL_VISIBLE is that, when it isn't set, inserts,
updates, and deletes to the page can ignore the visibility map.  That
means that a server under heavy concurrency is much less likely to
encounter contention on the visibility map blocks.  Now, maybe that's
not really a problem, but I sure haven't seen enough evidence to make
me believe it.  If it's really true that PD_ALL_VISIBLE needn't fill
this role, then Heikki wasted an awful lot of time implementing it,
and I wasted an awful lot of time keeping it working when I made the
visibility map crash-safe for IOS.  That could be true, but I tend to
think it isn't.

 I would like to run some pgbench tests where we get the system in a
 steady state such as all/most updates are HOT updates (not entirely
 unlikely scenario for many real life cases). And then try running some
 concurrent queries which can be executed via IOS. My gut feel is that,
 today we will see slow and continuous drop in performance for these
 queries because IOS will slowly stop working.

If there are no vacuums, I agree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Robert Haas
On Wed, Dec 19, 2012 at 12:39 PM, Simon Riggs si...@2ndquadrant.com wrote:
 The benefit of saying that only UPDATEs clean the block is that this
 penalises only the workload making the mess, rather than everybody
 cleaning up repeatedly over one messy guy.

Right, but there are plenty of situations where having everybody clean
up after the messy guy is better than waiting around and hoping that
Mom (aka vacuum) will do it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Pavan Deolasee
On Thu, Dec 20, 2012 at 6:12 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Dec 19, 2012 at 12:26 PM, Pavan Deolasee
 pavan.deola...@gmail.com wrote:
 This can be handled by breaking 1-to-1 mapping on VM bit and
 PD_ALL_VISIBLE bit. So seq scans will only look at PD_ALL_VISIBLE. It
 was proposed by Andres up thread, but shot down by Tom and Simon. But
 I still feel that was over reaction and there is a lot of merit in the
 idea. As I said elsewhere, it will also help the case when there are
 DEAD line pointers in a page. Today we can't mark such pages
 all-visible, but if we break this mapping, we can do that.

 Sure, but you're zipping rather blithely past the disadvantages of
 such an approach.

Hmm. You're right. I did not think about the disadvantages and now
that you mention them, I feel they are important.

  Jeff Davis recently proposed getting rid of
 PD_ALL_VISIBLE, and Tom and I both expressed considerable skepticism
 about that; this proposal has the same problems.  One of the major
 benefits of PD_ALL_VISIBLE is that, when it isn't set, inserts,
 updates, and deletes to the page can ignore the visibility map.  That
 means that a server under heavy concurrency is much less likely to
 encounter contention on the visibility map blocks.  Now, maybe that's
 not really a problem, but I sure haven't seen enough evidence to make
 me believe it.  If it's really true that PD_ALL_VISIBLE needn't fill
 this role, then Heikki wasted an awful lot of time implementing it,
 and I wasted an awful lot of time keeping it working when I made the
 visibility map crash-safe for IOS.  That could be true, but I tend to
 think it isn't.


Yeah, VM buffer contention can become prominent if we break the
invariant that page level bit status implies the vm bit status, at
least when its clear.OTOH IMHO we need some mechanism to address the
issue of aggressive clearing of the VM bits, but a very lame
corresponding set operation. Today we don't have much contention on
the VM page, but we must be sacrificing its usability in return. IOS
as well as vacuum optimizations using VMs will turn out not so useful
for many workloads. I'm very reluctant to suggest that we can solve
this my setting aside another page-level bit to track visibility of
tuples for heapscans. Or even have a bit in the tuple header itself to
track this information at that level to avoid repeated visibility
check for a tuple which is known to be visible to all current and
future transactions.

 I would like to run some pgbench tests where we get the system in a
 steady state such as all/most updates are HOT updates (not entirely
 unlikely scenario for many real life cases). And then try running some
 concurrent queries which can be executed via IOS. My gut feel is that,
 today we will see slow and continuous drop in performance for these
 queries because IOS will slowly stop working.

 If there are no vacuums, I agree.


And we expect vacuums to be very less or none. AFAIR in pgbench, it
now takes hours for accounts table to get chosen for vacuum and we
should be happy about it. But IOS are almost impossible for pgbench
kind of workloads today because of our aggressive strategy to clear
the VM bits.

Thanks,
Pavan




-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-19 Thread Amit Kapila
On Thursday, December 20, 2012 6:14 AM Robert Haas wrote:
 On Wed, Dec 19, 2012 at 12:39 PM, Simon Riggs si...@2ndquadrant.com
 wrote:
  The benefit of saying that only UPDATEs clean the block is that this
  penalises only the workload making the mess, rather than everybody
  cleaning up repeatedly over one messy guy.
 
 Right, but there are plenty of situations where having everybody clean
 up after the messy guy is better than waiting around and hoping that
 Mom (aka vacuum) will do it.

If we see for similar situation in index, during index scan, it just marks
the tuple as DEAD without taking X lock and then during split (when it
already has X lock) it free's the actual space. 
So not sure if it's good idea to take X lock for cleanup during heap scan,
where write operation's happens more frequently and have better chance of
cleanup.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-18 Thread Amit Kapila
On Sunday, December 16, 2012 11:14 PM Tom Lane wrote:
 Pavan Deolasee pavan.deola...@gmail.com writes:
  On Sun, Dec 16, 2012 at 3:10 PM, Simon Riggs si...@2ndquadrant.com
 wrote:
  As explained above, I disagree that it looks like a good idea, and
  you've shown no evidence it would be or is true.
 
  Lets separate out these two issues. What you are suggesting as a
  follow up to Tom's idea, I've no objection to that and that might be
  worthwhile optimisation to try out. But this patch itself does not
  attempt to deal with that and its a separate work item and will
  require invasive changes and tests.
 


 Another thing that would need to be considered, if we do want to
 restrict when pruning happens, is whether it is worth introducing some
 other path altogether for setting the all-visible bit.  Or perhaps we
 should modify autovacuum's rules so that it will fire on relations for
 which there might be lots of unmarked all-visible pages.

Can something similar be also used for putting deleted index pages into FSM.

The reason is that currently deleted index pages are recorded in FSM in the
next Vacuum cycle. 
So if after bulk index update (always increasing order), even if the auto
vacuum is done once, it still does not put
deleted index pages into FSM. 
Now let's assume there are no operations which can lead to auto-vacuum on
same table, next cycle of bulk update will allocate
new index pages. 
I had observed this in one of the tests that if bulk index update happens
such that new value is always increasing, then index bloat happens. As per
initial analysis, it seems one of the reasons is what I described above.
If required, I can create a self-containing test which can show that
bulk-index update can lead to index bloat.


With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-17 Thread Pavan Deolasee
On Sun, Dec 16, 2012 at 11:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  And I agree with him that your proposed
 redefinition of the bit's meaning to avoid that is pretty horrid;
 it's ugly, complicates the invariant quite a lot, and breaks some
 existing usages of the bit.

(slammed.. feels the pain)  You definitely have better intuitions and
understanding of the system than me and I've to be really shameful to
defend my idea which was out of the hat anyways, but let me try
nevertheless at the risk of being slammed again :-)

So whats the current meaning of the VM bit ? If set, it tells us that
all tuples in the page are visible to all current and future
transactions. But thats not entirely true. Otherwise why wouldn't we
mark a page all-visible if it contains a DEAD line pointer ? Of
course, we don't do that to support index only scans and make sure
that vacuum picks up such page. So would I be too wrong if I take
liberty to read VM bit as: if set, every index pointer to the heap
page is for sure pointing to *only* all-visible tuple. I also consider
vacuum as an activity that is needed because we can't clean up dead
line pointers in the heap without first removing the index pointers.
So visibility maps go hand-in-hand with indexes. In fact, if a table
does not have any index, I am not sure if we even need a VM for that
table, especially if we can teach HOT prune to record free space with
FSM.

My half-cooked idea extends that and says: if VM bit is set, every
index pointer to the heap page is pointing to either a all-visible
tuple or a valid HOT-chain that has all-visible tuple at the end
(aborted HOT tuples at the end are not counted as part of a valid
chain). OTOH the page-level bit is set if every tuple in the heap page
is all-visible. DEAD line pointers are not counted while arriving at
page level bit. So in this new scheme of things, there is a loose
correspondence between these two. VM bit is useful for index-only
scans and vacuum optimisations while page-level bit has limited use of
optimising heap scans. Andres probably said the same thing. Jeff Davis
is already talking about removing the page-level bit on other threads,
so I don't see much problem breaking a 1-to-1 mapping with the VM bit
and the page-level bit.

Having said all of this, I am not insisting on this unless we see a
value in it. And there could be when you reach a steady state so that
almost every update is a HOT update. Today you will still need
periodic vacuums just so that index-only scans work. Of course, you
talked about other code paths to set all-visible bits.


 If we decide that we don't want to restrict pruning like that, then
 this patch probably has merit.  But we can't evaluate the two issues
 independently.


Fair enough. If someone starts actively working on moving HOT prune
logic to the UPDATE path, I'll withdraw this patch. But any change to
the HOT prune logic will require extensive and long duration testing
to prove its value. After all, we spent hours testing it when we wrote
it and had seen how small changes can cause drop in performance.

If nobody is volunteering to do this change at the moment, I feel that
the patch has value, as you yourself noted, and should be considered
on its merits, as things stand today and not based on things that we
might do in future.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-16 Thread Simon Riggs
On 16 December 2012 07:53, Pavan Deolasee pavan.deola...@gmail.com wrote:
 On Sun, Dec 16, 2012 at 3:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Doing that only makes sense when we're running a SELECT. Setting the
 all visible bit immediately prior to an UPDATE that clears it again is
 pointless effort, generating extra work for no reason.

 On the other hand, the HOT prune operation itself is worthless when
 we're running a SELECT.  The only reason we do it that way is that we
 have to prune before the query starts to use the page, else pruning
 might invalidate pointers-to-tuples that are being held within the
 query plan tree.


 Right. HOT prune may and often would be called in the SELECT path or
 heap/index scan leading to UPDATE/DELETE. But whenever its called, it
 looks like a good idea to set the visibility map bit.

As explained above, I disagree that it looks like a good idea, and
you've shown no evidence it would be or is true.

I agree with Tom that cleaning on SELECT can be worthless and I'd
definitely like to be able to turn it off conditionally or
permanently.

 There is some
 additional overhad to check if a  LIVE tuple is all-visible or not,
 but that doesn't look too much. I did run some pgbench tests for fully
 cached tables and did not see any difference in tps.

 Another idea could have been to NOT clear the visibility bit when a
 HOT update happens. Such tuple can get pruned by HOT prune, so we
 don't need vacuum per se, and the index-only scans are not affected
 because the update was a HOT update, so the index keys did not change
 either. So index-only scans would continue to return the same result.
 Don't know if this would work with hot standby, probably not.

 Maybe it's time to look at what it'd take for the low-level scan
 operations to know whether they're scanning the target relation of
 an UPDATE query, so that we could skip pruning altogether except
 when a HOT update could conceivably ensue.

 Yeah, we could do that. We may not be able to predict whether the
 coming update is HOT or not, but I don't think that matters.

We can predict that an update is HOT in advance, if none of the index
columns are touched in the UPDATE. Yes, there are some cases where it
might not be, but we could probably store that in the statement cache.

Making that checkat run time must cost some block contention, so it
would be good to remove it from every update.

 OTOH its
 probably not too bad to prune in any scan (like we do today) because
 there is fairly high chance that the page will be dirtied for hint bit
 updates. So may be its better to just prune as well. We have already
 put in several optimisations to do so only when required and without
 any unnecessary contention. Of course, benchmarks can prove me wrong.

I think we could use some measurement/stats there so we can check.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-16 Thread Andres Freund
On 2012-12-16 13:23:56 +0530, Pavan Deolasee wrote:
 Another idea could have been to NOT clear the visibility bit when a
 HOT update happens. Such tuple can get pruned by HOT prune, so we
 don't need vacuum per se, and the index-only scans are not affected
 because the update was a HOT update, so the index keys did not change
 either. So index-only scans would continue to return the same result.
 Don't know if this would work with hot standby, probably not.

For IOSs that sounds like an interesting and itself easy to implement
idea, you basically only would need to add a single !use_hot_update in
the if blocks doing the PageClearAllVisible in heap_update.
This probably could make IOSs far more likely in some scenarios.

The complicated bit seems to be the heapgetpage() logic arround
all_visible, because HOT updates are obviously relevant in normal heap
scans. It seems to me that would require the vm bit continuing to be set
while the page level bit get unset.
I *think* thats actually ok because whenever we set/clear the
visibilitymap we will still log it properly, so the crash safety
guarantees seem to hold true. Obviously we would have to change the
escape hatch for exactly that condition in vacuumlazy, but thats not a
problem...

I don't immediately see a problem with HS, your logic seems to hold
equally true there. The replay logic would need to be refined slightly,
but it looks possible. That is without having checked the code...

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-16 Thread Andres Freund
On 2012-12-15 16:48:08 -0500, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  Doing that only makes sense when we're running a SELECT. Setting the
  all visible bit immediately prior to an UPDATE that clears it again is
  pointless effort, generating extra work for no reason.

 On the other hand, the HOT prune operation itself is worthless when
 we're running a SELECT.  The only reason we do it that way is that we
 have to prune before the query starts to use the page, else pruning
 might invalidate pointers-to-tuples that are being held within the
 query plan tree.

 Maybe it's time to look at what it'd take for the low-level scan
 operations to know whether they're scanning the target relation of
 an UPDATE query, so that we could skip pruning altogether except
 when a HOT update could conceivably ensue.  I think this was discussed
 back when HOT went in, but nobody wanted to make the patch more invasive
 than it had to be.

FWIW I think that would be a pretty worthwile optimization - I have seen
workloads where hot pruning lead to considerable contention.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-16 Thread Pavan Deolasee
On Sun, Dec 16, 2012 at 3:10 PM, Simon Riggs si...@2ndquadrant.com wrote:


 As explained above, I disagree that it looks like a good idea, and
 you've shown no evidence it would be or is true.


Lets separate out these two issues. What you are suggesting as a
follow up to Tom's idea, I've no objection to that and that might be
worthwhile optimisation to try out. But this patch itself does not
attempt to deal with that and its a separate work item and will
require invasive changes and tests.

*Whenever* we HOT prune, either in SELECT path or UPDATE path, what
I'm suggesting is lets try to set the visibility map bit if the
conditions are favorable. The only extra work that we are doing (as in
the submitted patch) is to check few additional things for LIVE tuples
such as if xmin precedes the OldestXmin or not. That itself does not
seem too costly. What we gain is: 1. next vacuum may skip that page
because its marked all-visible and 2. index-only scan will not visit
the heap page. These two improvements will avoid an useless heap page
IO and may justify a little more work in HOT prune.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-16 Thread Simon Riggs
On 16 December 2012 14:41, Andres Freund and...@2ndquadrant.com wrote:
 On 2012-12-16 13:23:56 +0530, Pavan Deolasee wrote:
 Another idea could have been to NOT clear the visibility bit when a
 HOT update happens. Such tuple can get pruned by HOT prune, so we
 don't need vacuum per se, and the index-only scans are not affected
 because the update was a HOT update, so the index keys did not change
 either. So index-only scans would continue to return the same result.
 Don't know if this would work with hot standby, probably not.

 For IOSs that sounds like an interesting and itself easy to implement
 idea, you basically only would need to add a single !use_hot_update in
 the if blocks doing the PageClearAllVisible in heap_update.
 This probably could make IOSs far more likely in some scenarios.

Doing that would completely change the meaning of the visibility map
from a heap visibility map into an index-only map.

IndexOnly scans would still work, but nothing else would ever and it
would be hard to confirm the validity of the vm.

-1

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-16 Thread Andres Freund
On 2012-12-16 16:25:03 +, Simon Riggs wrote:
 On 16 December 2012 14:41, Andres Freund and...@2ndquadrant.com wrote:
  On 2012-12-16 13:23:56 +0530, Pavan Deolasee wrote:
  Another idea could have been to NOT clear the visibility bit when a
  HOT update happens. Such tuple can get pruned by HOT prune, so we
  don't need vacuum per se, and the index-only scans are not affected
  because the update was a HOT update, so the index keys did not change
  either. So index-only scans would continue to return the same result.
  Don't know if this would work with hot standby, probably not.
 
  For IOSs that sounds like an interesting and itself easy to implement
  idea, you basically only would need to add a single !use_hot_update in
  the if blocks doing the PageClearAllVisible in heap_update.
  This probably could make IOSs far more likely in some scenarios.

 Doing that would completely change the meaning of the visibility map
 from a heap visibility map into an index-only map.

 IndexOnly scans would still work, but nothing else would ever and it
 would be hard to confirm the validity of the vm.

I don't think it would change the meaning that much - the visibilitymap
would still work for vacuum as normal heap updates would still unset the
all-visible flag. Vacuum would skip pages that already were all-visible
and then only got hot updated, true, but that seems like its an
acceptable tradeoff as that dead space can be fully cleaned up by hot
pruning.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-16 Thread Tom Lane
Pavan Deolasee pavan.deola...@gmail.com writes:
 On Sun, Dec 16, 2012 at 3:10 PM, Simon Riggs si...@2ndquadrant.com wrote:
 As explained above, I disagree that it looks like a good idea, and
 you've shown no evidence it would be or is true.

 Lets separate out these two issues. What you are suggesting as a
 follow up to Tom's idea, I've no objection to that and that might be
 worthwhile optimisation to try out. But this patch itself does not
 attempt to deal with that and its a separate work item and will
 require invasive changes and tests.

 *Whenever* we HOT prune, either in SELECT path or UPDATE path, what
 I'm suggesting is lets try to set the visibility map bit if the
 conditions are favorable.

I don't believe it's clear at all that this is a good idea.  If we
restrict pruning to occur only when there's a fairly good chance of
an ensuing HOT update, then Simon's original objection (that we're
probably going to have to clear the bit again right away) has
considerable force.  And I agree with him that your proposed
redefinition of the bit's meaning to avoid that is pretty horrid;
it's ugly, complicates the invariant quite a lot, and breaks some
existing usages of the bit.

If we decide that we don't want to restrict pruning like that, then
this patch probably has merit.  But we can't evaluate the two issues
independently.

Another thing that would need to be considered, if we do want to
restrict when pruning happens, is whether it is worth introducing some
other path altogether for setting the all-visible bit.  Or perhaps we
should modify autovacuum's rules so that it will fire on relations for
which there might be lots of unmarked all-visible pages.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-15 Thread Simon Riggs
On 15 December 2012 12:42, Pavan Deolasee pavan.deola...@gmail.com wrote:
 We discussed this idea in the past [1] and Robert recently again
 mentioned this in another thread [2]. Please see a rebased/revised
 patch attached with the mail. This is mostly similar to what I's
 submitted in [1] except some additions to also compute visibility
 cut-off XID. I also removed a warning that I'd added to the previous
 patch to report the case when the page's all-visible bit is already
 set, but our HOT prune scan finds it otherwise. This is not to hide
 the warning the earlier reviewer had reported, but I think its not
 required because we do those consistency checks at other places
 anyways.

Doing that only makes sense when we're running a SELECT. Setting the
all visible bit immediately prior to an UPDATE that clears it again is
pointless effort, generating extra work for no reason.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-15 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 Doing that only makes sense when we're running a SELECT. Setting the
 all visible bit immediately prior to an UPDATE that clears it again is
 pointless effort, generating extra work for no reason.

On the other hand, the HOT prune operation itself is worthless when
we're running a SELECT.  The only reason we do it that way is that we
have to prune before the query starts to use the page, else pruning
might invalidate pointers-to-tuples that are being held within the
query plan tree.

Maybe it's time to look at what it'd take for the low-level scan
operations to know whether they're scanning the target relation of
an UPDATE query, so that we could skip pruning altogether except
when a HOT update could conceivably ensue.  I think this was discussed
back when HOT went in, but nobody wanted to make the patch more invasive
than it had to be.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set visibility map bit after HOT prune

2012-12-15 Thread Pavan Deolasee
On Sun, Dec 16, 2012 at 3:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Doing that only makes sense when we're running a SELECT. Setting the
 all visible bit immediately prior to an UPDATE that clears it again is
 pointless effort, generating extra work for no reason.

 On the other hand, the HOT prune operation itself is worthless when
 we're running a SELECT.  The only reason we do it that way is that we
 have to prune before the query starts to use the page, else pruning
 might invalidate pointers-to-tuples that are being held within the
 query plan tree.


Right. HOT prune may and often would be called in the SELECT path or
heap/index scan leading to UPDATE/DELETE. But whenever its called, it
looks like a good idea to set the visibility map bit. There is some
additional overhad to check if a  LIVE tuple is all-visible or not,
but that doesn't look too much. I did run some pgbench tests for fully
cached tables and did not see any difference in tps.

Another idea could have been to NOT clear the visibility bit when a
HOT update happens. Such tuple can get pruned by HOT prune, so we
don't need vacuum per se, and the index-only scans are not affected
because the update was a HOT update, so the index keys did not change
either. So index-only scans would continue to return the same result.
Don't know if this would work with hot standby, probably not.

 Maybe it's time to look at what it'd take for the low-level scan
 operations to know whether they're scanning the target relation of
 an UPDATE query, so that we could skip pruning altogether except
 when a HOT update could conceivably ensue.

Yeah, we could do that. We may not be able to predict whether the
coming update is HOT or not, but I don't think that matters. OTOH its
probably not too bad to prune in any scan (like we do today) because
there is fairly high chance that the page will be dirtied for hint bit
updates. So may be its better to just prune as well. We have already
put in several optimisations to do so only when required and without
any unnecessary contention. Of course, benchmarks can prove me wrong.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers