Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-09-01 Thread Andres Freund
Hi Simon,

On 2015-04-20 19:21:24 +0100, Simon Riggs wrote:
> Since we have many votes in favour of change in this area I'll post a new
> version and look for an early review/commit for next release.

If I see correctly there's been no new patch version since, right? The
patch is included in the current commitfest as needing review
nonetheless - that's possibly because it wasn't possible to "return with
feedback" for a while?

I marked it as RWF for now, if I missed something we can change it back.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-23 Thread Jim Nasby

On 4/23/15 8:25 AM, Robert Haas wrote:

Some users are partitioning tables just so that each
partition can be autovac'd separately.  That really shouldn't be
required.


Are they doing this for improved heap scan performance? Index scan 
performance? If the table wasn't partitioned, would they need more than 
one pass through the indexes due to exhausting maintenance_work_mem?


There's probably some fairly low-hanging fruit for parallelizing vacuum, 
but it really depends on what problems people are running into.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-23 Thread Andres Freund
On 2015-04-23 15:40:36 -0400, Robert Haas wrote:
 The issue is that you have to vacuum a table frequently enough to
 avoid accumulating bloat.  The frequency with which you need to vacuum
 varies depending on the size of the table and how frequently it's
 updated.  However, a large, heavily-updated table can take long enough
 to vacuum that, by the time you get done, it's already overdue to be
 vacuumed again.  That's a problem.

Especially because the indexes are scanned fully. In many cases I've
observed the heap scans themselves being fast; but scanning hundreds
(yes) of gigabytes of indexes taking ages.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-23 Thread Robert Haas
On Thu, Apr 23, 2015 at 10:44 AM, Jim Nasby jim.na...@bluetreble.com wrote:
 On 4/23/15 8:25 AM, Robert Haas wrote:
 Some users are partitioning tables just so that each
 partition can be autovac'd separately.  That really shouldn't be
 required.

 Are they doing this for improved heap scan performance? Index scan
 performance? If the table wasn't partitioned, would they need more than one
 pass through the indexes due to exhausting maintenance_work_mem?

I don't know of anyone with a properly-configured system who needs
more than one pass through the indexes due to exhausting
maintenance_work_mem.  The issue is that you have to vacuum a table
frequently enough to avoid accumulating bloat.  The frequency with
which you need to vacuum varies depending on the size of the table and
how frequently it's updated.  However, a large, heavily-updated table
can take long enough to vacuum that, by the time you get done, it's
already overdue to be vacuumed again.  That's a problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-23 Thread Robert Haas
On Wed, Apr 22, 2015 at 5:17 PM, Bruce Momjian br...@momjian.us wrote:
 On Wed, Apr 22, 2015 at 06:07:00PM -0300, Alvaro Herrera wrote:
  Good point, but doesn't vacuum remove the need for pruning as it removes
  all the old rows?

 Sure.  The point, I think, is to make autovacuum runs of some sort that
 don't actually vacuum but only do HOT-pruning.  Maybe this is a
 reasonable solution to the problem that queries don't prune anymore
 after Simon's patch.  If we made autovac HOT-prune periodically, we
 could have read-only queries prune only already-dirty pages.  Of course,
 that would need further adjustments to default number of autovac
 workers, I/O allocation, etc.

 Do we really want to make vacuum more complex for this?  vacuum does
 have the delay settings we would need though.

I think it's abundantly clear that, as wonderful as autovacuum is
compared with what we had before autovacuum, it's not good enough.
This is one area where I think improvement is definitely needed, and
I've suggested it before.  Discussion began here:

http://www.postgresql.org/message-id/AANLkTimd3ieGCm9pXV39ci6-owy3rX0mzz_N1tL=0...@mail.gmail.com

Some of the things I suggested then seem dumb in hindsight, but I
think the basic concept is still valid: if we scan the heap and find
only a few dead tuples, the expense of scanning all of the indexes may
not be justified.  Also, the fact that a relation can currently only
be vacuumed by one process at a time is coming to seem like a major
limitation.  Some users are partitioning tables just so that each
partition can be autovac'd separately.  That really shouldn't be
required.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-22 Thread Bruce Momjian
On Tue, Apr 21, 2015 at 04:36:53PM -0400, Robert Haas wrote:
  Keep in mind there's a disconnect between dirtying a page and writing it
  to storage.  A page could remain dirty for a long time in the buffer
  cache.  This writing of sequential pages would occur at checkpoint time
  only, which seems the wrong thing to optimize.  If some other process
  needs to evict pages to make room to read some other page in, surely
  it's going to try one page at a time, not write many sequential dirty
  pages.
 
 Well, for a big sequential scan, we use a ring buffer, so we will
 typically be evicting the pages that we ourselves read in moments
 before.  So in this case we would do a lot of sequential writes of
 dirty pages.

Ah, yes, this again supports the prune-then-skip approach, rather than
doing the first X% pruneable pages seen.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-22 Thread Jim Nasby

On 4/21/15 4:07 PM, Peter Eisentraut wrote:

On 4/21/15 4:45 PM, Jim Nasby wrote:
In order for a background worker to keep up with some of the workloads
that have been presented as counterexamples, you'd need multiple
background workers operating in parallel and preferring to work on
certain parts of a table.  That would require a lot more sophisticated
job management than we currently have for, say, autovacuum.


My thought was that the foreground queries would send page IDs to the 
bgworker via a shmq. If the queries have to do much waiting at all on IO 
then I'd expect the bgworker to be able to keep pace with a bunch of 
them since it's just grabbing buffers that are already in the pool (and 
only those in the pool; it wouldn't make sense for it to pull it back 
from the kernel, let alone disk).


We'd need to code this so that if a queue fills up the query doesn't 
block; we just skip that opportunity to prune. I think that'd be fine.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-22 Thread Greg Stark
On Mon, Apr 20, 2015 at 8:48 PM, Bruce Momjian br...@momjian.us wrote:

 But if the entire table is very hot, I think that that is just another of way
 of saying that autovacuum is horribly misconfigured.  I think the purpose of

 Well, we have to assume there are many misconfigured configurations ---
 autovacuum isn't super-easy to configure, so we can't just blame the
 user if this makes things worse.  In fact, page pruning was designed
 spefically for cases where autovacuum wasn't running our couldn't keep
 up.

Well autovacuum isn't currently considering HOT pruning part of its
job at all. It's hard to call it misconfigured when there's
literally *no* way to configure it correctly.

If you update less than autovacuum_vacuum_scale_factor fraction of the
table and then never update another row autovacuum will never run.
Ever. Every select will forevermore need to follow hot chains on that
table. Until eventually transaction wraparound forces a vacuum on that
table if that ever happens.

Possibly autovacuum could be adjusted to count how many selects are
happening on the table and decide to vacuum it when the cost of the
selects following the dead tuples is balanced by the cost of doing a
vacuum. But that's not something included in the design of autovacuum
today.

The original design of tuple storage was aimed at optimizing the
steady state where most tuples were not recently updated. It
guaranteed that except for tuples that were in the process of being
updated or were recently updated a tuple read didn't have to read the
CLOG, didn't have to follow any chains, didn't have to do any I/O or
other work other than to read the bits on the tuple itself. When a
tuple is updated it's put into a state where everyone who comes along
has to do extra work but as soon as practical the hint bits get set
and that extra work stops.

We had similar discussions about setting hint bits in the past. I'm
not sure why HOT pruning is the focus now because I actually think
hint bit setting is a larger source of I/O in innocent looking selects
even today. And it's a major headache, people are always being
surprised that their selects cause lots of I/O and slow down
dramatically after a big update or data load has finished. It's
characterized as why is the database writing everything twice (and
saying it's actually writing everything three times doesn't make
people feel better). In the new age of checksums with hint bit logging
I wonder if it's even a bigger issue.

It occurs to me that generating these dirty pages isn't really that
expensive individually. It's only that there's a sudden influx of a
large number of dirty pages that causes them to get translated
immediately into filesystem I/O. Perhaps we should dirty pages on hint
bit updates and do HOT pruning only to the extent it can be done
without causing I/O. Of course it's hard to tell that in advance  but
maybe something like if the current buffer had to be fetched and
caused a dirty buffer to be evicted then skip hot pruning and don't
dirty it for any hint bit updates would at least mean that once the
select fills up its share of buffers with dirty buffers it stops
dirtying more. It would dirty pages only as fast as bgwriter or
checkpoints manage to write them out.

That sounds a bit weird but I think the right solution should have
that combination of properties. It should guarantee that hint bits get
set and hot chains pruned within some length of time but that no one
select causes a storm of dirty buffers that then need to be flushed to
disk.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-22 Thread Peter Eisentraut
On 4/22/15 11:37 AM, Jim Nasby wrote:
 On 4/21/15 4:07 PM, Peter Eisentraut wrote:
 On 4/21/15 4:45 PM, Jim Nasby wrote:
 In order for a background worker to keep up with some of the workloads
 that have been presented as counterexamples, you'd need multiple
 background workers operating in parallel and preferring to work on
 certain parts of a table.  That would require a lot more sophisticated
 job management than we currently have for, say, autovacuum.
 
 My thought was that the foreground queries would send page IDs to the
 bgworker via a shmq. If the queries have to do much waiting at all on IO
 then I'd expect the bgworker to be able to keep pace with a bunch of
 them since it's just grabbing buffers that are already in the pool (and
 only those in the pool; it wouldn't make sense for it to pull it back
 from the kernel, let alone disk).
 
 We'd need to code this so that if a queue fills up the query doesn't
 block; we just skip that opportunity to prune. I think that'd be fine.

I think a to-clean-up map would work better.  But basically we need a
way to remember where to clean up later if we're not going to do it in
the foreground.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-22 Thread Bruce Momjian
On Tue, Apr 21, 2015 at 05:07:52PM -0400, Peter Eisentraut wrote:
 On 4/21/15 4:45 PM, Jim Nasby wrote:
  This comment made me wonder... has anyone considered handing the pruning
  work off to a bgworker, at least for SELECTs? That means the selects
  themselves wouldn't be burdened by the actual prune work, only in
  notifying the bgworker. While that's not going to be free, presumably
  it's a lot cheaper...
 
 The nice thing about having foreground queries to the light cleanup is
 that they can work in parallel and naturally hit the interesting parts
 of the table first.
 
 In order for a background worker to keep up with some of the workloads
 that have been presented as counterexamples, you'd need multiple
 background workers operating in parallel and preferring to work on
 certain parts of a table.  That would require a lot more sophisticated
 job management than we currently have for, say, autovacuum.

Well, the visibility map tells us where _not_ to clean up, so using
another map to tell use _where_ to cleanup might make sense.  However,
the density of the map might be low enough that a list makes more sense,
as you suggested.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-22 Thread Kevin Grittner
Greg Stark st...@mit.edu wrote:

 And it's a major headache, people are always being surprised that
 their selects cause lots of I/O and slow down dramatically after
 a big update or data load has finished. It's characterized as
 why is the database writing everything twice (and saying it's
 actually writing everything three times doesn't make people feel
 better).

When I looked at the life-cycle of a heap tuple in a database I was
using, I found that (ignoring related index access and ignoring
WAL-file copying, etc., for our backups), each tuple that existed
long enough to freeze and be eventually deleted caused a lot of
writes.

(1) WAL log the insert.
(2) Write the tuple.
(3) Hint and rewrite the tuple.
(4) WAL log the freeze of the tuple.
(5) Rewrite the frozen tuple.
(6) WAL-log the delete.
(7) Rewrite the deleted tuple.
(8) Prune and rewrite the page.
(9) Free line pointers and rewrite the page.

If I was lucky some of the writes could be combined in cache
because they happened close enough together.  Also, one could hope
that not too much of the WAL-logging involved full page writes to
the WAL -- again, keeping steps close together in time helps with
that.  If all of (1) through (5) are done in quick succession, you
save two physical writes of the heap page and save one full page

write to WAL.  If steps (7) through (9) are done in quick
succession, you save two more physical writes to the heap.  This is
part of what makes the aggressive incremental freezing being
discussed on a nearby thread appealing -- at least for some
workloads.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-22 Thread Alvaro Herrera
Bruce Momjian wrote:
 On Wed, Apr 22, 2015 at 04:36:17PM +0100, Greg Stark wrote:
  On Mon, Apr 20, 2015 at 8:48 PM, Bruce Momjian br...@momjian.us wrote:
   Well, we have to assume there are many misconfigured configurations ---
   autovacuum isn't super-easy to configure, so we can't just blame the
   user if this makes things worse.  In fact, page pruning was designed
   spefically for cases where autovacuum wasn't running our couldn't keep
   up.
  
  Well autovacuum isn't currently considering HOT pruning part of its
  job at all. It's hard to call it misconfigured when there's
  literally *no* way to configure it correctly.
 
 Good point, but doesn't vacuum remove the need for pruning as it removes
 all the old rows?

Sure.  The point, I think, is to make autovacuum runs of some sort that
don't actually vacuum but only do HOT-pruning.  Maybe this is a
reasonable solution to the problem that queries don't prune anymore
after Simon's patch.  If we made autovac HOT-prune periodically, we
could have read-only queries prune only already-dirty pages.  Of course,
that would need further adjustments to default number of autovac
workers, I/O allocation, etc.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-22 Thread Bruce Momjian
On Wed, Apr 22, 2015 at 06:07:00PM -0300, Alvaro Herrera wrote:
  Good point, but doesn't vacuum remove the need for pruning as it removes
  all the old rows?
 
 Sure.  The point, I think, is to make autovacuum runs of some sort that
 don't actually vacuum but only do HOT-pruning.  Maybe this is a
 reasonable solution to the problem that queries don't prune anymore
 after Simon's patch.  If we made autovac HOT-prune periodically, we
 could have read-only queries prune only already-dirty pages.  Of course,
 that would need further adjustments to default number of autovac
 workers, I/O allocation, etc.

Do we really want to make vacuum more complex for this?  vacuum does
have the delay settings we would need though.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-22 Thread Jim Nasby

On 4/22/15 1:51 PM, Kevin Grittner wrote:

(1) WAL log the insert.
(2) Write the tuple.
(3) Hint and rewrite the tuple.
(4) WAL log the freeze of the tuple.
(5) Rewrite the frozen tuple.
(6) WAL-log the delete.
(7) Rewrite the deleted tuple.
(8) Prune and rewrite the page.
(9) Free line pointers and rewrite the page.

If I was lucky some of the writes could be combined in cache
because they happened close enough together. Also, one could hope
that not too much of the WAL-logging involved full page writes to
the WAL -- again, keeping steps close together in time helps with
that.


This is why I like the idea of methods that tell us where we need to do 
cleanup... they provide us with a rough ability to track what tuples are 
in what part of their lifecycle. The VM helps with this a small amount, 
but really it only applies after 1 and 6; it doesn't help us with any 
other portions.


Having a way to track recently created tuples would allow us to be much 
more efficient with 1-3, and with aggressive freezing, 1-5. A way to 
track recently deleted tuples would help with 6-7, possibly 6-9 if no 
indexes.


If we doubled the size of the VM, that would let us track 4 states for 
each page:


- Page has newly inserted tuples
- Page has newly deleted tuples
- Page is all visible
- Page is frozen

though as discussed elsewhere, we could probably combine all visible and 
frozen.


The win from doing this would be easily knowing what pages need hinting 
(newly inserted) and pruning (newly deleted). Unfortunately we still 
wouldn't know whether we could do real work without visiting the page 
itself, but I suspect that for many workloads just having newly 
inserted/deleted would be a serious win.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-22 Thread Bruce Momjian
On Wed, Apr 22, 2015 at 04:36:17PM +0100, Greg Stark wrote:
 On Mon, Apr 20, 2015 at 8:48 PM, Bruce Momjian br...@momjian.us wrote:
  Well, we have to assume there are many misconfigured configurations ---
  autovacuum isn't super-easy to configure, so we can't just blame the
  user if this makes things worse.  In fact, page pruning was designed
  spefically for cases where autovacuum wasn't running our couldn't keep
  up.
 
 Well autovacuum isn't currently considering HOT pruning part of its
 job at all. It's hard to call it misconfigured when there's
 literally *no* way to configure it correctly.

Good point, but doesn't vacuum remove the need for pruning as it removes
all the old rows?

 If you update less than autovacuum_vacuum_scale_factor fraction of the
 table and then never update another row autovacuum will never run.
 Ever. Every select will forevermore need to follow hot chains on that
 table. Until eventually transaction wraparound forces a vacuum on that
 table if that ever happens.

Yes, that is a very good point, and it matches my concerns.  Of course,
Simon's concern is to avoid overly-aggressive pruning where the row is
being pruned but will soon be modified, making the prune, and its WAL
volume, undesirable.  We have to consider both cases in any final
solution.

 Possibly autovacuum could be adjusted to count how many selects are
 happening on the table and decide to vacuum it when the cost of the
 selects following the dead tuples is balanced by the cost of doing a
 vacuum. But that's not something included in the design of autovacuum
 today.

Well, autovacuum is also going to clean indexes, which seem like
overkill for pruning HOT updates.

 The original design of tuple storage was aimed at optimizing the
 steady state where most tuples were not recently updated. It
 guaranteed that except for tuples that were in the process of being
 updated or were recently updated a tuple read didn't have to read the
 CLOG, didn't have to follow any chains, didn't have to do any I/O or
 other work other than to read the bits on the tuple itself. When a
 tuple is updated it's put into a state where everyone who comes along
 has to do extra work but as soon as practical the hint bits get set
 and that extra work stops.

Yes, Simon is right that doing everything as-soon-as-possible is not
optimal.  I think the trick is knowing when we should give up waiting
for something else to dirty the page and prune it.

 We had similar discussions about setting hint bits in the past. I'm
 not sure why HOT pruning is the focus now because I actually think
 hint bit setting is a larger source of I/O in innocent looking selects
 even today. And it's a major headache, people are always being
 surprised that their selects cause lots of I/O and slow down
 dramatically after a big update or data load has finished. It's
 characterized as why is the database writing everything twice (and
 saying it's actually writing everything three times doesn't make
 people feel better). In the new age of checksums with hint bit logging
 I wonder if it's even a bigger issue.

What would be the downside of only doing pruning during SELECT hint bit
setting?  Hinting is delayed by long-running transactions, but so is
pruning.  I assume you can do more pruning than setting all_visible
hints because the old prunable rows are older by definition, but I am
unclear how much older they are.

FYI, while hint bit setting causes page writes, it does not cause WAL
writes unless you have wal_log_hints set or page-level checksums are
enabled.  By doing pruning at the same time as hint bit setting, you are
sharing the same page write, but are generating more WAL.  Of course, if
you are setting all-visible, then you are by definition waiting longer
to prune than before, and this might be enough to make it a win for all
use cases.  You wouldn't never-prune in a read-only workload because
your hint bits would eventually cause the pruning.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-21 Thread Bruce Momjian
On Mon, Apr 20, 2015 at 07:13:38PM -0300, Alvaro Herrera wrote:
 Bruce Momjian wrote:
  On Mon, Apr 20, 2015 at 04:19:22PM -0300, Alvaro Herrera wrote:
   Bruce Momjian wrote:
   
   This seems simple to implement: keep two counters, where the second one
   is pages we skipped cleanup in.  Once that counter hits SOME_MAX_VALUE,
   reset the first counter so that further 5 pages will get HOT pruned.  5%
   seems a bit high though.  (In Simon's design, SOME_MAX_VALUE is
   essentially +infinity.)
  
  This would tend to dirty non-sequential heap pages --- it seems best to
  just clean as many as we are supposed to, then skip the rest, so we can
  write sequential dirty pages to storage.
 
 Keep in mind there's a disconnect between dirtying a page and writing it
 to storage.  A page could remain dirty for a long time in the buffer
 cache.  This writing of sequential pages would occur at checkpoint time
 only, which seems the wrong thing to optimize.  If some other process
 needs to evict pages to make room to read some other page in, surely
 it's going to try one page at a time, not write many sequential dirty
 pages.

Yes, it might be too much optimization to try to get the checkpoint to
flush all those pages sequentially, but I was thinking of our current
behavior where, after an update of all rows, we effectively write out
the entire table because we have dirtied every page.  I guess with later
prune-based writes, we aren't really writing all the pages as we have
the pattern where pages with prunable content is kind of random. I guess
I was just wondering what value there is to your write-then-skip idea,
vs just writing the first X% of pages we find?  Your idea certainly
spreads out the pruning, and doesn't require knowing the size of the
table, though I though that information was easily determined.

One thing to consider is how we handle pruning of index scans that hit
multiple heap pages.  Do we still write X% of the pages in the table, or
%X of the heap pages we actually access via SELECT?  With the
write-then-skip approach, we would do X% of the pages we access, while
with the first-X% approach, we would probably prune all of them as we
would not be accessing most of the table.  I don't think we can do the
first first-X% of pages and have the percentage based on the number of
pages accessed as we have no way to know how many heap pages we will
access from the index.  (We would know for bitmap scans, but that
complexity doesn't seem worth it.)  That would argue, for consistency
with sequential and index-based heap access, that your approach is best.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-21 Thread Peter Eisentraut
On 4/21/15 4:45 PM, Jim Nasby wrote:
 This comment made me wonder... has anyone considered handing the pruning
 work off to a bgworker, at least for SELECTs? That means the selects
 themselves wouldn't be burdened by the actual prune work, only in
 notifying the bgworker. While that's not going to be free, presumably
 it's a lot cheaper...

The nice thing about having foreground queries to the light cleanup is
that they can work in parallel and naturally hit the interesting parts
of the table first.

In order for a background worker to keep up with some of the workloads
that have been presented as counterexamples, you'd need multiple
background workers operating in parallel and preferring to work on
certain parts of a table.  That would require a lot more sophisticated
job management than we currently have for, say, autovacuum.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-21 Thread Robert Haas
On Mon, Apr 20, 2015 at 6:13 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Bruce Momjian wrote:
 On Mon, Apr 20, 2015 at 04:19:22PM -0300, Alvaro Herrera wrote:
  Bruce Momjian wrote:
  This seems simple to implement: keep two counters, where the second one
  is pages we skipped cleanup in.  Once that counter hits SOME_MAX_VALUE,
  reset the first counter so that further 5 pages will get HOT pruned.  5%
  seems a bit high though.  (In Simon's design, SOME_MAX_VALUE is
  essentially +infinity.)

 This would tend to dirty non-sequential heap pages --- it seems best to
 just clean as many as we are supposed to, then skip the rest, so we can
 write sequential dirty pages to storage.

 Keep in mind there's a disconnect between dirtying a page and writing it
 to storage.  A page could remain dirty for a long time in the buffer
 cache.  This writing of sequential pages would occur at checkpoint time
 only, which seems the wrong thing to optimize.  If some other process
 needs to evict pages to make room to read some other page in, surely
 it's going to try one page at a time, not write many sequential dirty
 pages.

Well, for a big sequential scan, we use a ring buffer, so we will
typically be evicting the pages that we ourselves read in moments
before.  So in this case we would do a lot of sequential writes of
dirty pages.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-21 Thread Jim Nasby

On 4/21/15 10:04 AM, Bruce Momjian wrote:

One thing to consider is how we handle pruning of index scans that hit
multiple heap pages.  Do we still write X% of the pages in the table, or
%X of the heap pages we actually access via SELECT?  With the
write-then-skip approach, we would do X% of the pages we access, while
with the first-X% approach, we would probably prune all of them as we
would not be accessing most of the table.  I don't think we can do the
first first-X% of pages and have the percentage based on the number of
pages accessed as we have no way to know how many heap pages we will
access from the index.


This comment made me wonder... has anyone considered handing the pruning 
work off to a bgworker, at least for SELECTs? That means the selects 
themselves wouldn't be burdened by the actual prune work, only in 
notifying the bgworker. While that's not going to be free, presumably 
it's a lot cheaper...

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-21 Thread Robert Haas
On Tue, Apr 21, 2015 at 11:04 AM, Bruce Momjian br...@momjian.us wrote:
 Yes, it might be too much optimization to try to get the checkpoint to
 flush all those pages sequentially, but I was thinking of our current
 behavior where, after an update of all rows, we effectively write out
 the entire table because we have dirtied every page.  I guess with later
 prune-based writes, we aren't really writing all the pages as we have
 the pattern where pages with prunable content is kind of random. I guess
 I was just wondering what value there is to your write-then-skip idea,
 vs just writing the first X% of pages we find?  Your idea certainly
 spreads out the pruning, and doesn't require knowing the size of the
 table, though I though that information was easily determined.

 One thing to consider is how we handle pruning of index scans that hit
 multiple heap pages.  Do we still write X% of the pages in the table, or
 %X of the heap pages we actually access via SELECT?  With the
 write-then-skip approach, we would do X% of the pages we access, while
 with the first-X% approach, we would probably prune all of them as we
 would not be accessing most of the table.  I don't think we can do the
 first first-X% of pages and have the percentage based on the number of
 pages accessed as we have no way to know how many heap pages we will
 access from the index.  (We would know for bitmap scans, but that
 complexity doesn't seem worth it.)  That would argue, for consistency
 with sequential and index-based heap access, that your approach is best.

I actually implemented something like this for setting hint bits a few
years ago:

http://www.postgresql.org/message-id/aanlktik5qzr8wts0mqcwwmnp-qhgrdky5av5aob7w...@mail.gmail.com
http://www.postgresql.org/message-id/aanlktimgkag7wdu-x77gnv2gh6_qo5ss1u5b6q1ms...@mail.gmail.com

At least in later versions, the patch writes a certain number of
hinted pages, then skips writing a run of pages, then writes another
run of hinted pages.  The basic problem here is that, after the fsync
queue compaction patch went in, the benefits on my tests were pretty
modest.  Yeah, it costs something to write out lots of dirty pages,
but before the fsync queue compaction stuff, the initial scan of an
unhinted table took like 6x the time on the machine I tested on, but
after that, it was like 1.5x the time.  Blunting that spike just
wasn't exciting enough.

It strikes me that it would be better to have an integrated strategy
for this problem.  It doesn't make sense to have one strategy for
deciding whether to set hint bits and a separate strategy for deciding
whether to HOT-prune.  And if we decide to set hint bits and
HOT-prune, it might be smart to try to mark the page all-visible, too,
if it is and we're not about to update it.  I believe we're losing a
lot of performance on OLTP workloads by re-dirtying the same pages
over and over again.  We've probably all hit cases where there is an
obvious loss of performance because of this sort of thing, but I'm
starting to think it's hurting us in a lot of less-obvious ways.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-20 Thread Alvaro Herrera
Bruce Momjian wrote:
 On Mon, Apr 20, 2015 at 04:19:22PM -0300, Alvaro Herrera wrote:
  Bruce Momjian wrote:
  
  This seems simple to implement: keep two counters, where the second one
  is pages we skipped cleanup in.  Once that counter hits SOME_MAX_VALUE,
  reset the first counter so that further 5 pages will get HOT pruned.  5%
  seems a bit high though.  (In Simon's design, SOME_MAX_VALUE is
  essentially +infinity.)
 
 This would tend to dirty non-sequential heap pages --- it seems best to
 just clean as many as we are supposed to, then skip the rest, so we can
 write sequential dirty pages to storage.

Keep in mind there's a disconnect between dirtying a page and writing it
to storage.  A page could remain dirty for a long time in the buffer
cache.  This writing of sequential pages would occur at checkpoint time
only, which seems the wrong thing to optimize.  If some other process
needs to evict pages to make room to read some other page in, surely
it's going to try one page at a time, not write many sequential dirty
pages.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-20 Thread Bruce Momjian
On Mon, Apr 20, 2015 at 04:19:22PM -0300, Alvaro Herrera wrote:
 Bruce Momjian wrote:
 
  I think the limit has to be in terms of a percentage of the table size. 
  For example, if we do one SELECT on a table with all non-dirty pages, it
  would be good to know that 5% of the pages were pruned --- that tells me
  that another 19 SELECTs will totally prune the table, assuming no future
  writes.
 
 This seems simple to implement: keep two counters, where the second one
 is pages we skipped cleanup in.  Once that counter hits SOME_MAX_VALUE,
 reset the first counter so that further 5 pages will get HOT pruned.  5%
 seems a bit high though.  (In Simon's design, SOME_MAX_VALUE is
 essentially +infinity.)

This would tend to dirty non-sequential heap pages --- it seems best to
just clean as many as we are supposed to, then skip the rest, so we can
write sequential dirty pages to storage.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-20 Thread Robert Haas
On Mon, Apr 20, 2015 at 3:28 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 But why should 1 SELECT or 20 SELECTs or 200 SELECTs have to do a job, while
 the user waits, which is fundamentally VACUUM's duty to do in the
 background?  If there are a handful of very hot pages, then it makes sense
 not to wait for vacuum to get to them.  And that is what a block-count limit
 does.

I think that's a fundamental mischaracterization of the problem.  As
soon as you define this as vacuum's problem, then of course it makes
no sense to prune in the foreground, ever.  But if you define the
problem as get the best overall system performance, then it clearly
DOES sometimes make sense to prune in the foreground, as benchmark
results upthread demonstrate.

The fact is that on a workload like pgbench - and it doesn't have to
be exactly pgbench, just any kind of workload where there are lots of
changes to the table - vacuum can at any given time be pruning at most
one page of the table.  That is because only one vacuum process can be
running in a given table at one time, and it can't be doing two things
at once.  But there can be many processes doing inserts, updates, or
deletes on that table, as many as whatever you have max_connections
set to.  There can easily be dozens even on a well-configured system;
on a poorly configured system, there could be hundreds.  It seems
obvious that if you can have dozens or hundreds of processes creating
garbage and at most one process cleaning it up, there will be cases
where you get further and further behind.

Now, it might well be that the right solution to that problem is to
allow multiple vacuum processes in the same database, or add
background workers to help with opportunistic HOT-pruning of pages so
it doesn't get done in the foreground.  Fine.  But as of today, on a
heavily-modified table, the ONLY way that we can possibly remove junk
from the table as fast as we're creating junk is if the backends
touching the table do some of the work.  Now, Simon is making the
argument that it should be good enough to have people *modifying* the
table help with the cleanup rather than imposing that load on the
people who are only *reading* it, and that's not a dumb argument, but
there are still cases where that strategy loses - specifically, where
the table churn has stopped or paused, by autovacuum hasn't run yet.
If you're going to do 1 sequential scan of the table and then go home
for the day, HOT-pruning is dumb even in that case.  If you're going
to do 1000 sequential scans of that table in a row, HOT-pruning may
very well be smart.  There's no guarantee that the table has met the
autovacuum threshold, but HOT-pruning it could well be a win anyway.
Or it might be a loss.  You can make any policy here look smart or
dumb by picking a particular workload, and you don't even have to
invent crazy things that will never happen in real life to do it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-20 Thread Bruce Momjian
On Mon, Apr 20, 2015 at 09:56:20PM +0100, Simon Riggs wrote:
 On 20 April 2015 at 20:28, Jeff Janes jeff.ja...@gmail.com wrote:
  
 
 But why should 1 SELECT or 20 SELECTs or 200 SELECTs have to do a job,
 while the user waits, which is fundamentally VACUUM's duty to do in the
 background? 
 
 
 Agreed. I don't see a % as giving us anything at all.
 
 The idea is that we want to turn an O(N) problem for one query into an O(1)
 task.
  
 
 The use case I see for this is when there is a mixed workload.  There is
 one select which reads the entire table, and hundreds of thousands of
 selects/updates/insert that don't, and of course vacuum comes along every
 now and then and does it thing.  Why should the one massive SELECT have
 horrible performance just because it was run right before autovacuum would
 have kicked in instead of right after if finished?
 
 
 +1

You can +1 all you want, but if you ignore the specific workloads I
mentioned, you are not going to get much traction.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-20 Thread Simon Riggs
On 20 April 2015 at 20:28, Jeff Janes jeff.ja...@gmail.com wrote:


 But why should 1 SELECT or 20 SELECTs or 200 SELECTs have to do a job,
 while the user waits, which is fundamentally VACUUM's duty to do in the
 background?


Agreed. I don't see a % as giving us anything at all.

The idea is that we want to turn an O(N) problem for one query into an O(1)
task.


 The use case I see for this is when there is a mixed workload.  There is
 one select which reads the entire table, and hundreds of thousands of
 selects/updates/insert that don't, and of course vacuum comes along every
 now and then and does it thing.  Why should the one massive SELECT have
 horrible performance just because it was run right before autovacuum would
 have kicked in instead of right after if finished?


+1

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-20 Thread Jeff Janes
On Mon, Sep 29, 2014 at 2:13 AM, Andres Freund and...@anarazel.de wrote:

 On 2014-09-28 19:51:36 +0100, Simon Riggs wrote:
  On 27 September 2014 09:29, Andres Freund and...@anarazel.de wrote:
   On 2014-09-27 10:23:33 +0300, Heikki Linnakangas wrote:
   This patch has gotten a fair amount of review, and has been
rewritten once
   during the commitfest. I think it's pretty close to being
committable, the
   only remaining question seems to be what to do with system catalogs.
I'm
   marking this as Returned with feedback, I take it that Simon can
proceed
   from here, outside the commitfest.
  
   FWIW, I don't think it is, even with that. As is it seems very likely
   that it's going to regress a fair share of workloads. At the very
least
   it needs a fair amount of benchmarking beforehand.
 
  There is some doubt there. We've not seen a workload that does
  actually exhibit a negative behaviour.

 Neither is there much data about the magnitude of positive effect the
 patch has...

  I'm not saying one doesn't exist, but it does matter how common/likely
  it is. If anyone can present a performance test case that demonstrates
  a regression, I think it will make it easier to discuss how wide that
  case is and what we should do about it. Discussing whether to do
  various kinds of limited pruning are moot until that is clear.

 I doubt it'll be hard to construct a case where it'll show. My first try
 of using a pgbench scale 100, -M prepared, -cj8 with a custom file with
 1 write and 5 read transaction yielded the following on my laptop:

 Baseline:
  relname| pgbench_tellers
  pg_total_relation_size | 458752
  relname| pgbench_accounts
  pg_total_relation_size | 1590337536
  relname| pgbench_branches
  pg_total_relation_size | 286720
  relname| pgbench_history
  pg_total_relation_size | 49979392
 Patched:
  relname| pgbench_tellers
  pg_total_relation_size | 516096
  relname| pgbench_accounts
  pg_total_relation_size | 1590337536
  relname| pgbench_branches
  pg_total_relation_size | 360448
  relname| pgbench_history
  pg_total_relation_size | 49528832

 So, there's a noticeable increase in size. Mostly on the smaller tables,
 so probably HOT cleanup was sometimes skipped during UPDATEs due to
 locks.

 Baseline was:
 tps = 9655.486532 (excluding connections establishing)
 Patched was:
 tps = 9466.158701 (including connections establishing)

Was this reproducible?  I've run your custom sql file with 4 clients (that
is how many CPUs I have) on a machine
with a BBU.  I had wal_level = hot_standby, but the archive_command just
returned true without archiving anything. And using the latest patch.

The size of the pgbench_tellers and pgbench_branches relations were
surprisingly variable in both patched and unpatched, but there was no
reliable difference between them, just within them.

On the TPS front, there was a hint that patched one was slightly slower but
the within sample variation was also high, and the p-val for difference was
only 0.214 on n of 66.

test case attached.

 That's not a unrealistic testcase.

 I'm pretty sure this could be made quite a bit more pronounced by not
 using a uniform distribution in the pgbench runs. And selecting a test
 that's more vulnerable to the change (e.g. using a wider distribution
 for the read only statements than the modifying ones) would make the the
 CPU overhead of the additional heap_hot_search_buffer() overhead
 heavier.

Sorry I don't understand this description.  Why would queries selecting
data that is not changing have any extra overhead?

Is the idea that the hot part of the table for updates would move around
over time, but the hot part for selects would be even throughout?  I'm not
sure how to put that to the test.

Cheers,

Jeff


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-20 Thread Andres Freund
On 2015-04-20 01:04:18 -0700, Jeff Janes wrote:
 Was this reproducible?

Yes, at least with an old version of the patch.

I don't think you could see a difference using exactly that with the
newer versions which have the 5 page limit. After all it'll pretty much
never reach it.

  That's not a unrealistic testcase.
 
  I'm pretty sure this could be made quite a bit more pronounced by not
  using a uniform distribution in the pgbench runs. And selecting a test
  that's more vulnerable to the change (e.g. using a wider distribution
  for the read only statements than the modifying ones) would make the the
  CPU overhead of the additional heap_hot_search_buffer() overhead
  heavier.

 Sorry I don't understand this description.  Why would queries selecting
 data that is not changing have any extra overhead?

The idea, I think, was that by having a uniform (or just wider)
distribution of the reads they'd be more likely to land on values that
have been updated at some point, but not been pruned since (because at
that point the patch IIRC didn't prune during reads at all). I.e. ones
wer

 Is the idea that the hot part of the table for updates would move around
 over time, but the hot part for selects would be even throughout?

Pretty much.

 I'm not sure how to put that to the test.

That pretty much was what I'd tried to model, yea. I guess it'd be
possible to model this by inserting NOW()/updating values NOW() - 5 and
selecting values up to NOW() - 60. That'd roughly model some realistic
insert/update/select patterns I've seen.

To possibly see any difference with the new patch this would have to be
done in a way that regularly a couple of pages would be touched, with
not that many selected tuples on each.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-20 Thread Simon Riggs
On 20 April 2015 at 18:33, Bruce Momjian br...@momjian.us wrote:


 Also, I am also not sure we should be designing features at this stage
 in our release process.


I see this more as a process of gaining approval. I don't think patches at
the back of the queue should get the its too late treatment just because
they are at the back of the queue. They are logically all at the same
stage. There should be a way to allow people that show patience and respect
for the process to get the same timeshare as those that push their patches
daily.

Anyway, in this case, the patch conflicts with other things going in now,
so changing things isn't really sensible for this release, in terms of my
time.

We clearly need to do a better job of piggybacking actions on a dirty
block. The discussion has been about whether to do early cleanup, but we
should also consider post-access cleanup if we set hint bits or dirtied the
block in other ways.

Since we have many votes in favour of change in this area I'll post a new
version and look for an early review/commit for next release.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
http://www.2ndquadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-20 Thread Bruce Momjian
On Thu, Apr 16, 2015 at 03:41:54PM +0100, Simon Riggs wrote:
 That is how we arrive at the idea of a cleanup limit, further enhanced by a
 limit that applies only to dirtying clean blocks, which we have 4? recent 
 votes
 in favour of.
 
 I would personally be in favour of a parameter to control the limit, since
 whatever we chose is right/wrong depending upon circumstances. I am however
 comfortable with not having a parameter if people think it is hard to tune
 that, which I agree it would be, hence no parameter in the patch.

I think the limit has to be in terms of a percentage of the table size. 
For example, if we do one SELECT on a table with all non-dirty pages, it
would be good to know that 5% of the pages were pruned --- that tells me
that another 19 SELECTs will totally prune the table, assuming no future
writes.  If there are future writes, they would dirty the pages and
cause even more pruning, but the 5% gives me the maximum pruning number
of SELECTs.  If there aren't another 19 SELECTs, do I care if the table
is pruned or not?  Probably not.  Measuring in page count doesn't do
that, and a large table could receive millions of selects before being
fully cleaned.

Also, I am also not sure we should be designing features at this stage
in our release process.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-20 Thread Alvaro Herrera
Bruce Momjian wrote:

 I think the limit has to be in terms of a percentage of the table size. 
 For example, if we do one SELECT on a table with all non-dirty pages, it
 would be good to know that 5% of the pages were pruned --- that tells me
 that another 19 SELECTs will totally prune the table, assuming no future
 writes.

This seems simple to implement: keep two counters, where the second one
is pages we skipped cleanup in.  Once that counter hits SOME_MAX_VALUE,
reset the first counter so that further 5 pages will get HOT pruned.  5%
seems a bit high though.  (In Simon's design, SOME_MAX_VALUE is
essentially +infinity.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-20 Thread Bruce Momjian
On Mon, Apr 20, 2015 at 04:19:22PM -0300, Alvaro Herrera wrote:
 Bruce Momjian wrote:
 
  I think the limit has to be in terms of a percentage of the table size. 
  For example, if we do one SELECT on a table with all non-dirty pages, it
  would be good to know that 5% of the pages were pruned --- that tells me
  that another 19 SELECTs will totally prune the table, assuming no future
  writes.
 
 This seems simple to implement: keep two counters, where the second one
 is pages we skipped cleanup in.  Once that counter hits SOME_MAX_VALUE,
 reset the first counter so that further 5 pages will get HOT pruned.  5%
 seems a bit high though.  (In Simon's design, SOME_MAX_VALUE is
 essentially +infinity.)

Oh, I pulled 5% out of the air.  Thinking of a SELECT-only workload,
which would be our worse case, I was thinking how many SELECTS running
through HOT update chains would it take to be slower than generating the
WAL to prune the page.  I see the percentage as something that we could
reasonably balance, while a fixed page count couldn't be analyzed in
that way.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-20 Thread Jeff Janes
On Mon, Apr 20, 2015 at 10:33 AM, Bruce Momjian br...@momjian.us wrote:

 On Thu, Apr 16, 2015 at 03:41:54PM +0100, Simon Riggs wrote:
  That is how we arrive at the idea of a cleanup limit, further enhanced
 by a
  limit that applies only to dirtying clean blocks, which we have 4?
 recent votes
  in favour of.
 
  I would personally be in favour of a parameter to control the limit,
 since
  whatever we chose is right/wrong depending upon circumstances. I am
 however
  comfortable with not having a parameter if people think it is hard to
 tune
  that, which I agree it would be, hence no parameter in the patch.

 I think the limit has to be in terms of a percentage of the table size.
 For example, if we do one SELECT on a table with all non-dirty pages, it
 would be good to know that 5% of the pages were pruned --- that tells me
 that another 19 SELECTs will totally prune the table, assuming no future
 writes.


But why should 1 SELECT or 20 SELECTs or 200 SELECTs have to do a job,
while the user waits, which is fundamentally VACUUM's duty to do in the
background?  If there are a handful of very hot pages, then it makes sense
not to wait for vacuum to get to them.  And that is what a block-count
limit does.

But if the entire table is very hot, I think that that is just another of
way of saying that autovacuum is horribly misconfigured.  I think the
purpose of this patch is to fix something that can't be fixed through
configuration alone.


 If there are future writes, they would dirty the pages and
 cause even more pruning, but the 5% gives me the maximum pruning number
 of SELECTs.  If there aren't another 19 SELECTs, do I care if the table
 is pruned or not?


The use case I see for this is when there is a mixed workload.  There is
one select which reads the entire table, and hundreds of thousands of
selects/updates/insert that don't, and of course vacuum comes along every
now and then and does it thing.  Why should the one massive SELECT have
horrible performance just because it was run right before autovacuum would
have kicked in instead of right after if finished?

Cheers,

Jeff


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-20 Thread Bruce Momjian
On Mon, Apr 20, 2015 at 12:28:11PM -0700, Jeff Janes wrote:
 But why should 1 SELECT or 20 SELECTs or 200 SELECTs have to do a job, while
 the user waits, which is fundamentally VACUUM's duty to do in the background? 
 If there are a handful of very hot pages, then it makes sense not to wait for
 vacuum to get to them.  And that is what a block-count limit does.  
 
 But if the entire table is very hot, I think that that is just another of way
 of saying that autovacuum is horribly misconfigured.  I think the purpose of

Well, we have to assume there are many misconfigured configurations ---
autovacuum isn't super-easy to configure, so we can't just blame the
user if this makes things worse.  In fact, page pruning was designed
spefically for cases where autovacuum wasn't running our couldn't keep
up.

 this patch is to fix something that can't be fixed through configuration 
 alone.
  
 
 If there are future writes, they would dirty the pages and
 cause even more pruning, but the 5% gives me the maximum pruning number
 of SELECTs.  If there aren't another 19 SELECTs, do I care if the table
 is pruned or not? 
 
 
 The use case I see for this is when there is a mixed workload.  There is one
 select which reads the entire table, and hundreds of thousands of selects/
 updates/insert that don't, and of course vacuum comes along every now and then
 and does it thing.  Why should the one massive SELECT have horrible 
 performance
 just because it was run right before autovacuum would have kicked in instead 
 of
 right after if finished?

I see your point, but what about the read-only workload after a big
update?  Do we leave large tables to be non-pruned for a long time? 
Also, consider cases where you did a big update, the autovacuum
thresh-hold was not met, so autovacuum doesn't run on that table ---
again, do we keep those non-pruned rows around for millions of scans?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Greg Stark
On 15 Apr 2015 15:43, Simon Riggs si...@2ndquadrant.com wrote:

 It all depends upon who is being selfish. Why is a user selfish for
 not wanting to clean every single block they scan, when the people
 that made the mess do nothing and go faster 10 minutes from now?
 Randomly and massively penalising large SELECTs makes no sense. Some
 cleanup is OK, with reasonable limits, which is why that is proposed.

I don't think it's productive to think of a query as a different actor with
only an interest in its own performance and no interest in overall system
performance.

From a holistic point of view the question is how many times is a given hit
chain going to need to be followed before it's pruned. Or to put it another
way, how expensive is creating a hot chain. Does it cause a single prune? a
fixed number of chain readers followed by a prune? Does the amount of work
depend on the workload or is it consistent?

My intuition is that a fixed cutoff like five pages is dangerous because
if you update many pages there's no limit to the number of times they'll be
read before they're all pruned. The steady state could easily be that every
query is having to read hot chains forever.

My intuition, again, is that what we need is a percentage such as do 10
prunes then ignore the next 1000 clean pages with hot chains. That
guarantees that after 100 selects the hot chains will all be pruned but
each select will only prune 1% of the clean pages it sees.


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Pavan Deolasee
On Thu, Apr 16, 2015 at 2:47 PM, Greg Stark st...@mit.edu wrote:


 On 15 Apr 2015 15:43, Simon Riggs si...@2ndquadrant.com wrote:
 
  It all depends upon who is being selfish. Why is a user selfish for
  not wanting to clean every single block they scan, when the people
  that made the mess do nothing and go faster 10 minutes from now?
  Randomly and massively penalising large SELECTs makes no sense. Some
  cleanup is OK, with reasonable limits, which is why that is proposed.

 I don't think it's productive to think of a query as a different actor
 with only an interest in its own performance and no interest in overall
 system performance.

 From a holistic point of view the question is how many times is a given
 hit chain going to need to be followed before it's pruned. Or to put it
 another way, how expensive is creating a hot chain. Does it cause a single
 prune? a fixed number of chain readers followed by a prune? Does the amount
 of work depend on the workload or is it consistent?


IMO the size or traversal of the HOT chain is not that expensive compared
to the cost of either pruning too frequently, which generates WAL as well
as makes buffers dirty. OTOH cost of less frequent pruning could also be
very high. It can cause severe table bloat which may just stay for a very
long time. Even if dead space is recovered within a page, truncating a
bloated heap is not always possible. In such cases, even SELECTs would be
slowed down just because they need to read/scan far more pages than they
otherwise would have. IOW its probably wrong to assume that not-pruning
quickly enough will have impact only on the non-SELECT queries.

I also concur with arguments upthread that this change needs to be
carefully calibrated because it can lead to significant degradation for
certain workloads.

 My intuition, again, is that what we need is a percentage such as do 10
 prunes then ignore the next 1000 clean pages with hot chains. That
 guarantees that after 100 selects the hot chains will all be pruned but
 each select will only prune 1% of the clean pages it sees.

I think some such proposal was made in the last. There could be knob to
control how much a read-only query (or may be a read-only transaction)
should do HOT cleanup, say as a percentage of pages it looks at. The
default can be left at 100% in the first release so that the current
behaviour is not suddenly disrupted. But it will allow others to play with
the percentages and then based on field reports, we can change defaults in
the next releases.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Alvaro Herrera
Pavan Deolasee wrote:
 On Thu, Apr 16, 2015 at 2:47 PM, Greg Stark st...@mit.edu wrote:

  From a holistic point of view the question is how many times is a given
  hit chain going to need to be followed before it's pruned. Or to put it
  another way, how expensive is creating a hot chain. Does it cause a single
  prune? a fixed number of chain readers followed by a prune? Does the amount
  of work depend on the workload or is it consistent?
 
 IMO the size or traversal of the HOT chain is not that expensive compared
 to the cost of either pruning too frequently, which generates WAL as well
 as makes buffers dirty. OTOH cost of less frequent pruning could also be
 very high. It can cause severe table bloat which may just stay for a very
 long time. Even if dead space is recovered within a page, truncating a
 bloated heap is not always possible.

I think you're failing to consider that in the patch there is a
distinction between read-only page accesses and page updates.  During a
page update, HOT cleanup is always done even with the patch, so there
won't be any additional bloat that would not be there without the patch.
It's only the read-only accesses to the patch that skip the HOT pruning.

Of course, as Greg says there will be some additional scans of the HOT
chain by read-only processes.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Andres Freund
On 2015-04-16 10:20:20 -0300, Alvaro Herrera wrote:
 I think you're failing to consider that in the patch there is a
 distinction between read-only page accesses and page updates.  During a
 page update, HOT cleanup is always done even with the patch, so there
 won't be any additional bloat that would not be there without the
 patch.

That's not really true (and my benchmark upthread proves it). The fact
that hot pruning only happens when we can get a cleanup lock means that
we can end up with more pages that are full, if we prune on select less
often. Especially if SELECTs are more frequent than write accesses -
pretty darn common - the likelihood of SELECTs getting the lock is
correspondingly higher.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Pavan Deolasee
On Thu, Apr 16, 2015 at 6:50 PM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

 Pavan Deolasee wrote:
  On Thu, Apr 16, 2015 at 2:47 PM, Greg Stark st...@mit.edu wrote:

   From a holistic point of view the question is how many times is a given
   hit chain going to need to be followed before it's pruned. Or to put it
   another way, how expensive is creating a hot chain. Does it cause a
 single
   prune? a fixed number of chain readers followed by a prune? Does the
 amount
   of work depend on the workload or is it consistent?
 
  IMO the size or traversal of the HOT chain is not that expensive compared
  to the cost of either pruning too frequently, which generates WAL as well
  as makes buffers dirty. OTOH cost of less frequent pruning could also be
  very high. It can cause severe table bloat which may just stay for a very
  long time. Even if dead space is recovered within a page, truncating a
  bloated heap is not always possible.

 I think you're failing to consider that in the patch there is a
 distinction between read-only page accesses and page updates.  During a
 page update, HOT cleanup is always done even with the patch, so there
 won't be any additional bloat that would not be there without the patch.
 It's only the read-only accesses to the patch that skip the HOT pruning.


Ah, Ok. I'd not read the patch. But now that I do, I feel much more
comfortable with the change. In fact, I wonder if its just enough to either
do full HOT prune for target relations and not at all for all other
relations involved in the query. My apologies if this is done based on
discussions upthread. I haven't read the entire thread yet.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Alvaro Herrera
Andres Freund wrote:
 On 2015-04-16 10:20:20 -0300, Alvaro Herrera wrote:
  I think you're failing to consider that in the patch there is a
  distinction between read-only page accesses and page updates.  During a
  page update, HOT cleanup is always done even with the patch, so there
  won't be any additional bloat that would not be there without the
  patch.
 
 That's not really true (and my benchmark upthread proves it). The fact
 that hot pruning only happens when we can get a cleanup lock means that
 we can end up with more pages that are full, if we prune on select less
 often. Especially if SELECTs are more frequent than write accesses -
 pretty darn common - the likelihood of SELECTs getting the lock is
 correspondingly higher.

Interesting point.  Of course, this code should count HOT cleanups
against the total limit when they are effectively carried out, and
ignore those that are skipped because of inability to acquire the
cleanup lock.  Not sure whether the submitted code does that.

Can we keep stats on how many pages we don't clean in the updating
process due to failure to acquire cleanup lock?  My intuition says that
it should be similar to the number of backends running concurrently,
but that might be wrong.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-16 Thread Simon Riggs
On 16 April 2015 at 15:21, Andres Freund and...@anarazel.de wrote:

 On 2015-04-16 10:20:20 -0300, Alvaro Herrera wrote:
  I think you're failing to consider that in the patch there is a
  distinction between read-only page accesses and page updates.  During a
  page update, HOT cleanup is always done even with the patch, so there
  won't be any additional bloat that would not be there without the
  patch.

 That's not really true (and my benchmark upthread proves it). The fact
 that hot pruning only happens when we can get a cleanup lock means that
 we can end up with more pages that are full, if we prune on select less
 often. Especially if SELECTs are more frequent than write accesses -
 pretty darn common - the likelihood of SELECTs getting the lock is
 correspondingly higher.


Your point that we *must* do *some* HOT cleanup on SELECTs is proven beyond
question. Alvaro has not disputed that, ISTM you misread that. Pavan has
questioned that point but the results upthread are there, he explains he
hasn't read that yet.

The only question is how much cleanup on SELECT? Having one SELECT hit
10,000 cleanups while another hits 0 creates an unfairness and
unpredictability in the way we work. Maybe some people running a backup
actually like the fact it cleans the database; others think that is a bad
thing. Few people issuing large queries think it is good behaviour. Anybody
running replication also knows that this causes a huge slam of WAL which
can increase replication delay, which is a concern for HA.

That is how we arrive at the idea of a cleanup limit, further enhanced by a
limit that applies only to dirtying clean blocks, which we have 4? recent
votes in favour of.

I would personally be in favour of a parameter to control the limit, since
whatever we chose is right/wrong depending upon circumstances. I am however
comfortable with not having a parameter if people think it is hard to tune
that, which I agree it would be, hence no parameter in the patch.


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Simon Riggs
On 15 April 2015 at 08:04, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Apr 15, 2015 at 3:37 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 14 April 2015 at 21:53, Robert Haas robertmh...@gmail.com wrote:
 Peter commented previously that README.HOT should get an update.  The
 relevant section seems to be When can/should we prune or
 defragment?.

 That's easy enough to change once we agree to commit.

 I wonder if it would be a useful heuristic to still prune pages if
 those pages are already dirty.

 Useful for who? This is about responsibility. Why should someone
 performing a large SELECT take the responsibility for cleaning pages?

 Because it makes it subsequent accesses to the page cheaper.

Cheaper for whom?

 Of
 course, that applies in all cases, but when the page is already dirty,
 the cost of pruning it is probably quite small - we're going to have
 to write the page anyway, and pruning it before it gets evicted
 (perhaps even by our scan) will be cheaper than writing it now and
 writing it again after it's pruned.  When the page is clean, the cost
 of pruning is significantly higher.

We aren't going to have to write the page, but someone will.

In a single workload, the mix of actions can be useful. In separate
workloads, where some guy just wants to run a report or a backup, its
not right that we slow them down because of someone else's actions.

 I won't take responsibility for paying my neighbor's tax bill, but I
 might take responsibility for picking up his mail while he's on
 holiday.

That makes it sound like this is an occasional, non-annoying thing.

It's more like, whoever fetches the mail needs to fetch it for
everybody. So we are slowing down one person disproportionately, while
others fly through without penalty. There is no argument that one
workload necessarily needs to perform that on behalf of the other
workload.

The actions you suggest are reasonable and should ideally be the role
of a background process. But that doesn't mean in the absence of that
we should pay the cost in the foreground.

Let me apply this patch.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Simon Riggs
On 15 April 2015 at 09:10, Andres Freund and...@anarazel.de wrote:
 On 2015-04-15 08:42:33 -0400, Simon Riggs wrote:
  Because it makes it subsequent accesses to the page cheaper.

 Cheaper for whom?

 Everyone.

I think what you mean is Everyone else. It is demonstrably quicker
and more consistent for a process when it limits the amount of pruning
it does, as well as the fact that it causes additional WAL traffic
when it does so, causing replication lag.

I love it when someone cleans up for me. I just don't think they'll
accept the argument that they should clean up for me because it makes
their life easier.   Certainly doesn't work with my kids.


 I don't really see the downside to this suggestion.

The suggestion makes things better than they are now but is still less
than I have proposed.

If what you both mean is IMHO this is an acceptable compromise, I
can accept it also, at this point in the CF.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Robert Haas
On Wed, Apr 15, 2015 at 3:37 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 14 April 2015 at 21:53, Robert Haas robertmh...@gmail.com wrote:
 Peter commented previously that README.HOT should get an update.  The
 relevant section seems to be When can/should we prune or
 defragment?.

 That's easy enough to change once we agree to commit.

 I wonder if it would be a useful heuristic to still prune pages if
 those pages are already dirty.

 Useful for who? This is about responsibility. Why should someone
 performing a large SELECT take the responsibility for cleaning pages?

Because it makes it subsequent accesses to the page cheaper.  Of
course, that applies in all cases, but when the page is already dirty,
the cost of pruning it is probably quite small - we're going to have
to write the page anyway, and pruning it before it gets evicted
(perhaps even by our scan) will be cheaper than writing it now and
writing it again after it's pruned.  When the page is clean, the cost
of pruning is significantly higher.

I won't take responsibility for paying my neighbor's tax bill, but I
might take responsibility for picking up his mail while he's on
holiday.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Andres Freund
On 2015-04-15 08:42:33 -0400, Simon Riggs wrote:
  Because it makes it subsequent accesses to the page cheaper.
 
 Cheaper for whom?

Everyone. Including further readers. Following HOT chains in read mostly
workloads can be really expensive. If you have workloads with a 'hot'
value range that's frequently updated, but that range moves you can
easily end up with heavily chained tuples which won't soon be touched by
a writer again.

And writers will often not yet be able to prune the page because there's
still live readers for the older versions (like other updaters).

  Of
  course, that applies in all cases, but when the page is already dirty,
  the cost of pruning it is probably quite small - we're going to have
  to write the page anyway, and pruning it before it gets evicted
  (perhaps even by our scan) will be cheaper than writing it now and
  writing it again after it's pruned.  When the page is clean, the cost
  of pruning is significantly higher.
 
 We aren't going to have to write the page, but someone will.

If it's already dirty that doesn't change at all. *Not* pruning in that
moment actually will often *increase* the total amount of writes to the
OS. Because now the pruning will happen on the next write access or
vacuum - when the page already might have been undirtied.

I don't really see the downside to this suggestion.

 The actions you suggest are reasonable and should ideally be the role
 of a background process. But that doesn't mean in the absence of that
 we should pay the cost in the foreground.

I'm not sure that's true. A background process will either cause
additional read IO to find worthwhile pages, or it'll not find
worthwhile pages because they're already paged out.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Alvaro Herrera
Simon Riggs wrote:
 On 15 April 2015 at 09:10, Andres Freund and...@anarazel.de wrote:

  I don't really see the downside to this suggestion.
 
 The suggestion makes things better than they are now but is still less
 than I have proposed.
 
 If what you both mean is IMHO this is an acceptable compromise, I
 can accept it also, at this point in the CF.

Let me see if I understand things.

What we have now is: when reading a page, we also HOT-clean it.  This
runs HOT-cleanup a large number of times, and causes many pages to
become dirty.

Your patch is when reading a page, HOT-clean it, but only 5 times in
each scan.  This runs HOT-cleanup at most 5 times, and causes at most 5
pages to become dirty.

Robert's proposal is when reading a page, if dirty HOT-clean it; if not
dirty, also HOT-clean it but only 5 times in each scan.  This runs
HOT-cleanup some number of times (as many as there are dirty), and
causes at most 5 pages to become dirty.


Am I right in thinking that HOT-clean in a dirty page is something that
runs completely within CPU cache?  If so, it would be damn fast and
would have benefits for future readers, for very little cost.

Dirtying a page is very different; if buffer reads are common, the
system is later bogged down trying to find clean pages to read uncached
buffers (including the read-only scan itself, so it becomes slower.)


If I have understood things correctly, then I stand behind Robert's
suggestion.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Heikki Linnakangas

On 04/15/2015 05:44 PM, Alvaro Herrera wrote:

Simon Riggs wrote:

On 15 April 2015 at 09:10, Andres Freund and...@anarazel.de wrote:



I don't really see the downside to this suggestion.


The suggestion makes things better than they are now but is still less
than I have proposed.

If what you both mean is IMHO this is an acceptable compromise, I
can accept it also, at this point in the CF.


Let me see if I understand things.

What we have now is: when reading a page, we also HOT-clean it.  This
runs HOT-cleanup a large number of times, and causes many pages to
become dirty.

Your patch is when reading a page, HOT-clean it, but only 5 times in
each scan.  This runs HOT-cleanup at most 5 times, and causes at most 5
pages to become dirty.

Robert's proposal is when reading a page, if dirty HOT-clean it; if not
dirty, also HOT-clean it but only 5 times in each scan.  This runs
HOT-cleanup some number of times (as many as there are dirty), and
causes at most 5 pages to become dirty.


Am I right in thinking that HOT-clean in a dirty page is something that
runs completely within CPU cache?  If so, it would be damn fast and
would have benefits for future readers, for very little cost.


If there are many tuples on the page, it takes some CPU effort to scan 
all the HOT chains and move tuples around. Also, it creates a WAL 
record, which isn't free.


Another question is whether the patch can reliably detect whether it's 
doing a read-only scan or not. I haven't tested, but I suspect it'd 
not do pruning when you do something like INSERT INTO foo SELECT * FROM 
foo WHERE blah. I.e. when the target relation is referenced twice in 
the same statement: once as the target, and second time as a source. 
Maybe that's OK, though.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Alvaro Herrera
Heikki Linnakangas wrote:
 On 04/15/2015 05:44 PM, Alvaro Herrera wrote:

 Robert's proposal is when reading a page, if dirty HOT-clean it; if not
 dirty, also HOT-clean it but only 5 times in each scan.  This runs
 HOT-cleanup some number of times (as many as there are dirty), and
 causes at most 5 pages to become dirty.
 
 
 Am I right in thinking that HOT-clean in a dirty page is something that
 runs completely within CPU cache?  If so, it would be damn fast and
 would have benefits for future readers, for very little cost.
 
 If there are many tuples on the page, it takes some CPU effort to scan all
 the HOT chains and move tuples around. Also, it creates a WAL record, which
 isn't free.

But if the page is in CPU cache, the CPU effort shouldn't be all that
noticeable, should it?  That's my point, but then maybe I'm wrong.  Now,
the WAL logging is annoying, so let's limit that too -- do it at most
for, say, 20 dirty pages and at most 5 clean pages.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Robert Haas
On Wed, Apr 15, 2015 at 8:42 AM, Simon Riggs si...@2ndquadrant.com wrote:
 I won't take responsibility for paying my neighbor's tax bill, but I
 might take responsibility for picking up his mail while he's on
 holiday.

 That makes it sound like this is an occasional, non-annoying thing.

 It's more like, whoever fetches the mail needs to fetch it for
 everybody. So we are slowing down one person disproportionately, while
 others fly through without penalty. There is no argument that one
 workload necessarily needs to perform that on behalf of the other
 workload.

Sure there is.  It's called a tragedy of the commons - everybody acts
in their own selfish interest (it's not *my* responsibility to limit
grazing on public land, or prune this page that I'm not modifying) and
as a result some resource that everybody cares about (grass,
system-wide I/O) gets trashed to everyone's detriment.  Purely selfish
behavior can only be justified here if we assume that the selfish
actor intends to participate in the system only once: I'm going to run
one big reporting query which must run as fast as possible, and then
I'm getting on a space ship to Mars.  So if my refusal to do any
pruning during that reporting query causes lots of extra I/O on the
system ten minutes from now, I don't care, because I'll have left the
playing field forever at that point.

As Heikki points out, any HOT pruning operation generates WAL and has
a CPU cost.  However, pruning a page that is currently dirty
*decreases* the total volume of writes to the data files, whereas
pruning a page that is currently clean *increases* the total volume of
writes to the data files.  In the first case, if we prune the page
right now while it's still dirty, we can't possibly cause any
additional data-file writes, and we may save one, because it's
possible that someone else would later have pruned it when it was
clean and there was no other reason to dirty it.  In the second case,
if we prune the page that is currently clean, it will become dirty.
That will cost us no additional I/O if the page is again modified
before it's written out, but otherwise it costs an additional data
file write.  I think there's a big difference between those two cases.
Sure, from the narrow point of view of how much work it takes this
scan to process this page, it's always better not to prune.  But if
you make the more realistic assumption that you will keep on issuing
queries on the system, then what you're doing to the overall system
I/O load is pretty important.

By the way, was anything ever done about this:

http://www.postgresql.org/message-id/20140929091343.ga4...@alap3.anarazel.de

That's just a workload that is 5/6th pgbench -S and 1/6th pgbench,
which is in no way an unrealistic workload, and showed a significant
regression with an earlier version of the patch.  You seem very eager
to commit this patch after four months of inactivity, but I think this
is a pretty massive behavior change that deserves careful scrutiny
before it goes in.  If we push something that changes longstanding
behavior and can't even be turned off, and it regresses behavior for a
use case that common, our users are going to come after us with
pitchforks.  That's not to say some people won't be happy, but in my
experience it takes a lot of happy users to make up for getting
stabbed with even one pitchfork.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Heikki Linnakangas

On 04/15/2015 07:11 PM, Alvaro Herrera wrote:

Heikki Linnakangas wrote:

On 04/15/2015 05:44 PM, Alvaro Herrera wrote:



Robert's proposal is when reading a page, if dirty HOT-clean it; if not
dirty, also HOT-clean it but only 5 times in each scan.  This runs
HOT-cleanup some number of times (as many as there are dirty), and
causes at most 5 pages to become dirty.


Am I right in thinking that HOT-clean in a dirty page is something that
runs completely within CPU cache?  If so, it would be damn fast and
would have benefits for future readers, for very little cost.


If there are many tuples on the page, it takes some CPU effort to scan all
the HOT chains and move tuples around. Also, it creates a WAL record, which
isn't free.


But if the page is in CPU cache, the CPU effort shouldn't be all that
noticeable, should it?  That's my point, but then maybe I'm wrong.  Now,
the WAL logging is annoying, so let's limit that too -- do it at most
for, say, 20 dirty pages and at most 5 clean pages.


There isn't much difference between that and just doing it on first 5 
pages. Both of those numbers were pulled out of thin air, anyway. I'd 
rather just keep it simple.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Simon Riggs
On 15 April 2015 at 12:39, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Apr 15, 2015 at 8:42 AM, Simon Riggs si...@2ndquadrant.com wrote:
 I won't take responsibility for paying my neighbor's tax bill, but I
 might take responsibility for picking up his mail while he's on
 holiday.

 That makes it sound like this is an occasional, non-annoying thing.

 It's more like, whoever fetches the mail needs to fetch it for
 everybody. So we are slowing down one person disproportionately, while
 others fly through without penalty. There is no argument that one
 workload necessarily needs to perform that on behalf of the other
 workload.

 Sure there is.  It's called a tragedy of the commons - everybody acts
 in their own selfish interest (it's not *my* responsibility to limit
 grazing on public land, or prune this page that I'm not modifying) and
 as a result some resource that everybody cares about (grass,
 system-wide I/O) gets trashed to everyone's detriment.  Purely selfish
 behavior can only be justified here if we assume that the selfish
 actor intends to participate in the system only once: I'm going to run
 one big reporting query which must run as fast as possible, and then
 I'm getting on a space ship to Mars.  So if my refusal to do any
 pruning during that reporting query causes lots of extra I/O on the
 system ten minutes from now, I don't care, because I'll have left the
 playing field forever at that point.

It all depends upon who is being selfish. Why is a user selfish for
not wanting to clean every single block they scan, when the people
that made the mess do nothing and go faster 10 minutes from now?
Randomly and massively penalising large SELECTs makes no sense. Some
cleanup is OK, with reasonable limits, which is why that is proposed.

On 04/15/2015 05:44 PM, Alvaro Herrera wrote:
 Robert's proposal is when reading a page, if dirty HOT-clean it; if not
 dirty, also HOT-clean it but only 5 times in each scan.  This runs
 HOT-cleanup some number of times (as many as there are dirty), and
 causes at most 5 pages to become dirty.

My understanding of Robert's proposal was when reading a page,
HOT-clean it, but only do this up to 5 times on clean pages, but
continue to do this indefinitely when the page is already dirty..
Andres said that was the only way and I have agreed to it.

Are you now saying not to commit your proposal at all?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Alvaro Herrera
Simon Riggs wrote:
 On 15 April 2015 at 12:39, Robert Haas robertmh...@gmail.com wrote:

 On 04/15/2015 05:44 PM, Alvaro Herrera wrote:
  Robert's proposal is when reading a page, if dirty HOT-clean it; if not
  dirty, also HOT-clean it but only 5 times in each scan.  This runs
  HOT-cleanup some number of times (as many as there are dirty), and
  causes at most 5 pages to become dirty.
 
 My understanding of Robert's proposal was when reading a page,
 HOT-clean it, but only do this up to 5 times on clean pages, but
 continue to do this indefinitely when the page is already dirty..

To me, both statements look identical.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Simon Riggs
On 15 April 2015 at 16:01, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Simon Riggs wrote:
 On 15 April 2015 at 12:39, Robert Haas robertmh...@gmail.com wrote:

 On 04/15/2015 05:44 PM, Alvaro Herrera wrote:
  Robert's proposal is when reading a page, if dirty HOT-clean it; if not
  dirty, also HOT-clean it but only 5 times in each scan.  This runs
  HOT-cleanup some number of times (as many as there are dirty), and
  causes at most 5 pages to become dirty.

 My understanding of Robert's proposal was when reading a page,
 HOT-clean it, but only do this up to 5 times on clean pages, but
 continue to do this indefinitely when the page is already dirty..

 To me, both statements look identical.

I didn't read it that way, apologies for any confusion. But that is good news.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Peter Geoghegan
On Wed, Apr 15, 2015 at 6:10 AM, Andres Freund and...@anarazel.de wrote:
  Of
  course, that applies in all cases, but when the page is already dirty,
  the cost of pruning it is probably quite small - we're going to have
  to write the page anyway, and pruning it before it gets evicted
  (perhaps even by our scan) will be cheaper than writing it now and
  writing it again after it's pruned.  When the page is clean, the cost
  of pruning is significantly higher.

 We aren't going to have to write the page, but someone will.

 If it's already dirty that doesn't change at all. *Not* pruning in that
 moment actually will often *increase* the total amount of writes to the
 OS. Because now the pruning will happen on the next write access or
 vacuum - when the page already might have been undirtied.

 I don't really see the downside to this suggestion.

+1. I think, in general, the opportunity cost of not pruning when a
page is already dirty is likely to be rather high. In general, it's
likely to be worth it.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-15 Thread Simon Riggs
On 14 April 2015 at 21:53, Robert Haas robertmh...@gmail.com wrote:

 Peter commented previously that README.HOT should get an update.  The
 relevant section seems to be When can/should we prune or
 defragment?.

That's easy enough to change once we agree to commit.

 I wonder if it would be a useful heuristic to still prune pages if
 those pages are already dirty.

Useful for who? This is about responsibility. Why should someone
performing a large SELECT take the responsibility for cleaning pages?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-14 Thread Robert Haas
On Tue, Apr 14, 2015 at 6:07 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 11 March 2015 at 20:55, Peter Eisentraut pete...@gmx.net wrote:
 I don't know how to move forward.  We could give users a knob: This
 might make your queries faster or not -- good luck.  But of course
 nobody will like that either.

 What is clear is that large SELECT queries are doing the work VACUUM
 should do. We should not be doing large background tasks (block
 cleanup) during long running foreground tasks. But there is no need
 for changing behaviour during small SELECTs. So the setting of 4 gives
 current behaviour for small SELECTs and new behaviour for larger
 SELECTs.

Peter commented previously that README.HOT should get an update.  The
relevant section seems to be When can/should we prune or
defragment?.

I wonder if it would be a useful heuristic to still prune pages if
those pages are already dirty.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-14 Thread Simon Riggs
On 11 March 2015 at 20:55, Peter Eisentraut pete...@gmx.net wrote:

 I don't know how to move forward.  We could give users a knob: This
 might make your queries faster or not -- good luck.  But of course
 nobody will like that either.

What is clear is that large SELECT queries are doing the work VACUUM
should do. We should not be doing large background tasks (block
cleanup) during long running foreground tasks. But there is no need
for changing behaviour during small SELECTs. So the setting of 4 gives
current behaviour for small SELECTs and new behaviour for larger
SELECTs.

The OP said this...
op
We also make SELECT clean up blocks as it goes. That is useful in OLTP
workloads, but it means that large SQL queries and pg_dump effectively
do much the same work as VACUUM, generating huge amounts of I/O and
WAL on the master, the cost and annoyance of which is experienced
directly by the user. That is avoided on standbys.

Effects of that are that long running statements often run much longer
than we want, increasing bloat as a result. It also produces wildly
varying response times, depending upon extent of cleanup required.
/op

This is not a performance patch. This is about one user doing the
cleanup work for another. People running large SELECTs should not be
penalised. The patch has been shown to avoid that and no further
discussion should be required.

I don't really care whether we have a parameter for this or not. As
long as we have the main feature.

It's trivial to add/remove a parameter to control this. Currently
there isn't one.

I'd like to commit this.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, RemoteDBA, Training  Services


hot_disable.v9.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-04-14 Thread David Steele
On 4/14/15 6:07 PM, Simon Riggs wrote:
 On 11 March 2015 at 20:55, Peter Eisentraut pete...@gmx.net wrote:
 
 I don't know how to move forward.  We could give users a knob: This
 might make your queries faster or not -- good luck.  But of course
 nobody will like that either.
 
 What is clear is that large SELECT queries are doing the work VACUUM
 should do. We should not be doing large background tasks (block
 cleanup) during long running foreground tasks. But there is no need
 for changing behaviour during small SELECTs. So the setting of 4 gives
 current behaviour for small SELECTs and new behaviour for larger
 SELECTs.
 
 The OP said this...
 op
 We also make SELECT clean up blocks as it goes. That is useful in OLTP
 workloads, but it means that large SQL queries and pg_dump effectively
 do much the same work as VACUUM, generating huge amounts of I/O and
 WAL on the master, the cost and annoyance of which is experienced
 directly by the user. That is avoided on standbys.
 
 Effects of that are that long running statements often run much longer
 than we want, increasing bloat as a result. It also produces wildly
 varying response times, depending upon extent of cleanup required.
 /op
 
 This is not a performance patch. This is about one user doing the
 cleanup work for another. People running large SELECTs should not be
 penalised. The patch has been shown to avoid that and no further
 discussion should be required.
 
 I don't really care whether we have a parameter for this or not. As
 long as we have the main feature.
 
 It's trivial to add/remove a parameter to control this. Currently
 there isn't one.
 
 I'd like to commit this.

+1 from me.  One of the last databases I worked on had big raw
partitions that were written to and then sequentially scanned exactly
once before being dropped.  It was painful to see all those writes
happening for nothing.

In other cases there were sequential scans that happened directly after
the main writes, but then the next read might be days in the future (if
ever) and the system was basically idle for a while which would have
allowed vacuum to come in and do the job without affecting performance
of the main job.

I think that in batch-oriented databases this patch will definitely be a
boon to performance.

-- 
- David Steele
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-03-12 Thread Andres Freund
On 2015-03-11 20:55:18 -0400, Peter Eisentraut wrote:
 I don't think so.  Andres basically wanted a nontrival algorithm to
 determine how much pruning to do during a read-only scan.  And Robert
 basically said, that's not really possible.

I don't think either of us made really strong statements.

 We have seen some benchmarks that show significant improvements.  We
 have seen some (constructed ones) that show problems.

FWIW, it's not that constructed. It's just a mixture of read with write
load.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-03-12 Thread Robert Haas
On Thu, Mar 12, 2015 at 3:48 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 3/12/15 5:41 AM, Andres Freund wrote:
 On 2015-03-11 20:55:18 -0400, Peter Eisentraut wrote:
 I don't think so.  Andres basically wanted a nontrival algorithm to
 determine how much pruning to do during a read-only scan.  And Robert
 basically said, that's not really possible.

 I don't think either of us made really strong statements.

 I didn't mean to put words in your mouth.  I just wanted to summarize
 the thread as, Andres wanted more fine-tuning on the behavior, Robert
 expressed serious doubts that that will lead to an acceptable result.

Or to put that another way, I'm not sure there's one behavior here
that will please everybody.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-03-12 Thread Peter Eisentraut
On 3/12/15 5:41 AM, Andres Freund wrote:
 On 2015-03-11 20:55:18 -0400, Peter Eisentraut wrote:
 I don't think so.  Andres basically wanted a nontrival algorithm to
 determine how much pruning to do during a read-only scan.  And Robert
 basically said, that's not really possible.
 
 I don't think either of us made really strong statements.

I didn't mean to put words in your mouth.  I just wanted to summarize
the thread as, Andres wanted more fine-tuning on the behavior, Robert
expressed serious doubts that that will lead to an acceptable result.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-03-11 Thread Peter Eisentraut
On 12/17/14 3:39 AM, Simon Riggs wrote:
 On 15 December 2014 at 20:26, Jeff Janes jeff.ja...@gmail.com wrote:
 
 I still get the compiler error in contrib:

 pgstattuple.c: In function 'pgstat_heap':
 pgstattuple.c:279: error: too few arguments to function
 'heap_beginscan_strat'

 Should it pass false for the always_prune?
 
 Yes.
 
 New version attached.

README.HOT section When can/should we prune or defragment? needs a
major update as a result of this patch.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-03-11 Thread Peter Eisentraut
On 3/9/15 1:36 PM, Jeff Janes wrote:
 Did versions 7 and 8 of this patch address Andres' concern about
 performance regressions?

I don't think so.  Andres basically wanted a nontrival algorithm to
determine how much pruning to do during a read-only scan.  And Robert
basically said, that's not really possible.

The presented patch actually has a hardcoded prune limit of 4 per scan,
which I don't see mentioned in the discussion anywhere (except in very
early versions, where this was exposed as a knob).

I think most people were of the opinion that scans on system catalogs
should not be affected by this behavior change.  Makes sense to me:
System catalog bloat is likely a bigger problem than speeding up queries
on catalogs with large live data.

And then there is still some disagreement whether just turning this on
is tolerable for all uses.  Andres mentioned workloads that have trouble
getting a cleanup lock.  README.HOT seems to think that cleaning up
during reads is important because skipping over dead tuples is
expensive.  Nobody seems to like the idea of (implicitly) pushing more
responsibility on VACUUM.  We have seen some benchmarks that show
significant improvements.  We have seen some (constructed ones) that
show problems.

I don't know how to move forward.  We could give users a knob: This
might make your queries faster or not -- good luck.  But of course
nobody will like that either.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-03-09 Thread Jeff Janes
On Wed, Dec 17, 2014 at 12:39 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On 15 December 2014 at 20:26, Jeff Janes jeff.ja...@gmail.com wrote:

  I still get the compiler error in contrib:
 
  pgstattuple.c: In function 'pgstat_heap':
  pgstattuple.c:279: error: too few arguments to function
  'heap_beginscan_strat'
 
  Should it pass false for the always_prune?

 Yes.

 New version attached.


This no longer applies directly against head, but if I apply to an older
checkout and then do git checkout -m origin it rolls forward cleanly.

Did versions 7 and 8 of this patch address Andres' concern about
performance regressions?

Thanks,

Jeff


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2015-02-12 Thread Michael Paquier
On Wed, Dec 17, 2014 at 5:39 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 15 December 2014 at 20:26, Jeff Janes jeff.ja...@gmail.com wrote:

  I still get the compiler error in contrib:
 
  pgstattuple.c: In function 'pgstat_heap':
  pgstattuple.c:279: error: too few arguments to function
  'heap_beginscan_strat'
 
  Should it pass false for the always_prune?

 Yes.

 New version attached.


Moved patch to CF 2015-02 with same status Needs review. It visibly needs
more work, and numbers to show increase in performance while only cases
showing up performance decrease showed up.
-- 
Michael


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-12-17 Thread Simon Riggs
On 15 December 2014 at 20:26, Jeff Janes jeff.ja...@gmail.com wrote:

 I still get the compiler error in contrib:

 pgstattuple.c: In function 'pgstat_heap':
 pgstattuple.c:279: error: too few arguments to function
 'heap_beginscan_strat'

 Should it pass false for the always_prune?

Yes.

New version attached.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


hot_disable.v8.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-12-11 Thread Simon Riggs
On 17 November 2014 at 22:08, Simon Riggs si...@2ndquadrant.com wrote:
 On 17 November 2014 21:09, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 What happened to this patch?  I'm going over something that could use
 the concept of clean some stuff up when reading this page, but only if
 we're already writing or similar.

 I see some cases were presented that had a performance decrease.  Did we
 get any numbers for the increase in performance in some other
 interesting cases?

 It's not dead; it just needs more work. Maybe for next CF, or you can now.

Latest version attached for next CF

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


hot_disable.v7.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-11-17 Thread Alvaro Herrera
What happened to this patch?  I'm going over something that could use
the concept of clean some stuff up when reading this page, but only if
we're already writing or similar.

I see some cases were presented that had a performance decrease.  Did we
get any numbers for the increase in performance in some other
interesting cases?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-11-17 Thread Simon Riggs
On 17 November 2014 21:09, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 What happened to this patch?  I'm going over something that could use
 the concept of clean some stuff up when reading this page, but only if
 we're already writing or similar.

 I see some cases were presented that had a performance decrease.  Did we
 get any numbers for the increase in performance in some other
 interesting cases?

It's not dead; it just needs more work. Maybe for next CF, or you can now.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-29 Thread Andres Freund
On 2014-09-28 19:51:36 +0100, Simon Riggs wrote:
 On 27 September 2014 09:29, Andres Freund and...@anarazel.de wrote:
  On 2014-09-27 10:23:33 +0300, Heikki Linnakangas wrote:
  This patch has gotten a fair amount of review, and has been rewritten once
  during the commitfest. I think it's pretty close to being committable, the
  only remaining question seems to be what to do with system catalogs. I'm
  marking this as Returned with feedback, I take it that Simon can proceed
  from here, outside the commitfest.
 
  FWIW, I don't think it is, even with that. As is it seems very likely
  that it's going to regress a fair share of workloads. At the very least
  it needs a fair amount of benchmarking beforehand.

 There is some doubt there. We've not seen a workload that does
 actually exhibit a negative behaviour.

Neither is there much data about the magnitude of positive effect the
patch has...

 I'm not saying one doesn't exist, but it does matter how common/likely
 it is. If anyone can present a performance test case that demonstrates
 a regression, I think it will make it easier to discuss how wide that
 case is and what we should do about it. Discussing whether to do
 various kinds of limited pruning are moot until that is clear.

I doubt it'll be hard to construct a case where it'll show. My first try
of using a pgbench scale 100, -M prepared, -cj8 with a custom file with
1 write and 5 read transaction yielded the following on my laptop:

Baseline:
 relname| pgbench_tellers
 pg_total_relation_size | 458752
 relname| pgbench_accounts
 pg_total_relation_size | 1590337536
 relname| pgbench_branches
 pg_total_relation_size | 286720
 relname| pgbench_history
 pg_total_relation_size | 49979392
Patched:
 relname| pgbench_tellers
 pg_total_relation_size | 516096
 relname| pgbench_accounts
 pg_total_relation_size | 1590337536
 relname| pgbench_branches
 pg_total_relation_size | 360448
 relname| pgbench_history
 pg_total_relation_size | 49528832

So, there's a noticeable increase in size. Mostly on the smaller tables,
so probably HOT cleanup was sometimes skipped during UPDATEs due to
locks.

Baseline was:
tps = 9655.486532 (excluding connections establishing)
Patched was:
tps = 9466.158701 (including connections establishing)

That's not a unrealistic testcase.

I'm pretty sure this could be made quite a bit more pronounced by not
using a uniform distribution in the pgbench runs. And selecting a test
that's more vulnerable to the change (e.g. using a wider distribution
for the read only statements than the modifying ones) would make the the
CPU overhead of the additional heap_hot_search_buffer() overhead
heavier.


 My memory was that it took months for people to understand the
 frequent update use case, since catching it in flagrante delicto was
 hard. That may be the case here, or not, but negative-benefit
 experimental results very welcome.

 Updated patch attached to address earlier comments.

contrib (at least pgstattuple) doesn't currently compile with
this... Easily patched up tho.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-29 Thread Andres Freund
On 2014-09-29 11:13:43 +0200, Andres Freund wrote:
 I doubt it'll be hard to construct a case where it'll show. My first try
 of using a pgbench scale 100, -M prepared, -cj8 with a custom file with
 1 write and 5 read transaction yielded the following on my laptop:

Hm. On second thought that testcase probably *decreased* the bad effects
of this because pgbench's readonly statements only touch one table...

patched:

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-29 Thread Andres Freund
On 2014-09-29 11:31:11 +0200, Andres Freund wrote:
 On 2014-09-29 11:13:43 +0200, Andres Freund wrote:
  I doubt it'll be hard to construct a case where it'll show. My first try
  of using a pgbench scale 100, -M prepared, -cj8 with a custom file with
  1 write and 5 read transaction yielded the following on my laptop:
 
 Hm. On second thought that testcase probably *decreased* the bad effects
 of this because pgbench's readonly statements only touch one table...

I've attached the file nonetheless, for posterities sake.

Note it has the scale hardcoded...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
\set scale 100
\set nbranches 1 * :scale
\set ntellers 10 * :scale
\set naccounts 10 * :scale
\setrandom aid 1 :naccounts
\setrandom bid 1 :nbranches
\setrandom tid 1 :ntellers
\setrandom delta -5000 5000
BEGIN;
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, 
:aid, :delta, CURRENT_TIMESTAMP);
END;

\set naccounts 10 * :scale
\setrandom aid 1 :naccounts
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

\set naccounts 10 * :scale
\setrandom aid 1 :naccounts
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

\set naccounts 10 * :scale
\setrandom aid 1 :naccounts
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

\set naccounts 10 * :scale
\setrandom aid 1 :naccounts
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

\set naccounts 10 * :scale
\setrandom aid 1 :naccounts
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-28 Thread Simon Riggs
On 27 September 2014 09:29, Andres Freund and...@anarazel.de wrote:
 On 2014-09-27 10:23:33 +0300, Heikki Linnakangas wrote:
 This patch has gotten a fair amount of review, and has been rewritten once
 during the commitfest. I think it's pretty close to being committable, the
 only remaining question seems to be what to do with system catalogs. I'm
 marking this as Returned with feedback, I take it that Simon can proceed
 from here, outside the commitfest.

 FWIW, I don't think it is, even with that. As is it seems very likely
 that it's going to regress a fair share of workloads. At the very least
 it needs a fair amount of benchmarking beforehand.

There is some doubt there. We've not seen a workload that does
actually exhibit a negative behaviour. I'm not saying one doesn't
exist, but it does matter how common/likely it is. If anyone can
present a performance test case that demonstrates a regression, I
think it will make it easier to discuss how wide that case is and what
we should do about it. Discussing whether to do various kinds of
limited pruning are moot until that is clear.

My memory was that it took months for people to understand the
frequent update use case, since catching it in flagrante delicto was
hard. That may be the case here, or not, but negative-benefit
experimental results very welcome.

Updated patch attached to address earlier comments.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


hot_disable.v6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-27 Thread Heikki Linnakangas
This patch has gotten a fair amount of review, and has been rewritten 
once during the commitfest. I think it's pretty close to being 
committable, the only remaining question seems to be what to do with 
system catalogs. I'm marking this as Returned with feedback, I take it 
that Simon can proceed from here, outside the commitfest.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-27 Thread Andres Freund
On 2014-09-27 10:23:33 +0300, Heikki Linnakangas wrote:
 This patch has gotten a fair amount of review, and has been rewritten once
 during the commitfest. I think it's pretty close to being committable, the
 only remaining question seems to be what to do with system catalogs. I'm
 marking this as Returned with feedback, I take it that Simon can proceed
 from here, outside the commitfest.

FWIW, I don't think it is, even with that. As is it seems very likely
that it's going to regress a fair share of workloads. At the very least
it needs a fair amount of benchmarking beforehand.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-21 Thread Robert Haas
On Fri, Sep 19, 2014 at 5:42 PM, Andres Freund and...@anarazel.de wrote:
 On 2014-09-19 17:29:08 -0400, Robert Haas wrote:
  I generally have serious doubts about disabling it generally for
  read workloads. I imagine it e.g. will significantly penalize
  workloads where its likely that a cleanup lock can't be acquired
  every time...

 I share that doubt.  But I understand why Simon wants to do something,
 too, because the current situation is not great either.

 Right, I totally agree. I doubt a simple approach like this will work in
 the general case, but I think something needs to be done.

 I think limiting the amount of HOT cleanup for readonly queries is a
 good idea, but I think it has to be gradual. Say after a single cleaned
 up page at least another 500 pages need to have been touched till the
 next hot cleanup. That way a single query won't be penalized with
 cleaning up everything, but there'll be some progress.

I tried this kind of thing several years ago with hint-bit-setting and
was unimpressed by the results.

http://www.postgresql.org/message-id/aanlktik5qzr8wts0mqcwwmnp-qhgrdky5av5aob7w...@mail.gmail.com
http://www.postgresql.org/message-id/aanlktimgkag7wdu-x77gnv2gh6_qo5ss1u5b6q1ms...@mail.gmail.com

Granted, I never tried a ratio as low as 500:1, and HOT pruning is not
the same thing as setting hint bits, but I think the basic problems
are similar, namely:

1. You can't know how many times the page is going to be referenced in
the future before it again gets modified.  If that number is small,
then you shouldn't bother with hint bits, or HOT-pruning, or freezing.
But if it's big, you should do all of those things as soon as possible
because the benefits are quite significant.  Therefore, any change in
this area is  guaranteed to lose on some easy-to-construct workload,
because I just described two of them that want opposing things.

2. Dirtying every N'th page is a great way to generate lots of random
I/O that will quite possibly make your disk almost as sad - or even
sadder - than dirtying all of them, but without anywhere as near as
much performance benefit.

Variations on this idea have been proposed so many times over the
years that I'm tempted to give some credence to the theory that we
ought to adopt one of them.  But there will certainly be losers, as
well as winners.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-19 Thread Robert Haas
On Mon, Sep 15, 2014 at 5:13 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 15 September 2014 17:09, Robert Haas robertmh...@gmail.com wrote:
 Do we really want to disable HOT for all catalog scans?

 The intention of the patch is that catalog scans are treated
 identically to non-catalog scans. The idea here is that HOT cleanup
 only occurs on scans on target relations, so only INSERT, UPDATE and
 DELETE do HOT cleanup.

 It's possible that many catalog scans don't follow the normal target
 relation logic, so we might argue we should use HOT every time. OTOH,
 since we now have separate catalog xmins we may find that using HOT on
 catalogs is no longer effective. So I could go either way on how to
 proceed; its an easy change either way.

What I'm thinking about is that the smarts to enable pruning is all in
the executor nodes.  So anything that updates the catalog without
going through the executor will never be subject to pruning.  That
includes nearly all catalog-modifying code throughout the backend.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-19 Thread Simon Riggs
On 19 September 2014 13:04, Robert Haas robertmh...@gmail.com wrote:

 What I'm thinking about is that the smarts to enable pruning is all in
 the executor nodes.  So anything that updates the catalog without
 going through the executor will never be subject to pruning.  That
 includes nearly all catalog-modifying code throughout the backend.

Are you saying this is a problem or a benefit? (and please explain why).

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-19 Thread Andres Freund
On September 19, 2014 10:16:35 PM CEST, Simon Riggs si...@2ndquadrant.com 
wrote:
On 19 September 2014 13:04, Robert Haas robertmh...@gmail.com wrote:

 What I'm thinking about is that the smarts to enable pruning is all
in
 the executor nodes.  So anything that updates the catalog without
 going through the executor will never be subject to pruning.  That
 includes nearly all catalog-modifying code throughout the backend.

Are you saying this is a problem or a benefit? (and please explain
why).

I have no idea what Robert is thinking of, but I'd imagine its horrible for 
workloads with catalog bloat. Like ones involving temp tables.

I generally have serious doubts about disabling it generally for read 
workloads. I imagine it e.g. will significantly penalize workloads where its 
likely that a cleanup lock can't be acquired every time...

Andres

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-19 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On September 19, 2014 10:16:35 PM CEST, Simon Riggs si...@2ndquadrant.com 
 wrote:
 Are you saying this is a problem or a benefit? (and please explain
 why).

 I have no idea what Robert is thinking of, but I'd imagine its horrible for 
 workloads with catalog bloat. Like ones involving temp tables.

Yeah.  But it's also the case that we know a good deal more about the
access patterns for system-driven catalog updates than we do about user
queries.  ISTM we could probably suppress HOT pruning during catalog
*scans* and instead try to do it when a system-driven heap_update
occurs.

Having said that, this could reasonably be considered outside the scope
of a patch that's trying to improve the behavior for user queries.
But if the patch author doesn't want to expand the scope like that,
ISTM he ought to ensure that the behavior *doesn't* change for system
accesses, rather than trying to convince us that disabling HOT for
system updates is a good idea.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-19 Thread Robert Haas
On Fri, Sep 19, 2014 at 4:30 PM, Andres Freund and...@anarazel.de wrote:
 On September 19, 2014 10:16:35 PM CEST, Simon Riggs si...@2ndquadrant.com 
 wrote:
On 19 September 2014 13:04, Robert Haas robertmh...@gmail.com wrote:

 What I'm thinking about is that the smarts to enable pruning is all
in
 the executor nodes.  So anything that updates the catalog without
 going through the executor will never be subject to pruning.  That
 includes nearly all catalog-modifying code throughout the backend.

Are you saying this is a problem or a benefit? (and please explain
why).

 I have no idea what Robert is thinking of, but I'd imagine its horrible for 
 workloads with catalog bloat. Like ones involving temp tables.

Right, that's what I was going for.

 I generally have serious doubts about disabling it generally for read 
 workloads. I imagine it e.g. will significantly penalize workloads where its 
 likely that a cleanup lock can't be acquired every time...

I share that doubt.  But I understand why Simon wants to do something,
too, because the current situation is not great either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-19 Thread Andres Freund
On 2014-09-19 17:29:08 -0400, Robert Haas wrote:
  I generally have serious doubts about disabling it generally for
  read workloads. I imagine it e.g. will significantly penalize
  workloads where its likely that a cleanup lock can't be acquired
  every time...
 
 I share that doubt.  But I understand why Simon wants to do something,
 too, because the current situation is not great either.

Right, I totally agree. I doubt a simple approach like this will work in
the general case, but I think something needs to be done.

I think limiting the amount of HOT cleanup for readonly queries is a
good idea, but I think it has to be gradual. Say after a single cleaned
up page at least another 500 pages need to have been touched till the
next hot cleanup. That way a single query won't be penalized with
cleaning up everything, but there'll be some progress.

The other thing I think might be quite worthwile would be to abort hot
cleanup when the gain is only minimal. If e.g. only 1 small tuple is
removed from a half full page it's not worth the cost of the wal logging
et al.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-19 Thread Simon Riggs
On 19 September 2014 15:35, Tom Lane t...@sss.pgh.pa.us wrote:

 Having said that, this could reasonably be considered outside the scope
 of a patch that's trying to improve the behavior for user queries.
 But if the patch author doesn't want to expand the scope like that,
 ISTM he ought to ensure that the behavior *doesn't* change for system
 accesses, rather than trying to convince us that disabling HOT for
 system updates is a good idea.

As I said, I could make an argument to go either way, so I was unsure.

I'm happy to avoid changing behaviour for catalog scans in this patch.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-19 Thread Andres Freund
On 2014-09-19 16:35:19 -0400, Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  On September 19, 2014 10:16:35 PM CEST, Simon Riggs si...@2ndquadrant.com 
  wrote:
  Are you saying this is a problem or a benefit? (and please explain
  why).
 
  I have no idea what Robert is thinking of, but I'd imagine its horrible for 
  workloads with catalog bloat. Like ones involving temp tables.
 
 Yeah.  But it's also the case that we know a good deal more about the
 access patterns for system-driven catalog updates than we do about user
 queries.  ISTM we could probably suppress HOT pruning during catalog
 *scans* and instead try to do it when a system-driven heap_update
 occurs.
 
 Having said that, this could reasonably be considered outside the scope
 of a patch that's trying to improve the behavior for user queries.
 But if the patch author doesn't want to expand the scope like that,
 ISTM he ought to ensure that the behavior *doesn't* change for system
 accesses, rather than trying to convince us that disabling HOT for
 system updates is a good idea.

I think it'd have to change for anything not done via the
executor. There definitely is user defined code out there doing manual
heap_* stuff. I know because i've written some. And I know I'm not the
only one.

If such paths suddenly stop doing HOT cleanup we'll cause a noticeable
amount of pain.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-17 Thread Emanuel Calvo

El 15/09/14 18:13, Simon Riggs escribió:
 On 15 September 2014 17:09, Robert Haas robertmh...@gmail.com wrote:

 Do we really want to disable HOT for all catalog scans?
 The intention of the patch is that catalog scans are treated
 identically to non-catalog scans. The idea here is that HOT cleanup
 only occurs on scans on target relations, so only INSERT, UPDATE and
 DELETE do HOT cleanup.

 It's possible that many catalog scans don't follow the normal target
 relation logic, so we might argue we should use HOT every time. OTOH,
 since we now have separate catalog xmins we may find that using HOT on
 catalogs is no longer effective. So I could go either way on how to
 proceed; its an easy change either way.


I setup a more concurrent scenario and the difference is quite larger:

Without patch:
1st concurrent with writes:
tps = 5705.261620 (including connections establishing)
tps = 5945.338557 (excluding connections establishing)
2nd no writes being executed:
tps = 9988.792575 (including connections establishing)
tps = 11059.671351 (excluding connections establishing)


Patched version:
1st concurrent with writes:
tps = 9476.741707 (including connections establishing)
tps = 10274.831185 (excluding connections establishing)
2nd no writes being executed:
tps = 12993.644808 (including connections establishing)
tps = 15171.214744 (excluding connections establishing)


Stats (writes have been run with a time limit, not by tx):

hotcleanup=# select relname ,n_live_tup, n_dead_tup, n_tup_hot_upd from
pg_stat_user_tables where relname ~ 'pgbench';
 relname  | n_live_tup | n_dead_tup | n_tup_hot_upd
--+++---
 pgbench_tellers  |500 |  0 |   2044192
 pgbench_accounts |5109728 | 310842 |   1969264
 pgbench_history  |2265882 |  0 | 0
 pgbench_branches | 50 |  0 |   2237167
(4 rows)

hotcleanup=# select relname ,n_live_tup, n_dead_tup, n_tup_hot_upd from
pg_stat_user_tables where relname ~ 'pgbench';
 relname  | n_live_tup | n_dead_tup | n_tup_hot_upd
--+++---
 pgbench_history  |2148946 |  0 | 0
 pgbench_tellers  |500 |  0 |   1969675
 pgbench_branches | 50 |  0 |   2150655
 pgbench_accounts |5098774 | 300123 |   1897484
(4 rows)

  

I ran the regression tests over the patched version and they passed ok.


-- 
--
Emanuel Calvo http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-17 Thread Simon Riggs
On 15 September 2014 22:13, Simon Riggs si...@2ndquadrant.com wrote:
 On 15 September 2014 17:09, Robert Haas robertmh...@gmail.com wrote:

 Do we really want to disable HOT for all catalog scans?

 The intention of the patch is that catalog scans are treated
 identically to non-catalog scans. The idea here is that HOT cleanup
 only occurs on scans on target relations, so only INSERT, UPDATE and
 DELETE do HOT cleanup.

Since INSERT, UPDATE and DELETE can only be called when
!RecoveryInProgress(), we can completely avoid making this test at the
top of each heap_page_prune_opt() call.

I very much like the simplicity of saying no target, no cleanup.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-15 Thread Robert Haas
On Sun, Sep 14, 2014 at 4:37 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 12 September 2014 18:19, Simon Riggs si...@2ndquadrant.com wrote:
 On 12 September 2014 15:30, Tom Lane t...@sss.pgh.pa.us wrote:

 After a little bit I remembered there was already a function for this.
 So specifically, I'd suggest using ExecRelationIsTargetRelation()
 to decide whether to mark the scan as requiring pruning.

 Sounds cool. Thanks both, this is sounding like a viable route now.

 Yes, this is viable.

 Patch attached, using Alvaro's idea of use-case specific pruning and
 Tom's idea of aiming at target relations. Patch uses or extends
 existing infrastructure, so its shorter than it might have been, yet
 with all that bufmgr yuck removed.

 This is very, very good because while going through this I notice the
 dozen or more places where we were pruning blocks in annoying places I
 didn't even know about such as about 4-5 constraint checks. In more
 than a few DDL commands like ALTER TABLE and CLUSTER we were even
 pruning the old relation prior to rewrite.

Do we really want to disable HOT for all catalog scans?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-15 Thread Simon Riggs
On 15 September 2014 17:09, Robert Haas robertmh...@gmail.com wrote:

 Do we really want to disable HOT for all catalog scans?

The intention of the patch is that catalog scans are treated
identically to non-catalog scans. The idea here is that HOT cleanup
only occurs on scans on target relations, so only INSERT, UPDATE and
DELETE do HOT cleanup.

It's possible that many catalog scans don't follow the normal target
relation logic, so we might argue we should use HOT every time. OTOH,
since we now have separate catalog xmins we may find that using HOT on
catalogs is no longer effective. So I could go either way on how to
proceed; its an easy change either way.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-15 Thread Emanuel Calvo

El 14/09/14 17:37, Simon Riggs escribió:
 On 12 September 2014 18:19, Simon Riggs si...@2ndquadrant.com wrote:
 On 12 September 2014 15:30, Tom Lane t...@sss.pgh.pa.us wrote:
 After a little bit I remembered there was already a function for this.
 So specifically, I'd suggest using ExecRelationIsTargetRelation()
 to decide whether to mark the scan as requiring pruning.
 Sounds cool. Thanks both, this is sounding like a viable route now.
 Yes, this is viable.

 Patch attached, using Alvaro's idea of use-case specific pruning and
 Tom's idea of aiming at target relations. Patch uses or extends
 existing infrastructure, so its shorter than it might have been, yet
 with all that bufmgr yuck removed.

 This is very, very good because while going through this I notice the
 dozen or more places where we were pruning blocks in annoying places I
 didn't even know about such as about 4-5 constraint checks. In more
 than a few DDL commands like ALTER TABLE and CLUSTER we were even
 pruning the old relation prior to rewrite.


A simple performance test with the following variables:
LOOP=50
CONN=60
TXSS=500
SCALE=30


Select only:
WITH PATCH
Average:  20716.1 tps

NO PATCH
Average:  19141.7 tps


With writes:
WITH PATCH
Average:  2602.65

NO PATCH
Average:  2565.32


TODO:
- Consistency check.
- ALTER and CLUSTER test.






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-14 Thread Simon Riggs
On 12 September 2014 18:19, Simon Riggs si...@2ndquadrant.com wrote:
 On 12 September 2014 15:30, Tom Lane t...@sss.pgh.pa.us wrote:

 After a little bit I remembered there was already a function for this.
 So specifically, I'd suggest using ExecRelationIsTargetRelation()
 to decide whether to mark the scan as requiring pruning.

 Sounds cool. Thanks both, this is sounding like a viable route now.

Yes, this is viable.

Patch attached, using Alvaro's idea of use-case specific pruning and
Tom's idea of aiming at target relations. Patch uses or extends
existing infrastructure, so its shorter than it might have been, yet
with all that bufmgr yuck removed.

This is very, very good because while going through this I notice the
dozen or more places where we were pruning blocks in annoying places I
didn't even know about such as about 4-5 constraint checks. In more
than a few DDL commands like ALTER TABLE and CLUSTER we were even
pruning the old relation prior to rewrite.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


hot_disable.v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-12 Thread Michael Paquier
On Mon, Feb 3, 2014 at 3:42 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Wed, Jan 15, 2014 at 2:43 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 8 January 2014 08:33, Simon Riggs si...@2ndquadrant.com wrote:

 Patch attached, implemented to reduce writes by SELECTs only.

This patch is registered in this CF. It does not apply anymore and
needs a rebase. Robert and Amit have provided as well some comments
but they have not been addressed. Is it fair to mark it as returned
with feedback even if it has not been reviewed within the last month?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-12 Thread Michael Paquier
On Fri, Sep 12, 2014 at 3:19 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Mon, Feb 3, 2014 at 3:42 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Wed, Jan 15, 2014 at 2:43 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 8 January 2014 08:33, Simon Riggs si...@2ndquadrant.com wrote:

 Patch attached, implemented to reduce writes by SELECTs only.

 This patch is registered in this CF. It does not apply anymore and
 needs a rebase. Robert and Amit have provided as well some comments
 but they have not been addressed. Is it fair to mark it as returned
 with feedback even if it has not been reviewed within the last month?
For the time being, status has been changed to waiting on author.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-12 Thread Alvaro Herrera
Michael Paquier wrote:
 On Fri, Sep 12, 2014 at 3:19 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  On Mon, Feb 3, 2014 at 3:42 PM, Amit Kapila amit.kapil...@gmail.com wrote:
  On Wed, Jan 15, 2014 at 2:43 AM, Simon Riggs si...@2ndquadrant.com wrote:
  On 8 January 2014 08:33, Simon Riggs si...@2ndquadrant.com wrote:
 
  Patch attached, implemented to reduce writes by SELECTs only.
 
  This patch is registered in this CF. It does not apply anymore and
  needs a rebase. Robert and Amit have provided as well some comments
  but they have not been addressed. Is it fair to mark it as returned
  with feedback even if it has not been reviewed within the last month?
 For the time being, status has been changed to waiting on author.

As it happens, I was studying this patch yesterday on the flight back
home.  I gave it a quick look; I noticed it was in the commitfest and
hadn't seen any review activity for many months, which seemed odd.

Anyway I first read the whole thread to know what to focus on, before
going over the patch itself.  Once I finished reading the emails, I had
a vague idea of how I thought it would work: my thinking was that
heap/index scans would either call heap_page_prune_opt, or not,
depending on whether they were part of a read-only executor node.  So if
you have a query that updates a certain table, and while doing so scans
another table in read-only mode, then the HOT updates would be enabled
for the table being written, but disabled for the one being read.

As it turns out, the patch as written is nothing at all like that, and
TBH I don't think I like it very much.

My idea is that we would have a new executor flag, say
EXEC_FLAG_READ_ONLY; we would set it on nodes that are known to be
read-only, and reset it on those that aren't, such as LockRows and
ModifyTable (obviously we need to pass it down correctly from parent to
children).  Then in ExecInitSeqScan and ExecInitIndexScan, if we see the
flag set, we call heap/index_set_allow_prune(false) for the heap scan;
same thing in index scans.  (I envisioned it as a boolean rather than
enabling a certain number of cleanups per scan.)

I tried to code this but I think it doesn't work correctly, and no time
for debug currently.  Anyway let me know what you think of this general
idea.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 2b336b0..b859f89 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -342,9 +342,11 @@ heapgetpage(HeapScanDesc scan, BlockNumber page)
 	snapshot = scan-rs_snapshot;
 
 	/*
-	 * Prune and repair fragmentation for the whole page, if possible.
+	 * Prune and repair fragmentation for the whole page, if possible and
+	 * enabled.
 	 */
-	heap_page_prune_opt(scan-rs_rd, buffer);
+	if (scan-rs_allow_prune)
+		heap_page_prune_opt(scan-rs_rd, buffer);
 
 	/*
 	 * We must hold share lock on the buffer content while examining tuple
@@ -1349,6 +1351,9 @@ heap_beginscan_internal(Relation relation, Snapshot snapshot,
 	scan-rs_allow_sync = allow_sync;
 	scan-rs_temp_snap = temp_snap;
 
+	/* HOT pruning is initially allowed; caller can turn it off if desired */
+	scan-rs_allow_prune = true;
+
 	/*
 	 * we can use page-at-a-time mode if it's an MVCC-safe snapshot
 	 */
@@ -1440,6 +1445,12 @@ heap_endscan(HeapScanDesc scan)
 	pfree(scan);
 }
 
+void
+heap_set_allow_prune(HeapScanDesc scan, bool allow)
+{
+	scan-rs_allow_prune = allow;
+}
+
 /* 
  *		heap_getnext	- retrieve next tuple in scan
  *
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 53cf96f..cb2f075 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -253,6 +253,7 @@ index_beginscan(Relation heapRelation,
 	 */
 	scan-heapRelation = heapRelation;
 	scan-xs_snapshot = snapshot;
+	scan-xs_allow_prune = true;
 
 	return scan;
 }
@@ -387,6 +388,12 @@ index_endscan(IndexScanDesc scan)
 	IndexScanEnd(scan);
 }
 
+void
+index_set_allow_prune(IndexScanDesc scan, bool allow_prune)
+{
+	scan-xs_allow_prune = allow_prune;
+}
+
 /* 
  *		index_markpos  - mark a scan position
  * 
@@ -520,9 +527,9 @@ index_fetch_heap(IndexScanDesc scan)
 			 ItemPointerGetBlockNumber(tid));
 
 		/*
-		 * Prune page, but only if we weren't already on this page
+		 * Prune page if enabled, but only if we weren't already on this page
 		 */
-		if (prev_buf != scan-xs_cbuf)
+		if (prev_buf != scan-xs_cbuf  scan-xs_allow_prune)
 			heap_page_prune_opt(scan-heapRelation, scan-xs_cbuf);
 	}
 
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index fbd7492..8f4964c 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1351,7 +1351,7 @@ BeginCopy(bool is_from,
 		 *
 		 * ExecutorStart computes a result tupdesc for us
 		 */
-		

Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-12 Thread Simon Riggs
On 12 September 2014 14:54, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 My idea is that we would have a new executor flag, say
 EXEC_FLAG_READ_ONLY; we would set it on nodes that are known to be
 read-only, and reset it on those that aren't, such as LockRows and
 ModifyTable (obviously we need to pass it down correctly from parent to
 children).  Then in ExecInitSeqScan and ExecInitIndexScan, if we see the
 flag set, we call heap/index_set_allow_prune(false) for the heap scan;
 same thing in index scans.  (I envisioned it as a boolean rather than
 enabling a certain number of cleanups per scan.)

 I tried to code this but I think it doesn't work correctly, and no time
 for debug currently.  Anyway let me know what you think of this general
 idea.

Thanks for looking at this.

My concern was to ensure that UPDATEs and DELETEs continue to call
heap_page_prune_opt while larger SELECTs do not.

This is achieved without a counter, so after some thought like it
better; simple is good. Happy to progress from here, or you can?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-12 Thread Alvaro Herrera
Simon Riggs wrote:
 On 12 September 2014 14:54, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 
  My idea is that we would have a new executor flag, say
  EXEC_FLAG_READ_ONLY; we would set it on nodes that are known to be
  read-only, and reset it on those that aren't, such as LockRows and
  ModifyTable (obviously we need to pass it down correctly from parent to
  children).  Then in ExecInitSeqScan and ExecInitIndexScan, if we see the
  flag set, we call heap/index_set_allow_prune(false) for the heap scan;
  same thing in index scans.  (I envisioned it as a boolean rather than
  enabling a certain number of cleanups per scan.)
 
  I tried to code this but I think it doesn't work correctly, and no time
  for debug currently.  Anyway let me know what you think of this general
  idea.
 
 Thanks for looking at this.
 
 My concern was to ensure that UPDATEs and DELETEs continue to call
 heap_page_prune_opt while larger SELECTs do not.
 
 This is achieved without a counter, so after some thought like it
 better; simple is good. Happy to progress from here, or you can?

Please feel free to take over.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-12 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 12 September 2014 14:54, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 My idea is that we would have a new executor flag, say
 EXEC_FLAG_READ_ONLY; we would set it on nodes that are known to be
 read-only, and reset it on those that aren't, such as LockRows and
 ModifyTable (obviously we need to pass it down correctly from parent to
 children).  Then in ExecInitSeqScan and ExecInitIndexScan, if we see the
 flag set, we call heap/index_set_allow_prune(false) for the heap scan;
 same thing in index scans.  (I envisioned it as a boolean rather than
 enabling a certain number of cleanups per scan.)
 
 I tried to code this but I think it doesn't work correctly, and no time
 for debug currently.  Anyway let me know what you think of this general
 idea.

 Thanks for looking at this.

 My concern was to ensure that UPDATEs and DELETEs continue to call
 heap_page_prune_opt while larger SELECTs do not.

I think there's another way to think about it: what about saying that
the query's target relation(s) are subject to pruning, while others
are not?  Then you do not need an executor flag, you just need to
look at the estate-es_result_relations array (or maybe even only at
estate-es_result_relation_info).  This would have the advantage of
doing what-I-think-is-the-right-thing for updates/deletes involving
joins to other tables.  The mechanism Alvaro describes would probably
have to prune all tables involved in such a query; do we really want
that?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-12 Thread Tom Lane
I wrote:
 I think there's another way to think about it: what about saying that
 the query's target relation(s) are subject to pruning, while others
 are not?  Then you do not need an executor flag, you just need to
 look at the estate-es_result_relations array (or maybe even only at
 estate-es_result_relation_info).

After a little bit I remembered there was already a function for this.
So specifically, I'd suggest using ExecRelationIsTargetRelation()
to decide whether to mark the scan as requiring pruning.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-09-12 Thread Simon Riggs
On 12 September 2014 15:30, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 I think there's another way to think about it: what about saying that
 the query's target relation(s) are subject to pruning, while others
 are not?  Then you do not need an executor flag, you just need to
 look at the estate-es_result_relations array (or maybe even only at
 estate-es_result_relation_info).

 After a little bit I remembered there was already a function for this.
 So specifically, I'd suggest using ExecRelationIsTargetRelation()
 to decide whether to mark the scan as requiring pruning.

Sounds cool. Thanks both, this is sounding like a viable route now.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-02-02 Thread Amit Kapila
On Wed, Jan 15, 2014 at 2:43 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 8 January 2014 08:33, Simon Riggs si...@2ndquadrant.com wrote:

 Patch attached, implemented to reduce writes by SELECTs only.

This is really a valuable improvement over current SELECT behaviour
w.r.t Writes.

While going though patch, I observed few points, so thought of
sharing with you:

+ /*
+ * If we are tracking pruning in SELECTs then we can only get
+ * here by heap_page_prune_opt() call that cleans a block,
+ * so in that case, register it as a pruning operation.
+ * Make sure we don't double count during VACUUMs.
+ */
+ if (PrunePageDirtyLimit  -1)
+ PrunePageDirty++;

a. As PrunePageDirtyLimit variable is not initialized for DDL flow,
   any statement like Create Function().. will have value of
   PrunePageDirtyLimit as 4 (default) and in such cases MarkBufferDirty()
   will increment the wrong counter.

b. For DDL statements like Create Materialized view, it will behave as
   Select statement.
   Ex.
   Create Materialized view mv1 as select * from t1;

   Now here I think it might not be a problem, because for t1 anyway there
   will be no write, so skipping pruning should not be a problem and for
   materialized views also there will no dead rows, so skipping should be
   okay, but I think it is not strictly adhering to statement to reduce writes
   by SELECTs only and purpose of patch which is to avoid only when
   Top level statement is SELECT.
   Do you think it's better to consider such cases and optimize for them
   or should we avoid it by following thumb rule that pruning will be avoided
   only for top level SELECT?

2. + Allow cleanup of shared buffers by foreground processes, allowing
later cleanup by VACUUM,
This line is not clear, what do you mean to say by allowing later cleanup
by VACUUM, if already foreground process has done cleanup, then it
should save effort of Vacuum.


In general, though both the optimisations (allow_buffer_cleanup and
prune_page_dirty_limit )  used in patch have similarity in the sense
that they will be used to avoid pruning, but still I feel they are for different
cases (READ ONLY OP and WRITE ON SMALL TABLES) and also as there
are more people inclined to do this for only SELECT operations, do you think
it will be a good idea to make them as separate patches?

I think there can be some applications or use cases which can be benefited
by avoiding pruning for WRITE ON SMALL TABLES, but the case for SELECT
is more general and more applications can get benefit with this optimisation,so
it would be better if we first try to accomplish that case.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-01-17 Thread Robert Haas
On Wed, Jan 15, 2014 at 5:14 PM, Simon Riggs si...@2ndquadrant.com wrote:
 We already know that HOT is ineffective in areas of high contention
 (previous thread by me). Prior experience was that smaller tables
 didn't show much apparent benefit from using HOT either; its
 effectiveness was limited to medium and large tables being updated.
 The two already stated use cases that would apply are these ones

Do you have a link to that previous thread?  I don't happen to recall
that conversation.

I've found that HOT can be very important on smaller tables, so I'm
skeptical of that as a general conclusion.  What I think might be true
is that if VACUUM is going to hit the table often enough to make you
happy, then you don't really need HOT.  In other words, if the update
rate is non-zero but low, not too much cruft will accumulate before
the table gets vacuumed, and you may be OK.  If the update rate is
high, though, I think disabling HOT will be painful on a table of any
size.  There might be exceptions, but I can't think of what the are
off-hand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Turning off HOT/Cleanup sometimes

2014-01-15 Thread Robert Haas
On Tue, Jan 14, 2014 at 4:13 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 8 January 2014 08:33, Simon Riggs si...@2ndquadrant.com wrote:
 VACUUM cleans up blocks, which is nice because it happens offline in a
 lazy manner.

 We also make SELECT clean up blocks as it goes. That is useful in OLTP
 workloads, but it means that large SQL queries and pg_dump effectively
 do much the same work as VACUUM, generating huge amounts of I/O and
 WAL on the master, the cost and annoyance of which is experienced
 directly by the user. That is avoided on standbys.

 Effects of that are that long running statements often run much longer
 than we want, increasing bloat as a result. It also produces wildly
 varying response times, depending upon extent of cleanup required.

 It is a simple task to make that behaviour optional on the master.

 I propose a USERSET parameter, prune_cost_limit (---insert better name here)
 which will make the behaviour optional, default -1, in normal user
 processes. VACUUM will ignore this parameter and so its actions will
 never be deferred.

 In detail, this parameter would disable pruning for any scan larger
 than the cost limit. So large scans will disable the behaviour. The
 default, -1, means never disable pruning, which is the current
 behavour.

 We track the number of pages dirtied by the current statement. When
 this reaches prune_cost_limit, we will apply these behaviours to all
 shared_buffer block accesses...

 (1) avoid running heap_page_prune_opt()

 (2) avoid dirtying the buffer for hints. (This is safe because the
 hinted changes will either be lost or will be part of the full page
 image when we make a logged-change).

 (i.e. doesn't apply to temp tables)

 For example, if we set prune_cost_limit = 4 this behaviour allows
 small index lookups via bitmapheapscan to continue to cleanup, while
 larger index and seq scans will avoid cleanup.



 There would be a postgresql.conf parameter prune_cost_limit, as well
 as a table level parameter that would prevent pruning except via
 VACUUM.

 This will help in these ways
 * Reduce write I/O from SELECTs and pg_dump - improving backups and BI 
 queries
 * Allow finer grained control over Hot Standby conflicts
 * Potentially allow diagnostic inspection of older data via SeqScan

 Prototype patch shows this is possible and simple enough for 9.4.
 Major objections? Or should I polish up and submit?

 Patch attached, implemented to reduce writes by SELECTs only.

I am still not sure whether we want this, but I think it's definitely
an improvement over the previous version.  Assorted comments:

- Naming consistency seems to me to dictate that there should be more
similarity between the reloption name (allow_buffer_cleanup) and the
GUC (prune_page_dirty_limit).

- The documentation doesn't describe the use case where suppressing
cleanup on a per-table basis would be desirable, and I can't think of
one, either.

- There are a variety of ways to limit pruning; here, you've chosen to
limit it to a particular number of pruning operations per executor
invocation.  But the flag is global, not part of the executor state,
so a query that calls a PL/pgsql function during execution will reset
the counter for the parent query also, which doesn't seem very
principled.

In a patch I posted a few years ago to set hint bits only sometimes, I
settled on an algorithm where I dirtied the first 50 pages per scan
and then skipped the next 950, or something like that.  The idea was
that you wanted the pages that did get dirtied to be clustered
together to avoid random I/O; and also that you wanted table of
arbitrary size to get hinted within a certain number of scans (e.g.
20).  The limiting here is much more aggressive, so on large tables it
will amount to basically no pruning at all.  I dunno whether that's a
good idea or not.  But if the idea of making this an integer rather
than a boolean is to allow some pruning to still happen while keeping
it checked within reasonable bounds, I'm not sure it will succeed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >