Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-16 Thread Peter Eisentraut
On lör, 2009-08-15 at 16:55 -0700, Jeff Davis wrote:
 Similarly, you could call vacuum_freeze_min_age the maximum age a
 tuple
 can be before a vacuum will freeze it.

Heh, you could also call max_connections the minimum number of
connections before the server will refuse new connection attempts.

It's not easy ... ;-)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-15 Thread Peter Eisentraut
On fre, 2009-08-14 at 13:57 -0700, Jeff Davis wrote:
 Looking at the definitions of vacuum_freeze_min_age and
 autovacuum_freeze_max_age there seems to be almost no distinction
 between min and max in those two names.

For min, the action happens at or above the min values.  For max, the
action happens at or below the max value.

With those two particular parameters, the freezing happens exactly
between the min and the max value.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-15 Thread Jeff Davis
On Sun, 2009-08-16 at 02:02 +0300, Peter Eisentraut wrote:
 For min, the action happens at or above the min values.  For max, the
 action happens at or below the max value.

From the docs, 23.1.4:

autovacuum is invoked on any table that might contain XIDs older than
the age specified by the configuration parameter
autovacuum_freeze_max_age

I interpret that to mean that the forced autovacuum run happens above
the value. You could reasonably call it the minimum age of relfrozenxid
that will cause autovacuum to forcibly run a vacuum. 

Similarly, you could call vacuum_freeze_min_age the maximum age a tuple
can be before a vacuum will freeze it.

I'm not trying to be argumentative, I'm just trying to show that it can
be confusing if you interpret it the wrong way. The first time I saw
those configuration names, I was confused, and ever since, I have to
think about it: is that variable called min or max?.

My general feeling is that both of these are thresholds. The only real
maximum happens near wraparound.

 With those two particular parameters, the freezing happens exactly
 between the min and the max value.

Thanks, that's a helpful way to remember it.

It may be a little obsolete because now the freezing will normally
happen between vacuum_freeze_min_age and vacuum_freeze_table_age; but at
least I should be able to remember which of the other parameters is
min and which one is max.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-14 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 Yes. There are two ways to do the threshold:
   1. Constant fraction of vacuum_freeze_min_age
   2. Extra GUC

 I lean toward #1, because it avoids an extra GUC*, and it avoids the
 awkwardness when the lower setting is higher than the higher
 setting.

I tend to agree with Josh that you do need to offer two knobs.  But
expressing the second knob as a fraction (with range 0 to 1) might be
better than an independent min parameter.  As you say, that'd be
useful to prevent people from setting them inconsistently.

 *: As an aside, these GUCs already have incredibly confusing names, and
 an extra variable would increase the confusion. For instance, they seem
 to use min and max interchangeably.

Some of them are in fact max's, I believe.  They are complicated :-(.
It might be worth somebody taking two steps back and seeing if we need
quite so many knobs.  I think we got here partly by not wanting to
predetermine vacuuming strategies, but it doesn't help to offer
flexibility if people can't figure out how to use it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-14 Thread Jeff Davis
On Fri, 2009-08-14 at 14:37 -0400, Tom Lane wrote:
 I tend to agree with Josh that you do need to offer two knobs.  But
 expressing the second knob as a fraction (with range 0 to 1) might be
 better than an independent min parameter.  As you say, that'd be
 useful to prevent people from setting them inconsistently.

Ok. Any ideas for a name?

Josh suggests vacuum_freeze_dirty_age (or perhaps he was using at as a
placeholder). I don't particularly like that name, but I can't think of
anything better without renaming vacuum_freeze_min_age.

  *: As an aside, these GUCs already have incredibly confusing names, and
  an extra variable would increase the confusion. For instance, they seem
  to use min and max interchangeably.
 
 Some of them are in fact max's, I believe.

Looking at the definitions of vacuum_freeze_min_age and
autovacuum_freeze_max_age there seems to be almost no distinction
between min and max in those two names. I've complained about this
before:

http://archives.postgresql.org/pgsql-hackers/2008-12/msg01731.php

I think both are essentially thresholds, so giving them two names with
opposite meaning is misleading.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Alvaro Herrera
Jeff Davis wrote:

 Why aren't we more opportunistic about freezing tuples? For instance, if
 we already have a dirty buffer in cache, we should be more aggressive
 about freezing those tuples than freezing tuples on disk.

The most widely cited reason is that you lose forensics data.  Although
they are increasingly rare, there are still situations in which the heap
tuple machinery messes up and the xmin/xmax/etc fields of the tuple are
the best/only way to find out what happened and thus fix the bug.  If
you freeze early, there's just no way to know.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Jeff Davis
On Thu, 2009-08-13 at 17:58 -0400, Alvaro Herrera wrote:
 The most widely cited reason is that you lose forensics data.  Although
 they are increasingly rare, there are still situations in which the heap
 tuple machinery messes up and the xmin/xmax/etc fields of the tuple are
 the best/only way to find out what happened and thus fix the bug.  If
 you freeze early, there's just no way to know.

As it stands, it looks like it's not just one extra write for each
buffer, but potentially many (theoretically, as many as there are tuples
on a page). I suppose the reasoning is that tuples on the same page have
approximately the same xmin, and are likely to be frozen at the same
time. But it seems entirely reasonable that the xmins on one page span
several VACUUM runs, and that seems more likely with the FSM. That means
that a few tuples on the page are older than 100M and get frozen, and
the rest are only about 95M transactions old, so we have to come back
and freeze them again, later.

Let's say that we had a range like 50-100M, where if it's older than
100M, we freeze it, and if it's older than 50M we freeze it only if it's
on a dirty page. We would still have forensic evidence, but we could
make a range such that we avoid writing multiple times.

And people who don't care about forensic evidence can set it to 0-100M.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Robert Haas
On Thu, Aug 13, 2009 at 5:33 PM, Jeff Davispg...@j-davis.com wrote:
 Or, perhaps when the bgwriter is flushing dirty buffers, it can look for
 opportunities to set hint bits or freeze tuples.

One of the tricky things here is that the time you are mostly likely
to want to do this is when you are loading a lot of data.  But in that
case shared buffers are likely to be written back to disk before
transaction commit, so it'll be too early to do anything.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PERFORM] Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Josh Berkus

 Why aren't we more opportunistic about freezing tuples? For instance, if
 we already have a dirty buffer in cache, we should be more aggressive
 about freezing those tuples than freezing tuples on disk.
 
 The most widely cited reason is that you lose forensics data.  Although
 they are increasingly rare, there are still situations in which the heap
 tuple machinery messes up and the xmin/xmax/etc fields of the tuple are
 the best/only way to find out what happened and thus fix the bug.  If
 you freeze early, there's just no way to know.

That argument doesn't apply.  If the page is in memory and is being
written anyway, and some of the rows are past vacuum_freeze_min_age,
then why not freeze them rather than waiting for a vacuum process to
read them off disk and rewrite them?

We're not talking about freezing every tuple as soon as it's out of
scope.  Just the ones which are more that 100m (or whatever the setting
is) old.  I seriously doubt that anyone is doing useful forensics using
xids which are 100m old.

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 Let's say that we had a range like 50-100M, where if it's older than
 100M, we freeze it, and if it's older than 50M we freeze it only if it's
 on a dirty page. We would still have forensic evidence, but we could
 make a range such that we avoid writing multiple times.

Yeah, making the limit slushy would doubtless save some writes, with
not a lot of downside.

 And people who don't care about forensic evidence can set it to 0-100M.

Everybody *thinks* they don't care about forensic evidence.  Until they
need it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Jeff Davis
On Thu, 2009-08-13 at 18:46 -0400, Tom Lane wrote:
 Yeah, making the limit slushy would doubtless save some writes, with
 not a lot of downside.

OK, then should we make this a TODO? I'll make an attempt at this.

  And people who don't care about forensic evidence can set it to 0-100M.
 
 Everybody *thinks* they don't care about forensic evidence.  Until they
 need it.

We already allow setting vacuum_freeze_min_age to zero, so I don't see a
solution here other than documentation.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 On Thu, 2009-08-13 at 18:46 -0400, Tom Lane wrote:
 Everybody *thinks* they don't care about forensic evidence.  Until they
 need it.

 We already allow setting vacuum_freeze_min_age to zero, so I don't see a
 solution here other than documentation.

Yeah, we allow it.  I just don't want to encourage it ... and definitely
not make it default.

What are you envisioning exactly?  If vacuum finds any reason to dirty
a page (or it's already dirty), then freeze everything on the page that's
got age  some lower threshold?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Josh Berkus
Jeff, Tom,

 Let's say that we had a range like 50-100M, where if it's older than
 100M, we freeze it, and if it's older than 50M we freeze it only if it's
 on a dirty page. We would still have forensic evidence, but we could
 make a range such that we avoid writing multiple times.
 
 Yeah, making the limit slushy would doubtless save some writes, with
 not a lot of downside.

This would mean two settings: vacuum_freeze_min_age and
vacuum_freeze_dirty_age.  And we'd need to add those to the the
autovacuum settings for each table as well.  While we could just make
one setting 1/2 of the other, that prevents me from saying:

freeze this table agressively if it's in memory, but wait a long time
to vaccuum if it's on disk

I can completely imagine a table which has a vacuum_freeze_dirty_age of
1 and a vacuum_freeze_min_age of 1m.

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Josh Berkus

 What are you envisioning exactly?  If vacuum finds any reason to dirty
 a page (or it's already dirty), then freeze everything on the page that's
 got age  some lower threshold?

I was envisioning, if the page is already dirty and in memory *for any
reason*, the freeze rows at below some threshold.

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Jeff Davis
On Thu, 2009-08-13 at 19:05 -0400, Tom Lane wrote:
 What are you envisioning exactly?  If vacuum finds any reason to dirty
 a page (or it's already dirty), then freeze everything on the page that's
 got age  some lower threshold?

Yes. There are two ways to do the threshold:
  1. Constant fraction of vacuum_freeze_min_age
  2. Extra GUC

I lean toward #1, because it avoids an extra GUC*, and it avoids the
awkwardness when the lower setting is higher than the higher
setting.

However, #2 might be nice for people who want to live on the edge or
experiment with new values. But I suspect most of the advantage would be
had just by saying that we opportunistically freeze tuples older than
50% of vacuum_freeze_min_age.

Regards,
Jeff Davis

*: As an aside, these GUCs already have incredibly confusing names, and
an extra variable would increase the confusion. For instance, they seem
to use min and max interchangeably.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Jeff Davis
On Thu, 2009-08-13 at 18:25 -0400, Robert Haas wrote:
 On Thu, Aug 13, 2009 at 5:33 PM, Jeff Davispg...@j-davis.com wrote:
  Or, perhaps when the bgwriter is flushing dirty buffers, it can look for
  opportunities to set hint bits or freeze tuples.
 
 One of the tricky things here is that the time you are mostly likely
 to want to do this is when you are loading a lot of data.  But in that
 case shared buffers are likely to be written back to disk before
 transaction commit, so it'll be too early to do anything.

I think it would be useful in other cases, like avoiding repeated
freezing of different tuples on the same page.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 What are you envisioning exactly?  If vacuum finds any reason to dirty
 a page (or it's already dirty), then freeze everything on the page that's
 got age  some lower threshold?

 I was envisioning, if the page is already dirty and in memory *for any
 reason*, the freeze rows at below some threshold.

I believe we've had this discussion before.  I do *NOT* want freezing
operations pushed into any random page access, and in particular will
do my best to veto any attempt to put them into the bgwriter.  Freezing
requires accessing the clog and emitting a WAL record, and neither is
appropriate for low-level code like bgwriter.  The deadlock potential
alone is sufficient reason why not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Greg Stark
On Fri, Aug 14, 2009 at 12:07 AM, Josh Berkusj...@agliodbs.com wrote:
 freeze this table agressively if it's in memory, but wait a long time
 to vaccuum if it's on disk

Waitasec, in memory?

There are two projects here:

1) Make vacuum when it's freezing tuples freeze every tuple  lesser
age if it finds any tuples which are  max_age (or I suppose if the
page is already dirty due to vacuum or something else). Vacuum still
has to read in all the pages before it finds out that they don't need
freezing so it doesn't mean distinguishing in memory from needs to
be read in.

2) Have something like bgwriter check if the page is dirty and vacuum
and freeze things based on the lesser threshold. This would
effectively only be vacuuming things that are in memory

However the latter is a more complex and frought project. We looked at
this a while back in EDB and we found that the benefits were less than
we expected and the complexities more than we expected.  I would
recommend sticking with (1) for now and only looking at (2) if we have
a more detailed plan and solid testable use cases.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] freezing tuples ( was: Why is vacuum_freeze_min_age 100m? )

2009-08-13 Thread Greg Stark
On Fri, Aug 14, 2009 at 12:21 AM, Tom Lanet...@sss.pgh.pa.us wrote:
 I was envisioning, if the page is already dirty and in memory *for any
 reason*, the freeze rows at below some threshold.

 I believe we've had this discussion before.  I do *NOT* want freezing
 operations pushed into any random page access, and in particular will
 do my best to veto any attempt to put them into the bgwriter.

It's possible Josh accidentally waved this red flag and really meant
just to make it conditional on whether the page is dirty rather than
on whether vacuum dirtied it.

However he did give me a thought

With the visibility map vacuum currently only covers pages that are
known to have in-doubt tuples. That's why we have the anti-wraparound
vacuums. However it could also check if the pages its skipping are in
memory and process them if they are even if they don't have in-doubt
tuples.

Or it could first go through ram and process any pages that are in
cache before going to the visibility map and starting from page 0,
which would hopefully avoid having to read them in later when we get
to them and find they've been flushed out.

I'm just brainstorming here. I'm not sure if either of these are
actually worth the complexity and danger of finding new bottlenecks in
special case optimization codepaths.

-- 
greg
http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers