Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-12 Thread Bruce Momjian
Greg Stark wrote:
 On Thu, Jun 3, 2010 at 11:41 AM, Greg Stark gsst...@mit.edu wrote:
  I think to make it work you need to store a whole 64-bit reference
  transaction id consisting of both a cycle counter and a transaction
  id. The invariant for the page is that every xid on the page can be
  compared to that reference transaction id using normal transactionid
  semantics. Actually I think the easiest way to do that is to set it to
  the oldest xid on the page. The first thing to do before comparing any
  transaction id on the page with a real transaction id would be to
  figure out whether the reference xid is comparable to the live xid,
  which if it's the oldest xid on the page implies they'll all be
  comparable.
 
  The way to maintain that invariant would be that any xid insertion on
  the page must advance the reference xid if it's not comparable to the
  newly inserted xid. It has to be advanced to the oldest xid that's
  still comparable with the newly inserted xid. Any xids on the page
  that are older than the new refernce xid have to be frozen or removed.
  I'm not sure how to do that without keeping clog forever though.
 
 So the more I think about this the more I think it's unavoidable that
 we would need to retain clog forever.
 
 I think the goal here is to be able to load data into the database and
 then never write the data ever again. Even if you visit the page years
 later after the transaction ids have wrapped around several times. In
 that case there's no avoiding that you'll need to know whether that
 transaction committed or aborted.

I think we might need two bits, one commited and all visible, and
another aborted and all vislble.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-12 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 I think we might need two bits, one commited and all visible, and
 another aborted and all vislble.

Huh?  The latter means vacuumable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-10 Thread Jan Wieck

Seems I underestimated the importance of forensic breadcrumbs.


On 6/9/2010 12:09 PM, Tom Lane wrote:

I do like the idea of using a status bit rather than FrozenXid to mark a
frozen tuple, because that eliminates the conflict between wanting to
freeze aggressively for performance reasons and wanting to preserve Xids
for forensic reasons.  But it doesn't seem to do much for Josh's
original problem.


Depends. Specifically on transaction profiles and how long the blocks 
linger around before being written. If you can set the all visible bit 
by the time, the page is written the first time, what bit including the 
is-frozen one cannot be set at that time too?


Maybe some analysis on the typical behavior of such system is in order. 
Especially the case Josh was mentioning seems to be a typical single 
insert logging style application, with little else going on on that 
particular database. I can't reveal specifics about that particular 
case, but think of something like taking frequent sensor readings, that 
need to be kept for years for forensics in case there is a product 
recall some day.


And even if some cases still required another page write because those 
frozen bits cannot be set on first write, this seems to be a win-win. We 
would get rid of the FrozenXid completely and shift to a bit, so we can 
effectively have a min_ freeze_age of zero while keeping the xid's forever.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-10 Thread Greg Stark
On Thu, Jun 3, 2010 at 11:41 AM, Greg Stark gsst...@mit.edu wrote:
 I think to make it work you need to store a whole 64-bit reference
 transaction id consisting of both a cycle counter and a transaction
 id. The invariant for the page is that every xid on the page can be
 compared to that reference transaction id using normal transactionid
 semantics. Actually I think the easiest way to do that is to set it to
 the oldest xid on the page. The first thing to do before comparing any
 transaction id on the page with a real transaction id would be to
 figure out whether the reference xid is comparable to the live xid,
 which if it's the oldest xid on the page implies they'll all be
 comparable.

 The way to maintain that invariant would be that any xid insertion on
 the page must advance the reference xid if it's not comparable to the
 newly inserted xid. It has to be advanced to the oldest xid that's
 still comparable with the newly inserted xid. Any xids on the page
 that are older than the new refernce xid have to be frozen or removed.
 I'm not sure how to do that without keeping clog forever though.

So the more I think about this the more I think it's unavoidable that
we would need to retain clog forever.

I think the goal here is to be able to load data into the database and
then never write the data ever again. Even if you visit the page years
later after the transaction ids have wrapped around several times. In
that case there's no avoiding that you'll need to know whether that
transaction committed or aborted.

Now we could make a bet that most transactions commit and therefore we
could keep a list of aborted transactions only which we might be able
to keep forever in very little space if very few transactions abort.
Presumably we would only use this form once the transaction was about
to be truncated out of clog. I'm not too happy with the assumption
that there aren't many aborts though. Someone could come along with a
use case where they have lots of aborts and run into strange
limitations and performance characteristics.

Alternatively we could do something like keeping a list of tables
touched by any transaction. Then vacuum could look for any
non-committed transactions old enough to be in danger of aging out of
clog and ensure those tables are frozen. But any tables which have
never been touched by any such old transaction could be left alone.
when we read in the page we'll be able to recognize the old
transactions as committed if they're beyond the end of the clog
horizon.

I don't really like that idea either because it leaves performance
really quite unpredictable. I could have a large table that goes
unvacuumed for a long time -- then when I come along with some tiny
query where I hit C-c and cause an abort I suddenly set a trap which
causes a huge vacuum freeze to fire off.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-10 Thread Tom Lane
Jan Wieck janwi...@yahoo.com writes:
 Depends. Specifically on transaction profiles and how long the blocks 
 linger around before being written. If you can set the all visible bit 
 by the time, the page is written the first time, what bit including the 
 is-frozen one cannot be set at that time too?

All-visible and is-frozen would be the same bit ...

 And even if some cases still required another page write because those 
 frozen bits cannot be set on first write, this seems to be a win-win. We 
 would get rid of the FrozenXid completely and shift to a bit, so we can 
 effectively have a min_ freeze_age of zero while keeping the xid's forever.

Right.  I don't see any downside, other than eating another status bit
per tuple, which we can afford.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-09 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 8, 2010 at 6:35 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But none of this accomplishes a damn thing towards the original goal,
 which was to avoid an extra disk write associated with freezing (not
 to mention an extra write for setting the transaction-committed hint
 bit).  Setting a bit is no cheaper from that standpoint than changing
 the xmin field.

 Except for insert-only tables, I don't believe this is true.

But insert-only tables are exactly the case that Josh complained about
to start with.

 If you
 freeze all tuples by the time the pages are marked all-visible,
 perhaps via the xmin-preserving mechanism Simon suggested, then you
 can use the visibility map to skip anti-wraparound vacuum as well as
 regular vacuum.  That sounds to me like it's accomplishing something.
 Is it a complete solution? No.  Is it better than what we have now?
 Yes.

I do like the idea of using a status bit rather than FrozenXid to mark a
frozen tuple, because that eliminates the conflict between wanting to
freeze aggressively for performance reasons and wanting to preserve Xids
for forensic reasons.  But it doesn't seem to do much for Josh's
original problem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-09 Thread marcin mank
On Wed, Jun 9, 2010 at 12:35 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On Tue, 2010-06-08 at 18:03 -0400, Robert Haas wrote:
 OK, yes, I see what you're getting at now.  There are two possible
 ways to do freeze the tuples and keep the xmin: we can either rely on
 the PD_ALL_VISIBLE page-level bit (as I previously proposed) or we can
 additionally have a HEAP_XMIN_FROZEN bit as you propose here.  I am
 not sure which way is better.

 Doing it at tuple level is more flexible and allows more aggressive
 freezing. It also works better with existing tuple visibility code.

 I agree, relying on a page-level bit (or field) is unpleasant in a
 number of ways.

 But none of this accomplishes a damn thing towards the original goal,
 which was to avoid an extra disk write associated with freezing (not
 to mention an extra write for setting the transaction-committed hint
 bit).  Setting a bit is no cheaper from that standpoint than changing
 the xmin field.


Could a tuple wih the bit set be considered frozen already? Would we
actually ever need to rewrite the xmin, even for anti-wraparound
reasons?

Greetings
Marcin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-09 Thread Simon Riggs
On Tue, 2010-06-08 at 18:35 -0400, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Tue, 2010-06-08 at 18:03 -0400, Robert Haas wrote:
  OK, yes, I see what you're getting at now.  There are two possible
  ways to do freeze the tuples and keep the xmin: we can either rely on
  the PD_ALL_VISIBLE page-level bit (as I previously proposed) or we can
  additionally have a HEAP_XMIN_FROZEN bit as you propose here.  I am
  not sure which way is better.
 
  Doing it at tuple level is more flexible and allows more aggressive
  freezing. It also works better with existing tuple visibility code.
 
 I agree, relying on a page-level bit (or field) is unpleasant in a
 number of ways.
 
 But none of this accomplishes a damn thing towards the original goal,
 which was to avoid an extra disk write associated with freezing (not
 to mention an extra write for setting the transaction-committed hint
 bit).  Setting a bit is no cheaper from that standpoint than changing
 the xmin field.

No, it doesn't of itself, but if you raise a complaint then we must
first address the complaint as a sub-topic before we continue the main
discussion around $TOPIC. My proposal removes the barrier that early
freezing would overwrite xmin values, so early freezing need not have
any negative effects.

The general idea is to hide the third write (freezing) on a tuple by
making it happen at the same time as another tuple's second
write (hint bit setting).

I'm happy to let that continue by the OPs.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-09 Thread Tom Lane
marcin mank marcin.m...@gmail.com writes:
 Could a tuple wih the bit set be considered frozen already? Would we
 actually ever need to rewrite the xmin, even for anti-wraparound
 reasons?

That's exactly what Simon is suggesting: if we had a tuple status flag
with the semantics of this xmin is known visible to all current and
future transactions, we could consider setting that bit to be the moral
equivalent of freezing the tuple.  The tuple visibility tests would
never actually consult clog for such an xmin and thus we'd never have to
replace it with FrozenXid.

But this doesn't in itself save us any work: we'd still need to treat
setting that bit as a WAL-logged operation, and we'd still need to have
VACUUM track the oldest not-thus-hinted xmins.  What it does do is
eliminate the conflict between wanting to freeze tuples aggressively for
various performance reasons and wanting to preserve original xmin values
for forensic reasons.

I wonder how this might play into Heikki's ideas about making the
visibility map trustworthy.  If we WAL-logged the operation of set all
the per-tuple VISIBLE-TO-ALL bits on this page, as well as the page's
bit in the visibility map, then that end of things would be
trustworthy.  And all the operations that have to unset the map bit
are already WAL-logged.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-09 Thread Robert Haas
On Wed, Jun 9, 2010 at 12:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 If you
 freeze all tuples by the time the pages are marked all-visible,
 perhaps via the xmin-preserving mechanism Simon suggested, then you
 can use the visibility map to skip anti-wraparound vacuum as well as
 regular vacuum.  That sounds to me like it's accomplishing something.
 Is it a complete solution? No.  Is it better than what we have now?
 Yes.

 I do like the idea of using a status bit rather than FrozenXid to mark a
 frozen tuple, because that eliminates the conflict between wanting to
 freeze aggressively for performance reasons and wanting to preserve Xids
 for forensic reasons.  But it doesn't seem to do much for Josh's
 original problem.

OK, I see.  So maybe we add a Todo to implement that, and then keep
thinking about how to fix Josh's problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-08 Thread Greg Stark
On Tue, Jun 8, 2010 at 2:26 AM, Stephen Frost sfr...@snowman.net wrote:
 Just an off-the-wall thought, but, would it be possible to have a tool
 which read WAL backwards and compared entries in the WAL against entries
 on disk?  I realize that you'd only see one version of a particular
 block and then have to skip any updates which are earlier than it, but
 it seems like you could cover a pretty large chunk of the recent changes
 to the database using this approach..

I assume you mean back out the changes incrementally until you find a
full_page_write and see if it matches? And continue comparing with
full_page_writes once per checkpoint? I don't think the WAL has enough
information to replay backwards though. For example vacuum cleanup
records just list the tids to remove. They don't have the contents to
replace there.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-08 Thread Stephen Frost
* Greg Stark (gsst...@mit.edu) wrote:
 I assume you mean back out the changes incrementally until you find a
 full_page_write and see if it matches?

To be honest, you're already assuming I know more about how this all
works than I do. :)  The gist of my thought was simply- we write out
block changes to the WAL, including data in many cases.  If we were to
look at the very end of the WAL, at the last piece of data written
there, and the data files have supposedly been flushed, then what's in
the WAL at that point should match what's in the data files, right?  If
it doesn't, that'd be bad.

 And continue comparing with
 full_page_writes once per checkpoint?

If we could only do it when there's a full page write, then perhaps that
would work as well, but I thought we tracked them at a lower level.  In
any case, the idea is the same- compare what's in WAL to what's supposed
to be on disk, and alarm whenever there's a clear error.

 I don't think the WAL has enough
 information to replay backwards though. For example vacuum cleanup
 records just list the tids to remove. They don't have the contents to
 replace there.

Right, you couldn't actually move the database backwards in time using
this tool (because we only write out new data, we don't write out what
was in that block/page before the write)- that isn't the idea or intent.
It would just be a tool that someone could run against a database where
they've detected corruption (or, I dunno, more frequently, to perhaps
catch corruption faster?), to see if the problem is a PG bug or a
hardware/kernel/etc issue.  In fact, if you could somehow do this
against a backup that's been taken using pg_start_backup/pg_stop_backup,
that would be pretty awesome.

I know that if such a tool existed, I'd be happy to run it as part of my
regular backup routines- I *always* have all the WALs from my last
backup to my next backup (and typically farther back than that, eg: if I
run full backups weekly, I'll have 4 full backups + all 4 weeks of WALs,
to be able to replay back to any point in the month..).

The big question that I have is- would this actually be productive?
Would it actually be able to catch hardware corruption or help at all
with PG bugs?  Those are the things I'm not really sure about.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-08 Thread Jan Wieck

On 6/8/2010 8:27 AM, Greg Stark wrote:

On Tue, Jun 8, 2010 at 2:26 AM, Stephen Frost sfr...@snowman.net wrote:

Just an off-the-wall thought, but, would it be possible to have a tool
which read WAL backwards and compared entries in the WAL against entries
on disk?  I realize that you'd only see one version of a particular
block and then have to skip any updates which are earlier than it, but
it seems like you could cover a pretty large chunk of the recent changes
to the database using this approach..


I assume you mean back out the changes incrementally until you find a
full_page_write and see if it matches? And continue comparing with
full_page_writes once per checkpoint? I don't think the WAL has enough
information to replay backwards though. For example vacuum cleanup
records just list the tids to remove. They don't have the contents to
replace there.


You can't back out changes. WAL does not contain before images.


Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-08 Thread Simon Riggs
On Fri, 2010-06-04 at 10:18 -0400, Tom Lane wrote:
 Jan Wieck janwi...@yahoo.com writes:
  On 6/2/2010 3:10 PM, Alvaro Herrera wrote:
  I'd prefer a setting that would tell the system to freeze all tuples
  that fall within a safety range whenever any tuple in the page is frozen
  -- weren't you working on a patch to do this?  (was it Jeff Davis?)
 
  I just see a lot of cost caused by this safety range. I yet have to 
  see its real value, other than feel good.
 
 Jan, you don't know what you're talking about.  I have repeatedly had
 cases where being able to look at xmin was critical to understanding
 a bug.  I *will not* hold still for a solution that effectively reduces
 min_freeze_age to zero.

Recent history shows Tom's view to be the most useful one: its useful to
keep seeing the xmin. The last time we altered the way we set hint bits
we caused multiple data loss bugs doing it. We will need to debug things
and the WAL is always long gone (great idea though).

Why not just have a separate flag for HEAP_XMIN_FROZEN, that way we can
keep the xmin but also can see it is frozen?

We already WAL-log certain flag settings, so why not this one also?

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-08 Thread Robert Haas
On Tue, Jun 8, 2010 at 4:55 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Fri, 2010-06-04 at 10:18 -0400, Tom Lane wrote:
 Jan Wieck janwi...@yahoo.com writes:
  On 6/2/2010 3:10 PM, Alvaro Herrera wrote:
  I'd prefer a setting that would tell the system to freeze all tuples
  that fall within a safety range whenever any tuple in the page is frozen
  -- weren't you working on a patch to do this?  (was it Jeff Davis?)

  I just see a lot of cost caused by this safety range. I yet have to
  see its real value, other than feel good.

 Jan, you don't know what you're talking about.  I have repeatedly had
 cases where being able to look at xmin was critical to understanding
 a bug.  I *will not* hold still for a solution that effectively reduces
 min_freeze_age to zero.

 Recent history shows Tom's view to be the most useful one: its useful to
 keep seeing the xmin. The last time we altered the way we set hint bits
 we caused multiple data loss bugs doing it. We will need to debug things
 and the WAL is always long gone (great idea though).

 Why not just have a separate flag for HEAP_XMIN_FROZEN, that way we can
 keep the xmin but also can see it is frozen?

We could do that, but I think the point of this exercise is to reduce
I/O - specifically, I/O caused by anti-wraparound vacuums - and I'm
not clear how such a flag would help with that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-08 Thread Simon Riggs
On Tue, 2010-06-08 at 16:58 -0400, Robert Haas wrote:
 On Tue, Jun 8, 2010 at 4:55 PM, Simon Riggs si...@2ndquadrant.com wrote:
  On Fri, 2010-06-04 at 10:18 -0400, Tom Lane wrote:
  Jan Wieck janwi...@yahoo.com writes:
   On 6/2/2010 3:10 PM, Alvaro Herrera wrote:
   I'd prefer a setting that would tell the system to freeze all tuples
   that fall within a safety range whenever any tuple in the page is frozen
   -- weren't you working on a patch to do this?  (was it Jeff Davis?)
 
   I just see a lot of cost caused by this safety range. I yet have to
   see its real value, other than feel good.
 
  Jan, you don't know what you're talking about.  I have repeatedly had
  cases where being able to look at xmin was critical to understanding
  a bug.  I *will not* hold still for a solution that effectively reduces
  min_freeze_age to zero.
 
  Recent history shows Tom's view to be the most useful one: its useful to
  keep seeing the xmin. The last time we altered the way we set hint bits
  we caused multiple data loss bugs doing it. We will need to debug things
  and the WAL is always long gone (great idea though).
 
  Why not just have a separate flag for HEAP_XMIN_FROZEN, that way we can
  keep the xmin but also can see it is frozen?
 
 We could do that, but I think the point of this exercise is to reduce
 I/O - specifically, I/O caused by anti-wraparound vacuums - and I'm
 not clear how such a flag would help with that.

Hmmm: You suggested a variant of this idea, so whatever reasoning was
behind your suggestion would be shared here, surely?

Tom has been saying we cannot freeze early because we need to keep
xmins. I agree with that. This suggestion shows it is possible to freeze
a tuple AND keep its xmin. So that removes the argument that we should
freeze more aggressively (whenever we write the block) and can thus
reduce longer term I/O costs.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-08 Thread Robert Haas
On Tue, Jun 8, 2010 at 5:08 PM, Simon Riggs si...@2ndquadrant.com wrote:
  Why not just have a separate flag for HEAP_XMIN_FROZEN, that way we can
  keep the xmin but also can see it is frozen?

 We could do that, but I think the point of this exercise is to reduce
 I/O - specifically, I/O caused by anti-wraparound vacuums - and I'm
 not clear how such a flag would help with that.

 Hmmm: You suggested a variant of this idea, so whatever reasoning was
 behind your suggestion would be shared here, surely?

 Tom has been saying we cannot freeze early because we need to keep
 xmins. I agree with that. This suggestion shows it is possible to freeze
 a tuple AND keep its xmin. So that removes the argument that we should
 freeze more aggressively (whenever we write the block) and can thus
 reduce longer term I/O costs.

OK, yes, I see what you're getting at now.  There are two possible
ways to do freeze the tuples and keep the xmin: we can either rely on
the PD_ALL_VISIBLE page-level bit (as I previously proposed) or we can
additionally have a HEAP_XMIN_FROZEN bit as you propose here.  I am
not sure which way is better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-08 Thread Simon Riggs
On Tue, 2010-06-08 at 18:03 -0400, Robert Haas wrote:
 OK, yes, I see what you're getting at now.  There are two possible
 ways to do freeze the tuples and keep the xmin: we can either rely on
 the PD_ALL_VISIBLE page-level bit (as I previously proposed) or we can
 additionally have a HEAP_XMIN_FROZEN bit as you propose here.  I am
 not sure which way is better.

Doing it at tuple level is more flexible and allows more aggressive
freezing. It also works better with existing tuple visibility code.

-- 
 Simon Riggs   www.2ndQuadrant.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-08 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On Tue, 2010-06-08 at 18:03 -0400, Robert Haas wrote:
 OK, yes, I see what you're getting at now.  There are two possible
 ways to do freeze the tuples and keep the xmin: we can either rely on
 the PD_ALL_VISIBLE page-level bit (as I previously proposed) or we can
 additionally have a HEAP_XMIN_FROZEN bit as you propose here.  I am
 not sure which way is better.

 Doing it at tuple level is more flexible and allows more aggressive
 freezing. It also works better with existing tuple visibility code.

I agree, relying on a page-level bit (or field) is unpleasant in a
number of ways.

But none of this accomplishes a damn thing towards the original goal,
which was to avoid an extra disk write associated with freezing (not
to mention an extra write for setting the transaction-committed hint
bit).  Setting a bit is no cheaper from that standpoint than changing
the xmin field.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-08 Thread Alvaro Herrera
Excerpts from Tom Lane's message of mar jun 08 18:35:00 -0400 2010:

 But none of this accomplishes a damn thing towards the original goal,
 which was to avoid an extra disk write associated with freezing (not
 to mention an extra write for setting the transaction-committed hint
 bit).  Setting a bit is no cheaper from that standpoint than changing
 the xmin field.

... unless the bit is outside the page itself -- so we get back to the
idea of a freeze map.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-08 Thread Robert Haas
On Tue, Jun 8, 2010 at 6:35 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On Tue, 2010-06-08 at 18:03 -0400, Robert Haas wrote:
 OK, yes, I see what you're getting at now.  There are two possible
 ways to do freeze the tuples and keep the xmin: we can either rely on
 the PD_ALL_VISIBLE page-level bit (as I previously proposed) or we can
 additionally have a HEAP_XMIN_FROZEN bit as you propose here.  I am
 not sure which way is better.

 Doing it at tuple level is more flexible and allows more aggressive
 freezing. It also works better with existing tuple visibility code.

 I agree, relying on a page-level bit (or field) is unpleasant in a
 number of ways.

 But none of this accomplishes a damn thing towards the original goal,
 which was to avoid an extra disk write associated with freezing (not
 to mention an extra write for setting the transaction-committed hint
 bit).  Setting a bit is no cheaper from that standpoint than changing
 the xmin field.

Except for insert-only tables, I don't believe this is true.  If you
freeze all tuples by the time the pages are marked all-visible,
perhaps via the xmin-preserving mechanism Simon suggested, then you
can use the visibility map to skip anti-wraparound vacuum as well as
regular vacuum.  That sounds to me like it's accomplishing something.
Is it a complete solution? No.  Is it better than what we have now?
Yes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-07 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 ... my perspective is that it would be A Good Thing if it could
 just be turned on when needed.  If you have recurring bug that can
 be arranged, but in those cases you have other options; so I'm
 assuming you want this kept because it is primarily of forensic
 value after a non-repeatable bug has munged something?
 
Yeah, that's exactly the problem.  When you realize you need it,
it's too late.

 The best thought I've had so far
 is that if someone kept WAL files long enough the evidence might be
 in there somewhere

Hm, that is an excellent point.  The WAL trace would actually be a lot
superior in terms of being able to figure out what went wrong.  But
I don't quite see how we tell people either keep xmin or keep your
old WAL.  Also, for production sites the amount of WAL you'd have to
hang onto seems a bit daunting.  Other problems are the cost of shipping
it to a developer, and the impracticality of sanitizing private data in
it before you show it to somebody.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-07 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 
 The best thought I've had so far is that if someone kept WAL
 files long enough the evidence might be in there somewhere
 
 Hm, that is an excellent point.  The WAL trace would actually be a
 lot superior in terms of being able to figure out what went wrong.
 But I don't quite see how we tell people either keep xmin or keep
 your old WAL.  Also, for production sites the amount of WAL you'd
 have to hang onto seems a bit daunting.
 
Any thoughts on how far back the WAL would need to go to deal with
the issues where such information has been useful?  (For example, we
always have at least two weeks worth, but I don't know if that's a
useful range or not.)
 
 Other problems are the cost of shipping it to a developer, and the
 impracticality of sanitizing private data in it before you show it
 to somebody.
 
Yeah, this wouldn't be a practical answer to the need unless
PostgreSQL shipped with a tool which could scan WAL and extract the
relevant information (probably under direction of someone from the
list or a private support organization).  Is the required
information predictable enough to make developing such a tool a
tractable problem?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-07 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 If we were actually going in this direction we'd want to write a
 much better WAL-text-dump tool than we have, and then in principle
 somebody could sanitize the text output before shipping it off.
 
I wouldn't think this would be practical unless there was a way to
scan the WAL files and dump only the bits related to the affected
pages, and probably with at least an option (possibly default) to
just mention the data type and length, rather than showing the
actual values -- how often are the values relevant, anyway?  (Not
rhetorical; I really don't know.)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-07 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 Hm, that is an excellent point.  The WAL trace would actually be a
 lot superior in terms of being able to figure out what went wrong.
 But I don't quite see how we tell people either keep xmin or keep
 your old WAL.  Also, for production sites the amount of WAL you'd
 have to hang onto seems a bit daunting.
 
 Any thoughts on how far back the WAL would need to go to deal with
 the issues where such information has been useful?  (For example, we
 always have at least two weeks worth, but I don't know if that's a
 useful range or not.)

Well, it's a how long does it take you to notice data corruption
kind of issue.  The most recent case I can think of where xmin was
helpful was in trying to sort out a problem with an index being
inconsistent with the heap, which manifested as wrong query answers
for the user.  I don't know how long it took him to recognize and
report the problem.  (We never did locate the bug-if-any, IIRC...
it would have been much more helpful to have the WAL trace.  xmin
did let me rule out some theories, though.)
 
 Other problems are the cost of shipping it to a developer, and the
 impracticality of sanitizing private data in it before you show it
 to somebody.
 
 Yeah, this wouldn't be a practical answer to the need unless
 PostgreSQL shipped with a tool which could scan WAL and extract the
 relevant information (probably under direction of someone from the
 list or a private support organization).  Is the required
 information predictable enough to make developing such a tool a
 tractable problem?

Hard to tell.  If we were actually going in this direction we'd
want to write a much better WAL-text-dump tool than we have, and then
in principle somebody could sanitize the text output before shipping
it off.  But going through a large volume of data that way could be
pretty impractical.  Also, we (or at least I) have nearly zip experience
with trying to debug problems by examining WAL, so it's not real clear
to me which details might be important.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-07 Thread Greg Smith

Tom Lane wrote:

If we were actually going in this direction we'd
want to write a much better WAL-text-dump tool than we have, and then
in principle somebody could sanitize the text output before shipping
it off.  But going through a large volume of data that way could be
pretty impractical.  Also, we (or at least I) have nearly zip experience
with trying to debug problems by examining WAL, so it's not real clear
to me which details might be important.
  


There's another interesting thing about moving in this direction too.  
Systems that have WAL archiving setup that run into a problem could end 
up being a much richer source for historical analysis of how the system 
got into the bad state than is available right now.  Typically those can 
have longer histories available than you'll find on a primary that's 
recycling segments all the time.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-07 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 If we were actually going in this direction we'd
 want to write a much better WAL-text-dump tool than we have, and then
 in principle somebody could sanitize the text output before shipping
 it off.  But going through a large volume of data that way could be
 pretty impractical.  Also, we (or at least I) have nearly zip experience
 with trying to debug problems by examining WAL, so it's not real clear
 to me which details might be important.

Just an off-the-wall thought, but, would it be possible to have a tool
which read WAL backwards and compared entries in the WAL against entries
on disk?  I realize that you'd only see one version of a particular
block and then have to skip any updates which are earlier than it, but
it seems like you could cover a pretty large chunk of the recent changes
to the database using this approach..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Jan Wieck

On 6/2/2010 2:16 PM, Robert Haas wrote:

On Wed, Jun 2, 2010 at 2:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:

Alvaro Herrera alvhe...@commandprompt.com writes:

The problem is that vacuum doesn't know that a certain part of the table
is already frozen.  It needs to scan it completely anyways.  If we had a
frozen map, we could mark pages that are completely frozen and thus do
not need any vacuuming; but we don't (I don't recall the reasons for
this.  Maybe it's just that no one has gotten around to it, or maybe
there's something else).


Offhand I think the reason is that you'd have to trust the frozen bit
to be 100% correct (or at least never set to 1 in error).  Currently,
both the FSM and visibility forks are just hints, and we won't suffer
data corruption if they're wrong; so we don't get too tense about WAL
logging or fsync'ing updates.  I believe Heikki is looking into what
it'd take to make the visibility map 100% reliable, in connection with
the desire for index-only scans.  If we get that and the overhead isn't
too terrible maybe we could build a frozen-status map the same way.


We could, but I think we'd be better off just freezing at the time we
mark the page PD_ALL_VISIBLE and then using the visibility map for
both purposes.  Keeping around the old xmin values after every tuple
on the page is visible to every running transaction is useful only for
forensics, and building a whole new freeze map just to retain that
information longer (and eventually force a massive anti-wraparound
vacuum) seems like overkill.


Agreed.

The whole business of minimum freeze age always struck me as leaving 
bread crumbs behind. Other than forensics, what is the actual value of 
that overhead?



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Jan Wieck

On 6/2/2010 3:10 PM, Alvaro Herrera wrote:

Excerpts from Robert Haas's message of mié jun 02 14:16:33 -0400 2010:


We could, but I think we'd be better off just freezing at the time we
mark the page PD_ALL_VISIBLE and then using the visibility map for
both purposes.  Keeping around the old xmin values after every tuple
on the page is visible to every running transaction is useful only for
forensics, and building a whole new freeze map just to retain that
information longer (and eventually force a massive anti-wraparound
vacuum) seems like overkill.


Reducing the xid wraparound horizon a bit is reasonable, but moving it
all the way forward to OldestXmin is a bit much, methinks.


Why?



Besides, there's another argument for not freezing tuples immediately:
they may be updated shortly thereafter, causing extra churn for no gain.


What extra churn does it create if the tuple can be frozen before the 
bgwriter ever writes the page in the first place?




I'd prefer a setting that would tell the system to freeze all tuples
that fall within a safety range whenever any tuple in the page is frozen
-- weren't you working on a patch to do this?  (was it Jeff Davis?)


I just see a lot of cost caused by this safety range. I yet have to 
see its real value, other than feel good.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Tom Lane
Jan Wieck janwi...@yahoo.com writes:
 On 6/2/2010 3:10 PM, Alvaro Herrera wrote:
 I'd prefer a setting that would tell the system to freeze all tuples
 that fall within a safety range whenever any tuple in the page is frozen
 -- weren't you working on a patch to do this?  (was it Jeff Davis?)

 I just see a lot of cost caused by this safety range. I yet have to 
 see its real value, other than feel good.

Jan, you don't know what you're talking about.  I have repeatedly had
cases where being able to look at xmin was critical to understanding
a bug.  I *will not* hold still for a solution that effectively reduces
min_freeze_age to zero.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Robert Haas
On Fri, Jun 4, 2010 at 10:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Jan Wieck janwi...@yahoo.com writes:
 On 6/2/2010 3:10 PM, Alvaro Herrera wrote:
 I'd prefer a setting that would tell the system to freeze all tuples
 that fall within a safety range whenever any tuple in the page is frozen
 -- weren't you working on a patch to do this?  (was it Jeff Davis?)

 I just see a lot of cost caused by this safety range. I yet have to
 see its real value, other than feel good.

 Jan, you don't know what you're talking about.  I have repeatedly had
 cases where being able to look at xmin was critical to understanding
 a bug.  I *will not* hold still for a solution that effectively reduces
 min_freeze_age to zero.

So, we're talking in circles here.  I've already proposed a method
that would avoid the need to wipe out the xmins:

http://archives.postgresql.org/pgsql-hackers/2010-05/msg01485.php

And you said that if we were going to do that we might as well just
freeze sooner:

http://archives.postgresql.org/pgsql-hackers/2010-05/msg01548.php

If you don't want to freeze sooner, let's go back to the method
described in the first email.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Jan Wieck janwi...@yahoo.com writes:
 
 I just see a lot of cost caused by this safety range. I yet
 have to see its real value, other than feel good.
 
 Jan, you don't know what you're talking about.  I have repeatedly
 had cases where being able to look at xmin was critical to
 understanding a bug.  I *will not* hold still for a solution that
 effectively reduces min_freeze_age to zero.
 
In my experience with my own environment, I can honestly say that
it's clear that not freezing tuples quickly adds more cost than
running with cassert on.  If we had to run in production with one or
the other, I would definitely choose cassert from a performance
perspective; which one would do more to find bugs?  Why do we view
them so differently?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 In my experience with my own environment, I can honestly say that
 it's clear that not freezing tuples quickly adds more cost than
 running with cassert on.  If we had to run in production with one or
 the other, I would definitely choose cassert from a performance
 perspective; which one would do more to find bugs?  Why do we view
 them so differently?

The reason for not recommending cassert in production builds is not
cost but stability.  Per the fine manual:

 Also, having the tests turned on won't necessarily enhance the
 stability of your server!  The assertion checks are not categorized
 for severity, and so what might be a relatively harmless bug will
 still lead to server restarts if it triggers an assertion
 failure.  This option is not recommended for production use, but
 you should have it on for development work or when running a beta
 version.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Robert Haas
On Fri, Jun 4, 2010 at 11:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 In my experience with my own environment, I can honestly say that
 it's clear that not freezing tuples quickly adds more cost than
 running with cassert on.  If we had to run in production with one or
 the other, I would definitely choose cassert from a performance
 perspective; which one would do more to find bugs?  Why do we view
 them so differently?

 The reason for not recommending cassert in production builds is not
 cost but stability.  Per the fine manual:

         Also, having the tests turned on won't necessarily enhance the
         stability of your server!  The assertion checks are not categorized
         for severity, and so what might be a relatively harmless bug will
         still lead to server restarts if it triggers an assertion
         failure.  This option is not recommended for production use, but
         you should have it on for development work or when running a beta
         version.

We routinely castigate people for benchmarking done with cassert
turned on, and tell them their numbers are meaningless.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Jun 4, 2010 at 11:45 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The reason for not recommending cassert in production builds is not
 cost but stability.

 We routinely castigate people for benchmarking done with cassert
 turned on, and tell them their numbers are meaningless.

I didn't say it wasn't expensive ;-).  But Kevin's question seemed to
be based on the assumption that runtime cost was the only negative.
It wouldn't be terribly hard to make a variant of cassert that skips
two or three of the most expensive things (particularly memory context
checking and CLOBBER_FREED_MEMORY), and from a cost perspective that
would be totally reasonable to run in production.  We haven't done it
because of the stability issue.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 But Kevin's question seemed to be based on the assumption that
 runtime cost was the only negative.  It wouldn't be terribly hard
 to make a variant of cassert that skips two or three of the most
 expensive things (particularly memory context checking and
 CLOBBER_FREED_MEMORY), and from a cost perspective that would be
 totally reasonable to run in production.  We haven't done it
 because of the stability issue.
 
Fair enough.  I was thinking of them both as debugging features,
which had various ideas roiling around in my head.  Having run
hundreds of databases 24/7 for years without ever needing this
information, but paying the cost for it one way or another every
day, my perspective is that it would be A Good Thing if it could
just be turned on when needed.  If you have recurring bug that can
be arranged, but in those cases you have other options; so I'm
assuming you want this kept because it is primarily of forensic
value after a non-repeatable bug has munged something?
 
Another thought bouncing around was that these breadcrumbs are
expensive; I was trying to think of some other way to capture the
information which would be cheaper, but I haven't thought of
anything, and I'm far from certain that cheaper breadcrumbs to
answer the need can be developed.  The best thought I've had so far
is that if someone kept WAL files long enough the evidence might be
in there somewhere
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Bruce Momjian
Kevin Grittner wrote:
 Fair enough.  I was thinking of them both as debugging features,
 which had various ideas roiling around in my head.  Having run
 hundreds of databases 24/7 for years without ever needing this
 information, but paying the cost for it one way or another every
 day, my perspective is that it would be A Good Thing if it could
 just be turned on when needed.  If you have recurring bug that can
 be arranged, but in those cases you have other options; so I'm
 assuming you want this kept because it is primarily of forensic
 value after a non-repeatable bug has munged something?
  
 Another thought bouncing around was that these breadcrumbs are
 expensive; I was trying to think of some other way to capture the
 information which would be cheaper, but I haven't thought of
 anything, and I'm far from certain that cheaper breadcrumbs to
 answer the need can be developed.  The best thought I've had so far
 is that if someone kept WAL files long enough the evidence might be
 in there somewhere

The idea that thousands of Postgres installations are slower just so we
can occasionally debug xmin/xmax issues seems way off balance to me.
If people want debugging, let them modify the freeze age settings;  the
defaults should not favor debugging when there is a measurable cost
involved.  How many times in the past five years have we even needed
such debugging information, and also are cases where we could not have
told the user to change freeze settings to get us that info?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 The idea that thousands of Postgres installations are slower just so we
 can occasionally debug xmin/xmax issues seems way off balance to me.

There's no evidence whatsoever that the scope of the problem is that large.

 If people want debugging, let them modify the freeze age settings;  the
 defaults should not favor debugging when there is a measurable cost
 involved.  How many times in the past five years have we even needed
 such debugging information, and also are cases where we could not have
 told the user to change freeze settings to get us that info?

You're missing the point here: this is something we need when trying
to make sense of cases that are hard or impossible to reproduce.
Retroactively changing the freeze policy isn't possible.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 The idea that thousands of Postgres installations are slower just
 so we can occasionally debug xmin/xmax issues seems way off
 balance to me.
 
 There's no evidence whatsoever that the scope of the problem is
 that large.
 
Well, are we agreed that the current approach means that insertion
of a heap tuple normally requires it to be written to disk three
times, with two of those WAL-logged?  And that deletion of a tuple
generally requires the same?  I'd say that constitutes prima facie
evidence that any PostgreSQL installation doing any significant
number of writes is slower because of this.  Are you suggesting
there aren't thousands of such installations, or that the repeated
disk writes are generally free?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  The idea that thousands of Postgres installations are slower just so we
  can occasionally debug xmin/xmax issues seems way off balance to me.
 
 There's no evidence whatsoever that the scope of the problem is that large.
 
  If people want debugging, let them modify the freeze age settings;  the
  defaults should not favor debugging when there is a measurable cost
  involved.  How many times in the past five years have we even needed
  such debugging information, and also are cases where we could not have
  told the user to change freeze settings to get us that info?
 
 You're missing the point here: this is something we need when trying
 to make sense of cases that are hard or impossible to reproduce.
 Retroactively changing the freeze policy isn't possible.

With in-place VACUUM FULL gone in 9.0, will there be as much need for
xmin/xmax forensics?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 With in-place VACUUM FULL gone in 9.0, will there be as much need for
 xmin/xmax forensics?

You know perfectly well that no one could answer that question.
(Or at least not answer it on the basis of facts available today.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  With in-place VACUUM FULL gone in 9.0, will there be as much need for
  xmin/xmax forensics?
 
 You know perfectly well that no one could answer that question.
 (Or at least not answer it on the basis of facts available today.)

Well, guess then.  In the past, how many forensic cases were needed for
in-place VACUUM FULL bugs, vs. other cases?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
 With in-place VACUUM FULL gone in 9.0, will there be as much need for
 xmin/xmax forensics?
 
 You know perfectly well that no one could answer that question.
 (Or at least not answer it on the basis of facts available today.)

 Well, guess then.

I already told you my opinion on this matter.  Since you're prepared
to discount that, I don't see why you'd put any credence in my
evidence-free guesses.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Alvaro Herrera
Excerpts from Bruce Momjian's message of vie jun 04 15:39:07 -0400 2010:
 Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   With in-place VACUUM FULL gone in 9.0, will there be as much need for
   xmin/xmax forensics?
  
  You know perfectly well that no one could answer that question.
  (Or at least not answer it on the basis of facts available today.)
 
 Well, guess then.  In the past, how many forensic cases were needed for
 in-place VACUUM FULL bugs, vs. other cases?

I don't understand the question.  I know I have debugged a bunch of
cases of data corruption, and having xmin/xmax around has been truly
useful.  VACUUM FULL has never been involved (that I know of -- most of
our customers tend not to run it AFAIK), so why would I care about
whether it's gone in 9.0?  Note that it's not always about PG bugs; but
in the cases where xmin=FrozenXid for all/most involved tuples, the
problems are more difficult to track down.

Yes, VACUUM FULL had bugs too -- I, for one, welcome our new
not-in-place VACUUM FULL overlord.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-04 Thread Bruce Momjian
Alvaro Herrera wrote:
 Excerpts from Bruce Momjian's message of vie jun 04 15:39:07 -0400 2010:
  Tom Lane wrote:
   Bruce Momjian br...@momjian.us writes:
With in-place VACUUM FULL gone in 9.0, will there be as much need for
xmin/xmax forensics?
   
   You know perfectly well that no one could answer that question.
   (Or at least not answer it on the basis of facts available today.)
  
  Well, guess then.  In the past, how many forensic cases were needed for
  in-place VACUUM FULL bugs, vs. other cases?
 
 I don't understand the question.  I know I have debugged a bunch of
 cases of data corruption, and having xmin/xmax around has been truly
 useful.  VACUUM FULL has never been involved (that I know of -- most of
 our customers tend not to run it AFAIK), so why would I care about
 whether it's gone in 9.0?  Note that it's not always about PG bugs; but
 in the cases where xmin=FrozenXid for all/most involved tuples, the
 problems are more difficult to track down.
 
 Yes, VACUUM FULL had bugs too -- I, for one, welcome our new
 not-in-place VACUUM FULL overlord.

OK, so we had lots of forensics the didn't involve VACUUM FULL.  That's
what I wanted to know.
-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-03 Thread Greg Stark
So I think the scheme in the original post of this thread is workable.
Not as described but could be made to work. In which case I think it's
preferable to a freeze map -- which I had previously assumed we would
need eventually.

The problem with the scheme originally described is that it assumed
you could have an cycle counter and then arrange that all the xids on
the page are within that cycle. That doesn't work because you could
easily have two live xids on the page that belong to two cycles -- one
FirstNormalTransactionId and one MaxTransactionId.

I think to make it work you need to store a whole 64-bit reference
transaction id consisting of both a cycle counter and a transaction
id. The invariant for the page is that every xid on the page can be
compared to that reference transaction id using normal transactionid
semantics. Actually I think the easiest way to do that is to set it to
the oldest xid on the page. The first thing to do before comparing any
transaction id on the page with a real transaction id would be to
figure out whether the reference xid is comparable to the live xid,
which if it's the oldest xid on the page implies they'll all be
comparable.

The way to maintain that invariant would be that any xid insertion on
the page must advance the reference xid if it's not comparable to the
newly inserted xid. It has to be advanced to the oldest xid that's
still comparable with the newly inserted xid. Any xids on the page
that are older than the new refernce xid have to be frozen or removed.
I'm not sure how to do that without keeping clog forever though.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-02 Thread Russell Smith
On 28/05/10 04:00, Josh Berkus wrote:
  Consider a table that is
 regularly written but append-only.  Every time autovacuum kicks in,
 we'll go and remove any dead tuples and then mark the pages
 PD_ALL_VISIBLE and set the visibility map bits, which will cause
 subsequent vacuums to ignore the all-visible portions of the table...
 until anti-wraparound kicks in, at which point we'll vacuum the entire
 table and freeze everything.

 If, however, we decree that you can't write a new tuple into a
 PD_ALL_VISIBLE page without freezing the existing tuples, then you'll
 still have the small, incremental vacuums but those are pretty cheap,
 
 That only works if those pages were going to be autovacuumed anyway.  In
 the case outlined above (which I've seen at 3 different production sites
 this year), they wouldn't be; a table with less than 2% updates and
 deletes does not get vacuumed until max_freeze_age for any reason.  For
 that matter, pages which are getting autovacuumed are not a problem,
 period; they're being read and written and freezing them is not an issue.

 I'm not seeing a way of fixing this common issue short of overhauling
 CLOG, or of creating a freeze_map.  Darn.
   
Don't you not get a positive enough effect by adjusting the table's
autovacuum_min_freeze_age and autovacuum_max_freeze_age.  If you set
those numbers small, it appears to me that you would get very quickly to
a state where the vacuum would example only the most recent part of the
table rather than the whole thing.  Does that give you enough of a win
that it stops the scanning and writing of the whole table which reduces
the performance problem being experienced.  It's not a complete
solution, but does it go someway?

Regards

Russell


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-02 Thread Alvaro Herrera
Excerpts from Russell Smith's message of mié jun 02 06:38:35 -0400 2010:

 Don't you not get a positive enough effect by adjusting the table's
 autovacuum_min_freeze_age and autovacuum_max_freeze_age.  If you set
 those numbers small, it appears to me that you would get very quickly to
 a state where the vacuum would example only the most recent part of the
 table rather than the whole thing.

The problem is that vacuum doesn't know that a certain part of the table
is already frozen.  It needs to scan it completely anyways.  If we had a
frozen map, we could mark pages that are completely frozen and thus do
not need any vacuuming; but we don't (I don't recall the reasons for
this.  Maybe it's just that no one has gotten around to it, or maybe
there's something else).

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-02 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 The problem is that vacuum doesn't know that a certain part of the table
 is already frozen.  It needs to scan it completely anyways.  If we had a
 frozen map, we could mark pages that are completely frozen and thus do
 not need any vacuuming; but we don't (I don't recall the reasons for
 this.  Maybe it's just that no one has gotten around to it, or maybe
 there's something else).

Offhand I think the reason is that you'd have to trust the frozen bit
to be 100% correct (or at least never set to 1 in error).  Currently,
both the FSM and visibility forks are just hints, and we won't suffer
data corruption if they're wrong; so we don't get too tense about WAL
logging or fsync'ing updates.  I believe Heikki is looking into what
it'd take to make the visibility map 100% reliable, in connection with
the desire for index-only scans.  If we get that and the overhead isn't
too terrible maybe we could build a frozen-status map the same way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-02 Thread Robert Haas
On Wed, Jun 2, 2010 at 2:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
 The problem is that vacuum doesn't know that a certain part of the table
 is already frozen.  It needs to scan it completely anyways.  If we had a
 frozen map, we could mark pages that are completely frozen and thus do
 not need any vacuuming; but we don't (I don't recall the reasons for
 this.  Maybe it's just that no one has gotten around to it, or maybe
 there's something else).

 Offhand I think the reason is that you'd have to trust the frozen bit
 to be 100% correct (or at least never set to 1 in error).  Currently,
 both the FSM and visibility forks are just hints, and we won't suffer
 data corruption if they're wrong; so we don't get too tense about WAL
 logging or fsync'ing updates.  I believe Heikki is looking into what
 it'd take to make the visibility map 100% reliable, in connection with
 the desire for index-only scans.  If we get that and the overhead isn't
 too terrible maybe we could build a frozen-status map the same way.

We could, but I think we'd be better off just freezing at the time we
mark the page PD_ALL_VISIBLE and then using the visibility map for
both purposes.  Keeping around the old xmin values after every tuple
on the page is visible to every running transaction is useful only for
forensics, and building a whole new freeze map just to retain that
information longer (and eventually force a massive anti-wraparound
vacuum) seems like overkill.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-02 Thread Alvaro Herrera
Excerpts from Robert Haas's message of mié jun 02 14:16:33 -0400 2010:

 We could, but I think we'd be better off just freezing at the time we
 mark the page PD_ALL_VISIBLE and then using the visibility map for
 both purposes.  Keeping around the old xmin values after every tuple
 on the page is visible to every running transaction is useful only for
 forensics, and building a whole new freeze map just to retain that
 information longer (and eventually force a massive anti-wraparound
 vacuum) seems like overkill.

Reducing the xid wraparound horizon a bit is reasonable, but moving it
all the way forward to OldestXmin is a bit much, methinks.

Besides, there's another argument for not freezing tuples immediately:
they may be updated shortly thereafter, causing extra churn for no gain.

I'd prefer a setting that would tell the system to freeze all tuples
that fall within a safety range whenever any tuple in the page is frozen
-- weren't you working on a patch to do this?  (was it Jeff Davis?)

(BTW maybe instead of separate visibility and freeze maps we could have
two bits in the visibility map?)

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-02 Thread Bruce Momjian
Alvaro Herrera wrote:
 Excerpts from Robert Haas's message of mi?? jun 02 14:16:33 -0400 2010:
 
  We could, but I think we'd be better off just freezing at the time we
  mark the page PD_ALL_VISIBLE and then using the visibility map for
  both purposes.  Keeping around the old xmin values after every tuple
  on the page is visible to every running transaction is useful only for
  forensics, and building a whole new freeze map just to retain that
  information longer (and eventually force a massive anti-wraparound
  vacuum) seems like overkill.
 
 Reducing the xid wraparound horizon a bit is reasonable, but moving it
 all the way forward to OldestXmin is a bit much, methinks.
 
 Besides, there's another argument for not freezing tuples immediately:
 they may be updated shortly thereafter, causing extra churn for no gain.
 
 I'd prefer a setting that would tell the system to freeze all tuples
 that fall within a safety range whenever any tuple in the page is frozen
 -- weren't you working on a patch to do this?  (was it Jeff Davis?)
 
 (BTW maybe instead of separate visibility and freeze maps we could have
 two bits in the visibility map?)

Yeah, the two-bits idea was suggested during the conversation core had
about the issue.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-06-02 Thread Robert Haas
On Wed, Jun 2, 2010 at 3:10 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of mié jun 02 14:16:33 -0400 2010:

 We could, but I think we'd be better off just freezing at the time we
 mark the page PD_ALL_VISIBLE and then using the visibility map for
 both purposes.  Keeping around the old xmin values after every tuple
 on the page is visible to every running transaction is useful only for
 forensics, and building a whole new freeze map just to retain that
 information longer (and eventually force a massive anti-wraparound
 vacuum) seems like overkill.

 Reducing the xid wraparound horizon a bit is reasonable, but moving it
 all the way forward to OldestXmin is a bit much, methinks.

Why?  If it's just for forensics, those are some pretty expensive
forensics - it eventually costs you an additional complete rewrite of
every page.

 Besides, there's another argument for not freezing tuples immediately:
 they may be updated shortly thereafter, causing extra churn for no gain.

But if you were going to update PD_ALL_VISIBLE, then you were going to
write the page anyway.  You might as well freeze everything at the
same time so you don't have to come back.

Alternatively, you could do what I suggested upthread and just believe
PD_ALL_VISIBLE over the individual tuple xmins.  Then you don't have
to freeze the page until it's next written, but you still get to keep
your forensic info.

 I'd prefer a setting that would tell the system to freeze all tuples
 that fall within a safety range whenever any tuple in the page is frozen
 -- weren't you working on a patch to do this?  (was it Jeff Davis?)

Not me.  I don't think that's going to help a whole lot, though.  In
many of the painful scenarios, every tuple on the page will have the
same XID, and therefore they'll all be frozen at the same time anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-28 Thread Heikki Linnakangas

On 27/05/10 22:56, Robert Haas wrote:

On Thu, May 27, 2010 at 3:52 PM, Kevin Grittner
kevin.gritt...@wicourts.gov  wrote:

Robert Haasrobertmh...@gmail.com  wrote:

On Thu, May 27, 2010 at 3:15 PM, Kevin Grittner



(a)  The tuples were written within the same transaction which
created or truncated the table.



In case (a), you mess up visibility with respect to other
command-IDs within the transaction.


Surely that problem is surmountable?


I proposed an idea at PGCon, but I believe Tom and Heikki thought it
was far too grotty to consider.


No, I think it's surmountable too. We discussed hacks to teach the MVCC 
checks that all frozen tuples on a table that was created in the same 
transaction (i.e. the same cases where we skip WAL logging) were 
actually created by the running transaction, and check commandid 
accordingly.


Or detect simple DML commands where we know that the command doesn't 
read the table. COPY would usually fall into that category, though 
non-immutable input functions make that a bit iffy.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-27 Thread Heikki Linnakangas

On 27/05/10 08:56, Jesper Krogh wrote:

Just a thought. Wouldn't a All-visible bit also enable index only scans
to some degree?


Yes. In fact, that's one reason I implemented the visibility map in the 
first place. I started working on index-only scans based on that last 
year, if you search the archives for index-only scans you'll find those 
discussions.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-27 Thread Josh Berkus

 Well, maybe I'm confused here, but arranging things so that we NEVER
 have to visit the page after initially writing it seems like it's
 setting the bar almost impossibly high. 

That is the use case, though.  What I've encountered so far at 3 client
sites is tables which are largely append-only, with a few selects and
very few updates ( 2%) on recent data.   In general, once data gets
flushed out of memory, it goes to disk and never gets recalled, and
certainly not written.  Thinks are hunky-dory until we reach
max_freeze_age, at which point the server has to chew through hundreds
of gigabytes of old data just to freeze them, sometimes bringing the
application to a halt in the process.

The user's perspective on this is quite reasonable: if I haven't
selected these pages, and I haven't written to them, why does autovacuum
need to visit them and screw up my server performance?

  Consider a table that is
 regularly written but append-only.  Every time autovacuum kicks in,
 we'll go and remove any dead tuples and then mark the pages
 PD_ALL_VISIBLE and set the visibility map bits, which will cause
 subsequent vacuums to ignore the all-visible portions of the table...
 until anti-wraparound kicks in, at which point we'll vacuum the entire
 table and freeze everything.

 If, however, we decree that you can't write a new tuple into a
 PD_ALL_VISIBLE page without freezing the existing tuples, then you'll
 still have the small, incremental vacuums but those are pretty cheap,

That only works if those pages were going to be autovacuumed anyway.  In
the case outlined above (which I've seen at 3 different production sites
this year), they wouldn't be; a table with less than 2% updates and
deletes does not get vacuumed until max_freeze_age for any reason.  For
that matter, pages which are getting autovacuumed are not a problem,
period; they're being read and written and freezing them is not an issue.

I'm not seeing a way of fixing this common issue short of overhauling
CLOG, or of creating a freeze_map.  Darn.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-27 Thread Josh Berkus
On 5/26/10 6:32 PM, Robert Haas wrote:
 Hmm, yeah.  Maybe we should freeze when we set PD_ALL_VISIBLE; that
 might be just as good, and simpler.  Assuming the visibility map is
 sufficiently crash-safe/non-buggy, we could then teach VACUUM that
 it's OK to advance relfrozenxid even when doing just a partial vacuum
 - because any pages that were skipped must contain only frozen tuples.
  Previously you've objected to proposals in this direction because
 they might destroy forensic information, but maybe we should do it
 anyway.

It would be an improvement, and easier than the various ways of never
having to visit the pages, which are all fairly intensive.  Given the
destruction of rollback information, though, we'd probably want a way to
switch this behaviour on and off as an autovacuum setting.

Does this send us down the wrong path, though?  I thought we wanted to
think about removing hint bits so that we could implement things like
CRCs.  No?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-27 Thread Robert Haas
On Thu, May 27, 2010 at 2:17 PM, Josh Berkus j...@agliodbs.com wrote:
 On 5/26/10 6:32 PM, Robert Haas wrote:
 Hmm, yeah.  Maybe we should freeze when we set PD_ALL_VISIBLE; that
 might be just as good, and simpler.  Assuming the visibility map is
 sufficiently crash-safe/non-buggy, we could then teach VACUUM that
 it's OK to advance relfrozenxid even when doing just a partial vacuum
 - because any pages that were skipped must contain only frozen tuples.
  Previously you've objected to proposals in this direction because
 they might destroy forensic information, but maybe we should do it
 anyway.

 It would be an improvement, and easier than the various ways of never
 having to visit the pages, which are all fairly intensive.  Given the
 destruction of rollback information, though, we'd probably want a way to
 switch this behaviour on and off as an autovacuum setting.

It's not going to destroy anything that is needed for rollback unless
there's a bug - PD_ALL_VISIBLE only gets set when all tuples on the
page are visible to all backends.  That can't happen until all
transactions that wrote the page, and all others that have a lower
xmin, have committed.  That having been said, if making it a GUC makes
people less nervous about doing it, then +1 from me.

 Does this send us down the wrong path, though?  I thought we wanted to
 think about removing hint bits so that we could implement things like
 CRCs.  No?

PD_ALL_VISIBLE is a page-level bit, not a tuple-level bit, and I
strongly suspect it's not going anywhere.  It's critical
infrastructure for index-only scans, among other things.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-27 Thread Robert Haas
On Thu, May 27, 2010 at 2:00 PM, Josh Berkus j...@agliodbs.com wrote:
 Well, maybe I'm confused here, but arranging things so that we NEVER
 have to visit the page after initially writing it seems like it's
 setting the bar almost impossibly high.

 That is the use case, though.  What I've encountered so far at 3 client
 sites is tables which are largely append-only, with a few selects and
 very few updates ( 2%) on recent data.   In general, once data gets
 flushed out of memory, it goes to disk and never gets recalled, and
 certainly not written.

We might be able to optimize this case if the transactions are small,
such that they commit before dirtying too large a fraction of
shared_buffers.  We could - at least in theory - teach the bgwriter or
some other process to freeze them before writing them to disk the
first time.  But if the blocks have to be written to disk before
transaction commit it seems to me we're DOA, unless we're willing to
retain arbitrarily large amounts of CLOG.

What might be more practical is to try to find ways to spread out the
I/O so that it doesn't happen all at once in a huge ornery spike.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-27 Thread Jesper Krogh


On 27/05/2010, at 20.00, Josh Berkus j...@agliodbs.com wrote:




Well, maybe I'm confused here, but arranging things so that we NEVER
have to visit the page after initially writing it seems like it's
setting the bar almost impossibly high.


That is the use case, though.  What I've encountered so far at 3  
client

sites is tables which are largely append-only, with a few selects and
very few updates ( 2%) on recent data.   In general, once data gets
flushed out of memory, it goes to disk and never gets recalled, and
certainly not written.  Thinks are hunky-dory until we reach
max_freeze_age, at which point the server has to chew through hundreds
of gigabytes of old data just to freeze them, sometimes bringing the
application to a halt in the process.


The data doesn't get in there in  no time if autovacuum was aware of  
inserts too it would incrementally freeze the table as it grows.


It would still cause it to be read in again but not in a big chunck.

Couldn't pages that are totally filled by the same transaction, be  
frozen on the initial write?


Jesper - given my limited knowledge about how it works.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-27 Thread Kevin Grittner
Jesper Krogh jes...@krogh.cc wrote:
 
 Couldn't pages that are totally filled by the same transaction, be
 
 frozen on the initial write?
 
As far as I'm aware, that can only be done if:
 
(a)  The tuples were written within the same transaction which
created or truncated the table.
 
*or*
 
(b)  The writing transaction and all transactions concurrent to it
have completed by the time the page is about to be written.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-27 Thread Robert Haas
On Thu, May 27, 2010 at 3:15 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Jesper Krogh jes...@krogh.cc wrote:

 Couldn't pages that are totally filled by the same transaction, be

 frozen on the initial write?

 As far as I'm aware, that can only be done if:

 (a)  The tuples were written within the same transaction which
 created or truncated the table.

 *or*

 (b)  The writing transaction and all transactions concurrent to it
 have completed by the time the page is about to be written.

Actually, I think this is true only in case (b).  In case (a), you
mess up visibility with respect to other command-IDs within the
transaction.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-27 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 On Thu, May 27, 2010 at 3:15 PM, Kevin Grittner
 
 (a)  The tuples were written within the same transaction which
 created or truncated the table.
 
 In case (a), you mess up visibility with respect to other
 command-IDs within the transaction.
 
Surely that problem is surmountable?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-27 Thread Robert Haas
On Thu, May 27, 2010 at 3:52 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:
 On Thu, May 27, 2010 at 3:15 PM, Kevin Grittner

 (a)  The tuples were written within the same transaction which
 created or truncated the table.

 In case (a), you mess up visibility with respect to other
 command-IDs within the transaction.

 Surely that problem is surmountable?

I proposed an idea at PGCon, but I believe Tom and Heikki thought it
was far too grotty to consider.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-27 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 I proposed an idea at PGCon, but I believe Tom and Heikki thought
 it was far too grotty to consider.
 
Well, as an alternative -- don't we have some information about the
relation pinned which could hold the xid of its creator?  If the
tuple is frozen check to see if your transaction is the creator and
behave like you created the tuple (which, in fact, you did)?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-27 Thread Joe Conway
On 05/27/2010 12:39 PM, Robert Haas wrote:
 On Thu, May 27, 2010 at 3:15 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Jesper Krogh jes...@krogh.cc wrote:

 Couldn't pages that are totally filled by the same transaction, be

 frozen on the initial write?

 As far as I'm aware, that can only be done if:

 (a)  The tuples were written within the same transaction which
 created or truncated the table.

 *or*

 (b)  The writing transaction and all transactions concurrent to it
 have completed by the time the page is about to be written.
 
 Actually, I think this is true only in case (b).  In case (a), you
 mess up visibility with respect to other command-IDs within the
 transaction.
 

(a) can work if it is all in one command, CREATE TABLE AS SELECT...

Additionally we were discussing COPY in the FROM clause, which means you
could CREATE TABLE AS SELECT ... FROM (COPY ...). That would allow bulk
loading with hint bits already set (and tuples frozen?).

Joe



signature.asc
Description: OpenPGP digital signature


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-27 Thread Kevin Grittner
Joe Conway m...@joeconway.com wrote:
 
 (a) can work if it is all in one command, CREATE TABLE AS
 SELECT...
 
 Additionally we were discussing COPY in the FROM clause, which
 means you could CREATE TABLE AS SELECT ... FROM (COPY ...). That
 would allow bulk loading with hint bits already set (and tuples
 frozen?).
 
As long as it's hinted and frozen after a pg_dump -1 | psql I'll be
happy.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Josh Berkus
On 5/25/10 10:04 PM, Heikki Linnakangas wrote:
 On 25/05/10 23:56, Josh Berkus wrote:
 Do we get a bit in the visibility map for a page which has aborted
 transaction rows on it?
 
 If there's a tuple with an aborted xmin on a page, the bit in the
 visibility map is not set. A tuple with aborted xmax doesn't matter.

Then it seems like pages in the visibility map, at least, would not need
to be vacuumed or frozen.  Do pages persist in the visibility map
indefinitely?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Heikki Linnakangas

On 26/05/10 21:35, Josh Berkus wrote:

On 5/25/10 10:04 PM, Heikki Linnakangas wrote:

On 25/05/10 23:56, Josh Berkus wrote:

Do we get a bit in the visibility map for a page which has aborted
transaction rows on it?


If there's a tuple with an aborted xmin on a page, the bit in the
visibility map is not set. A tuple with aborted xmax doesn't matter.


Then it seems like pages in the visibility map, at least, would not need
to be vacuumed or frozen.  Do pages persist in the visibility map
indefinitely?


In theory, until any tuple on the page is inserted/updated/deleted 
again. However, we've been operating on the assumption that it's always 
safe to clear any bit in the visibility map, without affecting 
correctness. I would not like to give up that assumption, it makes life 
easier.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Josh Berkus

 In theory, until any tuple on the page is inserted/updated/deleted
 again. However, we've been operating on the assumption that it's always
 safe to clear any bit in the visibility map, without affecting
 correctness. I would not like to give up that assumption, it makes life
 easier.

It wouldn't affect correctness, it would just force that page to be
vacuumed-and-frozen.  I think I can make this work, let me just hammer
it out.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 2:44 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 26/05/10 21:35, Josh Berkus wrote:
 On 5/25/10 10:04 PM, Heikki Linnakangas wrote:
 On 25/05/10 23:56, Josh Berkus wrote:
 Do we get a bit in the visibility map for a page which has aborted
 transaction rows on it?

 If there's a tuple with an aborted xmin on a page, the bit in the
 visibility map is not set. A tuple with aborted xmax doesn't matter.

 Then it seems like pages in the visibility map, at least, would not need
 to be vacuumed or frozen.  Do pages persist in the visibility map
 indefinitely?

 In theory, until any tuple on the page is inserted/updated/deleted again.
 However, we've been operating on the assumption that it's always safe to
 clear any bit in the visibility map, without affecting correctness. I would
 not like to give up that assumption, it makes life easier.

What if we drove it off of the PD_ALL_VISIBLE bit on the page itself,
rather than the visibility map bit?  It would be safe to clear the
visibility map bit without touching the page, but if you clear the
PD_ALL_VISIBLE bit on the page itself then you set all the hint bits
and freeze all the tuples.  In the case where the visibility map bit
gets cleared but the page-level bit is still set, a future vacuum can
notice and reset the visibility map bit.  But whenever the visibility
map bit is set, you know that the page-level bit MUST be set, so you
needn't vacuum those pages, even for anti-wraparound: you know they'll
be frozen when and if they ever get written again.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Josh Berkus

 What if we drove it off of the PD_ALL_VISIBLE bit on the page itself,
 rather than the visibility map bit?  It would be safe to clear the
 visibility map bit without touching the page, but if you clear the
 PD_ALL_VISIBLE bit on the page itself then you set all the hint bits
 and freeze all the tuples.  In the case where the visibility map bit
 gets cleared but the page-level bit is still set, a future vacuum can
 notice and reset the visibility map bit.  But whenever the visibility
 map bit is set, you know that the page-level bit MUST be set, so you
 needn't vacuum those pages, even for anti-wraparound: you know they'll
 be frozen when and if they ever get written again.

How does that get us out of reading and writing old pages, though?  If
we're going to set a bit on them, we might as well freeze them.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 How does that get us out of reading and writing old pages, though?

Yeah.  Neither PD_ALL_VISIBLE nor the visibility map are going to solve
your problem, because they cannot become set without having visited the
page.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 8:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Josh Berkus j...@agliodbs.com writes:
 How does that get us out of reading and writing old pages, though?

 Yeah.  Neither PD_ALL_VISIBLE nor the visibility map are going to solve
 your problem, because they cannot become set without having visited the
 page.

Well, maybe I'm confused here, but arranging things so that we NEVER
have to visit the page after initially writing it seems like it's
setting the bar almost impossibly high.   Consider a table that is
regularly written but append-only.  Every time autovacuum kicks in,
we'll go and remove any dead tuples and then mark the pages
PD_ALL_VISIBLE and set the visibility map bits, which will cause
subsequent vacuums to ignore the all-visible portions of the table...
until anti-wraparound kicks in, at which point we'll vacuum the entire
table and freeze everything.

If, however, we decree that you can't write a new tuple into a
PD_ALL_VISIBLE page without freezing the existing tuples, then you'll
still have the small, incremental vacuums but those are pretty cheap,
and in any event, I don't see any way to get rid of them unless
someone can devise a scheme to do away with vacuum entirely.  But you
won't need the full-table vacuum to freeze tuples, because you can
freeze them opportunistically the next time those pages are written
(at which point freezing will be very cheap because the page has to be
written to disk at that point no matter what).

Maybe I'm confused.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, May 26, 2010 at 8:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah.  Neither PD_ALL_VISIBLE nor the visibility map are going to solve
 your problem, because they cannot become set without having visited the
 page.

 Well, maybe I'm confused here, but arranging things so that we NEVER
 have to visit the page after initially writing it seems like it's
 setting the bar almost impossibly high.

Well, that was the use-case that Josh was on about when this idea came
up: high-volume append-only log tables that in most cases will never be
read, so his client wants to get rid of the extra I/O for maintenance
visits to once-written pages.

If you're willing to allow one visit and rewrite of each page, then
we can do that today with maybe a bit of rejiggering of vacuum's
when-to-freeze heuristics.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Robert Haas
On Wed, May 26, 2010 at 8:59 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, May 26, 2010 at 8:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah.  Neither PD_ALL_VISIBLE nor the visibility map are going to solve
 your problem, because they cannot become set without having visited the
 page.

 Well, maybe I'm confused here, but arranging things so that we NEVER
 have to visit the page after initially writing it seems like it's
 setting the bar almost impossibly high.

 Well, that was the use-case that Josh was on about when this idea came
 up: high-volume append-only log tables that in most cases will never be
 read, so his client wants to get rid of the extra I/O for maintenance
 visits to once-written pages.

Well, I'll just note that using PD_ALL_VISIBLE as I'm proposing is
basically equivalent to Josh's original proposal of using an XID epoch
except that it addresses all three of the obvious issues which he
noted in his original email; plus it doesn't prevent truncating CLOG
(on the assumption that we rejigger things not to consult clog when
the page is marked PD_ALL_VISIBLE).

 If you're willing to allow one visit and rewrite of each page, then
 we can do that today with maybe a bit of rejiggering of vacuum's
 when-to-freeze heuristics.

Hmm, yeah.  Maybe we should freeze when we set PD_ALL_VISIBLE; that
might be just as good, and simpler.  Assuming the visibility map is
sufficiently crash-safe/non-buggy, we could then teach VACUUM that
it's OK to advance relfrozenxid even when doing just a partial vacuum
- because any pages that were skipped must contain only frozen tuples.
 Previously you've objected to proposals in this direction because
they might destroy forensic information, but maybe we should do it
anyway.

Either way, I think if we do this it *basically* gets rid of
anti-wraparound vacuum.  Yeah, we'll still do routine partial vacuums,
but what you won't have is... write the table, vacuum, vacuum, vacuum,
vacuum, OK, everything's visible to everyone, don't need to vacuum any
more...  months pass...  boom, unexpected full-table vacuum.  The
beginning part is the same, but you get rid of the boom at the end.
The only way I see to cut down vacuum activity even further is to
freeze them (and set the visibility map bit) before evicting them from
shared_buffers.  That's really the only way to get write once and
only once, but it's pretty hit or miss, because the xmin horizon
might not advance fast enough to make it actually work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-26 Thread Jesper Krogh


On 27/05/2010, at 02.48, Robert Haas robertmh...@gmail.com wrote:


On Wed, May 26, 2010 at 8:06 PM, Tom Lane t...@sss.pgh.pa.us wrote:

Josh Berkus j...@agliodbs.com writes:

How does that get us out of reading and writing old pages, though?


Yeah.  Neither PD_ALL_VISIBLE nor the visibility map are going to  
solve
your problem, because they cannot become set without having visited  
the

page.


Well, maybe I'm confused here, but arranging things so that we NEVER
have to visit the page after initially writing it seems like it's
setting the bar almost impossibly high.   Consider a table that is
regularly written but append-only.  Every time autovacuum kicks in,
we'll go and remove any dead tuples and then mark the pages
PD_ALL_VISIBLE and set the visibility map bits, which will cause
subsequent vacuums to ignore the all-visible portions of the table...
until anti-wraparound kicks in, at which point we'll vacuum the entire
table and freeze everything.


Just a thought.  Wouldn't a All-visible bit also enable index only  
scans to some degree?


Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-25 Thread Heikki Linnakangas

On 24/05/10 22:49, Alvaro Herrera wrote:

Excerpts from Josh Berkus's message of vie may 21 17:57:35 -0400 2010:


Problem: currently, if your database has a large amount of cold data,
such as 350GB of 3-year-old sales transactions, in 8.4 vacuum no longer
needs to touch it thanks to the visibility map.  However, every
freeze_age transactions, very old pages need to be sucked into memory
and rewritten just in order to freeze those pages.  This can have a huge
impact on system performance, and seems unjustified because the pages
are not actually being used.


I think this is nonsense.  If you have 3-years-old sales transactions,
and your database has any interesting churn, tuples those pages have
been frozen for a very long time *already*.  The problem is vacuum
reading them in so that it can verify there's nothing to do.  If we want
to avoid *reading* those pages, this solution is useless:


Suggested resolution: we would add a 4-byte field to the *page* header
which would track the XID wraparound count.


because you still have to read the page.


What's missing from the suggestion is that relfrozenxid and datfrozenxid 
also need to be expanded to 8-bytes. That way you effectively have 
8-byte XIDs, which means that you never need to vacuum to avoid XID 
wraparound.


You still need to freeze to truncate clog, though, but if you have the 
disk space, you can now do that every 100 billion transactions for 
example if you wish.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-25 Thread Alvaro Herrera
Excerpts from Heikki Linnakangas's message of mar may 25 04:41:30 -0400 2010:
 On 24/05/10 22:49, Alvaro Herrera wrote:

  I think this is nonsense.  If you have 3-years-old sales transactions,
  and your database has any interesting churn, tuples those pages have
  been frozen for a very long time *already*.

 What's missing from the suggestion is that relfrozenxid and datfrozenxid 
 also need to be expanded to 8-bytes. That way you effectively have 
 8-byte XIDs, which means that you never need to vacuum to avoid XID 
 wraparound.

Hmm, so are we going to use the xid epoch more officially?  That's
entirely a new line of development, perhaps it opens new possibilities.

This sounds like extending Xid to 64 bits, without having to store the
high bits everywhere.  Was this discussed in the PGCon devs meeting?

-- 
Álvaro Herrera alvhe...@alvh.no-ip.org

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-25 Thread Tom Lane
Alvaro Herrera alvhe...@alvh.no-ip.org writes:
 This sounds like extending Xid to 64 bits, without having to store the
 high bits everywhere.  Was this discussed in the PGCon devs meeting?

Yeah, that's what it would amount to.  It was not discussed at the dev
meeting --- it was an idea that came up one evening at PGCon.

I'm not sure whether this would imply having to widen xid to 64 bits
internally.  That could be a bit unpleasant as far as CPU and shared
memory space go, although every year that goes by makes 32-bit machines
less interesting as DB servers.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-25 Thread Josh Berkus
Alvaro,

 This sounds like extending Xid to 64 bits, without having to store the
 high bits everywhere.  Was this discussed in the PGCon devs meeting?

Essentially, yes.

One of the main objections to raising XID to 64-bit has been the per-row
overhead.  But adding 4 bytes per page wouldn't be much of an impact.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-25 Thread Jan Wieck

On 5/24/2010 9:30 AM, Heikki Linnakangas wrote:

On 22/05/10 16:35, Tom Lane wrote:

Josh Berkusj...@agliodbs.com  writes:

   From a discussion at dinner at pgcon, I wanted to send this to the list
for people to poke holes in it:


Somebody (I think Joe or Heikki) poked a big hole in this last night at
the Royal Oak.


Me.


 Although the scheme would get rid of the need to replace
old XIDs with FrozenXid, it does not get rid of the need to set hint
bits before you can truncate CLOG.


Hmm, we don't rely on setting hint bits to truncate CLOG anymore 
(http://archives.postgresql.org/pgsql-committers/2006-11/msg00026.php). 
It's the replacement of xids with FrozenXid that matters, the hint bits 
are really just hints.


Doesn't change the conclusion, though: you still need to replace XIDs 
with FrozenXids to truncate the clog. Conceivably we could keep around 
more than 2^32 transactions in clog with this scheme, but then you need 
a lot more space for the clog. But perhaps it would be better to do that 
than to launch anti-wraparound vacuums, or to refuse more updates in the 
extreme cases.


Correct. The problem actually are aborted transactions. Just because an 
XID is really old doesn't mean it was committed.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-25 Thread Josh Berkus

 Correct. The problem actually are aborted transactions. Just because an
 XID is really old doesn't mean it was committed.

Yes, that's the main issue with my idea; XIDs which fell off the CLOG
would become visible even if they'd aborted.

Do we get a bit in the visibility map for a page which has aborted
transaction rows on it?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-25 Thread Heikki Linnakangas

On 25/05/10 23:56, Josh Berkus wrote:

Do we get a bit in the visibility map for a page which has aborted
transaction rows on it?


If there's a tuple with an aborted xmin on a page, the bit in the 
visibility map is not set. A tuple with aborted xmax doesn't matter.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-24 Thread Heikki Linnakangas

On 22/05/10 16:35, Tom Lane wrote:

Josh Berkusj...@agliodbs.com  writes:

   From a discussion at dinner at pgcon, I wanted to send this to the list
for people to poke holes in it:


Somebody (I think Joe or Heikki) poked a big hole in this last night at
the Royal Oak.


Me.


 Although the scheme would get rid of the need to replace
old XIDs with FrozenXid, it does not get rid of the need to set hint
bits before you can truncate CLOG.


Hmm, we don't rely on setting hint bits to truncate CLOG anymore 
(http://archives.postgresql.org/pgsql-committers/2006-11/msg00026.php). 
It's the replacement of xids with FrozenXid that matters, the hint bits 
are really just hints.


Doesn't change the conclusion, though: you still need to replace XIDs 
with FrozenXids to truncate the clog. Conceivably we could keep around 
more than 2^32 transactions in clog with this scheme, but then you need 
a lot more space for the clog. But perhaps it would be better to do that 
than to launch anti-wraparound vacuums, or to refuse more updates in the 
extreme cases.



 So in your example of an insert-only
table that's probably never read again, there's still a minimum of one
update visit required on every old page.  Now that's still better than
two update visits ... but we could manage that already, just by tweaking
vacuum's heuristics about when to freeze vs when to set hint bits.


(As also discussed in the Royal Oak) I think we should simply not dirty 
a page when a hint bit is updated. Reading a page from disk is 
expensive, setting hint bits on the access is generally cheap compared 
to that. But that is orthogonal to the idea of a per-page XID epoch.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-24 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 (As also discussed in the Royal Oak) I think we should simply not dirty 
 a page when a hint bit is updated. Reading a page from disk is 
 expensive, setting hint bits on the access is generally cheap compared 
 to that. But that is orthogonal to the idea of a per-page XID epoch.

I'm not sure it's cheap.  What you suggest would result in a substantial
increase in clog accesses, which means (1) more I/O and (2) more
contention.  Certainly it's worth experimenting with, but it's no
guaranteed win.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-24 Thread Josh Berkus



I'm not sure it's cheap.  What you suggest would result in a substantial
increase in clog accesses, which means (1) more I/O and (2) more
contention.  Certainly it's worth experimenting with, but it's no
guaranteed win.


It seems like there's a number of issues we could fix by making the CLOG 
more efficient somehow -- from the elimination of hint bits to the 
ability to freeze pages without writing them.


Not, of course, that I have any idea how to do that.

--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-24 Thread Alvaro Herrera
Excerpts from Josh Berkus's message of vie may 21 17:57:35 -0400 2010:

 Problem: currently, if your database has a large amount of cold data, 
 such as 350GB of 3-year-old sales transactions, in 8.4 vacuum no longer 
 needs to touch it thanks to the visibility map.  However, every 
 freeze_age transactions, very old pages need to be sucked into memory 
 and rewritten just in order to freeze those pages.  This can have a huge 
 impact on system performance, and seems unjustified because the pages 
 are not actually being used.

I think this is nonsense.  If you have 3-years-old sales transactions,
and your database has any interesting churn, tuples those pages have
been frozen for a very long time *already*.  The problem is vacuum
reading them in so that it can verify there's nothing to do.  If we want
to avoid *reading* those pages, this solution is useless:

 Suggested resolution: we would add a 4-byte field to the *page* header 
 which would track the XID wraparound count.

because you still have to read the page.

I think what you're looking for is for this Xid wraparound count to be
stored elsewhere, not inside the page.  That way vacuum can read it and
skip the page without reading it altogether.  I think a freeze map has
been mentioned downthread.

I remember mentioning some time ago that we could declare some tables as
frozen, i.e. not needing vacuum.  This strikes me as similar, except
at the page level rather than table level.

-- 
Álvaro Herrera alvhe...@alvh.no-ip.org

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-22 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
  From a discussion at dinner at pgcon, I wanted to send this to the list 
 for people to poke holes in it:

Somebody (I think Joe or Heikki) poked a big hole in this last night at
the Royal Oak.  Although the scheme would get rid of the need to replace
old XIDs with FrozenXid, it does not get rid of the need to set hint
bits before you can truncate CLOG.  So in your example of an insert-only
table that's probably never read again, there's still a minimum of one
update visit required on every old page.  Now that's still better than
two update visits ... but we could manage that already, just by tweaking
vacuum's heuristics about when to freeze vs when to set hint bits.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-22 Thread Josh Berkus



Somebody (I think Joe or Heikki) poked a big hole in this last night at
the Royal Oak.  Although the scheme would get rid of the need to replace
old XIDs with FrozenXid, it does not get rid of the need to set hint
bits before you can truncate CLOG.  So in your example of an insert-only
table that's probably never read again, there's still a minimum of one
update visit required on every old page.  Now that's still better than
two update visits ... but we could manage that already, just by tweaking
vacuum's heuristics about when to freeze vs when to set hint bits.


Yeah, someone pointed that out to me too and suggested that a freeze map 
was the better solution.  I still think there's something we can do with 
pages on the visibility map but I'll have to think about it some more.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-22 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 Somebody (I think Joe or Heikki) poked a big hole in this last night at
 the Royal Oak.  Although the scheme would get rid of the need to replace
 old XIDs with FrozenXid, it does not get rid of the need to set hint
 bits before you can truncate CLOG.  So in your example of an insert-only
 table that's probably never read again, there's still a minimum of one
 update visit required on every old page.  Now that's still better than
 two update visits ... but we could manage that already, just by tweaking
 vacuum's heuristics about when to freeze vs when to set hint bits.

 Yeah, someone pointed that out to me too and suggested that a freeze map 
 was the better solution.  I still think there's something we can do with 
 pages on the visibility map but I'll have to think about it some more.

It occurred to me on the flight home that maybe we could salvage
something from this if there were some mechanism that caused hint bits
to get set before the page got written out from shared buffers the first
time.  This assumes that you have enough slack in shared-buffer space
that the transactions that touched a particular page all commit or abort
before the page first gets flushed to disk.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-22 Thread Jan Wieck

On 5/22/2010 9:16 PM, Tom Lane wrote:

Josh Berkus j...@agliodbs.com writes:

Somebody (I think Joe or Heikki) poked a big hole in this last night at
the Royal Oak.  Although the scheme would get rid of the need to replace
old XIDs with FrozenXid, it does not get rid of the need to set hint
bits before you can truncate CLOG.  So in your example of an insert-only
table that's probably never read again, there's still a minimum of one
update visit required on every old page.  Now that's still better than
two update visits ... but we could manage that already, just by tweaking
vacuum's heuristics about when to freeze vs when to set hint bits.


Yeah, someone pointed that out to me too and suggested that a freeze map 
was the better solution.  I still think there's something we can do with 
pages on the visibility map but I'll have to think about it some more.


It occurred to me on the flight home that maybe we could salvage
something from this if there were some mechanism that caused hint bits
to get set before the page got written out from shared buffers the first
time.  This assumes that you have enough slack in shared-buffer space
that the transactions that touched a particular page all commit or abort
before the page first gets flushed to disk.


At least the background writer should have a few spare cycles to look 
over a to be flushed page before writing it.



Jan

--
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-21 Thread Josh Berkus
From a discussion at dinner at pgcon, I wanted to send this to the list 
for people to poke holes in it:


Problem: currently, if your database has a large amount of cold data, 
such as 350GB of 3-year-old sales transactions, in 8.4 vacuum no longer 
needs to touch it thanks to the visibility map.  However, every 
freeze_age transactions, very old pages need to be sucked into memory 
and rewritten just in order to freeze those pages.  This can have a huge 
impact on system performance, and seems unjustified because the pages 
are not actually being used.


Suggested resolution: we would add a 4-byte field to the *page* header 
which would track the XID wraparound count.  Any page whose wraparound 
count was not equal to the current one would be considered to have all 
frozen tuples.  This would remove the necessity to read and write old 
pages just to freeze them, a humongous gain for databases with long data 
retention horizons, let alone data warehouses.


All xids on the page would, necessarily, need to belong to the same 
wraparound; if a page gets updated and its wraparound count (hereafter 
WCID) is lower than current, all tuples on the page would be frozen 
before any data is written to it.  XIDs which were before the max_freeze 
horizon on a page which was being written anyway would be frozen as they 
are now.


Obvious issues:

(1) In a case of rows written close to the wraparound point, this would 
cause a set of tuples to be frozen sooner than they would be in the 
current system.


(2) It's not clear what to do with a page where there are XIDs which are 
just before wraparound (like XID # 2.4b) which are still visible and 
receives a write with a new cycle xid (#1).


(3) This will require changing the page structure, with all that 
entails.  So it should probably be done when we're making another change 
(like adding CRCs).


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Idea for getting rid of VACUUM FREEZE on cold pages

2010-05-21 Thread Florian Pflug
On May 21, 2010, at 23:57 , Josh Berkus wrote:
 From a discussion at dinner at pgcon, I wanted to send this to the list for 
 people to poke holes in it:
 
 Problem: currently, if your database has a large amount of cold data, such 
 as 350GB of 3-year-old sales transactions, in 8.4 vacuum no longer needs to 
 touch it thanks to the visibility map.  However, every freeze_age 
 transactions, very old pages need to be sucked into memory and rewritten just 
 in order to freeze those pages.  This can have a huge impact on system 
 performance, and seems unjustified because the pages are not actually being 
 used.
 
 Suggested resolution: we would add a 4-byte field to the *page* header which 
 would track the XID wraparound count.  Any page whose wraparound count was 
 not equal to the current one would be considered to have all frozen tuples.  
 This would remove the necessity to read and write old pages just to freeze 
 them, a humongous gain for databases with long data retention horizons, let 
 alone data warehouses.

If I understand this correctly, VACUUM usually only frees old tuples, but never 
increases the oldest xid in the pg_class record. Once that value becomes older 
than freeze_age, VACUUM needs to scan the whole relation to freeze old tuples. 
That results in most of the pages being marked dirty and subsequently being 
written out, causing an IO storm. If, OTOH, the wraparound count was stored in 
the page header, VACUUM would still need to read those pages, but wouldn't need 
to write them out.

Alternatively, VACUUM could freeze a few pages on each run, even if the xids 
are below freeze_age. It could pick those pages randomly, or maybe even prefer 
pages whose tuples have older xmin/xmas values. That would spread the load out 
more evenly, much like we try to spread checkpoints out over the whole 
checkpoint interval.

best regards,
Florian Pflugi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers