Re: [HACKERS] Aggressive freezing in lazy-vacuum

2007-03-06 Thread ITAGAKI Takahiro
Tom Lane [EMAIL PROTECTED] wrote:

  This is a stand-alone patch for aggressive freezing. I'll propose
  to use OldestXmin instead of FreezeLimit as the freeze threshold
  in the circumstances below:
 
 I think it's a really bad idea to freeze that aggressively under any
 circumstances except being told to (ie, VACUUM FREEZE).  When you
 freeze, you lose history information that might be needed later --- for
 forensic purposes if nothing else.

I don't think we can supply such a historical database functionality here,
because we can guarantee it just only for INSERTed tuples even if we pay 
attention. We've already enabled autovacuum as default, so that we cannot
predict when the next vacuum starts and recently UPDATEd and DELETEd tuples
are removed at random times. Furthermore, HOT will also accelerate removing
expired tuples. Instead, we'd better to use WAL or something like audit
logs for keeping history information.


 You need to show a fairly amazing
 performance gain to justify that, and I don't think you can.

Thank you for your advice. I found that aggressive freezing for
already dirty pages made things worse, but for pages that contain
other tuples being frozen or dead tuples was useful.

I did an acceleration test for XID wraparound vacuum.
I initialized the database with

  $ ./pgbench -i -s100
  # VACUUM FREEZE accounts;
  # SET vacuum_freeze_min_age = 6;

and repeated the following queries.

  CHECKPOINT;
  UPDATE accounts SET aid=aid WHERE random()  0.005;
  SELECT count(*) FROM accounts WHERE xmin  2;
  VACUUM accounts;

After the freeze threshold got at vacuum_freeze_min_age (run = 3),
the VACUUM became faster with aggressive freezing. I think it came
from piggybacking multiple freezing operations -- the number of
unfrozen tuples were kept lower values.

* Durations of VACUUM [sec]
run|  HEAD  | freeze
---++
 1 |5.8 |   8.2 
 2 |5.2 |   9.0 
 3 |  118.2 | 102.0 
 4 |  122.4 |  99.8 
 5 |  121.0 |  79.8 
 6 |  122.1 |  77.9 
 7 |  123.8 | 115.5 
---++
avg|  121.5 |  95.0 
3-7|

* Numbers of unfrozen tuples
run|  HEAD  | freeze
---++
 1 |  50081 |  50434 
 2 |  99836 | 100072 
 3 | 100047 |  86484 
 4 | 100061 |  86524 
 5 |  99766 |  87046 
 6 |  99854 |  86824 
 7 |  99502 |  86595 
---++
avg|  99846 |  86695
3-7|

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Aggressive freezing in lazy-vacuum

2007-03-06 Thread Gregory Stark

ITAGAKI Takahiro [EMAIL PROTECTED] writes:

 I don't think we can supply such a historical database functionality here,
 because we can guarantee it just only for INSERTed tuples even if we pay 
 attention. We've already enabled autovacuum as default, so that we cannot
 predict when the next vacuum starts and recently UPDATEd and DELETEd tuples
 are removed at random times. Furthermore, HOT will also accelerate removing
 expired tuples. Instead, we'd better to use WAL or something like audit
 logs for keeping history information.

Well comparing the data to WAL is precisely the kind of debugging that I think
Tom is concerned with.

The hoped for gain here is that vacuum finds fewer pages with tuples that
exceed vacuum_freeze_min_age? That seems useful though vacuum is still going
to have to read every page and I suspect most of the writes pertain to dead
tuples, not freezing tuples.

This strikes me as something that will be more useful once we have the DSM
especially if it ends up including a frozen map. Once we have the DSM vacuum
will no longer be visiting every page, so it will be much easier for pages to
get quite old and only be caught by a vacuum freeze. The less i/o that vacuum
freeze has to do the better. If we get a freeze map then agressive freezing
would help keep pages out of that map so they never need to be vacuumed just
to freeze the tuples in them.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Aggressive freezing in lazy-vacuum

2007-03-06 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] wrote:
 I think it's a really bad idea to freeze that aggressively under any
 circumstances except being told to (ie, VACUUM FREEZE).  When you
 freeze, you lose history information that might be needed later --- for
 forensic purposes if nothing else.

 I don't think we can supply such a historical database functionality here,
 because we can guarantee it just only for INSERTed tuples even if we pay 
 attention. We've already enabled autovacuum as default, so that we cannot
 predict when the next vacuum starts and recently UPDATEd and DELETEd tuples
 are removed at random times.

I said nothing about expired tuples.  The point of not freezing is to
preserve information about the insertion time of live tuples.  And your
test case is unconvincing, because no sane DBA would run with such a
small value of vacuum_freeze_min_age.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Aggressive freezing in lazy-vacuum

2007-03-06 Thread ITAGAKI Takahiro
Gregory Stark [EMAIL PROTECTED] wrote:

 The hoped for gain here is that vacuum finds fewer pages with tuples that
 exceed vacuum_freeze_min_age? That seems useful though vacuum is still going
 to have to read every page and I suspect most of the writes pertain to dead
 tuples, not freezing tuples.

Yes. VACUUM makes dirty pages only for freezing exceeded tuples in
particular cases and I think we can reduce the writes by keeping the
number of unfrozen tuples low.

There are three additional costs in FREEZE.
  1. CPU cost for changing the xids of target tuples.
  2. Writes cost for WAL entries of FREEZE (log_heap_freeze).
  3. Writes cost for newly created dirty pages.

I did additional freezing in the following two cases. We'll have created
dirty buffers and WAL entries for required operations then, so that I think
the additional costs of 2 and 3 are ignorable, though 1 still affects us.

| - There are another tuple to be frozen in the same page.
| - There are another dead tuples in the same page.
|   Freezing is delayed until the heap vacuum phase.


 This strikes me as something that will be more useful once we have the DSM
 especially if it ends up including a frozen map. Once we have the DSM vacuum
 will no longer be visiting every page, so it will be much easier for pages to
 get quite old and only be caught by a vacuum freeze. The less i/o that vacuum
 freeze has to do the better. If we get a freeze map then agressive freezing
 would help keep pages out of that map so they never need to be vacuumed just
 to freeze the tuples in them.

Yeah, I was planning to 2 bits/page DSM exactly for the purpose. One of the
bits means to-be-vacuumed and another means to-be-frozen. It helps us avoid
full scanning of the pages for XID wraparound vacuums, but DSM should be more
reliable and not lost any information. I made an attempt to accomplish it
in DSM, but I understand the need to demonstrate it works as designed to you.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Aggressive freezing in lazy-vacuum

2007-03-06 Thread ITAGAKI Takahiro

Tom Lane [EMAIL PROTECTED] wrote:

 I said nothing about expired tuples.  The point of not freezing is to
 preserve information about the insertion time of live tuples.

I don't know what good it will do -- for debugging?
Why don't you use CURRENT_TIMESTAMP?


 And your
 test case is unconvincing, because no sane DBA would run with such a
 small value of vacuum_freeze_min_age.

I intended to use the value for an accelerated test.
The penalties of freeze are divided for the long term in normal use,
but we surely suffer from them by bits.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Aggressive freezing in lazy-vacuum

2007-03-06 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] wrote:
 I said nothing about expired tuples.  The point of not freezing is to
 preserve information about the insertion time of live tuples.

 I don't know what good it will do -- for debugging?

Exactly.  As an example, I've been chasing offline a report from Merlin
Moncure about duplicate entries in a unique index; I still don't know
what exactly is going on there, but the availability of knowledge about
which transactions inserted which entries has been really helpful.  If
we had a system designed to freeze tuples as soon as possible, that info
would have been gone forever pretty soon after the problem happened.

I don't say that this behavior can never be acceptable, but you need
much more than a marginal performance improvement to convince me that
it's worth the loss of forensic information.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Aggressive freezing in lazy-vacuum

2007-03-05 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes:
 This is a stand-alone patch for aggressive freezing. I'll propose
 to use OldestXmin instead of FreezeLimit as the freeze threshold
 in the circumstances below:

I think it's a really bad idea to freeze that aggressively under any
circumstances except being told to (ie, VACUUM FREEZE).  When you
freeze, you lose history information that might be needed later --- for
forensic purposes if nothing else.  You need to show a fairly amazing
performance gain to justify that, and I don't think you can.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Aggressive freezing in lazy-vacuum

2007-03-05 Thread Florian G. Pflug

Tom Lane wrote:

ITAGAKI Takahiro [EMAIL PROTECTED] writes:

This is a stand-alone patch for aggressive freezing. I'll propose
to use OldestXmin instead of FreezeLimit as the freeze threshold
in the circumstances below:


I think it's a really bad idea to freeze that aggressively under any
circumstances except being told to (ie, VACUUM FREEZE).  When you
freeze, you lose history information that might be needed later --- for
forensic purposes if nothing else.  You need to show a fairly amazing
performance gain to justify that, and I don't think you can.


There could be a GUC vacuum_freeze_limit, and the actual FreezeLimit 
would be calculated as

GetOldestXmin() - vacuum_freeze_limit

The default for vacuum_freeze_limit would be MaxTransactionId/2, just
as it is now.

greetings, Florian Pflug

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster