Re: [HACKERS] Aggressive freezing in lazy-vacuum
Tom Lane [EMAIL PROTECTED] wrote: This is a stand-alone patch for aggressive freezing. I'll propose to use OldestXmin instead of FreezeLimit as the freeze threshold in the circumstances below: I think it's a really bad idea to freeze that aggressively under any circumstances except being told to (ie, VACUUM FREEZE). When you freeze, you lose history information that might be needed later --- for forensic purposes if nothing else. I don't think we can supply such a historical database functionality here, because we can guarantee it just only for INSERTed tuples even if we pay attention. We've already enabled autovacuum as default, so that we cannot predict when the next vacuum starts and recently UPDATEd and DELETEd tuples are removed at random times. Furthermore, HOT will also accelerate removing expired tuples. Instead, we'd better to use WAL or something like audit logs for keeping history information. You need to show a fairly amazing performance gain to justify that, and I don't think you can. Thank you for your advice. I found that aggressive freezing for already dirty pages made things worse, but for pages that contain other tuples being frozen or dead tuples was useful. I did an acceleration test for XID wraparound vacuum. I initialized the database with $ ./pgbench -i -s100 # VACUUM FREEZE accounts; # SET vacuum_freeze_min_age = 6; and repeated the following queries. CHECKPOINT; UPDATE accounts SET aid=aid WHERE random() 0.005; SELECT count(*) FROM accounts WHERE xmin 2; VACUUM accounts; After the freeze threshold got at vacuum_freeze_min_age (run = 3), the VACUUM became faster with aggressive freezing. I think it came from piggybacking multiple freezing operations -- the number of unfrozen tuples were kept lower values. * Durations of VACUUM [sec] run| HEAD | freeze ---++ 1 |5.8 | 8.2 2 |5.2 | 9.0 3 | 118.2 | 102.0 4 | 122.4 | 99.8 5 | 121.0 | 79.8 6 | 122.1 | 77.9 7 | 123.8 | 115.5 ---++ avg| 121.5 | 95.0 3-7| * Numbers of unfrozen tuples run| HEAD | freeze ---++ 1 | 50081 | 50434 2 | 99836 | 100072 3 | 100047 | 86484 4 | 100061 | 86524 5 | 99766 | 87046 6 | 99854 | 86824 7 | 99502 | 86595 ---++ avg| 99846 | 86695 3-7| Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Aggressive freezing in lazy-vacuum
ITAGAKI Takahiro [EMAIL PROTECTED] writes: I don't think we can supply such a historical database functionality here, because we can guarantee it just only for INSERTed tuples even if we pay attention. We've already enabled autovacuum as default, so that we cannot predict when the next vacuum starts and recently UPDATEd and DELETEd tuples are removed at random times. Furthermore, HOT will also accelerate removing expired tuples. Instead, we'd better to use WAL or something like audit logs for keeping history information. Well comparing the data to WAL is precisely the kind of debugging that I think Tom is concerned with. The hoped for gain here is that vacuum finds fewer pages with tuples that exceed vacuum_freeze_min_age? That seems useful though vacuum is still going to have to read every page and I suspect most of the writes pertain to dead tuples, not freezing tuples. This strikes me as something that will be more useful once we have the DSM especially if it ends up including a frozen map. Once we have the DSM vacuum will no longer be visiting every page, so it will be much easier for pages to get quite old and only be caught by a vacuum freeze. The less i/o that vacuum freeze has to do the better. If we get a freeze map then agressive freezing would help keep pages out of that map so they never need to be vacuumed just to freeze the tuples in them. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Aggressive freezing in lazy-vacuum
ITAGAKI Takahiro [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] wrote: I think it's a really bad idea to freeze that aggressively under any circumstances except being told to (ie, VACUUM FREEZE). When you freeze, you lose history information that might be needed later --- for forensic purposes if nothing else. I don't think we can supply such a historical database functionality here, because we can guarantee it just only for INSERTed tuples even if we pay attention. We've already enabled autovacuum as default, so that we cannot predict when the next vacuum starts and recently UPDATEd and DELETEd tuples are removed at random times. I said nothing about expired tuples. The point of not freezing is to preserve information about the insertion time of live tuples. And your test case is unconvincing, because no sane DBA would run with such a small value of vacuum_freeze_min_age. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Aggressive freezing in lazy-vacuum
Gregory Stark [EMAIL PROTECTED] wrote: The hoped for gain here is that vacuum finds fewer pages with tuples that exceed vacuum_freeze_min_age? That seems useful though vacuum is still going to have to read every page and I suspect most of the writes pertain to dead tuples, not freezing tuples. Yes. VACUUM makes dirty pages only for freezing exceeded tuples in particular cases and I think we can reduce the writes by keeping the number of unfrozen tuples low. There are three additional costs in FREEZE. 1. CPU cost for changing the xids of target tuples. 2. Writes cost for WAL entries of FREEZE (log_heap_freeze). 3. Writes cost for newly created dirty pages. I did additional freezing in the following two cases. We'll have created dirty buffers and WAL entries for required operations then, so that I think the additional costs of 2 and 3 are ignorable, though 1 still affects us. | - There are another tuple to be frozen in the same page. | - There are another dead tuples in the same page. | Freezing is delayed until the heap vacuum phase. This strikes me as something that will be more useful once we have the DSM especially if it ends up including a frozen map. Once we have the DSM vacuum will no longer be visiting every page, so it will be much easier for pages to get quite old and only be caught by a vacuum freeze. The less i/o that vacuum freeze has to do the better. If we get a freeze map then agressive freezing would help keep pages out of that map so they never need to be vacuumed just to freeze the tuples in them. Yeah, I was planning to 2 bits/page DSM exactly for the purpose. One of the bits means to-be-vacuumed and another means to-be-frozen. It helps us avoid full scanning of the pages for XID wraparound vacuums, but DSM should be more reliable and not lost any information. I made an attempt to accomplish it in DSM, but I understand the need to demonstrate it works as designed to you. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Aggressive freezing in lazy-vacuum
Tom Lane [EMAIL PROTECTED] wrote: I said nothing about expired tuples. The point of not freezing is to preserve information about the insertion time of live tuples. I don't know what good it will do -- for debugging? Why don't you use CURRENT_TIMESTAMP? And your test case is unconvincing, because no sane DBA would run with such a small value of vacuum_freeze_min_age. I intended to use the value for an accelerated test. The penalties of freeze are divided for the long term in normal use, but we surely suffer from them by bits. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Aggressive freezing in lazy-vacuum
ITAGAKI Takahiro [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] wrote: I said nothing about expired tuples. The point of not freezing is to preserve information about the insertion time of live tuples. I don't know what good it will do -- for debugging? Exactly. As an example, I've been chasing offline a report from Merlin Moncure about duplicate entries in a unique index; I still don't know what exactly is going on there, but the availability of knowledge about which transactions inserted which entries has been really helpful. If we had a system designed to freeze tuples as soon as possible, that info would have been gone forever pretty soon after the problem happened. I don't say that this behavior can never be acceptable, but you need much more than a marginal performance improvement to convince me that it's worth the loss of forensic information. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Aggressive freezing in lazy-vacuum
Jim C. Nasby [EMAIL PROTECTED] wrote: * Aggressive freezing we will use OldestXmin as the threshold to freeze tuples in dirty pages or pages that have some dead tuples. Or, many UNFROZEN pages still remain after vacuum and they will cost us in the next vacuum preventing XID wraparound. Another good idea. If it's not too invasive I'd love to see that as a stand-alone patch so that we know it can get in. This is a stand-alone patch for aggressive freezing. I'll propose to use OldestXmin instead of FreezeLimit as the freeze threshold in the circumstances below: - The page is already dirty. - There are another tuple to be frozen in the same page. - There are another dead tuples in the same page. Freezing is delayed until the heap vacuum phase. Anyway we create new dirty buffers and/or write WAL then, so additional freezing is almost free. Keeping the number of unfrozen tuples low, we can reduce the cost of next XID wraparound vacuum and piggyback multiple freezing operations in the same page. The following test shows differences of the number of unfrozen tuples with or without the patch. Formerly, recently inserted tuples are not frozen immediately (1). Even if there are some dead tuples in the same page, unfrozen live tuples are not frozen (2). With patch, the number after first vacuum was already low (3), because the pages including recently inserted tuples were dirty and not written yet, so aggressive freeze was performed for it. Moreover, if there are dead tuples in a page, other live tuples in the same page are also frozen (4). # CREATE CAST (xid AS integer) WITHOUT FUNCTION AS IMPLICIT; [without patch] $ ./pgbench -i -s1 (including vacuum) # SELECT count(*) FROM accounts WHERE xmin 2; = 10 (1) # UPDATE accounts SET aid = aid WHERE aid % 20 = 0; = UPDATE 5000 # SELECT count(*) FROM accounts WHERE xmin 2; = 10 # VACUUM accounts; # SELECT count(*) FROM accounts WHERE xmin 2; = 10 (2) [with patch] $ ./pgbench -i -s1 (including vacuum) # SELECT count(*) FROM accounts WHERE xmin 2; = 2135 (3) # UPDATE accounts SET aid = aid WHERE aid % 20 = 0; = UPDATE 5000 # SELECT count(*) FROM accounts WHERE xmin 2; = 7028 # VACUUM accounts; # SELECT count(*) FROM accounts WHERE xmin 2; = 0 (4) Regards, --- ITAGAKI Takahiro NTT Open Source Software Center aggressive_freeze.patch Description: Binary data ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Aggressive freezing in lazy-vacuum
ITAGAKI Takahiro [EMAIL PROTECTED] writes: This is a stand-alone patch for aggressive freezing. I'll propose to use OldestXmin instead of FreezeLimit as the freeze threshold in the circumstances below: I think it's a really bad idea to freeze that aggressively under any circumstances except being told to (ie, VACUUM FREEZE). When you freeze, you lose history information that might be needed later --- for forensic purposes if nothing else. You need to show a fairly amazing performance gain to justify that, and I don't think you can. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Aggressive freezing in lazy-vacuum
Tom Lane wrote: ITAGAKI Takahiro [EMAIL PROTECTED] writes: This is a stand-alone patch for aggressive freezing. I'll propose to use OldestXmin instead of FreezeLimit as the freeze threshold in the circumstances below: I think it's a really bad idea to freeze that aggressively under any circumstances except being told to (ie, VACUUM FREEZE). When you freeze, you lose history information that might be needed later --- for forensic purposes if nothing else. You need to show a fairly amazing performance gain to justify that, and I don't think you can. There could be a GUC vacuum_freeze_limit, and the actual FreezeLimit would be calculated as GetOldestXmin() - vacuum_freeze_limit The default for vacuum_freeze_limit would be MaxTransactionId/2, just as it is now. greetings, Florian Pflug ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] [HACKERS] Aggressive freezing in lazy-vacuum
Florian G. Pflug wrote: There could be a GUC vacuum_freeze_limit, and the actual FreezeLimit would be calculated as GetOldestXmin() - vacuum_freeze_limit We already have that. It's called vacuum_freeze_min_age, and the default is 100 million transactions. IIRC we added it late in the 8.2 release cycle when we changed the clog truncation point to depend on freeze limit. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] Aggressive freezing in lazy-vacuum
Heikki Linnakangas wrote: Florian G. Pflug wrote: There could be a GUC vacuum_freeze_limit, and the actual FreezeLimit would be calculated as GetOldestXmin() - vacuum_freeze_limit We already have that. It's called vacuum_freeze_min_age, and the default is 100 million transactions. IIRC we added it late in the 8.2 release cycle when we changed the clog truncation point to depend on freeze limit. Ok, that explains why I didn't find it when I checked the source - I checked the 8.1 sources by accident ;-) Anyway, thanks for pointing that out ;-) greetings, Florian Pflug ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match