Re: [HACKERS] VACUUM FULL versus TOAST
On Sun, Aug 14, 2011 at 7:43 PM, Greg Stark st...@mit.edu wrote: On Sun, Aug 14, 2011 at 5:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: There would be some merit in your suggestion if we knew that all/most toasted columns would actually get fetched out of the catcache entry at some point. Then we'd only be moving the cost around, and might even save something on repeated accesses. But I don't think we know that. In the specific example at hand (pg_statistic entries) it's entirely plausible that the planner would only need the histogram, or only need the MCV list, depending on the sorts of queries it was coping with. Fwiw detoasting statistics entries sounds like a fine idea to me. I've often seen queries that are unexpectedly slow to plan and chalked it up to statistics entries getting toasted. If it's ok to read either the histogram or MVC list from disk every time we plan a query then why are we bothering with an in-memory cache of the statistics at all? The only thing that gives me pause is that it's possible these entries are *really* large. If you have a decent number of tables that are all a few megabytes of histograms then things could go poorly. But I don't think having to read in these entries from pg_toast every time you plan a query is going to go much better for you either. Yep; in fact, I've previously submitted test results showing that repeatedly decompressing TOAST entries can significantly slow down query planning. That having been said, Tom's fix seems safer to back-patch. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VACUUM FULL versus TOAST
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: On 14.08.2011 01:13, Tom Lane wrote: I am thinking that the most reasonable solution is instead to fix VACUUM FULL/CLUSTER so that they don't change existing toast item OIDs when vacuuming a system catalog. They already do some pretty ugly things to avoid changing the toast table's OID in this case, and locking down the item OIDs too doesn't seem that much harder. (Though I've not actually looked at the code yet...) How about detoasting all datums before caching them? It's surprising that a datum that is supposedly in a catalog cache, actually needs disk access to use. Don't really want to fix a VACUUM-FULL-induced problem by inserting distributed overhead into every other operation. There would be some merit in your suggestion if we knew that all/most toasted columns would actually get fetched out of the catcache entry at some point. Then we'd only be moving the cost around, and might even save something on repeated accesses. But I don't think we know that. In the specific example at hand (pg_statistic entries) it's entirely plausible that the planner would only need the histogram, or only need the MCV list, depending on the sorts of queries it was coping with. There's also a concern of bloating the catcaches if we do that ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VACUUM FULL versus TOAST
I wrote: I am thinking that the most reasonable solution is instead to fix VACUUM FULL/CLUSTER so that they don't change existing toast item OIDs when vacuuming a system catalog. They already do some pretty ugly things to avoid changing the toast table's OID in this case, and locking down the item OIDs too doesn't seem that much harder. (Though I've not actually looked at the code yet...) Attached is a proposed patch for this. The main potential drawback here is that if any varlena items that had not previously been toasted got toasted, they would require additional OIDs to be assigned, possibly leading to a duplicate-OID failure. This should not happen unless somebody decides to play with the attstorage properties of a system catalog, and I don't feel too bad about a small possibility of VAC FULL failing after that. (Note it should eventually succeed if you keep trying, since the generated OIDs would keep changing.) I realized that there is an easy fix for that: since tuptoaster.c already knows what the old toast table OID is, it can just look into that table to see if each proposed new OID is already in use, and iterate till it gets a non-conflicting OID. This may seem kind of inefficient, but since it's such a corner case, I don't think the code path will get hit often enough to matter. Comments? regards, tom lane diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c index 4f4dd69291fd50008f8e313176a02cd5bc955e08..785c679879012508b4925b4f8ce93e1c712f7ec4 100644 *** a/src/backend/access/heap/tuptoaster.c --- b/src/backend/access/heap/tuptoaster.c *** do { \ *** 74,80 static void toast_delete_datum(Relation rel, Datum value); ! static Datum toast_save_datum(Relation rel, Datum value, int options); static struct varlena *toast_fetch_datum(struct varlena * attr); static struct varlena *toast_fetch_datum_slice(struct varlena * attr, int32 sliceoffset, int32 length); --- 74,82 static void toast_delete_datum(Relation rel, Datum value); ! static Datum toast_save_datum(Relation rel, Datum value, ! struct varlena *oldexternal, int options); ! static bool toast_valueid_exists(Oid toastrelid, Oid valueid); static struct varlena *toast_fetch_datum(struct varlena * attr); static struct varlena *toast_fetch_datum_slice(struct varlena * attr, int32 sliceoffset, int32 length); *** toast_insert_or_update(Relation rel, Hea *** 431,436 --- 433,439 bool toast_oldisnull[MaxHeapAttributeNumber]; Datum toast_values[MaxHeapAttributeNumber]; Datum toast_oldvalues[MaxHeapAttributeNumber]; + struct varlena *toast_oldexternal[MaxHeapAttributeNumber]; int32 toast_sizes[MaxHeapAttributeNumber]; bool toast_free[MaxHeapAttributeNumber]; bool toast_delold[MaxHeapAttributeNumber]; *** toast_insert_or_update(Relation rel, Hea *** 466,471 --- 469,475 * -- */ memset(toast_action, ' ', numAttrs * sizeof(char)); + memset(toast_oldexternal, 0, numAttrs * sizeof(struct varlena *)); memset(toast_free, 0, numAttrs * sizeof(bool)); memset(toast_delold, 0, numAttrs * sizeof(bool)); *** toast_insert_or_update(Relation rel, Hea *** 550,555 --- 554,560 */ if (VARATT_IS_EXTERNAL(new_value)) { + toast_oldexternal[i] = new_value; if (att[i]-attstorage == 'p') new_value = heap_tuple_untoast_attr(new_value); else *** toast_insert_or_update(Relation rel, Hea *** 676,682 { old_value = toast_values[i]; toast_action[i] = 'p'; ! toast_values[i] = toast_save_datum(rel, toast_values[i], options); if (toast_free[i]) pfree(DatumGetPointer(old_value)); toast_free[i] = true; --- 681,688 { old_value = toast_values[i]; toast_action[i] = 'p'; ! toast_values[i] = toast_save_datum(rel, toast_values[i], ! toast_oldexternal[i], options); if (toast_free[i]) pfree(DatumGetPointer(old_value)); toast_free[i] = true; *** toast_insert_or_update(Relation rel, Hea *** 726,732 i = biggest_attno; old_value = toast_values[i]; toast_action[i] = 'p'; ! toast_values[i] = toast_save_datum(rel, toast_values[i], options); if (toast_free[i]) pfree(DatumGetPointer(old_value)); toast_free[i] = true; --- 732,739 i = biggest_attno; old_value = toast_values[i]; toast_action[i] = 'p'; ! toast_values[i] = toast_save_datum(rel, toast_values[i], ! toast_oldexternal[i], options); if (toast_free[i]) pfree(DatumGetPointer(old_value)); toast_free[i] = true; *** toast_insert_or_update(Relation rel, Hea *** 839,845 i = biggest_attno; old_value = toast_values[i]; toast_action[i] = 'p'; ! toast_values[i] = toast_save_datum(rel, toast_values[i], options); if (toast_free[i])
Re: [HACKERS] VACUUM FULL versus TOAST
On Sun, Aug 14, 2011 at 5:15 PM, Tom Lane t...@sss.pgh.pa.us wrote: There would be some merit in your suggestion if we knew that all/most toasted columns would actually get fetched out of the catcache entry at some point. Then we'd only be moving the cost around, and might even save something on repeated accesses. But I don't think we know that. In the specific example at hand (pg_statistic entries) it's entirely plausible that the planner would only need the histogram, or only need the MCV list, depending on the sorts of queries it was coping with. Fwiw detoasting statistics entries sounds like a fine idea to me. I've often seen queries that are unexpectedly slow to plan and chalked it up to statistics entries getting toasted. If it's ok to read either the histogram or MVC list from disk every time we plan a query then why are we bothering with an in-memory cache of the statistics at all? The only thing that gives me pause is that it's possible these entries are *really* large. If you have a decent number of tables that are all a few megabytes of histograms then things could go poorly. But I don't think having to read in these entries from pg_toast every time you plan a query is going to go much better for you either. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] VACUUM FULL versus TOAST
So I've gotten things fixed to the point where the regression tests seem to not fall over when contending with concurrent vacuum full pg_class, and now expanded the scope of the testing to all the system catalogs. What's failing for me now is this chunk in opr_sanity: *** 209,219 NOT p1.proisagg AND NOT p2.proisagg AND (p1.proargtypes[3] p2.proargtypes[3]) ORDER BY 1, 2; ! proargtypes | proargtypes ! -+- ! 1114 |1184 ! (1 row) ! SELECT DISTINCT p1.proargtypes[4], p2.proargtypes[4] FROM pg_proc AS p1, pg_proc AS p2 WHERE p1.oid != p2.oid AND --- 209,215 NOT p1.proisagg AND NOT p2.proisagg AND (p1.proargtypes[3] p2.proargtypes[3]) ORDER BY 1, 2; ! ERROR: missing chunk number 0 for toast value 23902886 in pg_toast_2619 SELECT DISTINCT p1.proargtypes[4], p2.proargtypes[4] FROM pg_proc AS p1, pg_proc AS p2 WHERE p1.oid != p2.oid AND On investigation, this turns out to occur when the planner is trying to fetch the value of a toasted attribute in a cached pg_statistic tuple, and a concurrent vacuum full pg_statistic has just finished. The problem of course is that vacuum full reassigned all the toast item OIDs in pg_statistic, so the one we have our hands on is no longer correct. In general, *any* access to a potentially toasted attribute value in a catcache entry is at risk here. I don't think it's going to be feasible, either from a notational or efficiency standpoint, to insist that callers always re-lock the source catalog before fetching a catcache entry from which we might wish to extract a potentially toasted attribute. I am thinking that the most reasonable solution is instead to fix VACUUM FULL/CLUSTER so that they don't change existing toast item OIDs when vacuuming a system catalog. They already do some pretty ugly things to avoid changing the toast table's OID in this case, and locking down the item OIDs too doesn't seem that much harder. (Though I've not actually looked at the code yet...) The main potential drawback here is that if any varlena items that had not previously been toasted got toasted, they would require additional OIDs to be assigned, possibly leading to a duplicate-OID failure. This should not happen unless somebody decides to play with the attstorage properties of a system catalog, and I don't feel too bad about a small possibility of VAC FULL failing after that. (Note it should eventually succeed if you keep trying, since the generated OIDs would keep changing.) Thoughts? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] VACUUM FULL versus TOAST
On 14.08.2011 01:13, Tom Lane wrote: On investigation, this turns out to occur when the planner is trying to fetch the value of a toasted attribute in a cached pg_statistic tuple, and a concurrent vacuum full pg_statistic has just finished. The problem of course is that vacuum full reassigned all the toast item OIDs in pg_statistic, so the one we have our hands on is no longer correct. In general, *any* access to a potentially toasted attribute value in a catcache entry is at risk here. I don't think it's going to be feasible, either from a notational or efficiency standpoint, to insist that callers always re-lock the source catalog before fetching a catcache entry from which we might wish to extract a potentially toasted attribute. I am thinking that the most reasonable solution is instead to fix VACUUM FULL/CLUSTER so that they don't change existing toast item OIDs when vacuuming a system catalog. They already do some pretty ugly things to avoid changing the toast table's OID in this case, and locking down the item OIDs too doesn't seem that much harder. (Though I've not actually looked at the code yet...) How about detoasting all datums before caching them? It's surprising that a datum that is supposedly in a catalog cache, actually needs disk access to use. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers