Re: [HACKERS] To Do wiki
On 10.04.2012 03:32, Jeff Janes wrote: The To Do wiki says not to add things to the page with discussing here. So here are some things to discuss. Assuming the discussion is a brief yup or nope, it seems to make sense to lump them into one email: Vacuuming a table with a large GIN index is painfully slow, because the index is vacuumed in logical order not physical order. Is making a vacuum in physical order a to-do? Does this belong to vacuuming, or to GIN indexing? Looking at the complexity of how this was done for btree index, I would say this is far from easy. I wonder if there is an easier way that is still good enough, for example every time you split a page, check to see if a vacuum is in the index, and if so only move tuples physically rightward. If the table is so active that there is essentially always a vacuum in the index, this could lead to bloat. But if the table is that large and active, under the current non-physical order the vacuum would likely take approximately forever to finish and so the bloat would be just as bad under that existing system. Yup, seems like a todo. It doesn't sound like a good idea to force tuples to be moved right when a vacuum is in progress, that could lead to bloating, but it should be feasible to implement the same cycleid-mechanism in gin that we did in b-tree. Speed up COUNT(*) is marked as done. While index-only-scans should speed this up in certain cases, it is nothing compared to the speed up that could be obtained by use a fixed row count and a +/- count to follow MVCC visibility rules, and that speed-up is the one people used to MyISAM are expecting. We might not want to actually implement the fixed row count +/- MVCC count idea, but we probably shouldn't mark the whole thing as done because just one approach to it was implemented. I think the way we'd speed up COUNT(*) further would be to implement materialized views. Then you could define a materialized view on COUNT(*), and essentially get a row counter similar to MyISAM. I think it's fair to mark this as done. sort_support was implemented for plain tuple sorting only, To Do is extend to index-creation sorts (item 2 from message 1698.1323222...@sss.pgh.pa.us) Index-creation sorts are already handled, Tom is referring to using the new comparator API for index searches in that email. The change would go to _bt_compare(). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] To Do wiki
On Tue, Apr 10, 2012 at 7:27 AM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: I think the way we'd speed up COUNT(*) further would be to implement materialized views. Then you could define a materialized view on COUNT(*), and essentially get a row counter similar to MyISAM. I think it's fair to mark this as done. If only because it comes up so frequently it would be good to have this noted in the TODO, either under materialized views or as a pointer to them. A good materialized views implementation including automatically determining what delta data to keep sure would be nice to have. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] To Do wiki
On Mon, Apr 9, 2012 at 11:27 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 10.04.2012 03:32, Jeff Janes wrote: The To Do wiki says not to add things to the page with discussing here. ... sort_support was implemented for plain tuple sorting only, To Do is extend to index-creation sorts (item 2 from message 1698.1323222...@sss.pgh.pa.us) Index-creation sorts are already handled, Tom is referring to using the new comparator API for index searches in that email. The change would go to _bt_compare(). If I do select count(distinct bid) from pgbench_accounts I get many calls to btint4fastcmp, but if I do create index on pgbench_accounts (bid) I instead get many calls to btint4cmp. If the index build is using SortSupport, shouldn't it also be calling btint4fastcmp like the distinct does? Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] To Do wiki
On 10.04.2012 18:31, Jeff Janes wrote: On Mon, Apr 9, 2012 at 11:27 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 10.04.2012 03:32, Jeff Janes wrote: The To Do wiki says not to add things to the page with discussing here. ... sort_support was implemented for plain tuple sorting only, To Do is extend to index-creation sorts (item 2 from message 1698.1323222...@sss.pgh.pa.us) Index-creation sorts are already handled, Tom is referring to using the new comparator API for index searches in that email. The change would go to _bt_compare(). If I do select count(distinct bid) from pgbench_accounts I get many calls to btint4fastcmp, but if I do create index on pgbench_accounts (bid) I instead get many calls to btint4cmp. If the index build is using SortSupport, shouldn't it also be calling btint4fastcmp like the distinct does? Oh, sorry, you're right. I stand corrected. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] To Do wiki
On 10 April 2012 16:40, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 10.04.2012 18:31, Jeff Janes wrote: If I do select count(distinct bid) from pgbench_accounts I get many calls to btint4fastcmp, but if I do create index on pgbench_accounts (bid) I instead get many calls to btint4cmp. If the index build is using SortSupport, shouldn't it also be calling btint4fastcmp like the distinct does? Oh, sorry, you're right. I stand corrected. There is an impedance mismatch between tuplesort_begin_heap and tuplesort_begin_index_btree that prevented this from being done with the initial commit. Strangely, the SortSupport commit message didn't comment on this. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] To Do wiki
The To Do wiki says not to add things to the page with discussing here. So here are some things to discuss. Assuming the discussion is a brief yup or nope, it seems to make sense to lump them into one email: Vacuuming a table with a large GIN index is painfully slow, because the index is vacuumed in logical order not physical order. Is making a vacuum in physical order a to-do? Does this belong to vacuuming, or to GIN indexing? Looking at the complexity of how this was done for btree index, I would say this is far from easy. I wonder if there is an easier way that is still good enough, for example every time you split a page, check to see if a vacuum is in the index, and if so only move tuples physically rightward. If the table is so active that there is essentially always a vacuum in the index, this could lead to bloat. But if the table is that large and active, under the current non-physical order the vacuum would likely take approximately forever to finish and so the bloat would be just as bad under that existing system. Speed up COUNT(*) is marked as done. While index-only-scans should speed this up in certain cases, it is nothing compared to the speed up that could be obtained by use a fixed row count and a +/- count to follow MVCC visibility rules, and that speed-up is the one people used to MyISAM are expecting. We might not want to actually implement the fixed row count +/- MVCC count idea, but we probably shouldn't mark the whole thing as done because just one approach to it was implemented. sort_support was implemented for plain tuple sorting only, To Do is extend to index-creation sorts (item 2 from message 1698.1323222...@sss.pgh.pa.us) Cheers, Jeff -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers