Re: [HACKERS] To Do wiki

2012-04-10 Thread Heikki Linnakangas

On 10.04.2012 03:32, Jeff Janes wrote:

The To Do wiki says not to add things to the page with discussing here.

So here are some things to discuss.  Assuming the discussion is a
brief yup or nope, it seems to make sense to lump them into one email:

Vacuuming a table with a large GIN index is painfully slow, because
the index is vacuumed in logical order not physical order.  Is making
a vacuum in physical order a to-do?  Does this belong to vacuuming, or
to GIN indexing?  Looking at the complexity of how this was done for
btree index, I would say this is far from easy.  I wonder if there is
an easier way that is still good enough, for example every time you
split a page, check to see if a vacuum is in the index, and if so only
move tuples physically rightward.  If the table is so active that
there is essentially always a vacuum in the index, this could lead to
bloat.  But if the table is that large and active, under the current
non-physical order the vacuum would likely take approximately forever
to finish and so the bloat would be just as bad under that existing
system.


Yup, seems like a todo. It doesn't sound like a good idea to force 
tuples to be moved right when a vacuum is in progress, that could lead 
to bloating, but it should be feasible to implement the same 
cycleid-mechanism in gin that we did in b-tree.



Speed up COUNT(*)  is marked as done.  While index-only-scans should
speed this up in certain cases, it is nothing compared to the speed up
that could be obtained by use a fixed row count and a +/- count to
follow MVCC visibility rules, and that speed-up is the one people
used to MyISAM are expecting.  We might not want to actually implement
the fixed row count +/- MVCC count idea, but we probably shouldn't
mark the whole thing as done because just one approach to it was
implemented.


I think the way we'd speed up COUNT(*) further would be to implement 
materialized views. Then you could define a materialized view on 
COUNT(*), and essentially get a row counter similar to MyISAM. I think 
it's fair to mark this as done.



sort_support was implemented for plain tuple sorting only, To Do is
extend to index-creation sorts (item 2 from message
1698.1323222...@sss.pgh.pa.us)


Index-creation sorts are already handled, Tom is referring to using the 
new comparator API for index searches in that email. The change would go 
to _bt_compare().


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] To Do wiki

2012-04-10 Thread Greg Stark
On Tue, Apr 10, 2012 at 7:27 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 I think the way we'd speed up COUNT(*) further would be to implement
 materialized views. Then you could define a materialized view on COUNT(*),
 and essentially get a row counter similar to MyISAM. I think it's fair to
 mark this as done.

If only because it comes up so frequently it would be good to have
this noted in the TODO, either under materialized views or as a
pointer to them.

A good materialized views implementation including automatically
determining what delta data to keep sure would be nice to have.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] To Do wiki

2012-04-10 Thread Jeff Janes
On Mon, Apr 9, 2012 at 11:27 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 10.04.2012 03:32, Jeff Janes wrote:

 The To Do wiki says not to add things to the page with discussing here.
...

 sort_support was implemented for plain tuple sorting only, To Do is
 extend to index-creation sorts (item 2 from message
 1698.1323222...@sss.pgh.pa.us)


 Index-creation sorts are already handled, Tom is referring to using the new
 comparator API for index searches in that email. The change would go to
 _bt_compare().

If I do select count(distinct bid) from pgbench_accounts I get many
calls to btint4fastcmp, but if I do create index on pgbench_accounts
(bid) I instead get many calls to btint4cmp.  If the index build is
using SortSupport, shouldn't it also be calling btint4fastcmp like the
distinct does?

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] To Do wiki

2012-04-10 Thread Heikki Linnakangas

On 10.04.2012 18:31, Jeff Janes wrote:

On Mon, Apr 9, 2012 at 11:27 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

On 10.04.2012 03:32, Jeff Janes wrote:


The To Do wiki says not to add things to the page with discussing here.

...



sort_support was implemented for plain tuple sorting only, To Do is
extend to index-creation sorts (item 2 from message
1698.1323222...@sss.pgh.pa.us)


Index-creation sorts are already handled, Tom is referring to using the new
comparator API for index searches in that email. The change would go to
_bt_compare().


If I do select count(distinct bid) from pgbench_accounts I get many
calls to btint4fastcmp, but if I do create index on pgbench_accounts
(bid) I instead get many calls to btint4cmp.  If the index build is
using SortSupport, shouldn't it also be calling btint4fastcmp like the
distinct does?


Oh, sorry, you're right. I stand corrected.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] To Do wiki

2012-04-10 Thread Peter Geoghegan
On 10 April 2012 16:40, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 10.04.2012 18:31, Jeff Janes wrote:
 If I do select count(distinct bid) from pgbench_accounts I get many
 calls to btint4fastcmp, but if I do create index on pgbench_accounts
 (bid) I instead get many calls to btint4cmp.  If the index build is
 using SortSupport, shouldn't it also be calling btint4fastcmp like the
 distinct does?


 Oh, sorry, you're right. I stand corrected.

There is an impedance mismatch between tuplesort_begin_heap and
tuplesort_begin_index_btree that prevented this from being done with
the initial commit. Strangely, the SortSupport commit message didn't
comment on this.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] To Do wiki

2012-04-09 Thread Jeff Janes
The To Do wiki says not to add things to the page with discussing here.

So here are some things to discuss.  Assuming the discussion is a
brief yup or nope, it seems to make sense to lump them into one email:

Vacuuming a table with a large GIN index is painfully slow, because
the index is vacuumed in logical order not physical order.  Is making
a vacuum in physical order a to-do?  Does this belong to vacuuming, or
to GIN indexing?  Looking at the complexity of how this was done for
btree index, I would say this is far from easy.  I wonder if there is
an easier way that is still good enough, for example every time you
split a page, check to see if a vacuum is in the index, and if so only
move tuples physically rightward.  If the table is so active that
there is essentially always a vacuum in the index, this could lead to
bloat.  But if the table is that large and active, under the current
non-physical order the vacuum would likely take approximately forever
to finish and so the bloat would be just as bad under that existing
system.

Speed up COUNT(*)  is marked as done.  While index-only-scans should
speed this up in certain cases, it is nothing compared to the speed up
that could be obtained by use a fixed row count and a +/- count to
follow MVCC visibility rules, and that speed-up is the one people
used to MyISAM are expecting.  We might not want to actually implement
the fixed row count +/- MVCC count idea, but we probably shouldn't
mark the whole thing as done because just one approach to it was
implemented.

sort_support was implemented for plain tuple sorting only, To Do is
extend to index-creation sorts (item 2 from message
1698.1323222...@sss.pgh.pa.us)

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers