On Jul 15, 2009, at 2:52 PM, Dimitri Fontaine wrote:
Le 15 juil. 09 à 02:01, Glen Parker a écrit :
Sounds to me like another reason to separate index definition from creation. If an index can be defined but not yet created or valid, then you could imagine syntax like:

DEFINE INDEX blahblah1 ON mytable (some fields);
DEFINE INDEX blahblah2 ON mytable (some other fields);
[RE]INDEX TABLE mytable;

...provided that REINDEX TABLE could recreate all indexes simultaneously as you suggest.

Well to me it sounded much more like:
 BEGIN;
  CREATE INDEX idx_a ON t(a) DEFERRED;
  CREATE INDEX idx_b ON t(b) DEFERRED;
 COMMIT;

And at commit time, PostgreSQL would build all the transaction indexes in one pass over the heap, but as Tom already pointed out, using only 1 CPU. Maybe that'd be a way to limit the overall io bandwidth usage while not consuming too many CPU resources at the same time.

I mean now we have a choice to either sync scan the table heap on multiple CPU, saving IO but using 1 CPU per index, or to limit CPU to only 1 but then scan the heap once per index. The intermediary option of using 1 CPU while still making a single heap scan sure can be worthwhile to some?


Here's an off-the-wall thought... since most of the CPU time is in the sort, what about allowing a backend to fork off dedicated sort processes? Aside from building multiple indexes at once, that functionality could also be useful in general queries.
--
Decibel!, aka Jim C. Nasby, Database Architect  deci...@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to