On Jul 15, 2009, at 2:52 PM, Dimitri Fontaine wrote:
Le 15 juil. 09 à 02:01, Glen Parker a écrit :
Sounds to me like another reason to separate index definition from
creation. If an index can be defined but not yet created or
valid, then you could imagine syntax like:
DEFINE INDEX blahblah1 ON mytable (some fields);
DEFINE INDEX blahblah2 ON mytable (some other fields);
[RE]INDEX TABLE mytable;
...provided that REINDEX TABLE could recreate all indexes
simultaneously as you suggest.
Well to me it sounded much more like:
BEGIN;
CREATE INDEX idx_a ON t(a) DEFERRED;
CREATE INDEX idx_b ON t(b) DEFERRED;
COMMIT;
And at commit time, PostgreSQL would build all the transaction
indexes in one pass over the heap, but as Tom already pointed out,
using only 1 CPU. Maybe that'd be a way to limit the overall io
bandwidth usage while not consuming too many CPU resources at the
same time.
I mean now we have a choice to either sync scan the table heap on
multiple CPU, saving IO but using 1 CPU per index, or to limit CPU
to only 1 but then scan the heap once per index. The intermediary
option of using 1 CPU while still making a single heap scan sure
can be worthwhile to some?
Here's an off-the-wall thought... since most of the CPU time is in
the sort, what about allowing a backend to fork off dedicated sort
processes? Aside from building multiple indexes at once, that
functionality could also be useful in general queries.
--
Decibel!, aka Jim C. Nasby, Database Architect deci...@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers