Quoting Arjen van der Meijden <[EMAIL PROTECTED]>: > Hi list, > > I noticed on a forum a query taking a surprisingly large amount of time > in MySQL. Of course I wanted to prove PostgreSQL 8.0.1 could do it much > better. To my surprise PostgreSQL was ten times worse on the same > machine! And I don't understand why. > > I don't really need this query to be fast since I don't use it, but the > range-thing is not really an uncommon query I suppose. So I'm wondering > why it is so slow and this may point to a wrong plan being chosen or > generated. > > Here are table definitions: > > Table "public.postcodes" > Column | Type | Modifiers > -------------+---------------+----------- > postcode_id | smallint | not null > range_from | smallint | > range_till | smallint | > Indexes: > "postcodes_pkey" PRIMARY KEY, btree (postcode_id) > "range" UNIQUE, btree (range_from, range_till) > > Table "public.data_main" > Column | Type | Modifiers > --------+----------+----------- > userid | integer | not null > range | smallint | > Indexes: > "data_main_pkey" PRIMARY KEY, btree (userid) > > And here's the query I ran: > > SELECT COUNT(*) FROM > data_main AS dm, > postcodes AS p > WHERE dm.range BETWEEN p.range_from AND p.range_till
I just posted an answer to this (via webcafe webmail; can't recall which pg-list), that might interest you. BTree indexes as they stand (multi-column, ...) answer what most people need for queries. Unfortunately, out-of-the-box, they have no good way of handling range queries. To compensate, you can use a small amount of kinky SQL. This is in the same line as the tricks used to implement hierarchic queries in relational SQL. [1] Create a table "widths"(wid int) of powers of 2, up to what will just cover max(range_till-range_from). Since your "range" column is a smallint, this table can have no more than 15 rows. You can get as fussy as you want about keeping this table to a minimum. [2] Change postcodes: ALTER TABLE postcodes ADD wid INT USING 2 ^ CEIL(LOG(range_from - range_till,2)); ALTER TABLE postcodes ADD start INT USING range_from - (range_from % wid); CREATE INDEX postcodes_wid_start_index ON (wid, start); ANALYZE postcodes; [4] Write your query as: SELECT COUNT(*) FROM data_main AS dm CROSS JOIN widths -- yes, CROSS JOIN. For once, it HELPS performance. JOIN postcodes AS p ON dm.wid = widths.wid AND dm.start = p.range - p.range % widths.wid WHERE dm.range BETWEEN p.range_from AND p.range_till This uses BTREE exact-match to make a tight restriction on which rows to check. YMMV, but this has worked even for multi-M table joins. -- "Dreams come true, not free." ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster