Hi, I am looking to improve the initial query speed for the following query:

select email_id from email, to_tsquery('default','example') as q where q@@fts;

This is running on 8.2.4 on Windows Server 2K3.

The initial output from explain analyse is as follows.

"Nested Loop (cost=8.45..76.70 rows=18 width=8) (actual time=5776.347..27364.248 rows=14938 loops=1)" " -> Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual time=0.023..0.024 rows=1 loops=1)" " -> Bitmap Heap Scan on email (cost=8.45..76.46 rows=18 width=322) (actual time=5776.314..27353.344 rows=14938 loops=1)"
"        Filter: (q.q @@ email.fts)"
" -> Bitmap Index Scan on email_fts_index (cost=0.00..8.44 rows=18 width=0) (actual time=5763.355..5763.355 rows=15118 loops=1)"
"              Index Cond: (q.q @@ email.fts)"
"Total runtime: 27369.091 ms"

Subsequent output is considerably faster. (I am guessing that is because email_fts_index is cached.

"Nested Loop (cost=8.45..76.70 rows=18 width=8) (actual time=29.241..264.712 rows=14938 loops=1)" " -> Function Scan on q (cost=0.00..0.01 rows=1 width=32) (actual time=0.008..0.010 rows=1 loops=1)" " -> Bitmap Heap Scan on email (cost=8.45..76.46 rows=18 width=322) (actual time=29.224..256.135 rows=14938 loops=1)"
"        Filter: (q.q @@ email.fts)"
" -> Bitmap Index Scan on email_fts_index (cost=0.00..8.44 rows=18 width=0) (actual time=28.344..28.344 rows=15118 loops=1)"
"              Index Cond: (q.q @@ email.fts)"
"Total runtime: 268.663 ms"

The table contains text derived from emails and therefore its contents and the searches can vary wildly.

Table construction as follows:

CREATE TABLE email
(
email_id bigint NOT NULL DEFAULT nextval(('public.email_email_id_seq'::text)::regclass),
 send_to text NOT NULL DEFAULT ''::text,
 reply_from character varying(100) NOT NULL DEFAULT ''::character varying,
 cc text NOT NULL DEFAULT ''::text,
 bcc text NOT NULL DEFAULT ''::text,
 subject text NOT NULL DEFAULT ''::text,
 "content" text NOT NULL DEFAULT ''::text,
 time_tx_rx timestamp without time zone NOT NULL DEFAULT now(),
 fts tsvector,
 CONSTRAINT email_pkey PRIMARY KEY (email_id),
)
WITH (OIDS=FALSE);

-- Index: email_fts_index

CREATE INDEX email_fts_index
 ON email
 USING gist
 (fts);

CREATE INDEX email_mailbox_id_idx
 ON email
 USING btree
 (mailbox_id);


-- Trigger: fts_trigger on email
CREATE TRIGGER fts_trigger
 BEFORE INSERT OR UPDATE
 ON email
 FOR EACH ROW
EXECUTE PROCEDURE tsearch2('fts', 'send_to', 'reply_from', 'cc', 'content', 'subject');


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to