Hi,

I'm looking for a *fast* solution to search thru ~ 4 million records of book descriptions. I've installed PostgreSQL 8.0.1 on a dual opteron server with 8G of memory, running Linux 2.6. I haven't done a lot of tuning on PostgreSQL itself, but here's the settings I have changed so far:

shared_buffers = 2000 (anything much bigger says the kernel doesnt allow it, still have to look into that)
effective_cache_size = 32768


Here's my table:

ilab=# \d books
Table "public.books"
Column | Type | Modifiers
---------------+------------------------+----------------------------------------------------------
recordnumber | integer | not null default nextval('books_recordnumber_seq'::text)
membernumber | integer | not null default 0
booknumber | character varying(20) | not null default ''::character varying
author | character varying(60) | not null default ''::character varying
titel | text | not null
description | character varying(100) | not null default ''::character varying
descriprest | text | not null
price | bigint | not null default 0::bigint
keywords | character varying(100) | not null default ''::character varying
dollarprice | bigint | not null default 0::bigint
countrynumber | smallint | not null default 0::smallint
entrydate | date | not null
status | smallint | not null default 0::smallint
recordtype | smallint | not null default 0::smallint
bookflags | smallint | not null default 0::smallint
year | smallint | not null default 0::smallint
firstedition | smallint | not null default 0::smallint
dustwrapper | smallint | not null default 0::smallint
signed | smallint | not null default 0::smallint
cover | smallint | not null default 0::smallint
specialfield | smallint | not null default 0::smallint
idxfti | tsvector |
Indexes:
"recordnumber_idx" unique, btree (recordnumber)
"idxfti_idx" gist (idxfti)


idxfti is a tsvector of concatenated description and descriprest.

ilab=# select avg(character_length(description)),avg(character_length(descriprest)) from books;
avg | avg
---------------------+----------------------
89.1596992873947218 | 133.0468689304200538


Queries take forever to run. Right now we run a MySQL server, on which we maintain our own indices (we split the description fields by word and have different tables for words and the bookdescriptions they appear in).

For example, a query for the word 'terminology' on our MySQL search takes 5.8 seconds and returns 375 results. The same query on postgresql using the tsearch2 index takes 30802.105 ms and returns 298 results.

How do I speed this up? Should I change settings, add or change indexes or.. what?

Rick Jansen
--
Systems Administrator for Rockingstone IT
http://www.rockingstone.com
http://www.megabooksearch.com - Search many book listing sites at once

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Reply via email to