Re: [PERFORM] Tsearch2 performance on big database

Oleg Bartunov Tue, 22 Mar 2005 07:53:22 -0800

Mike,

no comments before Rick post tsearch configs and increased buffers !
Union shouldn't be faster than (term1|term2).
tsearch2 internals description might help you understanding tsearch2 
limitations.
See  http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_internals
Also, don't miss my notes:
http://www.sai.msu.su/~megera/oddmuse/index.cgi/Tsearch_V2_Notes

Oleg
On Tue, 22 Mar 2005, Mike Rylander wrote:

On Tue, 22 Mar 2005 14:25:19 +0100, Rick Jansen <[EMAIL PROTECTED]> wrote:


ilab=# explain analyze select count(titel) from books where idxfti @@
to_tsquery('default', 'buckingham | palace');
                                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
  Aggregate  (cost=35547.99..35547.99 rows=1 width=56) (actual
time=125968.119..125968.120 rows=1 loops=1)
    ->  Index Scan using idxfti_idx on books  (cost=0.00..35525.81
rows=8869 width=56) (actual time=0.394..125958.245 rows=3080 loops=1)
          Index Cond: (idxfti @@ '\'buckingham\' | \'palac\''::tsquery)
  Total runtime: 125968.212 ms
(4 rows)

Time: 125969.264 ms
ilab=#


Ahh... I should have qualified my claim.  I am creating a google-esqe
search interface and almost every query uses '&' as the term joiner.
'AND' queries and one-term queries are orders of magnitude faster than
'OR' queries, and fortunately are the expected default for most users.
(Think, "I typed in these words, therefore I want to match these
words"...)  An interesting test may be to time multiple queries
independently, one for each search term, and see if the combined cost
is less than a single 'OR' search.  If so, you could use UNION to join
the results.

However, the example you originally gave ('terminology') should be
very fast.  On a comparable query ("select count(value) from
metabib.full_rec where index_vector @@ to_tsquery('default','jane');")
I get 12ms.

Oleg, do you see anything else on the surface here?

Try:

EXPLAIN ANALYZE
 SELECT titel FROM books WHERE idxfti @@
   to_tsquery('default', 'buckingham')
 UNION
 SELECT titel FROM books WHERE idxfti @@
   to_tsquery('default', 'palace');

and see if using '&' instead of '|' where you can helps out.  I
imagine you'd be surprised by the speed of:

 SELECT titel FROM books WHERE idxfti @@
   to_tsquery('default', 'buckingham&palace');


> As an example of what I think you *should* be seeing, I have a similar
> box (4 procs, but that doesn't matter for one query) and I can search
> a column with tens of millions of rows in around a second.
>

That sounds very promising, I'd love to get those results.. could you
tell me what your settings are, howmuch memory you have and such?


16G of RAM on a dedicated machine.


shared_buffers = 15000          # min 16, at least max_connections*2, 8KB 
each
work_mem = 10240                # min 64, size in KB
maintenance_work_mem = 1000000  # min 1024, size in KB
# big m_w_m for loading data...

random_page_cost = 2.5          # units are one sequential page fetch cost
# fast drives, and tons of RAM


    Regards,
                Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [email protected], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Tsearch2 performance on big database

Reply via email to