I'm having some issues with fulltext searching.  

I've gone though the list archives and stack overflow, but can't seem to get 
the exact answers.  hoping someone can help.

Thanks in advance and apologies for these questions being rather basic.  I just 
felt the docs and some online posts are leading me into possibly making the 
wrong decision and I want to make sure I"m doing this right.


1.  I need to make both 'title' and 'description' searchable.   What is the 
current proper way to index multiple columns of a table ( ie, not one ) ?

        I've essentially seen the following in the docs, mailing list, and 
various websites:

        A unified index
                CREATE INDEX CONCURRENTLY unified_tsvector_idx ON mytable USING 
gin(to_tsvector('english', title || ' ' || description ));

        Individual indexes
                CREATE INDEX CONCURRENTLY title_tsvector_idx ON mytable USING 
gin(to_tsvector('english', title ));
                CREATE INDEX CONCURRENTLY description_tsvector_idx ON mytable 
USING gin(to_tsvector('english', description ));

        Using dedicated columns ( one or more )
                ALTER TABLE .... 
                create trigger ....

        I can't figure out which one to use.  This is on a steadily growing 
table of around 20MM rows that gets 20-80k new records a day, but existing 
records are rarely updated.
                

2. I've been getting a handful of 'can not index words longer than 2047 
characters' in my tests.  

        if this 2047 character max is on tokens, is there a way to lower it?  
or to profile the index for distribution of tokens ?  I don't think we have to 
support any tokens larger than 20chars or so.

3a. What should EXPLAIN ANALYZE show if it is using the index ?  i couldn't 
find an example.

3b. Depending on how I index the column, what do I need to pass into the query 
so that it uses the index ?

        1.      if the index is created like 
                        gin(to_tsvector('english', title ));

                do i have to search in this format ?
                        to_tsvector('english',title) @@ to_tsquery('english', 
'dog') ;

        2.      if i use an index like 
                         gin(to_tsvector('english', title || ' ' || description 
));
 
                what is the correct way to query the database and let the 
planner know I want to use the index ?





-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to