>>>>> "Mike" == Mike<mickalo>Blezien  <[EMAIL PROTECTED]> writes:

Mike> I am trying to put together a SELECT query to perform a search
Mike> on a column with various keywords. The column(searchwords) has
Mike> keywords something like this: (keyword1 keyword2 keyword3..etc)
Mike> separated by a space.

This isn't a DBI-specific response, but...

Note that this will *always* be very slow.  You are forcing your
database engine to scan the entire contents of the database every time
you call the suggested SELECT query.  This will make any time you
spend preparing the query seem insignificant, I would imagine.

If the words are always whole, you will get much better performance by
normalizing this into another table.  Add a (or use an existing)
unique identifier to your "bus_info" table, then use that unique ID to
establish a 1-to-many relationship into another table of nothing but:

| CREATE TABLE bus_search_words
| (
|     bus_id      INTEGER REFERENCES bus_info ( bus_id ),
|     search_word VARCHAR(20),
|     INDEX ( search_word )
| )

Then you can grab the matching businesses by joining against this
table:

| SELECT bi.*
|   FROM bus_info bi, bus_search_words bsw
|   WHERE bsw.bus_id = bi.bus_id
|     AND bsw.search_word = 'target'

Variations on this (probably adding DISTINCT or GROUP BY with some
sort of counting option, etc) are probably necessary.  If you're
searching for multiple whole words, I'd suggest an "IN" clause.

| my $word_list = join ', ', map $dbh->quote($_), @words;
| my $sql = ( "SELECT bus_id" .
|             "  FROM bus_search_words" .
|             "  WHERE search_word IN ( $word_list )" );
| my $ids_aref = $dbh->selectcol_arrayref($sql);

If your search words are not always whole, I'm not sure what to
suggest.  Prefixes will still work well with the index (although
you'll have to revert to joining a bunch of "LIKE" conditions with
"OR"); search terms that are unspecified on the left will pretty much
always suck.  

There are techniques you can use to get around this (letter quads,
keeping another version of reversed words so you can do quick suffix
searchs, keeping a letter+count mask to filter...) but at some point
your original scheme might be better after all.  Brute-force text
scans are rarely the best answer, but ... *shrug*

Also, some databases support this sort of thing internally, and you
don't need to do your own search engine at all (e.g. ConText /
InterMedia for Oracle).  Alternatively, there are many well-known
search engines available in source form at various levels of freedom:
harvest, glimpse, ht:dig.

Mike> Basically what I have is a SELECT query I need to generate on
Mike> the fly to construct the search on the keywords entered into the
Mike> form: $FORM{'words'} The query looks something like this:

If you stick with the "one string of multiple words" method, I would
suspect you'd have better performance by creating a more complex
query; you're forcing a full table scan anyway, so you could do:

| my $sql = ( 'SELECT * FROM bus_info WHERE ' .
|             join ' OR ', map "searchwords LIKE \"\%$_\%\"", @words );

Mike> My question is what is the best or most efficient way to build
Mike> the query on the fly. Any tips, suggestions much appreciated.

As I point out above, the time to build the query will be nothing
compared to the execution time, if this is at all a large DB.

t.

Reply via email to