Jamie, I think your approach of a cross-reference table is a good start. This is similar to creating a stemming index. Perhaps you might like to look into, for example, some perl Modules for stemming (like Linga::Stem) to further reduce your data space.
Perhaps since there may be a large resultset from the first select, and then feeding this information back to the server, and the second search also returning a large resultset, perhaps you'd be better off trying to obtain the results in one select statement --- what about something like: select eventlog.* from eventlog left join crossref on (eventlog.id=crossref.id and word = 'HELLO') ORDER BY eventlog.id DESC LIMIT 20; ? Regards, Dan > -----Original Message----- > From: Jaime Teng [mailto:[EMAIL PROTECTED]] > Sent: Monday, 25 February 2002 2:34 p.m. > To: [EMAIL PROTECTED] > Subject: help with big table search > > > Hi, > > I have a table and currently has about 1.6 million entries. > It is a table of events with date/time and description of the event. > > mysql> describe eventlog; > +-------------+------------------+------+-----+---------+----------------+ > | Field | Type | Null | Key | Default | Extra | > +-------------+------------------+------+-----+---------+----------------+ > | id | int(10) unsigned | | PRI | NULL | auto_increment | > | timestamp | int(10) unsigned | | MUL | 0 | | > | description | char(100) | | | | | > +-------------+------------------+------+-----+---------+----------------+ > > My task is to search this table for any particular word that may occur > anywhere in the description field: > ie. > > SELECT * FROM eventlog > WHERE description LIKE '%pattern%' > ORDER BY id DESC LIMIT 20; > > Considering that I am using LIKE instead of = as a search option, making > "description" into an index would not do any good. > > This search works well ONLY if the items to search are relatively near > the top of the table *AND* most importantly, there are at least '20' > matching items available on the table. *IF* the table contains only > 19 matches or less, then the SELECT will search through the whole > table and I may have to wait some 2~3 minutes to get the result. > > > Then I started using cross-reference table. > > mysql> describe crossref; > +-------+------------------+------+-----+---------+-------+ > | Field | Type | Null | Key | Default | Extra | > +-------+------------------+------+-----+---------+-------+ > | word | char(15) | | MUL | | | > | id | int(10) unsigned | | MUL | 0 | | > +-------+------------------+------+-----+---------+-------+ > word is an independent index, > id is an independent index > > For every entry into the eventlog table, I broken down each word from > description and inserted them into the crossref table. This way, > whenever I want to find the word "HELLO", all i need to do > is: > SELECT id FROM crossref WHERE word = 'HELLO' ORDER BY id DESC limit 20; > and then use the results into another search: > SELECT * from eventlog where id in (previous result); > > However still, for whatever reason, this search isnt working well. > Though the search time is better than before, it still takes about > 30~60 seconds for an answer. (sometimes fairly fast <5 seconds). > SOMETIMES, searching through this crossref were actually slower. > > I'd like to know how you people come up with a very good table and > search. Eventlog is currently 200MB in size. > > Jaime > > > --------------------------------------------------------------------- > Before posting, please check: > http://www.mysql.com/manual.php (the manual) > http://lists.mysql.com/ (the list archive) > > To request this thread, e-mail <[EMAIL PROTECTED]> > To unsubscribe, e-mail > <[EMAIL PROTECTED]> > Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php > > --------------------------------------------------------------------- Before posting, please check: http://www.mysql.com/manual.php (the manual) http://lists.mysql.com/ (the list archive) To request this thread, e-mail <[EMAIL PROTECTED]> To unsubscribe, e-mail <[EMAIL PROTECTED]> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php