Thanks Mark:
----- Original Message -----
From: "Mark C. Roduner, Jr." <[EMAIL PROTECTED]>
Sent: Tuesday, March 25, 2003 3:45 PM
Subject: RE: Your professional opinion Please...
> Brian,
> Here's Some hints on how to accomplish an efficiant way
> to index the data
> Regular Expressions:
> ([\w\d]{5,64}) -Matches all Word and Mumeric data in a
> given string
> Database
> Tables
> files : [int id][char*255 file name]
> (Propagate This With File Names)
> word : [int id][char*64 word]
> (Propagate This With *Unique* Words)
> map : [int id][int word][int files]
> (Propagate This With `file`.`id`,
> `word`.`id`
> where `word`.`name` is found in file
> named by
> `file`.`name`)
> Querys
> To Find a file With given words
> SELECT `file`.`name` from `file`,
> `word`, `map`
> where (`word`.`name` IN
> ('word1','word2', 'word3')) and
> (`map`.`word`=`word`.`id` and
> `map`.`file`=`file`.`id`)
> GROUP BY `file`.`name`;
> Room for Improvement
> Add in a field into the MAP table that gives the
> offset
> (in words) where the word was found. This would
> prove
> useful for "Quoted Queries" (ie: Phrase
> searching).
> Add a blob segment into the FILE table for
> easier access
> to the data (very optional, _will_ bloat your
> database)
Probably a little more than I can do in the allotted time.
> If you're willing to pay for it, I'll Write it for you.
Unfortunately there is no budget for this project.
> BTW, I recommend JAVA for writing the reader program,
> much easier and clean cut to do regular expressions, and
> PHP (v4.x) for the search program (easier UI).
Understood.
Appreciate the feedback.
Best regards,
Brian
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]