It might be more productive to create a list of all the unique words in a given 
blob record and make *that* searchable instead of looking at the contents of 
individual blobs each time you search. This is how more sophisticated searches 
are done on big databases.

You update your word-search list each time you update the record:

Just parse the space/tab/punctuation delimited words you are about to put into 
your blob, and search your word table for a match on each parsed word.  If not 
found then append the new word and get its index number. If you do find the 
word in your word table, you get the old word index number. You can also deal 
with RTF metadata in the parser, if you need to - just filter it out/skip it.

Next, append the word index number(s) to a look-up table and add the unique ID 
of the record containing the blob into a second column.

You can use case insensitive searching in English - just force your saved words 
to lower case. In German (I think) and other languages where case is important, 
you may have to save the same words with different capitalizations.

words table:
--------------
id:integer (autoincrement)
word:string

index the word column above for faster searching.

wordlookup table:
-----------------
wordid:integer
blobrecid:integer

Make sure you index both the above columns for faster searching

Now you can quickly locate all the records containing a list of words using a 
simple query that joins the word table and the index lookup table. The query's 
result set will be a list of the record IDs of all records with blobs 
containing the word(s) you are looking for. You can do logical AND/OR searches 
by making appropriate queries of the word table. 

To search for blob records containing "blue" or "widget":

SELECT wl.blobrecid
FROM words w, wordlookup wl
where w.word="widget" or w.word="blue" and w.id=wl.wordid

The above query works in MySQL - you may have to modify it to work in other 
databases.

Before launching a search, toss out any stop words before forming the query:

You need a list of "stop words" to avoid indexing. Stop Words are words like 
"if", "and", "what", "a" and so on.  These do not contribute much to the 
information found in the text and would simply clutter up the search tables.  I 
think I have seen English stop word lists available for free download on the 
Internet.

You could also dynamically create your own stop words list from words that have 
a high statistical probability of appearing in a blob.

When deleting a blob record, don't forget to clean up your wordlookup table so 
it doesn't point to deleted records.

HTH!

Kevin G. McCoy

--- In [email protected], David Smith <djsmith_1...@...> wrote:
>
> Any text over 240 characters in length is stored in a linked file, so your 
> only option is to load the blob field into a string and search that. If you 
> are storing richtext you have to keep in mind tha the meta-data will be seen 
> by your search routine, so it may be more practical to load it into a hidden 
> richedit for searching and create a listbox or whatever to show intermediate 
> results, like you'd see in Outlook Express.
> 
> Dave
> 
> --- On Mon, 8/2/10, haroutbulbulian2000 <haroutbulbulian2...@...> wrote:
> 
> From: haroutbulbulian2000 <haroutbulbulian2...@...>
> Subject: [delphi-en] Re: ado "locate " search in a mdb memo field
> To: [email protected]
> Date: Monday, August 2, 2010, 8:00 AM
> 
> You can make your Own Searching prcedure by moving through the Records in the 
> table and compare your text (or Lines) to the memo.
> 
> 
> 
> --- In [email protected], Prakash Shirodkar <prakash.shirodkar@> 
> wrote:
> 
> >
> 
> > I agree with with Peter.
> 
> > 
> 
> > On Fri, Jul 16, 2010 at 4:00 AM, Peter Luijer <p.luijer@> wrote:
> 
> > 
> 
> > >
> 
> > >
> 
> > > Hello Andries,
> 
> > >
> 
> > > As far as I know (though I'm not a database expert), memo-fields are 
> > > stored
> 
> > > as blobs and
> 
> > >
> 
> > > can't therefor not be used to index or sort.
> 
> > >
> 
> > > Maybe someone can share a better light at this, but it's the only thing I
> 
> > > could think of.
> 
> > >
> 
> > > Greetz,
> 
> > >
> 
> > > Peter.
> 
> > >
> 
> > > (AKA VideoRipper on NLDelphi)
> 
> > >
> 
> > > _____
> 
> > >
> 
> > > From: [email protected] <delphi-en%40yahoogroups.com> [mailto:
> 
> > > [email protected] <delphi-en%40yahoogroups.com>] On Behalf
> 
> > > Of Andries Bos
> 
> > > Sent: donderdag 15 juli 2010 22:15
> 
> > > To: delphi advanced group
> 
> > > Subject: [delphi-en] ado "locate " search in a mdb memo field
> 
> > >
> 
> > >
> 
> > > Hello all,
> 
> > >
> 
> > > I build a small application that displays some information within TDBEDit
> 
> > > and
> 
> > > TDBMemo field. The record that should be dispayed is connection to a
> 
> > > 'locate'
> 
> > > command that actually search in all field of the database.
> 
> > >
> 
> > > all went well, except searching in the memo field: This will result in a
> 
> > > EOleExceptionand expanded with an error " 'sorting order' that could not 
> > > be
> 
> > >
> 
> > > used" (sorry, the proper eror message is in Dutch: 'De sorteervolgorde kan
> 
> > > niet
> 
> > > worden toegepast"). I connect to a mdb database via ado.
> 
> > >
> 
> > > I have search the web for problems related to 'locate ' and memo field but
> 
> > > could not find any articles related to this problem; Only some examples
> 
> > > where
> 
> > > they use the locate to search within fields.
> 
> > >
> 
> > > Could I use locate to search within a mdb memo fields?
> 
> > >
> 
> > > Regards
> 
> > >
> 
> > > Andries
> 


Reply via email to