On 23/11/2006, at 5:21 AM, Dr Gerard Hammond wrote:
At 2:16 PM -0700 22/11/06, Norman Palardy wrote:
On Nov 22, 2006, at 1:42 PM, Dr Gerard Hammond wrote:
When I do a SQLselect and search for a fragment of a string in a
text field, will the SQLselect actually search the entire
contents of my TEXT and BLOB columns.
ie is full-text search builtin?
you can get it to but it does not seem to be indexed in any way
that makes it run faster
if you do
like '%something%'
on a big blob or text it can still take a while
Thanks. Now I understand. full-text search = indexed full text
search.
Looking at the SQLlite pages
http://www.sqlite.org/cvstrac/wiki?p=FtsOne shows the example:
sqlite> select name, snippet(poem, '[', ']', '%%') from poem
where text match 'land';
ozymandias|i met a traveller from an antique [land] who said:
two vast and trunkless legs of %%
This seems to be what I would call keyword-indexed searching, or what
I used to refer to as "FileMaker searches" because it was my
observation back in the early nineties that FM users expected to be
able to type any word into a search field and have it be found.
eg: given a name field containing Andy Dent, users expect to be able
to type in Dent and find the same entry as they do when they type in
Andy, unless we have spent a lot of hours educating them out of this.
I implemented this in 4D for various systems, one enterprise system
of which has been in use continuously for about 15 years now, and it
was something people loved. Things they especially appreciated were
being able to do a "word starting with" and "word sounds like" search
(NOT Soundex).
I made sure the first version of our OOFILE C++ database engine had
support for these features!
I'm not sure but I *think* Valentina may have something smarter to
speedup fragment searches. I have had a few conversations with Ruslan
over the years.
Something which people tend to dismiss is implementation differences
affecting sheer speed of reading records from a database and doing
whole text searches. SQL engines vary widely in the cost of scanning
all records in a table.
The classic text on indexing and searching text is "Managing
Gigabytes". See http://www.cs.mu.oz.au/mg/ for more information and
source code.
KEYWORD vs FRAGMENT searching
A keyword search for 'dent' will find '"Andy Dent" but not "The page
indent".
It may match "the car was dented".
A wildcard search for '%dent%' will match all the above.
TIP
One simple optimisation I built into OOFILE but you can code in any
application on top of your database is to factor out single wildcard
searches, if you have an indexed way to do "starts with" searches.
Say someone is searching for "den%d"
Break that down into a search for startsWith("den') and a sub-search
on the results for those matching "den%d'.
Given the speed of modern computers (vs the kinds of Macs I was
delivering on in say 1994) this is probably only worth worrying about
for collections in the hundreds of thousands of records, or certainly
to be considered if similar searches are taking more than a second.
Andy Dent
databaser of yore
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>
Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>