Re: [sqlite] FTS and postfix search

2009-08-06 Thread Olaf Schmidt

"Hugh Sasse"  schrieb im
Newsbeitrag
news:alpine.lfd.2.00.0908061712390.30...@trueman.cs.cse.dmu.ac.uk...

> I wonder if the idea of suffix arrays would belp.
> http://www.cs.dartmouth.edu/~doug/sarray/

I was about suggesting basically the same thing...

In case of Lukas' topics-table, one would have to
create an additional (search)table 'topics_suffixes',
where each word from within 'topics' is "expanded",
resulting in multiple record-entries.

In case the table 'topics' contains the word
'Motor' (with an ID of e.g. 12345) - the 'topics_suffixes'-
table should get the following insertions:
topic_id   |   word_suffixes
  12345   |   motor
  12345   |   otor
  12345   |   tor
  12345   |   or

In the above sequence the listing is stopped at a "maximum-
two-chars"-suffix (sparing us the storage or the last, singlechar),
to safe a bit of space, since single-char Like-queries, formulated
in "contains-format" (as e.g. ... word_suffixes Like '%t%' ...)
are probably not that interesting regarding their usual larger
recordcount-output (from the users point of view in such
"live-search-scenarios").

But all these "contains-queries", searching for wordparts,
larger than one single char can now be performed with
larger speed against the (agreed much larger) 'topics_suffixes'
table using:
Select Distinct topic_id Where word_suffixes Like 'somepart%'
(with a proper index on word_suffixes) - maybe combined in
a Join to table topics, depending on the Apps implementation.

The size of the topics_suffixes-table (and its index on
word_suffixes) dependent on the average-wordlenght
in topics of course - it's the usual tradeoff between
"used space" and speed.

Olaf




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-06 Thread Jim Showalter
You can lazy-init the table of substrings and substring matches when 
you save words, so you only use space for substrings that appear in 
words that have been saved in your database. That avoids entering 
substrings that never occur in English, and substrings that occur in 
English but not in any words you are searching.

- Original Message - 
From: "Jim Showalter" <j...@jimandlisa.com>
To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Thursday, August 06, 2009 8:59 AM
Subject: Re: [sqlite] FTS and postfix search


>I hadn't thought about it until now, but there's no reason you can't 
>treat the table of substrings as substrings appearing anywhere in a 
>word, and join each substring of a word to the substrings table.
>
> For 26 letters, you have > 17k three-letter substrings and almost 
> 500k four-letter substrings, so this technique probably is limited 
> to three-letter substrings. But that's not bad--it means that if 
> character sequences are evenly distributed in words (they're not, 
> but assume they are for a moment), then the set of all possible 
> words is cut into 1/>17,000 on the first search. There are only 
> about 600,000 words in the English language, so with even 
> distribution your first search cuts the results to ~34 words. But 
> uneven character-sequence distributions will skew that--some 
> sequences will result in more words than others. Sill, even if you 
> get back 500 words to search for the full match, that's better than 
> searching all of your words. (You also have to store all one-letter 
> and two-letter substrings, but those don't take up much additional 
> room.)
>
> You would have to have a separate table that has the multiple joins 
> (one per substring) in your words, because now each word can 
> potentially have M substrings in it, so M foreign keys.
>
> For your example, Motor/motor would match mot, oto, and tor. You 
> would use a rule to arbitrarily match the first N letters in a the 
> search term, so if someone searched for "%motor%", that would be a 
> search in the substrings table for "mot". This would return the 
> primary key for "mot". Then you search the substring-matches table 
> for all words that have that substring, and this gives you back a 
> list of words containing "mot". Then you search that result set for 
> "%motor%". Similarly, if someone searchs for "%oto%", you get the 
> substring key for "oto" and find words containing that substring.
>
> For all I know, full-text-search engines may do something like this.
>
> - Original Message - 
> From: "Lukas Haase" <lukasha...@gmx.at>
> To: <sqlite-users@sqlite.org>
> Sent: Thursday, August 06, 2009 6:54 AM
> Subject: Re: [sqlite] FTS and postfix search
>
>
>> Hi Jim,
>>
>> and thank you for the great idea. But I thought it would be 
>> possible to
>> search '*word*' - but this is not possible with this method either.
>>
>> Is there any chance for searching '*word*' quickly?
>>
>> Regards,
>> Luke
>>
>> Jim Showalter schrieb:
>>> You could store the words reversed (in addition to storing them in
>>> forward order). Then like 'xxx%' would be fast.
>>>
>>> This would double your disk footprint, but could give you the 
>>> search
>>> performance you're looking for.
>>>
>>> If that's too goofy, you could create a table of all one, two, and
>>> three-character word endings, and join to it from all of your 
>>> words
>>> (stored in forward order). Then search first for the primary key 
>>> of
>>> the word ending you want to search for, then search your words for
>>> that key.
>>>
>>> Index the join.
>>>
>>> - Original Message - 
>>> From: "Lukas Haase" <lukasha...@gmx.at>
>>> To: <sqlite-users@sqlite.org>
>>> Sent: Wednesday, August 05, 2009 6:16 PM
>>> Subject: Re: [sqlite] FTS and postfix search
>>>
>>>
>>>> Wes Freeman schrieb:
>>>>> I clearly am not in the right mindset to be answering list 
>>>>> emails.
>>>>> Please ignore my response (it's too late now)--back to my 
>>>>> stressful
>>>>> deadline.
>>>> :-)
>>>>
>>>>> Strange that it's implemented for prefix and not postfix?
>>>> Well, an explanation is easy: Same as with LIKE, LIKE 'xxx' or 
>>>> LIKE
>>>> 'xxx%' can be performed easy because only the beginning of words
>>>&g

Re: [sqlite] FTS and postfix search

2009-08-06 Thread Jim Showalter
I hadn't thought about it until now, but there's no reason you can't 
treat the table of substrings as substrings appearing anywhere in a 
word, and join each substring of a word to the substrings table.

For 26 letters, you have > 17k three-letter substrings and almost 500k 
four-letter substrings, so this technique probably is limited to 
three-letter substrings. But that's not bad--it means that if 
character sequences are evenly distributed in words (they're not, but 
assume they are for a moment), then the set of all possible words is 
cut into 1/>17,000 on the first search. There are only about 600,000 
words in the English language, so with even distribution your first 
search cuts the results to ~34 words. But uneven character-sequence 
distributions will skew that--some sequences will result in more words 
than others. Sill, even if you get back 500 words to search for the 
full match, that's better than searching all of your words. (You also 
have to store all one-letter and two-letter substrings, but those 
don't take up much additional room.)

You would have to have a separate table that has the multiple joins 
(one per substring) in your words, because now each word can 
potentially have M substrings in it, so M foreign keys.

For your example, Motor/motor would match mot, oto, and tor. You would 
use a rule to arbitrarily match the first N letters in a the search 
term, so if someone searched for "%motor%", that would be a search in 
the substrings table for "mot". This would return the primary key for 
"mot". Then you search the substring-matches table for all words that 
have that substring, and this gives you back a list of words 
containing "mot". Then you search that result set for "%motor%". 
Similarly, if someone searchs for "%oto%", you get the substring key 
for "oto" and find words containing that substring.

For all I know, full-text-search engines may do something like this.

- Original Message - 
From: "Lukas Haase" <lukasha...@gmx.at>
To: <sqlite-users@sqlite.org>
Sent: Thursday, August 06, 2009 6:54 AM
Subject: Re: [sqlite] FTS and postfix search


> Hi Jim,
>
> and thank you for the great idea. But I thought it would be possible 
> to
> search '*word*' - but this is not possible with this method either.
>
> Is there any chance for searching '*word*' quickly?
>
> Regards,
> Luke
>
> Jim Showalter schrieb:
>> You could store the words reversed (in addition to storing them in
>> forward order). Then like 'xxx%' would be fast.
>>
>> This would double your disk footprint, but could give you the 
>> search
>> performance you're looking for.
>>
>> If that's too goofy, you could create a table of all one, two, and
>> three-character word endings, and join to it from all of your words
>> (stored in forward order). Then search first for the primary key of
>> the word ending you want to search for, then search your words for
>> that key.
>>
>> Index the join.
>>
>> - Original Message - 
>> From: "Lukas Haase" <lukasha...@gmx.at>
>> To: <sqlite-users@sqlite.org>
>> Sent: Wednesday, August 05, 2009 6:16 PM
>> Subject: Re: [sqlite] FTS and postfix search
>>
>>
>>> Wes Freeman schrieb:
>>>> I clearly am not in the right mindset to be answering list 
>>>> emails.
>>>> Please ignore my response (it's too late now)--back to my 
>>>> stressful
>>>> deadline.
>>> :-)
>>>
>>>> Strange that it's implemented for prefix and not postfix?
>>> Well, an explanation is easy: Same as with LIKE, LIKE 'xxx' or 
>>> LIKE
>>> 'xxx%' can be performed easy because only the beginning of words
>>> need to
>>> be compared.
>>>
>>> However, there /is/ a way to also do postfix searches. I have the
>>> *same*
>>> database in *.hlp format and with WinHelp it's possible to search
>>> '*otor' (and others) with almost zero CPU and time consumption. 
>>> I'd
>>> be
>>> curious how they did this.
>>>
>>> For a solution for SQLite I would accept a small performance 
>>> penalty
>>> in
>>> that case (but very few secs max); additionally I would also 
>>> accept
>>> the
>>> index being bigger.
>>>
>>> Regards,
>>> Luke
>>>
>>>> Wes
>>>>
>>>> On Wed, Aug 5, 2009 at 8:58 PM, Lukas Haase<lukasha...@gmx.at>
>>>> wrote:
>>>>> Wes Freeman schrieb:
>>>>>> Why not LIKE '%otor'?
>>>>> SELECT topic_title FROM topics
&g

Re: [sqlite] FTS and postfix search

2009-08-06 Thread Lukas Haase
Hi Jim,

and thank you for the great idea. But I thought it would be possible to 
search '*word*' - but this is not possible with this method either.

Is there any chance for searching '*word*' quickly?

Regards,
Luke

Jim Showalter schrieb:
> You could store the words reversed (in addition to storing them in 
> forward order). Then like 'xxx%' would be fast.
> 
> This would double your disk footprint, but could give you the search 
> performance you're looking for.
> 
> If that's too goofy, you could create a table of all one, two, and 
> three-character word endings, and join to it from all of your words 
> (stored in forward order). Then search first for the primary key of 
> the word ending you want to search for, then search your words for 
> that key.
> 
> Index the join.
> 
> - Original Message - 
> From: "Lukas Haase" <lukasha...@gmx.at>
> To: <sqlite-users@sqlite.org>
> Sent: Wednesday, August 05, 2009 6:16 PM
> Subject: Re: [sqlite] FTS and postfix search
> 
> 
>> Wes Freeman schrieb:
>>> I clearly am not in the right mindset to be answering list emails.
>>> Please ignore my response (it's too late now)--back to my stressful
>>> deadline.
>> :-)
>>
>>> Strange that it's implemented for prefix and not postfix?
>> Well, an explanation is easy: Same as with LIKE, LIKE 'xxx' or LIKE
>> 'xxx%' can be performed easy because only the beginning of words 
>> need to
>> be compared.
>>
>> However, there /is/ a way to also do postfix searches. I have the 
>> *same*
>> database in *.hlp format and with WinHelp it's possible to search
>> '*otor' (and others) with almost zero CPU and time consumption. I'd 
>> be
>> curious how they did this.
>>
>> For a solution for SQLite I would accept a small performance penalty 
>> in
>> that case (but very few secs max); additionally I would also accept 
>> the
>> index being bigger.
>>
>> Regards,
>> Luke
>>
>>> Wes
>>>
>>> On Wed, Aug 5, 2009 at 8:58 PM, Lukas Haase<lukasha...@gmx.at> 
>>> wrote:
>>>> Wes Freeman schrieb:
>>>>> Why not LIKE '%otor'?
>>>> SELECT topic_title FROM topics
>>>> WHERE topic LIKE '%otor%'
>>>> ORDER BY topic_title ASC;
>>>>
>>>> This is very, very slow, especially on my > 100 MB database. 
>>>> "Realtime"
>>>> search in the GUI is a requirement. This is exactly the reason why 
>>>> I
>>>> want to use FTS instead of LIKE...
>>>>
>>>> Regards,
>>>> Luke
>>>>
>>>>> Wes
>>>>>
>>>>> On Wed, Aug 5, 2009 at 7:47 PM, Lukas Haase<lukasha...@gmx.at> 
>>>>> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> It's me again, sorry. The next big problem concerning FTS. I 
>>>>>> have the
>>>>>> requirement to do postfix searches, like:
>>>>>>
>>>>>> SELECT topic_title FROM topics
>>>>>> WHERE topic MATCH '*otor'
>>>>>> ORDER BY topic_title ASC;
>>>>>>
>>>>>> should find Motor, motor, Monotor etc. But this does not seem to 
>>>>>> work.
>>>>>> Is there any chance to get this working?
>>>>>>
>>>>>> Best regards,
>>>>>> Luke
>>>>>>
>>>>>> ___
>>>>>> sqlite-users mailing list
>>>>>> sqlite-users@sqlite.org
>>>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>>>>
>>>>> ___
>>>>> sqlite-users mailing list
>>>>> sqlite-users@sqlite.org
>>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>>>
>>>> ___
>>>> sqlite-users mailing list
>>>> sqlite-users@sqlite.org
>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>>
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users 
> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-06 Thread Lukas Haase
Roger Binns schrieb:
> [...]
> 
> Lukas Haase wrote:
>> additionally I would also accept the index being bigger.
> 
> You could have a second FTS table where you store the keywords in reverse
> order :-)

Hi,

DAMN!! Damn, damn. Thank you for the great idea. But unfortunately I 
thought I could combine postfix and prefix. But with this method this is 
not possible either. For example, consider the german word 
"Telefonanschlusskabel" (consisting of "Telefon", "Anschluss", "Kabel").

In this case '*anschluss*' should also find "Telefonanschlusskabel".

Or, to be more specific, any '*partial*' query should work.

Actually I need to solve this problem somehow. One possibility would be 
to write a complete FTS system on my own but this is really not what I 
want...

Regards,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-05 Thread John Machin
On 6/08/2009 12:07 PM, Jim Showalter wrote:
> Sorry--I read my emails arrival order, not reverse chronological--so I 
> didn't see that John had already solved it.

Not me ... this is ancient lore e.g. Knuth vol 3 of TAOCP 1973 edition 
page 391 "If we make two copies of the file, one in which the keys are 
in normal alphabetic order and another in which they are ordered from 
right to left (as if the words were spelled backwards), a misspelled 
word will probably agree up to half or more of its length with an entry 
in one of those two files."
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-05 Thread Jim Showalter
Sorry--I read my emails arrival order, not reverse chronological--so I 
didn't see that John had already solved it.

- Original Message - 
From: "John Machin" <sjmac...@lexicon.net>
To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Wednesday, August 05, 2009 6:40 PM
Subject: Re: [sqlite] FTS and postfix search


> On 6/08/2009 11:16 AM, Lukas Haase wrote:
>> Wes Freeman schrieb:
>
>>
>>> Strange that it's implemented for prefix and not postfix?
>>
>> Well, an explanation is easy: Same as with LIKE, LIKE 'xxx' or LIKE
>> 'xxx%' can be performed easy because only the beginning of words 
>> need to
>> be compared.
>>
>> However, there /is/ a way to also do postfix searches. I have the 
>> *same*
>> database in *.hlp format and with WinHelp it's possible to search
>> '*otor' (and others) with almost zero CPU and time consumption. I'd 
>> be
>> curious how they did this.
>
> In memory: maybe a suffix tree.
>
> In a database: have a column with the words stored backwards. SELECT 
> ...
> WHERE back_word LIKE "roto%"
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-05 Thread Jim Showalter
Forgot to conclude by saying the search gives you a list of words that 
you then need to further reduce by the actual number of characters you 
want to search by.

That's why storing them in reverse order might be preferable.

Also, just thought of something--if you store them in reverse order, 
you don't need to also store them in forward order. Just reverse the 
strings before displaying them.

- Original Message - 
From: "Jim Showalter" <j...@jimandlisa.com>
To: "General Discussion of SQLite Database" <sqlite-users@sqlite.org>
Sent: Wednesday, August 05, 2009 7:04 PM
Subject: Re: [sqlite] FTS and postfix search


> You could store the words reversed (in addition to storing them in 
> forward order). Then like 'xxx%' would be fast.
>
> This would double your disk footprint, but could give you the search 
> performance you're looking for.
>
> If that's too goofy, you could create a table of all one, two, and 
> three-character word endings, and join to it from all of your words 
> (stored in forward order). Then search first for the primary key of 
> the word ending you want to search for, then search your words for 
> that key.
>
> Index the join.
>
> - Original Message - 
> From: "Lukas Haase" <lukasha...@gmx.at>
> To: <sqlite-users@sqlite.org>
> Sent: Wednesday, August 05, 2009 6:16 PM
> Subject: Re: [sqlite] FTS and postfix search
>
>
>> Wes Freeman schrieb:
>>> I clearly am not in the right mindset to be answering list emails.
>>> Please ignore my response (it's too late now)--back to my 
>>> stressful
>>> deadline.
>>
>> :-)
>>
>>> Strange that it's implemented for prefix and not postfix?
>>
>> Well, an explanation is easy: Same as with LIKE, LIKE 'xxx' or LIKE
>> 'xxx%' can be performed easy because only the beginning of words 
>> need to
>> be compared.
>>
>> However, there /is/ a way to also do postfix searches. I have the 
>> *same*
>> database in *.hlp format and with WinHelp it's possible to search
>> '*otor' (and others) with almost zero CPU and time consumption. I'd 
>> be
>> curious how they did this.
>>
>> For a solution for SQLite I would accept a small performance 
>> penalty in
>> that case (but very few secs max); additionally I would also accept 
>> the
>> index being bigger.
>>
>> Regards,
>> Luke
>>
>>> Wes
>>>
>>> On Wed, Aug 5, 2009 at 8:58 PM, Lukas Haase<lukasha...@gmx.at> 
>>> wrote:
>>>> Wes Freeman schrieb:
>>>>> Why not LIKE '%otor'?
>>>> SELECT topic_title FROM topics
>>>> WHERE topic LIKE '%otor%'
>>>> ORDER BY topic_title ASC;
>>>>
>>>> This is very, very slow, especially on my > 100 MB database. 
>>>> "Realtime"
>>>> search in the GUI is a requirement. This is exactly the reason 
>>>> why I
>>>> want to use FTS instead of LIKE...
>>>>
>>>> Regards,
>>>> Luke
>>>>
>>>>> Wes
>>>>>
>>>>> On Wed, Aug 5, 2009 at 7:47 PM, Lukas Haase<lukasha...@gmx.at> 
>>>>> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> It's me again, sorry. The next big problem concerning FTS. I 
>>>>>> have the
>>>>>> requirement to do postfix searches, like:
>>>>>>
>>>>>> SELECT topic_title FROM topics
>>>>>> WHERE topic MATCH '*otor'
>>>>>> ORDER BY topic_title ASC;
>>>>>>
>>>>>> should find Motor, motor, Monotor etc. But this does not seem 
>>>>>> to work.
>>>>>> Is there any chance to get this working?
>>>>>>
>>>>>> Best regards,
>>>>>> Luke
>>>>>>
>>>>>> ___
>>>>>> sqlite-users mailing list
>>>>>> sqlite-users@sqlite.org
>>>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>>>>
>>>>> ___
>>>>> sqlite-users mailing list
>>>>> sqlite-users@sqlite.org
>>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>>>
>>>> ___
>>>> sqlite-users mailing list
>>>> sqlite-users@sqlite.org
>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>>
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-05 Thread Jim Showalter
You could store the words reversed (in addition to storing them in 
forward order). Then like 'xxx%' would be fast.

This would double your disk footprint, but could give you the search 
performance you're looking for.

If that's too goofy, you could create a table of all one, two, and 
three-character word endings, and join to it from all of your words 
(stored in forward order). Then search first for the primary key of 
the word ending you want to search for, then search your words for 
that key.

Index the join.

- Original Message - 
From: "Lukas Haase" <lukasha...@gmx.at>
To: <sqlite-users@sqlite.org>
Sent: Wednesday, August 05, 2009 6:16 PM
Subject: Re: [sqlite] FTS and postfix search


> Wes Freeman schrieb:
>> I clearly am not in the right mindset to be answering list emails.
>> Please ignore my response (it's too late now)--back to my stressful
>> deadline.
>
> :-)
>
>> Strange that it's implemented for prefix and not postfix?
>
> Well, an explanation is easy: Same as with LIKE, LIKE 'xxx' or LIKE
> 'xxx%' can be performed easy because only the beginning of words 
> need to
> be compared.
>
> However, there /is/ a way to also do postfix searches. I have the 
> *same*
> database in *.hlp format and with WinHelp it's possible to search
> '*otor' (and others) with almost zero CPU and time consumption. I'd 
> be
> curious how they did this.
>
> For a solution for SQLite I would accept a small performance penalty 
> in
> that case (but very few secs max); additionally I would also accept 
> the
> index being bigger.
>
> Regards,
> Luke
>
>> Wes
>>
>> On Wed, Aug 5, 2009 at 8:58 PM, Lukas Haase<lukasha...@gmx.at> 
>> wrote:
>>> Wes Freeman schrieb:
>>>> Why not LIKE '%otor'?
>>> SELECT topic_title FROM topics
>>> WHERE topic LIKE '%otor%'
>>> ORDER BY topic_title ASC;
>>>
>>> This is very, very slow, especially on my > 100 MB database. 
>>> "Realtime"
>>> search in the GUI is a requirement. This is exactly the reason why 
>>> I
>>> want to use FTS instead of LIKE...
>>>
>>> Regards,
>>> Luke
>>>
>>>> Wes
>>>>
>>>> On Wed, Aug 5, 2009 at 7:47 PM, Lukas Haase<lukasha...@gmx.at> 
>>>> wrote:
>>>>> Hi,
>>>>>
>>>>> It's me again, sorry. The next big problem concerning FTS. I 
>>>>> have the
>>>>> requirement to do postfix searches, like:
>>>>>
>>>>> SELECT topic_title FROM topics
>>>>> WHERE topic MATCH '*otor'
>>>>> ORDER BY topic_title ASC;
>>>>>
>>>>> should find Motor, motor, Monotor etc. But this does not seem to 
>>>>> work.
>>>>> Is there any chance to get this working?
>>>>>
>>>>> Best regards,
>>>>> Luke
>>>>>
>>>>> ___
>>>>> sqlite-users mailing list
>>>>> sqlite-users@sqlite.org
>>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>>>
>>>> ___
>>>> sqlite-users mailing list
>>>> sqlite-users@sqlite.org
>>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>>
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-05 Thread Roger Binns
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Lukas Haase wrote:
> additionally I would also accept the index being bigger.

You could have a second FTS table where you store the keywords in reverse
order :-)

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkp6ONUACgkQmOOfHg372QTg8wCgvUYTChJ48xnUBcNVeRUHoSQY
5X4AoMwt7vsNzc5yMHDo1x9gXImuiJKw
=q1HB
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-05 Thread John Machin
On 6/08/2009 11:16 AM, Lukas Haase wrote:
> Wes Freeman schrieb:

> 
>> Strange that it's implemented for prefix and not postfix?
> 
> Well, an explanation is easy: Same as with LIKE, LIKE 'xxx' or LIKE 
> 'xxx%' can be performed easy because only the beginning of words need to 
> be compared.
> 
> However, there /is/ a way to also do postfix searches. I have the *same* 
> database in *.hlp format and with WinHelp it's possible to search 
> '*otor' (and others) with almost zero CPU and time consumption. I'd be 
> curious how they did this.

In memory: maybe a suffix tree.

In a database: have a column with the words stored backwards. SELECT ... 
WHERE back_word LIKE "roto%"
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-05 Thread Lukas Haase
Wes Freeman schrieb:
> I clearly am not in the right mindset to be answering list emails.
> Please ignore my response (it's too late now)--back to my stressful
> deadline.

:-)

> Strange that it's implemented for prefix and not postfix?

Well, an explanation is easy: Same as with LIKE, LIKE 'xxx' or LIKE 
'xxx%' can be performed easy because only the beginning of words need to 
be compared.

However, there /is/ a way to also do postfix searches. I have the *same* 
database in *.hlp format and with WinHelp it's possible to search 
'*otor' (and others) with almost zero CPU and time consumption. I'd be 
curious how they did this.

For a solution for SQLite I would accept a small performance penalty in 
that case (but very few secs max); additionally I would also accept the 
index being bigger.

Regards,
Luke

> Wes
> 
> On Wed, Aug 5, 2009 at 8:58 PM, Lukas Haase wrote:
>> Wes Freeman schrieb:
>>> Why not LIKE '%otor'?
>> SELECT topic_title FROM topics
>> WHERE topic LIKE '%otor%'
>> ORDER BY topic_title ASC;
>>
>> This is very, very slow, especially on my > 100 MB database. "Realtime"
>> search in the GUI is a requirement. This is exactly the reason why I
>> want to use FTS instead of LIKE...
>>
>> Regards,
>> Luke
>>
>>> Wes
>>>
>>> On Wed, Aug 5, 2009 at 7:47 PM, Lukas Haase wrote:
 Hi,

 It's me again, sorry. The next big problem concerning FTS. I have the
 requirement to do postfix searches, like:

 SELECT topic_title FROM topics
 WHERE topic MATCH '*otor'
 ORDER BY topic_title ASC;

 should find Motor, motor, Monotor etc. But this does not seem to work.
 Is there any chance to get this working?

 Best regards,
 Luke

 ___
 sqlite-users mailing list
 sqlite-users@sqlite.org
 http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-05 Thread Wes Freeman
I clearly am not in the right mindset to be answering list emails.
Please ignore my response (it's too late now)--back to my stressful
deadline.

Strange that it's implemented for prefix and not postfix?

Wes

On Wed, Aug 5, 2009 at 8:58 PM, Lukas Haase wrote:
> Wes Freeman schrieb:
>> Why not LIKE '%otor'?
>
> SELECT topic_title FROM topics
> WHERE topic LIKE '%otor%'
> ORDER BY topic_title ASC;
>
> This is very, very slow, especially on my > 100 MB database. "Realtime"
> search in the GUI is a requirement. This is exactly the reason why I
> want to use FTS instead of LIKE...
>
> Regards,
> Luke
>
>>
>> Wes
>>
>> On Wed, Aug 5, 2009 at 7:47 PM, Lukas Haase wrote:
>>> Hi,
>>>
>>> It's me again, sorry. The next big problem concerning FTS. I have the
>>> requirement to do postfix searches, like:
>>>
>>> SELECT topic_title FROM topics
>>> WHERE topic MATCH '*otor'
>>> ORDER BY topic_title ASC;
>>>
>>> should find Motor, motor, Monotor etc. But this does not seem to work.
>>> Is there any chance to get this working?
>>>
>>> Best regards,
>>> Luke
>>>
>>> ___
>>> sqlite-users mailing list
>>> sqlite-users@sqlite.org
>>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-05 Thread Lukas Haase
Wes Freeman schrieb:
> Why not LIKE '%otor'?

SELECT topic_title FROM topics
WHERE topic LIKE '%otor%'
ORDER BY topic_title ASC;

This is very, very slow, especially on my > 100 MB database. "Realtime" 
search in the GUI is a requirement. This is exactly the reason why I 
want to use FTS instead of LIKE...

Regards,
Luke

> 
> Wes
> 
> On Wed, Aug 5, 2009 at 7:47 PM, Lukas Haase wrote:
>> Hi,
>>
>> It's me again, sorry. The next big problem concerning FTS. I have the
>> requirement to do postfix searches, like:
>>
>> SELECT topic_title FROM topics
>> WHERE topic MATCH '*otor'
>> ORDER BY topic_title ASC;
>>
>> should find Motor, motor, Monotor etc. But this does not seem to work.
>> Is there any chance to get this working?
>>
>> Best regards,
>> Luke
>>
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] FTS and postfix search

2009-08-05 Thread Wes Freeman
Why not LIKE '%otor'?

Wes

On Wed, Aug 5, 2009 at 7:47 PM, Lukas Haase wrote:
> Hi,
>
> It's me again, sorry. The next big problem concerning FTS. I have the
> requirement to do postfix searches, like:
>
> SELECT topic_title FROM topics
> WHERE topic MATCH '*otor'
> ORDER BY topic_title ASC;
>
> should find Motor, motor, Monotor etc. But this does not seem to work.
> Is there any chance to get this working?
>
> Best regards,
> Luke
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] FTS and postfix search

2009-08-05 Thread Lukas Haase
Hi,

It's me again, sorry. The next big problem concerning FTS. I have the 
requirement to do postfix searches, like:

SELECT topic_title FROM topics
WHERE topic MATCH '*otor'
ORDER BY topic_title ASC;

should find Motor, motor, Monotor etc. But this does not seem to work. 
Is there any chance to get this working?

Best regards,
Luke

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users