Re: [sqlite] Substring (LIKE "%key%") searches, would FTS3 with suffix-tree tokenizer be the fast way?

Sam Roberts Fri, 06 Aug 2010 09:43:03 -0700

On Fri, Aug 6, 2010 at 6:11 AM, Adam DeVita <adev...@verifeye.com> wrote:
> A variant on Simon's plan.
> Are the 10,000 rows static, slowly changing, or frequently changing?


Never change, it's read-only.

>  Does
> it make sense to pre-calculate some counts at the time data is loaded?
>  Is
> this memory constrained so much that you can't afford 1 or 2 MB to let you
> look up based on ints? (I'm assuming that one letter is all you are after,
> either 'starts with' or 'contains' and not in order combinations.)

No, substrings, it's just that I then need a count of matching
substrings by first char.

Good idea, there are a number of other queries where pre-calculating
is linear in the space cost, but here the the usage is interactive
search, where as they type more of the name, it narrows down the
search results as people type in more.

Pre-calculating would be about 40 factorial in space, there are about
64000 3-character strings, and then once  they typed the 4th char in
it would be slow again. Of course, not all of those exist. Hm. Maybe
I'll try to precalculate the suffix tree, and see how many results
there really are, I don't need to store zero results.

The fastest I've found so far is using FTS3. Its a little slow, but
not unusably so. There are only 2500 rows now, I hope that it will
scale well as the DB increases in size. I'm still considering other
approaches, maybe a custom b-tree.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Substring (LIKE "%key%") searches, would FTS3 with suffix-tree tokenizer be the fast way?

Reply via email to