Re: [sqlite] Substring (LIKE "%key%") searches, would FTS3 with suffix-tree tokenizer be the fast way?

Mohd Radzi Ibrahim Fri, 06 Aug 2010 18:08:47 -0700

Have you not consider loading the whole rows into memory array and use simple 
string search or regexp? I'm sure 10,000 records could be search a blink.


best regards,
Radzi.
On 6-Aug-2010, at 3:42 AM, Sam Roberts wrote:

> I'd appreciate any suggestions on good ways to do this, I'm neither an SQL or
> sqlite expert, so I might be thinking about it all wrong.
> 
> I have something like a (read-only) address book/rolodex, with interactive
> searching. As users type into the search box, I need to first know for each
> section how many rows match the substring typed so far.  I only display the
> rows that are visible on screen.
> 
> I have two queries:
> 
> (A) I count the rows in a letter group.
> 
> If they typed "e":
> 
> select substr(name,1,1), count(*) from my_table where name like '%e%'
> group by substr(name,1,1);
> A|94
> B|118
> C|131
> ...
> 
> This is too slow, ~3sec, with 2500 rows, and we want to have 10000 rows.
> 
> Worse, when they type "es", the search is as slow after they type "s" as when
> they typed "e", even though the "es" rows are a sub-set of the rows that
> matched "e".
> 
> FTS3 only searches full terms/words by default, but I think if I built a 
> custom
> tokenizer that returned all the suffix trees for a name:
> 
> "fu bar" => [ "r", "ar", "bar", " bar", "u bar", "fu bar"]
> 
> That I could do rewrite query (A) like this:
> 
> select substr(name,1,1), count(*) from my_table where name match 'e*'
> group by substr(name,1,1);
> 
> Is this a reasonable approach? Is there a better way? Has somebody
> else done this?
> 
> 
> 
> (B) I access specific rows within a letter group.
> 
> For visible rows, I fetch them by offset into a letter group, so row 4 in the
> "g" section of names containing "e" would be:
> 
> select * from my_table where name like "g%" and name like "%e%" order
> by name limit 1 offset 4;
> 
> The performance for this is OK, right now, I think it's because the first LIKE
> can use the index, so the linear scan is over only a few hundred rows. Or it
> could be that the on-screen display of each row is slower than the DB search. 
> I
> think it might become a problem, though.
> 
> I'm not sure how I would rewrite this to use FTS3 if it turns out to be to 
> slow
> for a larger DB, maybe a tokenizer that puts the first letter of  the name as
> the first letter of every suffix?
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Substring (LIKE "%key%") searches, would FTS3 with suffix-tree tokenizer be the fast way?

Reply via email to