hi, On Sunday 04 December 2011 14:23:09 Black, Michael (IS) wrote: > It says "here's token 'hal'" and if you return the pointer to "h" it points > to the same place so it returns "hal" right back to you....ergo the loop. I have read through the ext/fts3/fts3/expr.c code and found out the following: piEndOffset must point to the zero byte after the returned token. fts3 expects the tokenizer to generate exactly one token for each search string.
The first call to my xNext always returned the prefix with length 1 and piStartOffset=piEndOffset=0. Therefore fts3 incremented its internal pointer by 0 after each loop and then called xNext on the same string again. I fixed this by returning first the longest prefix (the given word itself) and pointing piEndOffset after the returned string. Now it works. > You don't say why you're doing this. FTS already supports prefix queries. The fts documentation states, that if I want to efficently search for prefixes I should give the maximum size of such prefixes such that fts can optimize for those prefixes. I want to efficently search for prefixes of any length. The drawback of my tokenizer is, that it consumes a lot of space, for 56Mb of strings I get a 1.2Gb file. I assume since everything is done in trees, a search with my tokenizer is in O(log(n)) where n is the number of tokens in the table. Is this still O(log(n)) if I write a tokenizer for which input=output and use the fts prefix search? Greetings johannes _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users