On 8/20/07, Cesar D. Rodas <[EMAIL PROTECTED]> wrote: > As I know ( I can be wrong ) SQLite Full Text Search is only match with hole > words right? It could not be > And also no FT extension to db ( as far I know) is miss spell tolerant,
Yes, fts is matching exactly. There is some primitive support for English stemming using the Porter stemmer, but, honestly, it's not well-exercised. > And > I've found this Paper that talks about *Using Superimposed Coding Of N-Gram > Lists For Efficient Inexact Matching* http://citeseer.ist.psu.edu/cache/papers/cs/22812/http:zSzzSzwww.novodynamics.comzSztrenklezSzpaperszSzatc92v.pdf/william92using.pdf > > I was reading and it is not so hard to implement, but it cost a extra > storage space, but I think the benefits are more. > > Also following this paper could be done a way to match with fragments of > words... what do you think of it? It's an interesting paper, and I must say that anything which involves Bloom Filters automatically draws my attention :-). While I think spelling-suggestion might be valuable for fts in the longer term, I'm not very enthusiastic about this particular model. It seems much more useful in the standard indexing model of building the index, manually tweaking it, and then doing a ton of queries against it. fts is really fairly constrained, because many use-cases are more along the lines of update the index quite a bit, and query it only a few times. Also, I think the concepts in the paper might have very significant problems handling Unicode, because the bit vectors will get so very large. I may be wrong, sometimes the overlapping-vector approach can have surprising relevance depending on the frequency distribution of the things in the vector. It would need some experimentation to figure that out. Certainly something to bookmark, though. Thanks, scott ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------