Stephen Woodbridge wrote: > I would be interested in having something like this also. > > What I don't understand in your approach is how you compute the > (Levenstein) distance during a search. It seems like you have a fixed > set of tokens from your document text and these are indexed. Then you > have a query token the you want to compare to the index based on some > fuzzy distance. Since every query can be different I think you have to > compute the distance for every key in the index? that would require > doing a full index scan. > > If there ware a function that you could run a token through that would > given you that tokens "location" in some space then you could generate a > similar "location" for the query token and then use the rtree and > distance. I'm not aware of any such functions, but my expertise is more > in GIS the search searching.
Hmmm, that was supposed to say text searching. > Thoughts? > > Best, > -Steve > > Martin Pfeifle wrote: >> Hi, I think there is nothing available except FTS. Doing a full table >> scan and computing for each string the (Levenstein) distance to the >> query object is too time consuming. So what I would like to see is >> the implementation of a generic metric index which needs as one >> parameter a metric distance function. Based on such a distance >> function you could then do similarity search on any objects , e.g. >> images, strings, etc. One possible index would be the M-tree (which >> you can also organize relational as it was done with the R*-tree). >> The idea is that you have a hierarchical index and each node is >> represented by a database object o and a covering radius r >> reflecting the maximal distance of all objects in that subtree to the >> object o. If you do a range query now, you compute the distance of >> your query object to the object o. If this distance minus the >> coverage radius r is bigger than your query range you can prune that >> subtree. You can either implement such a similarity module as an own >> extension similar toFTS or the Spatial module, or integrate it into >> FTS and use it only for strings. Personally, I need the second >> solution because I'd like to do full and fuzzy text search. Are there >> any plans to implement something like this, if yes, I would like to >> take part in such a development. . Best Martin >> >> >> >> >> ----- Ursprüngliche Mail ---- Von: Alberto Simões >> <[EMAIL PROTECTED]> An: General Discussion of SQLite Database >> <sqlite-users@sqlite.org> Gesendet: Donnerstag, den 3. Juli 2008, >> 21:52:05 Uhr Betreff: [sqlite] Fuzzy Matching >> >> Hello >> >> Although I am quite certain that the answer is that SQLite does not >> provide any mechanism to help me on this, it doesn't hurt to ask. Who >> know if anybody have any suggestion. >> >> Basically, I am using SQLite for a dictionary, and I want to let the >> user do fuzzy searches. OK, some simple Levenshtein distance of one >> or two would do the trick, probably. >> >> I imagine that SQLite (given the lite), does not provide any kind of >> nearmisses search. But probably, somebody here did anything similar >> in any language? >> >> Cheers Alberto > > _______________________________________________ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users