I would be interested in having something like this also.

What I don't understand in your approach is how you compute the 
(Levenstein) distance during a search. It seems like you have a fixed 
set of tokens from your document text and these are indexed. Then you 
have a query token the you want to compare to the index based on some 
fuzzy distance. Since every query can be different I think you have to 
compute the distance for every key in the index? that would require 
doing a full index scan.

If there ware a function that you could run a token through that would 
given you that tokens "location" in some space then you could generate a 
similar "location" for the query token and then use the rtree and 
distance. I'm not aware of any such functions, but my expertise is more 
in GIS the search searching.

Thoughts?

Best,
   -Steve

Martin Pfeifle wrote:
> Hi, I think there is nothing available except FTS. Doing a full table
> scan and computing for each string the (Levenstein) distance to the
> query object is too time consuming. So what I would like to see is
> the implementation of a generic metric index which needs as one
> parameter a metric distance function. Based on such a distance
> function you could then do similarity search on any objects , e.g.
> images, strings, etc. One possible index would be the M-tree (which
> you can also organize relational as it was done with the R*-tree).
> The idea is that you have a hierarchical index and each node is
> represented by a database  object o and a covering radius r
> reflecting the maximal distance of all objects in that subtree to the
> object o. If you do a range query now, you compute the distance of
> your query object to the object o. If this distance minus the
> coverage radius r is bigger than your query range you can prune that
> subtree. You can either implement such a similarity module as an own
> extension similar toFTS or the Spatial module, or integrate it into
> FTS and use it only for strings. Personally, I need the second
> solution because I'd like to do full and fuzzy text search. Are there
> any plans to implement something like this, if yes, I would like to
> take part in such a development. . Best Martin
> 
> 
> 
> 
> ----- Ursprüngliche Mail ---- Von: Alberto Simões
> <[EMAIL PROTECTED]> An: General Discussion of SQLite Database
> <sqlite-users@sqlite.org> Gesendet: Donnerstag, den 3. Juli 2008,
> 21:52:05 Uhr Betreff: [sqlite] Fuzzy Matching
> 
> Hello
> 
> Although I am quite certain that the answer is that SQLite does not 
> provide any mechanism to help me on this, it doesn't hurt to ask. Who
>  know if anybody have any suggestion.
> 
> Basically, I am using SQLite for a dictionary, and I want to let the 
> user do fuzzy searches. OK, some simple Levenshtein distance of one
> or two would do the trick, probably.
> 
> I imagine that SQLite (given the lite), does not provide any kind of 
> nearmisses search. But probably, somebody here did anything similar
> in any language?
> 
> Cheers Alberto

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to