I'm not sure how well it works with PostgreSQL, but for Levenshtein, you can screen out strings that are longer/shorter than your starting string by greater than your target threshold. So, if you have a 7 character string and want an edit distance of 2, your candidate pool can be limited to strings with lengths 5-9. (Off the top of my head.) If the PostgreSQL character_length() function is fast enough, this can help. For bigram comparisons, you could potentially apply something of a similar pre-slicing.
Not sure if this pays off in your case, but if you can do fast filters before applying the more intensive coefficient calculation, maybe it would be of some help? ********************************************************************** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:[email protected] **********************************************************************

