I will second Rick's approach and have implemented something very similar for a client when soundex feel short of expectation. It worked very well.
On Mon, Jun 3, 2013 at 5:43 PM, Rick James <rja...@yahoo-inc.com> wrote: > Soundex is the 'right' approach, but it needs improvement. So, find an > improvement, then do something like this... > Store the Soundex value in a column of its own, INDEX that column, and JOIN > on that column using "=". Thus, ... > * You have spent the effort to convert to Soundex once, not on every call. > * Multiple strings will have the same Soundex, but generally not many will > have the same. Hence, the JOIN won't be 1:1, but rather some small number. > > Other approaches (eg, Levenshtein) need both strings in the computation. It > _may_ be possible to work around that by the following. > Let's say you wanted to a "match" if > * one letter was dropped or added or changed, or > * one pair of adjacent letters was swapped. > Then... For a N-letter word, store N+1 rows: > * The word, as is, > * The N words, each shortened by one letter. > Then an equal match on that hacked column will catch single > dropped/added/changed letter with only N+1 matches. > (Minor note: doubled letters make the count less than N+1.) > >> -----Original Message----- >> From: h...@tbbs.net [mailto:h...@tbbs.net] >> Sent: Monday, June 03, 2013 8:30 AM >> To: mysql@lists.mysql.com >> Subject: string-likeness >> >> I wish to join two tables on likeness, not equality, of character strings. >> Soundex does not work. I am using the Levenstein edit distance, written in >> SQL, a very costly test, and I am in no position to write it in C and link >> it to MySQL--and joining on equality takes a fraction of a second, and >> this takes hours. Any good ideas? >> >> >> -- >> MySQL General Mailing List >> For list archives: http://lists.mysql.com/mysql >> To unsubscribe: http://lists.mysql.com/mysql > > > -- > MySQL General Mailing List > For list archives: http://lists.mysql.com/mysql > To unsubscribe: http://lists.mysql.com/mysql > -- - michael dykman - mdyk...@gmail.com May the Source be with you. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/mysql