I will second Rick's approach and have implemented something very
similar for a client when soundex feel short of expectation.  It
worked very well.

On Mon, Jun 3, 2013 at 5:43 PM, Rick James <rja...@yahoo-inc.com> wrote:
> Soundex is the 'right' approach, but it needs improvement.  So, find an 
> improvement, then do something like this...
> Store the Soundex value in a column of its own, INDEX that column, and JOIN 
> on that column using "=".  Thus, ...
> * You have spent the effort to convert to Soundex once, not on every call.
> * Multiple strings will have the same Soundex, but generally not many will 
> have the same.  Hence, the JOIN won't be 1:1, but rather some small number.
>
> Other approaches (eg, Levenshtein) need both strings in the computation.  It 
> _may_ be possible to work around that by the following.
> Let's say you wanted to a "match" if
> * one letter was dropped or added or changed, or
> * one pair of adjacent letters was swapped.
> Then...  For a N-letter word, store N+1 rows:
> * The word, as is,
> * The N words, each shortened by one letter.
> Then an equal match on that hacked column will catch single 
> dropped/added/changed letter with only N+1 matches.
> (Minor note:  doubled letters make the count less than N+1.)
>
>> -----Original Message-----
>> From: h...@tbbs.net [mailto:h...@tbbs.net]
>> Sent: Monday, June 03, 2013 8:30 AM
>> To: mysql@lists.mysql.com
>> Subject: string-likeness
>>
>> I wish to join two tables on likeness, not equality, of character strings.
>> Soundex does not work. I am using the Levenstein edit distance, written in
>> SQL, a very costly test, and I am in no position to write it in C and link
>> it to MySQL--and joining on equality takes a fraction of a second, and
>> this takes hours. Any good ideas?
>>
>>
>> --
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe:    http://lists.mysql.com/mysql
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/mysql
>



-- 
 - michael dykman
 - mdyk...@gmail.com

 May the Source be with you.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql

Reply via email to