On Sun, 2005-04-17 at 07:46 +1000, Howard Lowndes wrote:
> Does anyone know of a better FOSS algorithm than soundex for fuzzy name 
> matching.

Soundex is stuffed.  It only works for white American surnames circa
1950.

Look in the source tarball of GNU Gettext for a file called fstrcmp.c
It does a fuzzy strcmp, and returns a double - from 0.0 if completely
different, to 1.0 for identical.  It uses the Levenstein (sp?) edit
distance, the same one diff uses for optimal diffs.

I've been lobbying for fstrcmp to be added to GNU libc, without success.

-- 
Regards
Peter Miller <[EMAIL PROTECTED]>
/\/\*        http://www.canb.auug.org.au/~millerp/

PGP public key ID: 1024D/D0EDB64D
fingerprint = AD0A C5DF C426 4F03 5D53  2BDB 18D8 A4E2 D0ED B64D
See http://www.keyserver.net or any PGP keyserver for public key.

Attachment: signature.asc
Description: This is a digitally signed message part

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Reply via email to