-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

John Machin wrote:
> I have developed a table which maps most latin-decorated Unicode 
> characters into the non-decorated basic form. 

This is a fascinating article by Sean Burke (a linguist) about
converting all Unicode characters into US-ASCII.  The conversion is
primarily based on sound, so in theory running soundex on the result
could be somewhat useful.

  http://interglacial.com/~sburke/tpj/as_html/tpj22.html

You can find his tables at this link encoded as perl data structures.


http://cpansearch.perl.org/src/SBURKE/Text-Unidecode-0.04/lib/Text/Unidecode/

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAko4c4wACgkQmOOfHg372QQwUwCglqxQzZSGjHHoL13/L8Kw6NrX
46wAn3q12ugcrBryawTwpV8bjs/nYlZe
=XPU9
-----END PGP SIGNATURE-----
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to