> Alternatively, we can try to be more fancy and attempt some
> language-specific analysis and treatment, so depending on the language
> of the document and/or of the field used, we would do various stuff to
> the text.

Would something like Unidecode help?
http://www.tablix.org/~avian/blog/archives/2009/01/unicode_transliteration_in_python/

We could update author records to have several transliterations as
alternate names.

Joe


Reply via email to