> Alternatively, we can try to be more fancy and attempt some > language-specific analysis and treatment, so depending on the language > of the document and/or of the field used, we would do various stuff to > the text.
Would something like Unidecode help? http://www.tablix.org/~avian/blog/archives/2009/01/unicode_transliteration_in_python/ We could update author records to have several transliterations as alternate names. Joe
