On Wed, May 18, 2005 at 02:46:31PM -0700, David M. Cook wrote:
> On Tue, May 17, 2005 at 04:05:34PM -0700, Gregory K. Ruiz-Ade wrote:
> 
> > The part that's nailing me is the letters with diacritics.
> 

As always, there's a shorter way to do it:

remove_diacritics = lambda s: ''.join([c for c in normalize('NFD', s) 
                                       if ord(c) < 128])
                                       
But this will throw away some characters since normalize will sometimes just
return the character itself if there's no decomposition, so if you have
characters like that and you want some sensible translation (like ß
to ss), you may have to create a translation table for them).

Dave

-- 
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg

Reply via email to