At Mon, 13 Dec 2010 13:57:59 -0800,
Karen Coyle wrote:
>
> I'm not sure this is the same problem, but here's a bug report:
> https://bugs.launchpad.net/openlibrary/+bug/389217
> that covers at least some of the issues. I remember that this has come
> up before, and may have something to do with Solr. Definitely, one
> needs to be able to search on accented and unaccented characters with
> the same query. Also, those of us with dumb ASCII keyboards find it
> very hard to key accented characters, although we may want to search
> on names with diacritics.
FYI, an easy way in Python to strip (many) diacritics to allow us
Americans to search the way we like, and the way our keyboards
support:
import unicodedata
def strip_accents(s):
return ''.join((c for c in unicodedata.normalize('NFD', unicode(s)) if
unicodedata.category(c)
!= 'Mn'))
Normalize to NFD (decomposed) then strip all the mark, nonspacing
characters. Obviously you want to index both the accented and stipped
versions.
best, Erik
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to
[email protected]