Re: [ol-tech] Bulk Download and Request

Erik Hetzner Mon, 13 Dec 2010 14:08:34 -0800

At Mon, 13 Dec 2010 13:57:59 -0800,
Karen Coyle wrote:
> 
> I'm not sure this is the same problem, but here's a bug report:
>     https://bugs.launchpad.net/openlibrary/+bug/389217
> that covers at least some of the issues. I remember that this has come  
> up before, and may have something to do with Solr. Definitely, one  
> needs to be able to search on accented and unaccented characters with  
> the same query. Also, those of us with dumb ASCII keyboards find it  
> very hard to key accented characters, although we may want to search  
> on names with diacritics.


FYI, an easy way in Python to strip (many) diacritics to allow us
Americans to search the way we like, and the way our keyboards
support:

  import unicodedata
  def strip_accents(s):
      return ''.join((c for c in unicodedata.normalize('NFD', unicode(s)) if 
unicodedata.category(c) 
!= 'Mn'))

Normalize to NFD (decomposed) then strip all the mark, nonspacing
characters. Obviously you want to index both the accented and stipped
versions.

best, Erik
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Re: [ol-tech] Bulk Download and Request

Reply via email to