On Mon, Dec 13, 2010 at 9:45 AM, Lee Passey <[email protected]> wrote:
> In your case the encoding can be a little confusing, because OL has used the
> "Combining diacritical marks" set (range 300-36f) [1]. These Unicode
> "characters" are designed not to be used as standalone characters, but rather
> as a means of modifying the /preceding/ character.

Very interesting; I didn't know Unicode had that.  I think I
understand now how some of my wildcard author searches pick up accent
variations, while others don't.

This search:
http://openlibrary.org/search/authors?q=kha*lid
gives results with both accented and unaccented "a"s, because the
combining diacritical falls into the wildcard range.  But this search
http://openlibrary.org/search/authors?q=k*alid
only gives results with un-accented "a"s because there is no place in
the search pattern to match the combining diacritical.  Right?

I learn something new every day  :-)

- Alan
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to