On Mon, Dec 13, 2010 at 9:45 AM, Lee Passey <[email protected]> wrote: > In your case the encoding can be a little confusing, because OL has used the > "Combining diacritical marks" set (range 300-36f) [1]. These Unicode > "characters" are designed not to be used as standalone characters, but rather > as a means of modifying the /preceding/ character.
Very interesting; I didn't know Unicode had that. I think I understand now how some of my wildcard author searches pick up accent variations, while others don't. This search: http://openlibrary.org/search/authors?q=kha*lid gives results with both accented and unaccented "a"s, because the combining diacritical falls into the wildcard range. But this search http://openlibrary.org/search/authors?q=k*alid only gives results with un-accented "a"s because there is no place in the search pattern to match the combining diacritical. Right? I learn something new every day :-) - Alan _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
