Parsers for the UnicodeSet notation?

Eric Muller Wed, 23 Jul 2014 15:25:28 -0700

I would like to work with the exemplarCharacters data in the CLDR. Thatuses the UnicodeSet notation. Is there somewhere a parser for thatnotation, that would return me just the list of characters in the set?Something a bit like the UnicodeSet utility at<http://unicode.org/cldr/utility/list-unicodeset.jsp>, but for use inapps/shell.

I suspect that the exemplarCharacters use a restricted form of theUnicodeSet notation (e.g. do not use property values). Is that correct,and if so, what's the subset?

Incidentally, I copy/pasted the punctuation exemplar characters forhe.xml into the utility, and it reported that the set contains 8,130code points, including the ascii letters. Somehow, that seems incorrect.What did I do wrong?


Thanks,
Eric.


_______________________________________________
Unicode mailing list
[email protected]
http://unicode.org/mailman/listinfo/unicode

Parsers for the UnicodeSet notation?

Reply via email to