>From the first line, I guess you mean that all three questions are having to do with the Sentence_Break property values. Namely:
http://www.unicode.org/reports/tr29/proposed.html#Table_Sentence_Break_Property_Values http://www.unicode.org/reports/tr29/proposed.html#SContinue Mark On Thu, Mar 8, 2018 at 9:25 AM, fantasai via Unicode <[email protected]> wrote: > Given that the comma and colon are categorized as SContinue, > why is the semicolon also not SContinue? > Also, why is the Greek Question Mark not categorized with > the rest of the question marks? > As I recall , both are because the semicolon can also represent a greek question mark (they are canonically equivalent , so you can't reliably distinguish between them ). BTW, here is a table of property differences for codepoint X, toNfc(X) (if a single character) and toNfkc(X) (again, if a single character). https://docs.google.com/spreadsheets/d/1ZExxhAujA8kX42F8KBK3okX_So7Dt5YZvyanL8dH8tM/edit#gid=0 It was a quick dump so no guarantees that all the dots are crossed. It skips comparing properties that are purposefully different across NFC (like Decomposition_Mapping) or different code points (like Name or Block), and most CJK properties (ones starting with 'k'). > Why aren't the vertical presentation forms categorized with > the things they are presenting? > At least some of them are: U+FE10 ( ︐ ) PRESENTATION FORM FOR VERTICAL COMMA U+FE11 ( ︑ ) PRESENTATION FORM FOR VERTICAL IDEOGRAPHIC COMMA U+FE13 ( ︓ ) PRESENTATION FORM FOR VERTICAL COLON U+FE31 ( ︱ ) PRESENTATION FORM FOR VERTICAL EM DASH U+FE32 ( ︲ ) PRESENTATION FORM FOR VERTICAL EN DASH > > Thanks~ > ~fantasai >

