I realize that Terminal_Punctuation is only an informative property,
but I have a question concerning it and characters that the Line
Breaking Algorithm identifies as being word dividers.

In UAX #14 the following info is given in the list of characters of Line
Break class BA:

  Other forms of visible word dividers that provide break opportunities.

  0F0B  TIBETAN MARK INTERSYLLABIC TSHEG
  1361  ETHIOPIC WORDSPACE
  17D5  KHMER SIGN BARIYOOSAN
  10100         AEGEAN WORD SEPARATOR LINE
  10101         AEGEAN WORD SEPARATOR DOT
  10102         AEGEAN CHECK MARK
  1039F         UGARITIC WORD DIVIDER

Of these seven characters, only two, U+1361 and U+17D5 have
the Terminal_Punctuation property.  One of these, U+10102 is
a symbol and thus is not punctuation, but what is the distinction
that causes the other four to not also have the Terminal_Punctuation
property?  Is it because Terminal_Punctuation is informative
that these other four have slipped thru the cracks, or is there
a reason I should be noticing, but am not?



Reply via email to