> >00B7;MIDDLE DOT;Po;0;ON;;;;;N;;;;; > >10101;AEGEAN WORD SEPARATOR DOT;Po;0;ON;;;;;N;;;;; > >16EB;RUNIC SINGLE PUNCTUATION;Po;0;L;;;;;N;;;;;
> I was meaning to ask about this. I'm all over not encoding Yet Another > middle dot, but I was wondering. In my research on Samaritan, I've > found that they frequently write (you guessed it) a middle dot to > separate words (they like to use space to enable them to do this cool > columnar writing thing). I was assuming that this could be conflated > with someone else's middle-dot-word-separator; would that be U+10101? As far as I am concerned, U+00B7 should be sufficient for that. But if you were looking for a punctuation mark distinguished from U+00B7, specifically for archaic textual practice, my choice would be U+16EB (and the Runic double dot, U+16EC) as an alternative. Scripts.txt treats these as common punctuation: 16EB..16ED ; Common # Po [3] RUNIC SINGLE PUNCTUATION..RUNIC CROSS PUNCTUATION Unfortunately, software may be making over-aggressive assumptions about script identity in some cases, which can throw off implementations that pick up punctuation out of another script block. Note that as part of the ongoing work to cover Greek paleographic needs, a large number of multiple dot punctuation characters are currently under ballot for addition to 10646 (and Unicode). See 2056, 2058..205E at: http://www.unicode.org/alloc/Pipeline.html These are (proposed to be) encoded in the General Punctuation block to ensure that *everyone* is clear that their intended use is general, so we don't have to keep cloning more and more such dot combinations to handle the dot punctuation for each different paleographic tradition. --Ken

