Tom Emerson wrote: > One gotcha, that I run into every six months or so, is forgetting that > the punctuation characters in the Basic Latin block are classified as > Latin script. This trips me up because most of my text processing work > involves CJK, so I'll write something to filter latin characters with > (in Rosette notation):
That must be a Rosette-specific behavior: in UTR#24 (and in its database <Scripts.txt>), the only ASCII-range code-points classified as "Latin" are the upper- and lower-case letters. _ Marco

