At 05:52 AM 11/20/2003, Philippe Verdy wrote:
We need a comprehensive new technical report that lists all the exceptions
to the general category system, as these line-breaking or word-breaking or
grapheme cluster breaking properties are orthogonal to the basic GC system
and to the combining class
At 05:44 AM 11/19/2003, Philippe Verdy wrote:
However, a couple of paragraphs up, the definition for No-Break
Space says:
U+00A0 [No-Break Space] behaves like the following coded
character sequence: U+FEFF [Zero Width No-Break Space] +
U+0020 [Space] + U+FEFF [Zero Width No-Break Space].
On 19/11/2003 17:44, Philippe Verdy wrote:
...
This trick doesn't work if any of the CC's are in combining class zero.
Of course, but which combining character of combining class 0 does need to
combine with NBSP in a way that affect renderers?
Do you think about sequences like NBSP,CGJ?
Or
From: Peter Kirk [EMAIL PROTECTED]
As for line breaking (UAX14), WJ explicitly prohibits this; ZWJ and ZWNJ
are not listed, and so as Cf characters are ignored in the line breaking
algorithm. I note also that the combining mark CGJ is listed as GL and
so is not CM. The descriptive text of
In the online 4.0 book, chapter 15
http://www.unicode.org/versions/Unicode4.0.0/ch15.pdf
the definition for Word Joiner says:
Until Unicode 3.1.1, U+FEFF was the only code point with word
joining semantics, but because it is more commonly used as
byte order mark, the use of U+2060 [word
From: Pim Blokland [EMAIL PROTECTED]
However, a couple of paragraphs up, the definition for No-Break
Space says:
U+00A0 [No-Break Space] behaves like the following coded
character sequence: U+FEFF [Zero Width No-Break Space] +
U+0020 [Space] + U+FEFF [Zero Width No-Break Space].
Is this
On 19/11/2003 01:49, Pim Blokland wrote:
In the online 4.0 book, chapter 15
http://www.unicode.org/versions/Unicode4.0.0/ch15.pdf
the definition for Word Joiner says:
Until Unicode 3.1.1, U+FEFF was the only code point with word
joining semantics, but because it is more commonly used as
From: Peter Kirk [EMAIL PROTECTED]
Does this equivalence hold when combining characters are applied to the
NBSP? Is the sequence NBSP, CC (recommended for spacing diacritics,
where CC is any sequence of combining characters) equivalent to ZWNBS,
SP, ZWNBS, CC? Or should the equivalence be to
From: Philippe Verdy [EMAIL PROTECTED]
So, NBSP,CC must not be treated as if it was:
WJ,SP,WJ,CC
but really rather as:
WJ,SP,CC,WJ
Note here the inversion.
The inversion here acts as if WJ was a combining character of combining
class 256 (i.e. with a class higher than the combining
On 19/11/2003 16:26, Philippe Verdy wrote:
From: Philippe Verdy [EMAIL PROTECTED]
So, NBSP,CC must not be treated as if it was:
WJ,SP,WJ,CC
but really rather as:
WJ,SP,CC,WJ
Note here the inversion.
The inversion here acts as if WJ was a combining character of combining
class 256
From: Peter Kirk [EMAIL PROTECTED]
Of course this is not a standard normalization form, but using this
pseudo
combining class may help render the last two coded strings (in my quote
above) equivalently in renderers.
This works even in the case where there are multiple diacritics (noted
CC1
11 matches
Mail list logo