TUS does not prevent anyone to put noncharacter code points in Unicode
strings. As a matter of fact, p. 23 of TUS 3.0 reads U+ is reserved
for
private program use as a sentinel or other signal. I would expect this
to
hold true for the noncharacters that were introduced later too. It may
David Hopwood said:
At 09:01 AM 3/19/02 -0800, Yves Arrouye wrote:
TUS does not prevent anyone to put noncharacter code points in Unicode
strings. As a matter of fact, p. 23 of TUS 3.0 reads U+ is reserved
for private program use as a sentinel or other signal.
But it is
Markus Scherer wrote:
How about U+10?
It is a non-character, which gives it a high (unassigned
character) weight in the UCA. It is the highest code point =
the last character.
That is definitely not what I was looking for. It is an illegal codepoint,
while I was looking for a
-BEGIN PGP SIGNED MESSAGE-
Asmus Freytag wrote:
At 09:01 AM 3/19/02 -0800, Yves Arrouye wrote:
TUS does not prevent anyone to put noncharacter code points in Unicode
strings. As a matter of fact, p. 23 of TUS 3.0 reads U+ is reserved
for private program use as a sentinel or other
Markus Scherer wrote:
How about U+10?
It is a non-character, which gives it a high (unassigned
character) weight in the UCA. It is the highest code point =
the last character.
That is definitely not what I was looking for. It is an illegal codepoint,
while I was looking for a legal
Lars Kristan responded:
Markus Scherer wrote:
How about U+10?
It is a non-character, which gives it a high (unassigned
character) weight in the UCA. It is the highest code point =
the last character.
That is definitely not what I was looking for. It is an illegal codepoint,
Since collation depends on the language and not the code point or encoding
or anything else, there is no absolute last character that would be the last
character in every possible collation?
MichKa
Michael Kaplan
Trigeminal Software, Inc. -- http://www.trigeminal.com/
- Original Message
Lars Kristan asked:
Is there a character (codepoint), that is guaranteed to be sorted (collated)
after all other codepoints?
Like:
_WantThisOneOnTop
Able
Baker
NoMatterWhat
^WantThisOneOnBottom
^^and_so_on
Where _ is the underscore, which is usually collated 'quite high'.
And ^
Kenneth Whistler wrote:
In the Unicode Collation Algorithm (UTS #10), there is no explicit
weight assigned corresponding to S, but a primary weight
assignment of 0x is guaranteed to be higher than that of
any Han character.
Well, then I am proposing to introduce such a character.
At 11:13 AM 3/15/02 -0800, you wrote:
Once again, if you want a *character* to
correspond to that highest weight, then you have to tailor the
table to do so. But then, of course, you could assign any character
you want to have that highest weight value, including a private
use character or even a
How about U+10?
It is a non-character, which gives it a high (unassigned character) weight in the UCA.
It is the highest code point = the last character.
It cannot be a Private-Use character, so few people will be tempted to tailor it to
something other than its default UCA weight.
It also
11 matches
Mail list logo