RE: Collation - last character?

2002-03-22 Thread Yves Arrouye
TUS does not prevent anyone to put noncharacter code points in Unicode strings. As a matter of fact, p. 23 of TUS 3.0 reads U+ is reserved for private program use as a sentinel or other signal. I would expect this to hold true for the noncharacters that were introduced later too. It may

Re: Collation - last character?

2002-03-20 Thread Kenneth Whistler
David Hopwood said: At 09:01 AM 3/19/02 -0800, Yves Arrouye wrote: TUS does not prevent anyone to put noncharacter code points in Unicode strings. As a matter of fact, p. 23 of TUS 3.0 reads U+ is reserved for private program use as a sentinel or other signal. But it is

RE: Collation - last character?

2002-03-19 Thread Yves Arrouye
Markus Scherer wrote: How about U+10? It is a non-character, which gives it a high (unassigned character) weight in the UCA. It is the highest code point = the last character. That is definitely not what I was looking for. It is an illegal codepoint, while I was looking for a

Re: Collation - last character?

2002-03-19 Thread David Hopwood
-BEGIN PGP SIGNED MESSAGE- Asmus Freytag wrote: At 09:01 AM 3/19/02 -0800, Yves Arrouye wrote: TUS does not prevent anyone to put noncharacter code points in Unicode strings. As a matter of fact, p. 23 of TUS 3.0 reads U+ is reserved for private program use as a sentinel or other

RE: Collation - last character?

2002-03-18 Thread Lars Kristan
Markus Scherer wrote: How about U+10? It is a non-character, which gives it a high (unassigned character) weight in the UCA. It is the highest code point = the last character. That is definitely not what I was looking for. It is an illegal codepoint, while I was looking for a legal

RE: Collation - last character?

2002-03-18 Thread Kenneth Whistler
Lars Kristan responded: Markus Scherer wrote: How about U+10? It is a non-character, which gives it a high (unassigned character) weight in the UCA. It is the highest code point = the last character. That is definitely not what I was looking for. It is an illegal codepoint,

Re: Collation - last character?

2002-03-15 Thread Michael \(michka\) Kaplan
Since collation depends on the language and not the code point or encoding or anything else, there is no absolute last character that would be the last character in every possible collation? MichKa Michael Kaplan Trigeminal Software, Inc. -- http://www.trigeminal.com/ - Original Message

Re: Collation - last character?

2002-03-15 Thread Kenneth Whistler
Lars Kristan asked: Is there a character (codepoint), that is guaranteed to be sorted (collated) after all other codepoints? Like: _WantThisOneOnTop Able Baker NoMatterWhat ^WantThisOneOnBottom ^^and_so_on Where _ is the underscore, which is usually collated 'quite high'. And ^

RE: Collation - last character?

2002-03-15 Thread Lars Kristan
Kenneth Whistler wrote: In the Unicode Collation Algorithm (UTS #10), there is no explicit weight assigned corresponding to S, but a primary weight assignment of 0x is guaranteed to be higher than that of any Han character. Well, then I am proposing to introduce such a character.

Re: Collation - last character?

2002-03-15 Thread Asmus Freytag
At 11:13 AM 3/15/02 -0800, you wrote: Once again, if you want a *character* to correspond to that highest weight, then you have to tailor the table to do so. But then, of course, you could assign any character you want to have that highest weight value, including a private use character or even a

Re: Collation - last character?

2002-03-15 Thread Markus Scherer
How about U+10? It is a non-character, which gives it a high (unassigned character) weight in the UCA. It is the highest code point = the last character. It cannot be a Private-Use character, so few people will be tempted to tailor it to something other than its default UCA weight. It also