Markus Scherer wrote: > How about U+10ffff? > It is a non-character, which gives it a high (unassigned > character) weight in the UCA. It is the highest code point = > "the last character".
That is definitely not what I was looking for. It is an illegal codepoint, while I was looking for a legal codepoint, and one that would not 'happen to be' the last, but would be 'defined as' last. Initially, I wanted to have such a codepoint, which would counterpart the underscore (_). Meaning, it would be a valid alpha character (one that is guaranteed to be accepted for identifiers, even as the first character), and would have a non-zero-width representation. Asmus Freytag [[EMAIL PROTECTED]] also noted that there could be use for such characters in user interfaces. However, for this type of usage, it would be preferred to have two zero-width, non-breaking characters, that would typically NOT be allowed in user input, allowing the application to keep reserved items on top or bottom of a sorted list, also knowing that the user can never delete them or add an item with the same name, as long as these are screened at point of input. Things get more complicated if you allow reversed sort order, so I cannot say at this point whether or not anyone would really choose to use such an approach. The question would then be, if we pursue this issue, are we looking for a single character, that would counterpart the underscore, or are we looking for four characters, two alpha characters and two zero-width spaces? To allow for the latter, I now think that these would fit more in the General Punctuation block than in the Specials block. Lars Kristan

