Peter Kirk wrote:Well, I was thinking that as soon as a new character is defined which is not punctuation, it is automatically not an identifier. Obviously there are a few problems of detail there, but only of the types which have to be faced by any program which has to deal with as yet undefined characters. I suppose we just have to say that behaviour with undefined characters is undefined! - as we don't yet know if they are punctuation or not.
But the other way round is less of a problem. So I am suggesting that for now we define all punctuation characters except for those with specifically defined operator functions, also all undefined characters, as giving a syntax error. This makes it possible
to define additional punctuation characters, even those in so far
undefined scripts like Tifinagh, as valid operators in future
versions.
Yes, but this makes it impossible to use any as-yet undefined scripts in identifiers! E.g., you'd never be able to write a variable name in Tifinagh letters in future versions!
Unless you are still thinking at non-fixed sets, in which case I must remind you again that there are no balls or door-keepers in a card game... :-)
Ciao. Marco
I note the following from the TR31 draft:
For stability, the property values will be absolutely invariant; not changing with successive versions of Unicode. Of course, this doesn't limit the ability of the Unicode Standard to add more symbol or whitespace characters, but the syntax and whitespace characters recommended for use in patterns would not change.
I am not sure what is meant here by "symbol ... characters", not otherwise defined in this draft. Maybe this is an error for "syntax ... characters" as later in the sentence. Is the meaning of this "absolute invariant" that once a character is defined as a syntax character, or as a whitespace character, it will always remain one, but that additional characters may be defined as syntax characters, or as whitespace characters, in later versions? If so, we don't have a problem, as we can add Tifinagh punctuation, and Arabic and Hebrew punctuation, in later versions as required. My problem is if the list is to be understood as complete now for all time. I would see this as both unnecessary and problematic. The way round this is to define syntax relative to a specific version of Unicode.
On the point that having a small fixed list saves storage space, I can see that it might do in the short term but also that in the medium term it will increase complexity as so many workarounds are necessary - just as with incorrectly fixed combining classes in Hebrew etc etc.
As for goalkeepers (not door-keepers), I don't see your point. You can accuse me of trying to change soccer into American football or vice versa if you like, to which my reply is that the current rules are only a proposed draft, and my rules (like soccer!) are more globally acceptable, even for Tifinagh users. But I am talking about the same general kind of ball game, with some adjustments.
-- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/

