On Jun 5, 2014, at 12:41 PM, Hans Aberg <haber...@telia.com> wrote:

> On 5 Jun 2014, at 17:46, Jeff Senn <s...@maya.com> wrote:
> 
>> That is: are identifiers merely sequences of characters or intended to be 
>> comparable as “Unicode strings” (under some sort of compatibility rule)?
> 
> In computer languages, identifiers are normally compared only for equality, 
> as it reduces lookup time complexity.

Well in this case we are talking about parsing a source file and generating 
internal symbols, so the complexity of the comparison operation is a red 
herring.

The real question is how does the source identifier get mapped into a 
(compiled) symbol.  (e.g. in C++ this is not an obvious operation)

If your implication is that there should be no canonicalization (the string 
from the source is used as a sequence of characters only directly mapped to a 
symbol), then I predict sticky problems in the future.  The most obvious of 
which is that in some cases I will be able to change the semantics of the 
complied program by (accidentally) canonicalizing the source text (an 
operation, I will point out, that is invisible to the user in many (most?) 
Unicode aware editors).




_______________________________________________
Unicode mailing list
Unicode@unicode.org
http://unicode.org/mailman/listinfo/unicode

Reply via email to