> From: Philippe Verdy [mailto:[EMAIL PROTECTED] > Sent: Monday, May 24, 2004 3:28 PM
> Is it a joke? UTF-8 designates Unicode codepoints refering to > Unicode abstract characters with all their semantic (including > the character name and properties). No, it is not a tweak. For years, many scholars working with electronic versions of Biblical texts have used the MCW (not MCS -- a typo on my part) representation, which is effectively a Latin cipher of Hebrew and Greek characters. The abstract characters are entirely Basic Latin characters, but they are standing for Hebrew or Greek characters. > You can't say that the tableabove is ASCII not either Unicode. > It's only a separate legacy 7-bit encoding. It certainly could be considered ASCII or Unicode Basic Latin characters: they are always documented as such, and viewed as such. One *could* also consider it a legacy encoding of non-Latin characters, but in practice it's not used that way -- it's only at a higher level of interpretation (on the part of the user, not the system) that these are Hebrew or Greek characters. > which is probably > not widely interoperable because unimplemented or not documented > in the same common places as where ASCII and Unicode are defined. Well, actually, it *is* interoperable within the sizeable community that has adopted that convention -- they can and do interchange data using this. You can find content using this representation in such places as the Oxford Text Archive. Peter Constable

