> From: Philippe Verdy [mailto:[EMAIL PROTECTED]
> Sent: Monday, May 24, 2004 3:28 PM


> Is it a joke? UTF-8 designates Unicode codepoints refering to
> Unicode abstract characters with all their semantic (including
> the character name and properties).

No, it is not a tweak. For years, many scholars working with electronic
versions of Biblical texts have used the MCW (not MCS -- a typo on my
part) representation, which is effectively a Latin cipher of Hebrew and
Greek characters. The abstract characters are entirely Basic Latin
characters, but they are standing for Hebrew or Greek characters.



> You can't say that the tableabove is ASCII not either Unicode.
> It's only a separate legacy 7-bit encoding.

It certainly could be considered ASCII or Unicode Basic Latin
characters: they are always documented as such, and viewed as such. One
*could* also consider it a legacy encoding of non-Latin characters, but
in practice it's not used that way -- it's only at a higher level of
interpretation (on the part of the user, not the system) that these are
Hebrew or Greek characters.


> which is probably
> not widely interoperable because unimplemented or not documented
> in the same common places as where ASCII and Unicode are defined.

Well, actually, it *is* interoperable within the sizeable community that
has adopted that convention -- they can and do interchange data using
this. You can find content using this representation in such places as
the Oxford Text Archive.



Peter Constable






Reply via email to