On Fri, Jan 11, 2002 at 03:35:55AM -0800, Edward Cherlin wrote:
> :Kenneth Whistler <[EMAIL PROTECTED]> wrote:
> ...
> :> Most
> :> of us, including those of use culpable in the definition of the
> :> tag characters (which John Cowan pointed out were defined to head
> :> off a worse threat to UTF-8) would prefer not to see them in
> :> wide use, but rather the use of standard tagging mechanisms like
> :> XML or HTML.
> :
> :Wow. You too.
> :
> :I honestly had no idea that the use of Plane 14 language tags,
> :defined as they are in a Unicode Technical Report, were so strongly
> :deprecated by everyone "in the know" about Unicode, including their
> :own creators. I had read UTF #7 at face value, as describing an
> :optional mechanism that might help with certain processes but which
> :we were under no obligation to use, but now it appears that Plane 14
> :language tags have the RFC 1815 nature ("Here's something you can
> :use, but for God's sake, please don't use it").
> ...
In other words, "if you can use XML, use it." This can't.
> I don't understand why markup is not an option.
It's not my decision, and it appears to be one that's been made and
committed to.
> Certainly some people want to. I'm arguing that they don't need to.
Protocols should only let people do what they absolutely *need*, nothing
more?
> Anyway, give us an example. Either a message in one language that
> cannot be displayed correctly from the plain text, or a message in
> more than one language where rendering in the user's preferred font
> loses information for that user.
Andries gave a couple examples. I'll give another: the backslash/yen
problem.
It's not going away, and the most reasonable fix is Tomohiro's: pretend
they're two glyphs of the same character. If we know the language, we
can choose yen symbols for Japanese, and backslashes otherwise.
(Now, this might be fixable at the editor level: if the editor notices
the user is entering text in CP932, translate backslashes to yen
symbols. That would break round-trip compatibility, however; it might
not seem important to maintain this for the two different yen symbols in
CP932, but for some people it's a showstopper.)
(I'm aware you're in favor of no such compromise, so let's try to keep
that debate in its own thread.)
(You could probably say "well, you can figure out by context whether a \
is supposed to be a backslash or yen, so you don't really *need* this."
Very well, then it's *wanted* enough.)
--
Glenn Maynard
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/