Kaixo!

On Tue, Jan 08, 2002 at 11:03:14PM -0500, Glenn Maynard wrote:
 
> What, exactly, needs to be done by an application (or rather, its data
> formats) to accomodate CJK in Unicode (and other languages with similar
> ambiguities)?

Be able to load an appropriate font; that's all.

> A couple people on a Vorbis list are suggesting allowing RFC2047
> encoding in Ogg tags, to let people use encodings other than UTF-8, as a
> "fix" for these problems.

A *VERY BAD* idea.
The only result will be that most implementations will ignore non unicode
encodings.

The encoding used internally by file formats such as ogg is simply *none of
user's business*; it is an internal encoding, period. The users don't have
to know or worry about it.
The application just has, if needed, to do the appropriate conversions
from/to the user encoding at interface boundaries, depending on the
user locale; and allow a way for the user to provide the font he/she wants.

Proposing the use of non unicode encodings for new formats or protocols
is today a very weird thing to do.

> One of them appears to consider Unicode
> currently useless for real-world data exchange in CJK, and believes this
> to be a consensus among Asian users.

And probably those people also use a widely wispread operating system that
happens to use unicode internally, and also use TTF fonts encoded in unicode,
and so on endlessly...

> I think RFC2047 is a fairly horrible
> solution.  An alternative is simply to store the language of the text;
> is that sufficient, or are there deeper problems?

Why even bother with that?
When you write a web page, for example, do you put language tags everywhere
and around any text?
And is there any browser that actually uses language tags for rendering?

The main purpose of a sound format is to contain sound. The text portions
are just informative; they are not intended to have all the flexibility
of a word processor and produce LaTeX-quality printouts; they are intended
to be just plaintext.
Simply, utf-8 is the ascii of the new millenium. 

> What other languages have similar problems? Something was mentioned
> about Russian, as well.  What fixes do they need?

People choose the fonts in function of their needs.
Those "problems" are *not* encoding problems; only font problems.

Saying that unicode is not good because the above reasons is nearly as stupid
as saying unicode is not good for latin based languages as it doesn't
disambiguate between serif and sans serif styles.

-- 
Ki �a vos v�ye b�n,
Pablo Saratxaga

http://www.srtxg.easynet.be/            PGP Key available, key ID: 0x8F0E4975

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to