Hi, At Wed, 04 Jul 2001 11:31:45 +0300, Shaul Karl <[EMAIL PROTECTED]> wrote:
> 1) It is my opinion that only the Description field and the fields names > should have a translation. Does this sounds reasonable? Yes, it would be nice if we have translation mechanism for Description fields (titles and contents). > 2) An important issue that should be agreed upon is the character encoding > scheme of these i18n files. If I remember correctly the `Introduction to > i18n' > suggests that UTF-8 should be chosen. There are three possibilities, I think. 1. Use locale-dependent encodings. Since different encodings cannot stay in a single file, translations will have to be separated into different files. So far almost translation-related things like man pages, info pages, message catalogs (aka gettext), debconf templates, and so on so on take this way. I heard some translation mechanisms (old Gnome?) violate this rule (i.e., including different encodings in a file) and annoys translators (translations with different encodings are sometimes broken). 2. Use UTF-8, a universal encoding. This enables translations to be included into one file. However, encoding conversion from UTF-8 to locale-dependent encodings will be needed by Description-handling softwares. Fortunately, GNU libc (since version 2.2) supplies nl_langinfo() and iconv() for this purpose. 3. Use ISO-2022, an another unviersal encoding. Like UTF-8, this will require encoding conversion. (iconv(3) of GNU libc doesn't support ISO-2022.) I think (2) is the best, as you wrote, since the advantage of (1) will be decreasing in future because (1) assumes one fixed encoding for one language (for example, ISO-8859-1 for French, EUC-JP for Japanese, KOI8-R for Russian, ...). And more, we might use UTF-8 for all languages in future. We are moving toward this direction, though I don't know how many years we will need to complete this Migration to UTF-8. (Many mechanisms, like manpages and message catalogs, assume such "fixed encoding for one language" and we will need great efforts and cooperation with upstreams for this Migration.) The demerit of (2) is that related softwares will have to implement encoding conversion and that encoding conversion softwares sometimes lack portability. Portability problem is that softwares have to use nl_langinfo(CODESET) and iconv(). iconv_open() has to accept conversion between UTF-8 and locale encodings. And more, the names for these encodings are not standardized. However, if we can limit portability to GNU libc system, this is not a problem. And, the cost of (2) that softwares will have to implement encoding conversion is a limited problem because there are a limited number of softwares which handle Description field. (Imagine migrating man pages into UTF-8. Unlike Description field, there are many man pages which are written in non-ASCII encodings. You will have to modify man parsers and browsers to assume UTF-8, and you will have to convert ALL manpages into UTF-8 at the same time. Otherwise your system will not work correctly. If you think about asking upstream to change manpages to be UTF-8, you will also have to think about migration of encoding of man pages all over the world, including proprietary OSes, at the same time!) P.S. Thanks to refer my document. :-) --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/ "Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/

