Re: [I18n]Please do not use en_US.UTF-8 outside the US

2002-05-06 Thread Juliusz Chroboczek

JS   I had to make up ko_KR.UTF-8 different from en_US.UTF-8 to make my
JS transition to ko_KR.UTF-8 work as I intended.

Fair point.

Of course, the long-term solution is to use font technologies that do
language-dependent and contextual font and glyph substitution.
Client- or server-side.

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n



Re: [I18n]Please do not use en_US.UTF-8 outside the US

2002-05-01 Thread Ienup Sung

How are you, Markus,

I just would like to point out that we never suggested that 
the en_US.UTF-8 is the only locale that you will ever need. On the
contrary, we've been pointing out that each region/country should use
their own Unicode locales. 

Yes, it is absolutely right that globalization isn't just encoding or
coded character set; Unicode itself alone cannot resolve everything/issues even
though it is absolutely a good thing to have a universal character set widely
accepted like Unicode.

With regards,

Ienup


] Date: Tue, 30 Apr 2002 21:32:39 +0100
] From: Markus Kuhn [EMAIL PROTECTED]
] Subject: [I18n]Please do not use en_US.UTF-8 outside the US
] To: [EMAIL PROTECTED]
] Cc: [EMAIL PROTECTED]
] MIME-version: 1.0
] 
] As we are talking about en_US.UTF-8:
] 
] General warning: Please do not use the locale name en_US.UTF-8 anywhere
] outside North America. Some older Solaris documentation suggested that
] this is the only UTF-8 locale you'll ever need, as locales don't change
] much sensible beyond the encoding anyway. This is not the case any more
] today!
] 
] An increasing number of programs of US origin finally start to abandon
] the annoying old habit of assuming Legal paper and non-metric units as
] default conventions everywhere, requiring 95% of the world population to
] figure out how to reconfigure to the standard conventions.
] 
] More recent software releases instead determine the default setting for
] conventions such as paper format and units of measurement with code
] similar to the following (feel free to copy it into your software as
] well):
] 
] 
] #include stdio.h
] #include stdlib.h
] #include string.h
] 
] /* LC_PAPER and LC_MEASUREMENT were introduced in ISO/IEC TR 14652 */
] 
] int main()
] {
]   char *units = mm;
]   char *paper = A4;
]   char *s;
] 
]   if (((s = getenv(LC_ALL))*s) ||
]   ((s = getenv(LC_PAPER))  *s) ||
]   ((s = getenv(LANG))  *s))
] if (strstr(s, _US) || strstr(s, _CA))
]   paper = Letter;
]   if (((s = getenv(LC_ALL))*s) ||
]   ((s = getenv(LC_MEASUREMENT))  *s) ||
]   ((s = getenv(LANG))  *s))
] if (strstr(s, _US))
]   units = inches;
] 
]   printf(Paper: %s\nUnits: %s\n, paper, units);
]   
]   return 0;
] }
] 
] 
] This leads to portable and agreeable default settings, using the
] standard values UNLESS you are in a locale that explicitely says that
] you are in North America. I think that's a very good implementation
] practice, but it requires that if you explain to an international
] audience how to activate UTF-8 locales, you should better use a non-US/
] CA locale. (en_GB.UTF-8 for instance seems like an excellent choice ... :)
] 
] Markus
] 
] -- 
] Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
] Email: mkuhn at acm.org,  WWW: http://www.cl.cam.ac.uk/~mgk25/
] 
] ___
] I18n mailing list
] [EMAIL PROTECTED]
] http://XFree86.Org/mailman/listinfo/i18n

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n



[I18n]Please do not use en_US.UTF-8 outside the US

2002-04-30 Thread Markus Kuhn

As we are talking about en_US.UTF-8:

General warning: Please do not use the locale name en_US.UTF-8 anywhere
outside North America. Some older Solaris documentation suggested that
this is the only UTF-8 locale you'll ever need, as locales don't change
much sensible beyond the encoding anyway. This is not the case any more
today!

An increasing number of programs of US origin finally start to abandon
the annoying old habit of assuming Legal paper and non-metric units as
default conventions everywhere, requiring 95% of the world population to
figure out how to reconfigure to the standard conventions.

More recent software releases instead determine the default setting for
conventions such as paper format and units of measurement with code
similar to the following (feel free to copy it into your software as
well):


#include stdio.h
#include stdlib.h
#include string.h

/* LC_PAPER and LC_MEASUREMENT were introduced in ISO/IEC TR 14652 */

int main()
{
  char *units = mm;
  char *paper = A4;
  char *s;

  if (((s = getenv(LC_ALL))*s) ||
  ((s = getenv(LC_PAPER))  *s) ||
  ((s = getenv(LANG))  *s))
if (strstr(s, _US) || strstr(s, _CA))
  paper = Letter;
  if (((s = getenv(LC_ALL))*s) ||
  ((s = getenv(LC_MEASUREMENT))  *s) ||
  ((s = getenv(LANG))  *s))
if (strstr(s, _US))
  units = inches;

  printf(Paper: %s\nUnits: %s\n, paper, units);
  
  return 0;
}


This leads to portable and agreeable default settings, using the
standard values UNLESS you are in a locale that explicitely says that
you are in North America. I think that's a very good implementation
practice, but it requires that if you explain to an international
audience how to activate UTF-8 locales, you should better use a non-US/
CA locale. (en_GB.UTF-8 for instance seems like an excellent choice ... :)

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: http://www.cl.cam.ac.uk/~mgk25/

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n



Re: [I18n]Please do not use en_US.UTF-8 outside the US

2002-04-30 Thread Jungshik Shin

On Tue, 30 Apr 2002, Dr Andrew C Aitchison wrote:

 On Tue, 30 Apr 2002, Markus Kuhn wrote:

  As we are talking about en_US.UTF-8:
 
  General warning: Please do not use the locale name en_US.UTF-8 anywhere
  outside North America.

  practice, but it requires that if you explain to an international
  audience how to activate UTF-8 locales, you should better use a non-US/
  CA locale. (en_GB.UTF-8 for instance seems like an excellent choice ... :)

 % find xc -name *UTF-8* -print
 xc/nls/Compose/en_US.UTF-8.ct


 Given that en_US.UTF-8 is the only instance of a locale file with UTF-8
 in its name, how do I find the names of other locales which use UTF-8 ?

  Have you looked into the Glibc locale directory? Mandrake has a bunch
of UTF-8 locales there, I believe.  Glibc 2.2.x has been supporting
ll_CC.UTF-8's for a while.  If your system doesn't have it, you can
just generate whatever ll_CC.UTF-8's you may need with localedef.
As for XLC_LOCALE, you can always make one as I wrote in my message
yesterday. RedHat and Mandrake Linux may not have XLC_LOCALES for
locales other than en_US.UTF-8, but some other Linux distributions
(e.g. TurboLinux) have zh_CN.UTF-8 and  zh_TW.UTF-8.   BTW,
the first UTF-8 locale other than en_US.UTF-8 shipped with Solaris
- Solaris 7? - (and AIX 4.x as well)  was ko_KR.UTF-8, IIRC.

a bit off-topic
  Now I'm almost done with switching to ko_KR.UTF-8 on my Linux box. It
works more or less fine in that I can do *more than* what I could
do under ko_KR.EUC-KR.  Still missing is Middle Korean support,
but it seems that xterm-16x can be used to *display* Middle Korean
text encoded with a sequence of U+1100 Hangul Conjoining Jamos
(http://chem.skku.ac.kr/~wkpark/screenshot/2002_04_30_221718_shot.png).
Vim 6.1 already supports up to two combining characters and Middle
Korean only need 'two combining characters' *most of time*. (even
modern Korean needs more than two 'combining characters' in some
cases,though. http://jshin.net/i18n/uyeo.html). Hopefully, with a little
more tweaking in Vim 6.1 and some major enhancements in Korean
XIM (e.g. Ami), I'll be able to typeset Middle Korean with LaTeX
sooner or later.  (LaTeX side is almost ready, too)
/a bit off-topic

  Jungshik Shin


___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n