On Tue, 2003-07-01 at 23:18, Paul Eggert wrote: > gregory mott <[EMAIL PROTECTED]> writes: > > can you point me to an appropriate RTFM that ideally would layout what > > encodings are used by what locales, or how to tell what encoding you > > have/need, etc usw? > > Sorry, no; this stuff tends to be scattered around all over the place. > > On my Debian GNU/Linux 3.0r1 system, the file > /usr/share/i18n/SUPPORTED lists the encodings used by locales, but > things may be different on your system. > > For general info about encodings you might try www.li18nux.org and/or > Ken Lunde's book on encodings and character sets > <http://www.praxagora.com/lunde/cjkv-ip.html>.
i've read things hither and yon, i remain in the dark.. when i pass textual input to sort, how does sort come to decide or infer the encoding? you seem to say that a locale is associated with a particular encoding. well, hmm. on rh9, the locale definitions (eg /usr/share/i18n/locales/en_IN) appear to be in unicode. i do not see where a locale becomes associated with any particular encoding (such as UTF-8 or ISO-8859-15). it seems i can "fix" the en_AU "failure" by specifying: $ LC_CTYPE=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8 sort /tmp/sos groan grosr groß grost red résumé resumed but that approach doesn't seem to help my personal locale definition: $ LC_CTYPE=g.UTF-8 LC_COLLATE=g.UTF-8 sort /tmp/sos groan grosr grost groß red resumed résumé i fail to understand. i've used the same stock definitions: # ---> /usr/share/i18n/locales/g <--- # build with: # localedef -i g -c g LC_CTYPE copy "i18n" END LC_CTYPE LC_COLLATE copy "iso14651_t1" END LC_COLLATE LC_TIME d_fmt "<U0025><U0059><U002F><U0025><U006D><U002F><U0025><U0064>" END LC_TIME can you/anyone give me a clue? _______________________________________________ Bug-coreutils mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-coreutils
