Li18nux Locale Name Guideline Public Review

2002-01-21 Thread Tomohiro KUBOTA

Hi,

I found the 2nd public review of Li18nux Locale Name Guideline
has started.

http://www.hauN.org/ml/b-l-j/a/800/840.html
http://www.li18nux.org/subgroups/sa/locnameguide/index.html

The page says that comments are welcome until 14 Feb 2002.

Any additions from Li18nux insiders?

---
Tomohiro KUBOTA [EMAIL PROTECTED]
http://www.debian.or.jp/~kubota/
Introduction to I18N  http://www.debian.org/doc/manuals/intro-i18n/
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




xterm KSYM mode encoding

2002-01-21 Thread Michael B Allen

On Sun, 20 Jan 2002 14:55:17 +
Markus Kuhn [EMAIL PROTECTED] wrote:
 
 What you could do is either prefix each release with some release
 indicator symbol, or add for instance 0x20 to a Unicode character to
 turn it into a release code. Both approaches allow you to use a normal
 UTF-8 decoder at the receiver's end.

Sounds like a good idea but unfortunately Xutf8LookupString,
XmbLookupString, and XwcLookupString are not supposed to be used with
XKeyReleasedEvents. Apparently it messes up the input context. I've
resorted to simple XLookupString and writing post-modifier KeySyms as 4
byte integers. Works for now but I'd like to also provide transparent
support for the Linux console and Putty. If I'm going to normalize on
something I should at least use unicode to avoid table lookups in the
end user code.

Ideas would be appreciated.

 
 There is no standard for what you want to do, as this is getting very
 far away from the classic VT100 / ISO 6429 terminal semantics. No matter
 what you do, it will be your private encoding that isn't compatible with
 anything else.
 
 Make sure that the ESC sequence that you use to activate this private
 mark/break mode is as long and obscure as possible (at least 10 bytes,
 but still within the ECMA-48 syntax for ESC sequences!), to minimize
 that it can ever be sent by accident to the terminal.

With a hint from of Mr. Dickey, I believe the private \E[?1515h and
\E[?1515l sequences are appropriate.

 
 http://www.ecma.ch/ecma1/STAND/ECMA-048.HTM

This looks important :~)

Thanks,
Mike

-- 
May The Source be with you.
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: Li18nux Locale Name Guideline Public Review

2002-01-21 Thread Bram Moolenaar


Tomohiro Kubota wrote:

 I found the 2nd public review of Li18nux Locale Name Guideline
 has started.
 
 http://www.hauN.org/ml/b-l-j/a/800/840.html
 http://www.li18nux.org/subgroups/sa/locnameguide/index.html
 
 The page says that comments are welcome until 14 Feb 2002.
 
 Any additions from Li18nux insiders?

Here are a few remarks from my side:


  87  All of the fields (i.e. LANGUAGE, TERRITORY, CODESET and MODIFIERS)
  88  shall be treated as case sensitive.

For users it's quite difficult to remember which field is uppercase or
lowercase.  It's easy to make a mistake.  I don't see a field where
there would be any confusion when case is ignored.  So why not ignore
case?  I know this makes it a bit more difficult for developers, but
that's a small price to pay for usability.


  114  If a two-letter code is not available in ISO 3166-1 for a territory,
  115  no standard value is defined for the territory.
   
  116  In order not to conflict with future extension of ISO 3166-1,
  117  user/implementation-defined values for the TERRITORY field shall
  118  include lowercase letters or consist of more than two letters.

Why not always require a non-standard code to be three (or more) letters?
That avoids a lot of confusion and will make it easy to detect a
non-standard name.

-- 
Tips for aliens in New York: Land anywhere.  Central Park, anywhere.
No one will care or indeed even notice.
-- Douglas Adams, The Hitchhiker's Guide to the Galaxy

 ///  Bram Moolenaar -- [EMAIL PROTECTED] -- http://www.moolenaar.net  \\\
(((   Creator of Vim -- http://vim.sf.net -- ftp://ftp.vim.org/pub/vim   )))
 \\\  Help me helping AIDS orphans in Uganda - http://iccf-holland.org  ///
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: xterm KSYM mode encoding

2002-01-21 Thread Thomas E. Dickey

On Mon, 21 Jan 2002, Michael B Allen wrote:

 With a hint from of Mr. Dickey, I believe the private \E[?1515h and
 \E[?1515l sequences are appropriate.

the pointer which Paul Williams gave (for key position mode) is probably
more promising.

-- 
T.E.Dickey [EMAIL PROTECTED]
http://invisible-island.net
ftp://invisible-island.net

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: xterm KSYM mode encoding

2002-01-21 Thread Paul Williams

Thomas E. Dickey wrote:
 
 On Mon, 21 Jan 2002, Michael B Allen wrote:
 
  With a hint from of Mr. Dickey, I believe the private \E[?1515h
  and \E[?1515l sequences are appropriate.
 
 the pointer which Paul Williams gave (for key position mode) is
 probably more promising.

Heh heh, I'll have to correct myself now you've mentioned it!
Unfortunately the VT500-series Programmer References have a typo, which
I blindly copied. The sequences are CSI ? 81 h and CSI ? 81 l, not CSI
81 h, etc.

However, they should not be used to turn on a mode other than KPM. Any
custom UTF-8 trickery should use a new mode code.

I'd like to hear more from the original poster about the requirement for
this encoding, and why it would need to be transparent to Linux console
and PuTTY. Without wishing to be rude, I think the problem should be
defined before the solution.

- Paul
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: xterm KSYM mode encoding

2002-01-21 Thread Thomas E. Dickey

On Mon, 21 Jan 2002, Paul Williams wrote:

 Thomas E. Dickey wrote:
 
  On Mon, 21 Jan 2002, Michael B Allen wrote:
 
   With a hint from of Mr. Dickey, I believe the private \E[?1515h
   and \E[?1515l sequences are appropriate.
 
  the pointer which Paul Williams gave (for key position mode) is
  probably more promising.

 Heh heh, I'll have to correct myself now you've mentioned it!
 Unfortunately the VT500-series Programmer References have a typo, which
 I blindly copied. The sequences are CSI ? 81 h and CSI ? 81 l, not CSI
 81 h, etc.

good (I was a little surprised to see it as a non-private mode, but wasn't
at a good point to research it - I have it as a private mode in vttest:
not the first typo in DEC's manuals).

 However, they should not be used to turn on a mode other than KPM. Any
 custom UTF-8 trickery should use a new mode code.

From what I understand of the request, KPM would be adequate, and UTF-8 is
just brought in because he's still learning what is involved.  Ultimately,
since scan-codes are not really portable in the sense that's requested,
I'm not sure how useful any of this is.

-- 
T.E.Dickey [EMAIL PROTECTED]
http://invisible-island.net
ftp://invisible-island.net

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: [I18n]Re: Li18nux Locale Name Guideline Public Review

2002-01-21 Thread Edmund GRIMLEY EVANS

   setenv LANG de_DE.iso-8859-1@euro
   setenv LANG DE_de.ISO-8859-1@euro
   setenv LANG de_DE.Iso-8859-1@EURO
 
 Do you think an average user can guess which one of these he has to
 type?  No GUI available!

If the average user is having to choose between those 3 possibilities,
then presumably those 3 possibilities were presented by some program
or included in some list. That program, or that list, should be
modified to only give valid possibilities.

Edmund
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: [I18n]Re: Li18nux Locale Name Guideline Public Review

2002-01-21 Thread Henry Spencer

On Mon, 21 Jan 2002, Bram Moolenaar wrote:
   setenv LANG de_DE.iso-8859-1@euro
   setenv LANG DE_de.ISO-8859-1@euro
   setenv LANG de_DE.Iso-8859-1@EURO
 Do you think an average user can guess which one of these he has to
 type?  No GUI available!

If the user has to *type* one of those, the system is broken.

You can write a menu system, which will list the legitimate choices and
ask him which one he wants, in twenty lines of shell script.  It will run
on any ASCII terminal.  There is no need to have X and Tcl/Tk to write
interfaces that are more novice-friendly than setenv.

  Henry Spencer
   [EMAIL PROTECTED]

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: [I18n]Re: Li18nux Locale Name Guideline Public Review

2002-01-21 Thread Bram Moolenaar


Henry Spencer wrote:

 On Mon, 21 Jan 2002, Bram Moolenaar wrote:
  setenv LANG de_DE.iso-8859-1@euro
  setenv LANG DE_de.ISO-8859-1@euro
  setenv LANG de_DE.Iso-8859-1@EURO
  Do you think an average user can guess which one of these he has to
  type?  No GUI available!
 
 If the user has to *type* one of those, the system is broken.
 
 You can write a menu system, which will list the legitimate choices and
 ask him which one he wants, in twenty lines of shell script.  It will run
 on any ASCII terminal.  There is no need to have X and Tcl/Tk to write
 interfaces that are more novice-friendly than setenv.

Nice idea.  However, that this menu system shell script still has to be
made is enough proof that, in practice, people will use setenv.  I often
use env LANG=locale gvim arguments to test message translations.

-- 
hundred-and-one symptoms of being an internet addict:
33. You name your children Eudora, Mozilla and Dotcom.

 ///  Bram Moolenaar -- [EMAIL PROTECTED] -- http://www.moolenaar.net  \\\
(((   Creator of Vim -- http://vim.sf.net -- ftp://ftp.vim.org/pub/vim   )))
 \\\  Help me helping AIDS orphans in Uganda - http://iccf-holland.org  ///
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: [I18n]Re: Li18nux Locale Name Guideline Public Review

2002-01-21 Thread Henry Spencer

On Mon, 21 Jan 2002, Bram Moolenaar wrote:
  You can write a menu system, which will list the legitimate choices and
  ask him which one he wants, in twenty lines of shell script...
 
 Nice idea.  However, that this menu system shell script still has to be
 made is enough proof that, in practice, people will use setenv...

For small values of people. :-)  Only the experts will.  The experts
presumably can get the case of a locale name right.

  Henry Spencer
   [EMAIL PROTECTED]

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: [I18n]Re: Li18nux Locale Name Guideline Public Review

2002-01-21 Thread Alastair McKinstry

On Mon, 2002-01-21 at 15:31, Bram Moolenaar wrote:
 
 Henry Spencer wrote:
 
  On Mon, 21 Jan 2002, Bram Moolenaar wrote:
 setenv LANG de_DE.iso-8859-1@euro
 setenv LANG DE_de.ISO-8859-1@euro
 setenv LANG de_DE.Iso-8859-1@EURO
   Do you think an average user can guess which one of these he has to
   type?  No GUI available!
  
  If the user has to *type* one of those, the system is broken.
  
  You can write a menu system, which will list the legitimate choices and
  ask him which one he wants, in twenty lines of shell script.  It will run
  on any ASCII terminal.  There is no need to have X and Tcl/Tk to write
  interfaces that are more novice-friendly than setenv.
 
 Nice idea.  However, that this menu system shell script still has to be
 made is enough proof that, in practice, people will use setenv.  I often
 use env LANG=locale gvim arguments to test message translations.
 
Try locale -a. It'll give you a list of  valid locales, and aliases.
It reads its contents from /etc/locale.aliases
Over on debian-devel, I've been proposing changing the locale-gen cmd.
in Debian (and similar in other distributions...) to automagically edit
locale.aliases to include only aliases to locales that are present on
the system (e.g. in /usr/lib/locale). Then locale -a
would give you


de_DE.UTF-8@euro
german
fr_FR.UTF-8@euro
french

..
env LANG=french gvim args
would work ...

In principle, I agree though, case sensitive; work should be aimed at
making a GUI simple to use, and the CLI consistent and simple.


 -- 
 hundred-and-one symptoms of being an internet addict:
 33. You name your children Eudora, Mozilla and Dotcom. s 
 
  ///  Bram Moolenaar -- [EMAIL PROTECTED] -- http://www.moolenaar.net  \\\
 (((   Creator of Vim -- http://vim.sf.net -- ftp://ftp.vim.org/pub/vim   )))
  \\\  Help me helping AIDS orphans in Uganda - http://iccf-holland.org  ///
 --
 Linux-UTF8:   i18n of Linux on all levels
 Archive:  http://mail.nl.linux.org/linux-utf8/
 
 

--
Alastair McKinstry [EMAIL PROTECTED]

-- 


--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: [I18n]Re: Li18nux Locale Name Guideline Public Review

2002-01-21 Thread Pablo Saratxaga

Kaixo!

On Mon, Jan 21, 2002 at 02:47:48PM +0100, Bram Moolenaar wrote:
 
  Usability is provided by GUI selection tools, not by softening syntax
  specs. The case sensitivity makes a lot of sense as ISO's language and

   setenv LANG de_DE.iso-8859-1@euro
   setenv LANG DE_de.ISO-8859-1@euro
   setenv LANG de_DE.Iso-8859-1@EURO
 
 Do you think an average user can guess which one of these he has to
 type?  No GUI available!

To use the command line you must be able to read a doc and to copy correctly
what it said; you are also supposed to know the command line is case
sensitive.

 The underscore is sufficient to separate the language and region.
 Upper/lower case doesn't really help me anyway, it's only an extra thing
 to know.

But it's also the way things are done sice ever (or at least as long as
I can look at); why to break it and introduce compatibility problems ?

 If we can agree on case insensitivity, then case differences are not
 aliases.  You can type them any way you like and they would still be the
 same locale.

Note that case insensitivity is locale dependent; by introducing case
insensitivity you may have some very strange behaviours, like
the locale being recognized when you first define it (as you were on another
locale previously), then after the change is done you start getting errors
(as the new locale defines new case insensitivity rules and the string that
previously was considered the same is not anylonger the same).

Also, remmeber the filesystem is still case sensitive; which means that
if you introduce case insensitivness for locales naming in variables, you
need to change the sources of the libc or other sensitive libraries in
order to have the files where locale data is stored found even if what the 
user request is a different name than the one of the actual directories...

The supposed benefits are too small compared to the troubles.

-- 
Ki ça vos våye bén,
Pablo Saratxaga

http://www.srtxg.easynet.be/PGP Key available, key ID: 0x8F0E4975

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: [I18n]Re: Li18nux Locale Name Guideline Public Review

2002-01-21 Thread Pablo Saratxaga

Kaixo!

On Mon, Jan 21, 2002 at 10:05:23AM -0500, Henry Spencer wrote:
 On Mon, 21 Jan 2002, Bram Moolenaar wrote:
  setenv LANG de_DE.iso-8859-1@euro
  setenv LANG DE_de.ISO-8859-1@euro
  setenv LANG de_DE.Iso-8859-1@EURO
  Do you think an average user can guess which one of these he has to
  type?  No GUI available!
 
 If the user has to *type* one of those, the system is broken.

Well, indeed! as none of the three above makes sense at all, there is no
possibility to have the euro sign in iso-8859-1 :)

-- 
Ki ça vos våye bén,
Pablo Saratxaga

http://www.srtxg.easynet.be/PGP Key available, key ID: 0x8F0E4975

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: Li18nux Locale Name Guideline Public Review

2002-01-21 Thread David Starner

On Mon, Jan 21, 2002 at 07:18:09PM +0900, Tomohiro KUBOTA wrote:
 Hi,
 
 I found the 2nd public review of Li18nux Locale Name Guideline
 has started.
 
 http://www.hauN.org/ml/b-l-j/a/800/840.html
 http://www.li18nux.org/subgroups/sa/locnameguide/index.html
 
 The page says that comments are welcome until 14 Feb 2002.

Starting from the top:

This is Linux, not proprietary Unixes. What is the point of not
standardizing on the IANA names, especially when you don't standardize
on the Unix ones, either? Especially - TCA-Big5 and TCA-BIG5-HKSCS.
Lovely names there.

It doesn't seem that the authors were very familiar with the IANA names,
as BIG5-HKSCS, ISO-8859-13, ISO-8859-15 and TIS-620 are registered. 

Why all the IBM code pages? glibc currently supports two - 1251 (be_BY,
bg_BG) and 1255 (yi_US). Is there anyone who really needs all the
others? They are a step backward on Unix.

They're missing a bunch of charsets currently in glibc's supported list.

glibc does not and will not support VISCII, as it puts graphic
characters in the ASCII range. And I'm sure Ulrich Drepper will bite
your head off for even asking. 

As a final note - why does this exist? Linux has a locale standard, in
the same way that Perl has a standard - it's called glibc. If you feel
compelled to write a formal standard, you have to write one that defines
what the standard implementation does. 

-- 
David Starner - [EMAIL PROTECTED], dvdeug/jabber.com (Jabber)
Pointless website: http://dvdeug.dhis.org
When the aliens come, when the deathrays hum, when the bombers bomb,
we'll still be freakin' friends. - Freakin' Friends
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: [I18n]Li18nux Locale Name Guideline Public Review

2002-01-21 Thread Tomohiro KUBOTA

Hi,

At Mon, 21 Jan 2002 19:18:09 +0900,
Tomohiro KUBOTA wrote:

 I found the 2nd public review of Li18nux Locale Name Guideline
 has started.
 
 http://www.hauN.org/ml/b-l-j/a/800/840.html
 http://www.li18nux.org/subgroups/sa/locnameguide/index.html

One important note.  I am not a member of Li18nux.  Thus,
people who have opinions should write it to Li18nux.  The
above web page writes how to comment.

---
Tomohiro KUBOTA [EMAIL PROTECTED]
http://www.debian.or.jp/~kubota/
Introduction to I18N  http://www.debian.org/doc/manuals/intro-i18n/
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: xterm KSYM mode encoding

2002-01-21 Thread Michael B Allen

On Mon, 21 Jan 2002 12:19:09 +
Paul Williams [EMAIL PROTECTED] wrote:
 
 I'd like to hear more from the original poster about the requirement for
 this encoding, and why it would need to be transparent to Linux console
 and PuTTY. Without wishing to be rude, I think the problem should be
 defined before the solution.

I'm writing a text mode application framework called Terminal MVC that
reads an XML screen definition to get a DOM tree (the Model), around which
it renders Views much like the Java AWT or the BView class from Beos
but with box and line drawing characters, for which the root viewport
is the terminal window (the View), and uses NON-neumonic  10 key easy
navigation based around Enter, Esc, and cursor keys (the Controller). If
a C module is specified in a frame tag it's is dlopen'd and initialized
passing the DOM tree onto which event hanlders can be registered. My
intention is that it be the easiest way to write a UI for a C program.

The model and recusrive composition of frames and basic text are stable
enough for me to move onto the Controller. I want to be able to use
the Print Screen key to render the current UI on the postscript display
device and sent to the printer. For the text component I want to be able
to hold the shift key down while I hit the up arrow key to select entire
lines of text. To do this I must be able to detect key releases.

Terminal MVC will be most useful for Linux configuration applications
running on servers, or POS like applications running remotely from
cheap PCs. Servers don't have X Windows so I want this to work on the
Linux console. Admins like administering there boxes from PuTTY on there
Windows workstation too.

As for the requirement of the encoding, I do not want to encode keycodes
or scancodes as Thomas thought. The original 5 minute hack to xterm that
I sent him did this. I want to encode UCS codes corresponding to the
KeySym of the keycode. In other words, I want the X server to do as much
work for me as possible because it knows all about the users keyboard and
custom mappings etc. So I just need to convert the KeySym to Unicode. The
Xutf8LookupString is the way you normally do this but it doesn't work with
KeyReleasedEvents for some reason. Control keys will require additional
representation. All of these UCS key syms will be augmented with an
extra bit of information to indicate that the key associated with the
UCS code was released as opposed to pressed. I do not believe I need to
communicate modifiers although it might be nice if there's space left
over. I have tried without success to find information on these DEC
terminals Paul speaks of but the client X server on which the end user
program is being displayed has already worked out the keyboard layout and
other platform specific issues so I don't think this is the direction I
want to take anyway. Same thing, for PuTTY, it knows it's operating in
a PC environment, so we can just convert to UCS codes there. The point
being the clients handle all the portability issues. The end user program
can now just use the UCS codes and a few constants for control keys.

So ideally, I want to take a keycode and convert it to a KeySym with
modifiers applied and then sent UCS codes with the high bit on if it's a
release. Of course I don't know for sure if that will work but I wouldn't
be asking these fandangled questions here if I did ;-)

xterm-165-ksym2.patch and sample ksym.c program attached.

Thanks,
Mike

-- 
May The Source be with you.



xterm-165-ksym2.patch
Description: Binary data


ksym.c
Description: Binary data


Re: Li18nux Locale Name Guideline Public Review

2002-01-21 Thread David Starner

On Tue, Jan 22, 2002 at 12:49:56PM +0900, Stephen J. Turnbull wrote:
 However, it's important to remember that a bad standard is better than
 no standard.  It is extremely difficult to change a bad standard, it
 is true.  But it's even harder to change no standard, and in the
 meantime users suffer much more.

I'm not sure I agree. A lot of programming languages and a lot of
systems have done well without a formal standard - Perl, Python, Fortran
prior to 1966. But a bad standard, that's hard to implement or is
painful to use, will drive away users and implementers, and discourage
the creation of a new standard.
 
 Telling the
 relevant Li18nux/LSB working group Debian has looked at the Li18nux
 proposal.  However, we intend to {use the IANA names, not impose
 unstandardized names, deprecate IBM code pages to compatibility
 packages} for these reasons:  would be great.  The Debian name
 commands a fair amount of respect because of Debian's continuing
 commitment to standards, both international and internal.

I can't honestly say I speak for Debian. I don't think anybody can
honestly say they speak for Debian on this, besides maybe Ben Collins
(libc maintainer). It's the whole herd of cats thing.
 
  David == David Starner [EMAIL PROTECTED] writes:
 David Why all the IBM code pages? glibc currently supports two -
 David 1251 (be_BY, bg_BG) and 1255 (yi_US).
 
 What do you mean by support?  For code pages, I would say iconv is
 the relevant functionality.  

I have no argument with iconv supporting any charset in use. But we're
talking about locale charsets, the charsets that every program can be
expected to handle, the master charsets for a user. Users should be able
to expect that you can send a file from one Linux box to another in the
same locale without having to recode it. While this isn't universally
true, adding charsets that aren't better then ones already in use
doesn't help anything. Furthermore, if possible, a charset should leave
C1 free of graphical characters, like ISO-8859-1 and EUC-JP do, and
UTF-8 does in a hamhanded way, and must leave C0 free of graphic
characters.  

What I mean by support is that it is included in the list of tested and
supported locales (/usr/share/doc/locales/SUPPORTED.gz on my system) -
attached to the bottom of this message.

 David As a final note - why does this exist? Linux has a locale
 David standard, in the same way that Perl has a standard
 
 Aka, why I use Python. :-)

Does Python have a formal standard? It would surprise me.
 
 David If you feel compelled to write a formal standard, you have
 David to write one that defines what the standard implementation
 David does.
 
 Note what taking that to extremes implies: forget POSIX, which
 doesn't describe any real OS.  

Large parts of POSIX are directly based of existing implemenations.
Also, POSIX needed implementing; there were many diverging Unixes.
There's one locale implementation used on Linux - glibc's.

 While that's mostly a joke, there's something important here.  And
 that is that if we stick consistently to the specify, then implement
 approach, we end up with something workable not so far from where we
 actually are.  

This sounds an awful lot like creationism.

- Jargon File (4.3.0, 30 APR 2001) [jargon]:

creationism n. The (false) belief that large, innovative software
designs can be completely specified in advance and then painlessly
magicked out of the void by the normal efforts of a team of normally
talented programmers. In fact, experience has shown repeatedly that good
designs arise only from evolutionary, exploratory interaction between
one (or at most a small handful of) exceptionally able designer(s) and
an active user population -- and that the first try at a big new idea is
always wrong. Unfortunately, because these truths don't fit the planning
models beloved of {management}, they are generally ignored.
  
We've had the evolutionary, exploratory ineraction, and now, for the
most part, glibc supports the locales and charsets people need. 

 I'm not recommending knuckling under to Emerson's hobgoblin, but I
 hope Debian will lean toward specifying desiderata (== standards)
 independently of current implementations, rather than falling into the
 trap of making the standards overly dependent on the implementations.

Why haven't you standardized Emacs yet? What would you do with a Emacs
standard that ignored much of the good points of recent Emacsen? IMO, we
have a poorly-thought out standard, in an area without multiple
implementations and hence the need for a standard.

af_ZA ISO-8859-1
ar_AE ISO-8859-6
ar_BH ISO-8859-6
ar_DZ ISO-8859-6
ar_EG ISO-8859-6
ar_IN UTF-8
ar_IQ ISO-8859-6
ar_JO ISO-8859-6
ar_KW ISO-8859-6
ar_LB ISO-8859-6
ar_LY ISO-8859-6
ar_MA ISO-8859-6
ar_OM ISO-8859-6
ar_QA ISO-8859-6
ar_SA ISO-8859-6
ar_SD ISO-8859-6
ar_SY ISO-8859-6
ar_TN ISO-8859-6
ar_YE ISO-8859-6
be_BY CP1251
bg_BG CP1251
br_FR ISO-8859-1
bs_BA ISO-8859-2
ca_ES