Re: Status of UTF-8 Debian changelogs

2003-06-09 Thread Henrique de Moraes Holschuh
On Sun, 08 Jun 2003, Wouter Verhelst wrote:
 [EMAIL PROTECTED]:~$ echo $LANG
 nl_BE.UTF-8

Is it in locale.gen? Otherwise, you will NOT have the locale information...

 which means that uxterm manually ensures that $LANG is set to
 something.UTF-8, since I set my $LANG to nl_BE.

Ick.

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh



Re: Status of UTF-8 Debian changelogs

2003-06-09 Thread Colin Walters
On Mon, 2003-06-09 at 12:05, Henrique de Moraes Holschuh wrote:
 On Sun, 08 Jun 2003, Wouter Verhelst wrote:
  [EMAIL PROTECTED]:~$ echo $LANG
  nl_BE.UTF-8
 
 Is it in locale.gen? Otherwise, you will NOT have the locale information...

Ah, good call.  We should have that in the default locale.gen.



Re: Status of UTF-8 Debian changelogs

2003-06-09 Thread Henrique de Moraes Holschuh
Hi Colin!

On Mon, 09 Jun 2003, Colin Walters wrote:

 On Mon, 2003-06-09 at 12:05, Henrique de Moraes Holschuh wrote:
  On Sun, 08 Jun 2003, Wouter Verhelst wrote:
   [EMAIL PROTECTED]:~$ echo $LANG
   nl_BE.UTF-8
  
  Is it in locale.gen? Otherwise, you will NOT have the locale information...
 
 Ah, good call.  We should have that in the default locale.gen.

You'd need to add UTF8 locales for every locale, then. And they're often
unsupported. I know for a fact pt_BR.UTF8 is unsupported (even if localegen
claim it managed to generate it).

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh



Re: Status of UTF-8 Debian changelogs

2003-06-07 Thread Dmitry Borodaenko
On Thu, Jun 05, 2003 at 08:57:06PM -0400, Colin Walters wrote:
 JR the only thing that will change is that if someone complains at
 JR people who use UTF-8 in changelogs, a new retort will be
 JR available, THE POLICY MADE ME DO IT!!1!, or similar.
 CW Why would someone complain?

I would complain.

I am using KOI8-R terminal which can not display Latin-1 characters, and
it seems backward to me to mandate or even allow _usage_ of UTF-8 ahead
of getting it _supported_ across the system. I'd rather have 7-bit ASCII
changelogs: why Latin-1 users are privileged to use native spelling of
their names, while Cyrillic and Kanji and other users have to resort to
transliteration?

-- 
Dmitry Borodaenko



Re: Status of UTF-8 Debian changelogs

2003-06-07 Thread Colin Watson
On Sat, Jun 07, 2003 at 04:59:29PM +0300, Dmitry Borodaenko wrote:
 On Thu, Jun 05, 2003 at 08:57:06PM -0400, Colin Walters wrote:
  JR the only thing that will change is that if someone complains at
  JR people who use UTF-8 in changelogs, a new retort will be
  JR available, THE POLICY MADE ME DO IT!!1!, or similar.
  CW Why would someone complain?
 
 I would complain.
 
 I am using KOI8-R terminal which can not display Latin-1 characters,

Where did Latin-1 come into this?

 and it seems backward to me to mandate or even allow _usage_ of UTF-8
 ahead of getting it _supported_ across the system.

If you find yourself with a UTF-8 file, use a program which knows how to
recode on the fly to your native encoding. Such programs are
increasingly common.

What do you lose here? Those who have fonts that can display the
character in question will be able to do so; those who don't won't, but
will see some reasonably obvious indicator like a ? or a filled-in
square to show that the character is one they can't display. This is
superior to the situation where those who don't have such fonts just see
some gibberish.

 I'd rather have 7-bit ASCII changelogs: why Latin-1 users are
 privileged to use native spelling of their names, while Cyrillic and
 Kanji and other users have to resort to transliteration?

They aren't so privileged. They may decide to do it anyway, but since
the encoding of changelogs is not yet specified you currently take pot
luck on anything outside 7-bit ASCII.

I believe you've just contradicted yourself, anyway. Nobody wants to
have to transliterate their name. I don't want to have to transliterate
the names of people who help me with my packages when I credit them in
the changelog; in some cases I may not even know how to transliterate
their names correctly. UTF-8 allows me to spell their names correctly.
At worst, a couple of characters may not be displayed properly for
people using legacy encodings who don't have software that can recode
for them, but if I'd artificially transliterated to 7-bit ASCII then
nobody would get to see the correct spellings anyway.

Since UTF-8 includes ASCII, all the technical content of my changelogs
will still appear normally no matter what locale you're using, but
suddenly it becomes possible for me to credit my contributors properly
regardless of whether they come from Spain, Russia, or Japan.

We're not talking about mandating the use of UTF-8 across the whole
system here. We're talking about recommending its use in one particular
case where it gives a small but real benefit, and where the consequences
of getting it wrong are not very important (we can always go back and
recode a few changelogs if some unforeseen badness results). Think of it
as a safe experiment in advance of wider deployment of UTF-8 later on.

Package maintainers who aren't set up for writing UTF-8 can always
resort to transliteration into ASCII if need be.

-- 
Colin Watson  [EMAIL PROTECTED]



Re: Status of UTF-8 Debian changelogs

2003-06-07 Thread Colin Walters
On Sat, 2003-06-07 at 09:59, Dmitry Borodaenko wrote:

 I am using KOI8-R terminal which can not display Latin-1 characters, and
 it seems backward to me to mandate or even allow _usage_ of UTF-8 ahead
 of getting it _supported_ across the system. 

A growing amount of software in Debian has UTF-8 support.  I have been
using a fully UTF-8 locale for some time.  At least gnome-terminal has
excellent support for UTF-8; xterm has support too if you invoke it as
'uxterm'.  So I think the support is already here.

 I'd rather have 7-bit ASCII
 changelogs: why Latin-1 users are privileged to use native spelling of
 their names, while Cyrillic and Kanji and other users have to resort to
 transliteration?

I think after we switch to UTF-8, there's no reason why you should.



Re: Status of UTF-8 Debian changelogs

2003-06-07 Thread Dmitry Borodaenko
On Sat, Jun 07, 2003 at 04:21:33PM +0100, Colin Watson wrote:
 DB I am using KOI8-R terminal which can not display Latin-1
 DB characters,
 CW Where did Latin-1 come into this?

I said characters, not encoding, and I mean that KOI8-R character set
does not include characters from Latin-1. Therefore, these characters
need to be replaced with '?', as you point out below.

 CW What do you lose here? Those who have fonts that can display the
 CW character in question will be able to do so; those who don't won't,
 CW but will see some reasonably obvious indicator like a ? or a
 CW filled-in square to show that the character is one they can't
 CW display. This is superior to the situation where those who don't
 CW have such fonts just see some gibberish.
 ...

I don't see it as a proper credit to your contributors if their name
appears as 'J?rg?n' (or even '' in case of Kanji) on my display.
Were it transliterated, I would at least be able to pronounce it (and
there are standard rules for such transliteration anyway (I even think
iconv should have an option to do lossy transliteration for characters
outside of target character set)).

 DB I'd rather have 7-bit ASCII changelogs: why Latin-1 users are
 DB privileged to use native spelling of their names, while Cyrillic
 DB and Kanji and other users have to resort to transliteration?
 CW They aren't so privileged. They may decide to do it anyway, but
 CW since the encoding of changelogs is not yet specified you currently
 CW take pot luck on anything outside 7-bit ASCII.

What I objected to is that they may: I'd rather they may not. I'd rather
encoding of changelogs was specified to be 7-bit ASCII.

 CW I believe you've just contradicted yourself, anyway. Nobody wants
 CW to have to transliterate their name.

Excuse me for ad hominem, but how many foreign languages do you speak?
The reason I'm asking is that my observation is that people from
countries with completely non-ASCII writing system (as opposed to
European Latin-based languages) almost always do transliterate their
names when they communicate with someone speaking a different language.
Do you observe a different pattern?

You see, it is not only a technical issue, it is a communication issue.
If you can't read Cyrillic, native spelling of my name wouldn't help you
to read it, even if it is displayed correctly.

 ...
 CW Package maintainers who aren't set up for writing UTF-8 can always
 CW resort to transliteration into ASCII if need be.

The biggest compromise you can convince me to with that argument, is to
allow to put non-ASCII names in UTF-8 into changelogs, but only if such
name is accompanied by ASCII transliteration. But that solution is
substantially more complex than just limiting changelogs to 7-bit ASCII,
and there is no easy way to check for compliance.

-- 
Dmitry Borodaenko



Re: Status of UTF-8 Debian changelogs

2003-06-07 Thread Colin Walters
On Sat, 2003-06-07 at 13:43, Dmitry Borodaenko wrote:

 I don't see it as a proper credit to your contributors if their name
 appears as 'J?rg?n' (or even '' in case of Kanji) on my display.

That's a problem with your display.

 What I objected to is that they may: I'd rather they may not. I'd rather
 encoding of changelogs was specified to be 7-bit ASCII.

I think that's just like giving up.  It will make life more painful for
everyone.

 Excuse me for ad hominem, but how many foreign languages do you speak?
 The reason I'm asking is that my observation is that people from
 countries with completely non-ASCII writing system (as opposed to
 European Latin-based languages) almost always do transliterate their
 names when they communicate with someone speaking a different language.

Of course, this is likely because it wasn't until fairly recently (i.e.
the last year or two) that GNU/Linux got some basic support for their
writing systems.  So they essentially had to transliterate.  But now
with UTF-8 there's a better choice, and they can use their real name.

 The biggest compromise you can convince me to with that argument, is to
 allow to put non-ASCII names in UTF-8 into changelogs, but only if such
 name is accompanied by ASCII transliteration. But that solution is
 substantially more complex than just limiting changelogs to 7-bit ASCII,
 and there is no easy way to check for compliance.

That's something that an individual maintainer could decide to do. 
Perhaps they could include a transliteration in quotation marks, like:

カゼチ Junichrio Koizumi [EMAIL PROTECTED].

My apologies if the above is some grave insult in Japanese; I just
picked some random Katakana in gucharmap :)

Anyways, I think transliteration is largely a separate issue from the
encoding of the changelog.  Using UTF-8 doesn't force people to stop
transliterating.



Re: Status of UTF-8 Debian changelogs

2003-06-07 Thread Wouter Verhelst
On Sat, Jun 07, 2003 at 04:21:33PM +0100, Colin Watson wrote:
 What do you lose here? Those who have fonts that can display the
 character in question will be able to do so; those who don't won't, but
 will see some reasonably obvious indicator like a ? or a filled-in
 square to show that the character is one they can't display. This is
 superior to the situation where those who don't have such fonts just see
 some gibberish.

Superior? No way, it's just as bad. Whether the noise is gibberish, or
whether it consist of question marks or cute little squares doesn't make
any difference at all.

-- 
Wouter Verhelst
Debian GNU/Linux -- http://www.debian.org
Nederlandstalige Linux-documentatie -- http://nl.linux.org
An expert can usually spot the difference between a fake charge and a
full one, but there are plenty of dead experts. 
  -- National Geographic Channel, in a documentary about large African beasts.


pgpZMFR75GCsR.pgp
Description: PGP signature


Re: Status of UTF-8 Debian changelogs

2003-06-07 Thread Steve Langasek
On Sat, Jun 07, 2003 at 09:31:26PM +0200, Wouter Verhelst wrote:
 On Sat, Jun 07, 2003 at 04:21:33PM +0100, Colin Watson wrote:
  What do you lose here? Those who have fonts that can display the
  character in question will be able to do so; those who don't won't, but
  will see some reasonably obvious indicator like a ? or a filled-in
  square to show that the character is one they can't display. This is
  superior to the situation where those who don't have such fonts just see
  some gibberish.

 Superior? No way, it's just as bad. Whether the noise is gibberish, or
 whether it consist of question marks or cute little squares doesn't make
 any difference at all.

Except that UTF8 is non-destructive when interpreted as any other
character set.  The same cannot be said of many other character sets:
trying to display some Western charsets on some CJK terminals can cause
codepage shifts that corrupt the display of the remainder of the text,
IIRC.

-- 
Steve Langasek
postmodern programmer


pgpFTIcsa8Wav.pgp
Description: PGP signature


Re: Status of UTF-8 Debian changelogs

2003-06-07 Thread Colin Walters
On Sat, 2003-06-07 at 15:36, Wouter Verhelst wrote:

 Yeah, but it's not always as good as the legacy support is. For
 instance, last I tried uxterm (like, 2 minutes ago), I put in a euro
 sign somewhere. Which appeared correctly (hurray), but doing backspace
 over that didn't do what it was supposed to do, in that only one of the
 three unicode bytes was removed (bug not filed yet, will do if I don't
 forget, and find the time to investigate properly).

Are you using zsh?  I get that kind of behavior with it, but bash works
ok.  This is unfortunate because I really like zsh otherwise :/



Re: Status of UTF-8 Debian changelogs

2003-06-07 Thread Wouter Verhelst
On Sat, Jun 07, 2003 at 04:17:15PM -0400, Colin Walters wrote:
 On Sat, 2003-06-07 at 15:36, Wouter Verhelst wrote:
 
  Yeah, but it's not always as good as the legacy support is. For
  instance, last I tried uxterm (like, 2 minutes ago), I put in a euro
  sign somewhere. Which appeared correctly (hurray), but doing backspace
  over that didn't do what it was supposed to do, in that only one of the
  three unicode bytes was removed (bug not filed yet, will do if I don't
  forget, and find the time to investigate properly).
 
 Are you using zsh?  I get that kind of behavior with it, but bash works
 ok.

No, I'm using bash...

-- 
Wouter Verhelst
Debian GNU/Linux -- http://www.debian.org
Nederlandstalige Linux-documentatie -- http://nl.linux.org
An expert can usually spot the difference between a fake charge and a
full one, but there are plenty of dead experts. 
  -- National Geographic Channel, in a documentary about large African beasts.



Re: Status of UTF-8 Debian changelogs

2003-06-07 Thread Colin Walters
[ no need to CC me ]

On Sat, 2003-06-07 at 17:39, Wouter Verhelst wrote:

 No, I'm using bash...

Weird.  It works here.   What's your $LANG?  If you're inputting Unicode
it should probably be something.UTF-8.



Re: Status of UTF-8 Debian changelogs

2003-06-07 Thread Wouter Verhelst
On Sat, Jun 07, 2003 at 05:58:28PM -0400, Colin Walters wrote:
 [ no need to CC me ]
 
 On Sat, 2003-06-07 at 17:39, Wouter Verhelst wrote:
 
  No, I'm using bash...
 
 Weird.  It works here.   What's your $LANG?  If you're inputting Unicode
 it should probably be something.UTF-8.

it is:

[EMAIL PROTECTED]:~$ echo $LANG
nl_BE.UTF-8

which means that uxterm manually ensures that $LANG is set to
something.UTF-8, since I set my $LANG to nl_BE.

Anyway, this is way offtopic here. My point was yes, there is some
unicode-support in Debian, but no, it's not working flawlessly yet. If
you have any other ideas (finding the exact issue still is a worthwile
goal), please send it by private mail.

-- 
Wouter Verhelst
Debian GNU/Linux -- http://www.debian.org
Nederlandstalige Linux-documentatie -- http://nl.linux.org
An expert can usually spot the difference between a fake charge and a
full one, but there are plenty of dead experts. 
  -- National Geographic Channel, in a documentary about large African beasts.



Re: Status of UTF-8 Debian changelogs

2003-06-06 Thread Bill Allombert
On Fri, Jun 06, 2003 at 01:17:00PM +0200, Jérôme Marant wrote:
  I don't see all those (7|8)-bit-charset-using people requiring the
  same...
 
   Policy would mean all of them in the same charset, UTF-8 that is.

The issue call for two comments:

1) Changelog are required to be written in english, so non 7bit
characters should be rare, and use of non latin-1 characters are 
probably not a good idea. For example, writing the name of a 
developer with japanese characters might cause problem to people 
reading the changelog understanding who is referred to. This is
unfortunate.

2) People write changelog with whatever locales they use for development.
Requiring them to use special tool for writing changelog would be a
pain. I don't know how far lintian can check for UTF-8 encoding. 

Cheers,
-- 
Bill. [EMAIL PROTECTED]

Imagine a large red swirl here. 



Re: Status of UTF-8 Debian changelogs

2003-06-06 Thread Steve Langasek
On Fri, Jun 06, 2003 at 06:37:11PM +0200, Bill Allombert wrote:
 On Fri, Jun 06, 2003 at 01:17:00PM +0200, Jérôme Marant wrote:
   I don't see all those (7|8)-bit-charset-using people requiring the
   same...

Policy would mean all of them in the same charset, UTF-8 that is.

 The issue call for two comments:

 1) Changelog are required to be written in english, so non 7bit
 characters should be rare, and use of non latin-1 characters are 
 probably not a good idea. For example, writing the name of a 
 developer with japanese characters might cause problem to people 
 reading the changelog understanding who is referred to. This is
 unfortunate.

 2) People write changelog with whatever locales they use for development.
 Requiring them to use special tool for writing changelog would be a
 pain. I don't know how far lintian can check for UTF-8 encoding. 

Of course, these comments give contradictory rationales.  The one says
that mandating UTF-8 is bad because people shouldn't use non-ASCII
characters in changelogs; the other says that mandating UTF-8 is bad
because it makes it harder for people to use non-ASCII characters in
changelogs.  I argue that the latter is a *good* thing; and where
exceptions are permitted, they should be encoded using a common
character set.

Checking for non-UTF8 characters in a changelog is trivial.  Dump the
file through 'iconv -f utf-8 -t ucs-4', discard the output, and check
the return value.  If there are any characters in the stream which are
invalid UTF-8 sequences, iconv will exit with an error code; and this
will be the case for the vast majority of other character sets.

-- 
Steve Langasek
postmodern programmer


pgpowTkkl06ur.pgp
Description: PGP signature


Re: Status of UTF-8 Debian changelogs

2003-06-06 Thread Colin Walters
On Fri, 2003-06-06 at 12:37, Bill Allombert wrote:

 1) Changelog are required to be written in english, so non 7bit
 characters should be rare, and use of non latin-1 characters are 
 probably not a good idea. For example, writing the name of a 
 developer with japanese characters might cause problem to people 
 reading the changelog understanding who is referred to. This is
 unfortunate.

 2) People write changelog with whatever locales they use for development.
 Requiring them to use special tool for writing changelog would be a
 pain.

For some of us, the locale encoding is UTF-8.  Besides, if you want to
continue using a legacy editor, it should be trivial to convert from
whatever locale encoding you're using into UTF-8 when building the
binary package using iconv.  Basically just something like this:

iconv -f ISO-8859-1 -t UTF-8 debian/changelog  
debian/foo/usr/share/doc/foo/changelog.Debian
gzip -9qf debian/foo/usr/share/doc/foo/changelog.Debian

  I don't know how far lintian can check for UTF-8 encoding. 

Actually I sent in a patch for this 153 days ago.  Bug #175318.



Status of UTF-8 Debian changelogs

2003-06-05 Thread Jérôme Marant

Hi,

  I've seen some UTF-8-encoded debian/changelog files but I haven't
  seen anything mentioning it is allowed in Debian Policy.

  According to #174982, the proposal has been accepted but the bug
  is still open. When is this planned for?

  Thanks.

--
Jérôme Marant



Re: Status of UTF-8 Debian changelogs

2003-06-05 Thread Josip Rodin
On Thu, Jun 05, 2003 at 01:35:38PM +0200, Jérôme Marant wrote:
   I've seen some UTF-8-encoded debian/changelog files but I haven't
   seen anything mentioning it is allowed in Debian Policy.
 
   According to #174982, the proposal has been accepted but the bug
   is still open. When is this planned for?

Ahm. You need it written in the Policy manual to use a 16-bit charset?
I don't see all those (7|8)-bit-charset-using people requiring the same...

-- 
 2. That which causes joy or happiness.



Re: Status of UTF-8 Debian changelogs

2003-06-05 Thread Steve Langasek
On Thu, Jun 05, 2003 at 02:23:36PM +0200, Josip Rodin wrote:
 On Thu, Jun 05, 2003 at 01:35:38PM +0200, Jérôme Marant wrote:
I've seen some UTF-8-encoded debian/changelog files but I haven't
seen anything mentioning it is allowed in Debian Policy.
  
According to #174982, the proposal has been accepted but the bug
is still open. When is this planned for?

 Ahm. You need it written in the Policy manual to use a 16-bit charset?
 ^^

 multibyte encoding.  If they were using a 16-bit character set,
we'd have to kill them for creating files that can't be processed as
C strings. :)

-- 
Steve Langasek
postmodern programmer


pgpPNS22Suzkx.pgp
Description: PGP signature


Re: Status of UTF-8 Debian changelogs

2003-06-05 Thread Colin Walters
On Thu, 2003-06-05 at 08:23, Josip Rodin wrote:

 Ahm. You need it written in the Policy manual to use a 16-bit charset?

As Steve points out, the size of the code space isn't particularly
relevant.

 I don't see all those (7|8)-bit-charset-using people requiring the same...

The problem is that we have no way to know what encoding an individual
Debian Changelog entry is in.  This is actually important for stuff like
apt-listchanges.  I constantly see broken characters in Debian
changelogs in apt-listchanges from people using ISO-8859-1, when my
terminal speaks UTF-8 natively.  If you're using an ISO-8859-1 terminal,
then apt-listchanges could recode the changelogs from UTF-8 to
ISO-8859-1 (or try, anyways).  And since my terminal speaks UTF-8,
apt-listchanges could just pass it on asis.

A situation where it can just be any encoding (or even a mix, if say a
speaker of an ISO-8859-2 language later takes over from the previous
ISO-8859-1 maintainer) is just terribly tbroken.

UTF-8 is the one and only sane choice.  This policy amendment got a
number of seconds, so unless you can raise a coherent objection, I think
it should go in.




Re: Status of UTF-8 Debian changelogs

2003-06-05 Thread Josip Rodin
On Thu, Jun 05, 2003 at 02:58:12PM -0400, Colin Walters wrote:
 The problem is that we have no way to know what encoding an individual
 Debian Changelog entry is in.

The problem is that my point entirely flew over your head. The point was,
as usual, that Policy is not designed to be a stick to beat people with,
and that it does not have to precede implementation.

You can already complain at people who use e.g. Latin 1 in changelogs. Once
a released version of the Policy manual gets a shiny and bright new sentence
saying Use Unicode (just in a roundabout, somewhat patronizing kind of
way), the only thing that will change is that if someone complains at people
who use UTF-8 in changelogs, a new retort will be available, THE POLICY
MADE ME DO IT!!1!, or similar.

Oh, and insert another standard rant here on how the fact something hasn't
been done does not automatically imply that those who haven't done it are
obstructionist sadistic bastards.

-- 
 2. That which causes joy or happiness.



Re: Status of UTF-8 Debian changelogs

2003-06-05 Thread Steve Langasek
On Thu, Jun 05, 2003 at 10:40:07PM +0200, Josip Rodin wrote:
 On Thu, Jun 05, 2003 at 02:58:12PM -0400, Colin Walters wrote:
  The problem is that we have no way to know what encoding an individual
  Debian Changelog entry is in.

 The problem is that my point entirely flew over your head. The point was,
 as usual, that Policy is not designed to be a stick to beat people with,
 and that it does not have to precede implementation.

 You can already complain at people who use e.g. Latin 1 in changelogs. Once
 a released version of the Policy manual gets a shiny and bright new sentence
 saying Use Unicode (just in a roundabout, somewhat patronizing kind of
 way), the only thing that will change is that if someone complains at people
 who use UTF-8 in changelogs, a new retort will be available, THE POLICY
 MADE ME DO IT!!1!, or similar.

Common sense already dictates that untagged, non-ASCII characters should
not be used in documents that must be parsed in a multilingual
environment (e.g., the planet Earth).  Specifying UTF8 as an encoding
for changelogs is to *permit* something which is desirable but not
sensibly achievable in the absence of a policy for it.

I'm more than happy to beat people for using non-UTF8 characters in
changelog with the stick I'm currently holding -- no need to roll up
Policy for this purpose. ;)

-- 
Steve Langasek
postmodern programmer


pgpVpbb2yFzaD.pgp
Description: PGP signature


Re: Status of UTF-8 Debian changelogs

2003-06-05 Thread Colin Walters
On Thu, 2003-06-05 at 16:40, Josip Rodin wrote:
 On Thu, Jun 05, 2003 at 02:58:12PM -0400, Colin Walters wrote:
  The problem is that we have no way to know what encoding an individual
  Debian Changelog entry is in.
 
 The problem is that my point entirely flew over your head. The point was,
 as usual, that Policy is not designed to be a stick to beat people with,
 and that it does not have to precede implementation.

You certainly had a strange way of stating this; your initial reply
seemed to focus on the size of the code space of the character sets.

Anyways, you could consider this as already mostly implemented; the vast
majority of changelogs are pure ASCII; there's only a few people using
ISO-8859-x and UTF-8.  Given the disadvantages of the former, we should
standardize on the latter, and that's what this policy amendment is all
about.

 You can already complain at people who use e.g. Latin 1 in changelogs. Once
 a released version of the Policy manual gets a shiny and bright new sentence
 saying Use Unicode (just in a roundabout, somewhat patronizing kind of
 way), 

I see no reason for it to be either roundabout or patronizing; perhaps
you could suggest an alternative wording that would remove these
perceived qualities?

 the only thing that will change is that if someone complains at people
 who use UTF-8 in changelogs, a new retort will be available, THE POLICY
 MADE ME DO IT!!1!, or similar.

Why would someone complain?

 Oh, and insert another standard rant here on how the fact something hasn't
 been done does not automatically imply that those who haven't done it are
 obstructionist sadistic bastards.

I never implied such, or if I did it was certainly not my intention.  I
think you've been doing a great job as a policy editor, and I assume
that not adding this amendment was just an oversight.