Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-06 Thread Andrew McMillan
On Sat, 2010-09-04 at 16:08 +0200, Bill Allombert wrote: On Sat, Sep 04, 2010 at 12:37:07AM +0200, Samuel Thibault wrote: Aurelien Jarno, le Fri 03 Sep 2010 19:16:40 +0200, a écrit : On Fri, Sep 03, 2010 at 04:20:27PM +0200, Samuel Thibault wrote: Roger Leigh, le Fri 03 Sep 2010 14:52:39

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-04 Thread Bill Allombert
On Sat, Sep 04, 2010 at 12:37:07AM +0200, Samuel Thibault wrote: Aurelien Jarno, le Fri 03 Sep 2010 19:16:40 +0200, a écrit : On Fri, Sep 03, 2010 at 04:20:27PM +0200, Samuel Thibault wrote: Roger Leigh, le Fri 03 Sep 2010 14:52:39 +0100, a écrit : There were no objections to having a

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-03 Thread Thorsten Glaser
Russ Allbery dixit: I agree with others in this thread that having a UTF-8 locale without the collation changes implied by en_US is very useful for various software packages such as automated test suites that want reproducible results and were originally written for the C locale. Same for

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-03 Thread Giacomo A. Catenazzi
On 03.09.2010 01:46, Russ Allbery wrote: Samuel Thibaultsthiba...@debian.org writes: Well, it's mostly - some people saying it's useless, - while other people saying I need it, and also - en_US.UTF-8 is just fine vs. - en_US.UTF-8 sucks, we really need C.UTF-8 instead without any

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-03 Thread Samuel Thibault
Thorsten Glaser, le Fri 03 Sep 2010 13:02:31 +, a écrit : Russ Allbery dixit: I agree with others in this thread that having a UTF-8 locale without the collation changes implied by en_US is very useful for various software packages such as automated test suites that want reproducible

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-03 Thread Samuel Thibault
Giacomo A. Catenazzi, le Fri 03 Sep 2010 15:26:47 +0200, a écrit : BTW I think we should wait some more time. Last week I was on debian-glibc list a bug: printf fails if it find an invalid UTF-8 character (when the locale uses UTF-8). Note it is allowed in POSIX, which distinguish raw strings

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-03 Thread Roger Leigh
On Fri, Sep 03, 2010 at 01:37:24AM +0200, Samuel Thibault wrote: Russ Allbery, le Thu 02 Sep 2010 16:24:56 -0700, a écrit : Generally what that means is that someone needs to digest the discussion in the thread Well, it's mostly - some people saying it's useless, - while other people

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-03 Thread Samuel Thibault
Roger Leigh, le Fri 03 Sep 2010 14:52:39 +0100, a écrit : On Fri, Sep 03, 2010 at 01:37:24AM +0200, Samuel Thibault wrote: without any convergence. I think reading back through the entire log, Thanks for having done it! people who were initially rather opposed to the proposal did come

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-03 Thread Ben Finney
Roger Leigh rle...@codelibre.net writes: There were no objections to having a UTF-8 locale installed and available by default, just to it *being* the default. […] Would a less confusing way to make this distinction be to say something like: “The minimal Debian installation must have a locale

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-03 Thread Thorsten Glaser
Samuel Thibault dixit: LC_CTYPE has differences between locales, transliterations notably. For Oh, okay – good to know… I'd say go on :) OK. (of course we'll need to wait for libc to provide the locale (post-squeeze I guess) before changing the policy). Sure. Maybe think of something to

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-03 Thread Thorsten Glaser
Samuel Thibault dixit: believe that's something that shouldn't break Squeeze at all. I also believe it cannot possibly do that. bye, //mirabilos -- “It is inappropriate to require that a time represented as seconds since the Epoch precisely represent the number of seconds between the

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-03 Thread Aurelien Jarno
On Fri, Sep 03, 2010 at 04:20:27PM +0200, Samuel Thibault wrote: Roger Leigh, le Fri 03 Sep 2010 14:52:39 +0100, a écrit : On Fri, Sep 03, 2010 at 01:37:24AM +0200, Samuel Thibault wrote: without any convergence. I think reading back through the entire log, Thanks for having done it!

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-03 Thread Russ Allbery
Ben Finney ben+deb...@benfinney.id.au writes: Would a less confusing way to make this distinction be to say something like: “The minimal Debian installation must have a locale available that uses the UTF-8 character encoding.”? The other angle here is that it can't just be any UTF-8 locale,

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-03 Thread Samuel Thibault
Aurelien Jarno, le Fri 03 Sep 2010 19:16:40 +0200, a écrit : On Fri, Sep 03, 2010 at 04:20:27PM +0200, Samuel Thibault wrote: Roger Leigh, le Fri 03 Sep 2010 14:52:39 +0100, a écrit : There were no objections to having a UTF-8 locale installed and available by default, just to it *being*

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-02 Thread Samuel Thibault
Hello, No news on this? Hurd's console needs a UTF-8 locale to be able to use wcwidth() for proper double-width support. Note: debian-installer is already providing a C.UTF-8 locale to d-i components, so it works there. Samuel -- To UNSUBSCRIBE, email to

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-02 Thread Russ Allbery
Samuel Thibault sthiba...@debian.org writes: No news on this? Hurd's console needs a UTF-8 locale to be able to use wcwidth() for proper double-width support. Note: debian-installer is already providing a C.UTF-8 locale to d-i components, so it works there. Does libc in Debian provide a

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-02 Thread Samuel Thibault
Russ Allbery, le Thu 02 Sep 2010 15:53:50 -0700, a écrit : Samuel Thibault sthiba...@debian.org writes: No news on this? Hurd's console needs a UTF-8 locale to be able to use wcwidth() for proper double-width support. Note: debian-installer is already providing a C.UTF-8 locale to d-i

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-02 Thread Russ Allbery
Samuel Thibault sthiba...@debian.org writes: Russ Allbery, le Thu 02 Sep 2010 15:53:50 -0700, a écrit : Does libc in Debian provide a C.UTF-8 locale? It doesn't yet but it's easy to do, that's not the question. See the questions in the bug thread. I think that's a prerequisite for doing

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-02 Thread Samuel Thibault
Russ Allbery, le Thu 02 Sep 2010 16:07:25 -0700, a écrit : Samuel Thibault sthiba...@debian.org writes: Russ Allbery, le Thu 02 Sep 2010 15:53:50 -0700, a écrit : Does libc in Debian provide a C.UTF-8 locale? It doesn't yet but it's easy to do, that's not the question. See the

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-02 Thread Russ Allbery
Samuel Thibault sthiba...@debian.org writes: Russ Allbery, le Thu 02 Sep 2010 16:07:25 -0700, a écrit : Ah, then no, in that case there has been no progress. I don't believe anyone is currently working on this. Well, no work is needed, what is needed is to agree on what work to do. That's

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-02 Thread Samuel Thibault
Russ Allbery, le Thu 02 Sep 2010 16:24:56 -0700, a écrit : Generally what that means is that someone needs to digest the discussion in the thread Well, it's mostly - some people saying it's useless, - while other people saying I need it, and also - en_US.UTF-8 is just fine vs. - en_US.UTF-8

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2010-09-02 Thread Russ Allbery
Samuel Thibault sthiba...@debian.org writes: Well, it's mostly - some people saying it's useless, - while other people saying I need it, and also - en_US.UTF-8 is just fine vs. - en_US.UTF-8 sucks, we really need C.UTF-8 instead without any convergence. I think the way to get past

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-12-01 Thread Giacomo A. Catenazzi
Thorsten Glaser wrote: Albert Cahalan dixit: Unless plain C goes UTF-8 Not going to happen, it’s not binary-safe. (I fought that in MirBSD with the OPTU-8/16 encoding scheme.) Why not? Note that usual functions work on bytes, not on characters, and on POSIX utilities the old/classical

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-12-01 Thread Thorsten Glaser
Giacomo A. Catenazzi dixit: Not going to happen, it’s not binary-safe. (I fought that in MirBSD with the OPTU-8/16 encoding scheme.) Why not? Note that usual functions work on bytes Not really. The difference between 'tr u x' on binary files can, depending on the implementation of tr (if it

Bug#522776: Subject: Re: Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-11-27 Thread Thorsten Glaser
Albert Cahalan dixit: Any imperfection in a locale results in C, as ASCII as can be. Yes, and C shall not imply latin1 but 7-bit ASCII but 8-bit transparent. //mirabilos -- Sometimes they [people] care too much: pretty printers [and syntax highligh- ting, d.A.] mechanically produce pretty

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-11-27 Thread Thorsten Glaser
Albert Cahalan dixit: Unless plain C goes UTF-8 Not going to happen, it’s not binary-safe. (I fought that in MirBSD with the OPTU-8/16 encoding scheme.) The stupid broken en_US.UTF-8 fucks up the sort order. So true… (and paper size!) We really need a do-nothing locale that follows the

Bug#522776: Subject: Re: Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-11-27 Thread Thorsten Glaser
Albert Cahalan dixit: Giacomo A. Catenazzi writes: I think nobody should use C or C.UTF-8 as user encoding. I’d use it. Debian doesn't ship a proper locale. I want sorting according to the raw Unicode values. Also called ASCIIbetically ☺ But C exists, C.UTF-8 doesn’t. * All ISO8859 locales

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-11-26 Thread Albert Cahalan
Steve Langasek writes: On Mon, Apr 06, 2009 at 05:33:35PM +, Thorsten Glaser wrote: If you need a specific locale (as seems from mksh, not sure if it is a bug in that program), you need to set it. You can only set a locale on a glibc-based system if it's installed beforehand, which root

Bug#522776: Subject: Re: Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-11-26 Thread Albert Cahalan
Roger Leigh writes: On Tue, Apr 07, 2009 at 09:24:38PM +0200, Adeodato Simó wrote: + Thorsten Glaser (Tue, 07 Apr 2009 18:54:59 +): Except the ton which sets LC_ALL=C to get sane (parsable, dependable, historically compatible) output. These would then unset all other LC_* and LANG and

Bug#522776: Subject: Re: Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-11-26 Thread Albert Cahalan
Andrew McMillan writes: On Wed, 2009-04-08 at 10:15 +0200, Giacomo A. Catenazzi wrote: So I've a question: what does UTF-8 mean in this context (C.UTF-8) ? ... So given a character which is outside of the 0x00 = 0x7f range, in an environment which does not specify an encoding, I would like to

Bug#522776: Subject: Re: Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-11-26 Thread Albert Cahalan
Giacomo A. Catenazzi writes: [Andrew McMillan probably] I think nobody should use C or C.UTF-8 as user encoding. And I really hope that Debian will try to convince user to use a proper locale. Debian doesn't ship a proper locale. I want sorting according to the raw Unicode values. I want

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-05-07 Thread Joey Hess
FWIW, the installation-locale udeb provides a C.UTF-8 locale, which d-i runs under. Takes about 168k. -- see shy jo signature.asc Description: Digital signature

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-09 Thread Giacomo A. Catenazzi
Thorsten Glaser wrote: Giacomo A. Catenazzi dixit: I think you misunderstand the mksh part of the problem. mksh has two modi: a legacy mode, in which it does not make any assumptions about charsets or encodings and is 8-bit clean and mostly 8-bit transparent, safe a few mostly past bugs and

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-09 Thread Thorsten Glaser
Giacomo A. Catenazzi dixit: This is good way to do things! Thanks. Or a debhelper (or like) utility that construct it for build needs. That’s already done, as I said – vorlon gave me an idea, I implemented it, it works, I uploaded a new mksh package… and then I saw someone’s added it to the

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-09 Thread Giacomo A. Catenazzi
Thorsten Glaser wrote: Giacomo A. Catenazzi dixit: a real locale), but in this case I would also test some UTF-16 or Asian locale (mksh should not assume UTF-8 in these cases). It doesn’t. This test is already run for the C locale. Besides, there are no UTF-16 or somesuch locales on UNIX®

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-08 Thread Giacomo A. Catenazzi
Roger Leigh wrote: I wasn't aware that this level of checking was performed, though it does make sense. But, does it not reject non 7-bit input in the C locale for completeness? Should tools doing raw I/O not be using lower level interfaces such as fread() and fwrite() rather than the

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-08 Thread Giacomo A. Catenazzi
Andrew McMillan wrote: On Tue, 2009-04-07 at 22:32 +0200, Adeodato Simó wrote: It is my impression that more packages than mksh could use an UTF-8 locale at build time (I’m afraid I don’t have pointers, but I’m sure I’ve come across at least a couple). Wouldn’t it be just better to change

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-08 Thread Roger Leigh
On Tue, Apr 07, 2009 at 10:47:00PM +, Thorsten Glaser wrote: Roger Leigh dixit: Are you sure? Not entirely, but I recall fgetc (or was it fgetwc?) being affected. Ah, fgetc/fputc are specified in the standard as byte oriented rather than character-oriented, so are probably

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-08 Thread Giacomo A. Catenazzi
Roger Leigh wrote: On Tue, Apr 07, 2009 at 09:24:38PM +0200, Adeodato Simó wrote: + Thorsten Glaser (Tue, 07 Apr 2009 18:54:59 +): Except the ton which sets LC_ALL=C to get sane (parsable, dependable, historically compatible) output. These would then unset all other LC_* and LANG and

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-08 Thread Roger Leigh
On Wed, Apr 08, 2009 at 09:41:18AM +0200, Giacomo A. Catenazzi wrote: Roger Leigh wrote: I wasn't aware that this level of checking was performed, though it does make sense. But, does it not reject non 7-bit input in the C locale for completeness? Should tools doing raw I/O not be using

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-08 Thread Giacomo A. Catenazzi
Roger Leigh wrote: On Tue, Apr 07, 2009 at 10:36:20AM +0200, Giacomo A. Catenazzi wrote: Roger Leigh wrote: I can't help but feel that your reply completely missed the purpose of what I want to do, and why. I hope the following response clears things up. I know that I missed the original

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-08 Thread Andrew McMillan
On Wed, 2009-04-08 at 10:15 +0200, Giacomo A. Catenazzi wrote: So I've a question: what does UTF-8 mean in this context (C.UTF-8) ? It is not a stupid question, and the answer is not the UTF-8 algorithm to code/decode unicode. I'm still thinking that you are confusing the various meanings.

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-08 Thread Roger Leigh
On Wed, Apr 08, 2009 at 10:22:15AM +0200, Giacomo A. Catenazzi wrote: Roger Leigh wrote: On Tue, Apr 07, 2009 at 09:24:38PM +0200, Adeodato Simó wrote: + Thorsten Glaser (Tue, 07 Apr 2009 18:54:59 +): Except the ton which sets LC_ALL=C to get sane (parsable, dependable, historically

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-08 Thread Giacomo A. Catenazzi
Andrew McMillan wrote: On Wed, 2009-04-08 at 10:15 +0200, Giacomo A. Catenazzi wrote: So I've a question: what does UTF-8 mean in this context (C.UTF-8) ? It is not a stupid question, and the answer is not the UTF-8 algorithm to code/decode unicode. I'm still thinking that you are confusing

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-08 Thread Thorsten Glaser
Giacomo A. Catenazzi dixit: The locale C is already a UTF-8 compatible locale. It is UTF-8 transparent but that's its pro and con. It does not tell the system that UTF-8 encoding is to be used. It basically says the encoding is none/unknown. Why build need to depend to a locale? [...] For

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-08 Thread Andrew McMillan
On Wed, 2009-04-08 at 15:31 +0200, Giacomo A. Catenazzi wrote: We have the same objective, but two different ways. Indeed, but it seems to me that you are pushing for a much bigger change than I am. So the smallest step which is in the same direction both of us want to go, is for *a* UTF-8

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Giacomo A. Catenazzi
Roger Leigh wrote: On Mon, Apr 06, 2009 at 11:09:17AM -0700, Steve Langasek wrote: On Mon, Apr 06, 2009 at 05:33:35PM +, Thorsten Glaser wrote: If you need a specific locale (as seems from mksh, not sure if it is a bug in that program), you need to set it. You can only set a locale on a

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Adeodato Simó
+ Thorsten Glaser (Tue, 07 Apr 2009 18:54:59 +): Except the ton which sets LC_ALL=C to get sane (parsable, dependable, historically compatible) output. These would then unset all other LC_* and LANG and LANGUAGE, and only set LC_CTYPE to C.UTF-8 to get old behaviour but with UTF-8 (and

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Bill Allombert
On Mon, Apr 06, 2009 at 10:56:25PM +0100, Roger Leigh wrote: On Mon, Apr 06, 2009 at 04:18:59PM +0200, Bill Allombert wrote: On Mon, Apr 06, 2009 at 02:06:55PM +0200, Thorsten Glaser wrote: Package: debian-policy Version: 3.8.1.0 Severity: wishlist For the mksh regression tests,

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Thorsten Glaser
Adeodato Simó dixit: + Thorsten Glaser (Tue, 07 Apr 2009 18:54:59 +): Except the ton which sets LC_ALL=C to get sane (parsable, dependable, historically compatible) output. These would then unset all other LC_* and LANG and LANGUAGE, and only set LC_CTYPE to C.UTF-8 to get old behaviour

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Roger Leigh
On Tue, Apr 07, 2009 at 06:54:59PM +, Thorsten Glaser wrote: Bill Allombert dixit: Fortunately, since Sarge, debian-installer set LANG in /etc/environment so programs almost never run under C locale anymore. Except the ton which sets LC_ALL=C to get sane (parsable, dependable,

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Thorsten Glaser
Bill Allombert dixit: Fortunately, since Sarge, debian-installer set LANG in /etc/environment so programs almost never run under C locale anymore. Except the ton which sets LC_ALL=C to get sane (parsable, dependable, historically compatible) output. These would then unset all other LC_* and

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Roger Leigh
On Tue, Apr 07, 2009 at 09:24:38PM +0200, Adeodato Simó wrote: + Thorsten Glaser (Tue, 07 Apr 2009 18:54:59 +): Except the ton which sets LC_ALL=C to get sane (parsable, dependable, historically compatible) output. These would then unset all other LC_* and LANG and LANGUAGE, and

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Adeodato Simó
+ Steve Langasek (Mon, 06 Apr 2009 11:09:17 -0700): On Mon, Apr 06, 2009 at 05:33:35PM +, Thorsten Glaser wrote: If you need a specific locale (as seems from mksh, not sure if it is a bug in that program), you need to set it. You can only set a locale on a glibc-based system if it’s

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Thorsten Glaser
Roger Leigh dixit: However, I would ideally like the C/POSIX locales to be UTF-8 by default as on other systems (with a C.ASCII variant if required). No, this has the potential to break, for example, tr(1). I lived through that on MirBSD. //mirabilos -- “It is inappropriate to require that a

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Thorsten Glaser
Adeodato Simó dixit: I would go as far as suggesting that some package like libc6 itself FWIW: -rw-r--r-- 1 tg tg 238336 Apr 7 22:59 en_US.UTF-8/LC_CTYPE It's not *that* much... Finally, this stuff that Roger proposes about making “C” be UTF-8, and create some C.ASCII for people needing

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Andrew McMillan
On Tue, 2009-04-07 at 22:32 +0200, Adeodato Simó wrote: It is my impression that more packages than mksh could use an UTF-8 locale at build time (I’m afraid I don’t have pointers, but I’m sure I’ve come across at least a couple). Wouldn’t it be just better to change Debian’s default to

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Roger Leigh
On Tue, Apr 07, 2009 at 10:36:20AM +0200, Giacomo A. Catenazzi wrote: I can't help but feel that your reply completely missed the purpose of what I want to do, and why. I hope the following response clears things up. Roger Leigh wrote: On Mon, Apr 06, 2009 at 11:09:17AM -0700, Steve Langasek

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Roger Leigh
On Tue, Apr 07, 2009 at 09:00:50PM +, Thorsten Glaser wrote: Adeodato Simó dixit: I would go as far as suggesting that some package like libc6 itself FWIW: -rw-r--r-- 1 tg tg 238336 Apr 7 22:59 en_US.UTF-8/LC_CTYPE It's not *that* much... Finally, this stuff that Roger

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Thorsten Glaser
Roger Leigh dixit: But, does it not reject non 7-bit input in the C locale for completeness? No, it doesn't - we (before my time though, I think) fought hard for eight-bit transparence and eight-bit cleanliness. Should tools doing raw I/O not be using lower level interfaces such as fread() and

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Roger Leigh
On Tue, Apr 07, 2009 at 10:01:16PM +, Thorsten Glaser wrote: Roger Leigh dixit: But, does it not reject non 7-bit input in the C locale for completeness? No, it doesn't - we (before my time though, I think) fought hard for eight-bit transparence and eight-bit cleanliness. Should

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-07 Thread Thorsten Glaser
Roger Leigh dixit: Are you sure? Not entirely, but I recall fgetc (or was it fgetwc?) being affected. //mirabilos -- “It is inappropriate to require that a time represented as seconds since the Epoch precisely represent the number of seconds between the referenced time and the Epoch.”

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-06 Thread Thorsten Glaser
Package: debian-policy Version: 3.8.1.0 Severity: wishlist For the mksh regression tests, I need a UTF-8 locale working; most systems either provide “en_US.UTF-8” or “en_US.utf8” with the former being recommended. Build-depending on locales-all has worked for me so far, except it won’t do in

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-06 Thread Giacomo A. Catenazzi
Thorsten Glaser wrote: For the mksh regression tests, I need a UTF-8 locale working; most systems either provide “en_US.UTF-8” or “en_US.utf8” with the former being recommended. Build-depending on locales-all has worked for me so far, except it won’t do in Kubuntu where said package does not

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-06 Thread Bill Allombert
On Mon, Apr 06, 2009 at 02:06:55PM +0200, Thorsten Glaser wrote: Package: debian-policy Version: 3.8.1.0 Severity: wishlist For the mksh regression tests, I need a UTF-8 locale working; most systems either provide “en_US.UTF-8” or “en_US.utf8” with the former being recommended. Hello

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-06 Thread Thorsten Glaser
Giacomo A. Catenazzi dixit: If you need a specific locale (as seems from mksh, not sure if it is a bug in that program), you need to set it. You can only set a locale on a glibc-based system if it’s installed beforehand, which root needs to do. Why does mksh need UTF-8? The regression tests

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-06 Thread Thorsten Glaser
Bill Allombert dixit: What about LC_COLLATE (which is a major problem with sort(1)) ? 1:1, just like the C locale does. What about packages that run before /usr is mounted ? They do not have /usr/*/locale/ anyway. This is a glibc problem. What about embedded systems with tight space

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-06 Thread Steve Langasek
On Mon, Apr 06, 2009 at 05:33:35PM +, Thorsten Glaser wrote: If you need a specific locale (as seems from mksh, not sure if it is a bug in that program), you need to set it. You can only set a locale on a glibc-based system if it’s installed beforehand, which root needs to do. You can

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-06 Thread Roger Leigh
On Mon, Apr 06, 2009 at 11:09:17AM -0700, Steve Langasek wrote: On Mon, Apr 06, 2009 at 05:33:35PM +, Thorsten Glaser wrote: If you need a specific locale (as seems from mksh, not sure if it is a bug in that program), you need to set it. You can only set a locale on a glibc-based

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-06 Thread Roger Leigh
On Mon, Apr 06, 2009 at 11:09:17AM -0700, Steve Langasek wrote: On Mon, Apr 06, 2009 at 05:33:35PM +, Thorsten Glaser wrote: If you need a specific locale (as seems from mksh, not sure if it is a bug in that program), you need to set it. You can only set a locale on a glibc-based

Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale

2009-04-06 Thread Roger Leigh
On Mon, Apr 06, 2009 at 04:18:59PM +0200, Bill Allombert wrote: On Mon, Apr 06, 2009 at 02:06:55PM +0200, Thorsten Glaser wrote: Package: debian-policy Version: 3.8.1.0 Severity: wishlist For the mksh regression tests, I need a UTF-8 locale working; most systems either provide