Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-02-04 Thread Vincent Lefevre
On 2007-01-19 03:43:02 +0100, Vincent Lefevre wrote: On 2007-01-18 17:39:40 +0100, Bruno Haible wrote: Vincent, do you have time to report that to the Apple people? No need to mention 'ls' - a simple printf 'E\xcc\x81\t2nd column\nFoo\t2nd column\n' should be all you need to

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-19 Thread Jim Meyering
Vincent Lefevre [EMAIL PROTECTED] wrote: On 2007-01-19 01:23:44 +0100, Bruno Haible wrote: Apple Terminal version 1.4.6, part of MacOS X 10.3.9, is affected. I forgot to say. This is still not fixed in Terminal 1.5 (133) from Mac OS X 10.4.8. Thanks. I've checked this in: *

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-18 Thread sci-fi
On 2007-01-15 21:05:53 -0600, Vincent Lefevre [EMAIL PROTECTED] said: Hi, Under Mac OS X 10.4.8 with ls (GNU coreutils) 5.97 (installed via MacPorts), in a 80-column terminal (uxterm), I get: $ ls É y123456789012345678901234567890 x123456789012345678901234567890

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-18 Thread Bruno Haible
Vincent Lefevre wrote: Hmm... I forgot that ls was an alias (the same one on all my accounts). So, back on Mac OS X: prunille:~/blah \ls -C --color=always | hexdump -C 1b 5b 30 30 6d 1b 5b 30 6d 45 cc 81 1b 5b 30 30 |.[00m.[0mE�..[00| 0010 6d 20 20 20 20 20 20 20 20 20 20

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-18 Thread Jim Meyering
Bruno Haible [EMAIL PROTECTED] wrote: Vincent Lefevre wrote: ... I see that the first call to wcwidth() gives: wcwidth(0x0301) = 1. U+0301 is COMBINING ACUTE ACCENT. So here is the problem: MacOS' wcwidth is buggy for combining characters like accents. OK. Can't autoconf detect that and

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-18 Thread Bruno Haible
Jim Meyering wrote: As I understand the goal, you'd like to make ls act differently (outputting spaces, not TABs, for column alignment) on all systems for each line containing a non-ASCII byte. Yes, this is what the proposed patch does. That change would contradict the documentation of -T

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-18 Thread Jim Meyering
Bruno Haible [EMAIL PROTECTED] wrote: Jim Meyering wrote: As I understand the goal, you'd like to make ls act differently (outputting spaces, not TABs, for column alignment) on all systems for each line containing a non-ASCII byte. Yes, this is what the proposed patch does. That change

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-18 Thread Bruno Haible
Jim Meyering wrote: Um... it *is* possible to use TABs after non-ASCII bytes and get correct alignment. The only requirement is that you be using a reasonable (non-buggy) terminal emulator. Yes, sure. I was only pointing out that the proposed change wouldn't need a doc change, because the

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-18 Thread Jim Meyering
Bruno Haible [EMAIL PROTECTED] wrote: in the mean time, advise people to use -T0 (or set TABSIZE=0 in their environment) if they care about alignment when using a buggy version of that particular terminal emulator. Do you really think it would be better to make everyone pay (even a tiny

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-18 Thread Bruno Haible
Paul Eggert wrote: Long ago I regularly used terminal emulators that mishandled tabs. Eventually they got fixed (or I stopped using them). Long ago I used terminals where the tab stops were customizable, and the previous user had set them to weird values. At that time, I stopped using tabs.

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-18 Thread Vincent Lefevre
On 2007-01-18 17:39:40 +0100, Bruno Haible wrote: The --color option also has the effect of turning tabs into spaces; yet this is undocumented. Actually the doc states `ls' uses tabs where possible in the output, for efficiency. If COLS is zero, do not use tabs at all. and the

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-18 Thread Vincent Lefevre
On 2007-01-19 01:23:44 +0100, Bruno Haible wrote: Apple Terminal version 1.4.6, part of MacOS X 10.3.9, is affected. I forgot to say. This is still not fixed in Terminal 1.5 (133) from Mac OS X 10.4.8. -- Vincent Lefèvre [EMAIL PROTECTED] - Web: http://www.vinc17.org/ 100% accessible validated

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-17 Thread Bruno Haible
Eric Blake wrote: coreutils does not handle multi-byte locales well. True, The problem is that no one has yet written a patch that makes it easy to handle multibyte locales without penalizing single-byte locales. There are patches for multibyte locale support for many of the text utilities,

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-17 Thread Bruno Haible
Vincent Lefevre wrote: Therefore: can you also show wrong behaviour when you set LC_ALL=en_US.UTF-8 ? Yes: prunille:~/blah export LC_ALL=en_US.UTF-8 prunille:~/blah locale LANG=POSIX LC_COLLATE=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_MONETARY=en_US.UTF-8

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-17 Thread Vincent Lefevre
On 2007-01-18 03:14:37 +0100, Bruno Haible wrote: Conclusion: What you see is not an ls bug, but an Apple Terminal bug with tabs. I don't use the Apple Terminal (and never use it). As I said in my bug report, I'm using uxterm here. More precisely: prunille:~ uxterm -version XFree86

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-16 Thread Vincent Lefevre
On 2007-01-15 22:29:41 -0800, Paul Eggert wrote: Most likely this has something to do with how mbrtowc and/or wcwidth behaves on MacOS X. Perhaps you can debug the quote_name function of 'ls' on the affected file name, and see why it's computing the width that it's computing? First, do you

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-16 Thread Andreas Schwab
Vincent Lefevre [EMAIL PROTECTED] writes: First, do you know any freely available test suite for functions such as mbrtowc and wcwidth? It would be easier to know where the problem is. There are some tests in glibc. For most of them it should be possible to run them standalone, too.

Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-15 Thread Vincent Lefevre
Hi, Under Mac OS X 10.4.8 with ls (GNU coreutils) 5.97 (installed via MacPorts), in a 80-column terminal (uxterm), I get: $ ls É y123456789012345678901234567890 x123456789012345678901234567890 z123456789012345678901234567890 instead of: $ ls É

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-15 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 According to Vincent Lefevre on 1/15/2007 8:05 PM: Hi, Under Mac OS X 10.4.8 with ls (GNU coreutils) 5.97 (installed via MacPorts), in a 80-column terminal (uxterm), I get: $ ls É y123456789012345678901234567890

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-15 Thread Vincent Lefevre
On 2007-01-15 20:13:02 -0700, Eric Blake wrote: According to Vincent Lefevre on 1/15/2007 8:05 PM: Under Mac OS X 10.4.8 with ls (GNU coreutils) 5.97 (installed via MacPorts), in a 80-column terminal (uxterm), I get: $ ls É y123456789012345678901234567890

Re: Alignment bug in ls with UTF-8 filenames under Mac OS X

2007-01-15 Thread Paul Eggert
Vincent Lefevre [EMAIL PROTECTED] writes: In fact the problem seems to be due to the combining character under Mac OS X. The filename É is encoded as 45 cc 81. Most likely this has something to do with how mbrtowc and/or wcwidth behaves on MacOS X. Perhaps you can debug the quote_name