fold UTF ready?

2007-10-13 Thread jidanni
fold (GNU coreutils) 5.97
* counts bytes, not columns, even without -b
* has no compassion for multibyte chars, turning UTF-8 into illegal sequences.
export LC_ALL=zh_TW.utf8
echo 不准作台灣人|fold --help
echo 不准作台灣人|fold -w 2
echo 不准作台灣人|fold -w 3
echo 不准 作台灣人|fold -w 3
echo 不准 作 台灣人|fold -w 3
echo 不准 作 台灣人|fold -w 3 -b
echo 不准 作 台灣人|fold -w 3 -s
echo 不准 作 台灣人|fold -w 4 -s


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: fold UTF ready?

2007-10-13 Thread Eric Blake
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to [EMAIL PROTECTED] on 10/13/2007 6:57 PM:
 fold (GNU coreutils) 5.97

Consider upgrading.  The latest stable coreutils is at 6.9.

 * counts bytes, not columns, even without -b
 * has no compassion for multibyte chars, turning UTF-8 into illegal sequences.

You've brought this up before, and the answer is the same as before.
Coreutils does not yet support multibyte locales, because no one has yet
contributed a patch that is usable across all the coreutils that handle
text, which is easy to maintain, and which does not penalize performance
on single-byte locales.

- --
Don't work too hard, make some time for fun as well!

Eric Blake [EMAIL PROTECTED]
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHEXj584KuGfSFAYARArgEAKCvT8nDsuwggU/1yenQ+duGofgNBgCfRS5b
7RbkC45GMpdOC/3VZNsvPO0=
=Mitp
-END PGP SIGNATURE-


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils


Re: fold UTF ready?

2007-10-13 Thread jidanni
 EB == Eric Blake [EMAIL PROTECTED] writes:

EB According to [EMAIL PROTECTED] on 10/13/2007 6:57 PM:
 fold (GNU coreutils) 5.97

EB Consider upgrading.  The latest stable coreutils is at 6.9.

I'll tell Debian to upgrade.

 * counts bytes, not columns, even without -b
 * has no compassion for multibyte chars, turning UTF-8 into illegal 
 sequences.

EB You've brought this up before, and the answer is the same as before.
EB Coreutils does not yet support multibyte locales, because no one has yet
EB contributed a patch that is usable across all the coreutils that handle
EB text, which is easy to maintain, and which does not penalize performance
EB on single-byte locales.

OK, the --usage then should note it just means things like tabs when
talking about columns, else it sounds like multibyte is supported.


___
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils