Your message dated Sun, 15 Apr 2007 02:48:31 +0200
with message-id <[EMAIL PROTECTED]>
and subject line Bug#316147: iconv: options for illegal characters
has caused the attached Bug report to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere. Please contact me immediately.)
Debian bug tracking system administrator
(administrator, Debian Bugs database)
--- Begin Message ---
Package: libc6
Version: 2.3.2.ds1-22
Severity: wishlist
File: /usr/bin/iconv
Tags: upstream
-c is nice, but it would be nice to know just how many illegal
characters were invalid characters were omitted from the output.
--verbose won't say, but should.
$ iconv -f gb2312 -t big5 gdxw08.htm | wc -c
iconv: illegal input sequence at position 906
906
$ iconv -f gb2312 -t big5 -c gdxw08.htm | wc -c - gdxw08.htm
4585 -
4585 gdxw08.htm
9170 total
The man page said "Omit invalid characters from output", well maybe it
should say more, like "just send the character it can't deal with
through to the output unconverted".
Or better yet, give the user the choice of deleting them, sending them
through, or redirecting them, etc.
Greater still would be an option to "mark unconvertible characters
with @--> <--@ [or customizable]"
--- End Message ---
--- Begin Message ---
On Wed, Jun 29, 2005 at 01:53:33AM +0800, Dan Jacobson wrote:
> Package: libc6
> Version: 2.3.2.ds1-22
> Severity: wishlist
> File: /usr/bin/iconv
> Tags: upstream
>
> -c is nice, but it would be nice to know just how many illegal
> characters were invalid characters were omitted from the output.
> --verbose won't say, but should.
>
> $ iconv -f gb2312 -t big5 gdxw08.htm | wc -c
> iconv: illegal input sequence at position 906
> 906
> $ iconv -f gb2312 -t big5 -c gdxw08.htm | wc -c - gdxw08.htm
> 4585 -
> 4585 gdxw08.htm
> 9170 total
iconv is meant to be strict. If you want it to omit errors, then use
//IGNORE after your encoding name, or //TRANSLIT to try some proximity
transliterations.
If you want more subtle ways, recode is the tool you want.
--
·O· Pierre Habouzit
··O [EMAIL PROTECTED]
OOO http://www.madism.org
pgpTWmXELgeb5.pgp
Description: PGP signature
--- End Message ---