Package: libc6 Version: 2.3.2-8 Severity: normal UTF-8 encoding is specified in RFC2279 as follows:
UCS-4 range (hex.) UTF-8 octet sequence (binary) 0000 0000-0000 007F 0xxxxxxx 0000 0080-0000 07FF 110xxxxx 10xxxxxx 0000 0800-0000 FFFF 1110xxxx 10xxxxxx 10xxxxxx 0001 0000-001F FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx 0020 0000-03FF FFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 0400 0000-7FFF FFFF 1111110x 10xxxxxx ... 10xxxxxx This means that ascii characters (hex 20 - 7F range) have multiple representations. In fact, is a well-known issue in security analysis. E.g. '.' character has the following representations: 2E C0 AE E0 80 AE F0 80 80 AE F8 80 80 80 AE FC 80 80 80 80 AE. However, iconv can handle only the first of these representations: [EMAIL PROTECTED]:~> printf '\x2E\n' | iconv -f utf-8 -t us-ascii . [EMAIL PROTECTED]:~> printf '\xC0\xAE\n' | iconv -f utf-8 -t us-ascii iconv: illegal input sequence at position 0 [EMAIL PROTECTED]:~> printf '\xE0\x80\xAE\n' | iconv -f utf-8 -t us-ascii iconv: illegal input sequence at position 0 [EMAIL PROTECTED]:~> printf '\xF0\x80\x80\xAE\n' | iconv -f utf-8 -t us-ascii iconv: illegal input sequence at position 0 [EMAIL PROTECTED]:~> printf '\xF8\x80\x80\x80\xAE\n' | iconv -f utf-8 -t us-ascii iconv: illegal input sequence at position 0 [EMAIL PROTECTED]:~> printf '\xFC\x80\x80\x80\x80\xAE\n' | iconv -f utf-8 -t us-ascii iconv: illegal input sequence at position 0 -- System Information: Debian Release: 3.0 Architecture: i386 Kernel: Linux sercond 2.4.21 #1 Срд Июл 30 22:24:06 MSD 2003 i686 Locale: LANG=ru_RU.KOI8-R, LC_CTYPE=ru_RU.KOI8-R Versions of packages libc6 depends on: ii libdb1-compat 2.1.3-7 The Berkeley database routines [gl -- no debconf information -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

