Bug#525301: /usr/bin/isutf8: accepts UTF-8-encoded UTF-16 surrogates

2009-05-05 Thread Jakub Wilk
* Lars Wirzenius l...@liw.fi, 2009-05-03, 19:36: $ man utf-8 | grep -A 2 UTF-16 | sed -e 's/^ *//' The UCS code values 0xd800–0xdfff (UTF-16 surrogates) as well as 0xfffe and 0x (UCS non-characters) should not appear in conforming UTF-8 streams. $ s='\xed\xa0\x88\xed\xbd\x85' # 0xd808 +

Bug#525301: /usr/bin/isutf8: accepts UTF-8-encoded UTF-16 surrogates

2009-05-03 Thread Lars Wirzenius
to, 2009-04-23 kello 16:52 +0200, Jakub Wilk kirjoitti: Package: moreutils Version: 0.34 Severity: normal File: /usr/bin/isutf8 $ man utf-8 | grep -A 2 UTF-16 | sed -e 's/^ *//' The UCS code values 0xd800–0xdfff (UTF-16 surrogates) as well as 0xfffe and 0x (UCS non-characters) should

Bug#525301: /usr/bin/isutf8: accepts UTF-8-encoded UTF-16 surrogates

2009-04-23 Thread Jakub Wilk
Package: moreutils Version: 0.34 Severity: normal File: /usr/bin/isutf8 $ man utf-8 | grep -A 2 UTF-16 | sed -e 's/^ *//' The UCS code values 0xd800–0xdfff (UTF-16 surrogates) as well as 0xfffe and 0x (UCS non-characters) should not appear in conforming UTF-8 streams. $