Egmont wrote: > > UTF-8 is clearly defined by RFC 2279 which maintains the clear > > 1-to-6-bytes encoding scheme of RFC 2044 with no confusion - and will > > hopefully remain so.
> FYI: RFC 2279 is obsoleted by RFC 3629 which defines UTF-8 as a 1-to-4-bytes > encoding scheme. Sad but true... I see. This is not noticed in RFC 2279 as it would usually be... These well-known documents are called "RFC" so what is actually the process of placing a comment to them? I see no mechanism, nor a mechanism to be notified in time about a planned change to an RFC. François Yergeau, could you please comment on the issue I have raised (see previous message referred below)? Merci. Marcin wrote: > Why sad? They weren't going to be any characters defined above U+10FFFF > anyway. Because, as I tried to point out in my previous message <http://mail.nl.linux.org/linux-utf8/2007-05/msg00002.html> this may cause authors of terminal emulators (xterm, rxvt, ...) to change the display behaviour of 5- and 6-bit sequences which raises absolutely unnecessary confusion and additional inconsistency in the already chaotic width handling and recognition of terminal properties and the interworking with applications. Thomas -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/