Re: wcwidth update

2007-07-08 Thread Bruno Haible
Hello Markus, > > Could you update your wcwidth implementation at > > > > http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c > > > > to latest unicode data? > > Done. This code assigns width 2 to U+4DC0..U+4DFF. But they are marked as 'N' in Unicode 5.0.0's ucd/EastAsianWidth.txt, therefore they sho

Re: wcwidth update

2007-05-30 Thread Thomas Wolff
Egmont wrote: > > UTF-8 is clearly defined by RFC 2279 which maintains the clear > > 1-to-6-bytes encoding scheme of RFC 2044 with no confusion - and will > > hopefully remain so. > FYI: RFC 2279 is obsoleted by RFC 3629 which defines UTF-8 as a 1-to-4-bytes > encoding scheme. Sad but true... I

Re: wcwidth update

2007-05-29 Thread Marcin 'Qrczak' Kowalczyk
Dnia 29-05-2007, Wt o godzinie 19:24 +0200, Egmont Koblinger napisał(a): > FYI: RFC 2279 is obsoleted by RFC 3629 which defines UTF-8 as a 1-to-4-bytes > encoding scheme. Sad but true... Why sad? They weren't going to be any characters defined above U+10 anyway. -- __("< Marcin K

Re: wcwidth update

2007-05-29 Thread Egmont Koblinger
Hi, > UTF-8 is clearly defined by RFC 2279 which maintains the clear > 1-to-6-bytes encoding scheme of RFC 2044 with no confusion - and will > hopefully remain so. FYI: RFC 2279 is obsoleted by RFC 3629 which defines UTF-8 as a 1-to-4-bytes encoding scheme. Sad but true... -- Egmont -- Linu

Re: wcwidth update

2007-05-29 Thread Thomas Wolff
Egmont wrote: > On Fri, May 25, 2007 at 06:12:13PM +0200, Thomas Wolff wrote: > > I have not heard anything like this before (about changing behaviour > > of emitted replacement characters) > So far there lived two concurrent definitions of UTF-8, one defined it to be > at most 4 bytes long, wh

Re: wcwidth update -REMOVE PLEASE-

2007-05-26 Thread Markus Kuhn
"[EMAIL PROTECTED]" wrote on 2007-05-25 21:33 UTC: > REMOVE PLEASE As the mail header said: List-unsubscribe: Markus -- Markus Kuhn, Computer Laboratory, University of Cambridge http://www.cl.cam.ac.uk/~mgk25/ || CB3 0FD, Great Britain -- Linux-UTF8: i18n of Li

Re: wcwidth update -REMOVE PLEASE-

2007-05-25 Thread [EMAIL PROTECTED]
REMOVE PLEASE - Original Message - From: "Egmont Koblinger" <[EMAIL PROTECTED]> To: Sent: Friday, May 25, 2007 12:09 PM Subject: Re: wcwidth update Hi, > http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c > to latest unicode data? Done. Now that recently ev

Re: wcwidth update

2007-05-25 Thread SrinTuar
If I correctly understand the thread that's just been discussed on this list, starting at: http://mail.nl.linux.org/linux-utf8/2007-04/msg00050.html then from now on everyone defines UTF-8 to be at most 4 bytes long. And in this case I think the proper behavior would be to emit 5 or 6 bytes. Thi

Re: wcwidth update

2007-05-25 Thread Egmont Koblinger
On Fri, May 25, 2007 at 06:31:01PM +0200, I wrote: > And in this case I think the proper behavior would be to emit 5 or 6 bytes. Sorry, typo: to emit 5 or 6 replacement characters. bye, Egmont -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/

Re: wcwidth update

2007-05-25 Thread Egmont Koblinger
On Fri, May 25, 2007 at 06:12:13PM +0200, Thomas Wolff wrote: > I have not heard anything like this before (about changing behaviour > of emitted replacement characters) So far there lived two concurrent definitions of UTF-8, one defined it to be at most 4 bytes long, while the other one defined

Re: wcwidth update

2007-05-25 Thread Thomas Wolff
Hi, Egmont wrote: > Now that recently every standard seemed to agree that UTF-8 uses at most 4 > (and not 6) bytes and the highest valid Unicode value is U+1F, I wonder U+10, actually. > whether the stress test should be updated, too. As far as I understand, the > preferred new behavior f

Re: wcwidth update

2007-05-25 Thread Egmont Koblinger
Hi, > > http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c > > to latest unicode data? > Done. Now that recently every standard seemed to agree that UTF-8 uses at most 4 (and not 6) bytes and the highest valid Unicode value is U+1F, I wonder whether the stress test should be updated, too. As far as

Re: wcwidth update

2007-05-25 Thread Markus Kuhn
Emanuele Giaquinta wrote on 2007-05-25 11:05 UTC: > Hi, > > Could you update your wcwidth implementation at > > http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c > > to latest unicode data? Done. Markus -- Markus Kuhn, Computer Laboratory, University of Cambridge http://www.cl.cam.ac.uk/~mgk25/ |