Re: [PHP-DEV] UTF-8 encoding

2002-08-25 Thread Stefan Esser
On Sun, Aug 25, 2002 at 09:21:01PM +0200, Stig Venaas wrote: > Great, I've been wondering why UTF-8 wasn't defined like that > in the first place. Could you please give me a pointer to the > addition? It is defined in RFC 2279. Regards, Stefan -- PHP Development Mailing List

Re: [PHP-DEV] UTF-8 encoding

2002-08-25 Thread Stig Venaas
On Sun, Aug 25, 2002 at 09:26:29PM +0200, [EMAIL PROTECTED] wrote: > On Sun, 25 Aug 2002, Stig Venaas wrote: > > > BTW It seems that for some reason I can't post to php-dev anymore, > > at least some of you get this... > > I was not in the list, but got your message through php-dev. Thanks for

Re: [PHP-DEV] UTF-8 encoding

2002-08-25 Thread derick
On Sun, 25 Aug 2002, Stig Venaas wrote: > BTW It seems that for some reason I can't post to php-dev anymore, > at least some of you get this... I was not in the list, but got your message through php-dev. Derick --- Did I

Re: [PHP-DEV] UTF-8 encoding

2002-08-25 Thread Stig Venaas
On Sun, Aug 25, 2002 at 06:28:44PM +0100, Wez Furlong wrote: > Hi Stefan, > > I borrowed that code from the mbstring extension. Either I misinterpreted > the code, or mbstring also has it's utf-8 decoder incorrect. > > --Wez. > > On 08/25/02, "Stefan Esser" <[EMAIL PROTECTED]> wrote: > > Hello

Re: [PHP-DEV] UTF-8 encoding

2002-08-25 Thread Wez Furlong
Hi Stefan, I borrowed that code from the mbstring extension. Either I misinterpreted the code, or mbstring also has it's utf-8 decoder incorrect. --Wez. On 08/25/02, "Stefan Esser" <[EMAIL PROTECTED]> wrote: > Hello, > > html.c / get_next_char() has an utf-8 decoder. The implementation > is a

[PHP-DEV] UTF-8 encoding

2002-08-25 Thread Stefan Esser
forget my last mail... I just found the addition myself. Stefan -- PHP Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php

[PHP-DEV] UTF-8 encoding

2002-08-25 Thread Stefan Esser
Hello, html.c / get_next_char() has an utf-8 decoder. The implementation is a little bit fishy. AFAIK utf-8 sequences are 1 upto 4 chars but this one supports 5, 6 byte utf-8 sequences. I wonder where this addition to the standard is defined.. The problem is the following: the german ue is 0xFC w