Yep, as far as I read the archives, I haven't found any discussions on the 
charset related backwards problems. So I wrote "*exactly* about this 
issue".

You may want to redirect me to bug #9392 (http://bugs.php.net/bug.php?id=9392), but it 
doens't seem to help...

In addition, I found determining the internal charset by LC_CTYPE is 
dangerous because setlocale() is not thread-safe in some libc 
implementations (glibc seems to be that one).

I'm going to read archives more carefully, though I think even handling 
the charset in phpinfo() will yield the same discussion in the future.


Moriyoshi Koizumi

"Wez Furlong" <[EMAIL PROTECTED]> wrote:

> Search the archives for the discussion.
> phpinfo could determine the charset as your patch does at the start,
> and then pass the info in php_escape_html_entities.
> 
> Seems easy to me.
> 
> --Wez.
> 
> On 10/16/02, "Moriyoshi Koizumi" <[EMAIL PROTECTED]> wrote:
> > Wez Furlong <[EMAIL PROTECTED]> wrote:
> > > Unfortunately, we absolutely must remain 100% backwards compatible with
> > > htmlentities(), so this patch should not be applied.
> > 
> > Were there any discussions exactly about this issue? Though I have to see 
> > some historical reason, however I don't understand why 100% backwards 
> > compatibility is required for htmlentities().
> > Because the patched htmlentities() acts in the same way with default 
> > configuration, and IMHO defaulting to iso-8859-1 is quite meaningless for 
> > the scripts that uses other charsets than it.
> > 
> > Hmm... otherwise I would like to suggest a mbstring function like 
> > mb_htmlentities(), but it would sound like a reinvention of the same 
> > wheel...
> > 
> > > However, I don't see a problem with making phpinfo determine the charset
> > > and passing that on to the internal htmlentities function?
> > 
> > The problem is that php_info_html_esc() in ext/standard/info.c calls 
> > php_escape_html_entities() with no charset information specified. Without 
> > the patch, every character is treated as ISO-8859-1 even if a fetched 
> > character is actually a mere first byte of a multibyte character.
> > 
> > 
> > Moriyoshi Koizumi
> > 
> > 
> > 
> > -- 
> > PHP Development Mailing List <http://www.php.net/>
> > To unsubscribe, visit: http://www.php.net/unsub.php
> 
> 
> 


-- 
PHP Development Mailing List <http://www.php.net/>
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to