2012/3/13 Rasmus Lerdorf <ras...@lerdorf.com>:
> On 03/12/2012 03:05 AM, Yasuo Ohgaki wrote:
>> I thought default_charset became UTF-8, so I was expecting
>> following HTTP header.
>>
>> content-type  text/html; charset=UTF-8
>>
>> However, I got empty charset (missing 'charset=UTF-8').
>> So I looked up to source and found the line in SAPI.h
>>
>> 293   #define SAPI_DEFAULT_CHARSET        ""
>>
>> Empty string should be "UTF-8", isn't it?
>
> No, we can't force an output charset on people since it would end up
> breaking a lot of sites.

Right, so may be for the next major release? 5.5.0?

As the first XSS advisory in 2000 states, explicitly setting char coding will
prevent certain XSS. Recent browsers have much better encoding handing,
but setting encoding explicitly is better for security still.

> PHP 5.3's determine_charset behaves exactly like 5.4's. In 5.3 we have:
>
>    if (charset_hint == NULL)
>                return cs_8859_1;
>
> and in 5.4 we have:
>
>    if (charset_hint == NULL)
>                return cs_utf_8;
>
> So there is no difference in their guessing when there is no hint, the
> only difference is that in 5.4 we choose utf8 and in 5.3 we choose
> 8859-1 in that case.

I got this with 5.3
<?php
echo htmlentities('<日本語UTF-8>',ENT_QUOTES);
echo htmlentities('<日本語UTF-8>',ENT_QUOTES, 'UTF-8');

&lt;&aelig;�&yen;&aelig;�&not;&egrave;&ordf;�UTF8
&gt;&lt;日本語UTF-8&gt;

So people migrating from 5.3 to 5.4 should not have problems.
Migration older than 5.3 to 5.4 will be problematic.

I always set all parameters for htmlentities/htmlspecialchars, therefore
I haven't noticed this was changed from 5.3. They may be migrating from
5.2 or older. (RHEL5 uses 5.1)

Since PHP does not have default multibyte module, it may be good for having

input_encoding
internal_encoding
output_encoding

php.ini settings and make multibyte modules use them when they are set.
Or just make mbstring default, alternatively.

Rather big change for released version, but this is simple easy change.

Regards,

--
Yasuo Ohgaki

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to