From:             
Operating system: 
PHP version:      5.3.2
Package:          SimpleXML related
Bug Type:         Bug
Bug description:simplexml_load_file() doesn't use HTTP headers

Description:
------------
Seen at http://stackoverflow.com/questions/2899274/



If you use simplexml_load_file() to load a remote document via HTTP,
SimpleXML assumes that the content is UTF-8 regardless of the HTTP headers.
In the test script below, at the time of writing, Google's web server
returns something like:



-------------

HTTP/1.1 200 OK

Content-Type: text/xml; charset=GB2312

Date: Tue, 25 May 2010 05:05:17 GMT

Pragma: no-cache

Expires: Fri, 01 Jan 1990 00:00:00 GMT

Cache-Control: no-cache, no-store, must-revalidate

expires=Thu, 24-May-2012 05:05:17 GMT; path=/; domain=.google.com

X-Content-Type-Options: nosniff

Server: igfe

X-XSS-Protection: 1; mode=block

Transfer-Encoding: chunked



<?xml version="1.0"?><xml_api_reply version="1">

<!-- single-byte encoded GB2312 stuff -->

</xml_api_reply>

-------------



The server advertises the content "text/xml; charset=GB2312", but since the
XML declaration doesn't mention the encoding, SimpleXML assumes it is UTF-8
and eventually fails to load it.



If it is at all possible, SimpleXML (and DOM, I assume) should look at the
HTTP headers to find the document's encoding.

Test script:
---------------
simplexml_load_file('http://www.google.com/ig/api?weather=11791&hl=zh-CN');

Actual result:
--------------
PHP Warning:  simplexml_load_file():
http://www.google.com/ig/api?weather=11791&hl=zh-CN:1: parser error : Input
is not proper UTF-8, indicate encoding !

Bytes: 0xC7 0xE7 0x22 0x2F in Command line code on line 1



Warning: simplexml_load_file():
http://www.google.com/ig/api?weather=11791&hl=zh-CN:1: parser error : Input
is not proper UTF-8, indicate encoding !

Bytes: 0xC7 0xE7 0x22 0x2F in Command line code on line 1

PHP Warning:  simplexml_load_file(): t_system
data="SI"/></forecast_information><current_conditions><condition data=" in
Command line code on line 1



Warning: simplexml_load_file(): t_system
data="SI"/></forecast_information><current_conditions><condition data=" in
Command line code on line 1

PHP Warning:  simplexml_load_file():                                       
                                        ^ in Command line code on line 1



Warning: simplexml_load_file():

-- 
Edit bug report at http://bugs.php.net/bug.php?id=51903&edit=1
-- 
Try a snapshot (PHP 5.2):            
http://bugs.php.net/fix.php?id=51903&r=trysnapshot52
Try a snapshot (PHP 5.3):            
http://bugs.php.net/fix.php?id=51903&r=trysnapshot53
Try a snapshot (PHP 6.0):            
http://bugs.php.net/fix.php?id=51903&r=trysnapshot60
Fixed in SVN:                        
http://bugs.php.net/fix.php?id=51903&r=fixed
Fixed in SVN and need be documented: 
http://bugs.php.net/fix.php?id=51903&r=needdocs
Fixed in release:                    
http://bugs.php.net/fix.php?id=51903&r=alreadyfixed
Need backtrace:                      
http://bugs.php.net/fix.php?id=51903&r=needtrace
Need Reproduce Script:               
http://bugs.php.net/fix.php?id=51903&r=needscript
Try newer version:                   
http://bugs.php.net/fix.php?id=51903&r=oldversion
Not developer issue:                 
http://bugs.php.net/fix.php?id=51903&r=support
Expected behavior:                   
http://bugs.php.net/fix.php?id=51903&r=notwrong
Not enough info:                     
http://bugs.php.net/fix.php?id=51903&r=notenoughinfo
Submitted twice:                     
http://bugs.php.net/fix.php?id=51903&r=submittedtwice
register_globals:                    
http://bugs.php.net/fix.php?id=51903&r=globals
PHP 4 support discontinued:          http://bugs.php.net/fix.php?id=51903&r=php4
Daylight Savings:                    http://bugs.php.net/fix.php?id=51903&r=dst
IIS Stability:                       
http://bugs.php.net/fix.php?id=51903&r=isapi
Install GNU Sed:                     
http://bugs.php.net/fix.php?id=51903&r=gnused
Floating point limitations:          
http://bugs.php.net/fix.php?id=51903&r=float
No Zend Extensions:                  
http://bugs.php.net/fix.php?id=51903&r=nozend
MySQL Configuration Error:           
http://bugs.php.net/fix.php?id=51903&r=mysqlcfg

Reply via email to