From:             brett at brettbrewer dot com
Operating system: Linux xq41.cyberlnc.com 2.6.18-5
PHP version:      5.2.6
PHP Bug Type:     SimpleXML related
Bug description:  Apostrophe character code (’) converted to garbage by 
SimpleXML or LibXML

Description:
------------
When parsing an XML feed (wordpress) containing the character codes for a
right single curly quote (’), the character is converted into ’.
Unfortunately I'm not able to get complete access to the server to
deactivate Zend optimizer, Ioncube, etc and I'm pulling the OS info from
phpinfo(). I've included the URL of the actual feed that is causing the
problems. I found a really old similar bug report for php 4.3.2, but
nothing for PHP5.Here's the old bug report URL:

http://bugs.php.net/bug.php?id=24863&edit=2
I also found:
http://bugs.php.net/bug.php?id=26964&edit=2

which suggest a similar problem with htmlentities and html_entity_decode
but I don't know if it's related. I'm sure my feed is UTF-8 and if I
convert it to ISO9xxx-1 before passing it to my SimpleXML object then
SimpleXML complains that it's not in UTF-8 format and aborts, so I'm pretty
sure it's not a UTF8 encoding issue with the feed. I've included the feed
url in the code sample below. It assumes it is inside a class, but you can
probably run the code below to reproduce the symptoms just by removing the
"this->" in two places.  

Reproduce code:
---------------
$this->blog_url = "http://75.126.106.225/blog/feed/";;
$rawFeed = file_get_contents($this->blog_url);
$xml = new SimpleXmlElement($rawFeed); 

//you can see the results of the incorrect parsing of the feed in the left
sidebar at http://75.126.106.225

Expected result:
----------------
Code should keep the ’ entity code intact or possibly convert it to
'

Actual result:
--------------
SimpleXML contstructor seems to convert all instances of ’ into ’

If you use SimpleXML to parse the feed at http://75.126.106.225/blog/feed/
you should see the problem in the <title> of the second item in the feed. 

-- 
Edit bug report at http://bugs.php.net/?id=46129&edit=1
-- 
Try a CVS snapshot (PHP 5.2): 
http://bugs.php.net/fix.php?id=46129&r=trysnapshot52
Try a CVS snapshot (PHP 5.3): 
http://bugs.php.net/fix.php?id=46129&r=trysnapshot53
Try a CVS snapshot (PHP 6.0): 
http://bugs.php.net/fix.php?id=46129&r=trysnapshot60
Fixed in CVS:                 http://bugs.php.net/fix.php?id=46129&r=fixedcvs
Fixed in release:             
http://bugs.php.net/fix.php?id=46129&r=alreadyfixed
Need backtrace:               http://bugs.php.net/fix.php?id=46129&r=needtrace
Need Reproduce Script:        http://bugs.php.net/fix.php?id=46129&r=needscript
Try newer version:            http://bugs.php.net/fix.php?id=46129&r=oldversion
Not developer issue:          http://bugs.php.net/fix.php?id=46129&r=support
Expected behavior:            http://bugs.php.net/fix.php?id=46129&r=notwrong
Not enough info:              
http://bugs.php.net/fix.php?id=46129&r=notenoughinfo
Submitted twice:              
http://bugs.php.net/fix.php?id=46129&r=submittedtwice
register_globals:             http://bugs.php.net/fix.php?id=46129&r=globals
PHP 4 support discontinued:   http://bugs.php.net/fix.php?id=46129&r=php4
Daylight Savings:             http://bugs.php.net/fix.php?id=46129&r=dst
IIS Stability:                http://bugs.php.net/fix.php?id=46129&r=isapi
Install GNU Sed:              http://bugs.php.net/fix.php?id=46129&r=gnused
Floating point limitations:   http://bugs.php.net/fix.php?id=46129&r=float
No Zend Extensions:           http://bugs.php.net/fix.php?id=46129&r=nozend
MySQL Configuration Error:    http://bugs.php.net/fix.php?id=46129&r=mysqlcfg

Reply via email to