From: borys dot forytarz at gmail dot com
Operating system: Linux
PHP version: 5.2.3
PHP Bug Type: DOM XML related
Bug description: Really strang encoding problem - should work, but doesn't
Description:
------------
There is a problem with DOM and encoding. I have two separate files, one
full XHTML code (DTD, head, meta, body and more contents) saved in UTF-8.
Meta declaration is UTF-8, server sends the code in UTF-8 too. The second
file is a simple file without any DTD, head, meta and body. Saved in UTF-8
too. The problem is, when I import nodes from the second file using
importNode(), in the output there are invalid encoded characters (those who
were declared in the second file). It is strange because as I read, DOM
works in UTF-8 so there should be not such a problem.
What is more, I was debugging the properties such as actualEncoding and
they shown me that there is UTF-8...
If it's not a bug, but I think it is, how to fix that? I can't declare in
the second file DTD, head and body elements.
Reproduce code:
---------------
$this->dom = new DOMDocument('1.0','UTF-8');
$this->dom->encoding = 'UTF-8';
$this->dom->formatOutput = self::$formatOutput;
$this->dom->preserveWhiteSpace = self::$preserveWhiteSpace;
@$this->dom->loadHtmlFile($html);
...
echo $this->dom->saveXML();
The above works well for the complete XHTML file. But when I load an
incomplete file (encoded in UTF-8) I don't see properly encoded characters
when I import nodes from the second document to the first one.
I tried to convert the whole output with iconv() and mb_convert_encoding()
but it seems not to make any difference at all.
Expected result:
----------------
Properly encoded characters from both complete XHTML file and second
"poor" file. The second file is such as follows:
<content id="something">
<h1>some string</h1>
</content>
Actual result:
--------------
Not properly encoded characters from between <content> tag.
--
Edit bug report at http://bugs.php.net/?id=41980&edit=1
--
Try a CVS snapshot (PHP 4.4):
http://bugs.php.net/fix.php?id=41980&r=trysnapshot44
Try a CVS snapshot (PHP 5.2):
http://bugs.php.net/fix.php?id=41980&r=trysnapshot52
Try a CVS snapshot (PHP 6.0):
http://bugs.php.net/fix.php?id=41980&r=trysnapshot60
Fixed in CVS: http://bugs.php.net/fix.php?id=41980&r=fixedcvs
Fixed in release:
http://bugs.php.net/fix.php?id=41980&r=alreadyfixed
Need backtrace: http://bugs.php.net/fix.php?id=41980&r=needtrace
Need Reproduce Script: http://bugs.php.net/fix.php?id=41980&r=needscript
Try newer version: http://bugs.php.net/fix.php?id=41980&r=oldversion
Not developer issue: http://bugs.php.net/fix.php?id=41980&r=support
Expected behavior: http://bugs.php.net/fix.php?id=41980&r=notwrong
Not enough info:
http://bugs.php.net/fix.php?id=41980&r=notenoughinfo
Submitted twice:
http://bugs.php.net/fix.php?id=41980&r=submittedtwice
register_globals: http://bugs.php.net/fix.php?id=41980&r=globals
PHP 3 support discontinued: http://bugs.php.net/fix.php?id=41980&r=php3
Daylight Savings: http://bugs.php.net/fix.php?id=41980&r=dst
IIS Stability: http://bugs.php.net/fix.php?id=41980&r=isapi
Install GNU Sed: http://bugs.php.net/fix.php?id=41980&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=41980&r=float
No Zend Extensions: http://bugs.php.net/fix.php?id=41980&r=nozend
MySQL Configuration Error: http://bugs.php.net/fix.php?id=41980&r=mysqlcfg