On 3/21/07, Eli <[EMAIL PROTECTED]> wrote:
> What about html_entity_decode?
> http://www.php.net/html_entity_decode

No. It doesn't help in this case.

DOMDocument->saveHTML() method converts any non-ascii characters into
entities.
For example, if the dom document has the text node value of:
       שלום
It converts the string to entities:
       &#1513;&#1500;&#1493;&#1501;
Although the string is already in UTF-8. The DOMDocument is already
initialized with version "1.0" and encoding "UTF-8", the php file is in
UTF-8, the xml file is in UTF-8 and got <?xml version="1.0"
encoding="UTF-8"?> header.

Example:
<?php
$dom = new DOMDocument('1.0','utf-8');
$dom->loadXML("<html><body>שלום</body></html>");
$output = $dom->saveHTML();
header("Content-Type: text/html; charset=UTF-8");
echo $output;
?>

-thanks

Did you set the UTF8 format in the html_entity_decode function?
so your code would become:
<?php
$dom = new DOMDocument('1.0','utf-8');
$dom->loadXML("<html><body>שלום</body></html>");
$output = $dom->saveHTML();
header("Content-Type: text/html; charset=UTF-8");
echo html_entity_decode($output,ENT_QUOTES,"UTF-8");
?>

I'm not really sure about it, but i'm not using UTF8. Also have a look
at the comments under the html_entity_decode function, and at the
functions/comments of utf8_encode/utf8_decode.

Tijnema

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


Reply via email to