[PHP] Re: PHP DOM saveHTML outputs entities

2007-03-21 Thread Eli

What about html_entity_decode?
http://www.php.net/html_entity_decode 


No. It doesn't help in this case.

DOMDocument-saveHTML() method converts any non-ascii characters into 
entities.

For example, if the dom document has the text node value of:
שלום
It converts the string to entities:
#1513;#1500;#1493;#1501;
Although the string is already in UTF-8. The DOMDocument is already 
initialized with version 1.0 and encoding UTF-8, the php file is in 
UTF-8, the xml file is in UTF-8 and got ?xml version=1.0 
encoding=UTF-8? header.


Example:
?php
$dom = new DOMDocument('1.0','utf-8');
$dom-loadXML(htmlbodyשלום/body/html);
$output = $dom-saveHTML();
header(Content-Type: text/html; charset=UTF-8);
echo $output;
?

-thanks

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Re: PHP DOM saveHTML outputs entities

2007-03-21 Thread Tijnema !

On 3/21/07, Eli [EMAIL PROTECTED] wrote:

 What about html_entity_decode?
 http://www.php.net/html_entity_decode

No. It doesn't help in this case.

DOMDocument-saveHTML() method converts any non-ascii characters into
entities.
For example, if the dom document has the text node value of:
   שלום
It converts the string to entities:
   #1513;#1500;#1493;#1501;
Although the string is already in UTF-8. The DOMDocument is already
initialized with version 1.0 and encoding UTF-8, the php file is in
UTF-8, the xml file is in UTF-8 and got ?xml version=1.0
encoding=UTF-8? header.

Example:
?php
$dom = new DOMDocument('1.0','utf-8');
$dom-loadXML(htmlbodyשלום/body/html);
$output = $dom-saveHTML();
header(Content-Type: text/html; charset=UTF-8);
echo $output;
?

-thanks


Did you set the UTF8 format in the html_entity_decode function?
so your code would become:
?php
$dom = new DOMDocument('1.0','utf-8');
$dom-loadXML(htmlbodyשלום/body/html);
$output = $dom-saveHTML();
header(Content-Type: text/html; charset=UTF-8);
echo html_entity_decode($output,ENT_QUOTES,UTF-8);
?

I'm not really sure about it, but i'm not using UTF8. Also have a look
at the comments under the html_entity_decode function, and at the
functions/comments of utf8_encode/utf8_decode.

Tijnema


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




Re: [PHP] Re: PHP DOM saveHTML outputs entities

2007-03-21 Thread Eli

Tijnema ! wrote:

Did you set the UTF8 format in the html_entity_decode function?
so your code would become:
?php
$dom = new DOMDocument('1.0','utf-8');
$dom-loadXML(htmlbodyשלום/body/html);
$output = $dom-saveHTML();
header(Content-Type: text/html; charset=UTF-8);
echo html_entity_decode($output,ENT_QUOTES,UTF-8);
?


Yes. This works... thanks! :-)

But actually I wanted to avoid the saveHTML() method from converting to 
html entities in the first place, if possible at all.


-thanks!

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php