Well this is an interesting turn of events :)

We should now run over to the libxml folks and see if there is anything that can be done.

There *are* encoding options when you setup the domdocument so it seems like the options are there but not working properly for one reason or another.

On Apr 13, 2009, at 8:01 AM, Raymond Irving <xwis...@yahoo.com> wrote:



Michael,

You are absolutely right! It's loadHTML() that's causing the problems.


Best regards,
__
Raymond Irving


--- On Mon, 4/13/09, Michael A. Peters <mpet...@mac.com> wrote:

From: Michael A. Peters <mpet...@mac.com>
Subject: Re: [PHP] Generate XHTML (HTML compatible) Code using DOMDocument
To: "Michael Shadle" <mike...@gmail.com>
Cc: "Raymond Irving" <xwis...@yahoo.com>, "php- gene...@lists.php.net" <php-general@lists.php.net>
Date: Monday, April 13, 2009, 5:36 AM
Michael A. Peters wrote:


function makeHTML($document) {
    $buffer = $document->saveHTML();
    $output =
html_entity_decode($buffer,ENT_QUOTES,"UTF-8");
    return $output;
    }

I'll try it and see what it does.


Huh - not tried above yet - but with

$test = $myxhtml->createElement('p','שלום');
$xmlBody->appendChild($test);

both saveXML() and saveHTML() do the right thing.

However if I have the string

<p>שלום</p>

and load it into a DOM -

With loadHTML() the utf8 is lost regardless of whether I
use saveXML() or saveHTML()

With loadXML() the utf8 is preserved regardless of whether
or not I use saveXML() or saveHTML()

php 5.2.9
libxml2 2.6.26-2.1.2.7 (CentOS 5.3)

I wonder if the real utf8 problem people experience is
really with loadHTML() and not with saveHTML() ??


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to