ID:               30975
 Updated by:       [EMAIL PROTECTED]
 Reported By:      justin at jwd dot co dot uk
-Status:           Open
+Status:           Bogus
 Bug Type:         XML related
 Operating System: Windows XP
 PHP Version:      5.0.2
 New Comment:

Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php

This is indeed expected, all XML extensions in PHP work internally with
UTF-8 so that's what it returns.


Previous Comments:
------------------------------------------------------------------------

[2004-12-03 13:24:16] justin at jwd dot co dot uk

Description:
------------
When retrieving sections of text from an HTML page using the new DOM
functions, the output is encoded using UTF-8 despite the input being
correctly detected as encoded ISO-8859-1. This means extra code in
order to convert back to the original charset of the input text. Surely
the DOM functions should either encode according to the detected input
encoding or at least provide some mechanism for setting the output
encoding? Or am I being stupid here?

Reproduce code:
---------------
<pre><?php
$xhtml= <<<HTML_END
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
<html xmlns="http://www.w3.org/1999/xhtml";>
<head><title>Untitled Document</title>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"
/></head>
<body><p class="test_paragraph">Test&nbsp;Paragraph</p></body>
HTML_END;

$in=new DomDocument();
$in->loadHTML($xhtml);
$xin=new DomXpath($in);

$text=$xin->query('//[EMAIL 
PROTECTED]"test_paragraph"]/text()')->item(0)->nodeValue;

echo(htmlspecialchars($text)."\n"); // Outputs "Test Paragraph"

$text=iconv("UTF-8", "ISO-8859-1", $text);
echo(htmlspecialchars($text)."\n"); // Outputs "Test Paragraph"
?></pre>

Expected result:
----------------
Test Paragraph
Test Paragraph

Actual result:
--------------
Test Paragraph
Test Paragraph


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=30975&edit=1

Reply via email to