ID:               36415
 Comment by:       jona at oismail dot com
 Reported By:      memoimyself at yahoo dot com dot br
 Status:           No Feedback
 Bug Type:         XSLT related
 Operating System: Windows XP
 PHP Version:      5.1.2
 New Comment:

Sample PHP Script that reproduces the bug for ISO-8859-15.
If the script is changed to use UTF-8 encoding and the Danish
characters are run through utf8_encode(), å Æ Ø Å will display fine but
æ ø will not.

<?php
// XML Document with Danish Characters
$xml = '<?xml version="1.0" encoding="ISO-8859-15"?>';
$xml .= '<root>æ ø å Æ Ø Å</root>';
// XSL Stylesheet
$xsl = '<?xml version="1.0" encoding="ISO-8859-15"?>';
$xsl .= '<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>';
$xsl .= '<xsl:output method="xml" version="1.0" encoding="ISO-8859-15"
indent="yes" media-type="application/xhtml+xml"
doctype-public="-//WAPFORUM//DTD XHTML Mobile 1.0//EN"
doctype-system="http://www.openmobilealliance.org/DTD/xhtml-mobile10.dtd";
omit-xml-declaration="no" />';
$xsl .= '<xsl:template match="/">';
$xsl .= '<html xmlns="http://www.w3.org/1999/xhtml"; xml:lang="da">';
$xsl .= '<head>';
$xsl .= '<meta http-equiv="Content-Type"
content="application/xhtml+xml; charset=ISO-8859-15" />';
$xsl .= '</head>';
$xsl .= '<body>';
$xsl .= '<xsl:value-of select="/root" />';
$xsl .= '</body>';
$xsl .= '</html>';
$xsl .= '</xsl:template>';
$xsl .= '</xsl:stylesheet>';

// Load XSL Stylesheet into memory
$obj_XML = new DOMDocument("1.0", "ISO-8859-15");
$obj_XSLT = new XSLTProcessor();
$obj_XML->loadXML($xsl);
$obj_XSLT->importStylesheet($obj_XML);

// Load XML document into memory
$obj_XML = new DOMDocument("1.0", "ISO-8859-15");
$obj_XML->loadXML($xml);

// Perform transformation
echo $obj_XSLT->transformToXml($obj_XML);
?>


Previous Comments:
------------------------------------------------------------------------

[2006-03-21 19:15:24] vodka_carambar_lovely_spam at yahoo dot com

I got the same problem with a similar setup. The only difference was
the OS (Ubuntu). Same PHP version and Apache version. Double check that
the content of what is going into transformToXML really is valid UTF-8.
The slighest encoding slip-up anywhere in the code will screw everything
up.

For myself I had to open a file that was supposed to be UTF-8, copy
paste it into another document and "save as" with UTF-8 encoding.

------------------------------------------------------------------------

[2006-02-24 01:00:03] php-bugs at lists dot php dot net

No feedback was provided for this bug for over a week, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".

------------------------------------------------------------------------

[2006-02-16 13:17:21] [EMAIL PROTECTED]

Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves. 

A proper reproducing script starts with <?php and ends with ?>,
is max. 10-20 lines long and does not require any external 
resources such as databases, etc.

If possible, make the script source available online and provide
an URL to it here. Try to avoid embedding huge scripts into the report.

This is not a reproducable script.... 
We need something, we can copy&paste and run. 

And please do not open 2 reports for the same problem

------------------------------------------------------------------------

[2006-02-16 12:59:42] memoimyself at yahoo dot com dot br

Description:
------------
An instance of the XSLTProcessor class, via its transformToXML method,
is used to transform XML documents using an XSL stylesheet.

The XSL document is in a file that is encoded in UTF-8. 

The PHP script is in a file also encoded in UTF-8.

The XML documents are created at run time from XML strings stored in a
MySQL 5 database whose character set is
UTF-8 and whose tables all have UTF-8 as their character set as well.

All the XML strings stored in the database are duly encoded in UTF-8.

Prior to data retrieval, a 'SET NAMES "utf8"' query is run to ensure
that all i/o operations use the UTF-8 character set.

Upon transformation, the results are output to the client preceded by
"header('Content-Type: text/html; charset=UTF-8')" to ensure that the
browser uses the correct character set.

The XSL file has the following top-level (child node of the document
element, as it should be) element:

<xsl:output encoding="utf-8" method="html"/>

When this code is run on a Windows server (Win XP, Apache 2.0.55,PHP
5.1.2), the transformation NEVER outputs UTF-8 text (seems to output
iso-8859-1), even if the 'method' attribute in the above element is
changed to 'xml', and even if a 'media-type' attribute is also used.

When run on a Linux server (also running PHP 5.1.2), the transformation
runs as expected and outputs proper UTF-8 text to the browser.

Reproduce code:
---------------
PHP code:

$dbo = new PDO(BD_DSN, BD_USERNAME, BD_PWD);
$dbo->query('SET NAMES "utf8"');
$sql = 'SELECT Report FROM reports WHERE Id =
'.$dbo->quote(strip_gpc_slashes($_GET['rid'])).' 
AND Author = '.$dbo->quote($_SESSION['user']);
$result = $dbo->query($sql);
$row = $result->fetch(PDO::FETCH_OBJ);
$xml = new DOMDocument('1.0', 'UTF-8');
$xml->loadXML($row->Report);
$xsl = new DOMDocument('1.0', 'UTF-8');
$xsl->load('/path/to/xsl/file.xsl');
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
$output = $proc->transformToXML($xml);
header('Content-Type: text/html; charset=utf-8');
print $output;

Start of XSL document:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xsi:schemaLocation="http://www.w3.org/2005/11/schema-for-xslt20.xsd";>
        <xsl:output encoding="utf-8" method="html"/>
        <xsl:template match="/">...

Expected result:
----------------
All text output to the browser should be proper UTF-8. If the browser's
character encoding is set to UTF-8 (which it should, with the
"content-type" header above), all accented character should be
adequately displayed.

Actual result:
--------------
When the code is run on a Windows XP server, the text output to the
browser is NOT proper UTF-8 and all accented characters are replaced by
weird symbols.

When the code is run on a Linux server (also equipped with PHP 5.1.2),
everything works as expected and the output is proper UTF-8.


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=36415&edit=1

Reply via email to