From: maddam at volny dot cz
Operating system: Win XP
PHP version: 5.0.2
PHP Bug Type: *XML functions
Bug description: XML parser stop at data when get first character as (������
ans so ...)
Description:
------------
xml file element:
<data>Jak se m� holoub�tko ?</data>
This is Czech language with special characters.
<?php
function characterData($parser, $data)
$getdata = $data;
echo $getdata must show 'Jak se m� holoub�tko ?'
But the parser stop and $getdata will consist of 'Jak se m'
and at the next step on same element parser will get all last text and
$getdata will consist of '� holoub�tko ?'
This is bug for 5.0.2. In 4.3.9 and sooner is all OK. 5.0.0 and 5.0.1 i
was not tested.
Description: The parser when get data with language characters as
(��������) will cut this data to two parts. First part consist of
characters to first occurence of some character (��������) and second
part consist of spare element.
THIS BUG WILL NOT SHOW FOR ENGLISH LANGUAGE WHICH NOT USE CHARACTERS AS
��������
Sorry for my english i hope you understand. Contact me at [EMAIL PROTECTED]
or ICQ 25684007
Reproduce code:
---------------
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE rokam [
<!ELEMENT data (#PCDATA)>
]>
<rokam>
<data>Jak se m�</data>
<data>Zde doma je dobr� ocet</data>
</rokam>
<?php
function characterData($parser, $data)
$getdata = $data;
echo $getdata . <br />;
Expected result:
----------------
Echo on screen, need two steps through function characterData:
Jak se m�
Zde doma je dobr� ocet
Actual result:
--------------
This output of parser 5.0.2 need four steps through function characterData
and will output:
Jak se m
�
Zde doma je dobr
� ocet
-------------------------------------------------
This BUG can be repaired with this code, who connect two parts from parser
to one variable say $data. This code
connect 'Jak se m' with '�'
function characterData($parser, $data) {
global $currentTag;
// <code for repair start>
global $lastdata, $lastTag;
if (strcmp($lastTag, $currentTag) == 0) {
$data = $lastdata . trim($data);
$lastdata = $lastTag = '';
}else{
$lastdata = $data;
$lastTag = $currentTag;
return;
}
// <code for repair end>
here can continue normal code for function characterData
//
see trim($data) must be there - the parser add to end of the string $data
of first part CR(x0D) LF(x0E) (I think) and must be trimed for code to
properly work.
--
Edit bug report at http://bugs.php.net/?id=30887&edit=1
--
Try a CVS snapshot (php4): http://bugs.php.net/fix.php?id=30887&r=trysnapshot4
Try a CVS snapshot (php5.0):
http://bugs.php.net/fix.php?id=30887&r=trysnapshot50
Try a CVS snapshot (php5.1):
http://bugs.php.net/fix.php?id=30887&r=trysnapshot51
Fixed in CVS: http://bugs.php.net/fix.php?id=30887&r=fixedcvs
Fixed in release: http://bugs.php.net/fix.php?id=30887&r=alreadyfixed
Need backtrace: http://bugs.php.net/fix.php?id=30887&r=needtrace
Need Reproduce Script: http://bugs.php.net/fix.php?id=30887&r=needscript
Try newer version: http://bugs.php.net/fix.php?id=30887&r=oldversion
Not developer issue: http://bugs.php.net/fix.php?id=30887&r=support
Expected behavior: http://bugs.php.net/fix.php?id=30887&r=notwrong
Not enough info:
http://bugs.php.net/fix.php?id=30887&r=notenoughinfo
Submitted twice:
http://bugs.php.net/fix.php?id=30887&r=submittedtwice
register_globals: http://bugs.php.net/fix.php?id=30887&r=globals
PHP 3 support discontinued: http://bugs.php.net/fix.php?id=30887&r=php3
Daylight Savings: http://bugs.php.net/fix.php?id=30887&r=dst
IIS Stability: http://bugs.php.net/fix.php?id=30887&r=isapi
Install GNU Sed: http://bugs.php.net/fix.php?id=30887&r=gnused
Floating point limitations: http://bugs.php.net/fix.php?id=30887&r=float
MySQL Configuration Error: http://bugs.php.net/fix.php?id=30887&r=mysqlcfg