ID:               36775
 User updated by:  ez at daoldskool dot org
 Reported By:      ez at daoldskool dot org
-Status:           Feedback
+Status:           Open
 Bug Type:         WDDX related
 Operating System: OSX Tiger 10.4.5
 PHP Version:      5.1.2
 New Comment:

once again the proof is live, here :

http://peoplemode.daoldskool.org:88/__dev/test/
test_NATIVE.php

and the source is here :

http://peoplemode.daoldskool.org:88/__dev/test/
test_NATIVE.php.s

PLUS you have it described here :

http://de2.php.net/manual/en/function.wddx-deserialize.php

and stop fooling me, i've been into the code : 
PHP_FUNCTION(wddx_deserialize) is a wrapper for int 
php_wddx_deserialize_ex(char *value, int vallen, zval 
*return_value)

what php_wddx_deserialize_ex if not an instance of the EXPAT 
parser : line 1140 parser = XML_ParserCreate("ISO-8859-1")

are you really the author of these lines ?

thanx


Previous Comments:
------------------------------------------------------------------------

[2006-03-18 21:31:16] [EMAIL PROTECTED]

>if you don't want the wddx_deserializer to mess with an 
>utf8 encoded docuemnt, you have to pass it utf8 encoded
Okay. Show me.

>the bug has been already reported several times and is still open 
No, it's not. It's closed as bogus.

>and YES wddx functions ARE using EXPAT :
>from the 5.1.2 release sources :
>ext/wddx.c, line 25 :
>#include "ext/xml/expat_compat.h"
Huh? Did you try to look into this file?
It's included *exactly* because libxml is used everywhere instead of
expat.

Please, give me short and complete reproduce code with expected and
actual results, and enough talking about what's crazy and what's not.
That's all I want to get from you.

------------------------------------------------------------------------

[2006-03-18 21:16:29] ez at daoldskool dot org

Well, tony, the problem is pretty self evident :

if you don't want the wddx_deserializer to mess with an utf8 
encoded docuemnt, you have to pass it utf8 encoded

doesn't this sound weird to you ? wddx_deserializer can only 
work on document utf8 encoded twice

it's crazy !

the bug has been already reported several times and is still 
open :

http://bugs.php.net/bug.php?id=35241

and look at the contributions in the documentation :

http://de2.php.net/manual/en/function.wddx-deserialize.php

it seems like this bug was intriduced with release 5

and YES wddx functions ARE using EXPAT :

from the 5.1.2 release sources :

ext/wddx.c, line 25 :
#include "ext/xml/expat_compat.h"

ext/wddx.c, line 1140 :
parser = XML_ParserCreate("ISO-8859-1");

---

BTW, why forcing the encoding here ? EXPAT should recognize 
the encoding, according to the encoding declaration in the 
document itself :
http://www.xml.com/pub/a/1999/09/expat/reference.html

all i am asking is to be able to work transparently on 
unicode documents without the pain of encoding them twice

did you look at this code : 
http://peoplemode.daoldskool.org:88/__dev/test/
test_NATIVE.php
http://peoplemode.daoldskool.org:88/__dev/test/
test_NATIVE.php.s

doesn't it look strange to you that i have to utf8_encode 
the XML stream before passing it to wddx_deserialize : the 
XML stream is already unicode

this is for real, check it !

------------------------------------------------------------------------

[2006-03-18 18:15:39] [EMAIL PROTECTED]

>it seems like wddx functions are still using the EXPAT xml parser
Only if you compiled them this way.

Sorry, I still don't get what is the problem and what are you
proposing.

------------------------------------------------------------------------

[2006-03-18 13:19:10] ez at daoldskool dot org

Got the cli binary compiled from sources (stable release 5.1.2 & cvs
trunk) on OS X, and could reproduce the bug

it seems like wddx functions are still using the EXPAT xml parser

according to EXPAT api documentation, the method XML_ParserCreate can
recognize the document encoding based on the document declaration
headers

otherwise, XML_ParserCreate can work on those 4 different encodings
US-ASCII, UTF-8, UTF-16, ISO-8859-1 

so i am working to find a bulletproof way to check the document
encoding declaration within xml headers

if the xml stream has not any ancoding declaration then only it's
legitimate for decoding strings while parsing the tree

MHO

am i missing something ? anyone agree ?

anyone

------------------------------------------------------------------------

[2006-03-17 19:49:24] ez at daoldskool dot org

alright, let's roll !

------------------------------------------------------------------------

The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at
    http://bugs.php.net/36775

-- 
Edit this bug report at http://bugs.php.net/?id=36775&edit=1

Reply via email to