ID: 36775 User updated by: ez at daoldskool dot org Reported By: ez at daoldskool dot org -Status: Feedback +Status: Open Bug Type: WDDX related Operating System: OSX Tiger 10.4.5 PHP Version: 5.1.2 New Comment:
once again the proof is live, here : http://peoplemode.daoldskool.org:88/__dev/test/ test_NATIVE.php and the source is here : http://peoplemode.daoldskool.org:88/__dev/test/ test_NATIVE.php.s PLUS you have it described here : http://de2.php.net/manual/en/function.wddx-deserialize.php and stop fooling me, i've been into the code : PHP_FUNCTION(wddx_deserialize) is a wrapper for int php_wddx_deserialize_ex(char *value, int vallen, zval *return_value) what php_wddx_deserialize_ex if not an instance of the EXPAT parser : line 1140 parser = XML_ParserCreate("ISO-8859-1") are you really the author of these lines ? thanx Previous Comments: ------------------------------------------------------------------------ [2006-03-18 21:31:16] [EMAIL PROTECTED] >if you don't want the wddx_deserializer to mess with an >utf8 encoded docuemnt, you have to pass it utf8 encoded Okay. Show me. >the bug has been already reported several times and is still open No, it's not. It's closed as bogus. >and YES wddx functions ARE using EXPAT : >from the 5.1.2 release sources : >ext/wddx.c, line 25 : >#include "ext/xml/expat_compat.h" Huh? Did you try to look into this file? It's included *exactly* because libxml is used everywhere instead of expat. Please, give me short and complete reproduce code with expected and actual results, and enough talking about what's crazy and what's not. That's all I want to get from you. ------------------------------------------------------------------------ [2006-03-18 21:16:29] ez at daoldskool dot org Well, tony, the problem is pretty self evident : if you don't want the wddx_deserializer to mess with an utf8 encoded docuemnt, you have to pass it utf8 encoded doesn't this sound weird to you ? wddx_deserializer can only work on document utf8 encoded twice it's crazy ! the bug has been already reported several times and is still open : http://bugs.php.net/bug.php?id=35241 and look at the contributions in the documentation : http://de2.php.net/manual/en/function.wddx-deserialize.php it seems like this bug was intriduced with release 5 and YES wddx functions ARE using EXPAT : from the 5.1.2 release sources : ext/wddx.c, line 25 : #include "ext/xml/expat_compat.h" ext/wddx.c, line 1140 : parser = XML_ParserCreate("ISO-8859-1"); --- BTW, why forcing the encoding here ? EXPAT should recognize the encoding, according to the encoding declaration in the document itself : http://www.xml.com/pub/a/1999/09/expat/reference.html all i am asking is to be able to work transparently on unicode documents without the pain of encoding them twice did you look at this code : http://peoplemode.daoldskool.org:88/__dev/test/ test_NATIVE.php http://peoplemode.daoldskool.org:88/__dev/test/ test_NATIVE.php.s doesn't it look strange to you that i have to utf8_encode the XML stream before passing it to wddx_deserialize : the XML stream is already unicode this is for real, check it ! ------------------------------------------------------------------------ [2006-03-18 18:15:39] [EMAIL PROTECTED] >it seems like wddx functions are still using the EXPAT xml parser Only if you compiled them this way. Sorry, I still don't get what is the problem and what are you proposing. ------------------------------------------------------------------------ [2006-03-18 13:19:10] ez at daoldskool dot org Got the cli binary compiled from sources (stable release 5.1.2 & cvs trunk) on OS X, and could reproduce the bug it seems like wddx functions are still using the EXPAT xml parser according to EXPAT api documentation, the method XML_ParserCreate can recognize the document encoding based on the document declaration headers otherwise, XML_ParserCreate can work on those 4 different encodings US-ASCII, UTF-8, UTF-16, ISO-8859-1 so i am working to find a bulletproof way to check the document encoding declaration within xml headers if the xml stream has not any ancoding declaration then only it's legitimate for decoding strings while parsing the tree MHO am i missing something ? anyone agree ? anyone ------------------------------------------------------------------------ [2006-03-17 19:49:24] ez at daoldskool dot org alright, let's roll ! ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/36775 -- Edit this bug report at http://bugs.php.net/?id=36775&edit=1