ID: 34928 User updated by: clemens at gutweiler dot net Reported By: clemens at gutweiler dot net Status: Open Bug Type: WDDX related Operating System: Linux PHP Version: 4.4.0 New Comment:
Hint: With PHP 4.3.11 the WDDX returns the correct Chars. Previous Comments: ------------------------------------------------------------------------ [2005-10-24 16:44:00] clemens at gutweiler dot net "Real" Unicode Chars does not work, too. But utf8_encode() of chr() should return valid chars. Test for "real" unicode chars: <?php header( 'Content-Type: text/html; charset=UTF-8' ); ?> <html> <body> <form method="post"> <textarea name="text"><?php echo $_POST['text'] ?></textarea> <input type="submit" /> </form> </body> </html> <pre> <?php $text = 'umlaute: '.chr( 220 ).chr( 228 ).chr( 246 ).chr( 223 ); $text = utf8_encode( $text ); if( isset( $_POST['text'] ) ) $text = $_POST['text']; var_dump( $text ); $wddx = wddx_serialize_value( $text ); var_dump( htmlentities( $wddx ) ); $data = wddx_deserialize( $wddx ); var_dump( $data ); $data = wddx_deserialize( '<?xml version="1.0" encoding="UTF-8" ?>'."\n".$wddx ); var_dump( $data ); $data = wddx_deserialize( '<?xml version="1.0" encoding="ISO-8859-1" ?>'."\n".$wddx ); var_dump( $data ); show_source( __FILE__ ); ?> </pre> With PHP 4.4.0 and 5.0.5 i do not get valid UTF-8 chars back. ------------------------------------------------------------------------ [2005-10-21 19:23:42] [EMAIL PROTECTED] Try to use real UTF8 chars instead of chr(). ------------------------------------------------------------------------ [2005-10-20 12:01:24] clemens at gutweiler dot net Why is this bug bogus? It use utf8_encode to encode the data to UTF-8. The manual says also: "Note: If you want to serialize non-ASCII characters you have to convert your data to UTF-8 first (see utf8_encode() and iconv()).". So the code should be correct? ------------------------------------------------------------------------ [2005-10-20 11:20:53] [EMAIL PROTECTED] Please do not submit the same bug more than once. An existing bug report already describes this very problem. Even if you feel that your issue is somewhat different, the resolution is likely to be the same. Thank you for your interest in PHP. See bug #34913. ------------------------------------------------------------------------ [2005-10-20 11:01:33] clemens at gutweiler dot net Description: ------------ umlaut characters in charset utf-8 get not correct en/decoded with wddx_serialize_value resp. wddx_deserialize. in php-5 the code with the xml-header and utf-8 encoding returns the iso-8859-1 chars and not the utf-8 charts - that is a bug too, or? Reproduce code: --------------- <?php header( 'Content-Type: text/html; charset=UTF-8' ); echo '<pre>'; $original = utf8_encode( 'umlaute: '.chr( 220 ).chr( 228 ).chr( 246 ).chr( 223 ) ); var_dump( $original ); $wddx = wddx_serialize_value( $original ); #var_dump( htmlentities( $wddx ) ); $data = wddx_deserialize( $wddx ); var_dump( $data ); $data = wddx_deserialize( '<?xml version="1.0" encoding="UTF-8" ?>'."\n".$wddx ); var_dump( $data ); ?> Expected result: ---------------- string(17) "umlaute: Üäöß" string(17) "umlaute: Üäöß" string(17) "umlaute: Üäöß" Actual result: -------------- string(17) "umlaute: Üäöß" string(17) "umlaute: ÿäöÿ" string(17) "umlaute: ÿäöÿ" ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=34928&edit=1