ID: 41554 Updated by: [EMAIL PROTECTED] Reported By: victorepand at gmail dot com -Status: Open +Status: Feedback -Bug Type: *Languages/Translation +Bug Type: Strings related Operating System: Linux PHP Version: 4.4.7 New Comment:
Thank you for this bug report. To properly diagnose the problem, we need a short but complete example script to be able to reproduce this bug ourselves. A proper reproducing script starts with <?php and ends with ?>, is max. 10-20 lines long and does not require any external resources such as databases, etc. If the script requires a database to demonstrate the issue, please make sure it creates all necessary tables, stored procedures etc. Please avoid embedding huge scripts into the report. Previous Comments: ------------------------------------------------------------------------ [2007-06-01 01:32:55] [EMAIL PROTECTED] My gut reaction to your problem is to mention that you've probably mixed up ISO 8859-1 and Windows-1252: the two are commonly confused for each other, the Windows encoding containing several more characters: However, said behavior does not precisely match up with your predicament, as © and ® are part of ISO 8859-1. Furthermore, the URL you supplied is already encoded in UTF-8. Perhaps you are double encoding? Either way, this is not a problem with the documentation, except possibly the fact that the user notes are waaaaay to long on utf8_encode and some of the info needs to be integrated into the main docs. ------------------------------------------------------------------------ [2007-06-01 00:57:31] victorepand at gmail dot com Description: ------------ I have used the function utf8_encode to encode iso-8859-1 pages into UTF-8 and displayed them on my site, but strange and funny characters are appearing such as "" and "Â". It turns out that the iso-8859-1 page contains the use of characters such as these: ©,,,,,®,, These characters display fine on my browser from the iso-8859-1 page, but when I use the utf8_encode function and display it on my utf-8 page, the result is garbled. So I have found the only solution is to manually convert all of the characters above before using the utf8_encode function and that solves the problem crudely, but it is not a perfect solution. What if I have missed any characters? Isn't there a cleaner method, a PHP function, that will do all this conversion without worry and without missing any characters? Reproduce code: --------------- Here is an example of an iso-8859-1 page which displays fine on my browser, but contains such characters such as ©,,,,,®,, mentioned above: http://www.jardenstore.com/product.aspx?bid=18&pid=1251 Expected result: ---------------- After using the utf8_encode function, I expected to see the page displaying correctly again on my UTF-8 page with these characters intact: ©,,,,,®,, Actual result: -------------- Instead, the result was garbled like this: â,â,â,â,Â,ââ¢,ââ¢,â,é,ð,â¢,,,è,Ž, ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=41554&edit=1
